from:"Zhang, Qiang"

回复: 回复: Question on KASAN calltrace record in RT

2021-04-14 Thread Zhang, Qiang




发件人: Mike Galbraith 
发送时间: 2021年4月14日 15:56
收件人: Zhang, Qiang; Dmitry Vyukov
抄送: Andrew Halaney; andreyk...@gmail.com; ryabinin@gmail.com; 
a...@linux-foundation.org; linux-kernel@vger.kernel.org; 
kasan-...@googlegroups.com
主题: Re: 回复: Question on KASAN calltrace record in RT

[Please note: This e-mail is from an EXTERNAL e-mail address]

On Wed, 2021-04-14 at 07:29 +, Zhang, Qiang wrote:
>
> if CONFIG_PREEMPT_RT is enabled and  but not in preemptible, the prealloc 
> should be allowed
>
>No, you can't take an rtmutex when not preemptible.
>
Oh, I'm in a mess,

Thank you for your explanation.

>-Mike

回复: Question on KASAN calltrace record in RT

2021-04-14 Thread Zhang, Qiang




发件人: Mike Galbraith 
发送时间: 2021年4月14日 12:00
收件人: Dmitry Vyukov; Zhang, Qiang
抄送: Andrew Halaney; andreyk...@gmail.com; ryabinin@gmail.com; 
a...@linux-foundation.org; linux-kernel@vger.kernel.org; 
kasan-...@googlegroups.com
主题: Re: Question on KASAN calltrace record in RT

[Please note: This e-mail is from an EXTERNAL e-mail address]

On Tue, 2021-04-13 at 17:29 +0200, Dmitry Vyukov wrote:
> On Tue, Apr 6, 2021 at 10:26 AM Zhang, Qiang  
> wrote:
> >
> > Hello everyone
> >
> > In RT system,   after  Andrew test,   found the following calltrace ,
> > in KASAN, we record callstack through stack_depot_save(), in this function, 
> > may be call alloc_pages,  but in RT, the spin_lock replace with
> > rt_mutex in alloc_pages(), if before call this function, the irq is 
> > disabled,
> > will trigger following calltrace.
> >
> > maybe  add array[KASAN_STACK_DEPTH] in struct kasan_track to record 
> > callstack  in RT system.
> >
> > Is there a better solution ？
>
> Hi Qiang,
>
> Adding 2 full stacks per heap object can increase memory usage too much.
> The stackdepot has a preallocation mechanism, I would start with
> adding interrupts check here:
> https://elixir.bootlin.com/linux/v5.12-rc7/source/lib/stackdepot.c#L294
> and just not do preallocation in interrupt context. This will solve
> the problem, right?

Hm, this thing might actually be (sorta?) working, modulo one startup
gripe.  The CRASH_DUMP inspired gripe I get with !RT appeared (and shut
up when told I don't care given kdump has worked just fine for ages:),
but no more might_sleep() gripeage.


CONFIG_KASAN_SHADOW_OFFSET=0xdc00
CONFIG_HAVE_ARCH_KASAN=y
CONFIG_HAVE_ARCH_KASAN_VMALLOC=y
CONFIG_CC_HAS_KASAN_GENERIC=y
CONFIG_KASAN=y
CONFIG_KASAN_GENERIC=y
CONFIG_KASAN_OUTLINE=y
# CONFIG_KASAN_INLINE is not set
CONFIG_KASAN_STACK=1
CONFIG_KASAN_VMALLOC=y
# CONFIG_KASAN_MODULE_TEST is not set

---
 lib/stackdepot.c |   10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

--- a/lib/stackdepot.c
+++ b/lib/stackdepot.c
@@ -71,7 +71,7 @@ static void *stack_slabs[STACK_ALLOC_MAX
 static int depot_index;
 static int next_slab_inited;
 static size_t depot_offset;
-static DEFINE_SPINLOCK(depot_lock);
+static DEFINE_RAW_SPINLOCK(depot_lock);

 static bool init_stack_slab(void **prealloc)
 {
@@ -265,7 +265,7 @@ depot_stack_handle_t stack_depot_save(un
struct page *page = NULL;
void *prealloc = NULL;
unsigned long flags;
-   u32 hash;
+   u32 hash, preemptible = !IS_ENABLED(CONFIG_PREEMPT_RT) || preemptible();

if CONFIG_PREEMPT_RT is enabled and  but not in preemptible, the prealloc 
should be allowed

should be change like this:
   may_prealloc = !(IS_ENABLED(CONFIG_PREEMPT_RT) && preemptible());

Thanks 
Qiang





if (unlikely(nr_entries == 0) || stack_depot_disable)
goto fast_exit;
@@ -291,7 +291,7 @@ depot_stack_handle_t stack_depot_save(un
 * The smp_load_acquire() here pairs with smp_store_release() to
 * |next_slab_inited| in depot_alloc_stack() and init_stack_slab().
 */
-   if (unlikely(!smp_load_acquire(_slab_inited))) {
+   if (unlikely(!smp_load_acquire(_slab_inited) && may_prealloc)) {
/*
 * Zero out zone modifiers, as we don't have specific zone
 * requirements. Keep the flags related to allocation in atomic
@@ -305,7 +305,7 @@ depot_stack_handle_t stack_depot_save(un
prealloc = page_address(page);
}

-   spin_lock_irqsave(_lock, flags);
+   raw_spin_lock_irqsave(_lock, flags);

found = find_stack(*bucket, entries, nr_entries, hash);
if (!found) {
@@ -329,7 +329,7 @@ depot_stack_handle_t stack_depot_save(un
WARN_ON(!init_stack_slab());
}

-   spin_unlock_irqrestore(_lock, flags);
+   raw_spin_unlock_irqrestore(_lock, flags);
 exit:
if (prealloc) {
/* Nobody used this memory, ok to free it. */

[0.692437] BUG: sleeping function called from invalid context at 
kernel/locking/rtmutex.c:943
[0.692439] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 1, name: 
swapper/0
[0.692442] Preemption disabled at:
[0.692443] [] on_each_cpu_cond_mask+0x30/0xb0
[0.692451] CPU: 5 PID: 1 Comm: swapper/0 Not tainted 5.12.0.g2afefec-tip-rt 
#5
[0.692454] Hardware name: MEDION MS-7848/MS-7848, BIOS M7848W08.20C 
09/23/2013
[0.692456] Call Trace:
[0.692458]  ? on_each_cpu_cond_mask+0x30/0xb0
[0.692462]  dump_stack+0x8a/0xb5
[0.692467]  ___might_sleep.cold+0xfe/0x112
[0.692471]  rt_spin_lock+0x1c/0x60
[0.692475]  free_unref_page+0x117/0x3c0
[0.692481]  qlist_free_all+0x60/0xd0
[0.692485]  per_cpu_remove_cache+0x5b/0x70
[0.692488]  smp_call_function_many_cond+0x185/

回复: Question on KASAN calltrace record in RT

2021-04-14 Thread Zhang, Qiang




发件人: Dmitry Vyukov 
发送时间: 2021年4月13日 23:29
收件人: Zhang, Qiang
抄送: Andrew Halaney; andreyk...@gmail.com; ryabinin@gmail.com; 
a...@linux-foundation.org; linux-kernel@vger.kernel.org; 
kasan-...@googlegroups.com
主题: Re: Question on KASAN calltrace record in RT

[Please note: This e-mail is from an EXTERNAL e-mail address]

On Tue, Apr 6, 2021 at 10:26 AM Zhang, Qiang  wrote:
>
> Hello everyone
>
> In RT system,   after  Andrew test,   found the following calltrace ,
> in KASAN, we record callstack through stack_depot_save(), in this function, 
> may be call alloc_pages,  but in RT, the spin_lock replace with
> rt_mutex in alloc_pages(), if before call this function, the irq is disabled,
> will trigger following calltrace.
>
> maybe  add array[KASAN_STACK_DEPTH] in struct kasan_track to record callstack 
>  in RT system.
>
> Is there a better solution ？

>Hi Qiang,
>
>Adding 2 full stacks per heap object can increase memory usage too >much.
>The stackdepot has a preallocation mechanism, I would start with
>adding interrupts check here:
>https://elixir.bootlin.com/linux/v5.12-rc7/source/lib/stackdepot.c#L294
>and just not do preallocation in interrupt context. This will solve
>the problem, right?

It seems to be useful,  however, there are the following situations 
If there is a lot of stack information that needs to be saved in  interrupts,  
the memory which has been allocated to hold the stack information is depletion, 
  when need to save stack again in interrupts,  there will be no memory 
available .

Thanks
Qiang
 

> Thanks
> Qiang
>
> BUG: sleeping function called from invalid context at 
> kernel/locking/rtmutex.c:951
> [   14.522262] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 640, 
> name: mount
> [   14.522304] Call Trace:
> [   14.522306]  dump_stack+0x92/0xc1
> [   14.522313]  ___might_sleep.cold.99+0x1b0/0x1ef
> [   14.522319]  rt_spin_lock+0x3e/0xc0
> [   14.522329]  local_lock_acquire+0x52/0x3c0
> [   14.522332]  get_page_from_freelist+0x176c/0x3fd0
> [   14.522543]  __alloc_pages_nodemask+0x28f/0x7f0
> [   14.522559]  stack_depot_save+0x3a1/0x470
> [   14.522564]  kasan_save_stack+0x2f/0x40
> [   14.523575]  kasan_record_aux_stack+0xa3/0xb0
> [   14.523580]  insert_work+0x48/0x340
> [   14.523589]  __queue_work+0x430/0x1280
> [   14.523595]  mod_delayed_work_on+0x98/0xf0
> [   14.523607]  kblockd_mod_delayed_work_on+0x17/0x20
> [   14.523611]  blk_mq_run_hw_queue+0x151/0x2b0
> [   14.523620]  blk_mq_sched_insert_request+0x2ad/0x470
> [   14.523633]  blk_mq_submit_bio+0xd2a/0x2330
> [   14.523675]  submit_bio_noacct+0x8aa/0xfe0
> [   14.523693]  submit_bio+0xf0/0x550
> [   14.523714]  submit_bio_wait+0xfe/0x200
> [   14.523724]  xfs_rw_bdev+0x370/0x480 [xfs]
> [   14.523831]  xlog_do_io+0x155/0x320 [xfs]
> [   14.524032]  xlog_bread+0x23/0xb0 [xfs]
> [   14.524133]  xlog_find_head+0x131/0x8b0 [xfs]
> [   14.524375]  xlog_find_tail+0xc8/0x7b0 [xfs]
> [   14.524828]  xfs_log_mount+0x379/0x660 [xfs]
> [   14.524927]  xfs_mountfs+0xc93/0x1af0 [xfs]
> [   14.525424]  xfs_fs_fill_super+0x923/0x17f0 [xfs]
> [   14.525522]  get_tree_bdev+0x404/0x680
> [   14.525622]  vfs_get_tree+0x89/0x2d0
> [   14.525628]  path_mount+0xeb2/0x19d0
> [   14.525648]  do_mount+0xcb/0xf0
> [   14.525665]  __x64_sys_mount+0x162/0x1b0
> [   14.525670]  do_syscall_64+0x33/0x40
> [   14.525674]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> [   14.525677] RIP: 0033:0x7fd6c15eaade

Question on KASAN calltrace record in RT

2021-04-06 Thread Zhang, Qiang

Hello everyone

In RT system,   after  Andrew test,   found the following calltrace ,
in KASAN, we record callstack through stack_depot_save(), in this function, may 
be call alloc_pages,  but in RT, the spin_lock replace with 
rt_mutex in alloc_pages(), if before call this function, the irq is disabled,
will trigger following calltrace.

maybe  add array[KASAN_STACK_DEPTH] in struct kasan_track to record callstack  
in RT system.

Is there a better solution ？
Thanks
Qiang

BUG: sleeping function called from invalid context at 
kernel/locking/rtmutex.c:951
[   14.522262] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 640, 
name: mount
[   14.522304] Call Trace:
[   14.522306]  dump_stack+0x92/0xc1
[   14.522313]  ___might_sleep.cold.99+0x1b0/0x1ef
[   14.522319]  rt_spin_lock+0x3e/0xc0
[   14.522329]  local_lock_acquire+0x52/0x3c0
[   14.522332]  get_page_from_freelist+0x176c/0x3fd0
[   14.522543]  __alloc_pages_nodemask+0x28f/0x7f0
[   14.522559]  stack_depot_save+0x3a1/0x470
[   14.522564]  kasan_save_stack+0x2f/0x40
[   14.523575]  kasan_record_aux_stack+0xa3/0xb0
[   14.523580]  insert_work+0x48/0x340
[   14.523589]  __queue_work+0x430/0x1280
[   14.523595]  mod_delayed_work_on+0x98/0xf0
[   14.523607]  kblockd_mod_delayed_work_on+0x17/0x20
[   14.523611]  blk_mq_run_hw_queue+0x151/0x2b0
[   14.523620]  blk_mq_sched_insert_request+0x2ad/0x470
[   14.523633]  blk_mq_submit_bio+0xd2a/0x2330
[   14.523675]  submit_bio_noacct+0x8aa/0xfe0
[   14.523693]  submit_bio+0xf0/0x550
[   14.523714]  submit_bio_wait+0xfe/0x200
[   14.523724]  xfs_rw_bdev+0x370/0x480 [xfs]
[   14.523831]  xlog_do_io+0x155/0x320 [xfs]
[   14.524032]  xlog_bread+0x23/0xb0 [xfs]
[   14.524133]  xlog_find_head+0x131/0x8b0 [xfs]
[   14.524375]  xlog_find_tail+0xc8/0x7b0 [xfs]
[   14.524828]  xfs_log_mount+0x379/0x660 [xfs]
[   14.524927]  xfs_mountfs+0xc93/0x1af0 [xfs]
[   14.525424]  xfs_fs_fill_super+0x923/0x17f0 [xfs]
[   14.525522]  get_tree_bdev+0x404/0x680
[   14.525622]  vfs_get_tree+0x89/0x2d0
[   14.525628]  path_mount+0xeb2/0x19d0
[   14.525648]  do_mount+0xcb/0xf0
[   14.525665]  __x64_sys_mount+0x162/0x1b0
[   14.525670]  do_syscall_64+0x33/0x40
[   14.525674]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[   14.525677] RIP: 0033:0x7fd6c15eaade

回复: [PATCH v2] loop: call __loop_clr_fd() with lo_mutex locked to avoid autoclear race

2021-03-26 Thread Zhang, Qiang




发件人: Pavel Tatashin 
发送时间: 2021年3月27日 5:41
收件人: Zhang, Qiang
抄送: Jens Axboe; linux-bl...@vger.kernel.org; LKML
主题: Re: [PATCH v2] loop: call __loop_clr_fd() with lo_mutex locked to avoid 
autoclear race

[Please note: This e-mail is from an EXTERNAL e-mail address]

On Fri, Mar 26, 2021 at 5:00 AM  wrote:
>
> From: Zqiang 
>
> lo->lo_refcnt = 0
>
> CPU0 CPU1
> lo_open()lo_open()
>  mutex_lock(>lo_mutex)
>  atomic_inc(>lo_refcnt)
>  lo_refcnt == 1
>  mutex_unlock(>lo_mutex)
>  mutex_lock(>lo_mutex)
>  atomic_inc(>lo_refcnt)
>  lo_refcnt == 2
>  mutex_unlock(>lo_mutex)
> loop_clr_fd()
>  mutex_lock(>lo_mutex)
>  atomic_read(>lo_refcnt) > 1
>  lo->lo_flags |= LO_FLAGS_AUTOCLEARlo_release()
>  mutex_unlock(>lo_mutex)
>  return  mutex_lock(>lo_mutex)
>atomic_dec_return(>lo_refcnt)
>  lo_refcnt == 1
>  mutex_unlock(>lo_mutex)
>  return
>
> lo_release()
>  mutex_lock(>lo_mutex)
>  atomic_dec_return(>lo_refcnt)
>  lo_refcnt == 0
>  lo->lo_flags & LO_FLAGS_AUTOCLEAR
>   == true
>  mutex_unlock(>lo_mutex)  loop_control_ioctl()
>case LOOP_CTL_REMOVE:
> mutex_lock(>lo_mutex)
> atomic_read(>lo_refcnt)==0
>   __loop_clr_fd(lo, true)   mutex_unlock(>lo_mutex)
> mutex_lock(>lo_mutex)loop_remove(lo)
>mutex_destroy(>lo_mutex)
>   ..   kfree(lo)
>data race
>
> When different tasks on two CPUs perform the above operations on the same
> lo device, data race may be occur, Do not drop lo->lo_mutex before calling
>  __loop_clr_fd(), so refcnt and LO_FLAGS_AUTOCLEAR check in lo_release
> stay in sync.
>
>There is a race with autoclear logic where use after free may >occur as
>shown in the above scenario. Do not drop lo->lo_mutex >before calling
>__loop_clr_fd(), so refcnt and LO_FLAGS_AUTOCLEAR check >in lo_release
>stay in sync.

Hi Pasha

this patch is incorrect,  lo->lo_state status detection is ignored by me,
in lo_release()  the lo_state is set Lo_rundown, when call LOOP_CTL_REMOVE , if 
lo->lo_state != Lo_unbound will directly return, not call loop_remove().
 I'm sorry to mislead you.

>
>Reviewed-by: Pavel Tatashin 

>
> Fixes: 6cc8e7430801 ("loop: scale loop device by introducing per device lock")
> Signed-off-by: Zqiang 
> ---
>  v1->v2:
>  Modify the title and commit message.
>
>  drivers/block/loop.c | 11 ---
>  1 file changed, 4 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> index d58d68f3c7cd..5712f1698a66 100644
> --- a/drivers/block/loop.c
> +++ b/drivers/block/loop.c
> @@ -1201,7 +1201,6 @@ static int __loop_clr_fd(struct loop_device *lo, bool 
> release)
> bool partscan = false;
> int lo_number;
>
> -   mutex_lock(>lo_mutex);
> if (WARN_ON_ONCE(lo->lo_state != Lo_rundown)) {
> err = -ENXIO;
> goto out_unlock;
> @@ -1257,7 +1256,6 @@ static int __loop_clr_fd(struct loop_device *lo, bool 
> release)
> lo_number = lo->lo_number;
> loop_unprepare_queue(lo);
>  out_unlock:
> -   mutex_unlock(>lo_mutex);
> if (partscan) {
> /*
>  * bd_mutex has been held already in release path, so don't
> @@ -1288,12 +1286,11 @@ static int __loop_clr_fd(struct loop_device *lo, bool 
> release)
>  * protects us from all the other places trying to change the 'lo'
>  * device.
>  */
> -   mutex_lock(>lo_mutex);
> +
> lo->lo_flags = 0;
> if (!part_shift)
> lo->lo_disk->flags |= GENHD_FL_NO_PART_SCAN;
> lo->lo_state = Lo_unbound;
> -   mutex_unlock(>lo_mutex);
>
> /*
>  * Need not hold lo_mutex to fput backing file. Calling fput holding
> @@ -1332,9 +1329,10 @@ static int loop_clr_fd(struct loop_device *lo)
> return 0;
> }
>

回复: [PATCH] loop: Fix use of unsafe lo->lo_mutex locks

2021-03-26 Thread Zhang, Qiang




发件人: Pavel Tatashin 
发送时间: 2021年3月25日 21:09
收件人: Zhang, Qiang
抄送: Jens Axboe; linux-bl...@vger.kernel.org; LKML
主题: Re: [PATCH] loop: Fix use of unsafe lo->lo_mutex locks

[Please note: This e-mail is from an EXTERNAL e-mail address]

>Hi Qiang,
>
>Thank you for root causing this issue. Did you encounter this issue >or
>found by inspection?
>
>I would change the title to what actually being changed, something >like:
>
>loop: call __loop_clr_fd() with lo_mutex locked to avoid autoclear >race
>
>
>   ..   kfree(lo)
>UAF
>
> When different tasks on two CPUs perform the above operations on the same
> lo device, UAF may occur.
>
>Please also explain the fix:
>
>Do not drop lo->lo_mutex before calling __loop_clr_fd(), so refcnt >and
>LO_FLAGS_AUTOCLEAR check in lo_release stay in sync.

 Sorry Pasha, please Ignore I sent v2 patch.
 
 In lo_release() , we set lo->lo_state = Lo_rundown
 In loop_control_ioctl(), LOOP_CTL_REMOVE:   if (lo->lo_state !=  Lo_unbound) 
is true  will return, not call loop_remove().
 
 I'm sorry to mislead you.

 Thanks
Qiang
 

>
> Fixes: 6cc8e7430801 ("loop: scale loop device by introducing per device lock")
> Signed-off-by: Zqiang 
> ---
>  drivers/block/loop.c | 11 ---
>  1 file changed, 4 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> index d58d68f3c7cd..5712f1698a66 100644
> --- a/drivers/block/loop.c
> +++ b/drivers/block/loop.c
> @@ -1201,7 +1201,6 @@ static int __loop_clr_fd(struct loop_device *lo, bool 
> release)
> bool partscan = false;
> int lo_number;
>
> -   mutex_lock(>lo_mutex);
> if (WARN_ON_ONCE(lo->lo_state != Lo_rundown)) {
> err = -ENXIO;
> goto out_unlock;
> @@ -1257,7 +1256,6 @@ static int __loop_clr_fd(struct loop_device *lo, bool 
> release)
> lo_number = lo->lo_number;
> loop_unprepare_queue(lo);
>  out_unlock:
> -   mutex_unlock(>lo_mutex);
> if (partscan) {
> /*
>  * bd_mutex has been held already in release path, so don't
> @@ -1288,12 +1286,11 @@ static int __loop_clr_fd(struct loop_device *lo, bool 
> release)
>  * protects us from all the other places trying to change the 'lo'
>  * device.
>  */
> -   mutex_lock(>lo_mutex);
> +
> lo->lo_flags = 0;
> if (!part_shift)
> lo->lo_disk->flags |= GENHD_FL_NO_PART_SCAN;
> lo->lo_state = Lo_unbound;
> -   mutex_unlock(>lo_mutex);
>
> /*
>  * Need not hold lo_mutex to fput backing file. Calling fput holding
> @@ -1332,9 +1329,10 @@ static int loop_clr_fd(struct loop_device *lo)
> return 0;
> }
> lo->lo_state = Lo_rundown;
> +   err = __loop_clr_fd(lo, false);
> mutex_unlock(>lo_mutex);
>
> -   return __loop_clr_fd(lo, false);
> +   return err;
>  }
>
>  static int
> @@ -1916,13 +1914,12 @@ static void lo_release(struct gendisk *disk, fmode_t 
> mode)
> if (lo->lo_state != Lo_bound)
> goto out_unlock;
> lo->lo_state = Lo_rundown;
> -   mutex_unlock(>lo_mutex);
> /*
>  * In autoclear mode, stop the loop thread
>  * and remove configuration after last close.
>  */
> __loop_clr_fd(lo, true);
> -   return;
> +   goto out_unlock;
> } else if (lo->lo_state == Lo_bound) {
> /*
>  * Otherwise keep thread (if running) and config,
> --
> 2.17.1
>

>LGTM
>Reviewed-by: Pavel Tatashin 
>
>Thank you,
>Pasha

回复: [PATCH] loop: Fix use of unsafe lo->lo_mutex locks

2021-03-25 Thread Zhang, Qiang




发件人: Pavel Tatashin 
发送时间: 2021年3月25日 21:09
收件人: Zhang, Qiang
抄送: Jens Axboe; linux-bl...@vger.kernel.org; LKML
主题: Re: [PATCH] loop: Fix use of unsafe lo->lo_mutex locks

[Please note: This e-mail is from an EXTERNAL e-mail address]

>Hi Qiang,
>
>Thank you for root causing this issue. Did you encounter this issue >or
>found by inspection?

Hi Pasha

  I found the problem during the inspection 

>
>I would change the title to what actually being changed, something >like:
>
>loop: call __loop_clr_fd() with lo_mutex locked to avoid autoclear >race

  agree it
 
>
>
>   ..   kfree(lo)
>UAF
>
> When different tasks on two CPUs perform the above operations on the same
> lo device, UAF may occur.
>
>Please also explain the fix:
>
>Do not drop lo->lo_mutex before calling __loop_clr_fd(), so refcnt >and
>LO_FLAGS_AUTOCLEAR check in lo_release stay in sync.

  I will modify it and resend it
 
  Thanks
  Qiang 
>
>
> Fixes: 6cc8e7430801 ("loop: scale loop device by introducing per device lock")
> Signed-off-by: Zqiang 
> ---
>  drivers/block/loop.c | 11 ---
>  1 file changed, 4 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> index d58d68f3c7cd..5712f1698a66 100644
> --- a/drivers/block/loop.c
> +++ b/drivers/block/loop.c
> @@ -1201,7 +1201,6 @@ static int __loop_clr_fd(struct loop_device *lo, bool 
> release)
> bool partscan = false;
> int lo_number;
>
> -   mutex_lock(>lo_mutex);
> if (WARN_ON_ONCE(lo->lo_state != Lo_rundown)) {
> err = -ENXIO;
> goto out_unlock;
> @@ -1257,7 +1256,6 @@ static int __loop_clr_fd(struct loop_device *lo, bool 
> release)
> lo_number = lo->lo_number;
> loop_unprepare_queue(lo);
>  out_unlock:
> -   mutex_unlock(>lo_mutex);
> if (partscan) {
> /*
>  * bd_mutex has been held already in release path, so don't
> @@ -1288,12 +1286,11 @@ static int __loop_clr_fd(struct loop_device *lo, bool 
> release)
>  * protects us from all the other places trying to change the 'lo'
>  * device.
>  */
> -   mutex_lock(>lo_mutex);
> +
> lo->lo_flags = 0;
> if (!part_shift)
> lo->lo_disk->flags |= GENHD_FL_NO_PART_SCAN;
> lo->lo_state = Lo_unbound;
> -   mutex_unlock(>lo_mutex);
>
> /*
>  * Need not hold lo_mutex to fput backing file. Calling fput holding
> @@ -1332,9 +1329,10 @@ static int loop_clr_fd(struct loop_device *lo)
> return 0;
> }
> lo->lo_state = Lo_rundown;
> +   err = __loop_clr_fd(lo, false);
> mutex_unlock(>lo_mutex);
>
> -   return __loop_clr_fd(lo, false);
> +   return err;
>  }
>
>  static int
> @@ -1916,13 +1914,12 @@ static void lo_release(struct gendisk *disk, fmode_t 
> mode)
> if (lo->lo_state != Lo_bound)
> goto out_unlock;
> lo->lo_state = Lo_rundown;
> -   mutex_unlock(>lo_mutex);
> /*
>  * In autoclear mode, stop the loop thread
>  * and remove configuration after last close.
>  */
> __loop_clr_fd(lo, true);
> -   return;
> +   goto out_unlock;
> } else if (lo->lo_state == Lo_bound) {
> /*
>  * Otherwise keep thread (if running) and config,
> --
> 2.17.1
>
>
>LGTM
>Reviewed-by: Pavel Tatashin 
>
>Thank you,
>Pasha

回复: [PATCH v2] bpf: Fix memory leak in copy_process()

2021-03-16 Thread Zhang, Qiang

Hello Alexei Starovoitov Daniel Borkmann
Please  review this patch.

Thanks
Qiang


发件人: Zhang, Qiang 
发送时间: 2021年3月15日 16:53
收件人: a...@kernel.org; dan...@iogearbox.net; and...@kernel.org
抄送: dvyu...@google.com; linux-kernel@vger.kernel.org; 
syzbot+44908bb56d2bfe56b...@syzkaller.appspotmail.com; b...@vger.kernel.org; 
Zhang, Qiang
主题: [PATCH v2] bpf: Fix memory leak in copy_process()

From: Zqiang 

The syzbot report a memleak follow:
BUG: memory leak
unreferenced object 0x888101b41d00 (size 120):
  comm "kworker/u4:0", pid 8, jiffies 4294944270 (age 12.780s)
  backtrace:
[] alloc_pid+0x66/0x560
[] copy_process+0x1465/0x25e0
[] kernel_clone+0xf3/0x670
[] kernel_thread+0x61/0x80
[] call_usermodehelper_exec_work
[] call_usermodehelper_exec_work+0xc4/0x120
[] process_one_work+0x2c9/0x600
[] worker_thread+0x59/0x5d0
[] kthread+0x178/0x1b0
[] ret_from_fork+0x1f/0x30

unreferenced object 0x888110ef5c00 (size 232):
  comm "kworker/u4:0", pid 8414, jiffies 4294944270 (age 12.780s)
  backtrace:
[] kmem_cache_zalloc
[] __alloc_file+0x1f/0xf0
[] alloc_empty_file+0x69/0x120
[] alloc_file+0x33/0x1b0
[] alloc_file_pseudo+0xb2/0x140
[] create_pipe_files+0x138/0x2e0
[] umd_setup+0x33/0x220
[] call_usermodehelper_exec_async+0xb4/0x1b0
[] ret_from_fork+0x1f/0x30

after the UMD process exits, the pipe_to_umh/pipe_from_umh and tgid
need to be release.

Fixes: d71fa5c9763c ("bpf: Add kernel module with user mode driver that 
populates bpffs.")
Reported-by: syzbot+44908bb56d2bfe56b...@syzkaller.appspotmail.com
Signed-off-by: Zqiang 
---
 v1->v2:
 Judge whether the pointer variable tgid is valid.

 kernel/bpf/preload/bpf_preload_kern.c | 24 
 1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/kernel/bpf/preload/bpf_preload_kern.c 
b/kernel/bpf/preload/bpf_preload_kern.c
index 79c5772465f1..5009875f01d3 100644
--- a/kernel/bpf/preload/bpf_preload_kern.c
+++ b/kernel/bpf/preload/bpf_preload_kern.c
@@ -4,6 +4,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include "bpf_preload.h"

@@ -20,6 +21,14 @@ static struct bpf_preload_ops umd_ops = {
.owner = THIS_MODULE,
 };

+static void bpf_preload_umh_cleanup(struct umd_info *info)
+{
+   fput(info->pipe_to_umh);
+   fput(info->pipe_from_umh);
+   put_pid(info->tgid);
+   info->tgid = NULL;
+}
+
 static int preload(struct bpf_preload_info *obj)
 {
int magic = BPF_PRELOAD_START;
@@ -61,8 +70,10 @@ static int finish(void)
if (n != sizeof(magic))
return -EPIPE;
tgid = umd_ops.info.tgid;
-   wait_event(tgid->wait_pidfd, thread_group_exited(tgid));
-   umd_ops.info.tgid = NULL;
+   if (tgid) {
+   wait_event(tgid->wait_pidfd, thread_group_exited(tgid));
+   bpf_preload_umh_cleanup(_ops.info);
+   }
return 0;
 }

@@ -80,10 +91,15 @@ static int __init load_umd(void)

 static void __exit fini_umd(void)
 {
+   struct pid *tgid;
bpf_preload_ops = NULL;
/* kill UMD in case it's still there due to earlier error */
-   kill_pid(umd_ops.info.tgid, SIGKILL, 1);
-   umd_ops.info.tgid = NULL;
+   tgid = umd_ops.info.tgid;
+   if (tgid) {
+   kill_pid(tgid, SIGKILL, 1);
+   wait_event(tgid->wait_pidfd, thread_group_exited(tgid));
+   bpf_preload_umh_cleanup(_ops.info);
+   }
umd_unload_blob(_ops.info);
 }
 late_initcall(load_umd);
--
2.17.1

回复: [PATCH] ARM: Fix incorrect use of smp_processor_id() by syzbot report

2021-03-11 Thread Zhang, Qiang




发件人: Dmitry Vyukov 
发送时间: 2021年3月12日 14:30
收件人: Zhang, Qiang
抄送: Russell King - ARM Linux; Andrew Morton; LKML; Linux ARM; syzkaller-bugs
主题: Re: [PATCH] ARM: Fix incorrect use of smp_processor_id() by syzbot report

[Please note: This e-mail is from an EXTERNAL e-mail address]

On Fri, Mar 12, 2021 at 5:13 AM  wrote:
>
> From: Zqiang 
>
> BUG: using smp_processor_id() in preemptible [] code:
> syz-executor.0/15841
> caller is debug_smp_processor_id+0x20/0x24
> lib/smp_processor_id.c:64
>
> The smp_processor_id() is used in a code segment when
> preemption has been disabled, otherwise, when preemption
> is enabled this pointer is usually no longer useful
> since it may no longer point to per cpu data of the
> current processor.
>
> Reported-by: syzbot 
> Fixes: f5fe12b1eaee ("ARM: spectre-v2: harden user aborts in kernel space")
> Signed-off-by: Zqiang 
> ---
>  arch/arm/include/asm/system_misc.h | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/arch/arm/include/asm/system_misc.h 
> b/arch/arm/include/asm/system_misc.h
> index 66f6a3ae68d2..61916dc7d361 100644
> --- a/arch/arm/include/asm/system_misc.h
> +++ b/arch/arm/include/asm/system_misc.h
> @@ -21,8 +21,10 @@ typedef void (*harden_branch_predictor_fn_t)(void);
>  DECLARE_PER_CPU(harden_branch_predictor_fn_t, harden_branch_predictor_fn);
>  static inline void harden_branch_predictor(void)
>  {
> +   preempt_disable();
> harden_branch_predictor_fn_t fn = per_cpu(harden_branch_predictor_fn,
>   smp_processor_id());
> +   preempt_enable();
> if (fn)
> fn();
>  }

>Hi Qiang,
>
>If the CPU can change here, what if it changes right after >preempt_enable()?
>Disabling preemption just around reading the callback looks like a
>no-op. Shouldn't we disable preemption at least around reading and
>calling the callback?

Hi dvyukov

Oh, I'm confused, we should call preempt_enable after calling callback 
function, to make sure callback function is called on  current processor . 
thank  you for your remind.

>
>On the second look, the fn seems to be const after init, so maybe we
>need to use raw_smp_processor_id() instead with an explanatory
>comment?

回复: possible deadlock in io_poll_double_wake (2)

2021-03-03 Thread Zhang, Qiang




发件人: Zhang, Qiang 
发送时间: 2021年3月3日 20:15
收件人: Jens Axboe; syzbot; asml.sile...@gmail.com; io-ur...@vger.kernel.org; 
linux-fsde...@vger.kernel.org; linux-kernel@vger.kernel.org; 
syzkaller-b...@googlegroups.com; v...@zeniv.linux.org.uk
主题: 回复: possible deadlock in io_poll_double_wake (2)




发件人: Jens Axboe 
发送时间: 2021年3月3日 1:20
收件人: syzbot; asml.sile...@gmail.com; io-ur...@vger.kernel.org; 
linux-fsde...@vger.kernel.org; linux-kernel@vger.kernel.org; 
syzkaller-b...@googlegroups.com; v...@zeniv.linux.org.uk
主题: Re: possible deadlock in io_poll_double_wake (2)

[Please note: This e-mail is from an EXTERNAL e-mail address]

On 2/28/21 9:18 PM, syzbot wrote:
> Hello,
>
> syzbot has tested the proposed patch but the reproducer is still triggering 
> an issue:
> possible deadlock in io_poll_double_wake
>
> 
> WARNING: possible recursive locking detected
> 5.11.0-syzkaller #0 Not tainted
> 
> syz-executor.0/10241 is trying to acquire lock:
> 888012e09130 (>sleep){..-.}-{2:2}, at: spin_lock 
> include/linux/spinlock.h:354 [inline]
> 888012e09130 (>sleep){..-.}-{2:2}, at: 
> io_poll_double_wake+0x25f/0x6a0 fs/io_uring.c:4921
>
> but task is already holding lock:
> 888013b00130 (>sleep){..-.}-{2:2}, at: 
> __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:137
>
> other info that might help us debug this:
>  Possible unsafe locking scenario:
>
>CPU0
>
>   lock(>sleep);
>   lock(>sleep);
>
>  *** DEADLOCK ***
>
>  May be due to missing lock nesting notation
>
>Since the fix is in yet this keeps failing (and I didn't get it), >I looked
>closer at this report. While the names of the locks are the >same, they are
>really two different locks. So let's try this...

>Hello Jens Axboe

Sorry for I make  noise, please ignore this information.

>Sorry, I provided the wrong information before.
>I'm not very familiar with io_uring,  before we start >vfs_poll again,  should 
>we set  'poll->head = NULL'  ?
>
>diff --git a/fs/io_uring.c b/fs/io_uring.c
>index 42b675939582..cae605c14510 100644
>--- a/fs/io_uring.c
>+++ b/fs/io_uring.c
>@@ -4824,7 +4824,7 @@ static bool io_poll_rewait(struct >io_kiocb *req, struct 
>io_poll_iocb *poll)
>
>if (!req->result && !READ_ONCE(poll->canceled)) {
>struct poll_table_struct pt = { ._key = poll->events >};
>-
>+   poll->head = NULL;
>req->result = vfs_poll(req->file, ) & >poll->events;
>}



>Thanks
>Qiang

>
>#syz test: git://git.kernel.dk/linux-block syzbot-test
>
>--
>Jens Axboe

回复: possible deadlock in io_poll_double_wake (2)

2021-03-03 Thread Zhang, Qiang




发件人: Jens Axboe 
发送时间: 2021年3月3日 1:20
收件人: syzbot; asml.sile...@gmail.com; io-ur...@vger.kernel.org; 
linux-fsde...@vger.kernel.org; linux-kernel@vger.kernel.org; 
syzkaller-b...@googlegroups.com; v...@zeniv.linux.org.uk
主题: Re: possible deadlock in io_poll_double_wake (2)

[Please note: This e-mail is from an EXTERNAL e-mail address]

On 2/28/21 9:18 PM, syzbot wrote:
> Hello,
>
> syzbot has tested the proposed patch but the reproducer is still triggering 
> an issue:
> possible deadlock in io_poll_double_wake
>
> 
> WARNING: possible recursive locking detected
> 5.11.0-syzkaller #0 Not tainted
> 
> syz-executor.0/10241 is trying to acquire lock:
> 888012e09130 (>sleep){..-.}-{2:2}, at: spin_lock 
> include/linux/spinlock.h:354 [inline]
> 888012e09130 (>sleep){..-.}-{2:2}, at: 
> io_poll_double_wake+0x25f/0x6a0 fs/io_uring.c:4921
>
> but task is already holding lock:
> 888013b00130 (>sleep){..-.}-{2:2}, at: 
> __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:137
>
> other info that might help us debug this:
>  Possible unsafe locking scenario:
>
>CPU0
>
>   lock(>sleep);
>   lock(>sleep);
>
>  *** DEADLOCK ***
>
>  May be due to missing lock nesting notation
>
>Since the fix is in yet this keeps failing (and I didn't get it), >I looked
>closer at this report. While the names of the locks are the >same, they are
>really two different locks. So let's try this...

Hello Jens Axboe

Sorry, I provided the wrong information before. 
I'm not very familiar with io_uring,  before we start vfs_poll again,  should 
we set  'poll->head = NULL'  ?  

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 42b675939582..cae605c14510 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -4824,7 +4824,7 @@ static bool io_poll_rewait(struct io_kiocb *req, struct 
io_poll_iocb *poll)
 
if (!req->result && !READ_ONCE(poll->canceled)) {
struct poll_table_struct pt = { ._key = poll->events };
-
+   poll->head = NULL;
req->result = vfs_poll(req->file, ) & poll->events;
}

 

Thanks
Qiang

>
>#syz test: git://git.kernel.dk/linux-block syzbot-test
>
>--
>Jens Axboe

回复: [PATCH v2] workqueue: Move the position of debug_work_activate() in __queue_work()

2021-03-03 Thread Zhang, Qiang

Hello Tejun

Please review this change.

Thanks
Qiang

>
>发件人: Zhang, Qiang 
>发送时间: 2021年2月18日 11:17
>收件人: jiangshan...@gmail.com; t...@kernel.org
>抄送: linux-kernel@vger.kernel.org
>主题: [PATCH v2] workqueue: Move the position of >debug_work_activate() in 
>__queue_work()
>
>From: Zqiang 
>
>The debug_work_activate() is called on the premise that
>the work can be inserted, because if wq be in WQ_DRAINING
>status, insert work may be failed.
>
>Fixes: e41e704bc4f4 ("workqueue: improve destroy_workqueue() >debuggability")
>Signed-off-by: Zqiang 
>Reviewed-by: Lai Jiangshan 
>---
>v1->v2:
> add Fixes tag.
>
> kernel/workqueue.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/kernel/workqueue.c b/kernel/workqueue.c
>index 0d150da252e8..21fb00b52def 100644
>--- a/kernel/workqueue.c
>+++ b/kernel/workqueue.c
>@@ -1412,7 +1412,6 @@ static void __queue_work(int cpu, struct 
>>workqueue_struct *wq,
> */
>lockdep_assert_irqs_disabled();
>
>-   debug_work_activate(work);
>
>/* if draining, only works from the same workqueue are allowed >*/
>   if (unlikely(wq->flags & __WQ_DRAINING) &&
>@@ -1494,6 +1493,7 @@ static void __queue_work(int cpu, struct 
>>workqueue_struct *wq,
>worklist = >delayed_works;
>}
>
>+   debug_work_activate(work);
>insert_work(pwq, work, worklist, work_flags);
>
> out:
>--
>2.25.1

回复: possible deadlock in io_poll_double_wake (2)

2021-02-28 Thread Zhang, Qiang




发件人: Jens Axboe 
发送时间: 2021年3月1日 7:08
收件人: syzbot; asml.sile...@gmail.com; io-ur...@vger.kernel.org; 
linux-fsde...@vger.kernel.org; linux-kernel@vger.kernel.org; 
syzkaller-b...@googlegroups.com; v...@zeniv.linux.org.uk
主题: Re: possible deadlock in io_poll_double_wake (2)

[Please note: This e-mail is from an EXTERNAL e-mail address]

On 2/27/21 5:42 PM, syzbot wrote:
> syzbot has found a reproducer for the following issue on:
>
> HEAD commit:5695e516 Merge tag 'io_uring-worker.v3-2021-02-25' of git:..
> git tree:   upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=114e3866d0
> kernel config:  https://syzkaller.appspot.com/x/.config?x=8c76dad0946df1f3
> dashboard link: https://syzkaller.appspot.com/bug?extid=28abd693db9e92c160d8
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=122ed9b6d0
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=14d5a292d0
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+28abd693db9e92c16...@syzkaller.appspotmail.com
>
> 
> WARNING: possible recursive locking detected
> 5.11.0-syzkaller #0 Not tainted
> 
> swapper/1/0 is trying to acquire lock:
> 88801b2b1130 (>sleep){..-.}-{2:2}, at: spin_lock 
> include/linux/spinlock.h:354 [inline]
> 88801b2b1130 (>sleep){..-.}-{2:2}, at: 
> io_poll_double_wake+0x25f/0x6a0 fs/io_uring.c:4960
>
> but task is already holding lock:
> 88801b2b3130 (>sleep){..-.}-{2:2}, at: 
> __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:137
>
> other info that might help us debug this:
>  Possible unsafe locking scenario:
>
>CPU0
>
>   lock(>sleep);
>   lock(>sleep);
>
>  *** DEADLOCK ***
>
>  May be due to missing lock nesting notation
>
> 2 locks held by swapper/1/0:
>  #0: 888147474908 (>lock){..-.}-{2:2}, at: 
> _snd_pcm_stream_lock_irqsave+0x9f/0xd0 sound/core/pcm_native.c:170
>  #1: 88801b2b3130 (>sleep){..-.}-{2:2}, at: 
> __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:137
>
> stack backtrace:
> CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.11.0-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> Google 01/01/2011
> Call Trace:
>  
>  __dump_stack lib/dump_stack.c:79 [inline]
>  dump_stack+0xfa/0x151 lib/dump_stack.c:120
>  print_deadlock_bug kernel/locking/lockdep.c:2829 [inline]
>  check_deadlock kernel/locking/lockdep.c:2872 [inline]
>  validate_chain kernel/locking/lockdep.c:3661 [inline]
>  __lock_acquire.cold+0x14c/0x3b4 kernel/locking/lockdep.c:4900
>  lock_acquire kernel/locking/lockdep.c:5510 [inline]
>  lock_acquire+0x1ab/0x730 kernel/locking/lockdep.c:5475
>  __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
>  _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
>  spin_lock include/linux/spinlock.h:354 [inline]
>  io_poll_double_wake+0x25f/0x6a0 fs/io_uring.c:4960
>  __wake_up_common+0x147/0x650 kernel/sched/wait.c:108
>  __wake_up_common_lock+0xd0/0x130 kernel/sched/wait.c:138
>  snd_pcm_update_state+0x46a/0x540 sound/core/pcm_lib.c:203
>  snd_pcm_update_hw_ptr0+0xa75/0x1a50 sound/core/pcm_lib.c:464
>  snd_pcm_period_elapsed+0x160/0x250 sound/core/pcm_lib.c:1805
>  dummy_hrtimer_callback+0x94/0x1b0 sound/drivers/dummy.c:378
>  __run_hrtimer kernel/time/hrtimer.c:1519 [inline]
>  __hrtimer_run_queues+0x609/0xe40 kernel/time/hrtimer.c:1583
>  hrtimer_run_softirq+0x17b/0x360 kernel/time/hrtimer.c:1600
>  __do_softirq+0x29b/0x9f6 kernel/softirq.c:345
>  invoke_softirq kernel/softirq.c:221 [inline]
>  __irq_exit_rcu kernel/softirq.c:422 [inline]
>  irq_exit_rcu+0x134/0x200 kernel/softirq.c:434
>  sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1100
>  
>  asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:632
> RIP: 0010:native_save_fl arch/x86/include/asm/irqflags.h:29 [inline]
> RIP: 0010:arch_local_save_flags arch/x86/include/asm/irqflags.h:70 [inline]
> RIP: 0010:arch_irqs_disabled arch/x86/include/asm/irqflags.h:137 [inline]
> RIP: 0010:acpi_safe_halt drivers/acpi/processor_idle.c:111 [inline]
> RIP: 0010:acpi_idle_do_entry+0x1c9/0x250 drivers/acpi/processor_idle.c:516
> Code: dd 38 6e f8 84 db 75 ac e8 54 32 6e f8 e8 0f 1c 74 f8 e9 0c 00 00 00 e8 
> 45 32 6e f8 0f 00 2d 4e 4a c5 00 e8 39 32 6e f8 fb f4 <9c> 5b 81 e3 00 02 00 
> 00 fa 31 ff 48 89 de e8 14 3a 6e f8 48 85 db
> RSP: 0018:c9d47d18 EFLAGS: 0293
> RAX:  RBX:  RCX: 
> RDX: 8880115c3780 RSI: 89052537 RDI: 
> RBP: 888141127064 R08: 0001 R09: 0001
> R10: 81794168 R11:  R12: 0001
> R13: 888141127000 R14: 888141127064 R15: 888143331804
>  acpi_idle_enter+0x361/0x500 drivers/acpi/processor_idle.c:647
>  cpuidle_enter_state+0x1b1/0xc80

回复: [PATCH] workqueue: Remove rcu_read_lock/unlock() in workqueue_congested()

2021-02-22 Thread Zhang, Qiang




发件人: Paul E. McKenney 
发送时间: 2021年2月18日 23:17
收件人: Lai Jiangshan
抄送: Zhang, Qiang; Tejun Heo; Tejun Heo; LKML
主题: Re: [PATCH] workqueue: Remove rcu_read_lock/unlock() in 
workqueue_congested()

[Please note: This e-mail is from an EXTERNAL e-mail address]

On Thu, Feb 18, 2021 at 11:04:00AM +0800, Lai Jiangshan wrote:
> +CC Paul
>
>
> On Wed, Feb 17, 2021 at 7:58 PM  wrote:
> >
> > From: Zqiang 
> >
> > The RCU read critical area already by preempt_disable/enable()
> > (equivalent to rcu_read_lock_sched/unlock_sched()) mark, so remove
> > rcu_read_lock/unlock().
>
> I think we can leave it which acks like document, especially
> workqueue_congested() is not performance crucial.  Either way
> is Ok for me.
>
>If the rcu_read_lock() is removed, should there be a comment saying >that
>it interacts with synchronize_rcu()?  Just in case one of the real-time
>guys figures out a way to get the job done without disabling >preemption...
>
>Thanx, Paul
>
> If it needs to be changed, please also do the same for
> rcu_read_lock() in wq_watchdog_timer_fn().
>
>
> And __queue_work() and try_to_grab_pending() also use local_irq_save()
> and rcu_read_lock() at the same time, but I don't know will these
> local_irq_save() be changed to raw_local_irq_save() in PREEMPT_RT.

The local_irq_save function is not change  in PREEMPT_RT system.

Thanks
Qiang

>
>
> >
> > Signed-off-by: Zqiang 
> > ---
> >  kernel/workqueue.c | 2 --
> >  1 file changed, 2 deletions(-)
> >
> > diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> > index 0d150da252e8..c599835ad6c3 100644
> > --- a/kernel/workqueue.c
> > +++ b/kernel/workqueue.c
> > @@ -4540,7 +4540,6 @@ bool workqueue_congested(int cpu, struct 
> > workqueue_struct *wq)
> > struct pool_workqueue *pwq;
> > bool ret;
> >
> > -   rcu_read_lock();
> > preempt_disable();
> >
> > if (cpu == WORK_CPU_UNBOUND)
> > @@ -4553,7 +4552,6 @@ bool workqueue_congested(int cpu, struct 
> > workqueue_struct *wq)
> >
> > ret = !list_empty(>delayed_works);
> > preempt_enable();
> > -   rcu_read_unlock();
> >
> > return ret;
> >  }
> > --
> > 2.25.1
> >

回复: [PATCH] workqueue: Move the position of debug_work_activate() in __queue_work()

2021-02-17 Thread Zhang, Qiang

Hello Tejun Heo

Excuse me, do you have time to make some suggestions for this modification?

Thanks
Qiang


发件人: Zhang, Qiang 
发送时间: 2021年2月11日 16:24
收件人: t...@kernel.org; jiangshan...@gmail.com
抄送: linux-kernel@vger.kernel.org
主题: [PATCH] workqueue: Move the position of debug_work_activate() in 
__queue_work()

From: Zqiang 

The debug_work_activate() is called on the premise that
the work can be inserted, because if wq be in WQ_DRAINING
status, insert work may be failed.

Signed-off-by: Zqiang 
---
 kernel/workqueue.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 0d150da252e8..21fb00b52def 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1412,7 +1412,6 @@ static void __queue_work(int cpu, struct workqueue_struct 
*wq,
 */
lockdep_assert_irqs_disabled();

-   debug_work_activate(work);

/* if draining, only works from the same workqueue are allowed */
if (unlikely(wq->flags & __WQ_DRAINING) &&
@@ -1494,6 +1493,7 @@ static void __queue_work(int cpu, struct workqueue_struct 
*wq,
worklist = >delayed_works;
}

+   debug_work_activate(work);
insert_work(pwq, work, worklist, work_flags);

 out:
--
2.25.1

回复: 回复: [PATCH v3] kvfree_rcu: Release page cache under memory pressure

2021-02-05 Thread Zhang, Qiang




发件人: Uladzislau Rezki 
发送时间: 2021年2月4日 22:09
收件人: Zhang, Qiang
抄送: Uladzislau Rezki; paul...@kernel.org; j...@joelfernandes.org; 
r...@vger.kernel.org; linux-kernel@vger.kernel.org
主题: Re: 回复: [PATCH v3] kvfree_rcu: Release page cache under memory pressure

[Please note: This e-mail is from an EXTERNAL e-mail address]

> 发件人: Uladzislau Rezki 
> 发送时间: 2021年2月2日 3:57
> 收件人: Zhang, Qiang
> 抄送: ure...@gmail.com; paul...@kernel.org; j...@joelfernandes.org; 
> r...@vger.kernel.org; linux-kernel@vger.kernel.org
> 主题: Re: [PATCH v3] kvfree_rcu: Release page cache under memory pressure
>
> [Please note: This e-mail is from an EXTERNAL e-mail address]
>
> Hello, Zqiang.
>
> > From: Zqiang 
> >
> > Add free per-cpu existing krcp's page cache operation, when
> > the system is under memory pressure.
> >
> > Signed-off-by: Zqiang 
> > ---
> >  kernel/rcu/tree.c | 26 ++
> >  1 file changed, 26 insertions(+)
> >
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index c1ae1e52f638..644b0f3c7b9f 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -3571,17 +3571,41 @@ void kvfree_call_rcu(struct rcu_head *head, 
> > rcu_callback_t func)
> >  }
> >  EXPORT_SYMBOL_GPL(kvfree_call_rcu);
> >
> > +static int free_krc_page_cache(struct kfree_rcu_cpu *krcp)
> > +{
> > + unsigned long flags;
> > + struct llist_node *page_list, *pos, *n;
> > + int freed = 0;
> > +
> > + raw_spin_lock_irqsave(>lock, flags);
> > + page_list = llist_del_all(>bkvcache);
> > + krcp->nr_bkv_objs = 0;
> > + raw_spin_unlock_irqrestore(>lock, flags);
> > +
> > + llist_for_each_safe(pos, n, page_list) {
> > + free_page((unsigned long)pos);
> > + freed++;
> > + }
> > +
> > + return freed;
> > +}
> > +
> >  static unsigned long
> >  kfree_rcu_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
> >  {
> >   int cpu;
> >   unsigned long count = 0;
> > + unsigned long flags;
> >
> >   /* Snapshot count of all CPUs */
> >   for_each_possible_cpu(cpu) {
> >   struct kfree_rcu_cpu *krcp = per_cpu_ptr(, cpu);
> >
> >   count += READ_ONCE(krcp->count);
> > +
> > + raw_spin_lock_irqsave(>lock, flags);
> > + count += krcp->nr_bkv_objs;
> > + raw_spin_unlock_irqrestore(>lock, flags);
> >   }
> >
> >   return count;
> > @@ -3598,6 +3622,8 @@ kfree_rcu_shrink_scan(struct shrinker *shrink, struct 
> > shrink_control *sc)
> >   struct kfree_rcu_cpu *krcp = per_cpu_ptr(, cpu);
> >
> >   count = krcp->count;
> > + count += free_krc_page_cache(krcp);
> > +
> >   raw_spin_lock_irqsave(>lock, flags);
> >   if (krcp->monitor_todo)
> >   kfree_rcu_drain_unlock(krcp, flags);
> > --
> > 2.17.1
> >>
> >Thank you for your patch!
> >
> >I spent some time to see how the patch behaves under low memory condition.
> >To simulate it, i used "rcuscale" tool with below parameters:
> >
> >../rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration 10 
> >--kconfig >CONFIG_NR_CPUS=64 \
> >--bootargs "rcuscale.kfree_rcu_test=1 rcuscale.kfree_nthreads=16 
> >>rcuscale.holdoff=20 rcuscale.kfree_loops=1 \
> >torture.disable_onoff_at_boot" --trust-make
> >
> >64 CPUs + 512 MB of memory. In general, my test system was running on edge
> >hitting an out of memory sometimes, but could be considered as stable in
> >regards to a test completion and taken time, so both were pretty solid.
> >
> >You can find a comparison on a plot, that can be downloaded following
> >a link: wget 
> >>ftp://vps418301.ovh.net/incoming/release_page_cache_under_low_memory.png
> >
> >In short, i see that a patched version can lead to longer test completion,
> >whereas the default variant is stable on almost all runs. After some analysis
> >and further digging i came to conclusion that a shrinker 
> >free_krc_page_cache()
> >concurs with run_page_cache_worker(krcp) running from kvfree_rcu() context.
> >
> >i.e. During the test a page shrinker is pretty active, because of low memory
> >condition. Our callback drains it whereas kvfree_rcu() part refill it right
> >away making kind of vicious circle.
> >
> >

回复: [PATCH] uprobes: Fix kasan UAF reported by syzbot

2021-02-05 Thread Zhang, Qiang

Hello peterz
 This ("rbtree, uprobes: Use rbtree helpers")modification misses the increase 
in the reference count , syzbot  have been reporting recently .
Thanks
Qiang


发件人: Zhang, Qiang 
发送时间: 2021年2月2日 17:17
收件人: pet...@infradead.org; mi...@redhat.com; 
syzbot+2f6d683983e3905ad...@syzkaller.appspotmail.com
抄送: o...@redhat.com; linux-kernel@vger.kernel.org
主题: [PATCH] uprobes: Fix kasan UAF reported by syzbot

From: Zqiang 

Call Trace:
 __dump_stack [inline]
 dump_stack+0x107/0x163
 print_address_description.constprop.0.cold+0x5b/0x2f8
 __kasan_report [inline]
 kasan_report.cold+0x7c/0xd8
 uprobe_cmp [inline]
 __uprobe_cmp [inline]
 rb_find_add [inline]
 __insert_uprobe [inline]
 insert_uprobe [inline]
 alloc_uprobe [inline]
 __uprobe_register+0x70f/0x850
 ..
 __do_sys_perf_event_open+0x647/0x2e60
 do_syscall_64+0x2d/0x70
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Allocated by task 12710:
 kzalloc [inline]
 alloc_uprobe [inline]
 __uprobe_register+0x19c/0x850
 trace_uprobe_enable [inline]
 trace_uprobe_register+0x443/0x880
 ...
 __do_sys_perf_event_open+0x647/0x2e60
 do_syscall_64+0x2d/0x70
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Freed by task 12710:
 kfree+0xe5/0x7b0
 put_uprobe [inline]
 put_uprobe+0x13b/0x190
 uprobe_apply+0xfc/0x130
 uprobe_perf_open [inline]
 trace_uprobe_register+0x5c9/0x880
 ...
 __do_sys_perf_event_open+0x647/0x2e60
 do_syscall_64+0x2d/0x70
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

 fix the count of references lost in __find_uprobe function

Fixes: c6bc9bd06dff ("rbtree, uprobes: Use rbtree helpers")
Reported-by: syzbot+1182ffb2063c5d087...@syzkaller.appspotmail.com
Signed-off-by: Zqiang 
---
 kernel/events/uprobes.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 7e15b2efdd87..6addc9780319 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -661,7 +661,7 @@ static struct uprobe *__find_uprobe(struct inode *inode, 
loff_t offset)
struct rb_node *node = rb_find(, _tree, __uprobe_cmp_key);

if (node)
-   return __node_2_uprobe(node);
+   return get_uprobe(__node_2_uprobe(node));

return NULL;
 }
--
2.17.1

回复: [PATCH v3] kvfree_rcu: Release page cache under memory pressure

2021-02-04 Thread Zhang, Qiang








发件人: Uladzislau Rezki 
发送时间: 2021年2月2日 3:57
收件人: Zhang, Qiang
抄送: ure...@gmail.com; paul...@kernel.org; j...@joelfernandes.org; 
r...@vger.kernel.org; linux-kernel@vger.kernel.org
主题: Re: [PATCH v3] kvfree_rcu: Release page cache under memory pressure

[Please note: This e-mail is from an EXTERNAL e-mail address]

Hello, Zqiang.

> From: Zqiang 
>
> Add free per-cpu existing krcp's page cache operation, when
> the system is under memory pressure.
>
> Signed-off-by: Zqiang 
> ---
>  kernel/rcu/tree.c | 26 ++
>  1 file changed, 26 insertions(+)
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index c1ae1e52f638..644b0f3c7b9f 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3571,17 +3571,41 @@ void kvfree_call_rcu(struct rcu_head *head, 
> rcu_callback_t func)
>  }
>  EXPORT_SYMBOL_GPL(kvfree_call_rcu);
>
> +static int free_krc_page_cache(struct kfree_rcu_cpu *krcp)
> +{
> + unsigned long flags;
> + struct llist_node *page_list, *pos, *n;
> + int freed = 0;
> +
> + raw_spin_lock_irqsave(>lock, flags);
> + page_list = llist_del_all(>bkvcache);
> + krcp->nr_bkv_objs = 0;
> + raw_spin_unlock_irqrestore(>lock, flags);
> +
> + llist_for_each_safe(pos, n, page_list) {
> + free_page((unsigned long)pos);
> + freed++;
> + }
> +
> + return freed;
> +}
> +
>  static unsigned long
>  kfree_rcu_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
>  {
>   int cpu;
>   unsigned long count = 0;
> + unsigned long flags;
>
>   /* Snapshot count of all CPUs */
>   for_each_possible_cpu(cpu) {
>   struct kfree_rcu_cpu *krcp = per_cpu_ptr(, cpu);
>
>   count += READ_ONCE(krcp->count);
> +
> + raw_spin_lock_irqsave(>lock, flags);
> + count += krcp->nr_bkv_objs;
> + raw_spin_unlock_irqrestore(>lock, flags);
>   }
>
>   return count;
> @@ -3598,6 +3622,8 @@ kfree_rcu_shrink_scan(struct shrinker *shrink, struct 
> shrink_control *sc)
>   struct kfree_rcu_cpu *krcp = per_cpu_ptr(, cpu);
>
>   count = krcp->count;
> + count += free_krc_page_cache(krcp);
> +
>   raw_spin_lock_irqsave(>lock, flags);
>   if (krcp->monitor_todo)
>   kfree_rcu_drain_unlock(krcp, flags);
> --
> 2.17.1
>>
>Thank you for your patch!
>
>I spent some time to see how the patch behaves under low memory condition.
>To simulate it, i used "rcuscale" tool with below parameters:
>
>../rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration 10 --kconfig 
>>CONFIG_NR_CPUS=64 \
>--bootargs "rcuscale.kfree_rcu_test=1 rcuscale.kfree_nthreads=16 
>>rcuscale.holdoff=20 rcuscale.kfree_loops=1 \
>torture.disable_onoff_at_boot" --trust-make
>
>64 CPUs + 512 MB of memory. In general, my test system was running on edge
>hitting an out of memory sometimes, but could be considered as stable in
>regards to a test completion and taken time, so both were pretty solid.
>
>You can find a comparison on a plot, that can be downloaded following
>a link: wget 
>>ftp://vps418301.ovh.net/incoming/release_page_cache_under_low_memory.png
>
>In short, i see that a patched version can lead to longer test completion,
>whereas the default variant is stable on almost all runs. After some analysis
>and further digging i came to conclusion that a shrinker free_krc_page_cache()
>concurs with run_page_cache_worker(krcp) running from kvfree_rcu() context.
>
>i.e. During the test a page shrinker is pretty active, because of low memory
>condition. Our callback drains it whereas kvfree_rcu() part refill it right
>away making kind of vicious circle.
>
>So, a run_page_cache_worker() should be backoff for some time when a system
>runs into a low memory condition or high pressure:
>
>diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
>index 7077d73fcb53..446723b9646b 100644
>--- a/kernel/rcu/tree.c
>+++ b/kernel/rcu/tree.c
>@@ -3163,7 +3163,7 @@ struct kfree_rcu_cpu {
>bool initialized;
>int count;
>
>-   struct work_struct page_cache_work;
>+   struct delayed_work page_cache_work;
>atomic_t work_in_progress;
>struct hrtimer hrtimer;
>
>@@ -3419,7 +3419,7 @@ schedule_page_work_fn(struct hrtimer *t)
>struct kfree_rcu_cpu *krcp =
>container_of(t, struct kfree_rcu_cpu, hrtimer);
>
>-   queue_work(system_highpri_wq, >page_cache_work);
>+   queue_delayed_work(system_highpri_wq, &

回复: [PATCH v3] kvfree_rcu: Release page cache under memory pressure

2021-02-04 Thread Zhang, Qiang





发件人: Uladzislau Rezki 
发送时间: 2021年2月2日 3:57
收件人: Zhang, Qiang
抄送: ure...@gmail.com; paul...@kernel.org; j...@joelfernandes.org; 
r...@vger.kernel.org; linux-kernel@vger.kernel.org
主题: Re: [PATCH v3] kvfree_rcu: Release page cache under memory pressure

[Please note: This e-mail is from an EXTERNAL e-mail address]

Hello, Zqiang.

> From: Zqiang 
>
> Add free per-cpu existing krcp's page cache operation, when
> the system is under memory pressure.
>
> Signed-off-by: Zqiang 
> ---
>  kernel/rcu/tree.c | 26 ++
>  1 file changed, 26 insertions(+)
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index c1ae1e52f638..644b0f3c7b9f 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3571,17 +3571,41 @@ void kvfree_call_rcu(struct rcu_head *head, 
> rcu_callback_t func)
>  }
>  EXPORT_SYMBOL_GPL(kvfree_call_rcu);
>
> +static int free_krc_page_cache(struct kfree_rcu_cpu *krcp)
> +{
> + unsigned long flags;
> + struct llist_node *page_list, *pos, *n;
> + int freed = 0;
> +
> + raw_spin_lock_irqsave(>lock, flags);
> + page_list = llist_del_all(>bkvcache);
> + krcp->nr_bkv_objs = 0;
> + raw_spin_unlock_irqrestore(>lock, flags);
> +
> + llist_for_each_safe(pos, n, page_list) {
> + free_page((unsigned long)pos);
> + freed++;
> + }
> +
> + return freed;
> +}
> +
>  static unsigned long
>  kfree_rcu_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
>  {
>   int cpu;
>   unsigned long count = 0;
> + unsigned long flags;
>
>   /* Snapshot count of all CPUs */
>   for_each_possible_cpu(cpu) {
>   struct kfree_rcu_cpu *krcp = per_cpu_ptr(, cpu);
>
>   count += READ_ONCE(krcp->count);
> +
> + raw_spin_lock_irqsave(>lock, flags);
> + count += krcp->nr_bkv_objs;
> + raw_spin_unlock_irqrestore(>lock, flags);
>   }
>
>   return count;
> @@ -3598,6 +3622,8 @@ kfree_rcu_shrink_scan(struct shrinker *shrink, struct 
> shrink_control *sc)
>   struct kfree_rcu_cpu *krcp = per_cpu_ptr(, cpu);
>
>   count = krcp->count;
> + count += free_krc_page_cache(krcp);
> +
>   raw_spin_lock_irqsave(>lock, flags);
>   if (krcp->monitor_todo)
>   kfree_rcu_drain_unlock(krcp, flags);
> --
> 2.17.1
>>
>Thank you for your patch!
>
>I spent some time to see how the patch behaves under low memory condition.
>To simulate it, i used "rcuscale" tool with below parameters:
>
>../rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration 10 --kconfig 
>>CONFIG_NR_CPUS=64 \
>--bootargs "rcuscale.kfree_rcu_test=1 rcuscale.kfree_nthreads=16 
>>rcuscale.holdoff=20 rcuscale.kfree_loops=1 \
>torture.disable_onoff_at_boot" --trust-make
>
>64 CPUs + 512 MB of memory. In general, my test system was running on edge
>hitting an out of memory sometimes, but could be considered as stable in
>regards to a test completion and taken time, so both were pretty solid.
>
>You can find a comparison on a plot, that can be downloaded following
>a link: wget 
>>ftp://vps418301.ovh.net/incoming/release_page_cache_under_low_memory.png
>
>In short, i see that a patched version can lead to longer test completion,
>whereas the default variant is stable on almost all runs. After some analysis
>and further digging i came to conclusion that a shrinker free_krc_page_cache()
>concurs with run_page_cache_worker(krcp) running from kvfree_rcu() context.
>
>i.e. During the test a page shrinker is pretty active, because of low memory
>condition. Our callback drains it whereas kvfree_rcu() part refill it right
>away making kind of vicious circle.
>
>So, a run_page_cache_worker() should be backoff for some time when a system
>runs into a low memory condition or high pressure:
>
>diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
>index 7077d73fcb53..446723b9646b 100644
>--- a/kernel/rcu/tree.c
>+++ b/kernel/rcu/tree.c
>@@ -3163,7 +3163,7 @@ struct kfree_rcu_cpu {
>bool initialized;
>int count;
>
>-   struct work_struct page_cache_work;
>+   struct delayed_work page_cache_work;
>atomic_t work_in_progress;
>struct hrtimer hrtimer;
>
>@@ -3419,7 +3419,7 @@ schedule_page_work_fn(struct hrtimer *t)
>struct kfree_rcu_cpu *krcp =
>container_of(t, struct kfree_rcu_cpu, hrtimer);
>
>-   queue_work(system_highpri_wq, >page_cache_work);
>+   queue_delayed_work(system_highpri_wq, &

回复: [PATCH v3] kvfree_rcu: Release page cache under memory pressure

2021-02-01 Thread Zhang, Qiang




发件人: Uladzislau Rezki 
发送时间: 2021年2月2日 3:57
收件人: Zhang, Qiang
抄送: ure...@gmail.com; paul...@kernel.org; j...@joelfernandes.org; 
r...@vger.kernel.org; linux-kernel@vger.kernel.org
主题: Re: [PATCH v3] kvfree_rcu: Release page cache under memory pressure

[Please note: This e-mail is from an EXTERNAL e-mail address]

Hello, Zqiang.

> From: Zqiang 
>
> Add free per-cpu existing krcp's page cache operation, when
> the system is under memory pressure.
>
> Signed-off-by: Zqiang 
> ---
>  kernel/rcu/tree.c | 26 ++
>  1 file changed, 26 insertions(+)
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index c1ae1e52f638..644b0f3c7b9f 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3571,17 +3571,41 @@ void kvfree_call_rcu(struct rcu_head *head, 
> rcu_callback_t func)
>  }
>  EXPORT_SYMBOL_GPL(kvfree_call_rcu);
>
> +static int free_krc_page_cache(struct kfree_rcu_cpu *krcp)
> +{
> + unsigned long flags;
> + struct llist_node *page_list, *pos, *n;
> + int freed = 0;
> +
> + raw_spin_lock_irqsave(>lock, flags);
> + page_list = llist_del_all(>bkvcache);
> + krcp->nr_bkv_objs = 0;
> + raw_spin_unlock_irqrestore(>lock, flags);
> +
> + llist_for_each_safe(pos, n, page_list) {
> + free_page((unsigned long)pos);
> + freed++;
> + }
> +
> + return freed;
> +}
> +
>  static unsigned long
>  kfree_rcu_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
>  {
>   int cpu;
>   unsigned long count = 0;
> + unsigned long flags;
>
>   /* Snapshot count of all CPUs */
>   for_each_possible_cpu(cpu) {
>   struct kfree_rcu_cpu *krcp = per_cpu_ptr(, cpu);
>
>   count += READ_ONCE(krcp->count);
> +
> + raw_spin_lock_irqsave(>lock, flags);
> + count += krcp->nr_bkv_objs;
> + raw_spin_unlock_irqrestore(>lock, flags);
>   }
>
>   return count;
> @@ -3598,6 +3622,8 @@ kfree_rcu_shrink_scan(struct shrinker *shrink, struct 
> shrink_control *sc)
>   struct kfree_rcu_cpu *krcp = per_cpu_ptr(, cpu);
>
>   count = krcp->count;
> + count += free_krc_page_cache(krcp);
> +
>   raw_spin_lock_irqsave(>lock, flags);
>   if (krcp->monitor_todo)
>   kfree_rcu_drain_unlock(krcp, flags);
> --
> 2.17.1
>>
>Thank you for your patch!
>
>I spent some time to see how the patch behaves under low memory condition.
>To simulate it, i used "rcuscale" tool with below parameters:
>
>../rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration 10 --kconfig 
>>CONFIG_NR_CPUS=64 \
>--bootargs "rcuscale.kfree_rcu_test=1 rcuscale.kfree_nthreads=16 
>>rcuscale.holdoff=20 rcuscale.kfree_loops=1 \
>torture.disable_onoff_at_boot" --trust-make
>
>64 CPUs + 512 MB of memory. In general, my test system was running on edge
>hitting an out of memory sometimes, but could be considered as stable in
>regards to a test completion and taken time, so both were pretty solid.
>
>You can find a comparison on a plot, that can be downloaded following
>a link: wget 
>>ftp://vps418301.ovh.net/incoming/release_page_cache_under_low_memory.png
>
>In short, i see that a patched version can lead to longer test completion,
>whereas the default variant is stable on almost all runs. After some analysis
>and further digging i came to conclusion that a shrinker free_krc_page_cache()
>concurs with run_page_cache_worker(krcp) running from kvfree_rcu() context.
>
>i.e. During the test a page shrinker is pretty active, because of low memory
>condition. Our callback drains it whereas kvfree_rcu() part refill it right
>away making kind of vicious circle.
>
>So, a run_page_cache_worker() should be backoff for some time when a system
>runs into a low memory condition or high pressure:
>
>diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
>index 7077d73fcb53..446723b9646b 100644
>--- a/kernel/rcu/tree.c
>+++ b/kernel/rcu/tree.c
>@@ -3163,7 +3163,7 @@ struct kfree_rcu_cpu {
>bool initialized;
>int count;
>
>-   struct work_struct page_cache_work;
>+   struct delayed_work page_cache_work;
>atomic_t work_in_progress;
>struct hrtimer hrtimer;
>
>@@ -3419,7 +3419,7 @@ schedule_page_work_fn(struct hrtimer *t)
>struct kfree_rcu_cpu *krcp =
>container_of(t, struct kfree_rcu_cpu, hrtimer);
>
>-   queue_work(system_highpri_wq, >page_cache_work);
>+   queue_delayed_work(system_highpri_wq, &

回复: [PATCH v2] kvfree_rcu: Release page cache under memory pressure

2021-01-30 Thread Zhang, Qiang




发件人: Uladzislau Rezki 
发送时间: 2021年1月29日 22:19
收件人: Zhang, Qiang
抄送: ure...@gmail.com; paul...@kernel.org; j...@joelfernandes.org; 
r...@vger.kernel.org; linux-kernel@vger.kernel.org
主题: Re: [PATCH v2] kvfree_rcu: Release page cache under memory pressure

[Please note: This e-mail is from an EXTERNAL e-mail address]

On Fri, Jan 29, 2021 at 04:04:42PM +0800, qiang.zh...@windriver.com wrote:
> From: Zqiang 
>
> Add free per-cpu existing krcp's page cache operation, when
> the system is under memory pressure.
>
> Signed-off-by: Zqiang 
> ---
>  kernel/rcu/tree.c | 25 +
>  1 file changed, 25 insertions(+)
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index c1ae1e52f638..ec098910d80b 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3571,17 +3571,40 @@ void kvfree_call_rcu(struct rcu_head *head, 
> rcu_callback_t func)
>  }
>  EXPORT_SYMBOL_GPL(kvfree_call_rcu);
>
> +static int free_krc_page_cache(struct kfree_rcu_cpu *krcp)
> +{
> + unsigned long flags;
> + struct kvfree_rcu_bulk_data *bnode;
> + int i;
> +
> + for (i = 0; i < rcu_min_cached_objs; i++) {
> + raw_spin_lock_irqsave(>lock, flags);
>I am not sure why we should disable IRQs. I think it can be >avoided.

Suppose in multi CPU system, the kfree_rcu_shrink_scan function is runing on 
CPU2,
and we just traverse to CPU2, and then call free_krc_page_cache function,
if not disable irq, a interrupt may be occurs on CPU2 after the CPU2 
corresponds to krcp variable 's lock be acquired,  if the interrupt or softirq 
handler function to call kvfree_rcu function, in this function , acquire CPU2 
corresponds to krcp variable 's lock , will happen deadlock.
Or in single CPU scenario.

> + bnode = get_cached_bnode(krcp);
> + raw_spin_unlock_irqrestore(>lock, flags);
> + if (!bnode)
> + break;
> + free_page((unsigned long)bnode);
> + }
> +
> + return i;
> +}
>Also i forgot to add in my previous comment to this path. Can we >access
>to page cache once and then do the drain work? I mean if we had >100 objects
>in the cache we would need to access to a krcp->lock 100 times.
>
>What about something like below:
>
>
>static int free_krc_page_cache(struct kfree_rcu_cpu *krcp)
>{
>struct llist_node *page_list, *pos, *n;
>int freed = 0;
>
>raw_spin_lock(>lock);
>page_list = llist_del_all(>bkvcache);
>krcp->nr_bkv_objs = 0;
>raw_spin_unlock(>lock);
>
>llist_for_each_safe(pos, n, page_list) {
>free_page((unsigned long) pos);
>freed++;
>}
>
>return freed;
>}
>

  this change looks better.
  Thanks 
  Qiang
> +
>  static unsigned long
>  kfree_rcu_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
>  {
>   int cpu;
>   unsigned long count = 0;
> + unsigned long flags;
>
>   /* Snapshot count of all CPUs */
>   for_each_possible_cpu(cpu) {
>   struct kfree_rcu_cpu *krcp = per_cpu_ptr(, cpu);
>
>   count += READ_ONCE(krcp->count);
> +
> + raw_spin_lock_irqsave(>lock, flags);
> + count += krcp->nr_bkv_objs;
> + raw_spin_unlock_irqrestore(>lock, flags);
>Should we disable irqs?

>
>   return count;
> @@ -3598,6 +3621,8 @@ kfree_rcu_shrink_scan(struct shrinker *shrink, struct 
> shrink_control *sc)
>   struct kfree_rcu_cpu *krcp = per_cpu_ptr(, cpu);
>
>   count = krcp->count;
> + count += free_krc_page_cache(krcp);
> +
>   raw_spin_lock_irqsave(>lock, flags);
>   if (krcp->monitor_todo)
>   kfree_rcu_drain_unlock(krcp, flags);
> --
> 2.17.1

Thanks!

--
Vlad Rezki

回复: [PATCH] PM: remove PF_WQ_WORKER mask

2021-01-27 Thread Zhang, Qiang




发件人: Rafael J. Wysocki 
发送时间: 2021年1月28日 2:16
收件人: Zhang, Qiang
抄送: Rafael Wysocki; Linux PM; Linux Kernel Mailing List
主题: Re: [PATCH] PM: remove PF_WQ_WORKER mask

[Please note: This e-mail is from an EXTERNAL e-mail address]

On Mon, Jan 25, 2021 at 5:01 AM  wrote:
>
> From: Zqiang 
>
> Due to kworker also is kernel thread, it's already included
> PF_KTHREAD mask, so remove PF_WQ_WORKER mask.

>So you are saying that all threads having PF_WQ_WORKER set must also
>have PF_KTHREAD set, right?

  yes
  #define PF_KTHREAD  0x0020
  #define PF_WQ_WORKER  0x0020
  I tracing kwoker's task->flags as follows:
  comm kworker/1:0, cpu 1, task->flags 0x4208060, delayed 3, func 
intel_fbc_work_fn

Thanks
Qiang

>That sounds correct, so I'm going to rewrite the changelog and apply
>the patch as 5.12 material, thanks!

> Signed-off-by: Zqiang 
> ---
>  kernel/power/process.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/power/process.c b/kernel/power/process.c
> index 45b054b7b5ec..50cc63534486 100644
> --- a/kernel/power/process.c
> +++ b/kernel/power/process.c
> @@ -235,7 +235,7 @@ void thaw_kernel_threads(void)
>
> read_lock(_lock);
> for_each_process_thread(g, p) {
> -   if (p->flags & (PF_KTHREAD | PF_WQ_WORKER))
> +   if (p->flags & PF_KTHREAD)
> __thaw_task(p);
> }
> read_unlock(_lock);
> --
> 2.17.1
>

回复: 回复: 回复: [PATCH] rcu: Release per-cpu krcp page cache when CPU going offline

2021-01-27 Thread Zhang, Qiang




发件人: Uladzislau Rezki 
发送时间: 2021年1月26日 22:07
收件人: Zhang, Qiang
抄送: Uladzislau Rezki; Paul E. McKenney; r...@vger.kernel.org; 
linux-kernel@vger.kernel.org
主题: Re: 回复: 回复: [PATCH] rcu: Release per-cpu krcp page cache when CPU going 
offline

>
> On Fri, Jan 22, 2021 at 01:44:36AM +, Zhang, Qiang wrote:
> >
> >
> > 
> > 发件人: Uladzislau Rezki 
> > 发送时间: 2021年1月22日 4:26
> > 收件人: Zhang, Qiang
> > 抄送: Paul E. McKenney; r...@vger.kernel.org; linux-kernel@vger.kernel.org; 
> > ure...@gmail.com
> > 主题: Re: [PATCH] rcu: Release per-cpu krcp page cache when CPU going offline
> > >Hello, Qiang,
> >
> > > On Thu, Jan 21, 2021 at 02:49:49PM +0800, qiang.zh...@windriver.com wrote:
> > > > From: Zqiang 
> > > >
> > > > If CPUs go offline, the corresponding krcp's page cache can
> > > > not be use util the CPU come back online, or maybe the CPU
> > > > will never go online again, this commit therefore free krcp's
> > > > page cache when CPUs go offline.
> > > >
> > > > Signed-off-by: Zqiang 
> > >
> > >Do you consider it as an issue? We have 5 pages per CPU, that is 20480 
> > >bytes.
> > >
> >
> > Hello Rezki
> >
> > In a multi CPUs system, more than one CPUs may be offline, there are more 
> > than 5 pages,  and these offline CPUs may never go online again  or  in the 
> > process of CPUs online, there are errors, which lead to the failure of 
> > online, these scenarios will lead to the per-cpu krc page cache will never 
> > be released.
> >
> >Thanks for your answer. I was thinking more about if you knew some >platforms
> >which suffer from such extra page usage when CPU goes offline. Any >issues
> >your platforms or devices run into because of that.
> >
> >So i understand that if CPU goes offline the 5 pages associated with it >are
> >unused until it goes online back.
>
>  I agree with you, But I still want to talk about what I think
>
>  My understanding is that when the CPU is offline,  the pages is not
>  accessible,  beacuse we don't know when this CPU will
>  go online again, so we best to return these page to the buddy system,
>  when the CPU goes online again, we can allocate page from the buddy
>  system to fill krcp's page cache.  maybe you may think that this memory
>  is small and don't need to.
>
>BTW, we can release the caches via shrinker path instead, what is more makes
>sense to me. We already have a callback, that frees pages when a page allocator
>asks for it. I think in that case it would be fair to return it to the buddy
>system. It happens under low memory condition

  I agree. it can be done in shrink callback, can release the currently 
existing per-cpu 
  page cache.
  
   Thanks
   Qiang
> or can be done manually to flush
>system caches:
>
>echo 3 > /proc/sys/vm/drop_caches
>
>What do you think?
>
>--
>Vlad Rezki

回复: [PATCH] PM: remove PF_WQ_WORKER mask

2021-01-26 Thread Zhang, Qiang



发件人: Zhang, Qiang 
发送时间: 2021年1月25日 12:00
收件人: rafael.j.wyso...@intel.com
抄送: linux...@vger.kernel.org; linux-kernel@vger.kernel.org
主题: [PATCH] PM: remove PF_WQ_WORKER mask

From: Zqiang 

Due to kworker also is kernel thread, it's already included
PF_KTHREAD mask, so remove PF_WQ_WORKER mask.

Signed-off-by: Zqiang 
---
 kernel/power/process.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/power/process.c b/kernel/power/process.c
index 45b054b7b5ec..50cc63534486 100644
--- a/kernel/power/process.c
+++ b/kernel/power/process.c
@@ -235,7 +235,7 @@ void thaw_kernel_threads(void)

read_lock(_lock);
for_each_process_thread(g, p) {
-   if (p->flags & (PF_KTHREAD | PF_WQ_WORKER))
+   if (p->flags & PF_KTHREAD)
__thaw_task(p);
}
read_unlock(_lock);
--
2.17.1

回复: [PATCH] sched/core: add rcu_read_lock/unlock() protection

2021-01-26 Thread Zhang, Qiang




发件人: Zhang, Qiang 
发送时间: 2021年1月26日 16:29
收件人: valentin.schnei...@arm.com
抄送: pet...@infradead.org; linux-kernel@vger.kernel.org
主题: [PATCH] sched/core: add rcu_read_lock/unlock() protection

>From: Zqiang 

>Due to for_each_process_thread belongs to RCU read operation,
>need to add rcu_read_lock/unlock() protection.

  Sorry to disturb you
  I find it's already in the RCU critical zone
  Please ignore this change


>Signed-off-by: Zqiang 
>---
> kernel/sched/core.c | 2 ++
> 1 file changed, 2 insertions(+)
>
>diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>index 8c5481077c9c..c3f0103fdf53 100644
>--- a/kernel/sched/core.c
>+++ b/kernel/sched/core.c
>@@ -7738,6 +7738,7 @@ static void dump_rq_tasks(struct rq *rq, const char 
>*loglvl)
>lockdep_assert_held(>lock);
>
>printk("%sCPU%d enqueued tasks (%u total):\n", loglvl, cpu, 
> rq->nr_running);
>+   rcu_read_lock();
>for_each_process_thread(g, p) {
>if (task_cpu(p) != cpu)
>continue;
>@@ -7747,6 +7748,7 @@ static void dump_rq_tasks(struct rq *rq, const char 
>*loglvl)
>
>printk("%s\tpid: %d, name: %s\n", loglvl, p->pid, p->comm);
>}
>+   rcu_read_unlock();
> }
>
> int sched_cpu_dying(unsigned int cpu)
--
2.17.1

回复: 回复: 回复: [PATCH 3/3] kvfree_rcu: use migrate_disable/enable()

2021-01-26 Thread Zhang, Qiang




发件人: Uladzislau Rezki 
发送时间: 2021年1月25日 21:49
收件人: Zhang, Qiang
抄送: Uladzislau Rezki; LKML; RCU; Paul E . McKenney; Michael Ellerman; Andrew 
Morton; Daniel Axtens; Frederic Weisbecker; Neeraj Upadhyay; Joel Fernandes; 
Peter Zijlstra; Michal Hocko; Thomas Gleixner; Theodore Y . Ts'o; Sebastian 
Andrzej Siewior; Oleksiy Avramchenko
主题: Re: 回复: 回复: [PATCH 3/3] kvfree_rcu: use migrate_disable/enable()

> >Hello, Zhang.
>
> > >
> > >发件人: Uladzislau Rezki (Sony) 
> > >发送时间: 2021年1月21日 0:21
> > >收件人: LKML; RCU; Paul E . McKenney; Michael Ellerman
> > >抄送: Andrew Morton; Daniel Axtens; Frederic Weisbecker; Neeraj >Upadhyay; 
> > >Joel Fernandes; Peter Zijlstra; Michal Hocko; Thomas >Gleixner; Theodore Y 
> > >. Ts'o; Sebastian Andrzej Siewior; Uladzislau >Rezki; Oleksiy Avramchenko
> > >主题: [PATCH 3/3] kvfree_rcu: use migrate_disable/enable()
> > >
> > >Since the page is obtained in a fully preemptible context, dropping
> > >the lock can lead to migration onto another CPU. As a result a prev.
> > >bnode of that CPU may be underutilised, because a decision has been
> > >made for a CPU that was run out of free slots to store a pointer.
> > >
> > >migrate_disable/enable() are now independent of RT, use it in order
> > >to prevent any migration during a page request for a specific CPU it
> > >is requested for.
> >
> >
> > Hello Rezki
> >
> > The critical migrate_disable/enable() area is not allowed to block, under 
> > RT and non RT.
> > There is such a description in preempt.h
> >
> >
> > * Notes on the implementation.
> >  *
> >  * The implementation is particularly tricky since existing code patterns
> >  * dictate neither migrate_disable() nor migrate_enable() is allowed to 
> > block.
> >  * This means that it cannot use cpus_read_lock() to serialize against 
> > hotplug,
> >  * nor can it easily migrate itself into a pending affinity mask change on
> >  * migrate_enable().
> >
> >How i interpret it is migrate_enable()/migrate_disable() are not allowed to
> >use any blocking primitives, such as rwsem/mutexes/etc. in order to mark a
> >current context as non-migratable.
> >
> >void migrate_disable(void)
> >{
> > struct task_struct *p = current;
> >
> > if (p->migration_disabled) {
> >  p->migration_disabled++;
> >  return;
> > }
>
> > preempt_disable();
> > this_rq()->nr_pinned++;
> > p->migration_disabled = 1;
> > preempt_enable();
> >}
> >
> >It does nothing that prevents you from doing schedule() or even wait for any
> >event(mutex slow path behaviour), when the process is removed from the 
> >run-queue.
> >I mean after the migrate_disable() is invoked. Or i miss something?
>
> Hello Rezki
>
> Sorry, there's something wrong with the previous description.
> There are the following scenarios
>
> Due to migrate_disable will increase  this_rq()->nr_pinned , after that
> if get_free_page be blocked, and this time, CPU going offline,
> the sched_cpu_wait_empty() be called in per-cpu "cpuhp/%d" task,
> and be blocked.
>
>But after the migrate_disable() is invoked a CPU can not be brought down.
>If there are pinned tasks a "hotplug path" will be blocked on 
>balance_hotplug_wait()
>call.

> blocked:
> sched_cpu_wait_empty()
> {
>   struct rq *rq = this_rq();
>rcuwait_wait_event(>hotplug_wait,
>rq->nr_running == 1 && !rq_has_pinned_tasks(rq),
>TASK_UNINTERRUPTIBLE);
> }
>
>Exactly.

> wakeup:
> balance_push()
> {
> if (is_per_cpu_kthread(push_task) || 
> is_migration_disabled(push_task)) {
>
> if (!rq->nr_running && !rq_has_pinned_tasks(rq) &&
> rcuwait_active(>hotplug_wait)) {
> raw_spin_unlock(>lock);
> rcuwait_wake_up(>hotplug_wait);
> raw_spin_lock(>lock);
> }
> return;
> }
> }
>
> One of the conditions for this function to wake up is "rq->nr_pinned  == 0"
> that is to say between migrate_disable/enable, if blocked will defect CPU 
> going
> offline longer blocking time.
>
>Indeed, the hotplug time is affected. For example in case of waiting for
>a mutex to be released, an owner will wakeup waiters. But this is expectable.

>
> I'm not sur

Question on migrate_disabe/enable()

2021-01-26 Thread Zhang, Qiang

Hello Peterz, tglx

I have some questions about migrate_disabe/enable(),  in the past 
migrate_disabe/enable() is replaced by preempt_disable/enable() in no RT system.

And now migrate_disabe/enable() has its own implementation, I want to know in 
migrate_disabe/enable() critical area is blocking allowed?  

If allowed, There's a blockage in migrate_disabe/enable() critical area, and 
this time if CPU going offline，offline time will be longer. Is this normal 
phenomenon？

Thanks
Qiang

回复: 回复: [PATCH 3/3] kvfree_rcu: use migrate_disable/enable()

2021-01-24 Thread Zhang, Qiang




发件人: Uladzislau Rezki 
发送时间: 2021年1月25日 5:57
收件人: Zhang, Qiang
抄送: Uladzislau Rezki (Sony); LKML; RCU; Paul E . McKenney; Michael Ellerman; 
Andrew Morton; Daniel Axtens; Frederic Weisbecker; Neeraj Upadhyay; Joel 
Fernandes; Peter Zijlstra; Michal Hocko; Thomas Gleixner; Theodore Y . Ts'o; 
Sebastian Andrzej Siewior; Oleksiy Avramchenko
主题: Re: 回复: [PATCH 3/3] kvfree_rcu: use migrate_disable/enable()

>Hello, Zhang.

> >
> >发件人: Uladzislau Rezki (Sony) 
> >发送时间: 2021年1月21日 0:21
> >收件人: LKML; RCU; Paul E . McKenney; Michael Ellerman
> >抄送: Andrew Morton; Daniel Axtens; Frederic Weisbecker; Neeraj >Upadhyay; 
> >Joel Fernandes; Peter Zijlstra; Michal Hocko; Thomas >Gleixner; Theodore Y . 
> >Ts'o; Sebastian Andrzej Siewior; Uladzislau >Rezki; Oleksiy Avramchenko
> >主题: [PATCH 3/3] kvfree_rcu: use migrate_disable/enable()
> >
> >Since the page is obtained in a fully preemptible context, dropping
> >the lock can lead to migration onto another CPU. As a result a prev.
> >bnode of that CPU may be underutilised, because a decision has been
> >made for a CPU that was run out of free slots to store a pointer.
> >
> >migrate_disable/enable() are now independent of RT, use it in order
> >to prevent any migration during a page request for a specific CPU it
> >is requested for.
>
>
> Hello Rezki
>
> The critical migrate_disable/enable() area is not allowed to block, under RT 
> and non RT.
> There is such a description in preempt.h
>
>
> * Notes on the implementation.
>  *
>  * The implementation is particularly tricky since existing code patterns
>  * dictate neither migrate_disable() nor migrate_enable() is allowed to block.
>  * This means that it cannot use cpus_read_lock() to serialize against 
> hotplug,
>  * nor can it easily migrate itself into a pending affinity mask change on
>  * migrate_enable().
>
>How i interpret it is migrate_enable()/migrate_disable() are not allowed to
>use any blocking primitives, such as rwsem/mutexes/etc. in order to mark a
>current context as non-migratable.
>
>void migrate_disable(void)
>{
> struct task_struct *p = current;
>
> if (p->migration_disabled) {
>  p->migration_disabled++;
>  return;
> }

> preempt_disable();
> this_rq()->nr_pinned++;
> p->migration_disabled = 1;
> preempt_enable();
>}
>
>It does nothing that prevents you from doing schedule() or even wait for any
>event(mutex slow path behaviour), when the process is removed from the 
>run-queue.
>I mean after the migrate_disable() is invoked. Or i miss something?

Hello Rezki

Sorry, there's something wrong with the previous description. 
There are the following scenarios
 
Due to migrate_disable will increase rq's nr_pinned, after that
if get_free_page be blocked, and this time, CPU going offline,  
the sched_cpu_wait_empty() be called in per-cpu "cpuhp/%d" task,
and be blocked.

sched_cpu_wait_empty()
{

   rcuwait_wait_event(>hotplug_wait,
   rq->nr_running == 1 && !rq_has_pinned_tasks(rq),
   TASK_UNINTERRUPTIBLE);
}



>
> How about the following changes:
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index e7a226abff0d..2aa19537ac7c 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3488,12 +3488,10 @@ add_ptr_to_bulk_krc_lock(struct kfree_rcu_cpu **krcp,
> (*krcp)->bkvhead[idx]->nr_records == 
> KVFREE_BULK_MAX_ENTR) {
> bnode = get_cached_bnode(*krcp);
> if (!bnode && can_alloc) {
> -   migrate_disable();
> krc_this_cpu_unlock(*krcp, *flags);
> bnode = (struct kvfree_rcu_bulk_data *)
> __get_free_page(GFP_KERNEL | 
> __GFP_RETRY_MAYFAIL | __GFP_NOMEMALLOC | __GFP_NOWARN);
> -   *krcp = krc_this_cpu_lock(flags);
> -   migrate_enable();
> +   raw_spin_lock_irqsave(&(*krcp)->lock, *flags);
>
Hm.. Taking the former lock can lead to a pointer leaking, i mean a CPU 
associated
with "krcp" might go offline during a page request process, so a queuing occurs 
on
off-lined CPU. Apat of that, acquiring a former lock still does not solve:

- CPU1 in process of page allocation;
- CPU1 gets migrated to CPU2;
- another task running on CPU1 also allocate a page;
- both bnodes are added to krcp associated with CPU1.

I agree that such scenario probably will never happen or i would say, can be
considered as a corner case. We can drop the:

[PATCH 3/3] kvfree_rcu: use migrate_disable/enable()

and

回复: 回复: [PATCH 3/3] kvfree_rcu: use migrate_disable/enable()

2021-01-24 Thread Zhang, Qiang



发件人: Uladzislau Rezki 
发送时间: 2021年1月25日 5:57
收件人: Zhang, Qiang
抄送: Uladzislau Rezki (Sony); LKML; RCU; Paul E . McKenney; Michael Ellerman; 
Andrew Morton; Daniel Axtens; Frederic Weisbecker; Neeraj Upadhyay; Joel 
Fernandes; Peter Zijlstra; Michal Hocko; Thomas Gleixner; Theodore Y . Ts'o; 
Sebastian Andrzej Siewior; Oleksiy Avramchenko
主题: Re: 回复: [PATCH 3/3] kvfree_rcu: use migrate_disable/enable()

>Hello, Zhang.

> >
> >发件人: Uladzislau Rezki (Sony) 
> >发送时间: 2021年1月21日 0:21
> >收件人: LKML; RCU; Paul E . McKenney; Michael Ellerman
> >抄送: Andrew Morton; Daniel Axtens; Frederic Weisbecker; Neeraj >Upadhyay; 
> >Joel Fernandes; Peter Zijlstra; Michal Hocko; Thomas >Gleixner; Theodore Y . 
> >Ts'o; Sebastian Andrzej Siewior; Uladzislau >Rezki; Oleksiy Avramchenko
> >主题: [PATCH 3/3] kvfree_rcu: use migrate_disable/enable()
> >
> >Since the page is obtained in a fully preemptible context, dropping
> >the lock can lead to migration onto another CPU. As a result a prev.
> >bnode of that CPU may be underutilised, because a decision has been
> >made for a CPU that was run out of free slots to store a pointer.
> >
> >migrate_disable/enable() are now independent of RT, use it in order
> >to prevent any migration during a page request for a specific CPU it
> >is requested for.
>
>
> Hello Rezki
>
> The critical migrate_disable/enable() area is not allowed to block, under RT 
> and non RT.
> There is such a description in preempt.h
>
>
> * Notes on the implementation.
>  *
>  * The implementation is particularly tricky since existing code patterns
>  * dictate neither migrate_disable() nor migrate_enable() is allowed to block.
>  * This means that it cannot use cpus_read_lock() to serialize against 
> hotplug,
>  * nor can it easily migrate itself into a pending affinity mask change on
>  * migrate_enable().
>
>How i interpret it is migrate_enable()/migrate_disable() are not allowed to
>use any blocking primitives, such as rwsem/mutexes/etc. in order to mark a
>current context as non-migratable.
>
>void migrate_disable(void)
>{
> struct task_struct *p = current;
>
> if (p->migration_disabled) {
>  p->migration_disabled++;
>  return;
> }

> preempt_disable();
> this_rq()->nr_pinned++;
> p->migration_disabled = 1;
> preempt_enable();
>}
>
>It does nothing that prevents you from doing schedule() or even wait for any
>event(mutex slow path behaviour), when the process is removed from the 
>run-queue.
>I mean after the migrate_disable() is invoked. Or i miss something?

Hello Rezki

Sorry, there's something wrong with the previous description.
There are the following scenarios

Due to migrate_disable will increase  this_rq()->nr_pinned , after that
if get_free_page be blocked, and this time, CPU going offline,
the sched_cpu_wait_empty() be called in per-cpu "cpuhp/%d" task,
and be blocked.

blocked:
sched_cpu_wait_empty()
{
  struct rq *rq = this_rq();
   rcuwait_wait_event(>hotplug_wait,
   rq->nr_running == 1 && !rq_has_pinned_tasks(rq),
   TASK_UNINTERRUPTIBLE);
}
wakeup:
balance_push()
{
if (is_per_cpu_kthread(push_task) || is_migration_disabled(push_task)) {
  
if (!rq->nr_running && !rq_has_pinned_tasks(rq) &&
rcuwait_active(>hotplug_wait)) {
raw_spin_unlock(>lock);
rcuwait_wake_up(>hotplug_wait);
raw_spin_lock(>lock);
}
return;
}
}

One of the conditions for this function to wake up is "rq->nr_pinned  == 0"
that is to say between migrate_disable/enable, if blocked will defect CPU going
offline longer blocking time.

I'm not sure that's a problem，and I didn't find it in the kernel code  between 
 migrate_disable/enable possible sleep calls.

>
> How about the following changes:
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index e7a226abff0d..2aa19537ac7c 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3488,12 +3488,10 @@ add_ptr_to_bulk_krc_lock(struct kfree_rcu_cpu **krcp,
> (*krcp)->bkvhead[idx]->nr_records == 
> KVFREE_BULK_MAX_ENTR) {
> bnode = get_cached_bnode(*krcp);
> if (!bnode && can_alloc) {
> -   migrate_disable();
> krc_this_cpu_unlock(*krcp, *flags);
> bnode = (struct kvfree_rcu_bulk_data *)
> __get_free_page(GFP_KERNEL | 
> __GFP_RETRY_MAYFAIL |

回复: 回复: [PATCH] rcu: Release per-cpu krcp page cache when CPU going offline

2021-01-23 Thread Zhang, Qiang




发件人: Uladzislau Rezki 
发送时间: 2021年1月22日 22:31
收件人: Zhang, Qiang
抄送: Uladzislau Rezki; Paul E. McKenney; r...@vger.kernel.org; 
linux-kernel@vger.kernel.org
主题: Re: 回复: [PATCH] rcu: Release per-cpu krcp page cache when CPU going offline

On Fri, Jan 22, 2021 at 01:44:36AM +, Zhang, Qiang wrote:
>
>
> 
> 发件人: Uladzislau Rezki 
> 发送时间: 2021年1月22日 4:26
> 收件人: Zhang, Qiang
> 抄送: Paul E. McKenney; r...@vger.kernel.org; linux-kernel@vger.kernel.org; 
> ure...@gmail.com
> 主题: Re: [PATCH] rcu: Release per-cpu krcp page cache when CPU going offline
> >Hello, Qiang,
>
> > On Thu, Jan 21, 2021 at 02:49:49PM +0800, qiang.zh...@windriver.com wrote:
> > > From: Zqiang 
> > >
> > > If CPUs go offline, the corresponding krcp's page cache can
> > > not be use util the CPU come back online, or maybe the CPU
> > > will never go online again, this commit therefore free krcp's
> > > page cache when CPUs go offline.
> > >
> > > Signed-off-by: Zqiang 
> >
> >Do you consider it as an issue? We have 5 pages per CPU, that is 20480 bytes.
> >
>
> Hello Rezki
>
> In a multi CPUs system, more than one CPUs may be offline, there are more 
> than 5 pages,  and these offline CPUs may never go online again  or  in the 
> process of CPUs online, there are errors, which lead to the failure of 
> online, these scenarios will lead to the per-cpu krc page cache will never be 
> released.
>
>Thanks for your answer. I was thinking more about if you knew some >platforms
>which suffer from such extra page usage when CPU goes offline. Any >issues
>your platforms or devices run into because of that.
>
>So i understand that if CPU goes offline the 5 pages associated with it >are
>unused until it goes online back.

 I agree with you, But I still want to talk about what I think

 My understanding is that when the CPU is offline,  the pages is not 
 accessible,  beacuse we don't know when this CPU will 
 go online again, so we best to return these page to the buddy system,
 when the CPU goes online again, we can allocate page from the buddy 
 system to fill krcp's page cache.  maybe you may think that this memory 
 is small and don't need to. 
 
 Thanks 
 Qiang

>
>--
>Vlad Rezki

回复: [PATCH 3/3] kvfree_rcu: use migrate_disable/enable()

2021-01-23 Thread Zhang, Qiang



>
>发件人: Uladzislau Rezki (Sony) 
>发送时间: 2021年1月21日 0:21
>收件人: LKML; RCU; Paul E . McKenney; Michael Ellerman
>抄送: Andrew Morton; Daniel Axtens; Frederic Weisbecker; Neeraj >Upadhyay; Joel 
>Fernandes; Peter Zijlstra; Michal Hocko; Thomas >Gleixner; Theodore Y . Ts'o; 
>Sebastian Andrzej Siewior; Uladzislau >Rezki; Oleksiy Avramchenko
>主题: [PATCH 3/3] kvfree_rcu: use migrate_disable/enable()
>
>Since the page is obtained in a fully preemptible context, dropping
>the lock can lead to migration onto another CPU. As a result a prev.
>bnode of that CPU may be underutilised, because a decision has been
>made for a CPU that was run out of free slots to store a pointer.
>
>migrate_disable/enable() are now independent of RT, use it in order
>to prevent any migration during a page request for a specific CPU it
>is requested for.


Hello Rezki

The critical migrate_disable/enable() area is not allowed to block, under RT 
and non RT.  
There is such a description in preempt.h 


* Notes on the implementation.
 *
 * The implementation is particularly tricky since existing code patterns
 * dictate neither migrate_disable() nor migrate_enable() is allowed to block.
 * This means that it cannot use cpus_read_lock() to serialize against hotplug,
 * nor can it easily migrate itself into a pending affinity mask change on
 * migrate_enable().


How about the following changes:

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index e7a226abff0d..2aa19537ac7c 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3488,12 +3488,10 @@ add_ptr_to_bulk_krc_lock(struct kfree_rcu_cpu **krcp,
(*krcp)->bkvhead[idx]->nr_records == 
KVFREE_BULK_MAX_ENTR) {
bnode = get_cached_bnode(*krcp);
if (!bnode && can_alloc) {
-   migrate_disable();
krc_this_cpu_unlock(*krcp, *flags);
bnode = (struct kvfree_rcu_bulk_data *)
__get_free_page(GFP_KERNEL | 
__GFP_RETRY_MAYFAIL | __GFP_NOMEMALLOC | __GFP_NOWARN);
-   *krcp = krc_this_cpu_lock(flags);
-   migrate_enable();
+   raw_spin_lock_irqsave(&(*krcp)->lock, *flags);
}
 
if (!bnode)


Thanks
Qiang



>
>Signed-off-by: Uladzislau Rezki (Sony) 
>---
> kernel/rcu/tree.c | 2 ++
>1 file changed, 2 insertions(+)
>
>diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
>index 454809514c91..cad36074366d 100644
>--- a/kernel/rcu/tree.c
>+++ b/kernel/rcu/tree.c
>@@ -3489,10 +3489,12 @@ add_ptr_to_bulk_krc_lock(struct >kfree_rcu_cpu **krcp,
>(*krcp)->bkvhead[idx]->nr_records == 
> >KVFREE_BULK_MAX_ENTR) {
>bnode = get_cached_bnode(*krcp);
>if (!bnode && can_alloc) {
>+   migrate_disable();
>krc_this_cpu_unlock(*krcp, *flags);
>bnode = (struct kvfree_rcu_bulk_data *)
>__get_free_page(GFP_KERNEL | 
> >__GFP_RETRY_MAYFAIL | __GFP_NOMEMALLOC | __GFP_NOWARN);
>   *krcp = krc_this_cpu_lock(flags);
>+   migrate_enable();
>}
>
>if (!bnode)
>--
>2.20.1

回复: [PATCH] rcu: Release per-cpu krcp page cache when CPU going offline

2021-01-21 Thread Zhang, Qiang




发件人: Uladzislau Rezki 
发送时间: 2021年1月22日 4:26
收件人: Zhang, Qiang
抄送: Paul E. McKenney; r...@vger.kernel.org; linux-kernel@vger.kernel.org; 
ure...@gmail.com
主题: Re: [PATCH] rcu: Release per-cpu krcp page cache when CPU going offline
>Hello, Qiang,

> On Thu, Jan 21, 2021 at 02:49:49PM +0800, qiang.zh...@windriver.com wrote:
> > From: Zqiang 
> >
> > If CPUs go offline, the corresponding krcp's page cache can
> > not be use util the CPU come back online, or maybe the CPU
> > will never go online again, this commit therefore free krcp's
> > page cache when CPUs go offline.
> >
> > Signed-off-by: Zqiang 
>
>Do you consider it as an issue? We have 5 pages per CPU, that is 20480 bytes.
>

Hello Rezki 

In a multi CPUs system, more than one CPUs may be offline, there are more than 
5 pages,  and these offline CPUs may never go online again  or  in the process 
of CPUs online, there are errors, which lead to the failure of online, these 
scenarios will lead to the per-cpu krc page cache will never be released.

Thanks
Qiang


>--
>Vlad Rezki

回复: Question on workqueue: Manually break affinity on hotplug

2021-01-17 Thread Zhang, Qiang

Hello Peter, Lai

Sorry to disturb again, I'm still confused, when the CPU is offline, we active  
call set_cpus_allowed_ptr function  to reset per-cpu kthread cpumask,
in sched_cpu_dying function , migrate_tasks function will reset per-cpu 
kthread's cpumask on runq, even if not on runq, when wake up,  other online 
CPUs will also be selected to run.

what I want to ask is why we take the initiative to set it up?

Thanks
Qiang



发件人: Peter Zijlstra 
发送时间: 2021年1月14日 17:11
收件人: Zhang, Qiang
抄送: linux-kernel@vger.kernel.org
主题: Re: Question on   workqueue: Manually break affinity on hotplug

[Please note this e-mail is from an EXTERNAL e-mail address]

On Thu, Jan 14, 2021 at 08:03:23AM +, Zhang, Qiang wrote:
> Hello Peter
>
> Excuse me, I have some questions for you, about a description of this change:
>
> ''Don't rely on the scheduler to force break affinity for us -- it will
> stop doing that for per-cpu-kthreads."
>
> this mean when cpuhotplug, scheduler do not  change  affinity for 
> per-cpu-kthread's task, if we not active setting affinity?
> but if per-cpu-kthread's task is not run state, when wake up,  will reset 
> it's affinity, this  is done automatically.
>
> or is it,  this place modified to fit the new one hotplug mechanism which
> ("sched/hotplug: Consolidate task migration on CPU unplug")?

https://lkml.kernel.org/r/20201214155457.3430-1-jiangshan...@gmail.com
https://lkml.kernel.org/r/20201218170919.2950-1-jiangshan...@gmail.com
https://lkml.kernel.org/r/20201226025117.2770-1-jiangshan...@gmail.com
https://lkml.kernel.org/r/2021052638.2417-1-jiangshan...@gmail.com
https://lkml.kernel.org/r/20210112144344.850850...@infradead.org

Question on workqueue: Manually break affinity on hotplug

2021-01-14 Thread Zhang, Qiang

Hello Peter

Excuse me, I have some questions for you, about a description of this change:

''Don't rely on the scheduler to force break affinity for us -- it will
stop doing that for per-cpu-kthreads."

this mean when cpuhotplug, scheduler do not  change  affinity for 
per-cpu-kthread's task, if we not active setting affinity?
but if per-cpu-kthread's task is not run state, when wake up,  will reset it's 
affinity, this  is done automatically.

or is it,  this place modified to fit the new one hotplug mechanism which
("sched/hotplug: Consolidate task migration on CPU unplug")?

Thanks
Qiang

回复: KASAN: use-after-free Read in usb_anchor_resume_wakeups (2)

2021-01-11 Thread Zhang, Qiang




发件人: Zhang, Qiang 
发送时间: 2021年1月12日 11:28
收件人: syzbot; a.darw...@linutronix.de; allen.l...@gmail.com; 
andreyk...@google.com; dvyu...@google.com; el...@google.com; 
gre...@linuxfoundation.org; gustavo...@kernel.org; 
linux-kernel@vger.kernel.org; linux-...@vger.kernel.org; 
syzkaller-b...@googlegroups.com; t...@linutronix.de
主题: 回复: KASAN: use-after-free Read in usb_anchor_resume_wakeups (2)




发件人: syzbot 
发送时间: 2021年1月12日 0:11
收件人: a.darw...@linutronix.de; allen.l...@gmail.com; andreyk...@google.com; 
dvyu...@google.com; el...@google.com; gre...@linuxfoundation.org; 
gustavo...@kernel.org; linux-kernel@vger.kernel.org; linux-...@vger.kernel.org; 
syzkaller-b...@googlegroups.com; t...@linutronix.de
主题: KASAN: use-after-free Read in usb_anchor_resume_wakeups (2)

Hello,

syzbot found the following issue on:

HEAD commit:841081d8 usb: usbip: Use DEFINE_SPINLOCK() for spinlock
git tree:   https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git 
usb-testing
console output: https://syzkaller.appspot.com/x/log.txt?x=12f42a3f50
kernel config:  https://syzkaller.appspot.com/x/.config?x=6f9911c273a88e5
dashboard link: https://syzkaller.appspot.com/bug?extid=39c636a0650bcbb172ec
compiler:   gcc (GCC) 10.1.0-syz 20200507

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+39c636a0650bcbb17...@syzkaller.appspotmail.com

xpad 6-1:0.65: xpad_irq_in - usb_submit_urb failed with result -19
xpad 6-1:0.65: xpad_irq_out - usb_submit_urb failed with result -19
==
BUG: KASAN: use-after-free in register_lock_class+0xecc/0x1100 
kernel/locking/lockdep.c:1291
Read of size 2 at addr 888137488092 by task systemd-udevd/7474

CPU: 1 PID: 7474 Comm: systemd-udevd Not tainted 5.11.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Call Trace:
 
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:120
 print_address_description.constprop.0.cold+0x5b/0x2f8 mm/kasan/report.c:230
 __kasan_report mm/kasan/report.c:396 [inline]
 kasan_report.cold+0x79/0xd5 mm/kasan/report.c:413
 register_lock_class+0xecc/0x1100 kernel/locking/lockdep.c:1291
 __lock_acquire+0x101/0x54f0 kernel/locking/lockdep.c:4711
 lock_acquire kernel/locking/lockdep.c:5437 [inline]
 lock_acquire+0x288/0x700 kernel/locking/lockdep.c:5402
 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
 _raw_spin_lock_irqsave+0x36/0x50 kernel/locking/spinlock.c:159
 __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:137
 usb_anchor_resume_wakeups drivers/usb/core/urb.c:937 [inline]
 usb_anchor_resume_wakeups+0xbe/0xe0 drivers/usb/core/urb.c:930
 __usb_hcd_giveback_urb+0x2df/0x5c0 drivers/usb/core/hcd.c:1661
 usb_hcd_giveback_urb+0x367/0x410 drivers/usb/core/hcd.c:1728
 dummy_timer+0x11f4/0x32a0 drivers/usb/gadget/udc/dummy_hcd.c:1971
 call_timer_fn+0x1a5/0x630 kernel/time/timer.c:1417
 expire_timers kernel/time/timer.c:1462 [inline]
 __run_timers.part.0+0x67c/0xa10 kernel/time/timer.c:1731
 __run_timers kernel/time/timer.c:1712 [inline]
 run_timer_softirq+0x80/0x120 kernel/time/timer.c:1744
 __do_softirq+0x1b7/0x977 kernel/softirq.c:343
 asm_call_irq_on_stack+0xf/0x20
 
 __run_on_irqstack arch/x86/include/asm/irq_stack.h:26 [inline]
 run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:77 [inline]
 do_softirq_own_stack+0x80/0xa0 arch/x86/kernel/irq_64.c:77
 invoke_softirq kernel/softirq.c:226 [inline]
 __irq_exit_rcu kernel/softirq.c:420 [inline]
 irq_exit_rcu+0x110/0x1a0 kernel/softirq.c:432
 sysvec_apic_timer_interrupt+0x43/0xa0 arch/x86/kernel/apic/apic.c:1096
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:628
RIP: 0010:__sanitizer_cov_trace_pc+0x37/0x60 kernel/kcov.c:197
Code: 81 e1 00 01 00 00 65 48 8b 14 25 40 ef 01 00 a9 00 01 ff 00 74 0e 85 c9 
74 35 8b 82 dc 13 00 00 85 c0 74 2b 8b 82 b8 13 00 00 <83> f8 02 75 20 48 8b 8a 
c0 13 00 00 8b 92 bc 13 00 00 48 8b 01 48
RSP: 0018:c90005f875b0 EFLAGS: 0246
RAX:  RBX: 0003 RCX: 
RDX: 888116d85040 RSI: 81dabe81 RDI: 0003
RBP: 888102c2bf00 R08:  R09: 0003
R10: 81dabeba R11: 0010 R12: 0002
R13: 01cc R14: dc00 R15: 
 tomoyo_domain_quota_is_ok+0x2f1/0x550 security/tomoyo/util.c:1093
 tomoyo_supervisor+0x2f2/0xf00 security/tomoyo/common.c:2089
 tomoyo_audit_path_log security/tomoyo/file.c:168 [inline]
 tomoyo_path_permission security/tomoyo/file.c:587 [inline]
 tomoyo_path_permission+0x270/0x3a0 security/tomoyo/file.c:573
 tomoyo_check_open_permission+0x33e/0x380 security/tomoyo/file.c:777
 tomoyo_file_open security/tomoyo/tomoyo.c:313 [

回复: KASAN: use-after-free Read in usb_anchor_resume_wakeups (2)

2021-01-11 Thread Zhang, Qiang




发件人: syzbot 
发送时间: 2021年1月12日 0:11
收件人: a.darw...@linutronix.de; allen.l...@gmail.com; andreyk...@google.com; 
dvyu...@google.com; el...@google.com; gre...@linuxfoundation.org; 
gustavo...@kernel.org; linux-kernel@vger.kernel.org; linux-...@vger.kernel.org; 
syzkaller-b...@googlegroups.com; t...@linutronix.de
主题: KASAN: use-after-free Read in usb_anchor_resume_wakeups (2)

Hello,

syzbot found the following issue on:

HEAD commit:841081d8 usb: usbip: Use DEFINE_SPINLOCK() for spinlock
git tree:   https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git 
usb-testing
console output: https://syzkaller.appspot.com/x/log.txt?x=12f42a3f50
kernel config:  https://syzkaller.appspot.com/x/.config?x=6f9911c273a88e5
dashboard link: https://syzkaller.appspot.com/bug?extid=39c636a0650bcbb172ec
compiler:   gcc (GCC) 10.1.0-syz 20200507

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+39c636a0650bcbb17...@syzkaller.appspotmail.com

xpad 6-1:0.65: xpad_irq_in - usb_submit_urb failed with result -19
xpad 6-1:0.65: xpad_irq_out - usb_submit_urb failed with result -19
==
BUG: KASAN: use-after-free in register_lock_class+0xecc/0x1100 
kernel/locking/lockdep.c:1291
Read of size 2 at addr 888137488092 by task systemd-udevd/7474

CPU: 1 PID: 7474 Comm: systemd-udevd Not tainted 5.11.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Call Trace:
 
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:120
 print_address_description.constprop.0.cold+0x5b/0x2f8 mm/kasan/report.c:230
 __kasan_report mm/kasan/report.c:396 [inline]
 kasan_report.cold+0x79/0xd5 mm/kasan/report.c:413
 register_lock_class+0xecc/0x1100 kernel/locking/lockdep.c:1291
 __lock_acquire+0x101/0x54f0 kernel/locking/lockdep.c:4711
 lock_acquire kernel/locking/lockdep.c:5437 [inline]
 lock_acquire+0x288/0x700 kernel/locking/lockdep.c:5402
 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
 _raw_spin_lock_irqsave+0x36/0x50 kernel/locking/spinlock.c:159
 __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:137
 usb_anchor_resume_wakeups drivers/usb/core/urb.c:937 [inline]
 usb_anchor_resume_wakeups+0xbe/0xe0 drivers/usb/core/urb.c:930
 __usb_hcd_giveback_urb+0x2df/0x5c0 drivers/usb/core/hcd.c:1661
 usb_hcd_giveback_urb+0x367/0x410 drivers/usb/core/hcd.c:1728
 dummy_timer+0x11f4/0x32a0 drivers/usb/gadget/udc/dummy_hcd.c:1971
 call_timer_fn+0x1a5/0x630 kernel/time/timer.c:1417
 expire_timers kernel/time/timer.c:1462 [inline]
 __run_timers.part.0+0x67c/0xa10 kernel/time/timer.c:1731
 __run_timers kernel/time/timer.c:1712 [inline]
 run_timer_softirq+0x80/0x120 kernel/time/timer.c:1744
 __do_softirq+0x1b7/0x977 kernel/softirq.c:343
 asm_call_irq_on_stack+0xf/0x20
 
 __run_on_irqstack arch/x86/include/asm/irq_stack.h:26 [inline]
 run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:77 [inline]
 do_softirq_own_stack+0x80/0xa0 arch/x86/kernel/irq_64.c:77
 invoke_softirq kernel/softirq.c:226 [inline]
 __irq_exit_rcu kernel/softirq.c:420 [inline]
 irq_exit_rcu+0x110/0x1a0 kernel/softirq.c:432
 sysvec_apic_timer_interrupt+0x43/0xa0 arch/x86/kernel/apic/apic.c:1096
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:628
RIP: 0010:__sanitizer_cov_trace_pc+0x37/0x60 kernel/kcov.c:197
Code: 81 e1 00 01 00 00 65 48 8b 14 25 40 ef 01 00 a9 00 01 ff 00 74 0e 85 c9 
74 35 8b 82 dc 13 00 00 85 c0 74 2b 8b 82 b8 13 00 00 <83> f8 02 75 20 48 8b 8a 
c0 13 00 00 8b 92 bc 13 00 00 48 8b 01 48
RSP: 0018:c90005f875b0 EFLAGS: 0246
RAX:  RBX: 0003 RCX: 
RDX: 888116d85040 RSI: 81dabe81 RDI: 0003
RBP: 888102c2bf00 R08:  R09: 0003
R10: 81dabeba R11: 0010 R12: 0002
R13: 01cc R14: dc00 R15: 
 tomoyo_domain_quota_is_ok+0x2f1/0x550 security/tomoyo/util.c:1093
 tomoyo_supervisor+0x2f2/0xf00 security/tomoyo/common.c:2089
 tomoyo_audit_path_log security/tomoyo/file.c:168 [inline]
 tomoyo_path_permission security/tomoyo/file.c:587 [inline]
 tomoyo_path_permission+0x270/0x3a0 security/tomoyo/file.c:573
 tomoyo_check_open_permission+0x33e/0x380 security/tomoyo/file.c:777
 tomoyo_file_open security/tomoyo/tomoyo.c:313 [inline]
 tomoyo_file_open+0xa3/0xd0 security/tomoyo/tomoyo.c:308
 security_file_open+0x52/0x4f0 security/security.c:1576
 do_dentry_open+0x353/0x1090 fs/open.c:804
 do_open fs/namei.c:3254 [inline]
 path_openat+0x1b9a/0x2730 fs/namei.c:3371
 do_filp_open+0x17e/0x3c0 fs/namei.c:3398
 do_sys_openat2+0x16d/0x420 fs/open.c:1172
 do_sys_open fs/open.c:1188 [inline]
 __do_sys_open fs/open.c:1196 [inline]
 __se_sys_open fs/open.c:1192 [inline]

回复: KASAN: use-after-free Read in service_outstanding_interrupt

2021-01-04 Thread Zhang, Qiang




发件人: Oliver Neukum 
发送时间: 2021年1月5日 0:28
收件人: syzbot; andreyk...@google.com; gre...@linuxfoundation.org; 
gustavo...@kernel.org; ingras...@epigenesys.com; lee.jo...@linaro.org; 
linux-kernel@vger.kernel.org; linux-...@vger.kernel.org; 
penguin-ker...@i-love.sakura.ne.jp; syzkaller-b...@googlegroups.com
主题: Re: KASAN: use-after-free Read in service_outstanding_interrupt

Am Donnerstag, den 17.12.2020, 19:21 -0800 schrieb syzbot:
> syzbot has found a reproducer for the following issue on:
>
> HEAD commit:5e60366d Merge tag 'fallthrough-fixes-clang-5.11-rc1' of g..
> git tree:   
> https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git usb-testing
> console output: https://syzkaller.appspot.com/x/log.txt?x=12c5b62350
> kernel config:  https://syzkaller.appspot.com/x/.config?x=5cea7506b7139727
> dashboard link: https://syzkaller.appspot.com/bug?extid=9e04e2df4a32fb661daf
> compiler:   gcc (GCC) 10.1.0-syz 20200507
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=175adf0750
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1672680f50
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: >syzbot+9e04e2df4a32fb661...@syzkaller.appspotmail.com
>
>#syz test: https://github.com/google/kasan.git  5e60366d
>

 Hello Oliver 
 
 this use-after-free still exists，It can be seen from calltrace that it is 
 usb_device's object  has been released when disconnect,
 can add a reference count to usb_device's object to avoid this problem 
 
 diff --git a/drivers/usb/class/cdc-wdm.c b/drivers/usb/class/cdc-wdm.c
index 508b1c3f8b73..001cb93da6bf 100644
--- a/drivers/usb/class/cdc-wdm.c
+++ b/drivers/usb/class/cdc-wdm.c
@@ -106,6 +106,7 @@ struct wdm_device {
 
struct list_headdevice_list;
int (*manage_power)(struct usb_interface *, int);
+   struct usb_device   *usb_dev;
 };
 
 static struct usb_driver wdm_driver;
@@ -338,6 +339,7 @@ static void free_urbs(struct wdm_device *desc)
 
 static void cleanup(struct wdm_device *desc)
 {
+   usb_put_dev(desc->usb_dev);
kfree(desc->sbuf);
kfree(desc->inbuf);
kfree(desc->orq);
@@ -855,6 +857,7 @@ static int wdm_create(struct usb_interface *intf, struct 
usb_endpoint_descripto
r
desc->intf = intf;
INIT_WORK(>rxwork, wdm_rxwork);
INIT_WORK(>service_outs_intr, service_interrupt_work);
+   desc->usb_dev = usb_get_dev(interface_to_usbdev(intf));
 
rv = -EINVAL;
if (!usb_endpoint_is_int_in(ep))




>From f51e3c5a202f3abc805edd64b21a68d29dd9d60e Mon Sep 17 >00:00:00 2001
>From: Oliver Neukum 
>Date: Mon, 4 Jan 2021 17:26:33 +0100
>Subject: [PATCH] cdc-wdm: poison URBs upon disconnect
>
>We have a chicken and egg issue between interrupt and work.
>This should break the cycle.
>
>Signed-off-by: Oliver Neukum 
>---
>drivers/usb/class/cdc-wdm.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
>diff --git a/drivers/usb/class/cdc-wdm.c b/drivers/usb/class/cdc->wdm.c
>index 02d0cfd23bb2..14eddda35280 100644
>--- a/drivers/usb/class/cdc-wdm.c
>+++ b/drivers/usb/class/cdc-wdm.c
>@@ -324,9 +324,9 @@ static void wdm_int_callback(struct urb *urb)
>static void kill_urbs(struct wdm_device *desc)
> {
>   /* the order here is essential */
>-   usb_kill_urb(desc->command);
>-   usb_kill_urb(desc->validity);
>-   usb_kill_urb(desc->response);
>+   usb_poison_urb(desc->command);
>+   usb_poison_urb(desc->validity);
>+   usb_poison_urb(desc->response);
> }
>
> static void free_urbs(struct wdm_device *desc)
>--
>2.26.2

回复: [PATCH] ipc/sem.c: Convert kfree_rcu() to call_rcu() in freeary function

2020-12-30 Thread Zhang, Qiang




发件人: Paul E. McKenney 
发送时间: 2020年12月31日 0:19
收件人: Zhang, Qiang
抄送: a...@linux-foundation.org; manf...@colorfullife.com; gustavo...@kernel.org; 
linux-kernel@vger.kernel.org
主题: Re: [PATCH] ipc/sem.c: Convert kfree_rcu() to call_rcu() in freeary function

On Wed, Dec 30, 2020 at 08:00:38PM +0800, qiang.zh...@windriver.com wrote:
> From: Zqiang 
>
> Due to freeary function is called with spinlock be held,
> the synchronize_rcu function may be called in kfree_rcu
> function, the schedule may be happen in spinlock critical
> region, need to replace kfree_rcu() with call_rcu().
>
>Except that the call to kfree_rcu() below has two arguments, and >thus
>provides a link for queuing the callback.  It will never directly invoke
>synchronize_rcu().  It is only the single-argument variant of >kfree_rcu()
>that might invoke synchronize_rcu().

Sorry. It was my mistake, please ignore this patch.

Thanks
Qiang

>Or are you seeing lockdep or might-sleep failures with the current >code?
>If so, please post the relevant portions of the console output.
>
>  Thanx, Paul
>
> Fixes: 693a8b6eecce ("ipc,rcu: Convert call_rcu(free_un) to kfree_rcu()")
> Signed-off-by: Zqiang 
> ---
>  ipc/sem.c | 9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/ipc/sem.c b/ipc/sem.c
> index f6c30a85dadf..12c3184347d9 100644
> --- a/ipc/sem.c
> +++ b/ipc/sem.c
> @@ -1132,6 +1132,13 @@ static int count_semcnt(struct sem_array *sma, ushort 
> semnum,
>   return semcnt;
>  }
>
> +static void free_un(struct rcu_head *head)
> +{
> + struct sem_undo *un = container_of(head, struct sem_undo, rcu);
> +
> + kfree(un);
> +}
> +
>  /* Free a semaphore set. freeary() is called with sem_ids.rwsem locked
>   * as a writer and the spinlock for this semaphore set hold. sem_ids.rwsem
>   * remains locked on exit.
> @@ -1152,7 +1159,7 @@ static void freeary(struct ipc_namespace *ns, struct 
> kern_ipc_perm *ipcp)
>   un->semid = -1;
>   list_del_rcu(>list_proc);
>   spin_unlock(>ulp->lock);
> - kfree_rcu(un, rcu);
> + call_rcu(>rcu, free_un);
>   }
>
>   /* Wake up all pending processes and let them fail with EIDRM. */
> --
> 2.17.1
>

回复: INFO: task hung in ath6kl_usb_destroy (3)

2020-11-30 Thread Zhang, Qiang




发件人: syzbot 
发送时间: 2020年11月30日 23:31
收件人: andreyk...@google.com; da...@davemloft.net; k...@kernel.org; 
kv...@codeaurora.org; linux-kernel@vger.kernel.org; linux-...@vger.kernel.org; 
linux-wirel...@vger.kernel.org; net...@vger.kernel.org; 
syzkaller-b...@googlegroups.com
主题: INFO: task hung in ath6kl_usb_destroy (3)

[Please note this e-mail is from an EXTERNAL e-mail address]

Hello,

syzbot found the following issue on:

HEAD commit:ebad4326 Merge 5.10-rc6 into usb-next
git tree:   https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git 
usb-testing
console output: https://syzkaller.appspot.com/x/log.txt?x=1566291d50
kernel config:  https://syzkaller.appspot.com/x/.config?x=fe8988e4dc252d01
dashboard link: https://syzkaller.appspot.com/bug?extid=bccb3d118a39c43b6c9d
compiler:   gcc (GCC) 10.1.0-syz 20200507

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+bccb3d118a39c43b6...@syzkaller.appspotmail.com

INFO: task kworker/1:4:7246 blocked for more than 143 seconds.
  Not tainted 5.10.0-rc6-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/1:4 state:D stack:22864 pid: 7246 ppid: 2 flags:0x4000
Workqueue: usb_hub_wq hub_event
Call Trace:
 context_switch kernel/sched/core.c:3779 [inline]
 __schedule+0x8a2/0x1f30 kernel/sched/core.c:4528
 schedule+0xcb/0x270 kernel/sched/core.c:4606
 schedule_timeout+0x1d8/0x250 kernel/time/timer.c:1847
 do_wait_for_common kernel/sched/completion.c:85 [inline]
 __wait_for_common kernel/sched/completion.c:106 [inline]
 wait_for_common kernel/sched/completion.c:117 [inline]
 wait_for_completion+0x168/0x270 kernel/sched/completion.c:138
 flush_workqueue+0x3ff/0x13e0 kernel/workqueue.c:2835
 flush_scheduled_work include/linux/workqueue.h:597 [inline]
 ath6kl_usb_flush_all drivers/net/wireless/ath/ath6kl/usb.c:476 [inline]

  void ath6kl_usb_flush_all
 flush_scheduled_work
  
  call "flush_scheduled_work" function to flush all work in system_wq?
  should "flush_work" need to be called?


 ath6kl_usb_destroy+0xc6/0x290 drivers/net/wireless/ath/ath6kl/usb.c:609
 ath6kl_usb_probe+0xc7b/0x11f0 drivers/net/wireless/ath/ath6kl/usb.c:1166
 usb_probe_interface+0x315/0x7f0 drivers/usb/core/driver.c:396
 really_probe+0x291/0xde0 drivers/base/dd.c:554
 driver_probe_device+0x26b/0x3d0 drivers/base/dd.c:738
 __device_attach_driver+0x1d1/0x290 drivers/base/dd.c:844
 bus_for_each_drv+0x15f/0x1e0 drivers/base/bus.c:431
 __device_attach+0x228/0x4a0 drivers/base/dd.c:912
 bus_probe_device+0x1e4/0x290 drivers/base/bus.c:491
 device_add+0xbb2/0x1ce0 drivers/base/core.c:2936
 usb_set_configuration+0x113c/0x1910 drivers/usb/core/message.c:2168
 usb_generic_driver_probe+0xba/0x100 drivers/usb/core/generic.c:238
 usb_probe_device+0xd9/0x2c0 drivers/usb/core/driver.c:293
 really_probe+0x291/0xde0 drivers/base/dd.c:554
 driver_probe_device+0x26b/0x3d0 drivers/base/dd.c:738
 __device_attach_driver+0x1d1/0x290 drivers/base/dd.c:844
 bus_for_each_drv+0x15f/0x1e0 drivers/base/bus.c:431
 __device_attach+0x228/0x4a0 drivers/base/dd.c:912
 bus_probe_device+0x1e4/0x290 drivers/base/bus.c:491
 device_add+0xbb2/0x1ce0 drivers/base/core.c:2936
 usb_new_device.cold+0x71d/0xfe9 drivers/usb/core/hub.c:2555
 hub_port_connect drivers/usb/core/hub.c:5223 [inline]
 hub_port_connect_change drivers/usb/core/hub.c:5363 [inline]
 port_event drivers/usb/core/hub.c:5509 [inline]
 hub_event+0x2348/0x42d0 drivers/usb/core/hub.c:5591
 process_one_work+0x933/0x1520 kernel/workqueue.c:2272
 process_scheduled_works kernel/workqueue.c:2334 [inline]
 worker_thread+0x82b/0x1120 kernel/workqueue.c:2420
 kthread+0x38c/0x460 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296

Showing all locks held in the system:
5 locks held by kworker/0:0/5:
 #0: 888103c7ed38 ((wq_completion)usb_hub_wq){+.+.}-{0:0}, at: 
arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
 #0: 888103c7ed38 ((wq_completion)usb_hub_wq){+.+.}-{0:0}, at: atomic64_set 
include/asm-generic/atomic-instrumented.h:856 [inline]
 #0: 888103c7ed38 ((wq_completion)usb_hub_wq){+.+.}-{0:0}, at: 
atomic_long_set include/asm-generic/atomic-long.h:41 [inline]
 #0: 888103c7ed38 ((wq_completion)usb_hub_wq){+.+.}-{0:0}, at: 
set_work_data kernel/workqueue.c:616 [inline]
 #0: 888103c7ed38 ((wq_completion)usb_hub_wq){+.+.}-{0:0}, at: 
set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline]
 #0: 888103c7ed38 ((wq_completion)usb_hub_wq){+.+.}-{0:0}, at: 
process_one_work+0x821/0x1520 kernel/workqueue.c:2243
 #1: c905fda8 ((work_completion)(>events)){+.+.}-{0:0}, at: 
process_one_work+0x854/0x1520 kernel/workqueue.c:2247
 #2: 888108dd6218 (>mutex){}-{3:3}, at: device_lock 
include/linux/device.h:731 [inline]
 #2: 888108dd6218 (>mutex){}-{3:3}, at:

回复: [PATCH] srcu: Remove srcu_cblist_invoking member from sdp

2020-11-19 Thread Zhang, Qiang




发件人: Paul E. McKenney 
发送时间: 2020年11月20日 2:12
收件人: Zhang, Qiang
抄送: jiangshan...@gmail.com; rost...@goodmis.org; j...@joshtriplett.org; 
r...@vger.kernel.org; linux-kernel@vger.kernel.org
主题: Re: [PATCH] srcu: Remove srcu_cblist_invoking member from sdp

[Please note this e-mail is from an EXTERNAL e-mail address]

On Thu, Nov 19, 2020 at 01:34:11PM +0800, qiang.zh...@windriver.com wrote:
> From: Zqiang 
>
> Workqueue can ensure the multiple same sdp->work sequential
> execution in rcu_gp_wq, not need srcu_cblist_invoking to
> prevent concurrent execution, so remove it.
>
> Signed-off-by: Zqiang 

>Good job analyzing the code, which is very good to see!!!
>
>But these do have a potential purpose.  Right now, it is OK to invoke
>synchronize_srcu() during early boot, that is, before the scheduler
>has started.  But there is a gap from the time that the scheduler has
>initialized (so that preemption and blocking are possible) and the time
>that workqueues are initialized and fully functional.  Only after that
>is it once again OK to use synchronize_srcu().
>
>If synchronize_srcu() is ever required to work correctly during that
>time period, it will need to directly invoke the functions that are
>currently run in workqueue context.  Which means that there will then be
>the possibility of two instances of these functions running just after
>workqueues are available.
>
>   Thanx, Paul

Thanks Paul.

> ---
>  include/linux/srcutree.h | 1 -
>  kernel/rcu/srcutree.c| 8 ++--
>  2 files changed, 2 insertions(+), 7 deletions(-)
>
> diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
> index 9cfcc8a756ae..62d8312b5451 100644
> --- a/include/linux/srcutree.h
> +++ b/include/linux/srcutree.h
> @@ -31,7 +31,6 @@ struct srcu_data {
>   struct rcu_segcblist srcu_cblist;   /* List of callbacks.*/
>   unsigned long srcu_gp_seq_needed;   /* Furthest future GP needed. */
>   unsigned long srcu_gp_seq_needed_exp;   /* Furthest future exp GP. */
> - bool srcu_cblist_invoking;  /* Invoking these CBs? */
>   struct timer_list delay_work;   /* Delay for CB invoking */
>   struct work_struct work;/* Context for CB invoking. */
>   struct rcu_head srcu_barrier_head;  /* For srcu_barrier() use. */
> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> index 3c5e2806e0b9..c4d5cd2567a6 100644
> --- a/kernel/rcu/srcutree.c
> +++ b/kernel/rcu/srcutree.c
> @@ -134,7 +134,6 @@ static void init_srcu_struct_nodes(struct srcu_struct 
> *ssp, bool is_static)
>   sdp = per_cpu_ptr(ssp->sda, cpu);
>   spin_lock_init(_PRIVATE(sdp, lock));
>   rcu_segcblist_init(>srcu_cblist);
> - sdp->srcu_cblist_invoking = false;
>   sdp->srcu_gp_seq_needed = ssp->srcu_gp_seq;
>   sdp->srcu_gp_seq_needed_exp = ssp->srcu_gp_seq;
>   sdp->mynode = _first[cpu / levelspread[level]];
> @@ -1254,14 +1253,11 @@ static void srcu_invoke_callbacks(struct work_struct 
> *work)
>   spin_lock_irq_rcu_node(sdp);
>   rcu_segcblist_advance(>srcu_cblist,
> rcu_seq_current(>srcu_gp_seq));
> - if (sdp->srcu_cblist_invoking ||
> - !rcu_segcblist_ready_cbs(>srcu_cblist)) {
> + if (!rcu_segcblist_ready_cbs(>srcu_cblist)) {
>   spin_unlock_irq_rcu_node(sdp);
>   return;  /* Someone else on the job or nothing to do. */
>   }
>
> - /* We are on the job!  Extract and invoke ready callbacks. */
> - sdp->srcu_cblist_invoking = true;
>   rcu_segcblist_extract_done_cbs(>srcu_cblist, _cbs);
>   len = ready_cbs.len;
>   spin_unlock_irq_rcu_node(sdp);
> @@ -1282,7 +1278,7 @@ static void srcu_invoke_callbacks(struct work_struct 
> *work)
>   rcu_segcblist_add_len(>srcu_cblist, -len);
>   (void)rcu_segcblist_accelerate(>srcu_cblist,
>  rcu_seq_snap(>srcu_gp_seq));
> - sdp->srcu_cblist_invoking = false;
> +
>   more = rcu_segcblist_ready_cbs(>srcu_cblist);
>   spin_unlock_irq_rcu_node(sdp);
>   if (more)
> --
> 2.17.1
>

回复: [PATCH] kthread_worker: Add flush delayed work func

2020-11-13 Thread Zhang, Qiang




发件人: Andrew Morton 
发送时间: 2020年11月13日 8:01
收件人: Zhang, Qiang
抄送: pmla...@suse.com; t...@kernel.org; linux...@kvack.org; 
linux-kernel@vger.kernel.org
主题: Re: [PATCH] kthread_worker: Add flush delayed work func

[Please note this e-mail is from an EXTERNAL e-mail address]

On Wed, 11 Nov 2020 17:13:55 +0800 qiang.zh...@windriver.com wrote:

> Add 'kthread_flush_delayed_work' func, the principle of
> this func is wait for a dwork to finish executing the
> last queueing.
>
>We'd like to see some code which actually uses this new function
>please.  Either in this patch or as one or more followup patches.
>
>btw, we call it "function", not "func".  But neither is really needed -
>just use () to identify a function.  ie:

>: Add kthread_flush_delayed_work().  The principle of this is to wait for
>: a dwork to finish executing the last queueing.

I don't see it being used in the kernel code so far, and I'm not sure if it's 
going to be used in subsequent scenarios (it like flush_delayed_work in 
workqueue )or whether it's currently using "kthread_work" some code needs it.

Thanks

Qiang

回复: memory leak in __usbhid_submit_report

2020-11-13 Thread Zhang, Qiang




发件人: syzbot 
发送时间: 2020年11月11日 21:55
收件人: benjamin.tissoi...@redhat.com; ji...@kernel.org; 
linux-in...@vger.kernel.org; linux-kernel@vger.kernel.org; 
linux-...@vger.kernel.org; syzkaller-b...@googlegroups.com
主题: memory leak in __usbhid_submit_report

[Please note this e-mail is from an EXTERNAL e-mail address]

Hello,

syzbot found the following issue on:

HEAD commit:f8394f23 Linux 5.10-rc3
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=12ebbdc650
kernel config:  https://syzkaller.appspot.com/x/.config?x=a3f13716fa0212fd
dashboard link: https://syzkaller.appspot.com/bug?extid=47b26cd837ececfc666d
compiler:   gcc (GCC) 10.1.0-syz 20200507
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=14497b8250
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1586ff1450

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+47b26cd837ececfc6...@syzkaller.appspotmail.com

BUG: memory leak
unreferenced object 0x8881097e5ec0 (size 32):
  comm "kworker/0:1", pid 7, jiffies 4294949214 (age 33.520s)
  hex dump (first 32 bytes):
04 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
  backtrace:
[<8296eaa1>] __usbhid_submit_report+0x116/0x490 
drivers/hid/usbhid/hid-core.c:588
[] usbhid_submit_report drivers/hid/usbhid/hid-core.c:638 
[inline]
[] usbhid_request+0x59/0xa0 
drivers/hid/usbhid/hid-core.c:1272
[<428a854b>] hidinput_led_worker+0x59/0x160 
drivers/hid/hid-input.c:1507
[<1bb8d86d>] process_one_work+0x27d/0x590 kernel/workqueue.c:2272
[<5d9a2f9c>] worker_thread+0x59/0x5d0 kernel/workqueue.c:2418
[] kthread+0x178/0x1b0 kernel/kthread.c:292
[<99d5a9ee>] ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296



   When usb device disconnect  the "raw_report" should be free in usbhid_stop.
   can we release it in this function, as shown below:

  usbhid_stop(struct hid_device *hid) {

 for (index = 0; index < HID_CONTROL_FIFO_SIZE; index++) {
  if (usbhid->ctrl[index].raw_report)
   kfree(usbhid->ctrl[index].raw_report);
  if (usbhid->out[index].raw_report)
   kfree(usbhid->out[index].raw_report);
  }
...
}




BUG: memory leak
unreferenced object 0x8881120200c0 (size 32):
  comm "kworker/0:1", pid 7, jiffies 4294949214 (age 33.520s)
  hex dump (first 32 bytes):
04 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
  backtrace:
[<8296eaa1>] __usbhid_submit_report+0x116/0x490 
drivers/hid/usbhid/hid-core.c:588
[] usbhid_submit_report drivers/hid/usbhid/hid-core.c:638 
[inline]
[] usbhid_request+0x59/0xa0 
drivers/hid/usbhid/hid-core.c:1272
[<428a854b>] hidinput_led_worker+0x59/0x160 
drivers/hid/hid-input.c:1507
[<1bb8d86d>] process_one_work+0x27d/0x590 kernel/workqueue.c:2272
[<5d9a2f9c>] worker_thread+0x59/0x5d0 kernel/workqueue.c:2418
[] kthread+0x178/0x1b0 kernel/kthread.c:292
[<99d5a9ee>] ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296

BUG: memory leak
unreferenced object 0x888107fa9420 (size 32):
  comm "kworker/0:1", pid 7, jiffies 4294949214 (age 33.520s)
  hex dump (first 32 bytes):
04 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
  backtrace:
[<8296eaa1>] __usbhid_submit_report+0x116/0x490 
drivers/hid/usbhid/hid-core.c:588
[] usbhid_submit_report drivers/hid/usbhid/hid-core.c:638 
[inline]
[] usbhid_request+0x59/0xa0 
drivers/hid/usbhid/hid-core.c:1272
[<428a854b>] hidinput_led_worker+0x59/0x160 
drivers/hid/hid-input.c:1507
[<1bb8d86d>] process_one_work+0x27d/0x590 kernel/workqueue.c:2272
[<5d9a2f9c>] worker_thread+0x59/0x5d0 kernel/workqueue.c:2418
[] kthread+0x178/0x1b0 kernel/kthread.c:292
[<99d5a9ee>] ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296

BUG: memory leak
unreferenced object 0x888112020b60 (size 32):
  comm "kworker/1:4", pid 8569, jiffies 4294949237 (age 33.290s)
  hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
  backtrace:
[<8296eaa1>] __usbhid_submit_report+0x116/0x490 
drivers/hid/usbhid/hid-core.c:588
[] usbhid_submit_report drivers/hid/usbhid/hid-core.c:638

回复: 回复: [PATCH v2] kthread_worker: re-set CPU affinities if CPU come online

2020-10-29 Thread Zhang, Qiang

发件人: Thomas Gleixner 
发送时间: 2020年10月28日 17:23
收件人: Zhang, Qiang; pmla...@suse.com; t...@kernel.org
抄送: a...@linux-foundation.org; linux...@kvack.org; linux-kernel@vger.kernel.org
主题: Re: 回复: [PATCH v2] kthread_worker: re-set CPU affinities if CPU come online

> [Please note this e-mail is from an EXTERNAL e-mail address]
>
> On Wed, Oct 28 2020 at 15:30, qiang zhang wrote:
>
>>How is that addressing any of the comments I made on V1 of this?
>
> Do you mean the following problem:
>
> "The dynamic hotplug states run late. What's preventing work to be queued
> on such a worker before it is bound to the CPU again?"
>
>This is one problem, but there are more and I explained them in great
>length. If there is anything unclear, then please ask.

Really, this patch is not considered that work may be put into the queue after 
the bound CPU is offline.   in addition, when the bound CPU goes online again, 
before restoring the worker's CPU affinity, work may be put into the queue.

Although  int this (powerclamp) way，that's not a problem, that it is solved by 
destroying and creating  tasks when the CPU hotplug,  in addition,  when CPU 
going down , this need call 'cancel_work_sync' func in offline callback,  this 
may be blocked long time. these operation is expensive.

this patch only just to recover  the worker task's affinity when CPU go to 
online again that create by "kthread_create_worker_on_cpu" func ,  likely 
per-CPU worker method when CPU hotplug in "workqueue" and "io-wq".

Thanks

Qiang

>
>Thanks,
>
>tglx

回复: [PATCH v2] kthread_worker: re-set CPU affinities if CPU come online

2020-10-28 Thread Zhang, Qiang




发件人: Thomas Gleixner 
发送时间: 2020年10月28日 16:30
收件人: Zhang, Qiang; pmla...@suse.com; t...@kernel.org
抄送: a...@linux-foundation.org; linux...@kvack.org; linux-kernel@vger.kernel.org
主题: Re: [PATCH v2] kthread_worker: re-set CPU affinities if CPU come online

[Please note this e-mail is from an EXTERNAL e-mail address]

On Wed, Oct 28 2020 at 15:30, qiang zhang wrote:
> From: Zqiang 
>
> When someone CPU offlined, the 'kthread_worker' which bind this CPU,
> will run anywhere, if this CPU online, recovery of 'kthread_worker'
> affinity by cpuhp notifiers.
>
> Signed-off-by: Zqiang 
> ---
>  v1->v2:
>  rename variable kworker_online to kthread_worker_online.
>  add 'cpuhp_node' and 'bind_cpu' init in KTHREAD_WORKER_INIT.
>  add a comment explaining for WARN_ON_ONCE.

>How is that addressing any of the comments I made on V1 of this?

Do you mean the following problem:
 
"The dynamic hotplug states run late. What's preventing work to be queued
on such a worker before it is bound to the CPU again?"

Thanks
Qiang
>
>Thanks,
>
>   tglx

回复: [PATCH] io-wq: set task TASK_INTERRUPTIBLE state before schedule_timeout

2020-10-28 Thread Zhang, Qiang




发件人: Jens Axboe 
发送时间: 2020年10月27日 21:35
收件人: Zhang, Qiang
抄送: io-ur...@vger.kernel.org; linux-kernel@vger.kernel.org
主题: Re: [PATCH] io-wq: set task TASK_INTERRUPTIBLE state before schedule_timeout

On 10/26/20 9:09 PM, qiang.zh...@windriver.com wrote:
> From: Zqiang 
>
> In 'io_wqe_worker' thread, if the work which in 'wqe->work_list' be
> finished, the 'wqe->work_list' is empty, and after that the
> '__io_worker_idle' func return false, the task state is TASK_RUNNING,
> need to be set TASK_INTERRUPTIBLE before call schedule_timeout func.
>
>I don't think that's safe - what if someone added work right before you
>call schedule_timeout_interruptible? Something ala:
>
>
>io_wq_enqueue()
>set_current_state(TASK_INTERRUPTIBLE();
>schedule_timeout(WORKER_IDLE_TIMEOUT);
>
>then we'll have work added and the task state set to running, but the
>worker itself just sets us to non-running and will hence wait
>WORKER_IDLE_TIMEOUT before the work is processed.
>
>The current situation will do one extra loop for this case, as the
>schedule_timeout() just ends up being a nop and we go around again

although the worker task state is running,  due to the call schedule_timeout, 
the 
current worker still possible to be switched out.
if set current worker task is no-running, the current worker be switched out, 
but
the schedule will call io_wq_worker_sleeping func  to wake up free worker task, 
if 
wqe->free_list is not empty.  

>checking for work. Since we already unused the mm, the next iteration
>will go to sleep properly unless new work came in.
>
>--
>Jens Axboe

回复: Question on io-wq

2020-10-22 Thread Zhang, Qiang




发件人: Zhang, Qiang 
发送时间: 2020年10月23日 11:55
收件人: Jens Axboe
抄送: v...@zeniv.linux.org.uk; io-ur...@vger.kernel.org; 
linux-kernel@vger.kernel.org; linux-fsde...@vger.kernel.org
主题: 回复: Question on io-wq




发件人: Jens Axboe 
发送时间: 2020年10月22日 22:08
收件人: Zhang, Qiang
抄送: v...@zeniv.linux.org.uk; io-ur...@vger.kernel.org; 
linux-kernel@vger.kernel.org; linux-fsde...@vger.kernel.org
主题: Re: Question on io-wq

On 10/22/20 3:02 AM, Zhang,Qiang wrote:
>
> Hi Jens Axboe
>
> There are some problem in 'io_wqe_worker' thread, when the
> 'io_wqe_worker' be create and  Setting the affinity of CPUs in NUMA
> nodes, due to CPU hotplug, When the last CPU going down, the
> 'io_wqe_worker' thread will run anywhere. when the CPU in the node goes
> online again, we should restore their cpu bindings?

>Something like the below should help in ensuring affinities are
>always correct - trigger an affinity set for an online CPU event. We
>should not need to do it for offlining. Can you test it?


>diff --git a/fs/io-wq.c b/fs/io-wq.c
>index 4012ff541b7b..3bf029d1170e 100644
>--- a/fs/io-wq.c
>+++ b/fs/io-wq.c
>@@ -19,6 +19,7 @@
 >#include 
 >#include 
 >#include 
>+#include 

 >#include "io-wq.h"
>
>@@ -123,9 +124,13 @@ struct io_wq {
 >   refcount_t refs;
  >  struct completion done;
>
>+   struct hlist_node cpuhp_node;
>+
 >   refcount_t use_refs;
 >};
>
>+static enum cpuhp_state io_wq_online;
>+
 >static bool io_worker_get(struct io_worker *worker)
 >{
   > return refcount_inc_not_zero(>ref);
>@@ -1096,6 +1101,13 @@ struct io_wq *io_wq_create(unsigned bounded, >struct 
>io_wq_data *data)
 >   return ERR_PTR(-ENOMEM);
  >  }
>
>+   ret = cpuhp_state_add_instance_nocalls(io_wq_online, >>cpuhp_node);
>+   if (ret) {
>+   kfree(wq->wqes);
>+   kfree(wq);
>+   return ERR_PTR(ret);
>+   }
>+
>wq->free_work = data->free_work;
>wq->do_work = data->do_work;
>
>@@ -1145,6 +1157,7 @@ struct io_wq *io_wq_create(unsigned bounded, >struct 
>io_wq_data *data)
 >   ret = PTR_ERR(wq->manager);
 >   complete(>done);
 >err:
>+   cpuhp_state_remove_instance_nocalls(io_wq_online, >>cpuhp_node);
  >  for_each_node(node)
 >   kfree(wq->wqes[node]);
 >   kfree(wq->wqes);
>@@ -1164,6 +1177,8 @@ static void __io_wq_destroy(struct io_wq *wq)
 >{
 >   int node;
>
>+   cpuhp_state_remove_instance_nocalls(io_wq_online, >>cpuhp_node);
>+
   > set_bit(IO_WQ_BIT_EXIT, >state);
  >  if (wq->manager)
 >   kthread_stop(wq->manager);
>@@ -1191,3 +1206,40 @@ struct task_struct *io_wq_get_task(struct io_wq >*wq)
 >{
 >  return wq->manager;
 >}
>+
>+static bool io_wq_worker_affinity(struct io_worker *worker, void *data)
>+{
>+   struct task_struct *task = worker->task;
>+   unsigned long flags;
>+
   struct rq_flags rf;
   struct rq *rq;
   rq = task_rq_lock(task, );

---   raw_spin_lock_irqsave(>pi_lock, flags);
>+   do_set_cpus_allowed(task, cpumask_of_node(worker->wqe->node));
>+   task->flags |= PF_NO_SETAFFINITY;
---  raw_spin_unlock_irqrestore(>pi_lock, flags);
   
  task_rq_unlock(rq, task, );

>+   return false;
>+}
>+
>+static int io_wq_cpu_online(unsigned int cpu, struct hlist_node *node)
>+{
>+   struct io_wq *wq = hlist_entry_safe(node, struct io_wq, cpuhp_node);
>+   int i;
>+
>+   rcu_read_lock();
>+   for_each_node(i)
>+   io_wq_for_each_worker(wq->wqes[i], io_wq_worker_affinity, 
>>NULL);
>+   rcu_read_unlock();
>+   return 0;
>+}
>+
>+static __init int io_wq_init(void)
>+{
>+   int ret;
>+
>+   ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, >"io->wq/online",
>+   io_wq_cpu_online, NULL);
>+   if (ret < 0)
>+   return ret;
>+   io_wq_online = ret;
>+   return 0;
>+}
>+subsys_initcall(io_wq_init);
>
>--
>Jens Axboe

回复: Question on io-wq

2020-10-22 Thread Zhang, Qiang




发件人: Jens Axboe 
发送时间: 2020年10月22日 22:08
收件人: Zhang, Qiang
抄送: v...@zeniv.linux.org.uk; io-ur...@vger.kernel.org; 
linux-kernel@vger.kernel.org; linux-fsde...@vger.kernel.org
主题: Re: Question on io-wq

On 10/22/20 3:02 AM, Zhang,Qiang wrote:
>
> Hi Jens Axboe
>
> There are some problem in 'io_wqe_worker' thread, when the
> 'io_wqe_worker' be create and  Setting the affinity of CPUs in NUMA
> nodes, due to CPU hotplug, When the last CPU going down, the
> 'io_wqe_worker' thread will run anywhere. when the CPU in the node goes
> online again, we should restore their cpu bindings?

>Something like the below should help in ensuring affinities are
>always correct - trigger an affinity set for an online CPU event. We
>should not need to do it for offlining. Can you test it?


>diff --git a/fs/io-wq.c b/fs/io-wq.c
>index 4012ff541b7b..3bf029d1170e 100644
>--- a/fs/io-wq.c
>+++ b/fs/io-wq.c
>@@ -19,6 +19,7 @@
 >#include 
 >#include 
 >#include 
>+#include 

 >#include "io-wq.h"
>
>@@ -123,9 +124,13 @@ struct io_wq {
 >   refcount_t refs;
  >  struct completion done;
>
>+   struct hlist_node cpuhp_node;
>+
 >   refcount_t use_refs;
 >};
>
>+static enum cpuhp_state io_wq_online;
>+
 >static bool io_worker_get(struct io_worker *worker)
 >{
   > return refcount_inc_not_zero(>ref);
>@@ -1096,6 +1101,13 @@ struct io_wq *io_wq_create(unsigned bounded, >struct 
>io_wq_data *data)
 >   return ERR_PTR(-ENOMEM);
  >  }
>
>+   ret = cpuhp_state_add_instance_nocalls(io_wq_online, >>cpuhp_node);
>+   if (ret) {
>+   kfree(wq->wqes);
>+   kfree(wq);
>+   return ERR_PTR(ret);
>+   }
>+
>wq->free_work = data->free_work;
>wq->do_work = data->do_work;
>
>@@ -1145,6 +1157,7 @@ struct io_wq *io_wq_create(unsigned bounded, >struct 
>io_wq_data *data)
 >   ret = PTR_ERR(wq->manager);
 >   complete(>done);
 >err:
>+   cpuhp_state_remove_instance_nocalls(io_wq_online, >>cpuhp_node);
  >  for_each_node(node)
 >   kfree(wq->wqes[node]);
 >   kfree(wq->wqes);
>@@ -1164,6 +1177,8 @@ static void __io_wq_destroy(struct io_wq *wq)
 >{
 >   int node;
>
>+   cpuhp_state_remove_instance_nocalls(io_wq_online, >>cpuhp_node);
>+
   > set_bit(IO_WQ_BIT_EXIT, >state);
  >  if (wq->manager)
 >   kthread_stop(wq->manager);
>@@ -1191,3 +1206,40 @@ struct task_struct *io_wq_get_task(struct io_wq >*wq)
 >{
 >  return wq->manager;
 >}
>+
>+static bool io_wq_worker_affinity(struct io_worker *worker, void *data)
>+{
>+   struct task_struct *task = worker->task;
>+   unsigned long flags;
>+
   struct rq_flags rf;


>+   raw_spin_lock_irqsave(>pi_lock, flags);
>+   do_set_cpus_allowed(task, cpumask_of_node(worker->wqe->node));
>+   task->flags |= PF_NO_SETAFFINITY;
>+   raw_spin_unlock_irqrestore(>pi_lock, flags);


>+   return false;
>+}
>+
>+static int io_wq_cpu_online(unsigned int cpu, struct hlist_node *node)
>+{
>+   struct io_wq *wq = hlist_entry_safe(node, struct io_wq, cpuhp_node);
>+   int i;
>+
>+   rcu_read_lock();
>+   for_each_node(i)
>+   io_wq_for_each_worker(wq->wqes[i], io_wq_worker_affinity, 
>>NULL);
>+   rcu_read_unlock();
>+   return 0;
>+}
>+
>+static __init int io_wq_init(void)
>+{
>+   int ret;
>+
>+   ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, >"io->wq/online",
>+   io_wq_cpu_online, NULL);
>+   if (ret < 0)
>+   return ret;
>+   io_wq_online = ret;
>+   return 0;
>+}
>+subsys_initcall(io_wq_init);
>
>--
>Jens Axboe

Question on io-wq

2020-10-22 Thread Zhang, Qiang


Hi Jens Axboe

There are some problem in 'io_wqe_worker' thread, when the 
'io_wqe_worker' be create and  Setting the affinity of CPUs in NUMA 
nodes, due to CPU hotplug, When the last CPU going down, the 
'io_wqe_worker' thread will run anywhere. when the CPU in the node goes 
online again, we should restore their cpu bindings?

Thanks
Qiang

Question on io-wq

2020-10-22 Thread Zhang,Qiang




Hi Jens Axboe

There are some problem in 'io_wqe_worker' thread, when the 
'io_wqe_worker' be create and  Setting the affinity of CPUs in NUMA 
nodes, due to CPU hotplug, When the last CPU going down, the 
'io_wqe_worker' thread will run anywhere. when the CPU in the node goes 
online again, we should restore their cpu bindings?


Thanks
Qiang

回复: 回复: [PATCH] btrfs: Fix missing close devices

2020-09-23 Thread Zhang, Qiang




发件人: Johannes Thumshirn 
发送时间: 2020年9月21日 17:17
收件人: Zhang, Qiang; c...@fb.com; jo...@toxicpanda.com; dste...@suse.com; 
syzbot+582e66e5edf36a22c...@syzkaller.appspotmail.com
抄送: linux-bt...@vger.kernel.org; linux-kernel@vger.kernel.org
主题: Re: 回复: [PATCH] btrfs: Fix missing close devices

On 21/09/2020 11:14, Zhang, Qiang wrote:
>
>
> 
> 发件人: Johannes Thumshirn 
> 发送时间: 2020年9月21日 16:52
> 收件人: Zhang, Qiang; c...@fb.com; jo...@toxicpanda.com; dste...@suse.com
> 抄送: linux-bt...@vger.kernel.org; linux-kernel@vger.kernel.org
> 主题: Re: [PATCH] btrfs: Fix missing close devices
>
> On 21/09/2020 10:27, qiang.zh...@windriver.com wrote:
>> From: Zqiang 
>>
>> When the btrfs fill super error, we should first close devices and
>> then call deactivate_locked_super func to free fs_info.
>>
>> Signed-off-by: Zqiang 
>> ---
>>  fs/btrfs/super.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
>> index 8840a4fa81eb..3bfd54e8f388 100644
>> --- a/fs/btrfs/super.c
>> +++ b/fs/btrfs/super.c
>> @@ -1675,6 +1675,7 @@ static struct dentry *btrfs_mount_root(struct 
>> file_system_type *fs_type,
>>   error = security_sb_set_mnt_opts(s, new_sec_opts, 0, NULL);
>>   security_free_mnt_opts(_sec_opts);
>>   if (error) {
>> + btrfs_close_devices(fs_devices);
>>   deactivate_locked_super(s);
>>   return ERR_PTR(error);
>>   }
>>
>
>> I think this is the fix for the syzkaller issue:
>> Reported-by: syzbot+582e66e5edf36a22c...@syzkaller.appspotmail.com
>
> Please  try this patch.
>

>Nope, with this patch I get the following Null-ptr-deref:
>[   39.065209] 
>>==
>[   39.066318] BUG: KASAN: null-ptr-deref in bdev_name.constprop.0+0xd4/0x240
>[   39.067307] Read of size 4 at addr 03ac by task syz-repro/273
>[   39.068289]
>[   39.069602] 
>>==
>[   39.070837] BUG: kernel NULL pointer dereference, address: 03ac
>[   39.071837] #PF: supervisor read access in kernel mode
>[   39.072580] #PF: error_code(0x) - not-present page
>[   39.073318] PGD 8001cd3b1067 P4D 8001cd3b1067 PUD 1c6de7067 PMD >0
>[   39.074306] Oops:  [#1] SMP KASAN PTI
>[   39.074887] CPU: 0 PID: 273 Comm: syz-repro Tainted: GB 
>5.9.0-rc5+ >#772
>[   39.076031] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
>>rel-1.13.0-0-gf21b5a4-rebuilt.opensuse.org 04/01/2014
>[   39.077638] RIP: 0010:bdev_name.constprop.0+0xd4/0x240
>[   39.078387] Code: ca 4c 89 4c 24 08 e8 0b e9 ff ff 48 89 df 49 89 c6 e8 40 
>42 c6 ff 49 >8b ac 24 e0 00 00 00 48 8d bd ac 03 00 00 e8 2c 41 c6 ff <8b> 85 
>ac 03 00 00 4c 8b 4c >24 08 85 c0 0f 84 fe 00 00 00 4c 89 cf
>[   39.080991] RSP: 0018:8881f1a97878 EFLAGS: 00010286
>[   39.081728] RAX: 0001 RBX: 8881c9fb80e0 RCX: 
>>dc00
>[   39.082725] RDX: 0007 RSI: 0004 RDI: 
>81acd784
>[   39.083717] RBP:  R08:  R09: 
>>>
>[   39.084722] R10: fbfff0539591 R11: 0001 R12: 
>8881c9fb8000
>[   39.085711] R13: 8881ef6e2698 R14: 8881ef6e2680 R15: 
>
>[   39.086704] FS:  7f5d36eb9540() GS:8881f760() 
>>knlGS:
>[   39.087827] CS:  0010 DS:  ES:  CR0: 80050033
>[   39.088623] CR2: 03ac CR3: 0001ef552000 CR4: 
>06b0
>[   39.089607] DR0:  DR1:  DR2: 
>
>[   39.090603] DR3:  DR6: fffe0ff0 DR7: 
>0400
>[   39.091583] Call Trace:
>[   39.091943]  ? mac_address_string+0x380/0x380
>[   39.092559]  ? mark_held_locks+0x65/0x90
>[   39.093116]  pointer+0x21c/0x650
>[   39.093578]  ? format_decode+0x1cf/0x4e0
>[   39.094139]  ? resource_string.isra.0+0xc10/0xc10
>[   39.094809]  vsnprintf+0x2e0/0x820
>[   39.095292]  ? pointer+0x650/0x650
>[   39.095785]  snprintf+0x88/0xa0
>[   39.096234]  ? vsprintf+0x10/0x10
>[   39.096708]  ? rcu_read_lock_sched_held+0x3a/0x70
>[   39.097378]  ? sget+0x200/0x240
>[   39.097908]  ? btrfs_kill_super+0x30/0x30 [btrfs]
>[   39.098644]  btrfs_mount_root+0x442/0x5d0 [btrfs]
>[   39.099377]  ? parse_rescue_options+0x150/0x150 [btrfs]
>[   39.100103]  ? rcu_read_lock_sched_held+0x3a/0x70
>[   39.100759]  ? vfs_parse_fs_string+0xbc/0xf0
>[   39.10

回复: [PATCH] btrfs: Fix missing close devices

2020-09-21 Thread Zhang, Qiang




发件人: Johannes Thumshirn 
发送时间: 2020年9月21日 16:52
收件人: Zhang, Qiang; c...@fb.com; jo...@toxicpanda.com; dste...@suse.com
抄送: linux-bt...@vger.kernel.org; linux-kernel@vger.kernel.org
主题: Re: [PATCH] btrfs: Fix missing close devices

On 21/09/2020 10:27, qiang.zh...@windriver.com wrote:
> From: Zqiang 
>
> When the btrfs fill super error, we should first close devices and
> then call deactivate_locked_super func to free fs_info.
>
> Signed-off-by: Zqiang 
> ---
>  fs/btrfs/super.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
> index 8840a4fa81eb..3bfd54e8f388 100644
> --- a/fs/btrfs/super.c
> +++ b/fs/btrfs/super.c
> @@ -1675,6 +1675,7 @@ static struct dentry *btrfs_mount_root(struct 
> file_system_type *fs_type,
>   error = security_sb_set_mnt_opts(s, new_sec_opts, 0, NULL);
>   security_free_mnt_opts(_sec_opts);
>   if (error) {
> + btrfs_close_devices(fs_devices);
>   deactivate_locked_super(s);
>   return ERR_PTR(error);
>   }
>

>I think this is the fix for the syzkaller issue:
>Reported-by: syzbot+582e66e5edf36a22c...@syzkaller.appspotmail.com

Please  try this patch.

Re: 回复: RCU: Question on force_qs_rnp

2020-09-16 Thread Zhang,Qiang





On 9/16/20 2:06 AM, Paul E. McKenney wrote:

On Tue, Sep 15, 2020 at 01:16:39PM +0800, Zhang,Qiang wrote:



On 9/15/20 11:41 AM, Paul E. McKenney wrote:

On Tue, Sep 15, 2020 at 03:18:23AM +, Zhang, Qiang wrote:




发件人: Paul E. McKenney 
发送时间: 2020年9月15日 4:56
收件人: Joel Fernandes
抄送: Zhang, Qiang; Uladzislau Rezki; j...@joshtriplett.org; rost...@goodmis.org; 
mathieu.desnoy...@efficios.com; Lai Jiangshan; r...@vger.kernel.org; LKML
主题: Re: RCU: Question on force_qs_rnp

On Mon, Sep 14, 2020 at 03:42:08PM -0400, Joel Fernandes wrote:

On Mon, Sep 14, 2020 at 07:55:18AM +, Zhang, Qiang wrote:

Hello Paul

I have some questions for you .
in force_qs_rnp func ,  if  "f(rdp)" func return true we will call 
rcu_report_qs_rnp func
report a quiescent state for this rnp node, and clear grpmask form rnp->qsmask.
after that ,  can we make a check for this rnp->qsmask,  if  rnp->qsmask == 0,
we will check blocked readers in this rnp node,  instead of jumping directly to 
the next node .


Could you clarify what good is this going to do? What problem are you trying to
address?

You could have a task that is blocked in an RCU leaf node, but the
force_qs_rnp() decided to call rcu_report_qs_rnp(). This is perfectly Ok. The
CPU could be dyntick-idle and a quiescent state is reported. However, the GP
must not end and the rcu leaf node should still be present in its parent
intermediate nodes ->qsmask. In this case, the ->qsmask == 0 does not have
any relevance.

Or am I missing the point of the question?



Hello, Qiang,



Another way of making Joel's point is to say that the additional check
you are asking for is already being done, but by rcu_report_qs_rnp().



 Thanx, Paul


Hello Pual,  Joel

What I want to express is as follows :

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 7623128d0020..beb554539f01 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2622,6 +2622,11 @@ static void force_qs_rnp(int (*f)(struct rcu_data *rdp))
  if (mask != 0) {
  /* Idle/offline CPUs, report (releases rnp->lock). */
  rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags);
+   raw_spin_lock_irqsave_rcu_node(rnp, flags);
+   if (rnp->qsmask == 0 && 
rcu_preempt_blocked_readers_cgp(rnp))
+   rcu_initiate_boost(rnp, flags);
+   else
+   raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
  } else {
  /* Nothing to do here, so just drop the lock. */
  raw_spin_unlock_irqrestore_rcu_node(rnp, flags);


But in that case, why duplicate the code from rcu_initiate_boost()?

Thanx, Paul



Hello Paul

When we force a qs for rnp, we first check the leaf node "rnp->qsmask" if it
is reached zero, will check if there are some blocked readers in this leaf
rnp node, if so we need to priority-boost blocked readers.
if not we will check cpu dyntick-idle and report leaf node qs, after this
leaf rnp node report qs, there is may be some blocked readers in this node,
should we also need to priority-boost blocked readers?


Yes, but we will do that on the next time around, a few milliseconds
later.  And by that time, it is quite possible that the reader will have
completed, which will save us from having to priority-boost it.

Thanx, Paul



Thanks Paul, I see.

回复: 回复: [PATCH v3] debugobjects: install CPU hotplug callback

2020-09-16 Thread Zhang, Qiang




发件人: Waiman Long 
发送时间: 2020年9月10日 10:50
收件人: Zhang, Qiang; t...@linutronix.de; mi...@kernel.org; el...@google.com
抄送: linux-kernel@vger.kernel.org
主题: Re: 回复: [PATCH v3] debugobjects: install CPU hotplug callback

On 9/9/20 9:48 PM, Zhang, Qiang wrote:
>
> 
> 发件人: Waiman Long 
> 发送时间: 2020年9月9日 2:23
> 收件人: Zhang, Qiang; t...@linutronix.de; mi...@kernel.org; el...@google.com
> 抄送: linux-kernel@vger.kernel.org
> 主题: Re: [PATCH v3] debugobjects: install CPU hotplug callback
>
> On 9/8/20 2:27 AM, qiang.zh...@windriver.com wrote:
>> From: Zqiang 
>>
>> Due to CPU hotplug, it may never be online after it's offline,
>> some objects in percpu pool is never free. in order to avoid
>> this happening, install CPU hotplug callback, call this callback
>> func to free objects in percpu pool when CPU going offline.
>>
>> Signed-off-by: Zqiang 
>> ---
>>v1->v2:
>>Modify submission information.
>>
>>v2->v3:
>>In CPU hotplug callback func, add clear percpu pool "obj_free" operation.
>>capitalize 'CPU', and use shorter preprocessor sequence.
>>
>>include/linux/cpuhotplug.h |  1 +
>>lib/debugobjects.c | 24 
>>2 files changed, 25 insertions(+)
>>
>> diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
>> index 3215023d4852..0c39d57e5342 100644
>> --- a/include/linux/cpuhotplug.h
>> +++ b/include/linux/cpuhotplug.h
>> @@ -36,6 +36,7 @@ enum cpuhp_state {
>>CPUHP_X86_MCE_DEAD,
>>CPUHP_VIRT_NET_DEAD,
>>CPUHP_SLUB_DEAD,
>> + CPUHP_DEBUG_OBJ_DEAD,
>>CPUHP_MM_WRITEBACK_DEAD,
>>CPUHP_MM_VMSTAT_DEAD,
>>CPUHP_SOFTIRQ_DEAD,
>> diff --git a/lib/debugobjects.c b/lib/debugobjects.c
>> index fe4557955d97..bb69a02c3e7b 100644
>> --- a/lib/debugobjects.c
>> +++ b/lib/debugobjects.c
>> @@ -19,6 +19,7 @@
>>#include 
>>#include 
>>#include 
>> +#include 
>>
>>#define ODEBUG_HASH_BITS14
>>#define ODEBUG_HASH_SIZE(1 << ODEBUG_HASH_BITS)
>> @@ -433,6 +434,24 @@ static void free_object(struct debug_obj *obj)
>>}
>>}
>>
>> +#ifdef CONFIG_HOTPLUG_CPU
>> +static int object_cpu_offline(unsigned int cpu)
>> +{
>> + struct debug_percpu_free *percpu_pool;
>> + struct hlist_node *tmp;
>> + struct debug_obj *obj;
>> +
>> + percpu_pool = per_cpu_ptr(_obj_pool, cpu);
>> + hlist_for_each_entry_safe(obj, tmp, _pool->free_objs, node) {
>> + hlist_del(>node);
>> + kmem_cache_free(obj_cache, obj);
>> + }
>> + percpu_pool->obj_free = 0;
>>> For pointer, it is better to use NULL for clarity.
>>> Cheers,
>>> Longman
> Do you mean "->obj_free" variable ?   this represents the number of free 
> objects in  percpu_pool .
>
>>You are right. I got confused. Sorry for the noise.

>>Cheers,
>>Longman

Hello tglx, mingo

 Is this patch acceptable? 

Thanks 

Qiang

Re: 回复: RCU: Question on force_qs_rnp

2020-09-14 Thread Zhang,Qiang





On 9/15/20 11:41 AM, Paul E. McKenney wrote:

On Tue, Sep 15, 2020 at 03:18:23AM +, Zhang, Qiang wrote:




发件人: Paul E. McKenney 
发送时间: 2020年9月15日 4:56
收件人: Joel Fernandes
抄送: Zhang, Qiang; Uladzislau Rezki; j...@joshtriplett.org; rost...@goodmis.org; 
mathieu.desnoy...@efficios.com; Lai Jiangshan; r...@vger.kernel.org; LKML
主题: Re: RCU: Question on force_qs_rnp

On Mon, Sep 14, 2020 at 03:42:08PM -0400, Joel Fernandes wrote:

On Mon, Sep 14, 2020 at 07:55:18AM +, Zhang, Qiang wrote:

Hello Paul

I have some questions for you .
in force_qs_rnp func ,  if  "f(rdp)" func return true we will call 
rcu_report_qs_rnp func
report a quiescent state for this rnp node, and clear grpmask form rnp->qsmask.
after that ,  can we make a check for this rnp->qsmask,  if  rnp->qsmask == 0,
we will check blocked readers in this rnp node,  instead of jumping directly to 
the next node .


Could you clarify what good is this going to do? What problem are you trying to
address?

You could have a task that is blocked in an RCU leaf node, but the
force_qs_rnp() decided to call rcu_report_qs_rnp(). This is perfectly Ok. The
CPU could be dyntick-idle and a quiescent state is reported. However, the GP
must not end and the rcu leaf node should still be present in its parent
intermediate nodes ->qsmask. In this case, the ->qsmask == 0 does not have
any relevance.

Or am I missing the point of the question?



Hello, Qiang,



Another way of making Joel's point is to say that the additional check
you are asking for is already being done, but by rcu_report_qs_rnp().



Thanx, Paul


Hello Pual,  Joel

What I want to express is as follows :

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 7623128d0020..beb554539f01 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2622,6 +2622,11 @@ static void force_qs_rnp(int (*f)(struct rcu_data *rdp))
 if (mask != 0) {
 /* Idle/offline CPUs, report (releases rnp->lock). */
 rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags);
+   raw_spin_lock_irqsave_rcu_node(rnp, flags);
+   if (rnp->qsmask == 0 && 
rcu_preempt_blocked_readers_cgp(rnp))
+   rcu_initiate_boost(rnp, flags);
+   else
+   raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
 } else {
 /* Nothing to do here, so just drop the lock. */
 raw_spin_unlock_irqrestore_rcu_node(rnp, flags);


But in that case, why duplicate the code from rcu_initiate_boost()?

Thanx, Paul



Hello Paul

When we force a qs for rnp, we first check the leaf node "rnp->qsmask" 
if it is reached zero, will check if there are some blocked readers in 
this leaf rnp node, if so we need to priority-boost blocked readers.
if not we will check cpu dyntick-idle and report leaf node qs, after 
this leaf rnp node report qs, there is may be some blocked readers in 
this node, should we also need to priority-boost blocked readers?


Thanks

Qiang

回复: RCU: Question on force_qs_rnp

2020-09-14 Thread Zhang, Qiang




发件人: Paul E. McKenney 
发送时间: 2020年9月15日 4:56
收件人: Joel Fernandes
抄送: Zhang, Qiang; Uladzislau Rezki; j...@joshtriplett.org; rost...@goodmis.org; 
mathieu.desnoy...@efficios.com; Lai Jiangshan; r...@vger.kernel.org; LKML
主题: Re: RCU: Question on force_qs_rnp

On Mon, Sep 14, 2020 at 03:42:08PM -0400, Joel Fernandes wrote:
> On Mon, Sep 14, 2020 at 07:55:18AM +0000, Zhang, Qiang wrote:
> > Hello Paul
> >
> > I have some questions for you .
> > in force_qs_rnp func ,  if  "f(rdp)" func return true we will call 
> > rcu_report_qs_rnp func
> > report a quiescent state for this rnp node, and clear grpmask form 
> > rnp->qsmask.
> > after that ,  can we make a check for this rnp->qsmask,  if  rnp->qsmask == 
> > 0,
> > we will check blocked readers in this rnp node,  instead of jumping 
> > directly to the next node .
>
> Could you clarify what good is this going to do? What problem are you trying 
> to
> address?
>
> You could have a task that is blocked in an RCU leaf node, but the
> force_qs_rnp() decided to call rcu_report_qs_rnp(). This is perfectly Ok. The
> CPU could be dyntick-idle and a quiescent state is reported. However, the GP
> must not end and the rcu leaf node should still be present in its parent
> intermediate nodes ->qsmask. In this case, the ->qsmask == 0 does not have
> any relevance.
>
> Or am I missing the point of the question?

>Hello, Qiang,

>Another way of making Joel's point is to say that the additional check
>you are asking for is already being done, but by rcu_report_qs_rnp().

>Thanx, Paul

Hello Pual,  Joel

What I want to express is as follows :

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 7623128d0020..beb554539f01 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2622,6 +2622,11 @@ static void force_qs_rnp(int (*f)(struct rcu_data *rdp))
if (mask != 0) {
/* Idle/offline CPUs, report (releases rnp->lock). */
rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags);
+   raw_spin_lock_irqsave_rcu_node(rnp, flags);
+   if (rnp->qsmask == 0 && 
rcu_preempt_blocked_readers_cgp(rnp))
+   rcu_initiate_boost(rnp, flags);
+   else
+   raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
} else {
/* Nothing to do here, so just drop the lock. */
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);


Thanks

Qiang

RCU: Question on force_qs_rnp

2020-09-14 Thread Zhang, Qiang

Hello Paul

I have some questions for you .
in force_qs_rnp func ,  if  "f(rdp)" func return true we will call 
rcu_report_qs_rnp func
report a quiescent state for this rnp node, and clear grpmask form rnp->qsmask.
after that ,  can we make a check for this rnp->qsmask,  if  rnp->qsmask == 0, 
we will check blocked readers in this rnp node,  instead of jumping directly to 
the next node .


Thanks
Qiang

回复: RCU: Question rcu_preempt_blocked_readers_cgp in rcu_gp_fqs_loop func

2020-09-09 Thread Zhang, Qiang




发件人: Paul E. McKenney 
发送时间: 2020年9月9日 19:22
收件人: Zhang, Qiang
抄送: Joel Fernandes; Uladzislau Rezki; Josh Triplett; Steven Rostedt; Mathieu 
Desnoyers; Lai Jiangshan; rcu; LKML
主题: Re: RCU:  Question   rcu_preempt_blocked_readers_cgp  in  rcu_gp_fqs_loop 
func

On Wed, Sep 09, 2020 at 07:03:39AM +, Zhang, Qiang wrote:
>
> When config preempt RCU,  and then  there are multiple levels  node,  the 
> current task is preempted  in rcu  read critical region.
> the current task be add to "rnp->blkd_tasks" link list,  and the 
> "rnp->gp_tasks"  may be assigned a value .  these rnp is leaf node in RCU 
> tree.
>
> But in "rcu_gp_fqs_loop" func, we check blocked readers in root node.
>
> static void rcu_gp_fqs_loop(void)
>  {
> .
> struct rcu_node *rnp = rcu_get_root();
> .
> if (!READ_ONCE(rnp->qsmask) &&
>!rcu_preempt_blocked_readers_cgp(rnp))
> --> rnp is root node
>  break;
> 
> }
>
> the root node's blkd_tasks never add task, the "rnp->gp_tasks" is never be 
> assigned value,  this check is invailed.
>  Should we check leaf nodes like this

>There are two cases:

>1.  There is only a single rcu_node structure, which is both root
>   and leaf.  In this case, the current check is required:  Both
>   ->qsmask and the ->blkd_tasks list must be checked.  Your
>rcu_preempt_blocked_readers() would work in this case, but
>the current code is a bit faster because it does not need
>to acquire the ->lock nor does it need the loop overhead.

>2.  There are multiple levels.  In this case, as you say, the root
>rcu_node structure's ->blkd_tasks list will always be empty.
>But also in this case, the root rcu_node structure's ->qsmask
>cannot be zero until all the leaf rcu_node structures' ->qsmask
>fields are zero and their ->blkd_tasks lists no longer have
>tasks blocking the current grace period.  This means that your
 >   rcu_preempt_blocked_readers() function would never return
 >   true in this case.

>So the current code is fine.

>Are you seeing failures on mainline kernels?  If so, what is the failure
>mode?

 Yes it's right, thank you for your explanation.
  
  thanks
  Qiang

 >   Thanx, Paul

> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -1846,6 +1846,25 @@ static bool rcu_gp_init(void)
>   return true;
>  }
>
> +static bool rcu_preempt_blocked_readers(void)
> +{
> + struct rcu_node *rnp;
> + unsigned long flags;
> + bool ret = false;
> +
> + rcu_for_each_leaf_node(rnp) {
> + raw_spin_lock_irqsave_rcu_node(rnp, flags);
> + if (rcu_preempt_blocked_readers_cgp(rnp)) {
> + ret = true;
> + raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
> + break;
> + }
> + raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
> + }
> +
> + return ret;
> +}
> +
>  /*
>   * Helper function for swait_event_idle_exclusive() wakeup at 
> force-quiescent-state
>   * time.
> @@ -1864,7 +1883,7 @@ static bool rcu_gp_fqs_check_wake(int *gfp)
>   return true;
>
>   // The current grace period has completed.
> - if (!READ_ONCE(rnp->qsmask) && !rcu_preempt_blocked_readers_cgp(rnp))
> + if (!READ_ONCE(rnp->qsmask) && !rcu_preempt_blocked_readers())
>   return true;
>
>   return false;
> @@ -1927,7 +1946,7 @@ static void rcu_gp_fqs_loop(void)
>   /* Locking provides needed memory barriers. */
>   /* If grace period done, leave loop. */
>   if (!READ_ONCE(rnp->qsmask) &&
> - !rcu_preempt_blocked_readers_cgp(rnp))
> + !rcu_preempt_blocked_readers())
>   break;
>   /* If time for quiescent-state forcing, do it. */
>   if (!time_after(rcu_state.jiffies_force_qs, jiffies) ||
> --
>
>
> thanks
> Qiang

回复: [PATCH v3] debugobjects: install CPU hotplug callback

2020-09-09 Thread Zhang, Qiang




发件人: Waiman Long 
发送时间: 2020年9月9日 2:23
收件人: Zhang, Qiang; t...@linutronix.de; mi...@kernel.org; el...@google.com
抄送: linux-kernel@vger.kernel.org
主题: Re: [PATCH v3] debugobjects: install CPU hotplug callback

On 9/8/20 2:27 AM, qiang.zh...@windriver.com wrote:
> From: Zqiang 
>
> Due to CPU hotplug, it may never be online after it's offline,
> some objects in percpu pool is never free. in order to avoid
> this happening, install CPU hotplug callback, call this callback
> func to free objects in percpu pool when CPU going offline.
>
> Signed-off-by: Zqiang 
> ---
>   v1->v2:
>   Modify submission information.
>
>   v2->v3:
>   In CPU hotplug callback func, add clear percpu pool "obj_free" operation.
>   capitalize 'CPU', and use shorter preprocessor sequence.
>
>   include/linux/cpuhotplug.h |  1 +
>   lib/debugobjects.c | 24 
>   2 files changed, 25 insertions(+)
>
> diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
> index 3215023d4852..0c39d57e5342 100644
> --- a/include/linux/cpuhotplug.h
> +++ b/include/linux/cpuhotplug.h
> @@ -36,6 +36,7 @@ enum cpuhp_state {
>   CPUHP_X86_MCE_DEAD,
>   CPUHP_VIRT_NET_DEAD,
>   CPUHP_SLUB_DEAD,
> + CPUHP_DEBUG_OBJ_DEAD,
>   CPUHP_MM_WRITEBACK_DEAD,
>   CPUHP_MM_VMSTAT_DEAD,
>   CPUHP_SOFTIRQ_DEAD,
> diff --git a/lib/debugobjects.c b/lib/debugobjects.c
> index fe4557955d97..bb69a02c3e7b 100644
> --- a/lib/debugobjects.c
> +++ b/lib/debugobjects.c
> @@ -19,6 +19,7 @@
>   #include 
>   #include 
>   #include 
> +#include 
>
>   #define ODEBUG_HASH_BITS14
>   #define ODEBUG_HASH_SIZE(1 << ODEBUG_HASH_BITS)
> @@ -433,6 +434,24 @@ static void free_object(struct debug_obj *obj)
>   }
>   }
>
> +#ifdef CONFIG_HOTPLUG_CPU
> +static int object_cpu_offline(unsigned int cpu)
> +{
> + struct debug_percpu_free *percpu_pool;
> + struct hlist_node *tmp;
> + struct debug_obj *obj;
> +
> + percpu_pool = per_cpu_ptr(_obj_pool, cpu);
> + hlist_for_each_entry_safe(obj, tmp, _pool->free_objs, node) {
> + hlist_del(>node);
> + kmem_cache_free(obj_cache, obj);
> + }
> + percpu_pool->obj_free = 0;

>>For pointer, it is better to use NULL for clarity.

>>Cheers,
>>Longman

Do you mean "->obj_free" variable ?   this represents the number of free 
objects in  percpu_pool .

> +
> + return 0;
> +}
> +#endif
> +
>   /*
>* We run out of memory. That means we probably have tons of objects
>* allocated.
> @@ -1367,6 +1386,11 @@ void __init debug_objects_mem_init(void)
>   } else
>   debug_objects_selftest();
>
> +#ifdef CONFIG_HOTPLUG_CPU
> + cpuhp_setup_state_nocalls(CPUHP_DEBUG_OBJ_DEAD, "object:offline", NULL,
> + object_cpu_offline);
> +#endif
> +
>   /*
>* Increase the thresholds for allocating and freeing objects
>* according to the number of possible CPUs available in the system.

RCU: Question rcu_preempt_blocked_readers_cgp in rcu_gp_fqs_loop func

2020-09-09 Thread Zhang, Qiang


When config preempt RCU,  and then  there are multiple levels  node,  the 
current task is preempted  in rcu  read critical region.
the current task be add to "rnp->blkd_tasks" link list,  and the 
"rnp->gp_tasks"  may be assigned a value .  these rnp is leaf node in RCU tree.

But in "rcu_gp_fqs_loop" func, we check blocked readers in root node. 

static void rcu_gp_fqs_loop(void)
 {
.
struct rcu_node *rnp = rcu_get_root();
.
if (!READ_ONCE(rnp->qsmask) &&
   !rcu_preempt_blocked_readers_cgp(rnp))
--> rnp is root node
 break;

}

the root node's blkd_tasks never add task, the "rnp->gp_tasks" is never be 
assigned value,  this check is invailed.
 Should we check leaf nodes like this 

--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1846,6 +1846,25 @@ static bool rcu_gp_init(void)
return true;
 }
 
+static bool rcu_preempt_blocked_readers(void)
+{
+   struct rcu_node *rnp;
+   unsigned long flags;
+   bool ret = false;
+
+   rcu_for_each_leaf_node(rnp) {
+   raw_spin_lock_irqsave_rcu_node(rnp, flags);
+   if (rcu_preempt_blocked_readers_cgp(rnp)) {
+   ret = true;
+   raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
+   break;
+   }
+   raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
+   }
+
+   return ret;
+}
+
 /*
  * Helper function for swait_event_idle_exclusive() wakeup at 
force-quiescent-state
  * time.
@@ -1864,7 +1883,7 @@ static bool rcu_gp_fqs_check_wake(int *gfp)
return true;
 
// The current grace period has completed.
-   if (!READ_ONCE(rnp->qsmask) && !rcu_preempt_blocked_readers_cgp(rnp))
+   if (!READ_ONCE(rnp->qsmask) && !rcu_preempt_blocked_readers())
return true;
 
return false;
@@ -1927,7 +1946,7 @@ static void rcu_gp_fqs_loop(void)
/* Locking provides needed memory barriers. */
/* If grace period done, leave loop. */
if (!READ_ONCE(rnp->qsmask) &&
-   !rcu_preempt_blocked_readers_cgp(rnp))
+   !rcu_preempt_blocked_readers())
break;
/* If time for quiescent-state forcing, do it. */
if (!time_after(rcu_state.jiffies_force_qs, jiffies) ||
-- 


thanks
Qiang

[no subject]

2020-09-09 Thread Zhang, Qiang

 When config preempt RCU,  if task switch happened in rcu  read critical region.
 the current task be add to "rnp->blkd_tasks" link list,  these rnp is leaf 
node in RCUtree.
 In "rcu_gp_fqs_loop" func, 

static void rcu_gp_fqs_loop(void)
 {

struct rcu_node *rnp = rcu_get_root();
}

回复: [PATCH v2] debugobjects: install cpu hotplug callback

2020-09-03 Thread Zhang, Qiang

tglx please review.

Thanks
Qiang

发件人: linux-kernel-ow...@vger.kernel.org  代表 
qiang.zh...@windriver.com 
发送时间: 2020年8月27日 13:06
收件人: t...@linutronix.de; long...@redhat.com; el...@google.com
抄送: linux-kernel@vger.kernel.org
主题: [PATCH v2] debugobjects: install cpu hotplug callback

From: Zqiang 

Due to cpu hotplug, it may never be online after it's offline,
some objects in percpu pool is never free, in order to avoid
this happening, install cpu hotplug callback, call this callback
func to free objects in percpu pool when cpu going offline.

Signed-off-by: Zqiang 
---
 v1->v2:
 Modify submission information.

 include/linux/cpuhotplug.h |  1 +
 lib/debugobjects.c | 23 +++
 2 files changed, 24 insertions(+)

diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
index a2710e654b64..2e77db655cfa 100644
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -36,6 +36,7 @@ enum cpuhp_state {
CPUHP_X86_MCE_DEAD,
CPUHP_VIRT_NET_DEAD,
CPUHP_SLUB_DEAD,
+   CPUHP_DEBUG_OBJ_DEAD,
CPUHP_MM_WRITEBACK_DEAD,
CPUHP_MM_VMSTAT_DEAD,
CPUHP_SOFTIRQ_DEAD,
diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index fe4557955d97..50e21ed0519e 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 

 #define ODEBUG_HASH_BITS   14
 #define ODEBUG_HASH_SIZE   (1 << ODEBUG_HASH_BITS)
@@ -433,6 +434,23 @@ static void free_object(struct debug_obj *obj)
}
 }

+#if defined(CONFIG_HOTPLUG_CPU)
+static int object_cpu_offline(unsigned int cpu)
+{
+   struct debug_percpu_free *percpu_pool;
+   struct hlist_node *tmp;
+   struct debug_obj *obj;
+
+   percpu_pool = per_cpu_ptr(_obj_pool, cpu);
+   hlist_for_each_entry_safe(obj, tmp, _pool->free_objs, node) {
+   hlist_del(>node);
+   kmem_cache_free(obj_cache, obj);
+   }
+
+   return 0;
+}
+#endif
+
 /*
  * We run out of memory. That means we probably have tons of objects
  * allocated.
@@ -1367,6 +1385,11 @@ void __init debug_objects_mem_init(void)
} else
debug_objects_selftest();

+#if defined(CONFIG_HOTPLUG_CPU)
+   cpuhp_setup_state_nocalls(CPUHP_DEBUG_OBJ_DEAD, "object:offline", NULL,
+   object_cpu_offline);
+#endif
+
/*
 * Increase the thresholds for allocating and freeing objects
 * according to the number of possible CPUs available in the system.
--
2.17.1

回复: 回复: [PATCH] debugobjects: install cpu hotplug callback

2020-08-26 Thread Zhang, Qiang




发件人: linux-kernel-ow...@vger.kernel.org  代表 
Thomas Gleixner 
发送时间: 2020年8月26日 7:53
收件人: Waiman Long; Zhang, Qiang; el...@google.com
抄送: linux-kernel@vger.kernel.org; a...@linux-foundation.org
主题: Re: 回复: [PATCH] debugobjects: install cpu hotplug callback

On Tue, Aug 25 2020 at 18:26, Waiman Long wrote:
> On 8/25/20 12:53 AM, Zhang, Qiang wrote:
>>
>> When a cpu going offline, we should free objects in "percpu_obj_pool"
>> free_objs list which corresponding to this cpu.
>
> The percpu free object pool is supposed to be accessed only by that
> particular cpu without any lock. Trying to access it from another cpu
> can cause a race condition unless one can make sure that the offline cpu
> won't become online in the mean time.

>It is actually safe because CPU hotplug is globally serialized and there
>is no way that an offline CPU will come back from death valley
>magically. If such a zombie ever surfaces then we have surely more
>serious problems than accessing that pool :)

> There shouldn't be too many free objects in the percpu pool. Is it
> worth the effort to free them?

>That's a really good question nevertheless. The only case where this
>ever matters is physical hotplug. All other CPU hotplug stuff is
>temporarily or in case of a late (post boottime) SMT disable it's going
>to be a handful of free objects on that pool. As debugobjects is as the
>name says a debug facility the benefit is questionable unless there is a
>good reason to do so.

 I don't know  there may not be too many objects  in the percpu pool,
 but that doesn't mean they no need to be free, a CPU may never be online after 
it is offline. some objects in percpu pool is never free.



>Thanks,

  >  tglx

回复: [PATCH] debugobjects: install cpu hotplug callback

2020-08-24 Thread Zhang, Qiang



发件人: linux-kernel-ow...@vger.kernel.org  代表 
qiang.zh...@windriver.com 
发送时间: 2020年8月20日 11:24
收件人: t...@linutronix.de; el...@google.com; long...@redhat.com
抄送: linux-kernel@vger.kernel.org
主题: [PATCH] debugobjects: install cpu hotplug callback

From: Zqiang 

When a cpu going offline, we should free objects in "percpu_obj_pool"
free_objs list which corresponding to this cpu.

Signed-off-by: Zqiang 
---
 include/linux/cpuhotplug.h |  1 +
 lib/debugobjects.c | 23 +++
 2 files changed, 24 insertions(+)

diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
index a2710e654b64..2e77db655cfa 100644
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -36,6 +36,7 @@ enum cpuhp_state {
CPUHP_X86_MCE_DEAD,
CPUHP_VIRT_NET_DEAD,
CPUHP_SLUB_DEAD,
+   CPUHP_DEBUG_OBJ_DEAD,
CPUHP_MM_WRITEBACK_DEAD,
CPUHP_MM_VMSTAT_DEAD,
CPUHP_SOFTIRQ_DEAD,
diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index fe4557955d97..50e21ed0519e 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 

 #define ODEBUG_HASH_BITS   14
 #define ODEBUG_HASH_SIZE   (1 << ODEBUG_HASH_BITS)
@@ -433,6 +434,23 @@ static void free_object(struct debug_obj *obj)
}
 }

+#if defined(CONFIG_HOTPLUG_CPU)
+static int object_cpu_offline(unsigned int cpu)
+{
+   struct debug_percpu_free *percpu_pool;
+   struct hlist_node *tmp;
+   struct debug_obj *obj;
+
+   percpu_pool = per_cpu_ptr(_obj_pool, cpu);
+   hlist_for_each_entry_safe(obj, tmp, _pool->free_objs, node) {
+   hlist_del(>node);
+   kmem_cache_free(obj_cache, obj);
+   }
+
+   return 0;
+}
+#endif
+
 /*
  * We run out of memory. That means we probably have tons of objects
  * allocated.
@@ -1367,6 +1385,11 @@ void __init debug_objects_mem_init(void)
} else
debug_objects_selftest();

+#if defined(CONFIG_HOTPLUG_CPU)
+   cpuhp_setup_state_nocalls(CPUHP_DEBUG_OBJ_DEAD, "object:offline", NULL,
+   object_cpu_offline);
+#endif
+
/*
 * Increase the thresholds for allocating and freeing objects
 * according to the number of possible CPUs available in the system.
--
2.17.1

回复: [PATCH v2] libnvdimm: KASAN: global-out-of-bounds Read in internal_create_group

2020-08-18 Thread Zhang, Qiang

cc: Dan Williams
Please review.


发件人: linux-kernel-ow...@vger.kernel.org  代表 
qiang.zh...@windriver.com 
发送时间: 2020年8月12日 16:55
收件人: dan.j.willi...@intel.com; vishal.l.ve...@intel.com; dave.ji...@intel.com; 
ira.we...@intel.com
抄送: linux-nvd...@lists.01.org; linux-kernel@vger.kernel.org
主题: [PATCH v2] libnvdimm: KASAN: global-out-of-bounds Read in 
internal_create_group

From: Zqiang 

Because the last member of the "nvdimm_firmware_attributes" array
was not assigned a null ptr, when traversal of "grp->attrs" array
is out of bounds in "create_files" func.

func:
create_files:
->for (i = 0, attr = grp->attrs; *attr && !error; i++, attr++)
->

BUG: KASAN: global-out-of-bounds in create_files fs/sysfs/group.c:43 [inline]
BUG: KASAN: global-out-of-bounds in internal_create_group+0x9d8/0xb20
fs/sysfs/group.c:149
Read of size 8 at addr 8a2e4cf0 by task kworker/u17:10/959

CPU: 2 PID: 959 Comm: kworker/u17:10 Not tainted 5.8.0-syzkaller #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
Workqueue: events_unbound async_run_entry_fn
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x18f/0x20d lib/dump_stack.c:118
 print_address_description.constprop.0.cold+0x5/0x497 mm/kasan/report.c:383
 __kasan_report mm/kasan/report.c:513 [inline]
 kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
 create_files fs/sysfs/group.c:43 [inline]
 internal_create_group+0x9d8/0xb20 fs/sysfs/group.c:149
 internal_create_groups.part.0+0x90/0x140 fs/sysfs/group.c:189
 internal_create_groups fs/sysfs/group.c:185 [inline]
 sysfs_create_groups+0x25/0x50 fs/sysfs/group.c:215
 device_add_groups drivers/base/core.c:2024 [inline]
 device_add_attrs drivers/base/core.c:2178 [inline]
 device_add+0x7fd/0x1c40 drivers/base/core.c:2881
 nd_async_device_register+0x12/0x80 drivers/nvdimm/bus.c:506
 async_run_entry_fn+0x121/0x530 kernel/async.c:123
 process_one_work+0x94c/0x1670 kernel/workqueue.c:2269
 worker_thread+0x64c/0x1120 kernel/workqueue.c:2415
 kthread+0x3b5/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294

The buggy address belongs to the variable:
 nvdimm_firmware_attributes+0x10/0x40

Reported-by: syzbot+1cf0ffe61aecf46f5...@syzkaller.appspotmail.com
Signed-off-by: Zqiang 
---
 v1->v2:
 Modify the description of the error.

 drivers/nvdimm/dimm_devs.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/nvdimm/dimm_devs.c b/drivers/nvdimm/dimm_devs.c
index 61374def5155..b59032e0859b 100644
--- a/drivers/nvdimm/dimm_devs.c
+++ b/drivers/nvdimm/dimm_devs.c
@@ -529,6 +529,7 @@ static DEVICE_ATTR_ADMIN_RW(activate);
 static struct attribute *nvdimm_firmware_attributes[] = {
_attr_activate.attr,
_attr_result.attr,
+   NULL,
 };

 static umode_t nvdimm_firmware_visible(struct kobject *kobj, struct attribute 
*a, int n)
--
2.17.1

回复: [PATCH] rcu: shrink each possible cpu krcp

2020-08-18 Thread Zhang, Qiang




发件人: linux-kernel-ow...@vger.kernel.org  代表 
Joel Fernandes 
发送时间: 2020年8月19日 8:04
收件人: Paul E. McKenney
抄送: Uladzislau Rezki; Zhang, Qiang; Josh Triplett; Steven Rostedt; Mathieu 
Desnoyers; Lai Jiangshan; rcu; LKML
主题: Re: [PATCH] rcu: shrink each possible cpu krcp

On Tue, Aug 18, 2020 at 6:02 PM Paul E. McKenney  wrote:

> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index b8ccd7b5af82..6decb9ad2421 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -2336,10 +2336,15 @@ int rcutree_dead_cpu(unsigned int cpu)
> >  {
> > struct rcu_data *rdp = per_cpu_ptr(_data, cpu);
> > struct rcu_node *rnp = rdp->mynode;  /* Outgoing CPU's rdp & rnp. */
> > +   struct kfree_rcu_cpu *krcp;
> >
> > if (!IS_ENABLED(CONFIG_HOTPLUG_CPU))
> > return 0;
> >
> > +   /* Drain the kcrp of this CPU. IRQs should be disabled? */
> > +   krcp = this_cpu_ptr()
> > +   schedule_delayed_work(>monitor_work, 0);
> > +
> >
> > A cpu can be offlined and its krp will be stuck until a shrinker is 
> > involved.
> > Maybe be never.
>
> Does the same apply to its kmalloc() per-CPU caches?  If so, I have a
> hard time getting too worried about it.  ;-)

>Looking at slab_offline_cpu() , that calls cancel_delayed_work_sync()
>on the cache reaper who's job is to flush the per-cpu caches. So I
>believe during CPU offlining, the per-cpu slab caches are flushed.
>
>thanks,
>
 >- Joel

When cpu going offline, the slub or slab only flush free objects in offline cpu 
cache,  put these free objects in node list  or return buddy system,  for those 
who are still in use, they still stay offline cpu cache.

If we want clean per-cpu "krcp" objects when cpu going offline.
 we should free "krcp" cache objects in "rcutree_offline_cpu", this func be 
called before other rcu cpu offline func. and then "rcutree_offline_cpu" will 
be called in "cpuhp/%u" per-cpu thread.

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 8ce77d9ac716..1812d4a1ac1b 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3959,6 +3959,7 @@ int rcutree_offline_cpu(unsigned int cpu)
unsigned long flags;
struct rcu_data *rdp;
struct rcu_node *rnp;
+   struct kfree_rcu_cpu *krcp;
 
rdp = per_cpu_ptr(_data, cpu);
rnp = rdp->mynode;
@@ -3970,6 +3971,11 @@ int rcutree_offline_cpu(unsigned int cpu)
 
// nohz_full CPUs need the tick for stop-machine to work quickly
tick_dep_set(TICK_DEP_BIT_RCU);
+
+   krcp = per_cpu_ptr(, cpu);
+   raw_spin_lock_irqsave(>lock, flags);
+   schedule_delayed_work(>monitor_work, 0);
+   raw_spin_unlock_irqrestore(>lock, flags);
return 0;
 }

thanks,

Zqiang

转发: upstream test error: WARNING in do_epoll_wait

2020-08-06 Thread Zhang, Qiang



>
>发件人: linux-kernel-ow...@vger.kernel.org  
>>代表 syzbot 
>发送时间: 2020年8月5日 15:19
>收件人: linux-fsde...@vger.kernel.org; linux-kernel@vger.kernel.org; 
>>syzkaller->b...@googlegroups.com; v...@zeniv.linux.org.uk
>主题: upstream test error: WARNING in do_epoll_wait

>Hello,

>syzbot found the following issue on:

>HEAD commit:4f30a60a Merge tag 'close-range-v5.9' of git://git.kernel...
>git tree:   upstream
>console output: https://syzkaller.appspot.com/x/log.txt?x=14c5a7da90
>kernel config:  https://syzkaller.appspot.com/x/.config?x=8bdd9944dedf0f16
>dashboard link: https://syzkaller.appspot.com/bug?extid=4429670d8213f5f26352
>compiler:   clang version 11.0.0 (https://github.com/llvm/llvm-project.git 
>>ca2dcbd030eadbf0aa9b660efe864ff08af6e18b)

>IMPORTANT: if you fix the issue, please add the following tag to the commit:
>Reported-by: syzbot+4429670d8213f5f26...@syzkaller.appspotmail.com

>[ cut here ]
>WARNING: CPU: 1 PID: 8728 at fs/eventpoll.c:1828 ep_poll fs/eventpoll.c:1828 
>[inline]
>WARNING: CPU: 1 PID: 8728 at fs/eventpoll.c:1828 do_epoll_wait+0x337/0x920 
>>fs/eventpoll.c:2333
>Kernel panic - not syncing: panic_on_warn set ...
>CPU: 1 PID: 8728 Comm: syz-fuzzer Not tainted 5.8.0-syzkaller #0
>Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
>>Google 01/01/2011
>Call Trace:
 >__dump_stack lib/dump_stack.c:77 [inline]
 >dump_stack+0x16e/0x25d lib/dump_stack.c:118
 >panic+0x20c/0x69a kernel/panic.c:231
 >__warn+0x211/0x240 kernel/panic.c:600
 >report_bug+0x153/0x1d0 lib/bug.c:198
 >handle_bug+0x4d/0x90 arch/x86/kernel/traps.c:235
 >exc_invalid_op+0x16/0x70 arch/x86/kernel/traps.c:255
 >asm_exc_invalid_op+0x12/0x20 arch/x86/include/asm/idtentry.h:547
>RIP: 0010:ep_poll fs/eventpoll.c:1828 [inline]
>RIP: 0010:do_epoll_wait+0x337/0x920 fs/eventpoll.c:2333
>Code: 41 be 01 00 00 00 31 c0 48 89 44 24 20 45 31 e4 e9 7f 01 00 00 e8 59 ab 
>c6 ff 41 bc f2 >ff ff ff e9 c8 03 00 00 e8 49 ab c6 ff <0f> 0b e9 58 fe ff ff 
>49 bf ff ff ff ff ff ff ff 7f e9 f0 fe ff >ff
>RSP: 0018:c9e1fe28 EFLAGS: 00010293
>RAX: 81856297 RBX: 888120fafa00 RCX: 88811e196400
>RDX:  RSI:  RDI: 
>RBP:  R08: 818560d8 R09: 88619eb7
>R10:  R11:  R12: 7000
>R13: 0080 R14: 0001 R15: 0003
 >__do_sys_epoll_pwait fs/eventpoll.c:2364 [inline]
 >__se_sys_epoll_pwait fs/eventpoll.c:2350 [inline]
 >__x64_sys_epoll_pwait+0x92/0x150 fs/eventpoll.c:2350
 >do_syscall_64+0x6a/0xe0 arch/x86/entry/common.c:384
 >entry_SYSCALL_64_after_hwframe+0x44/0xa9
>RIP: 0033:0x469240
>Code: 0f 05 89 44 24 20 c3 cc cc cc 8b 7c 24 08 48 8b 74 24 10 8b 54 24 18 44 
>8b 54 24 1c >49 c7 c0 00 00 00 00 b8 19 01 00 00 0f 05 <89> 44 24 20 c3 cc cc 
>cc cc cc cc cc cc cc cc cc >8b 7c 24 08 48 c7
>RSP: 002b:00c4b7f0 EFLAGS: 0246 ORIG_RAX: 0119
>RAX: ffda RBX: 0001 RCX: 00469240
>RDX: 0080 RSI: 00c4b840 RDI: 0003
>RBP: 00c4be40 R08:  R09: 
>R10: 0001 R11: 0246 R12: 0003
>R13: 00c9cc00 R14: 00c00032c180 R15: 
>Kernel Offset: disabled
>Rebooting in 86400 seconds..

In "ep_poll" func the lockdep_assert_irqs_enabled detected
 interrupt status, although before enter "ep_poll" func, irq
  is already enabled, but it was missed lockdep irq status set.

--- a/kernel/entry/common.c
+++ b/kernel/entry/common.c
@@ -76,6 +76,7 @@ noinstr long syscall_enter_from_user_mode(struct pt_regs 
*regs, long syscall)
instrumentation_begin();
 
local_irq_enable();
+   lockdep_hardirqs_on(CALLER_ADDR0);
ti_work = READ_ONCE(current_thread_info()->flags);
if (ti_work & SYSCALL_ENTER_WORK)
syscall = syscall_trace_enter(regs, syscall, ti_work);
-- 
Can we should add lockdep_hardirqs_on?


>---
>This report is generated by a bot. It may contain errors.
>See https://goo.gl/tpsmEJ for more information about syzbot.
>syzbot engineers can be reached at syzkal...@googlegroups.com.

>syzbot will keep track of this issue. See:
>https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

回复: 回复: [PATCH] ALSA: seq: KASAN: use-after-free Read in delete_and_unsubscribe_port

2020-08-03 Thread Zhang, Qiang




发件人: Takashi Iwai 
发送时间: 2020年8月3日 14:16
收件人: Zhang, Qiang
抄送: pe...@perex.cz; ti...@suse.com; alsa-de...@alsa-project.org; 
linux-kernel@vger.kernel.org
主题: Re: 回复: [PATCH] ALSA: seq: KASAN: use-after-free Read in 
delete_and_unsubscribe_port

On Mon, 03 Aug 2020 03:35:05 +0200,
Zhang, Qiang wrote:
>
> >Thanks for the patch.  But I'm afraid that this change would break the
> >existing behavior and might have a bad side-effect.
>
> >It's likely the same issue as reported in another syzkaller report
> >("KASAN: invalid-free in snd_seq_port_disconnect"), and Hillf's patch
> >below should covert this as well.  Could you check whether it works?
>
> yes It's should same issue, add mutex lock in odev_ioctl, ensure 
> serialization.
> however, it should not be necessary to mutually exclusive with open and close.

>>>That's a big-hammer approach indeed, but it should be more reasonable
>>>in this case.  It makes the patch shorter and simpler, while the OSS
>>>sequencer is an ancient interface that wasn't considered much for the
>>>concurrency, and this might also cover the case where the access to
>>>another sequencer object that is being to be closed.

>>>So, it'd be great if you can confirm that the patch actually works.
>>>Then we can brush up and merge it for 5.9-rc1.

Just like you said, this change is more reasonable. It makes the patch shorter 
and simpler. 


>>>thanks,

>>>Takashi

>
>
>
> >thanks,
>
> >Takashi
>
> >---
> >--- a/sound/core/seq/oss/seq_oss.c
> >+++ b/sound/core/seq/oss/seq_oss.c
> >@@ -167,11 +167,17 @@ odev_write(struct file *file, const char
>  >static long
>  >odev_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
>  >{
> >+   long rc;
> >  struct seq_oss_devinfo *dp;
> >+
> >+   mutex_lock(_mutex);
> >   dp = file->private_data;
> >if (snd_BUG_ON(!dp))
> >-   return -ENXIO;
> >-   return snd_seq_oss_ioctl(dp, cmd, arg);
> >+   rc = -ENXIO;
> >+   else
> >+   rc = snd_seq_oss_ioctl(dp, cmd, arg);
> >+   mutex_unlock(_mutex);
> >+   return rc;
>  >}
>
>  >#ifdef CONFIG_COMPAT

回复: [PATCH] ALSA: seq: KASAN: use-after-free Read in delete_and_unsubscribe_port

2020-08-02 Thread Zhang, Qiang




发件人: Takashi Iwai 
发送时间: 2020年8月1日 17:39
收件人: Zhang, Qiang
抄送: pe...@perex.cz; ti...@suse.com; alsa-de...@alsa-project.org; 
linux-kernel@vger.kernel.org
主题: Re: [PATCH] ALSA: seq: KASAN: use-after-free Read in 
delete_and_unsubscribe_port

On Sat, 01 Aug 2020 08:24:03 +0200,
 wrote:
>
> From: Zhang Qiang 
>
> There is a potential race window,when a task acquire "src->list_mutex"
> write sem,traverse the linked list to find "subs" objects through
> parameter "info" in snd_seq_port_disconnect and then release this
> write sem, before this task acquire write sem again,this write sem
> may be acquired by another task, and get the same "subs" object through
> the same "info" before, it could happen "use-after-free" later, so a
> simple solution is to delete the object from the linked list when it
> is found.
>
> BUG: KASAN: use-after-free in list_empty include/linux/list.h:282 [inline]
> BUG: KASAN: use-after-free in delete_and_unsubscribe_port+0x8b/0x450
> sound/core/seq/seq_ports.c:530
> Read of size 8 at addr 888098523060 by task syz-executor.0/7202
>
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x1f0/0x31e lib/dump_stack.c:118
>  print_address_description+0x66/0x5a0 mm/kasan/report.c:383
>  __kasan_report mm/kasan/report.c:513 [inline]
>  kasan_report+0x132/0x1d0 mm/kasan/report.c:530
>  list_empty include/linux/list.h:282 [inline]
>  delete_and_unsubscribe_port+0x8b/0x450 sound/core/seq/seq_ports.c:530
>  snd_seq_port_disconnect+0x568/0x610 sound/core/seq/seq_ports.c:612
>  snd_seq_ioctl_unsubscribe_port+0x349/0x6c0 
> sound/core/seq/seq_clientmgr.c:1525
>  snd_seq_oss_midi_close+0x397/0x620 sound/core/seq/oss/seq_oss_midi.c:405
>  snd_seq_oss_synth_reset+0x335/0x8b0 sound/core/seq/oss/seq_oss_synth.c:406
>  snd_seq_oss_reset+0x5b/0x250 sound/core/seq/oss/seq_oss_init.c:435
>  snd_seq_oss_ioctl+0x5c2/0x1090 sound/core/seq/oss/seq_oss_ioctl.c:93
>  odev_ioctl+0x51/0x70 sound/core/seq/oss/seq_oss.c:174
>  vfs_ioctl fs/ioctl.c:48 [inline]
>  ksys_ioctl fs/ioctl.c:753 [inline]
>  __do_sys_ioctl fs/ioctl.c:762 [inline]
>  __se_sys_ioctl+0xf9/0x160 fs/ioctl.c:760
>  do_syscall_64+0x73/0xe0 arch/x86/entry/common.c:384
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> Allocated by task 7202:
>  save_stack mm/kasan/common.c:48 [inline]
>  set_track mm/kasan/common.c:56 [inline]
>  __kasan_kmalloc+0x103/0x140 mm/kasan/common.c:494
>  kmem_cache_alloc_trace+0x234/0x300 mm/slab.c:3551
>  kmalloc include/linux/slab.h:555 [inline]
>  kzalloc include/linux/slab.h:669 [inline]
>  snd_seq_port_connect+0x66/0x460 sound/core/seq/seq_ports.c:553
>  snd_seq_ioctl_subscribe_port+0x349/0x6c0 sound/core/seq/seq_clientmgr.c:1484
>  snd_seq_oss_midi_open+0x4db/0x830 sound/core/seq/oss/seq_oss_midi.c:364
>  snd_seq_oss_synth_setup_midi+0x108/0x510 
> sound/core/seq/oss/seq_oss_synth.c:269
>  snd_seq_oss_open+0x899/0xe90 sound/core/seq/oss/seq_oss_init.c:261
>  odev_open+0x5e/0x90 sound/core/seq/oss/seq_oss.c:125
>  chrdev_open+0x498/0x580 fs/char_dev.c:414
>  do_dentry_open+0x813/0x1070 fs/open.c:828
>  do_open fs/namei.c:3243 [inline]
>  path_openat+0x278d/0x37f0 fs/namei.c:3360
>  do_filp_open+0x191/0x3a0 fs/namei.c:3387
>  do_sys_openat2+0x463/0x770 fs/open.c:1179
>  do_sys_open fs/open.c:1195 [inline]
>  __do_sys_openat fs/open.c:1209 [inline]
>  __se_sys_openat fs/open.c:1204 [inline]
>  __x64_sys_openat+0x1c8/0x1f0 fs/open.c:1204
>  do_syscall_64+0x73/0xe0 arch/x86/entry/common.c:384
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> Freed by task 7203:
>  save_stack mm/kasan/common.c:48 [inline]
>  set_track mm/kasan/common.c:56 [inline]
>  kasan_set_free_info mm/kasan/common.c:316 [inline]
>  __kasan_slab_free+0x114/0x170 mm/kasan/common.c:455
>  __cache_free mm/slab.c:3426 [inline]
>  kfree+0x10a/0x220 mm/slab.c:3757
>  snd_seq_port_disconnect+0x570/0x610 sound/core/seq/seq_ports.c:614
>  snd_seq_ioctl_unsubscribe_port+0x349/0x6c0 
> sound/core/seq/seq_clientmgr.c:1525
>  snd_seq_oss_midi_close+0x397/0x620 sound/core/seq/oss/seq_oss_midi.c:405
>  snd_seq_oss_synth_reset+0x335/0x8b0 sound/core/seq/oss/seq_oss_synth.c:406
>  snd_seq_oss_reset+0x5b/0x250 sound/core/seq/oss/seq_oss_init.c:435
>  snd_seq_oss_ioctl+0x5c2/0x1090 sound/core/seq/oss/seq_oss_ioctl.c:93
>  odev_ioctl+0x51/0x70 sound/core/seq/oss/seq_oss.c:174
>  vfs_ioctl fs/ioctl.c:48 [inline]
>  ksys_ioctl fs/ioctl.c:753 [inline]
>  __do_sys_ioctl fs/ioctl.c:762 [inline]
>  __se_sys_ioctl+0xf9/0x160 fs/ioctl.c:760
>  do_syscall_64+0x73/0xe0 arch/x86/entry/common.c:384
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> The buggy address belongs to the object at fff

回复: [PATCH v3] mm/slab.c: add node spinlock protect in __cache_free_alien

2020-07-31 Thread Zhang, Qiang




发件人: Zhang, Qiang 
发送时间: 2020年7月31日 9:27
收件人: David Rientjes
抄送: c...@linux.com; penb...@kernel.org; iamjoonsoo@lge.com; 
a...@linux-foundation.org; linux...@kvack.org; linux-kernel@vger.kernel.org
主题: 回复: [PATCH v3] mm/slab.c: add node spinlock protect in __cache_free_alien




发件人: David Rientjes 
发送时间: 2020年7月31日 7:45
收件人: Zhang, Qiang
抄送: c...@linux.com; penb...@kernel.org; iamjoonsoo@lge.com; 
a...@linux-foundation.org; linux...@kvack.org; linux-kernel@vger.kernel.org
主题: Re: [PATCH v3] mm/slab.c: add node spinlock protect in __cache_free_alien

On Thu, 30 Jul 2020, qiang.zh...@windriver.com wrote:

> From: Zhang Qiang 
>
> for example:
>   node0
>   cpu0cpu1
> slab_dead_cpu
>>mutex_lock(_mutex)
>  >cpuup_canceledslab_dead_cpu
>>mask = cpumask_of_node(node)   >mutex_lock(_mutex)
>>n = get_node(cachep0, node0)
>>spin_lock_irq(n&->list_lock)
>>if (!cpumask_empty(mask)) == true
>   >spin_unlock_irq(>list_lock)
>   >goto free_slab
>
>>mutex_unlock(_mutex)
>
>  >cpuup_canceled
>>mask = 
> cpumask_of_node(node)
> kmem_cache_free(cachep0 )  >n = get_node(cachep0, 
> node0)
>  >__cache_free_alien(cachep0 ) 
> >spin_lock_irq(n&->list_lock)
>>n = get_node(cachep0, node0)   >if (!cpumask_empty(mask)) 
> == false
>>if (n->alien && n->alien[page_node])   >alien = n->alien
>  >alien = n->alien[page_node]  >n->alien = NULL
>  > 
> >spin_unlock_irq(>list_lock)
>>
>

>As mentioned in the review of v1 of this patch, we likely want to do a fix
>for cpuup_canceled() instead.

>I see, you mean  do fix in "cpuup_canceled" func?

 I'm very sorry, due to cpu_down receive gobal  "cpu_hotplug_lock" write lock  
protect. multiple cpu offline is serial，the scenario I described above does not 
exist.

回复: [PATCH v3] mm/slab.c: add node spinlock protect in __cache_free_alien

2020-07-30 Thread Zhang, Qiang




发件人: David Rientjes 
发送时间: 2020年7月31日 7:45
收件人: Zhang, Qiang
抄送: c...@linux.com; penb...@kernel.org; iamjoonsoo@lge.com; 
a...@linux-foundation.org; linux...@kvack.org; linux-kernel@vger.kernel.org
主题: Re: [PATCH v3] mm/slab.c: add node spinlock protect in __cache_free_alien

On Thu, 30 Jul 2020, qiang.zh...@windriver.com wrote:

> From: Zhang Qiang 
>
> for example:
>   node0
>   cpu0cpu1
> slab_dead_cpu
>>mutex_lock(_mutex)
>  >cpuup_canceledslab_dead_cpu
>>mask = cpumask_of_node(node)   >mutex_lock(_mutex)
>>n = get_node(cachep0, node0)
>>spin_lock_irq(n&->list_lock)
>>if (!cpumask_empty(mask)) == true
>   >spin_unlock_irq(>list_lock)
>   >goto free_slab
>
>>mutex_unlock(_mutex)
>
>  >cpuup_canceled
>>mask = 
> cpumask_of_node(node)
> kmem_cache_free(cachep0 )  >n = get_node(cachep0, 
> node0)
>  >__cache_free_alien(cachep0 ) 
> >spin_lock_irq(n&->list_lock)
>>n = get_node(cachep0, node0)   >if (!cpumask_empty(mask)) 
> == false
>>if (n->alien && n->alien[page_node])   >alien = n->alien
>  >alien = n->alien[page_node]  >n->alien = NULL
>  > 
> >spin_unlock_irq(>list_lock)
>>
>

>As mentioned in the review of v1 of this patch, we likely want to do a fix
>for cpuup_canceled() instead.

I see, you mean  do fix in "cpuup_canceled" func?

回复: KASAN: use-after-free Read in delete_and_unsubscribe_port (2)

2020-07-29 Thread Zhang, Qiang




发件人: linux-kernel-ow...@vger.kernel.org  代表 
syzbot 
发送时间: 2020年7月30日 11:33
收件人: alsa-devel-ow...@alsa-project.org; alsa-de...@alsa-project.org; 
linux-kernel@vger.kernel.org; pe...@perex.cz; syzkaller-b...@googlegroups.com; 
ti...@suse.com
主题: Re: KASAN: use-after-free Read in delete_and_unsubscribe_port (2)

syzbot has found a reproducer for the following issue on:

HEAD commit:d3590ebf Merge tag 'audit-pr-20200729' of git://git.kernel..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1207e0b890
kernel config:  https://syzkaller.appspot.com/x/.config?x=812bbfcb6ae2cd60
dashboard link: https://syzkaller.appspot.com/bug?extid=1a54a94bd32716796edd
compiler:   clang version 10.0.0 (https://github.com/llvm/llvm-project/ 
c2443155a0fb245c8f17f2c1c72b6ea391e86e81)
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=11b227f890

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+1a54a94bd32716796...@syzkaller.appspotmail.com

==
BUG: KASAN: use-after-free in list_empty include/linux/list.h:282 [inline]
BUG: KASAN: use-after-free in delete_and_unsubscribe_port+0x8b/0x450 
sound/core/seq/seq_ports.c:530
Read of size 8 at addr 888098523060 by task syz-executor.0/7202

CPU: 1 PID: 7202 Comm: syz-executor.0 Not tainted 5.8.0-rc7-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1f0/0x31e lib/dump_stack.c:118
 print_address_description+0x66/0x5a0 mm/kasan/report.c:383
 __kasan_report mm/kasan/report.c:513 [inline]
 kasan_report+0x132/0x1d0 mm/kasan/report.c:530
 list_empty include/linux/list.h:282 [inline]


 It 's looks likely "subs->ref_count" problem.


 delete_and_unsubscribe_port+0x8b/0x450 sound/core/seq/seq_ports.c:530
 snd_seq_port_disconnect+0x568/0x610 sound/core/seq/seq_ports.c:612
 snd_seq_ioctl_unsubscribe_port+0x349/0x6c0 sound/core/seq/seq_clientmgr.c:1525
 snd_seq_oss_midi_close+0x397/0x620 sound/core/seq/oss/seq_oss_midi.c:405
 snd_seq_oss_synth_reset+0x335/0x8b0 sound/core/seq/oss/seq_oss_synth.c:406
 snd_seq_oss_reset+0x5b/0x250 sound/core/seq/oss/seq_oss_init.c:435
 snd_seq_oss_ioctl+0x5c2/0x1090 sound/core/seq/oss/seq_oss_ioctl.c:93
 odev_ioctl+0x51/0x70 sound/core/seq/oss/seq_oss.c:174
 vfs_ioctl fs/ioctl.c:48 [inline]
 ksys_ioctl fs/ioctl.c:753 [inline]
 __do_sys_ioctl fs/ioctl.c:762 [inline]
 __se_sys_ioctl+0xf9/0x160 fs/ioctl.c:760
 do_syscall_64+0x73/0xe0 arch/x86/entry/common.c:384
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x45c429
Code: 8d b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 
89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 
5b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:7f6e48930c78 EFLAGS: 0246 ORIG_RAX: 0010
RAX: ffda RBX: 000154c0 RCX: 0045c429
RDX:  RSI: 5100 RDI: 0003
RBP: 0078bf38 R08:  R09: 
R10:  R11: 0246 R12: 0078bf0c
R13: 7ffe51b9d10f R14: 7f6e489319c0 R15: 0078bf0c

Allocated by task 7202:
 save_stack mm/kasan/common.c:48 [inline]
 set_track mm/kasan/common.c:56 [inline]
 __kasan_kmalloc+0x103/0x140 mm/kasan/common.c:494
 kmem_cache_alloc_trace+0x234/0x300 mm/slab.c:3551
 kmalloc include/linux/slab.h:555 [inline]
 kzalloc include/linux/slab.h:669 [inline]
 snd_seq_port_connect+0x66/0x460 sound/core/seq/seq_ports.c:553
 snd_seq_ioctl_subscribe_port+0x349/0x6c0 sound/core/seq/seq_clientmgr.c:1484
 snd_seq_oss_midi_open+0x4db/0x830 sound/core/seq/oss/seq_oss_midi.c:364
 snd_seq_oss_synth_setup_midi+0x108/0x510 sound/core/seq/oss/seq_oss_synth.c:269
 snd_seq_oss_open+0x899/0xe90 sound/core/seq/oss/seq_oss_init.c:261
 odev_open+0x5e/0x90 sound/core/seq/oss/seq_oss.c:125
 chrdev_open+0x498/0x580 fs/char_dev.c:414
 do_dentry_open+0x813/0x1070 fs/open.c:828
 do_open fs/namei.c:3243 [inline]
 path_openat+0x278d/0x37f0 fs/namei.c:3360
 do_filp_open+0x191/0x3a0 fs/namei.c:3387
 do_sys_openat2+0x463/0x770 fs/open.c:1179
 do_sys_open fs/open.c:1195 [inline]
 __do_sys_openat fs/open.c:1209 [inline]
 __se_sys_openat fs/open.c:1204 [inline]
 __x64_sys_openat+0x1c8/0x1f0 fs/open.c:1204
 do_syscall_64+0x73/0xe0 arch/x86/entry/common.c:384
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Freed by task 7203:
 save_stack mm/kasan/common.c:48 [inline]
 set_track mm/kasan/common.c:56 [inline]
 kasan_set_free_info mm/kasan/common.c:316 [inline]
 __kasan_slab_free+0x114/0x170 mm/kasan/common.c:455
 __cache_free mm/slab.c:3426 [inline]
 kfree+0x10a/0x220 mm/slab.c:3757
 snd_seq_port_disconnect+0x570/0x610 sound/core/seq/seq_ports.c:614
 snd_seq_ioctl_unsubscribe_port+0x349/0x6c0 sound/core/seq/seq_clientmgr.c:1525

回复: INFO: rcu detected stall in tc_modify_qdisc

2020-07-29 Thread Zhang, Qiang




发件人: linux-kernel-ow...@vger.kernel.org  代表 
syzbot 
发送时间: 2020年7月29日 13:53
收件人: da...@davemloft.net; fweis...@gmail.com; j...@mojatatu.com; 
j...@resnulli.us; linux-kernel@vger.kernel.org; mi...@kernel.org; 
net...@vger.kernel.org; syzkaller-b...@googlegroups.com; t...@linutronix.de; 
vinicius.go...@intel.com; xiyou.wangc...@gmail.com
主题: INFO: rcu detected stall in tc_modify_qdisc

Hello,

syzbot found the following issue on:

HEAD commit:181964e6 fix a braino in cmsghdr_from_user_compat_to_kern()
git tree:   net
console output: https://syzkaller.appspot.com/x/log.txt?x=12925e3890
kernel config:  https://syzkaller.appspot.com/x/.config?x=f87a5e4232fdb267
dashboard link: https://syzkaller.appspot.com/bug?extid=9f78d5c664a8c33f4cce
compiler:   gcc (GCC) 10.1.0-syz 20200507
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=16587f8c90
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15b2d79090

The issue was bisected to:

commit 5a781ccbd19e4664babcbe4b4ead7aa2b9283d22
Author: Vinicius Costa Gomes 
Date:   Sat Sep 29 00:59:43 2018 +

tc: Add support for configuring the taprio scheduler

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=160e1bac90
console output: https://syzkaller.appspot.com/x/log.txt?x=110e1bac90

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+9f78d5c664a8c33f4...@syzkaller.appspotmail.com
Fixes: 5a781ccbd19e ("tc: Add support for configuring the taprio scheduler")

rcu: INFO: rcu_preempt self-detected stall on CPU
rcu:1-...!: (1 GPs behind) idle=6f6/1/0x4000 
softirq=10195/10196 fqs=1
(t=27930 jiffies g=9233 q=413)
rcu: rcu_preempt kthread starved for 27901 jiffies! g9233 f0x0 
RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
rcu:Unless rcu_preempt kthread gets sufficient CPU time, OOM is now 
expected behavior.
rcu: RCU grace-period kthread stack dump:
rcu_preempt R  running task2911210  2 0x4000
Call Trace:
 context_switch kernel/sched/core.c:3458 [inline]
 __schedule+0x8ea/0x2210 kernel/sched/core.c:4219
 schedule+0xd0/0x2a0 kernel/sched/core.c:4294
 schedule_timeout+0x148/0x250 kernel/time/timer.c:1908
 rcu_gp_fqs_loop kernel/rcu/tree.c:1874 [inline]
 rcu_gp_kthread+0xae5/0x1b50 kernel/rcu/tree.c:2044
 kthread+0x3b5/0x4a0 kernel/kthread.c:291
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293
NMI backtrace for cpu 1
CPU: 1 PID: 6799 Comm: syz-executor494 Not tainted 5.8.0-rc6-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Call Trace:
 
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x18f/0x20d lib/dump_stack.c:118
 nmi_cpu_backtrace.cold+0x70/0xb1 lib/nmi_backtrace.c:101
 nmi_trigger_cpumask_backtrace+0x1b3/0x223 lib/nmi_backtrace.c:62
 trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline]
 rcu_dump_cpu_stacks+0x194/0x1cf kernel/rcu/tree_stall.h:320
 print_cpu_stall kernel/rcu/tree_stall.h:553 [inline]
 check_cpu_stall kernel/rcu/tree_stall.h:627 [inline]
 rcu_pending kernel/rcu/tree.c:3489 [inline]
 rcu_sched_clock_irq.cold+0x5b3/0xccc kernel/rcu/tree.c:2504
 update_process_times+0x25/0x60 kernel/time/timer.c:1737
 tick_sched_handle+0x9b/0x180 kernel/time/tick-sched.c:176
 tick_sched_timer+0x108/0x290 kernel/time/tick-sched.c:1320
 __run_hrtimer kernel/time/hrtimer.c:1520 [inline]
 __hrtimer_run_queues+0x1d5/0xfc0 kernel/time/hrtimer.c:1584
 hrtimer_interrupt+0x32a/0x930 kernel/time/hrtimer.c:1646
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1080 [inline]
 __sysvec_apic_timer_interrupt+0x142/0x5e0 arch/x86/kernel/apic/apic.c:1097
 asm_call_on_stack+0xf/0x20 arch/x86/entry/entry_64.S:711
 
 __run_on_irqstack arch/x86/include/asm/irq_stack.h:22 [inline]
 run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:48 [inline]
 sysvec_apic_timer_interrupt+0xe0/0x120 arch/x86/kernel/apic/apic.c:1091
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:585
RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:770 [inline]
RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 
[inline]
RIP: 0010:_raw_spin_unlock_irqrestore+0x8c/0xe0 kernel/locking/spinlock.c:191
Code: 48 c7 c0 88 e0 b4 89 48 ba 00 00 00 00 00 fc ff df 48 c1 e8 03 80 3c 10 
00 75 37 48 83 3d e3 52 cc 01 00 74 22 48 89 df 57 9d <0f> 1f 44 00 00 bf 01 00 
00 00 e8 35 e5 66 f9 65 8b 05 fe 70 19 78
RSP: 0018:c900016672c0 EFLAGS: 0282
RAX: 11369c11 RBX: 0282 RCX: 0002
RDX: dc00 RSI:  RDI: 0282
RBP: 888093a052e8 R08:  R09: 
R10: 0001 R11:  R12: 0282
R13: 0078100c35c3 R14: 888093a05000 R15: 
 spin_unlock_irqrestore include/linux/spinlock.h:408 [inline]
 taprio_change+0x1fdc/0x2960 net/sched/sch_taprio.c:1557


 It looks

回复: [PATCH] mm/slab.c: add node spinlock protect in __cache_free_alien

2020-07-28 Thread Zhang, Qiang




发件人: David Rientjes 
发送时间: 2020年7月29日 3:46
收件人: Zhang, Qiang
抄送: c...@linux.com; penb...@kernel.org; iamjoonsoo@lge.com; 
a...@linux-foundation.org; linux...@kvack.org; linux-kernel@vger.kernel.org
主题: Re: [PATCH] mm/slab.c: add node spinlock protect in __cache_free_alien

On Tue, 28 Jul 2020, qiang.zh...@windriver.com wrote:

> From: Zhang Qiang 
>
> We should add node spinlock protect "n->alien" which may be
> assigned to NULL in cpuup_canceled func. cause address access
> exception.
>

>Hi, do you have an example NULL pointer dereference where you have hit
>this?

>This rather looks like something to fix up in cpuup_canceled() since it's
>currently manipulating the alien cache for the canceled cpu's node.

yes , it is fix up in cpuup_canceled  it's
currently manipulating the alien cache for the canceled cpu's node which  may 
be the same as the node being operated on in the __cache_free_alien func.

void cpuup_canceled
{
n = get_node(cachep, node);
spin_lock_irq(>list_lock);
...
n->alien = NULL;
spin_unlock_irq(>list_lock);
 
}

> Fixes: 18bf854117c6 ("slab: use get_node() and kmem_cache_node() functions")
> Signed-off-by: Zhang Qiang 
> ---
>  mm/slab.c | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/mm/slab.c b/mm/slab.c
> index a89633603b2d..290523c90b4e 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -759,8 +759,10 @@ static int __cache_free_alien(struct kmem_cache *cachep, 
> void *objp,
>
>   n = get_node(cachep, node);
>   STATS_INC_NODEFREES(cachep);
> + spin_lock(>list_lock);
>   if (n->alien && n->alien[page_node]) {
>   alien = n->alien[page_node];
> + spin_unlock(>list_lock);
>   ac = >ac;
>   spin_lock(>lock);
>   if (unlikely(ac->avail == ac->limit)) {
> @@ -769,14 +771,15 @@ static int __cache_free_alien(struct kmem_cache 
> *cachep, void *objp,
>   }
>   ac->entry[ac->avail++] = objp;
>   spin_unlock(>lock);
> - slabs_destroy(cachep, );
>   } else {
> + spin_unlock(>list_lock);
>   n = get_node(cachep, page_node);
>   spin_lock(>list_lock);
>   free_block(cachep, , 1, page_node, );
>   spin_unlock(>list_lock);
> - slabs_destroy(cachep, );
>   }
> +
> + slabs_destroy(cachep, );
>   return 1;
>  }
>
> --
> 2.26.2
>
>

回复: [PATCH v2] tipc: Don't using smp_processor_id() in preemptible code

2020-07-15 Thread Zhang, Qiang




发件人: Zhang, Qiang 
发送时间: 2020年7月15日 13:27
收件人: Tuong Tong Lien; Eric Dumazet; jma...@redhat.com; da...@davemloft.net; 
k...@kernel.org; Xue, Ying
抄送: net...@vger.kernel.org; tipc-discuss...@lists.sourceforge.net; 
linux-kernel@vger.kernel.org
主题: 回复: [PATCH v2] tipc: Don't using smp_processor_id() in preemptible code




发件人: Tuong Tong Lien 
发送时间: 2020年7月15日 11:53
收件人: Zhang, Qiang; Eric Dumazet; jma...@redhat.com; da...@davemloft.net; 
k...@kernel.org; Xue, Ying
抄送: net...@vger.kernel.org; tipc-discuss...@lists.sourceforge.net; 
linux-kernel@vger.kernel.org
主题: RE: [PATCH v2] tipc: Don't using smp_processor_id() in preemptible code



> -Original Message-
> From: Zhang, Qiang 
> Sent: Wednesday, July 15, 2020 9:13 AM
> To: Eric Dumazet ; jma...@redhat.com; 
> da...@davemloft.net; k...@kernel.org; Tuong Tong Lien
> ; Xue, Ying 
> Cc: net...@vger.kernel.org; tipc-discuss...@lists.sourceforge.net; 
> linux-kernel@vger.kernel.org
> Subject: 回复: [PATCH v2] tipc: Don't using smp_processor_id() in preemptible 
> code
>
>
>
> 
> 发件人: Eric Dumazet 
> 发送时间: 2020年7月14日 22:15
> 收件人: Zhang, Qiang; jma...@redhat.com; da...@davemloft.net; k...@kernel.org; 
> tuong.t.l...@dektech.com.au;
> eric.duma...@gmail.com; Xue, Ying
> 抄送: net...@vger.kernel.org; tipc-discuss...@lists.sourceforge.net; 
> linux-kernel@vger.kernel.org
> 主题: Re: [PATCH v2] tipc: Don't using smp_processor_id() in preemptible code
>
>
>
> On 7/14/20 1:05 AM, qiang.zh...@windriver.com wrote:
> > From: Zhang Qiang 
> >
> > CPU: 0 PID: 6801 Comm: syz-executor201 Not tainted 5.8.0-rc4-syzkaller #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine,
> > BIOS Google 01/01/2011
> >
> > Fixes: fc1b6d6de2208 ("tipc: introduce TIPC encryption & authentication")
> > Reported-by: syzbot+263f8c0d007dc09b2...@syzkaller.appspotmail.com
> > Signed-off-by: Zhang Qiang 
> > ---
> >  v1->v2:
> >  add fixes tags.
> >
> >  net/tipc/crypto.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/tipc/crypto.c b/net/tipc/crypto.c
> > index 8c47ded2edb6..520af0afe1b3 100644
> > --- a/net/tipc/crypto.c
> > +++ b/net/tipc/crypto.c
> > @@ -399,9 +399,10 @@ static void tipc_aead_users_set(struct tipc_aead __rcu 
> > *aead, int val)
> >   */
> >  static struct crypto_aead *tipc_aead_tfm_next(struct tipc_aead *aead)
> >  {
> > - struct tipc_tfm **tfm_entry = this_cpu_ptr(aead->tfm_entry);
> > + struct tipc_tfm **tfm_entry = get_cpu_ptr(aead->tfm_entry);
> >
> >   *tfm_entry = list_next_entry(*tfm_entry, list);
> > + put_cpu_ptr(tfm_entry);
> >   return (*tfm_entry)->tfm;
> >  }
> >
> >
>
> > You have not explained why this was safe.
> >
> >  This seems to hide a real bug.
> >
> > Presumably callers of this function should have disable preemption, and 
> > maybe > interrupts as well.
> >
> >Right after put_cpu_ptr(tfm_entry), this thread could migrate to another 
> >cpu, >and still access
> >data owned by the old cpu.
>
> Thanks for you suggest, I will check code again.
>

>Actually, last week I sent a similar patch to tipc-discussion which covers the
>case as well (there is also another place causing the same issue...). If you
>don't mind, you can take a look at below (just copied/pasted).

>BR/Tuong


>Hi Tuong Tong Lien

>The tipc_aead_free is RCU callback, this func is called in softirq context 
>which
>preemption has been banned
>so should not add preempt_disable/enable.

>thanks
>Zhang Qiang
sorry there are some questions in my reply. the tipc_aead_free function may 
also be called in the thread context. if enable CONFIG_RCU_BOOST

>-Original Message-
>From: Tuong Tong Lien 
>Sent: Friday, July 10, 2020 5:11 PM
>To: jma...@redhat.com; ma...@donjonn.com; ying@windriver.com; 
>tipc-discuss...@lists.sourceforge.net
>Cc: tipc-dek 
Subject: [PATCH RFC 1/5] tipc: fix using smp_processor_id() in preemptible
>
>The 'this_cpu_ptr()' is used to obtain the AEAD key' TFM on the current
CPU for encryption, however the execution can be preemptible since it's
actually user-space context, so the 'using smp_processor_id() in
preemptible' has been observed.

We fix the issue by using the 'get/put_cpu_ptr()' API which consists of
a 'preempt_disable()' instead.

Signed-off-by: Tuong Lien 
---
 net/tipc/crypto.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/net/tipc/crypto.c b/net/tipc/crypto.c
index c8c47fc72653..

回复: [PATCH v2] tipc: Don't using smp_processor_id() in preemptible code

2020-07-14 Thread Zhang, Qiang




发件人: Tuong Tong Lien 
发送时间: 2020年7月15日 11:53
收件人: Zhang, Qiang; Eric Dumazet; jma...@redhat.com; da...@davemloft.net; 
k...@kernel.org; Xue, Ying
抄送: net...@vger.kernel.org; tipc-discuss...@lists.sourceforge.net; 
linux-kernel@vger.kernel.org
主题: RE: [PATCH v2] tipc: Don't using smp_processor_id() in preemptible code



> -Original Message-
> From: Zhang, Qiang 
> Sent: Wednesday, July 15, 2020 9:13 AM
> To: Eric Dumazet ; jma...@redhat.com; 
> da...@davemloft.net; k...@kernel.org; Tuong Tong Lien
> ; Xue, Ying 
> Cc: net...@vger.kernel.org; tipc-discuss...@lists.sourceforge.net; 
> linux-kernel@vger.kernel.org
> Subject: 回复: [PATCH v2] tipc: Don't using smp_processor_id() in preemptible 
> code
>
>
>
> 
> 发件人: Eric Dumazet 
> 发送时间: 2020年7月14日 22:15
> 收件人: Zhang, Qiang; jma...@redhat.com; da...@davemloft.net; k...@kernel.org; 
> tuong.t.l...@dektech.com.au;
> eric.duma...@gmail.com; Xue, Ying
> 抄送: net...@vger.kernel.org; tipc-discuss...@lists.sourceforge.net; 
> linux-kernel@vger.kernel.org
> 主题: Re: [PATCH v2] tipc: Don't using smp_processor_id() in preemptible code
>
>
>
> On 7/14/20 1:05 AM, qiang.zh...@windriver.com wrote:
> > From: Zhang Qiang 
> >
> > CPU: 0 PID: 6801 Comm: syz-executor201 Not tainted 5.8.0-rc4-syzkaller #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine,
> > BIOS Google 01/01/2011
> >
> > Fixes: fc1b6d6de2208 ("tipc: introduce TIPC encryption & authentication")
> > Reported-by: syzbot+263f8c0d007dc09b2...@syzkaller.appspotmail.com
> > Signed-off-by: Zhang Qiang 
> > ---
> >  v1->v2:
> >  add fixes tags.
> >
> >  net/tipc/crypto.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/tipc/crypto.c b/net/tipc/crypto.c
> > index 8c47ded2edb6..520af0afe1b3 100644
> > --- a/net/tipc/crypto.c
> > +++ b/net/tipc/crypto.c
> > @@ -399,9 +399,10 @@ static void tipc_aead_users_set(struct tipc_aead __rcu 
> > *aead, int val)
> >   */
> >  static struct crypto_aead *tipc_aead_tfm_next(struct tipc_aead *aead)
> >  {
> > - struct tipc_tfm **tfm_entry = this_cpu_ptr(aead->tfm_entry);
> > + struct tipc_tfm **tfm_entry = get_cpu_ptr(aead->tfm_entry);
> >
> >   *tfm_entry = list_next_entry(*tfm_entry, list);
> > + put_cpu_ptr(tfm_entry);
> >   return (*tfm_entry)->tfm;
> >  }
> >
> >
>
> > You have not explained why this was safe.
> >
> >  This seems to hide a real bug.
> >
> > Presumably callers of this function should have disable preemption, and 
> > maybe > interrupts as well.
> >
> >Right after put_cpu_ptr(tfm_entry), this thread could migrate to another 
> >cpu, >and still access
> >data owned by the old cpu.
>
> Thanks for you suggest, I will check code again.
>

>Actually, last week I sent a similar patch to tipc-discussion which covers the
>case as well (there is also another place causing the same issue...). If you
>don't mind, you can take a look at below (just copied/pasted).

>BR/Tuong


Hi Tuong Tong Lien

The tipc_aead_free is RCU callback, this func is called in softirq context 
which 
preemption has been banned
so should not add preempt_disable/enable.

thanks
Zhang Qiang


>-Original Message-
>From: Tuong Tong Lien 
>Sent: Friday, July 10, 2020 5:11 PM
>To: jma...@redhat.com; ma...@donjonn.com; ying@windriver.com; 
>tipc-discuss...@lists.sourceforge.net
>Cc: tipc-dek 
Subject: [PATCH RFC 1/5] tipc: fix using smp_processor_id() in preemptible
>
>The 'this_cpu_ptr()' is used to obtain the AEAD key' TFM on the current
CPU for encryption, however the execution can be preemptible since it's
actually user-space context, so the 'using smp_processor_id() in
preemptible' has been observed.

We fix the issue by using the 'get/put_cpu_ptr()' API which consists of
a 'preempt_disable()' instead.

Signed-off-by: Tuong Lien 
---
 net/tipc/crypto.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/net/tipc/crypto.c b/net/tipc/crypto.c
index c8c47fc72653..1827ce4fac5d 100644
--- a/net/tipc/crypto.c
+++ b/net/tipc/crypto.c
@@ -326,7 +326,8 @@ static void tipc_aead_free(struct rcu_head *rp)
if (aead->cloned) {
tipc_aead_put(aead->cloned);
} else {
-   head = *this_cpu_ptr(aead->tfm_entry);
+   head = *get_cpu_ptr(aead->tfm_entry);
+   put_cpu_ptr(aead->tfm_entry);
list_for_each_entry_safe(tfm_entry, tmp, >list, list) {
crypto_free_aead(tfm_entry->tfm);

回复: [PATCH v2] tipc: Don't using smp_processor_id() in preemptible code

2020-07-14 Thread Zhang, Qiang




发件人: Eric Dumazet 
发送时间: 2020年7月14日 22:15
收件人: Zhang, Qiang; jma...@redhat.com; da...@davemloft.net; k...@kernel.org; 
tuong.t.l...@dektech.com.au; eric.duma...@gmail.com; Xue, Ying
抄送: net...@vger.kernel.org; tipc-discuss...@lists.sourceforge.net; 
linux-kernel@vger.kernel.org
主题: Re: [PATCH v2] tipc: Don't using smp_processor_id() in preemptible code



On 7/14/20 1:05 AM, qiang.zh...@windriver.com wrote:
> From: Zhang Qiang 
>
> CPU: 0 PID: 6801 Comm: syz-executor201 Not tainted 5.8.0-rc4-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine,
> BIOS Google 01/01/2011
>
> Fixes: fc1b6d6de2208 ("tipc: introduce TIPC encryption & authentication")
> Reported-by: syzbot+263f8c0d007dc09b2...@syzkaller.appspotmail.com
> Signed-off-by: Zhang Qiang 
> ---
>  v1->v2:
>  add fixes tags.
>
>  net/tipc/crypto.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/net/tipc/crypto.c b/net/tipc/crypto.c
> index 8c47ded2edb6..520af0afe1b3 100644
> --- a/net/tipc/crypto.c
> +++ b/net/tipc/crypto.c
> @@ -399,9 +399,10 @@ static void tipc_aead_users_set(struct tipc_aead __rcu 
> *aead, int val)
>   */
>  static struct crypto_aead *tipc_aead_tfm_next(struct tipc_aead *aead)
>  {
> - struct tipc_tfm **tfm_entry = this_cpu_ptr(aead->tfm_entry);
> + struct tipc_tfm **tfm_entry = get_cpu_ptr(aead->tfm_entry);
>
>   *tfm_entry = list_next_entry(*tfm_entry, list);
> + put_cpu_ptr(tfm_entry);
>   return (*tfm_entry)->tfm;
>  }
>
>

> You have not explained why this was safe.
>
>  This seems to hide a real bug.
>
> Presumably callers of this function should have disable preemption, and maybe 
> > interrupts as well.
>
>Right after put_cpu_ptr(tfm_entry), this thread could migrate to another cpu, 
>>and still access
>data owned by the old cpu.

Thanks for you suggest, I will check code again.

回复: WARNING in submit_audio_out_urb/usb_submit_urb

2020-07-10 Thread Zhang, Qiang




发件人: linux-kernel-ow...@vger.kernel.org  代表 
syzbot 
发送时间: 2020年7月9日 21:34
收件人: andreyk...@google.com; gre...@linuxfoundation.org; 
ingras...@epigenesys.com; linux-kernel@vger.kernel.org; 
linux-...@vger.kernel.org; syzkaller-b...@googlegroups.com
主题: WARNING in submit_audio_out_urb/usb_submit_urb

Hello,

syzbot found the following crash on:

HEAD commit:768a0741 usb: dwc2: gadget: Remove assigned but never used..
git tree:   https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git 
usb-testing
console output: https://syzkaller.appspot.com/x/log.txt?x=1568d11f10
kernel config:  https://syzkaller.appspot.com/x/.config?x=999be4eb2478ffa5
dashboard link: https://syzkaller.appspot.com/bug?extid=c190f6858a04ea7fbc52
compiler:   gcc (GCC) 10.1.0-syz 20200507
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=123aa2fb10

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+c190f6858a04ea7fb...@syzkaller.appspotmail.com

usb 1-1: send failed (error -32)
snd_usb_toneport 1-1:0.0: Line 6 TonePort GX now attached
[ cut here ]
usb 1-1: BOGUS urb xfer, pipe 0 != type 3
WARNING: CPU: 0 PID: 12 at drivers/usb/core/urb.c:478 
usb_submit_urb+0xa17/0x13e0 drivers/usb/core/urb.c:478
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.8.0-rc3-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Workqueue: events line6_startup_work
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0xf6/0x16e lib/dump_stack.c:118
 panic+0x2aa/0x6e1 kernel/panic.c:231
 __warn.cold+0x20/0x50 kernel/panic.c:600
 report_bug+0x1bd/0x210 lib/bug.c:198
 handle_bug+0x41/0x80 arch/x86/kernel/traps.c:235
 exc_invalid_op+0x13/0x40 arch/x86/kernel/traps.c:255
 asm_exc_invalid_op+0x12/0x20 arch/x86/include/asm/idtentry.h:563
RIP: 0010:usb_submit_urb+0xa17/0x13e0 drivers/usb/core/urb.c:478
Code: 84 e7 04 00 00 e8 a9 10 ca fd 4c 89 ef e8 41 79 12 ff 41 89 d8 44 89 e1 
4c 89 f2 48 89 c6 48 c7 c7 80 a0 5d 86 e8 db 77 9e fd <0f> 0b e8 82 10 ca fd 0f 
b6 6c 24 08 48 c7 c6 e0 a1 5d 86 48 89 ef
RSP: 0018:8881da227b10 EFLAGS: 00010086
RAX:  RBX: 0003 RCX: 
RDX: 8881da211900 RSI: 8129b4e3 RDI: ed103b444f54
RBP: 0030 R08: 0001 R09: 8881db21fe8b
R10:  R11: 0004 R12: 
R13: 8881d6ecd0a0 R14: 8881d3d8c690 R15: 8881d54c4000
 submit_audio_out_urb+0x6d6/0x1a00 sound/usb/line6/playback.c:271
 line6_submit_audio_out_all_urbs+0xc9/0x120 sound/usb/line6/playback.c:291
 line6_stream_start+0x187/0x230 sound/usb/line6/pcm.c:195
 line6_pcm_acquire+0x137/0x210 sound/usb/line6/pcm.c:318
 line6_startup_work+0x42/0x50 sound/usb/line6/driver.c:734
 process_one_work+0x94c/0x15f0 kernel/workqueue.c:2269
 worker_thread+0x64c/0x1120 kernel/workqueue.c:2415
 kthread+0x392/0x470 kernel/kthread.c:291
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293

 It's like Alan Stern's reply to the email titled "KASAN: use-after-free Read 
in line6_submit_audio_in_all_urbs."   It's also like a problem with 
asynchronous operations. can replace "cancel_delayed_work" with" 
cancel_delayed_work_sync"
in "line6_disconnect" func？

Zhang Qiang

Kernel Offset: disabled
Rebooting in 86400 seconds..


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

回复: 回复: [kthread] a90477f0c9: WARNING:at_kernel/kthread.c:#kthread_queue_work

2020-07-07 Thread Zhang, Qiang


Thanks for you  reply Pter, if the patch was add , like you said there are more 
work to do. so need remove the patch from -mm tree.

Best Regards,
Zhang Qiang


发件人: Petr Mladek 
发送时间: 2020年7月7日 17:47
收件人: Zhang, Qiang
抄送: Andrew Morton; linux-kernel@vger.kernel.org
主题: Re: 回复: [kthread] a90477f0c9: 
WARNING:at_kernel/kthread.c:#kthread_queue_work

On Mon 2020-07-06 10:17:31, Zhang, Qiang wrote:
> Hi , Petr Mladek
> There some question for "Work could not be queued when worker being 
> destroyed" patch,
>
> when in "spi_init_queue" func :
> "kthread_init_worker(>kworker); (worker->task = NULL)
>  ctlr->kworker_task = kthread_run(kthread_worker_fn, >kworker,
>  "%s", dev_name(>dev)); "

I see. I have missed that there are some kthread_worker users that
start the worker this way. They rely on the fact that
worker->task is set also by kthread_worker_fn.

The proper solution is to start the worker using either
kthread_create_worker() or kthread_create_worker_on_cpu().
They set worker->task immediately.

It means that more work is needed:

1. Convert all users that start the kthread_worker via
   kthread_worker_fn to use either kthread_create_worker()
   or kthread_create_worker_on_cpu().

2. Remove kthread_worker_fn declaration from include/linux/kthread.h
   to prevent starting the worker the temporary way.

   In the same patch, also the assignment to worker->task and
   the FIXME might get removed from kthread_worker_fn().


3. Finally, it should be safe to add the WARN_ON() into
   queuing_blocked().

Best Regards,
Petr

>
>  in "spi_start_queue" func:
>  "kthread_queue_work(>kworker, >pump_messages);"
>
>  Becasue the kthread_worker_fn is not begin running,  if queue work to worker,
>  the "!worker->task" = true, trigger WARN.
>
>  Are Need to add judgment " test_bit(KTHREAD_SHOULD_STOP,  
> _kthread(current)->flags) && WARN_ON(!worker->task)" in queuing_blocked 
> func ?
>
>  Zhang Qiang
>
>
>
>
> 
> 发件人: kernel test robot 
> 发送时间: 2020年7月6日 17:38
> 收件人: Zhang, Qiang
> 抄送: l...@lists.01.org
> 主题: [kthread] a90477f0c9: WARNING:at_kernel/kthread.c:#kthread_queue_work
>
> Greeting,
>
> FYI, we noticed the following commit (built with gcc-7):
>
> commit: a90477f0c956621eb0dd69f0abfb6066ad8fbef7 ("kthread: work could not be 
> queued when worker being destroyed")
> https://github.com/hnaz/linux-mm master
>
> in testcase: trinity
> with following parameters:
>
> runtime: 300s
>
> test-description: Trinity is a linux system call fuzz tester.
> test-url: http://codemonkey.org.uk/projects/trinity/
>
>
> on test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
>
> caused below changes (please refer to attached dmesg/kmsg for entire 
> log/backtrace):
>
>
> +-+++
> | | 85c6127e06 | a90477f0c9 |
> +-+++
> | boot_successes  | 6  | 0  |
> | boot_failures   | 0  | 30 |
> | WARNING:at_kernel/kthread.c:#kthread_queue_work | 0  | 30 |
> | EIP:kthread_queue_work  | 0  | 30 |
> | BUG:kernel_hang_in_test_stage   | 0  | 2  |
> +-+++
>
>
> If you fix the issue, kindly add following tag
> Reported-by: kernel test robot 
>
>
> [5.554282] WARNING: CPU: 0 PID: 1 at kernel/kthread.c:817 
> kthread_queue_work+0xf8/0x120
> [5.556204] Modules linked in:
> [5.556204] CPU: 0 PID: 1 Comm: swapper Tainted: G S
> 5.8.0-rc3-00014-ga90477f0c9566 #1
> [5.556204] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> 1.12.0-1 04/01/2014
> [5.556204] EIP: kthread_queue_work+0xf8/0x120
> [5.556204] Code: 00 59 e9 67 ff ff ff 8d 76 00 8b 4e 10 85 c9 75 a6 8d 4b 
> 28 89 f2 89 d8 bf 01 00 00 00 e8 f0 f5 ff ff eb 93 8d b6 00 00 00 00 <0f> 0b 
> 6a 00 31 c9 ba 01 00 00 00 b8 08 6c 64 c3 e8 f3 04 0b 00 5b
> [5.556204] EAX:  EBX: eeff538c ECX:  EDX: 0001
> [5.556204] ESI: eeff53d0 EDI:  EBP: f5edde70 ESP: f5edde5c
> [5.556204] DS: 007b ES: 007b FS:  GS:  SS: 0068 EFLAGS: 00010046
> [5.556204] CR0: 80050033 CR2: b7eda844 CR3: 038ae000 CR4: 000406d0
> [5.556204] Ca

回复: [PATCH v4] kthread: Work could not be queued when worker being destroyed

2020-07-07 Thread Zhang, Qiang

Sorry,  Pter Mladek, this is my mistake please ignore this change.


发件人: linux-kernel-ow...@vger.kernel.org  代表 
Petr Mladek 
发送时间: 2020年7月7日 17:06
收件人: Zhang, Qiang
抄送: ben.do...@codethink.co.uk; bfie...@redhat.com; c...@rock-chips.com; 
pet...@infradead.org; t...@kernel.org; a...@linux-foundation.org; 
naresh.kamb...@linaro.org; mm-comm...@vger.kernel.org; 
linux-kernel@vger.kernel.org
主题: Re: [PATCH v4] kthread: Work could not be queued when worker being destroyed

On Mon 2020-07-06 13:46:47, qiang.zh...@windriver.com wrote:
> From: Zhang Qiang 
>
> Before the work is put into the queue of the worker thread,
> the state of the worker thread needs to be detected,because
> the worker thread may be in the destruction state at this time.
>
> Signed-off-by: Zhang Qiang 
> Suggested-by: Petr Mladek 
> Reviewed-by: Petr Mladek 

This patch is completely different from the one that I suggested
or acked. Please, never keep acks when doing major rework and
people did not agree with it.

For this patch:

Nacked-by: Petr Mladek 

See below why.


> ---
>  v1->v2:
>  Add warning information for condition "!worker->task".
>  v2->v3:
>  Modify submission information and add "Reviewed-by" tags.
>  v3->v4:
>  Fix spi controller register trigger Warning.
>  when a spi controller register, a "kthread_worker_fn" worker is created
>  through "kthread_run" instead of "kthread_create_worker" which in this
>  func the "worker->task" will be initialized. and then the "spi_start_queue"
>  func queue a work to worker queue, at this time, if the worker has not
>  begin to running, the "!worker->task" will be true, so a warning is 
> triggered.
>
>  kernel/kthread.c | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/kernel/kthread.c b/kernel/kthread.c
> index bfbfa481be3a..825bd4dcdb95 100644
> --- a/kernel/kthread.c
> +++ b/kernel/kthread.c
> @@ -791,6 +791,11 @@ static inline bool queuing_blocked(struct kthread_worker 
> *worker,
>  {
>   lockdep_assert_held(>lock);
>
> + if (kthread_should_stop()) {

This does not make much sense.

kthread_should_stop() checks a flag set for the "current" process.
It works only when called from inside the kthread worker thread.

queuing_blocked() is called from kthread_queue_work()
or kthread_queue_delayed_work(). These are typically
called from another process. The only exception is when
they get re-queued from inside the work.

Best Regards,
Petr


> + WARN_ON(1);
> + return true;
> + }
> +
>   return !list_empty(>node) || work->canceling;
>  }
>
> --
> 2.24.1

回复: [PATCH v3] usb: gadget: function: fix missing spinlock in f_uac1_legacy

2020-07-06 Thread Zhang, Qiang

Hi Greg KH
In the early submission:
"commit id c6994e6f067cf0fc4c6cca3d164018b1150916f8" which add USB Audio Gadget 
driver "   
the "audio->play_queue" was protected from "audio->lock"
spinlock in "playback_work" func, But in "f_audio_out_ep_complete" func 
there is no protection for the operation of this "audio->play_queue". there
are missing spinlock,  Fix tags should add up here commit？

________
发件人: Greg KH 
发送时间: 2020年7月7日 3:55
收件人: Zhang, Qiang
抄送: ba...@kernel.org; colin.k...@canonical.com; linux-...@vger.kernel.org; 
linux-kernel@vger.kernel.org; sta...@vger.kernel.org
主题: Re: [PATCH v3] usb: gadget: function: fix missing spinlock in f_uac1_legacy

On Sun, Jul 05, 2020 at 08:40:27PM +0800, qiang.zh...@windriver.com wrote:
> From: Zhang Qiang 
>
> Add a missing spinlock protection for play_queue, because
> the play_queue may be destroyed when the "playback_work"
> work func and "f_audio_out_ep_complete" callback func
> operate this paly_queue at the same time.

"play_queue", right?

>
> Cc: stable 
> Signed-off-by: Zhang Qiang 

Because you do not have a Fixes: tag in here, how far back do you want
the stable patch to go to?  That's why, if you can, it's always good to
have a "Fixes:" tag in there to show what commit caused the problem you
are fixing here.

So, what commit caused this?

thanks,

gre gk-h

回复: [PATCH v3] kthread: Work could not be queued when worker being destroyed

2020-07-06 Thread Zhang, Qiang

I'm very sorry that there are some problems with my change.
as follows:

[1.203300] loop: module loaded
[1.204599] megasas: 07.714.04.00-rc1
[1.211124] spi_qup 78b7000.spi: IN:block:16, fifo:64, OUT:block:16, fifo:64
[1.211509] [ cut here ]
[1.217238] WARNING: CPU: 0 PID: 1 at kernel/kthread.c:819
kthread_queue_work+0x90/0xa0
[1.221832] Modules linked in:
[1.229554] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
5.8.0-rc3-next-20200706 #1
[1.232683] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
[1.240237] pstate: 4085 (nZcv daIf -PAN -UAO BTYPE=--)
[1.246918] pc : kthread_queue_work+0x90/0xa0
[1.252211] lr : kthread_queue_work+0x2c/0xa0
[1.256722] sp : 80001002ba50
[1.261061] x29: 80001002ba50 x28: 3b868000
[1.264363] x27: 3fcf63c0 x26: 3b868680
[1.269744] x25: 3b868400 x24: 3d116810
[1.275039] x23: 800012025304 x22: 3b8683bc
[1.280335] x21:  x20: 3b8683f8
[1.285630] x19: 3b8683b8 x18: 
[1.290925] x17:  x16: 800011167420
[1.296220] x15: 0eb90480 x14: 0267
[1.301515] x13: 0004 x12: 
[1.306810] x11:  x10: 0003
[1.312105] x9 : 3fcbac10 x8 : 3fcba240
[1.317400] x7 : 3bc3c800 x6 : 0003
[1.322696] x5 :  x4 : 
[1.327991] x3 : 3b8683bc x2 : 0001
[1.333285] x1 :  x0 : 
[1.338583] Call trace:
[1.343875]  kthread_queue_work+0x90/0xa0
[1.346050]  spi_start_queue+0x50/0x78
[1.350213]  spi_register_controller+0x458/0x820
[1.353860]  devm_spi_register_controller+0x44/0xa0
[1.358638]  spi_qup_probe+0x5d8/0x638
[1.363235]  platform_drv_probe+0x54/0xa8
[1.367053]  really_probe+0xd8/0x320
[1.371133]  driver_probe_device+0x58/0xb8
[1.374779]  device_driver_attach+0x74/0x80
[1.378685]  __driver_attach+0x58/0xe0
[1.382766]  bus_for_each_dev+0x70/0xc0
[1.386583]  driver_attach+0x24/0x30
[1.390317]  bus_add_driver+0x14c/0x1f0
[1.394137]  driver_register+0x64/0x120
[1.397696]  __platform_driver_register+0x48/0x58
[1.401519]  spi_qup_driver_init+0x1c/0x28
[1.406378]  do_one_initcall+0x54/0x1a0
[1.410372]  kernel_init_freeable+0x1d4/0x254
[1.414106]  kernel_init+0x14/0x110
[1.418616]  ret_from_fork+0x10/0x34
[1.421918] ---[ end trace 4b59f327623c9e10 ]---
[1.426526] spi_qup 78b9000.spi: IN:block:16, fifo:64, OUT:block:16, fifo:64
[1.430721] [ cut here ]
[1.437374] WARNING: CPU: 0 PID: 1 at kernel/kthread.c:819

when in "spi_init_queue" func :
"kthread_init_worker(>kworker); (worker->task = NULL)
 ctlr->kworker_task = kthread_run(kthread_worker_fn, >kworker,
 "%s", dev_name(>dev)); "

 in "spi_start_queue" func:
 "kthread_queue_work(>kworker, >pump_messages);"

 Becasue the kthread_worker_fn is not begin running,  if queue work to worker,
 the "!worker->task" = true, trigger WARN.

________
发件人: Tejun Heo  代表 Tejun Heo 
发送时间: 2020年7月6日 22:59
收件人: Zhang, Qiang
抄送: ben.do...@codethink.co.uk; bfie...@redhat.com; c...@rock-chips.com; 
pet...@infradead.org; pmla...@suse.com; a...@linux-foundation.org; 
mm-comm...@vger.kernel.org; linux-kernel@vger.kernel.org
主题: Re: [PATCH v3] kthread: Work could not be queued when worker being destroyed

On Sun, Jul 05, 2020 at 09:30:18AM +0800, qiang.zh...@windriver.com wrote:
> From: Zhang Qiang 
>
> Before the work is put into the queue of the worker thread,
> the state of the worker thread needs to be detected,because
> the worker thread may be in the destruction state at this time.
>
> Signed-off-by: Zhang Qiang 
> Suggested-by: Petr Mladek 
> Reviewed-by: Petr Mladek 

Andrew already brought this up but can you please provide some context on
why you're making this change?

Thanks.

--
tejun

回复: [PATCH v2] usb: gadget: function: fix missing spinlock in f_uac1_legacy

2020-07-06 Thread Zhang, Qiang

Thanks for your suggestin Greg KH
I think there is not need fix tags. I will resend.

thanks,
Zhang Qiang

发件人: Greg KH 
发送时间: 2020年7月6日 18:31
收件人: Zhang, Qiang
抄送: ba...@kernel.org; colin.k...@canonical.com; linux-...@vger.kernel.org; 
linux-kernel@vger.kernel.org
主题: Re: [PATCH v2] usb: gadget: function: fix missing spinlock in f_uac1_legacy

On Sun, Jul 05, 2020 at 02:16:16PM +0800, qiang.zh...@windriver.com wrote:
> From: Zhang Qiang 
>
> Add a missing spinlock protection to the add operation of the 
> "audio->play_queue"
> in "f_audio_out_ep_complete" function.

That says _what_ you did, but not _why_ you did that.  Why is a lock
needed here?  What does this protect?

What kernel commit does this "fix"?  Put that in the "Fixes:" line, and
probably you need a "cc: stable" in that area too, right?

thanks,

greg k-h

回复: [kthread] a90477f0c9: WARNING:at_kernel/kthread.c:#kthread_queue_work

2020-07-06 Thread Zhang, Qiang




发件人: Zhang, Qiang 
发送时间: 2020年7月6日 18:17
收件人: Petr Mladek
抄送: Andrew Morton; linux-kernel@vger.kernel.org
主题: 回复: [kthread] a90477f0c9: WARNING:at_kernel/kthread.c:#kthread_queue_work

Hi , Petr Mladek
There some question for "Work could not be queued when worker being destroyed" 
patch,

when in "spi_init_queue" func :
"kthread_init_worker(>kworker); (worker->task = NULL)
 ctlr->kworker_task = kthread_run(kthread_worker_fn, >kworker,
 "%s", dev_name(>dev)); "

 in "spi_start_queue" func:
 "kthread_queue_work(>kworker, >pump_messages);"

 Becasue the kthread_worker_fn is not begin running,  if queue work to worker,
 the "!worker->task" = true, trigger WARN.

 Are Need to  " test_bit(KTHREAD_SHOULD_STOP,  _kthread(current)->flags) " 
replace "WARN_ON(!worker->task)" in queuing_blocked func 
 or 
 "kthread_create_worker" replace "kthread_run(kthread_worker_fn.." in 
spi_init_queue func
 (because in kthread_create_worker, "worker->task" will be assigned a value) ?

 Zhang Qiang





发件人: kernel test robot 
发送时间: 2020年7月6日 17:38
收件人: Zhang, Qiang
抄送: l...@lists.01.org
主题: [kthread] a90477f0c9: WARNING:at_kernel/kthread.c:#kthread_queue_work

Greeting,

FYI, we noticed the following commit (built with gcc-7):

commit: a90477f0c956621eb0dd69f0abfb6066ad8fbef7 ("kthread: work could not be 
queued when worker being destroyed")
https://github.com/hnaz/linux-mm master

in testcase: trinity
with following parameters:

runtime: 300s

test-description: Trinity is a linux system call fuzz tester.
test-url: http://codemonkey.org.uk/projects/trinity/


on test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

caused below changes (please refer to attached dmesg/kmsg for entire 
log/backtrace):


+-+++
| | 85c6127e06 | a90477f0c9 |
+-+++
| boot_successes  | 6  | 0  |
| boot_failures   | 0  | 30 |
| WARNING:at_kernel/kthread.c:#kthread_queue_work | 0  | 30 |
| EIP:kthread_queue_work  | 0  | 30 |
| BUG:kernel_hang_in_test_stage   | 0  | 2  |
+-+++


If you fix the issue, kindly add following tag
Reported-by: kernel test robot 


[5.554282] WARNING: CPU: 0 PID: 1 at kernel/kthread.c:817 
kthread_queue_work+0xf8/0x120
[5.556204] Modules linked in:
[5.556204] CPU: 0 PID: 1 Comm: swapper Tainted: G S
5.8.0-rc3-00014-ga90477f0c9566 #1
[5.556204] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.12.0-1 04/01/2014
[5.556204] EIP: kthread_queue_work+0xf8/0x120
[5.556204] Code: 00 59 e9 67 ff ff ff 8d 76 00 8b 4e 10 85 c9 75 a6 8d 4b 
28 89 f2 89 d8 bf 01 00 00 00 e8 f0 f5 ff ff eb 93 8d b6 00 00 00 00 <0f> 0b 6a 
00 31 c9 ba 01 00 00 00 b8 08 6c 64 c3 e8 f3 04 0b 00 5b
[5.556204] EAX:  EBX: eeff538c ECX:  EDX: 0001
[5.556204] ESI: eeff53d0 EDI:  EBP: f5edde70 ESP: f5edde5c
[5.556204] DS: 007b ES: 007b FS:  GS:  SS: 0068 EFLAGS: 00010046
[5.556204] CR0: 80050033 CR2: b7eda844 CR3: 038ae000 CR4: 000406d0
[5.556204] Call Trace:
[5.556204]  spi_start_queue+0x50/0x70
[5.556204]  spi_register_controller+0x642/0xa80
[5.556204]  ? kobject_get+0x54/0xf0
[5.556204]  ? parport_pc_platform_probe+0x10/0x10
[5.556204]  spi_bitbang_start+0x2f/0x70
[5.556204]  ? parport_pc_platform_probe+0x10/0x10
[5.556204]  butterfly_attach+0x164/0x2c0
[5.556204]  ? driver_detach+0x30/0x30
[5.556204]  port_check+0x1c/0x30
[5.556204]  bus_for_each_dev+0x5a/0x90
[5.556204]  __parport_register_driver+0x76/0xa0
[5.556204]  ? driver_detach+0x30/0x30
[5.556204]  ? spi_engine_driver_init+0x16/0x16
[5.556204]  butterfly_init+0x19/0x1b
[5.556204]  do_one_initcall+0x79/0x310
[5.556204]  ? parse_args+0x70/0x420
[5.556204]  ? rcu_read_lock_sched_held+0x2f/0x50
[5.556204]  ? trace_initcall_level+0x95/0xc7
[5.556204]  ? kernel_init_freeable+0x129/0x19f
[5.556204]  kernel_init_freeable+0x148/0x19f
[5.556204]  ? rest_init+0x100/0x100
[5.556204]  kernel_init+0xd/0xf0
[5.556204]  ret_from_fork+0x1c/0x28
[5.556204] irq event stamp: 8620410
[5.556204] hardirqs last  enabled at (8620409): [] 
_raw_spin_unlock_irqrestore+0x2a/0x50
[5.556204] hardirqs last disabled at (8620410): [] 
_raw_spin_lock_irqsave+0

回复: [PATCH] usb: gadget: function: fix missing spinlock in f_uac1_legacy

2020-07-06 Thread Zhang, Qiang

Sorry, I will add changelog and resend.

Zhang Qiang


发件人: Greg KH 
发送时间: 2020年7月6日 15:40
收件人: Zhang, Qiang
抄送: ba...@kernel.org; colin.k...@canonical.com; linux-...@vger.kernel.org; 
linux-kernel@vger.kernel.org
主题: Re: [PATCH] usb: gadget: function: fix missing spinlock in f_uac1_legacy

On Sun, Jul 05, 2020 at 09:59:41AM +0800, qiang.zh...@windriver.com wrote:
> From: Zhang Qiang 
>
> Signed-off-by: Zhang Qiang 

No changelog text?  I can't take patches like that, sorry.

greg k-h

回复: [PATCH] usb: gadget: function: fix missing spinlock in f_uac1_legacy

2020-07-05 Thread Zhang, Qiang



发件人: linux-kernel-ow...@vger.kernel.org  代表 
qiang.zh...@windriver.com 
发送时间: 2020年7月5日 9:59
收件人: ba...@kernel.org
抄送: gre...@linuxfoundation.org; colin.k...@canonical.com; 
linux-...@vger.kernel.org; linux-kernel@vger.kernel.org
主题: [PATCH] usb: gadget: function: fix missing spinlock in f_uac1_legacy

From: Zhang Qiang 

Signed-off-by: Zhang Qiang 
---
 drivers/usb/gadget/function/f_uac1_legacy.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/usb/gadget/function/f_uac1_legacy.c 
b/drivers/usb/gadget/function/f_uac1_legacy.c
index 349deae7cabd..e2d7f69128a0 100644
--- a/drivers/usb/gadget/function/f_uac1_legacy.c
+++ b/drivers/usb/gadget/function/f_uac1_legacy.c
@@ -336,7 +336,9 @@ static int f_audio_out_ep_complete(struct usb_ep *ep, 
struct usb_request *req)

/* Copy buffer is full, add it to the play_queue */
if (audio_buf_size - copy_buf->actual < req->actual) {
+   spin_lock_irq(>lock);
list_add_tail(_buf->list, >play_queue);
+   spin_unlock_irq(>lock);
schedule_work(>playback_work);
copy_buf = f_audio_buffer_alloc(audio_buf_size);
if (IS_ERR(copy_buf))
--
2.24.1

回复: [PATCH] kthread: Don't cancel a work that is being cancelled

2020-07-03 Thread Zhang, Qiang

Thank you for your advice,   if add kthread_cancel_work() without the "_sync"
it is be dangerous, But I think it is unnecessary to cancel  work which is be 
canceling.

发件人: linux-kernel-ow...@vger.kernel.org  代表 
Petr Mladek 
发送时间: 2020年7月3日 15:28
收件人: Zhang, Qiang
抄送: ben.do...@codethink.co.uk; bfie...@redhat.com; c...@rock-chips.com; 
pet...@infradead.org; t...@kernel.org; linux-kernel@vger.kernel.org
主题: Re: [PATCH] kthread: Don't cancel a work that is being cancelled

On Thu 2020-07-02 12:43:24, qiang.zh...@windriver.com wrote:
> From: Zhang Qiang 
>
> When canceling a work, if it is found that the work is in
> the cancelling state, we should directly exit the cancelled
> operation.

No, the function guarantees that the work is not longer running
when it returns. This is why it has the suffix "_sync" in the name.

We would need to add kthread_cancel_work() without the "_sync"
wrappers that would not wait for the work in progress. But it
might be dangerous. The API users usually want to make sure
that the work in not longer running to avoid races.

What is the use case for the non-sync behavior, please?

Best Regards,
Petr

回复: [PATCH] usb: gadget: function: printer: The device interface is reset and should return error code

2020-06-29 Thread Zhang, Qiang

Hi Felipe,

Please check this patch and make suggestions .

Thanks
Zqiang


发件人: linux-usb-ow...@vger.kernel.org  代表 
qiang.zh...@windriver.com 
发送时间: 2020年6月28日 9:57
收件人: felipe.ba...@linux.intel.com
抄送: gre...@linuxfoundation.org; linux-...@vger.kernel.org; 
linux-kernel@vger.kernel.org
主题: [PATCH] usb: gadget: function: printer: The device interface is reset and 
should return error code

From: Zqiang 

After the device is disconnected from the host side, the interface of
the device is reset. If the userspace operates the device again,
an error code should be returned.

Signed-off-by: Zqiang 
---
 drivers/usb/gadget/function/f_printer.c | 36 +
 1 file changed, 36 insertions(+)

diff --git a/drivers/usb/gadget/function/f_printer.c 
b/drivers/usb/gadget/function/f_printer.c
index 9c7ed2539ff7..2b45a61e4213 100644
--- a/drivers/usb/gadget/function/f_printer.c
+++ b/drivers/usb/gadget/function/f_printer.c
@@ -338,6 +338,11 @@ printer_open(struct inode *inode, struct file *fd)

spin_lock_irqsave(>lock, flags);

+   if (dev->interface < 0) {
+   spin_unlock_irqrestore(>lock, flags);
+   return -ENODEV;
+   }
+
if (!dev->printer_cdev_open) {
dev->printer_cdev_open = 1;
fd->private_data = dev;
@@ -430,6 +435,12 @@ printer_read(struct file *fd, char __user *buf, size_t 
len, loff_t *ptr)
mutex_lock(>lock_printer_io);
spin_lock_irqsave(>lock, flags);

+   if (dev->interface < 0) {
+   spin_unlock_irqrestore(>lock, flags);
+   mutex_unlock(>lock_printer_io);
+   return -ENODEV;
+   }
+
/* We will use this flag later to check if a printer reset happened
 * after we turn interrupts back on.
 */
@@ -561,6 +572,12 @@ printer_write(struct file *fd, const char __user *buf, 
size_t len, loff_t *ptr)
mutex_lock(>lock_printer_io);
spin_lock_irqsave(>lock, flags);

+   if (dev->interface < 0) {
+   spin_unlock_irqrestore(>lock, flags);
+   mutex_unlock(>lock_printer_io);
+   return -ENODEV;
+   }
+
/* Check if a printer reset happens while we have interrupts on */
dev->reset_printer = 0;

@@ -667,6 +684,13 @@ printer_fsync(struct file *fd, loff_t start, loff_t end, 
int datasync)

inode_lock(inode);
spin_lock_irqsave(>lock, flags);
+
+   if (dev->interface < 0) {
+   spin_unlock_irqrestore(>lock, flags);
+   inode_unlock(inode);
+   return -ENODEV;
+   }
+
tx_list_empty = (likely(list_empty(>tx_reqs)));
spin_unlock_irqrestore(>lock, flags);

@@ -689,6 +713,13 @@ printer_poll(struct file *fd, poll_table *wait)

mutex_lock(>lock_printer_io);
spin_lock_irqsave(>lock, flags);
+
+   if (dev->interface < 0) {
+   spin_unlock_irqrestore(>lock, flags);
+   mutex_unlock(>lock_printer_io);
+   return EPOLLERR | EPOLLHUP;
+   }
+
setup_rx_reqs(dev);
spin_unlock_irqrestore(>lock, flags);
mutex_unlock(>lock_printer_io);
@@ -722,6 +753,11 @@ printer_ioctl(struct file *fd, unsigned int code, unsigned 
long arg)

spin_lock_irqsave(>lock, flags);

+   if (dev->interface < 0) {
+   spin_unlock_irqrestore(>lock, flags);
+   return -ENODEV;
+   }
+
switch (code) {
case GADGET_GET_PRINTER_STATUS:
status = (int)dev->printer_status;
--
2.24.1

回复: [PATCH] usb: gadget: function: printer: Add gadget dev interface status judgment

2020-06-27 Thread Zhang, Qiang




发件人: linux-usb-ow...@vger.kernel.org  代表 
qiang.zh...@windriver.com 
发送时间: 2020年6月15日 17:46
收件人: ba...@kernel.org
抄送: gre...@linuxfoundation.org; linux-...@vger.kernel.org; 
linux-kernel@vger.kernel.org
主题: [PATCH] usb: gadget: function: printer: Add gadget dev interface status 
judgment

From: Zqiang 

After the interface of gadget printer device was disabled,
We should not continue operate the device.

Signed-off-by: Zqiang 
---
 drivers/usb/gadget/function/f_printer.c | 36 +
 1 file changed, 36 insertions(+)

diff --git a/drivers/usb/gadget/function/f_printer.c 
b/drivers/usb/gadget/function/f_printer.c
index 9c7ed2539ff7..2b45a61e4213 100644
--- a/drivers/usb/gadget/function/f_printer.c
+++ b/drivers/usb/gadget/function/f_printer.c
@@ -338,6 +338,11 @@ printer_open(struct inode *inode, struct file *fd)

spin_lock_irqsave(>lock, flags);

+   if (dev->interface < 0) {
+   spin_unlock_irqrestore(>lock, flags);
+   return -ENODEV;
+   }
+
if (!dev->printer_cdev_open) {
dev->printer_cdev_open = 1;
fd->private_data = dev;
@@ -430,6 +435,12 @@ printer_read(struct file *fd, char __user *buf, size_t 
len, loff_t *ptr)
mutex_lock(>lock_printer_io);
spin_lock_irqsave(>lock, flags);

+   if (dev->interface < 0) {
+   spin_unlock_irqrestore(>lock, flags);
+   mutex_unlock(>lock_printer_io);
+   return -ENODEV;
+   }
+
/* We will use this flag later to check if a printer reset happened
 * after we turn interrupts back on.
 */
@@ -561,6 +572,12 @@ printer_write(struct file *fd, const char __user *buf, 
size_t len, loff_t *ptr)
mutex_lock(>lock_printer_io);
spin_lock_irqsave(>lock, flags);

+   if (dev->interface < 0) {
+   spin_unlock_irqrestore(>lock, flags);
+   mutex_unlock(>lock_printer_io);
+   return -ENODEV;
+   }
+
/* Check if a printer reset happens while we have interrupts on */
dev->reset_printer = 0;

@@ -667,6 +684,13 @@ printer_fsync(struct file *fd, loff_t start, loff_t end, 
int datasync)

inode_lock(inode);
spin_lock_irqsave(>lock, flags);
+
+   if (dev->interface < 0) {
+   spin_unlock_irqrestore(>lock, flags);
+   inode_unlock(inode);
+   return -ENODEV;
+   }
+
tx_list_empty = (likely(list_empty(>tx_reqs)));
spin_unlock_irqrestore(>lock, flags);

@@ -689,6 +713,13 @@ printer_poll(struct file *fd, poll_table *wait)

mutex_lock(>lock_printer_io);
spin_lock_irqsave(>lock, flags);
+
+   if (dev->interface < 0) {
+   spin_unlock_irqrestore(>lock, flags);
+   mutex_unlock(>lock_printer_io);
+   return EPOLLERR | EPOLLHUP;
+   }
+
setup_rx_reqs(dev);
spin_unlock_irqrestore(>lock, flags);
mutex_unlock(>lock_printer_io);
@@ -722,6 +753,11 @@ printer_ioctl(struct file *fd, unsigned int code, unsigned 
long arg)

spin_lock_irqsave(>lock, flags);

+   if (dev->interface < 0) {
+   spin_unlock_irqrestore(>lock, flags);
+   return -ENODEV;
+   }
+
switch (code) {
case GADGET_GET_PRINTER_STATUS:
status = (int)dev->printer_status;
--
2.24.1

回复: [PATCH] usb: gadget: function: printer: Add gadget dev interface status judgment

2020-06-24 Thread Zhang, Qiang

Hello, Greg KH
Please have you review the patch?

thanks
Zqiang

发件人: linux-usb-ow...@vger.kernel.org  代表 
qiang.zh...@windriver.com 
发送时间: 2020年6月15日 17:46
收件人: ba...@kernel.org
抄送: gre...@linuxfoundation.org; linux-...@vger.kernel.org; 
linux-kernel@vger.kernel.org
主题: [PATCH] usb: gadget: function: printer: Add gadget dev interface status 
judgment

From: Zqiang 

After the interface of gadget printer device was disabled,
We should not continue operate the device.

Signed-off-by: Zqiang 
---
 drivers/usb/gadget/function/f_printer.c | 36 +
 1 file changed, 36 insertions(+)

diff --git a/drivers/usb/gadget/function/f_printer.c 
b/drivers/usb/gadget/function/f_printer.c
index 9c7ed2539ff7..2b45a61e4213 100644
--- a/drivers/usb/gadget/function/f_printer.c
+++ b/drivers/usb/gadget/function/f_printer.c
@@ -338,6 +338,11 @@ printer_open(struct inode *inode, struct file *fd)

spin_lock_irqsave(>lock, flags);

+   if (dev->interface < 0) {
+   spin_unlock_irqrestore(>lock, flags);
+   return -ENODEV;
+   }
+
if (!dev->printer_cdev_open) {
dev->printer_cdev_open = 1;
fd->private_data = dev;
@@ -430,6 +435,12 @@ printer_read(struct file *fd, char __user *buf, size_t 
len, loff_t *ptr)
mutex_lock(>lock_printer_io);
spin_lock_irqsave(>lock, flags);

+   if (dev->interface < 0) {
+   spin_unlock_irqrestore(>lock, flags);
+   mutex_unlock(>lock_printer_io);
+   return -ENODEV;
+   }
+
/* We will use this flag later to check if a printer reset happened
 * after we turn interrupts back on.
 */
@@ -561,6 +572,12 @@ printer_write(struct file *fd, const char __user *buf, 
size_t len, loff_t *ptr)
mutex_lock(>lock_printer_io);
spin_lock_irqsave(>lock, flags);

+   if (dev->interface < 0) {
+   spin_unlock_irqrestore(>lock, flags);
+   mutex_unlock(>lock_printer_io);
+   return -ENODEV;
+   }
+
/* Check if a printer reset happens while we have interrupts on */
dev->reset_printer = 0;

@@ -667,6 +684,13 @@ printer_fsync(struct file *fd, loff_t start, loff_t end, 
int datasync)

inode_lock(inode);
spin_lock_irqsave(>lock, flags);
+
+   if (dev->interface < 0) {
+   spin_unlock_irqrestore(>lock, flags);
+   inode_unlock(inode);
+   return -ENODEV;
+   }
+
tx_list_empty = (likely(list_empty(>tx_reqs)));
spin_unlock_irqrestore(>lock, flags);

@@ -689,6 +713,13 @@ printer_poll(struct file *fd, poll_table *wait)

mutex_lock(>lock_printer_io);
spin_lock_irqsave(>lock, flags);
+
+   if (dev->interface < 0) {
+   spin_unlock_irqrestore(>lock, flags);
+   mutex_unlock(>lock_printer_io);
+   return EPOLLERR | EPOLLHUP;
+   }
+
setup_rx_reqs(dev);
spin_unlock_irqrestore(>lock, flags);
mutex_unlock(>lock_printer_io);
@@ -722,6 +753,11 @@ printer_ioctl(struct file *fd, unsigned int code, unsigned 
long arg)

spin_lock_irqsave(>lock, flags);

+   if (dev->interface < 0) {
+   spin_unlock_irqrestore(>lock, flags);
+   return -ENODEV;
+   }
+
switch (code) {
case GADGET_GET_PRINTER_STATUS:
status = (int)dev->printer_status;
--
2.24.1

Re: [PATCH] usb: gadget: function: printer: fix use-after-free in __lock_acquire

2020-06-22 Thread Zhang,Qiang


I cannot find a reference count for this structure(printer_dev).
In this scenario:
When the Character device is still open, if you operate the device 
through configfs and execute commands like unlink, the resources 
allocated when the device is bound will be released(printer_dev). After 
that, if you perform ioctl operation again, it will appear 
use-after-free. add kref to show that this resource(printer_dev) will 
also use,

until this character device is close.

Similar problems can occur in f_hid.c.

As for the other gadget drivers, I haven't seen their logical 
implementation,so I'm not sure


thanks,
Zqiang

On 6/18/20 4:30 PM, Greg KH wrote:

On Fri, Jun 05, 2020 at 11:56:52AM +0800, qiang.zh...@windriver.com wrote:

From: Zqiang 

Fix this by increase object reference count.

BUG: KASAN: use-after-free in __lock_acquire+0x3fd4/0x4180
kernel/locking/lockdep.c:3831
Read of size 8 at addr 8880683b0018 by task syz-executor.0/3377

CPU: 1 PID: 3377 Comm: syz-executor.0 Not tainted 5.6.11 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0xce/0x128 lib/dump_stack.c:118
  print_address_description.constprop.4+0x21/0x3c0 mm/kasan/report.c:374
  __kasan_report+0x131/0x1b0 mm/kasan/report.c:506
  kasan_report+0x12/0x20 mm/kasan/common.c:641
  __asan_report_load8_noabort+0x14/0x20 mm/kasan/generic_report.c:135
  __lock_acquire+0x3fd4/0x4180 kernel/locking/lockdep.c:3831
  lock_acquire+0x127/0x350 kernel/locking/lockdep.c:4488
  __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
  _raw_spin_lock_irqsave+0x35/0x50 kernel/locking/spinlock.c:159
  printer_ioctl+0x4a/0x110 drivers/usb/gadget/function/f_printer.c:723
  vfs_ioctl fs/ioctl.c:47 [inline]
  ksys_ioctl+0xfb/0x130 fs/ioctl.c:763
  __do_sys_ioctl fs/ioctl.c:772 [inline]
  __se_sys_ioctl fs/ioctl.c:770 [inline]
  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:770
  do_syscall_64+0x9e/0x510 arch/x86/entry/common.c:294
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4531a9
Code: ed 60 fc ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48
89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
01 f0 ff ff 0f 83 bb 60 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:7fd14ad72c78 EFLAGS: 0246 ORIG_RAX: 0010
RAX: ffda RBX: 0073bfa8 RCX: 004531a9
RDX: fff9 RSI: 009e RDI: 0003
RBP: 0003 R08:  R09: 
R10:  R11: 0246 R12: 004bbd61
R13: 004d0a98 R14: 7fd14ad736d4 R15: 

Allocated by task 2393:
  save_stack+0x21/0x90 mm/kasan/common.c:72
  set_track mm/kasan/common.c:80 [inline]
  __kasan_kmalloc.constprop.3+0xa7/0xd0 mm/kasan/common.c:515
  kasan_kmalloc+0x9/0x10 mm/kasan/common.c:529
  kmem_cache_alloc_trace+0xfa/0x2d0 mm/slub.c:2813
  kmalloc include/linux/slab.h:555 [inline]
  kzalloc include/linux/slab.h:669 [inline]
  gprinter_alloc+0xa1/0x870 drivers/usb/gadget/function/f_printer.c:1416
  usb_get_function+0x58/0xc0 drivers/usb/gadget/functions.c:61
  config_usb_cfg_link+0x1ed/0x3e0 drivers/usb/gadget/configfs.c:444
  configfs_symlink+0x527/0x11d0 fs/configfs/symlink.c:202
  vfs_symlink+0x33d/0x5b0 fs/namei.c:4201
  do_symlinkat+0x11b/0x1d0 fs/namei.c:4228
  __do_sys_symlinkat fs/namei.c:4242 [inline]
  __se_sys_symlinkat fs/namei.c:4239 [inline]
  __x64_sys_symlinkat+0x73/0xb0 fs/namei.c:4239
  do_syscall_64+0x9e/0x510 arch/x86/entry/common.c:294
  entry_SYSCALL_64_after_hwframe+0x49/0xbe

Freed by task 3368:
  save_stack+0x21/0x90 mm/kasan/common.c:72
  set_track mm/kasan/common.c:80 [inline]
  kasan_set_free_info mm/kasan/common.c:337 [inline]
  __kasan_slab_free+0x135/0x190 mm/kasan/common.c:476
  kasan_slab_free+0xe/0x10 mm/kasan/common.c:485
  slab_free_hook mm/slub.c:1444 [inline]
  slab_free_freelist_hook mm/slub.c:1477 [inline]
  slab_free mm/slub.c:3034 [inline]
  kfree+0xf7/0x410 mm/slub.c:3995
  gprinter_free+0x49/0xd0 drivers/usb/gadget/function/f_printer.c:1353
  usb_put_function+0x38/0x50 drivers/usb/gadget/functions.c:87
  config_usb_cfg_unlink+0x2db/0x3b0 drivers/usb/gadget/configfs.c:485
  configfs_unlink+0x3b9/0x7f0 fs/configfs/symlink.c:250
  vfs_unlink+0x287/0x570 fs/namei.c:4073
  do_unlinkat+0x4f9/0x620 fs/namei.c:4137
  __do_sys_unlink fs/namei.c:4184 [inline]
  __se_sys_unlink fs/namei.c:4182 [inline]
  __x64_sys_unlink+0x42/0x50 fs/namei.c:4182
  do_syscall_64+0x9e/0x510 arch/x86/entry/common.c:294
  entry_SYSCALL_64_after_hwframe+0x49/0xbe

The buggy address belongs to the object at 8880683b
  which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 24 bytes inside of
  1024-byte region [8880683b, 8880683b0400)
The buggy address belongs to the page:
page:ea0001a0ec00 refcount:1 mapcount:0 mapping:88806c00e300
index:0x8880683b1800 compound_mapcount: 0
flags:

回复: 回复: [PATCH v2] usb: gadget: function: printer: fix use-after-free in __lock_acquire

2020-06-09 Thread Zhang, Qiang

cdev  object reference count and "struct printer_dev" object  reference 
count(kref), This two reference counts do not conflict.  in file usb-skeleton.c 
also used a similar method， "struct usb_skel"  contains kref members.

thanks,
Zqiang

发件人: Greg KH 
发送时间: 2020年6月9日 17:48
收件人: Zhang, Qiang
抄送: ba...@kernel.org; linux-...@vger.kernel.org; linux-kernel@vger.kernel.org
主题: Re: 回复: [PATCH v2] usb: gadget: function: printer: fix use-after-free in 
__lock_acquire

A: http://en.wikipedia.org/wiki/Top_post
Q: Were do I find info about this thing called top-posting?
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?

A: No.
Q: Should I include quotations after my reply?

http://daringfireball.net/2007/07/on_top

On Tue, Jun 09, 2020 at 09:35:11AM +, Zhang, Qiang wrote:
> Thank you for your suggestion
> two referenced counted objects in the same exact structure. another  
> referenced is
> "dev->printer_cdev_open"?

Maybe, I don't know, but a cdev does have a reference count already,
right?  Why do you need printer_cdev_open as well?

thanks,

greg k-h

回复: [PATCH v2] usb: gadget: function: printer: fix use-after-free in __lock_acquire

2020-06-09 Thread Zhang, Qiang

Thank you for your suggestion
two referenced counted objects in the same exact structure. another  referenced 
is
"dev->printer_cdev_open"?


发件人: Greg KH 
发送时间: 2020年6月8日 15:33
收件人: Zhang, Qiang
抄送: ba...@kernel.org; linux-...@vger.kernel.org; linux-kernel@vger.kernel.org
主题: Re: [PATCH v2] usb: gadget: function: printer: fix use-after-free in 
__lock_acquire

On Mon, Jun 08, 2020 at 03:16:22PM +0800, qiang.zh...@windriver.com wrote:
> From: Zqiang 
>
> Increase the reference count of the printer dev through kref to avoid
> being released by other tasks when in use.
>
> BUG: KASAN: use-after-free in __lock_acquire+0x3fd4/0x4180
> kernel/locking/lockdep.c:3831
> Read of size 8 at addr 8880683b0018 by task syz-executor.0/3377
>
> CPU: 1 PID: 3377 Comm: syz-executor.0 Not tainted 5.6.11 #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0xce/0x128 lib/dump_stack.c:118
>  print_address_description.constprop.4+0x21/0x3c0 mm/kasan/report.c:374
>  __kasan_report+0x131/0x1b0 mm/kasan/report.c:506
>  kasan_report+0x12/0x20 mm/kasan/common.c:641
>  __asan_report_load8_noabort+0x14/0x20 mm/kasan/generic_report.c:135
>  __lock_acquire+0x3fd4/0x4180 kernel/locking/lockdep.c:3831
>  lock_acquire+0x127/0x350 kernel/locking/lockdep.c:4488
>  __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
>  _raw_spin_lock_irqsave+0x35/0x50 kernel/locking/spinlock.c:159
>  printer_ioctl+0x4a/0x110 drivers/usb/gadget/function/f_printer.c:723
>  vfs_ioctl fs/ioctl.c:47 [inline]
>  ksys_ioctl+0xfb/0x130 fs/ioctl.c:763
>  __do_sys_ioctl fs/ioctl.c:772 [inline]
>  __se_sys_ioctl fs/ioctl.c:770 [inline]
>  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:770
>  do_syscall_64+0x9e/0x510 arch/x86/entry/common.c:294
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x4531a9
> Code: ed 60 fc ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48
> 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
> 01 f0 ff ff 0f 83 bb 60 fc ff c3 66 2e 0f 1f 84 00 00 00 00
> RSP: 002b:7fd14ad72c78 EFLAGS: 0246 ORIG_RAX: 0010
> RAX: ffda RBX: 0073bfa8 RCX: 004531a9
> RDX: fff9 RSI: 009e RDI: 0003
> RBP: 0003 R08:  R09: 
> R10:  R11: 0246 R12: 004bbd61
> R13: 004d0a98 R14: 7fd14ad736d4 R15: 
>
> Allocated by task 2393:
>  save_stack+0x21/0x90 mm/kasan/common.c:72
>  set_track mm/kasan/common.c:80 [inline]
>  __kasan_kmalloc.constprop.3+0xa7/0xd0 mm/kasan/common.c:515
>  kasan_kmalloc+0x9/0x10 mm/kasan/common.c:529
>  kmem_cache_alloc_trace+0xfa/0x2d0 mm/slub.c:2813
>  kmalloc include/linux/slab.h:555 [inline]
>  kzalloc include/linux/slab.h:669 [inline]
>  gprinter_alloc+0xa1/0x870 drivers/usb/gadget/function/f_printer.c:1416
>  usb_get_function+0x58/0xc0 drivers/usb/gadget/functions.c:61
>  config_usb_cfg_link+0x1ed/0x3e0 drivers/usb/gadget/configfs.c:444
>  configfs_symlink+0x527/0x11d0 fs/configfs/symlink.c:202
>  vfs_symlink+0x33d/0x5b0 fs/namei.c:4201
>  do_symlinkat+0x11b/0x1d0 fs/namei.c:4228
>  __do_sys_symlinkat fs/namei.c:4242 [inline]
>  __se_sys_symlinkat fs/namei.c:4239 [inline]
>  __x64_sys_symlinkat+0x73/0xb0 fs/namei.c:4239
>  do_syscall_64+0x9e/0x510 arch/x86/entry/common.c:294
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> Freed by task 3368:
>  save_stack+0x21/0x90 mm/kasan/common.c:72
>  set_track mm/kasan/common.c:80 [inline]
>  kasan_set_free_info mm/kasan/common.c:337 [inline]
>  __kasan_slab_free+0x135/0x190 mm/kasan/common.c:476
>  kasan_slab_free+0xe/0x10 mm/kasan/common.c:485
>  slab_free_hook mm/slub.c:1444 [inline]
>  slab_free_freelist_hook mm/slub.c:1477 [inline]
>  slab_free mm/slub.c:3034 [inline]
>  kfree+0xf7/0x410 mm/slub.c:3995
>  gprinter_free+0x49/0xd0 drivers/usb/gadget/function/f_printer.c:1353
>  usb_put_function+0x38/0x50 drivers/usb/gadget/functions.c:87
>  config_usb_cfg_unlink+0x2db/0x3b0 drivers/usb/gadget/configfs.c:485
>  configfs_unlink+0x3b9/0x7f0 fs/configfs/symlink.c:250
>  vfs_unlink+0x287/0x570 fs/namei.c:4073
>  do_unlinkat+0x4f9/0x620 fs/namei.c:4137
>  __do_sys_unlink fs/namei.c:4184 [inline]
>  __se_sys_unlink fs/namei.c:4182 [inline]
>  __x64_sys_unlink+0x42/0x50 fs/namei.c:4182
>  do_syscall_64+0x9e/0x510 arch/x86/entry/common.c:294
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> The buggy address belongs to the object at 8880683b
>  which belongs to the cache kmalloc-1k of size 1024
> The buggy address is located 24 bytes inside of
>  1024-byte re

回复: [PATCH] usb: gadget: function: printer: Fix use-after-free in __lock_acquire()

2020-06-08 Thread Zhang, Qiang

Hi Markus.
I don't need to add Fix tag to view the code.


发件人: Markus Elfring 
发送时间: 2020年6月5日 16:57
收件人: Zhang, Qiang; linux-...@vger.kernel.org
抄送: kernel-janit...@vger.kernel.org; linux-kernel@vger.kernel.org; Alan Stern; 
Felipe Balbi; Greg Kroah-Hartman; Kyungtae Kim
主题: Re: [PATCH] usb: gadget: function: printer: Fix use-after-free in 
__lock_acquire()

> Fix this by increase object reference count.

I find this description incomplete according to the proposed changes.

Would you like to add the tag “Fixes” to the commit message?

Regards,
Markus

回复: [PATCH v5] workqueue: Remove unnecessary kfree() call in rcu_free_wq()

2020-05-27 Thread Zhang, Qiang

Thanks for your guide.
I will try to change the weakness of weak wording.


发件人: Zhang, Qiang 
发送时间: 2020年5月28日 9:41
收件人: Markus Elfring; Tejun Heo; Lai Jiangshan
抄送: linux-kernel@vger.kernel.org; kernel-janit...@vger.kernel.org
主题: 回复: [PATCH v5] workqueue: Remove unnecessary kfree() call in rcu_free_wq()

Thanks for your guide. I tried to change the weakness of weak wording



发件人: linux-kernel-ow...@vger.kernel.org  代表 
Markus Elfring 
发送时间: 2020年5月27日 16:20
收件人: Zhang, Qiang ; Tejun Heo ; Lai 
Jiangshan 
抄送: linux-kernel@vger.kernel.org ; 
kernel-janit...@vger.kernel.org 
主题: Re: [PATCH v5] workqueue: Remove unnecessary kfree() call in rcu_free_wq()

> Thus delete this function call which became unnecessary with the referenced
> software update.
…
> Suggested-by: Markus Elfring 

Would the tag “Co-developed-by” be more appropriate according to the patch 
review
to achieve a more pleasing commit message?


>  v1->v2->v3->v4->v5:
>  Modify weakly submitted information.

Now I wonder about your wording choice “weakly”.

Regards,
Markus

Re: [PATCH v3] workqueue: Fix double kfree for rescuer

2020-05-26 Thread Zhang, Qiang

Thank you reply
There is something wrong with my description.  is it feasible to describe as 
follows:

The resucer is already free in "destroy_workqueue" and
"wq->rescuer = NULL" was executed, but in "rcu_free_wq"
it's release again (equivalent to kfree(NULL)), this is
unnecessary, so should remove.


On 5/26/20 4:56 PM, Lai Jiangshan wrote:
> On Mon, May 25, 2020 at 5:22 PM  wrote:
>>
>> From: Zhang Qiang 
>>
>> The callback function "rcu_free_wq" could be called after memory
>> was released for "rescuer" already, Thus delete a misplaced call
>> of the function "kfree".
> 
> Hello
> 
> wq->rescuer is guaranteed to be NULL in rcu_free_wq()
> since def98c84b6cd
> ("workqueue: Fix spurious sanity check failures in destroy_workqueue()")
> 
> And the resucer is already free in destroy_workqueue()
> since 8efe1223d73c
> ("workqueue: Fix missing kfree(rescuer) in destroy_workqueue()")
> 
> The patch is a cleanup to remove a "kfree(NULL);".
> But the changelog is misleading.
> 
>>
>> Fixes: 6ba94429c8e7 ("workqueue: Reorder sysfs code")
> 
> It is totally unrelated.
> 
>> Signed-off-by: Zhang Qiang 
>> ---
>>   v1->v2->v3:
>>   Only commit information modification.
>>   kernel/workqueue.c | 1 -
>>   1 file changed, 1 deletion(-)
>>
>> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
>> index 891ccad5f271..a2451cdcd503 100644
>> --- a/kernel/workqueue.c
>> +++ b/kernel/workqueue.c
>> @@ -3491,7 +3491,6 @@ static void rcu_free_wq(struct rcu_head *rcu)
>>  else
>>  free_workqueue_attrs(wq->unbound_attrs);
>>
>> -   kfree(wq->rescuer);
>>  kfree(wq);
>>   }
>>
>> --
>> 2.24.1
>>

95 matches

Mail list logo