Re: [LKP] [writeback] 8bc4ad9498: INFO: suspicious RCU usage. ]

2016-09-08 Thread Ye Xiaolong
On 09/08, Jens Axboe wrote:
>On 09/07/2016 08:04 PM, Ye Xiaolong wrote:
>>On 09/08, Fengguang Wu wrote:
>>>Hi Jens and Tejun,
>>>
The command line was in the original email:

$ qemu-system-x86_64 -enable-kvm -cpu Westmere -m 512M

And agree, in general it'd be nice if there was a link to the image as
well, so that folks can reproduce.
>>>
>>>Yes we have a reproduce script internally, however need to polish it
>>>up and make it general usable by kernel developers.
>>>
>>>CC Philip. It'd be helpful to give more priority to the "lkp qemu"
>>>reproduce command.
>>>
>>>For the warning here, it does not depend on any initrd, I just
>>>confirmed this simple script can reproduce the problem.
>>
>>Yes, just confirmed that v4.8-rc5 doesn't have this warning, while kernel
>>for the updated commit (fa0e2cb1af765691fabd329f03cad563a0eebf18) on branch
>>wb-buf-throttle could reproduce it.
>
>Anyone willing to try the below patch?

The warning log is gone with this patch.

Tested-by: Xiaolong Ye 

Thanks,
Xiaolong

>
>diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
>index ef61bda76317..a7bb2e6bcaef 100644
>--- a/block/cfq-iosched.c
>+++ b/block/cfq-iosched.c
>@@ -3764,9 +3764,11 @@ static void check_blkcg_changed(struct
>cfq_io_cq *cic, struct bio *bio)
>   struct cfq_data *cfqd = cic_to_cfqd(cic);
>   struct cfq_queue *cfqq;
>   uint64_t serial_nr;
>+  bool nonroot_cg;
>
>   rcu_read_lock();
>   serial_nr = bio_blkcg(bio)->css.serial_nr;
>+  nonroot_cg = bio_blkcg(bio) != &blkcg_root;
>   rcu_read_unlock();
>
>   /*
>@@ -3781,7 +3783,7 @@ static void check_blkcg_changed(struct
>cfq_io_cq *cic, struct bio *bio)
>* do proper throttling of writes. Turn off wbt for that
>* case.
>*/
>-  if (bio_blkcg(bio) != &blkcg_root) {
>+  if (nonroot_cg) {
>   struct request_queue *q = cfqd->queue;
>
>   if (q->rq_wb)
>
>-- 
>Jens Axboe
>


Re: [LKP] [writeback] 8bc4ad9498: INFO: suspicious RCU usage. ]

2016-09-08 Thread Jens Axboe

On 09/07/2016 08:04 PM, Ye Xiaolong wrote:

On 09/08, Fengguang Wu wrote:

Hi Jens and Tejun,


The command line was in the original email:

$ qemu-system-x86_64 -enable-kvm -cpu Westmere -m 512M

And agree, in general it'd be nice if there was a link to the image as
well, so that folks can reproduce.


Yes we have a reproduce script internally, however need to polish it
up and make it general usable by kernel developers.

CC Philip. It'd be helpful to give more priority to the "lkp qemu"
reproduce command.

For the warning here, it does not depend on any initrd, I just
confirmed this simple script can reproduce the problem.


Yes, just confirmed that v4.8-rc5 doesn't have this warning, while kernel
for the updated commit (fa0e2cb1af765691fabd329f03cad563a0eebf18) on branch
wb-buf-throttle could reproduce it.


Anyone willing to try the below patch?

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index ef61bda76317..a7bb2e6bcaef 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -3764,9 +3764,11 @@ static void check_blkcg_changed(struct cfq_io_cq 
*cic, struct bio *bio)

struct cfq_data *cfqd = cic_to_cfqd(cic);
struct cfq_queue *cfqq;
uint64_t serial_nr;
+   bool nonroot_cg;

rcu_read_lock();
serial_nr = bio_blkcg(bio)->css.serial_nr;
+   nonroot_cg = bio_blkcg(bio) != &blkcg_root;
rcu_read_unlock();

/*
@@ -3781,7 +3783,7 @@ static void check_blkcg_changed(struct cfq_io_cq 
*cic, struct bio *bio)

 * do proper throttling of writes. Turn off wbt for that
 * case.
 */
-   if (bio_blkcg(bio) != &blkcg_root) {
+   if (nonroot_cg) {
struct request_queue *q = cfqd->queue;

if (q->rq_wb)

--
Jens Axboe



Re: [LKP] [writeback] 8bc4ad9498: INFO: suspicious RCU usage. ]

2016-09-07 Thread Ye Xiaolong
On 09/08, Fengguang Wu wrote:
>Hi Jens and Tejun,
>
>>The command line was in the original email:
>>
>>$ qemu-system-x86_64 -enable-kvm -cpu Westmere -m 512M
>>
>>And agree, in general it'd be nice if there was a link to the image as
>>well, so that folks can reproduce.
>
>Yes we have a reproduce script internally, however need to polish it
>up and make it general usable by kernel developers.
>
>CC Philip. It'd be helpful to give more priority to the "lkp qemu"
>reproduce command.
>
>For the warning here, it does not depend on any initrd, I just
>confirmed this simple script can reproduce the problem.

Yes, just confirmed that v4.8-rc5 doesn't have this warning, while kernel
for the updated commit (fa0e2cb1af765691fabd329f03cad563a0eebf18) on branch 
wb-buf-throttle could reproduce it.

Thanks,
Xiaolong
>
>-
>#!/bin/bash
>
>kernel=$1
>
>kvm=(
>   qemu-system-x86_64
>   -enable-kvm
>   -cpu Westmere
>   -kernel $kernel
>   -m 1024
>   -smp 1
>   -device e1000,netdev=net0
>   -netdev user,id=net0
>   -boot order=nc
>   -no-reboot
>   -watchdog i6300esb
>   -watchdog-action debug
>   -rtc base=localtime
>   -nographic
>)
>
>
>append=(
>   ip=vm-kbuild-yocto-ia32-17::dhcp
>   root=/dev/ram0
>   debug
>   apic=debug
>   sysrq_always_enabled
>   rcupdate.rcu_cpu_stall_timeout=100
>   net.ifnames=0
>   printk.devkmsg=on
>   panic=-1
>   softlockup_panic=1
>   nmi_watchdog=panic
>   oops=panic
>   load_ramdisk=2
>   prompt_ramdisk=0
>   systemd.log_level=err
>   ignore_loglevel
>   earlyprintk=ttyS0,115200
>   console=ttyS0,115200
>   console=tty0
>   vga=normal
>   rw
>)
>
>"${kvm[@]}" -append "${append[*]}"
>-
>
>Thanks,
>Fengguang


Re: [LKP] [writeback] 8bc4ad9498: INFO: suspicious RCU usage. ]

2016-09-07 Thread Fengguang Wu

Hi Jens and Tejun,


The command line was in the original email:

$ qemu-system-x86_64 -enable-kvm -cpu Westmere -m 512M

And agree, in general it'd be nice if there was a link to the image as
well, so that folks can reproduce.


Yes we have a reproduce script internally, however need to polish it
up and make it general usable by kernel developers.

CC Philip. It'd be helpful to give more priority to the "lkp qemu"
reproduce command.

For the warning here, it does not depend on any initrd, I just
confirmed this simple script can reproduce the problem.

-
#!/bin/bash

kernel=$1

kvm=(
   qemu-system-x86_64
   -enable-kvm
   -cpu Westmere
   -kernel $kernel
   -m 1024
   -smp 1
   -device e1000,netdev=net0
   -netdev user,id=net0
   -boot order=nc
   -no-reboot
   -watchdog i6300esb
   -watchdog-action debug
   -rtc base=localtime
   -nographic
)


append=(
   ip=vm-kbuild-yocto-ia32-17::dhcp
   root=/dev/ram0
   debug
   apic=debug
   sysrq_always_enabled
   rcupdate.rcu_cpu_stall_timeout=100
   net.ifnames=0
   printk.devkmsg=on
   panic=-1
   softlockup_panic=1
   nmi_watchdog=panic
   oops=panic
   load_ramdisk=2
   prompt_ramdisk=0
   systemd.log_level=err
   ignore_loglevel
   earlyprintk=ttyS0,115200
   console=ttyS0,115200
   console=tty0
   vga=normal
   rw
)

"${kvm[@]}" -append "${append[*]}"
-

Thanks,
Fengguang


Re: [writeback] 8bc4ad9498: INFO: suspicious RCU usage. ]

2016-09-07 Thread Jens Axboe

On 09/07/2016 10:41 AM, Tejun Heo wrote:

On Thu, Sep 01, 2016 at 06:21:30PM +0800, kernel test robot wrote:


FYI, we noticed the following commit:

https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git 
wb-buf-throttle
commit 8bc4ad9498f81d6689da42457615db0989e720d3 ("writeback: throttle buffered 
writeback")

in testcase: boot

on test machine: qemu-system-x86_64 -enable-kvm -cpu Westmere -m 512M

caused below changes:


+--+++
|  | b2e4e4e34d | 8bc4ad9498 |
+--+++
| boot_successes   | 6  | 0  |
| boot_failures| 0  | 8  |
| INFO:suspicious_RCU_usage| 0  | 8  |
| message:INFO:suspicious_RCU_usage| 0  | 0.0|
| pattern:INFO:suspicious_RCU_usage| 0  | 0  |
| calltrace:disk_events_workfn | 0  | 8  |
| message:calltrace:disk_events_workfn | 0  | 0.0|
| pattern:calltrace:disk_events_workfn | 0  | 0  |
+--+++



[7.389264] stack backtrace:
[7.391300] CPU: 0 PID: 19 Comm: kworker/0:1 Not tainted 
4.8.0-rc4-8-g8bc4ad9 #1
[7.395279] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
Debian-1.8.2-1 04/01/2014
[7.398012] Workqueue: events_freezable_power_ disk_events_workfn
[7.399933]   88003db1f8e8 8178ed3b 
88003db184c0
[7.404348]  0001 88003db1f918 81123472 
88003db184c0
[7.407158]  84091180 0003 880039d4c400 
88003db1f938
[7.410074] Call Trace:
[7.411328]  [] dump_stack+0x82/0xb8
[7.413982]  [] lockdep_rcu_suspicious+0xf7/0x100
[7.415828]  [] bio_blkcg+0x89/0x93
[7.417336]  [] check_blkcg_changed+0x58/0x1b8


Hmm... can't reproduce here on qemu with v4.8-rc5 and the same kernel
config.  Can you please check whether it reproduces on v4.8-rc5?  If
so, is there any way I can get the qemu image and command line used?


I've been getting them for a while. I just updated the branch to -rc5
this morning, so if it does reproduce, the same mail would likely show
up again. I'll keep you posted :-)

The command line was in the original email:

$ qemu-system-x86_64 -enable-kvm -cpu Westmere -m 512M

And agree, in general it'd be nice if there was a link to the image as
well, so that folks can reproduce.

--
Jens Axboe



Re: [writeback] 8bc4ad9498: INFO: suspicious RCU usage. ]

2016-09-07 Thread Tejun Heo
On Thu, Sep 01, 2016 at 06:21:30PM +0800, kernel test robot wrote:
> 
> FYI, we noticed the following commit:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git 
> wb-buf-throttle
> commit 8bc4ad9498f81d6689da42457615db0989e720d3 ("writeback: throttle 
> buffered writeback")
> 
> in testcase: boot
> 
> on test machine: qemu-system-x86_64 -enable-kvm -cpu Westmere -m 512M
> 
> caused below changes:
> 
> 
> +--+++
> |  | b2e4e4e34d | 8bc4ad9498 |
> +--+++
> | boot_successes   | 6  | 0  |
> | boot_failures| 0  | 8  |
> | INFO:suspicious_RCU_usage| 0  | 8  |
> | message:INFO:suspicious_RCU_usage| 0  | 0.0|
> | pattern:INFO:suspicious_RCU_usage| 0  | 0  |
> | calltrace:disk_events_workfn | 0  | 8  |
> | message:calltrace:disk_events_workfn | 0  | 0.0|
> | pattern:calltrace:disk_events_workfn | 0  | 0  |
> +--+++
> 
> 
> 
> [7.389264] stack backtrace:
> [7.391300] CPU: 0 PID: 19 Comm: kworker/0:1 Not tainted 
> 4.8.0-rc4-8-g8bc4ad9 #1
> [7.395279] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> Debian-1.8.2-1 04/01/2014
> [7.398012] Workqueue: events_freezable_power_ disk_events_workfn
> [7.399933]   88003db1f8e8 8178ed3b 
> 88003db184c0
> [7.404348]  0001 88003db1f918 81123472 
> 88003db184c0
> [7.407158]  84091180 0003 880039d4c400 
> 88003db1f938
> [7.410074] Call Trace:
> [7.411328]  [] dump_stack+0x82/0xb8
> [7.413982]  [] lockdep_rcu_suspicious+0xf7/0x100
> [7.415828]  [] bio_blkcg+0x89/0x93
> [7.417336]  [] check_blkcg_changed+0x58/0x1b8

Hmm... can't reproduce here on qemu with v4.8-rc5 and the same kernel
config.  Can you please check whether it reproduces on v4.8-rc5?  If
so, is there any way I can get the qemu image and command line used?

Thanks.

-- 
tejun


Re: [writeback] 8bc4ad9498: INFO: suspicious RCU usage. ]

2016-09-02 Thread Tejun Heo
Hello, Paul.

On Fri, Sep 02, 2016 at 10:00:56AM -0700, Paul E. McKenney wrote:
> Note that lockdep believes that the rcu_read_lock() has also been
> dropped, otherwise it would have listed it.

There isn't any actual code between rcu_read_lock() and the following
rcu deref.  That's the first thing which happens.  It looked
mysterious to me so I was wondering whether you knew any brekages on
rcu lockdep side.  Unfortunately, that doesn't seem to be the case.

> At this point, I have to suggest "git bisect"...

I'll try to reproduce it and see wth is going on.

Thanks!

-- 
tejun


Re: [writeback] 8bc4ad9498: INFO: suspicious RCU usage. ]

2016-09-02 Thread Paul E. McKenney
On Fri, Sep 02, 2016 at 09:49:25AM -0600, Jens Axboe wrote:
> On 09/02/2016 09:36 AM, Paul E. McKenney wrote:
> >On Fri, Sep 02, 2016 at 10:56:22AM -0400, Tejun Heo wrote:
> >>(cc'ing Paul, hi!)
> >>
> >>Hello,
> >>
> >>On Thu, Sep 01, 2016 at 02:13:34PM -0600, Jens Axboe wrote:
> >>>On 09/01/2016 04:21 AM, kernel test robot wrote:
> [7.323356] cdrom: Uniform CD-ROM driver Revision: 3.20
> [7.334239]
> [7.337256] ===
> [7.340532] [ INFO: suspicious RCU usage. ]
> [7.342419] 4.8.0-rc4-8-g8bc4ad9 #1 Not tainted
> [7.347065] ---
> [7.350132] include/linux/cgroup.h:435 suspicious 
> rcu_dereference_check() usage!
> >
> >Lockdep does not believe that any locks are held, correct?
> 
> Which is correct, the queue lock has been dropped at this point.

Note that lockdep believes that the rcu_read_lock() has also been
dropped, otherwise it would have listed it.

> [7.410074] Call Trace:
> [7.411328]  [] dump_stack+0x82/0xb8
> [7.413982]  [] lockdep_rcu_suspicious+0xf7/0x100
> [7.415828]  [] bio_blkcg+0x89/0x93
> [7.417336]  [] check_blkcg_changed+0x58/0x1b8
> [7.428722]  [] cfq_set_request+0xd1/0x2a3
> [7.439690]  [] elv_set_request+0x1f/0x24
> [7.442157]  [] get_request+0x38f/0x77f
> [7.447449]  [] blk_get_request+0x65/0xa8
> [7.449868]  [] ide_cd_queue_pc+0x76/0x19d
> [7.453757]  [] cdrom_check_status+0x51/0x53
> [7.455372]  [] ide_cdrom_check_events_real+0x20/0x3f
> [7.457294]  [] cdrom_update_events+0x18/0x21
> [7.458987]  [] cdrom_check_events+0x12/0x1f
> [7.460713]  [] idecd_check_events+0x1c/0x1e
> [7.462393]  [] disk_check_events+0x47/0x103
> [7.464129]  [] disk_events_workfn+0x1c/0x1e
> [7.465844]  [] process_one_work+0x272/0x4ee
> [7.467462]  [] worker_thread+0x1eb/0x2c9
> >>
> >>The warning is from
> >>
> >>#define task_css_set_check(task, __c)   
> >>\
> >>rcu_dereference_check((task)->cgroups,  \
> >>lockdep_is_held(&cgroup_mutex) ||   \
> >>lockdep_is_held(&css_set_lock) ||   \
> >>((task)->flags & PF_EXITING) || (__c))
> >>
> >>which is used by bio_blkcg() which is called by the following code in
> >>check_blkcg_changed().
> >>
> >>rcu_read_lock();
> >>serial_nr = bio_blkcg(bio)->css.serial_nr;
> >>rcu_read_unlock();
> >>
> >>So, I have no idea.  It looks like rcu_dereference_check() is being
> >>called with rcu read locked but still triggering suspicious RCU usage
> >>warning.
> >
> >Perhaps there is an rcu_read_unlock() somewhere on the code path?
> >
> >>The code hasn't changed for quite a while now, so it's also really
> >>weird that it's triggering now.  Paul, does anything ring a bell?
> >
> >I have not see something like this recently.
> 
> The question is if it's really new, or just re-triggering because the
> writeback branch changes that function? It's further down though, so
> can't impact the RCU section. And the writeback changes don't have any
> RCU code in them...

At this point, I have to suggest "git bisect"...

Thanx, Paul



Re: [writeback] 8bc4ad9498: INFO: suspicious RCU usage. ]

2016-09-02 Thread Jens Axboe

On 09/02/2016 09:36 AM, Paul E. McKenney wrote:

On Fri, Sep 02, 2016 at 10:56:22AM -0400, Tejun Heo wrote:

(cc'ing Paul, hi!)

Hello,

On Thu, Sep 01, 2016 at 02:13:34PM -0600, Jens Axboe wrote:

On 09/01/2016 04:21 AM, kernel test robot wrote:

[7.323356] cdrom: Uniform CD-ROM driver Revision: 3.20
[7.334239]
[7.337256] ===
[7.340532] [ INFO: suspicious RCU usage. ]
[7.342419] 4.8.0-rc4-8-g8bc4ad9 #1 Not tainted
[7.347065] ---
[7.350132] include/linux/cgroup.h:435 suspicious rcu_dereference_check() 
usage!


Lockdep does not believe that any locks are held, correct?


Which is correct, the queue lock has been dropped at this point.


[7.410074] Call Trace:
[7.411328]  [] dump_stack+0x82/0xb8
[7.413982]  [] lockdep_rcu_suspicious+0xf7/0x100
[7.415828]  [] bio_blkcg+0x89/0x93
[7.417336]  [] check_blkcg_changed+0x58/0x1b8
[7.428722]  [] cfq_set_request+0xd1/0x2a3
[7.439690]  [] elv_set_request+0x1f/0x24
[7.442157]  [] get_request+0x38f/0x77f
[7.447449]  [] blk_get_request+0x65/0xa8
[7.449868]  [] ide_cd_queue_pc+0x76/0x19d
[7.453757]  [] cdrom_check_status+0x51/0x53
[7.455372]  [] ide_cdrom_check_events_real+0x20/0x3f
[7.457294]  [] cdrom_update_events+0x18/0x21
[7.458987]  [] cdrom_check_events+0x12/0x1f
[7.460713]  [] idecd_check_events+0x1c/0x1e
[7.462393]  [] disk_check_events+0x47/0x103
[7.464129]  [] disk_events_workfn+0x1c/0x1e
[7.465844]  [] process_one_work+0x272/0x4ee
[7.467462]  [] worker_thread+0x1eb/0x2c9


The warning is from

#define task_css_set_check(task, __c)   \
rcu_dereference_check((task)->cgroups,   \
lockdep_is_held(&cgroup_mutex) ||   \
lockdep_is_held(&css_set_lock) ||   \
((task)->flags & PF_EXITING) || (__c))

which is used by bio_blkcg() which is called by the following code in
check_blkcg_changed().

rcu_read_lock();
serial_nr = bio_blkcg(bio)->css.serial_nr;
rcu_read_unlock();

So, I have no idea.  It looks like rcu_dereference_check() is being
called with rcu read locked but still triggering suspicious RCU usage
warning.


Perhaps there is an rcu_read_unlock() somewhere on the code path?


The code hasn't changed for quite a while now, so it's also really
weird that it's triggering now.  Paul, does anything ring a bell?


I have not see something like this recently.


The question is if it's really new, or just re-triggering because the
writeback branch changes that function? It's further down though, so
can't impact the RCU section. And the writeback changes don't have any
RCU code in them...


--
Jens Axboe


Re: [writeback] 8bc4ad9498: INFO: suspicious RCU usage. ]

2016-09-02 Thread Paul E. McKenney
On Fri, Sep 02, 2016 at 10:56:22AM -0400, Tejun Heo wrote:
> (cc'ing Paul, hi!)
> 
> Hello,
> 
> On Thu, Sep 01, 2016 at 02:13:34PM -0600, Jens Axboe wrote:
> > On 09/01/2016 04:21 AM, kernel test robot wrote:
> > > [7.323356] cdrom: Uniform CD-ROM driver Revision: 3.20
> > > [7.334239]
> > > [7.337256] ===
> > > [7.340532] [ INFO: suspicious RCU usage. ]
> > > [7.342419] 4.8.0-rc4-8-g8bc4ad9 #1 Not tainted
> > > [7.347065] ---
> > > [7.350132] include/linux/cgroup.h:435 suspicious 
> > > rcu_dereference_check() usage!

Lockdep does not believe that any locks are held, correct?

> ...
> > > [7.410074] Call Trace:
> > > [7.411328]  [] dump_stack+0x82/0xb8
> > > [7.413982]  [] lockdep_rcu_suspicious+0xf7/0x100
> > > [7.415828]  [] bio_blkcg+0x89/0x93
> > > [7.417336]  [] check_blkcg_changed+0x58/0x1b8
> > > [7.428722]  [] cfq_set_request+0xd1/0x2a3
> > > [7.439690]  [] elv_set_request+0x1f/0x24
> > > [7.442157]  [] get_request+0x38f/0x77f
> > > [7.447449]  [] blk_get_request+0x65/0xa8
> > > [7.449868]  [] ide_cd_queue_pc+0x76/0x19d
> > > [7.453757]  [] cdrom_check_status+0x51/0x53
> > > [7.455372]  [] ide_cdrom_check_events_real+0x20/0x3f
> > > [7.457294]  [] cdrom_update_events+0x18/0x21
> > > [7.458987]  [] cdrom_check_events+0x12/0x1f
> > > [7.460713]  [] idecd_check_events+0x1c/0x1e
> > > [7.462393]  [] disk_check_events+0x47/0x103
> > > [7.464129]  [] disk_events_workfn+0x1c/0x1e
> > > [7.465844]  [] process_one_work+0x272/0x4ee
> > > [7.467462]  [] worker_thread+0x1eb/0x2c9
> 
> The warning is from
> 
> #define task_css_set_check(task, __c) \
>   rcu_dereference_check((task)->cgroups,  \
>   lockdep_is_held(&cgroup_mutex) ||   \
>   lockdep_is_held(&css_set_lock) ||   \
>   ((task)->flags & PF_EXITING) || (__c))
> 
> which is used by bio_blkcg() which is called by the following code in
> check_blkcg_changed().
> 
>   rcu_read_lock();
>   serial_nr = bio_blkcg(bio)->css.serial_nr;
>   rcu_read_unlock();
> 
> So, I have no idea.  It looks like rcu_dereference_check() is being
> called with rcu read locked but still triggering suspicious RCU usage
> warning.

Perhaps there is an rcu_read_unlock() somewhere on the code path?

> The code hasn't changed for quite a while now, so it's also really
> weird that it's triggering now.  Paul, does anything ring a bell?

I have not see something like this recently.

Thanx, Paul



Re: [writeback] 8bc4ad9498: INFO: suspicious RCU usage. ]

2016-09-02 Thread Tejun Heo
(cc'ing Paul, hi!)

Hello,

On Thu, Sep 01, 2016 at 02:13:34PM -0600, Jens Axboe wrote:
> On 09/01/2016 04:21 AM, kernel test robot wrote:
> > [7.323356] cdrom: Uniform CD-ROM driver Revision: 3.20
> > [7.334239]
> > [7.337256] ===
> > [7.340532] [ INFO: suspicious RCU usage. ]
> > [7.342419] 4.8.0-rc4-8-g8bc4ad9 #1 Not tainted
> > [7.347065] ---
> > [7.350132] include/linux/cgroup.h:435 suspicious 
> > rcu_dereference_check() usage!
...
> > [7.410074] Call Trace:
> > [7.411328]  [] dump_stack+0x82/0xb8
> > [7.413982]  [] lockdep_rcu_suspicious+0xf7/0x100
> > [7.415828]  [] bio_blkcg+0x89/0x93
> > [7.417336]  [] check_blkcg_changed+0x58/0x1b8
> > [7.428722]  [] cfq_set_request+0xd1/0x2a3
> > [7.439690]  [] elv_set_request+0x1f/0x24
> > [7.442157]  [] get_request+0x38f/0x77f
> > [7.447449]  [] blk_get_request+0x65/0xa8
> > [7.449868]  [] ide_cd_queue_pc+0x76/0x19d
> > [7.453757]  [] cdrom_check_status+0x51/0x53
> > [7.455372]  [] ide_cdrom_check_events_real+0x20/0x3f
> > [7.457294]  [] cdrom_update_events+0x18/0x21
> > [7.458987]  [] cdrom_check_events+0x12/0x1f
> > [7.460713]  [] idecd_check_events+0x1c/0x1e
> > [7.462393]  [] disk_check_events+0x47/0x103
> > [7.464129]  [] disk_events_workfn+0x1c/0x1e
> > [7.465844]  [] process_one_work+0x272/0x4ee
> > [7.467462]  [] worker_thread+0x1eb/0x2c9

The warning is from

#define task_css_set_check(task, __c)   \
rcu_dereference_check((task)->cgroups,  \
lockdep_is_held(&cgroup_mutex) ||   \
lockdep_is_held(&css_set_lock) ||   \
((task)->flags & PF_EXITING) || (__c))

which is used by bio_blkcg() which is called by the following code in
check_blkcg_changed().

rcu_read_lock();
serial_nr = bio_blkcg(bio)->css.serial_nr;
rcu_read_unlock();

So, I have no idea.  It looks like rcu_dereference_check() is being
called with rcu read locked but still triggering suspicious RCU usage
warning.

The code hasn't changed for quite a while now, so it's also really
weird that it's triggering now.  Paul, does anything ring a bell?

Thanks.

-- 
tejun


Re: [writeback] 8bc4ad9498: INFO: suspicious RCU usage. ]

2016-09-01 Thread Jens Axboe

On 09/01/2016 04:21 AM, kernel test robot wrote:


FYI, we noticed the following commit:

https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git 
wb-buf-throttle
commit 8bc4ad9498f81d6689da42457615db0989e720d3 ("writeback: throttle buffered 
writeback")

in testcase: boot

on test machine: qemu-system-x86_64 -enable-kvm -cpu Westmere -m 512M

caused below changes:


+--+++
|  | b2e4e4e34d | 8bc4ad9498 |
+--+++
| boot_successes   | 6  | 0  |
| boot_failures| 0  | 8  |
| INFO:suspicious_RCU_usage| 0  | 8  |
| message:INFO:suspicious_RCU_usage| 0  | 0.0|
| pattern:INFO:suspicious_RCU_usage| 0  | 0  |
| calltrace:disk_events_workfn | 0  | 8  |
| message:calltrace:disk_events_workfn | 0  | 0.0|
| pattern:calltrace:disk_events_workfn | 0  | 0  |
+--+++



[7.323356] cdrom: Uniform CD-ROM driver Revision: 3.20
[7.334239]
[7.337256] ===
[7.340532] [ INFO: suspicious RCU usage. ]
[7.342419] 4.8.0-rc4-8-g8bc4ad9 #1 Not tainted
[7.347065] ---
[7.350132] include/linux/cgroup.h:435 suspicious rcu_dereference_check() 
usage!
[7.355928]
[7.355928] other info that might help us debug this:
[7.355928]
[7.360548]
[7.360548] rcu_scheduler_active = 1, debug_locks = 0
[7.366185] 3 locks held by kworker/0:1/19:
[7.369225]  #0:  ("events_freezable_power_efficient"){.+.+..}, at: 
[] process_one_work+0x151/0x4ee
[7.377225]  #1:  ((&(&ev->dwork)->work)){+.+...}, at: [] 
process_one_work+0x151/0x4ee
[7.384175]  #2:  (&(&q->__queue_lock)->rlock){-.}, at: 
[] cfq_set_request+0x5e/0x2a3
[7.389264]
[7.389264] stack backtrace:
[7.391300] CPU: 0 PID: 19 Comm: kworker/0:1 Not tainted 
4.8.0-rc4-8-g8bc4ad9 #1
[7.395279] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
Debian-1.8.2-1 04/01/2014
[7.398012] Workqueue: events_freezable_power_ disk_events_workfn
[7.399933]   88003db1f8e8 8178ed3b 
88003db184c0
[7.404348]  0001 88003db1f918 81123472 
88003db184c0
[7.407158]  84091180 0003 880039d4c400 
88003db1f938
[7.410074] Call Trace:
[7.411328]  [] dump_stack+0x82/0xb8
[7.413982]  [] lockdep_rcu_suspicious+0xf7/0x100
[7.415828]  [] bio_blkcg+0x89/0x93
[7.417336]  [] check_blkcg_changed+0x58/0x1b8
[7.419027]  [] ? lock_acquire+0x125/0x1be
[7.421418]  [] ? lock_acquire+0x125/0x1be
[7.424049]  [] ? lock_acquired+0xfc/0x2f2
[7.426893]  [] ? cfq_set_request+0x5e/0x2a3
[7.428722]  [] cfq_set_request+0xd1/0x2a3
[7.430315]  [] ? _raw_spin_unlock_irq+0x2c/0x3b
[7.432134]  [] ? trace_hardirqs_on_caller+0x17c/0x198
[7.434083]  [] ? trace_hardirqs_on+0xd/0xf
[7.435703]  [] ? _raw_spin_unlock_irq+0x31/0x3b
[7.437472]  [] ? ioc_create_icq+0x15e/0x175
[7.439690]  [] elv_set_request+0x1f/0x24
[7.442157]  [] get_request+0x38f/0x77f
[7.444694]  [] ? add_wait_queue_exclusive+0x44/0x44
[7.447449]  [] blk_get_request+0x65/0xa8
[7.449868]  [] ide_cd_queue_pc+0x76/0x19d
[7.451986]  [] ? __lock_acquire+0x679/0xed1
[7.453757]  [] cdrom_check_status+0x51/0x53
[7.455372]  [] ide_cdrom_check_events_real+0x20/0x3f
[7.457294]  [] cdrom_update_events+0x18/0x21
[7.458987]  [] cdrom_check_events+0x12/0x1f
[7.460713]  [] idecd_check_events+0x1c/0x1e
[7.462393]  [] disk_check_events+0x47/0x103
[7.464129]  [] disk_events_workfn+0x1c/0x1e
[7.465844]  [] process_one_work+0x272/0x4ee
[7.467462]  [] worker_thread+0x1eb/0x2c9
[7.469137]  [] ? rescuer_thread+0x2b3/0x2b3
[7.471338]  [] kthread+0xc5/0xcd
[7.472978]  [] ret_from_fork+0x1f/0x40
[7.474501]  [] ? kthread_create_on_node+0x1b0/0x1b0
[7.477061] Loading iSCSI transport class v2.0-870.
[7.479285] rdac: device handler registered


Pretty sure this has nothing to do with the writeback code, it's just 
triggering and now blaming one of those commits since they touch the 
same function. The RCU usage in there dates back to:


commit 598971bfbdfdc8701337dc1636c7919c44699914
Author: Tejun Heo 
Date:   Mon Mar 19 15:10:58 2012 -0700

cfq: don't use icq_get_changed()

Tejun, see what's up there, if anything?

--
Jens Axboe