subject:"\"\\\[d_alloc_parallel\\\] WARNING\\\: bad unlock balance detected\\\!\""

Re: [d_alloc_parallel] WARNING: bad unlock balance detected!

2017-11-07 Thread Peter Zijlstra

On Tue, Nov 07, 2017 at 12:03:29PM +0800, Fengguang Wu wrote:

> >>[  428.512005] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow 
> >>Control: RX
> >>LKP: HOSTNAME vm-lkp-wsx03-openwrt-i386-8, MAC , kernel 4.14.0-rc8 158, 
> >>serial console /dev/ttyS0
> >>[  429.798345] Kernel tests: Boot OK!
> >>[  430.761760] [  430.766166] =
> >>[  430.775297] WARNING: bad unlock balance detected!
> >>[  430.784342] 4.14.0-rc8 #158 Not tainted
> >>[  430.792153] -
> >>[  430.801319] pidof/1024 is trying to release lock (rcu_preempt_state) at:
> >>[  430.813514] [] rcu_read_unlock_special+0x5f8/0x620
> >>[  430.824041] but there are no more locks to release!
> >
> >Er... yes?  What of that?  Since when is rcu_read_lock() not allowed to
> >be used under an rwsem?

That's not what it says, it is.. The warning is about trying to release
a lock that's not held. And its right, RCU was doing that. It would
acquire a lock without lockdep knowing about it and then telling lockdep
about freeing it.

This is fixed by commit:

  02a7c234e540 ("rcu: Suppress lockdep false-positive ->boost_mtx complaints")

The problem is that RCU boosting was mixing futex and !futex rt_mutex
ops.

Re: [d_alloc_parallel] WARNING: bad unlock balance detected!

2017-11-06 Thread Fengguang Wu


CC locking people.

On Tue, Nov 07, 2017 at 02:33:28AM +, Al Viro wrote:

On Tue, Nov 07, 2017 at 10:01:13AM +0800, Fengguang Wu wrote:

Hi,

Here is a warning in v4.14-rc8 -- it's not necessarily a new bug.


Why is it a bug at all?


[  428.512005] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: 
RX
LKP: HOSTNAME vm-lkp-wsx03-openwrt-i386-8, MAC , kernel 4.14.0-rc8 158, serial 
console /dev/ttyS0
[  429.798345] Kernel tests: Boot OK!
[  430.761760] [  430.766166] =
[  430.775297] WARNING: bad unlock balance detected!
[  430.784342] 4.14.0-rc8 #158 Not tainted
[  430.792153] -
[  430.801319] pidof/1024 is trying to release lock (rcu_preempt_state) at:
[  430.813514] [] rcu_read_unlock_special+0x5f8/0x620
[  430.824041] but there are no more locks to release!


Er... yes?  What of that?  Since when is rcu_read_lock() not allowed to
be used under an rwsem?


[  430.833342] [  430.833342] other info that might help us debug this:
[  430.845985] 2 locks held by pidof/1024:
[  430.853826]  #0:  (&sb->s_type->i_mutex_key){}, at: [] 
lookup_slow+0x8a/0x310
[  430.869344]  #1:  (rcu_read_lock){}, at: [] 
d_alloc_parallel+0x7e/0xd10


No shit - we are doing RCU cache chain walk while holding ->i_rwsem.  As in
down_read(&rwsem);
...
rcu_read_lock();
...
rcu_read_unlock();

Why is that a problem?  If we are suddenly not allowed to have an RCU reader
section while holding any kind of a blocking lock, a *lot* of places in the
kernel are screwed.

Please, explain.


Good question! Actually it's not only you.

There are dozens of occurrences for this "unlock balance" warning
in RC8:

 ((console_sem).lock){-...}, at:  up
 (gcov_lock){+.+.}, at:  gcov_enable_events
 (&mm->mmap_sem){}, at:  __do_page_fault
 (node_lock){+.+.}, at:  gcov_event
 (&p->pi_lock){-.-.}, at:  try_to_wake_up
 (rcu_read_lock){}, at:  aa_file_perm
 (rcu_read_lock){}, at:  copy_namespaces
 (rcu_read_lock){}, at:  d_alloc_parallel
 (rcu_read_lock){}, at:  __d_lookup
 (rcu_read_lock){}, at:  dput
 (rcu_read_lock){}, at:  find_get_entry
 (rcu_read_lock){}, at:  insert_retry
 (rcu_read_lock){}, at:  mntput_no_expire
 (rcu_read_lock){}, at:  netlink_insert
 (rcu_read_lock){}, at:  rcu_read_lock
 (rcu_read_lock){}, at:  rcu_torture_read_lock
 (rcu_read_lock){}, at:  rhashtable_insert_slow
 (rcu_read_lock){}, at:  rhashtable_walk_start
 (rcu_read_lock){}, at:  sock_def_readable
 (rcu_read_lock){}, at:  SyS_setpriority
 (rcu_read_lock){}, at:  T.947
 (rcu_read_lock){}, at:  T.949
 (rcu_read_lock){}, at:  test_rhashtable
 (rcu_read_lock){}, at:  test_rht_lookup
 (rcu_read_lock){}, at:  threadfunc
 (rcu_read_lock){}, at:  thread_lookup_test
 (rcu_read_lock){}, at:  watchdog
 (rcu_sched_state.exp_mutex){+.+.}, at:  _synchronize_rcu_expedited
 (&sb->s_type->i_mutex_key#3){+.+.}, at:  start_creating
 (&sb->s_type->i_mutex_key){}, at:  lookup_slow
 (&sig->cred_guard_mutex){}, at:  prepare_bprm_creds
 (sk_lock-AF_NETLINK){}, at:  netlink_insert
 (tasklist_lock){}, at:  debug_show_all_locks
 (tasklist_lock){.+.+}, at:  debug_show_all_locks
 (tty_mutex){}, at:  tty_open
 (tty_mutex){+.+.}, at:  tty_open
 (tty_mutex){+.+.}, at:  tty_release_struct

Maybe some recent core locking changes triggered this warning.
In particular, some warnings show up since this commit.

commit cde50a67397c0da7d11795d4b4418384022ab8e6
Author: Levin, Alexander (Sasha Levin) 
AuthorDate: Sun Jun 18 14:06:01 2017 +
Commit: Ingo Molnar 
CommitDate: Tue Jun 20 11:53:09 2017 +0200

   locking/rtmutex: Don't initialize lockdep when not required

   pi_mutex isn't supposed to be tracked by lockdep, but just
   passing NULLs for name and key will cause lockdep to spew a
   warning and die, which is not what we want it to do.

   Skip lockdep initialization if the caller passed NULLs for
   name and key, suggesting such initialization isn't desired.

   Signed-off-by: Sasha Levin 
   Cc: Linus Torvalds 
   Cc: Peter Zijlstra 
   Cc: Thomas Gleixner 
   Fixes: f5694788ad8d ("rt_mutex: Add lockdep annotations")
   Link: 
http://lkml.kernel.org/r/20170618140548.4763-1-alexander.le...@verizon.com
   Signed-off-by: Ingo Molnar 
---
kernel/locking/rtmutex.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 43123533e9b10..78069895032a9 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1661,6 +1661,7 @@ void __rt_mutex_init(struct rt_mutex *lock, const char 
*name,
   lock->waiters = RB_ROOT;
   lock->waiters_leftmost = NULL;

+   if (name && key)
   debug_rt_mutex_init(lock, name, key);
}
EXPORT_SYMBOL_GPL(__rt_mutex_init);

018956d641  locking/selftest: Add RT-mutex support
cde50a6739  locking/rtmutex: Don't initialize lockdep when not requir

Re: [d_alloc_parallel] WARNING: bad unlock balance detected!

2017-11-06 Thread Al Viro

On Tue, Nov 07, 2017 at 10:01:13AM +0800, Fengguang Wu wrote:
> Hi,
> 
> Here is a warning in v4.14-rc8 -- it's not necessarily a new bug.

Why is it a bug at all?

> [  428.512005] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow 
> Control: RX
> LKP: HOSTNAME vm-lkp-wsx03-openwrt-i386-8, MAC , kernel 4.14.0-rc8 158, 
> serial console /dev/ttyS0
> [  429.798345] Kernel tests: Boot OK!
> [  430.761760] [  430.766166] =
> [  430.775297] WARNING: bad unlock balance detected!
> [  430.784342] 4.14.0-rc8 #158 Not tainted
> [  430.792153] -
> [  430.801319] pidof/1024 is trying to release lock (rcu_preempt_state) at:
> [  430.813514] [] rcu_read_unlock_special+0x5f8/0x620
> [  430.824041] but there are no more locks to release!

Er... yes?  What of that?  Since when is rcu_read_lock() not allowed to
be used under an rwsem?

> [  430.833342] [  430.833342] other info that might help us debug this:
> [  430.845985] 2 locks held by pidof/1024:
> [  430.853826]  #0:  (&sb->s_type->i_mutex_key){}, at: [] 
> lookup_slow+0x8a/0x310
> [  430.869344]  #1:  (rcu_read_lock){}, at: [] 
> d_alloc_parallel+0x7e/0xd10

No shit - we are doing RCU cache chain walk while holding ->i_rwsem.  As in
down_read(&rwsem);
...
rcu_read_lock();
...
rcu_read_unlock();

Why is that a problem?  If we are suddenly not allowed to have an RCU reader
section while holding any kind of a blocking lock, a *lot* of places in the
kernel are screwed.

Please, explain.

[d_alloc_parallel] WARNING: bad unlock balance detected!

2017-11-06 Thread Fengguang Wu


Hi,

Here is a warning in v4.14-rc8 -- it's not necessarily a new bug.

[  428.512005] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: 
RX
LKP: HOSTNAME vm-lkp-wsx03-openwrt-i386-8, MAC , kernel 4.14.0-rc8 158, serial 
console /dev/ttyS0
[  429.798345] Kernel tests: Boot OK!
[  430.761760] 
[  430.766166] =

[  430.775297] WARNING: bad unlock balance detected!
[  430.784342] 4.14.0-rc8 #158 Not tainted
[  430.792153] -
[  430.801319] pidof/1024 is trying to release lock (rcu_preempt_state) at:
[  430.813514] [] rcu_read_unlock_special+0x5f8/0x620
[  430.824041] but there are no more locks to release!
[  430.833342] 
[  430.833342] other info that might help us debug this:

[  430.845985] 2 locks held by pidof/1024:
[  430.853826]  #0:  (&sb->s_type->i_mutex_key){}, at: [] 
lookup_slow+0x8a/0x310
[  430.869344]  #1:  (rcu_read_lock){}, at: [] 
d_alloc_parallel+0x7e/0xd10
[  430.884215] 
[  430.884215] stack backtrace:

[  430.893488] CPU: 0 PID: 1024 Comm: pidof Not tainted 4.14.0-rc8 #158
[  430.905217] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.10.2-1 04/01/2014
[  430.920682] Call Trace:
[  430.926481]  dump_stack+0x16/0x1c
[  430.933406]  print_unlock_imbalance_bug+0xb9/0xd0
[  430.942512]  ? rcu_read_unlock_special+0x5f8/0x620
[  430.951690]  ? rcu_read_unlock_special+0x5f8/0x620
[  430.960928]  lock_release+0x1cc/0x490
[  430.968446]  ? rcu_gp_kthread_wake+0x34/0x50
[  430.976960]  ? rcu_read_unlock_special+0x5f8/0x620
[  430.986108]  rt_mutex_unlock+0x1e/0xb0
[  430.993754]  rcu_read_unlock_special+0x5f8/0x620
[  431.003016]  __rcu_read_unlock+0xa7/0xb0
[  431.010665]  d_alloc_parallel+0x345/0xd10
[  431.018441]  lookup_slow+0xe4/0x310
[  431.025500]  walk_component+0x146/0x590
[  431.031915]  path_lookupat+0xcc/0x3f0
[  431.038634]  ? 0xc100
[  431.042398]  ? 0xc100
[  431.046246]  filename_lookup+0xd8/0x290
[  431.051246]  ? strncpy_from_user+0x13f/0x260
[  431.056818]  user_path_at_empty+0x25/0x30
[  431.062068]  SyS_readlink+0x52/0x130
[  431.066929]  do_int80_syscall_32+0x95/0x290
[  431.072306]  entry_INT80_32+0x2f/0x2f
[  431.077230] EIP: 0xb7f0e6a2
[  431.081187] EFLAGS: 0296 CPU: 0
[  431.085910] EAX: ffda EBX: bf82714b ECX: 080964b8 EDX: 0050
[  431.095247] ESI: 0050 EDI: bf82714b EBP: bf827124 ESP: bf827100
[  431.107404]  DS: 007b ES: 007b FS:  GS: 0033 SS: 007b
[  431.617153] 8021q: adding VLAN 0 to HW filter on device eth0
[  433.538079] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: 
RX

Thanks,
Fengguang

Re: [d_alloc_parallel] WARNING: bad unlock balance detected!

Re: [d_alloc_parallel] WARNING: bad unlock balance detected!

Re: [d_alloc_parallel] WARNING: bad unlock balance detected!

[d_alloc_parallel] WARNING: bad unlock balance detected!

4 matches

Site Navigation

Mail list logo

Footer information