On 07/25/2014 07:19 PM, Paul E. McKenney wrote:
> On Thu, Jul 24, 2014 at 07:28:35PM -0400, Sasha Levin wrote:
>> > On 07/24/2014 06:54 PM, Paul E. McKenney wrote:
>>> > > On Thu, Jul 24, 2014 at 06:19:11PM -0400, Sasha Levin wrote:
>>>> > >> Hi all,
>>>> > >>
>>>> > >> While fuzzing with trinity inside a KVM tools guest running the 
>>>> > >> latest -next
>>>> > >> kernel I've stumbled on the following stack trace (full log attached):
>>>> > >>
>>>> > >> [  370.662014] INFO: task trinity-main:8727 blocked for more than 120 
>>>> > >> seconds.
>>>> > >> [  370.662891]       Not tainted 
>>>> > >> 3.16.0-rc6-next-20140724-sasha-00046-g7324c87-dirty #932
>>>> > >> [  370.663655] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
>>>> > >> disables this message.
>>>> > >> [  370.664562] trinity-main    D ffff88053cc80000 13064  8727   8714 
>>>> > >> 0x00000000
>>>> > >> [  370.665328]  ffff88053da6fc10 0000000000000002 ffff8805483e2dc8 
>>>> > >> ffff880541873000
>>>> > >> [  370.666147]  000000276ed30787 ffff88053da6c010 ffff88053da6c000 
>>>> > >> ffff8805452a0000
>>>> > >> [  370.667243]  ffff880541873000 0000000000000000 7fffffffffffffff 
>>>> > >> ffffffffb3ec51d8
>>>> > >> [  370.668788] Call Trace:
>>>> > >> [  370.669118] schedule (kernel/sched/core.c:2847)
>>>> > >> [  370.670538] schedule_timeout (kernel/time/timer.c:1476)
>>>> > >> [  370.671524] ? mark_lock (kernel/locking/lockdep.c:2894)
>>>> > >> [  370.672299] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63)
>>>> > >> [  370.673227] ? get_parent_ip (kernel/sched/core.c:2561)
>>>> > >> [  370.674085] wait_for_completion (include/linux/spinlock.h:328 
>>>> > >> kernel/sched/completion.c:76 kernel/sched/completion.c:93 
>>>> > >> kernel/sched/completion.c:101 kernel/sched/completion.c:122)
>>>> > >> [  370.674960] ? wake_up_state (kernel/sched/core.c:2942)
>>>> > >> [  370.675576] _rcu_barrier (kernel/rcu/tree.c:3325 (discriminator 8))
>>>> > >> [  370.676109] rcu_barrier (kernel/rcu/tree_plugin.h:920)
>>>> > >> [  370.676627] netdev_run_todo (net/core/dev.c:6323)
>>>> > >> [  370.677202] rtnl_unlock (net/core/rtnetlink.c:80)
>>>> > >> [  370.677714] unregister_netdev (net/core/dev.c:6687)
>>>> > >> [  370.678266] gprs_attach (net/phonet/pep-gprs.c:311)
>>>> > >> [  370.679641] pep_setsockopt (net/phonet/pep.c:1016)
>>>> > >> [  370.681082] sock_common_setsockopt (net/core/sock.c:2603)
>>>> > >> [  370.682048] SyS_setsockopt (net/socket.c:1914 net/socket.c:1894)
>>>> > >> [  370.682854] tracesys (arch/x86/kernel/entry_64.S:541)
>>>> > >> [  370.683586] 1 lock held by trinity-main/8727:
>>>> > >> [  370.684232] #0: (rcu_preempt_state.barrier_mutex){+.+...}, at: 
>>>> > >> _rcu_barrier (kernel/rcu/tree.c:3233)
>>>> > >>
>>>> > >> This has reproduced couple of times, and has always originated from 
>>>> > >> gprs_attach. I don't see any obvious
>>>> > >> issues with the code there, so I'm not sure if it's a fault of the 
>>>> > >> phonet or the rcu code.
>>> > > 
>>> > > Can't tell much from this.  Any chance of a .config?
>>> > > 
>>> > >                                                         Thanx, Paul
>>> > > 
>> > 
>> > Attached.
> If you were doing partial nohz_full= CPUs, there is a recent RCU bug
> that would result in these symptoms.  No idea how you would make it
> happen without specifying the nohz_full= boot parameter, but I should
> be getting the fix into -next in a few days.
> 
> But you never know.  So if you are interested in testing sooner, and if
> my local tests pass, I could send you a modified patch that applies on
> top of rcu/next.  If you would like such a patch, let me know.

Sure, if you Cc me on it I'll be happy to test it out, just don't go out
of your way since I've disabled phonet for now anyways, so it's not really
delaying me.


Thanks,
Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to