On Mon, Dec 03, 2018 at 07:44:03AM +0000, He, Bo wrote:
> Thanks, we have run the test for the whole weekend and not reproduce the
> issue, so we confirm the CONFIG_RCU_BOOST can fix the issue.
Very good, that is encouraging. Perhaps I should think about making
CONFIG_RCU_BOOST=y the default for CONFIG_PREEMPT in mainline, at least
for architectures for which rt_mutexes are implemented.
> We have enabled the rcupdate.rcu_cpu_stall_timeout=7 and also set panic on
> rcu stall and will see if we can see the panic, will keep you posed with the
> test results.
> echo 1 > /proc/sys/kernel/panic_on_rcu_stall
Looking forward to seeing what is going on! Of course, to reproduce, you
will need to again build with CONFIG_RCU_BOOST=n.
Thanx, Paul
> -----Original Message-----
> From: Paul E. McKenney <[email protected]>
> Sent: Saturday, December 1, 2018 12:49 AM
> To: He, Bo <[email protected]>
> Cc: Steven Rostedt <[email protected]>; [email protected];
> [email protected]; [email protected];
> [email protected]; Zhang, Jun <[email protected]>; Xiao, Jin
> <[email protected]>; Zhang, Yanmin <[email protected]>
> Subject: Re: rcu_preempt caused oom
>
> On Fri, Nov 30, 2018 at 03:18:58PM +0000, He, Bo wrote:
> > Here is the kernel cmdline:
>
> Thank you!
>
> > Kernel command line: androidboot.acpio_idx=0
> > androidboot.bootloader=efiwrapper-02_03-userdebug_kernelflinger-06_03-
> > userdebug androidboot.diskbus=00.0 androidboot.verifiedbootstate=green
> > androidboot.bootreason=power-on androidboot.serialno=R1J56L6006a7bb
> > g_ffs.iSerialNumber=R1J56L6006a7bb no_timer_check noxsaves
> > reboot_panic=p,w i915.hpd_sense_invert=0x7 mem=2G nokaslr nopti
> > ftrace_dump_on_oops trace_buf_size=1024K intel_iommu=off gpt
> > loglevel=4 androidboot.hardware=gordon_peak
> > firmware_class.path=/vendor/firmware relative_sleep_states=1
> > enforcing=0 androidboot.selinux=permissive cpu_init_udelay=10
> > androidboot.android_dt_dir=/sys/bus/platform/devices/ANDR0001:00/prope
> > rties/android/ pstore.backend=ramoops memmap=0x1400000$0x50000000
> > ramoops.mem_address=0x50000000 ramoops.mem_size=0x1400000
> > ramoops.record_size=0x4000 ramoops.console_size=0x1000000
> > ramoops.ftrace_size=0x10000 ramoops.dump_oops=1 vga=current
> > i915.modeset=1 drm.atomic=1 i915.nuclear_pageflip=1
> > drm.vblankoffdelay=
>
> And no sign of any suppression of RCU CPU stall warnings. Hmmm...
> It does take more than 21 seconds to OOM? Or do things happen faster than
> that? If they do happen faster than that, then on approach would be to add
> something like this to the kernel command line:
>
> rcupdate.rcu_cpu_stall_timeout=7
>
> This would set the stall timeout to seven seconds. Note that timeouts less
> than three seconds are silently interpreted as three seconds.
>
> Thanx, Paul
>
> > -----Original Message-----
> > From: Steven Rostedt <[email protected]>
> > Sent: Friday, November 30, 2018 11:17 PM
> > To: Paul E. McKenney <[email protected]>
> > Cc: He, Bo <[email protected]>; [email protected];
> > [email protected]; [email protected];
> > [email protected]; Zhang, Jun <[email protected]>; Xiao, Jin
> > <[email protected]>; Zhang, Yanmin <[email protected]>
> > Subject: Re: rcu_preempt caused oom
> >
> > On Fri, 30 Nov 2018 06:43:17 -0800
> > "Paul E. McKenney" <[email protected]> wrote:
> >
> > > Could you please send me your list of kernel boot parameters? They
> > > usually appear near the start of your console output.
> >
> > Or just: cat /proc/cmdline
> >
> > -- Steve
> >
>