Hello, Sorry about the long delay.
On Thu, Dec 14, 2017 at 01:21:09PM -0500, Steven Rostedt wrote: > > Yeah, will do, but out of curiosity, Sergey and I already described > > what the root problem was and you didn't really seem to take that. Is > > that because the explanation didn't make sense to you or us > > misunderstanding what your code does? > > Can you post the message id of the discussion you are referencing. > Because I've been swamped with other activities and only been skimming > these threads. This was already on another reply but just in case. http://lkml.kernel.org/r/20171108162813.ga983...@devbig577.frc2.facebook.com I tried your v4 patch and ran the test module and could easily reproduce RCU stall and other issues stemming from a CPU getting pegged down by printk flushing. I'm attaching the test module code at the end. I wrote this before but this isn't a theoretical problem. We see these stalls a lot. Preemption isn't enabled to begin with. Memory pressure is high and OOM triggers and printk starts printing out OOM warnings; then, a network packet comes in which triggers allocations in the network layer, which fails due to memory pressure, which then generates memory allocation failure messages which then generates netconsole packets which then tries to allocate more memory and so on. It's just that there's no one else to give that flushing duty too, so the pingpoinging that your patch implements can't really help anything. You argue that it isn't made worse by your patch, which may be true but your patch doesn't solve actual problems and is most likely unnecessary complication which gets in the way for the actual solution. It's a weird argument to push an approach which is fundamentally broken. Let's please stop that. Thanks. #include <linux/module.h> #include <linux/delay.h> #include <linux/sched.h> #include <linux/mutex.h> #include <linux/workqueue.h> #include <linux/hrtimer.h> static bool in_printk; static bool stop_testing; static struct hrtimer printk_timer; static ktime_t timer_interval; static enum hrtimer_restart printk_timerfn(struct hrtimer *timer) { int i; if (READ_ONCE(in_printk)) for (i = 0; i < 10000; i++) printk("%-80s\n", "XXX TIMER"); hrtimer_forward_now(&printk_timer, timer_interval); return READ_ONCE(stop_testing) ? HRTIMER_NORESTART : HRTIMER_RESTART; } static void preempt_printk_workfn(struct work_struct *work) { int i; hrtimer_init(&printk_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL); printk_timer.function = printk_timerfn; timer_interval = ktime_set(0, NSEC_PER_MSEC); hrtimer_start(&printk_timer, timer_interval, HRTIMER_MODE_REL); while (!READ_ONCE(stop_testing)) { preempt_disable(); WRITE_ONCE(in_printk, true); for (i = 0; i < 100; i++) printk("%-80s\n", "XXX PREEMPT"); WRITE_ONCE(in_printk, false); preempt_enable(); msleep(1); } } static DECLARE_WORK(preempt_printk_work, preempt_printk_workfn); static int __init test_init(void) { queue_work_on(0, system_wq, &preempt_printk_work); return 0; } static void __exit test_exit(void) { WRITE_ONCE(stop_testing, true); flush_work(&preempt_printk_work); hrtimer_cancel(&printk_timer); } module_init(test_init); module_exit(test_exit); MODULE_LICENSE("GPL");