Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-05-09 Thread Sergey Senozhatsky
Hi, Move printk and (some of) MM people to the recipients list. On (01/10/18 09:02), Tejun Heo wrote: [..] > The particular case that we've been seeing regularly in the fleet was > the following scenario. > > 1. Console is IPMI emulated serial console. Super slow. Also >netconsole is in us

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-04-22 Thread Sergey Senozhatsky
On (01/23/18 07:43), Tejun Heo wrote: > > > > We can have more. But if printk is causing printks, that's a major bug. > > And work queues are not going to fix it, it will just spread out the > > pain. Have it be 100 printks, it needs to be fixed if it is happening. > > And having all printks just

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-24 Thread Tejun Heo
Hello, Peter. On Wed, Jan 24, 2018 at 10:36:07AM +0100, Peter Zijlstra wrote: > On Wed, Jan 10, 2018 at 09:02:23AM -0800, Tejun Heo wrote: > > 1. Console is IPMI emulated serial console. Super slow. Also > >netconsole is in use. > > So my IPMI SoE typically run at 115200 Baud (or higher) an

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-24 Thread Peter Zijlstra
On Wed, Jan 10, 2018 at 09:02:23AM -0800, Tejun Heo wrote: > 1. Console is IPMI emulated serial console. Super slow. Also >netconsole is in use. So my IPMI SoE typically run at 115200 Baud (or higher) and I've not had trouble like that (granted I don't typically trigger OOM storms, but they

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-23 Thread Sergey Senozhatsky
On (01/23/18 21:52), Steven Rostedt wrote: > On Wed, 24 Jan 2018 11:11:33 +0900 > Sergey Senozhatsky wrote: > > > Please take a look. > > Was there something specific to look at? Not really. Just my previous email, basically. You said "I have to look at the latest code." so I replied. Well, if

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-23 Thread Steven Rostedt
On Wed, 24 Jan 2018 11:11:33 +0900 Sergey Senozhatsky wrote: > Please take a look. Was there something specific to look at? I'm doing a hundred different things at once, and my memory cache keeps getting flushed. -- Steve

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-23 Thread Sergey Senozhatsky
Hello, On (01/23/18 11:24), Steven Rostedt wrote: [..] > > With WQ we don't lockup the kernel, because we flush printk_safe in > > preemptible context. And people are very much expected to fix the > > misbehaving consoles. But that should not be printk_safe problem. > > Right, but now you just ma

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-23 Thread Tejun Heo
Hello, Sergey. On Wed, Jan 24, 2018 at 01:01:53AM +0900, Sergey Senozhatsky wrote: > On (01/23/18 10:41), Steven Rostedt wrote: > [..] > > We can have more. But if printk is causing printks, that's a major bug. > > And work queues are not going to fix it, it will just spread out the > > pain. Have

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-23 Thread Tejun Heo
Hey, On Tue, Jan 23, 2018 at 11:13:30AM -0500, Steven Rostedt wrote: > From what I understand is that there's an issue with one of the printk > consoles, due to memory pressure or whatnot. Then a printk happens > within a printk recursively. It gets put into the safe buffer and an > irq is sent to

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-23 Thread Steven Rostedt
On Wed, 24 Jan 2018 01:01:53 +0900 Sergey Senozhatsky wrote: > On (01/23/18 10:41), Steven Rostedt wrote: > [..] > > We can have more. But if printk is causing printks, that's a major bug. > > And work queues are not going to fix it, it will just spread out the > > pain. Have it be 100 printks, i

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-23 Thread Steven Rostedt
On Tue, 23 Jan 2018 07:43:47 -0800 Tejun Heo wrote: > So, at least in the case that we were seeing, it isn't that black and > white. printk keeps causing printks but only because printk buffer > flushing is preventing the printk'ing context from making forward > progress. The key problem there

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-23 Thread Sergey Senozhatsky
Hello, Tejun On (01/23/18 07:43), Tejun Heo wrote: > Hello, Steven. > > On Tue, Jan 23, 2018 at 10:41:21AM -0500, Steven Rostedt wrote: > > > I don't want to have heuristics in print_safe, I don't want to have a > > > magic > > > number controlled by a user-space visible knob, I don't want to ha

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-23 Thread Sergey Senozhatsky
On (01/23/18 10:41), Steven Rostedt wrote: [..] > We can have more. But if printk is causing printks, that's a major bug. > And work queues are not going to fix it, it will just spread out the > pain. Have it be 100 printks, it needs to be fixed if it is happening. > And having all printks just gen

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-23 Thread Tejun Heo
Hello, Steven. On Tue, Jan 23, 2018 at 10:41:21AM -0500, Steven Rostedt wrote: > > I don't want to have heuristics in print_safe, I don't want to have a magic > > number controlled by a user-space visible knob, I don't want to have the > > first 3 lines of a lockdep splat. > > We can have more. B

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-23 Thread Steven Rostedt
On Wed, 24 Jan 2018 00:21:30 +0900 Sergey Senozhatsky wrote: > On (01/23/18 09:56), Steven Rostedt wrote: > [..] > > > Why do we even use irq_work for printk_safe? > > > > Why not? > > > > Really, I think you are trying to solve a symptom and not the problem. > > If we are having issues with

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-23 Thread Sergey Senozhatsky
On (01/23/18 09:56), Steven Rostedt wrote: [..] > > Why do we even use irq_work for printk_safe? > > Why not? > > Really, I think you are trying to solve a symptom and not the problem. > If we are having issues with irq_work, we are going to have issues with > a work queue. It's just spreading ou

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-23 Thread Steven Rostedt
On Tue, 23 Jan 2018 15:40:23 +0900 Sergey Senozhatsky wrote: > Why do we even use irq_work for printk_safe? Why not? Really, I think you are trying to solve a symptom and not the problem. If we are having issues with irq_work, we are going to have issues with a work queue. It's just spreading o

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-22 Thread Sergey Senozhatsky
On (01/23/18 15:40), Sergey Senozhatsky wrote: > > Why do we even use irq_work for printk_safe? > ... perhaps because of wq: pool->lock -> printk -> call_console_drivers -> printk -> vprintk_safe -> wq: pool->lock Which is a "many things have gone wrong" type of scenario. Maybe we can workaro

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-22 Thread Sergey Senozhatsky
On (01/23/18 15:40), Sergey Senozhatsky wrote: [..] > Why do we even use irq_work for printk_safe? > > Okay... So, how about this. For printk_safe we use system_wq for flushing. > IOW, we flush from a task running exactly on the same CPU which hit printk > recursion, not from IRQ. From vprintk_saf

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-22 Thread Sergey Senozhatsky
Hello, On (01/21/18 23:15), Sergey Senozhatsky wrote: [..] > we have printk recursion from console drivers. it's redirected to > printk_safe and we queue an IRQ work to flush the buffer > > printk > console_unlock >call_console_drivers > net_console > printk > printk_save ->

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-22 Thread Sergey Senozhatsky
On (01/22/18 19:28), Sergey Senozhatsky wrote: > On (01/22/18 17:56), Sergey Senozhatsky wrote: > [..] > > Assume the following, > > But more importantly we are missing another huge thing - console_unlock(). IOW, not every console_unlock() is from vprintk_emit(). We can have console_trylock() ->

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-22 Thread Sergey Senozhatsky
On (01/22/18 17:56), Sergey Senozhatsky wrote: [..] > Assume the following, But more importantly we are missing another huge thing - console_unlock(). Suppose: console_lock(); << preemption >> printk

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-22 Thread Sergey Senozhatsky
On (01/21/18 16:04), Steven Rostedt wrote: [..] > > The problem is that we flush printk_safe right when console_unlock() > > printing > > loop enables local IRQs via printk_safe_exit_irqrestore() [given that IRQs > > were enabled in the first place when the CPU went to console_unlock()]. > > This

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-21 Thread Steven Rostedt
On Sun, 21 Jan 2018 23:15:21 +0900 Sergey Senozhatsky wrote: > so fix the console drivers ;) Totally agree! > > > > > just kidding. ok... Darn it! ;-) > the problem is that we flush printk_safe right when console_unlock() printing > loop enables local IRQs via printk_safe_exit_irqrest

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-21 Thread Sergey Senozhatsky
On (01/20/18 10:49), Steven Rostedt wrote: [..] > > printks from console_unlock()->call_console_drivers() are redirected > > to printk_safe buffer. we need irq_work on that CPU to flush its > > printk_safe buffer. > > So is the issue that we keep triggering this irq work then? Then this > solution

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-20 Thread Steven Rostedt
On Sat, 20 Jan 2018 16:14:02 +0900 Sergey Senozhatsky wrote: > [..] > > asmlinkage int vprintk_emit(int facility, int level, > > const char *dict, size_t dictlen, > > @@ -1849,6 +1918,17 @@ asmlinkage int vprintk_emit(int facility, int level, > > > > /* This stops t

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-20 Thread Steven Rostedt
On Sat, 20 Jan 2018 04:19:53 -0800 Tejun Heo wrote: > I'm a bit worried tho because this essentially seems like "detect > recursion, ignore messages" approach. netcons can have a very large > surface for bugs. Suppressing those messages would make them > difficult to debug. For example, all ou

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-20 Thread Tejun Heo
Hello, Steven. On Fri, Jan 19, 2018 at 01:20:52PM -0500, Steven Rostedt wrote: > I was thinking about this a bit more, and instead of offloading a > recursive printk, perhaps its best to simply throttle it. Because the > problem may not go away if a printk thread takes over, because the bug > is r

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-19 Thread Sergey Senozhatsky
On (01/19/18 13:20), Steven Rostedt wrote: [..] > I was thinking about this a bit more, and instead of offloading a > recursive printk, perhaps its best to simply throttle it. Because the > problem may not go away if a printk thread takes over, because the bug > is really the printk infrastructure

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-19 Thread Steven Rostedt
Tejun, I was thinking about this a bit more, and instead of offloading a recursive printk, perhaps its best to simply throttle it. Because the problem may not go away if a printk thread takes over, because the bug is really the printk infrastructure filling the printk buffer keeping printk from ev

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-18 Thread Steven Rostedt
On Thu, 18 Jan 2018 13:31:16 +0900 Sergey Senozhatsky wrote: > d'oh... indeed, I copy-pasted the wrong URL... it should > have been lkml.kernel.org/r/ [and it actually was]. I've learned to do a copy after entering the lkml.kernel.org link into the browser url, and before hitting enter. The redi

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-18 Thread Petr Mladek
On Wed 2018-01-17 12:05:51, Tejun Heo wrote: > Hello, Steven. > > On Wed, Jan 17, 2018 at 12:12:51PM -0500, Steven Rostedt wrote: > > From what I gathered, you said an OOM would trigger, and then the > > network console would not be able to allocate memory and it would > > trigger a printk too, an

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-17 Thread Sergey Senozhatsky
On (01/17/18 12:05), Tejun Heo wrote: [..] > > This could very well be a great place to force offloading. If a printk > > is called from within a printk, at the same context (normal, softirq, > > irq or NMI), then we should trigger the offloading. > > I was thinking more of a timeout based approac

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-17 Thread Sergey Senozhatsky
On (01/17/18 12:12), Steven Rostedt wrote: [..] > /* > * Can we actually use the console at this time on this cpu? > @@ -2333,6 +2390,7 @@ void console_unlock(void) > > for (;;) { > struct printk_log *msg; > + bool offload; > size_t ext_len = 0; >

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-17 Thread Sergey Senozhatsky
On (01/17/18 14:04), Petr Mladek wrote: > On Wed 2018-01-17 11:18:56, Sergey Senozhatsky wrote: > > On (01/16/18 10:45), Steven Rostedt wrote: > > [..] > > > > [1] https://marc.info/?l=linux-mm&m=145692016122716 > > > > > > Especially since Konstantin is working on pulling in all LKML archives, >

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-17 Thread Tejun Heo
Hello, Steven. On Wed, Jan 17, 2018 at 12:12:51PM -0500, Steven Rostedt wrote: > From what I gathered, you said an OOM would trigger, and then the > network console would not be able to allocate memory and it would > trigger a printk too, and cause an infinite amount of printks. Yeah, it falls in

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-17 Thread Steven Rostedt
On Wed, 17 Jan 2018 12:12:51 -0500 Steven Rostedt wrote: > @@ -2393,15 +2451,20 @@ void console_unlock(void) >* waiter waiting to take over. >*/ > console_lock_spinning_enable(); > + offload = recursion_check_start(); > > s

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-17 Thread Steven Rostedt
On Wed, 17 Jan 2018 07:15:09 -0800 Tejun Heo wrote: > It's great that Steven's patches solve a good number of problems. It > is also true that there's a class of problems that it doesn't solve, > which other approaches do. The productive thing to do here is trying > to solve the unsolved one to

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-17 Thread Steven Rostedt
On Wed, 17 Jan 2018 14:04:07 +0100 Petr Mladek wrote: > On Wed 2018-01-17 11:18:56, Sergey Senozhatsky wrote: > > On (01/16/18 10:45), Steven Rostedt wrote: > > [..] > > > > [1] https://marc.info/?l=linux-mm&m=145692016122716 > > > > > > Especially since Konstantin is working on pulling in a

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-17 Thread Tejun Heo
Hello, On Wed, Jan 17, 2018 at 10:12:08AM +0100, Petr Mladek wrote: > IMHO, the bad scenario with OOM was that any printk() called in > the OOM report became console_lock owner and was responsible > for pushing all new messages to the console. There was a possible > livelock because OOM Killer was

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-17 Thread Petr Mladek
On Wed 2018-01-17 11:18:56, Sergey Senozhatsky wrote: > On (01/16/18 10:45), Steven Rostedt wrote: > [..] > > > [1] https://marc.info/?l=linux-mm&m=145692016122716 > > > > Especially since Konstantin is working on pulling in all LKML archives, > > the above should be denoted as: > > > > Link: >

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-17 Thread Petr Mladek
On Tue 2018-01-16 11:44:56, Tejun Heo wrote: > Hello, Steven. > > On Thu, Jan 11, 2018 at 09:55:47PM -0500, Steven Rostedt wrote: > > All I did was start off a work queue on each CPU, and each CPU does one > > printk() followed by a millisecond sleep. No 10,000 printks, nothing > > in an interrupt

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-16 Thread Sergey Senozhatsky
On (01/16/18 11:13), Petr Mladek wrote: [..] > IMHO, it would make sense if flushing the printk buffer behaves > the same when called either from printk() or from any other path. > I mean that it should be aggressive and allow an effective > hand off. > > It should be safe as long as foo_specific_

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-16 Thread Sergey Senozhatsky
On (01/16/18 11:19), Petr Mladek wrote: [..] > > [1] https://marc.info/?l=linux-mm&m=145692016122716 > > Fixes: 6b97a20d3a79 ("printk: set may_schedule for some of > > console_trylock() callers") > > Signed-off-by: Sergey Senozhatsky > > Reported-by: Tetsuo Handa > > IMHO, this is a step in the

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-16 Thread Sergey Senozhatsky
On (01/16/18 10:45), Steven Rostedt wrote: [..] > > [1] https://marc.info/?l=linux-mm&m=145692016122716 > > Especially since Konstantin is working on pulling in all LKML archives, > the above should be denoted as: > > Link: > http://lkml.kernel.org/r/201603022101.CAH73907.OVOOMFHFFtQJSL%20()%20

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-16 Thread Tejun Heo
Hello, Steven. On Thu, Jan 11, 2018 at 09:55:47PM -0500, Steven Rostedt wrote: > All I did was start off a work queue on each CPU, and each CPU does one > printk() followed by a millisecond sleep. No 10,000 printks, nothing > in an interrupt handler. Preemption is disabled while the printk > happe

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-16 Thread Steven Rostedt
On Tue, 16 Jan 2018 15:10:13 +0900 Sergey Senozhatsky wrote: > overall that's very close to what I have in one of my private branches. > console_trylock_spinning() for some reason does not perform really > well on my made-up internal printk torture tests. it seems that I One thing I noticed in m

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-16 Thread Steven Rostedt
On Tue, 16 Jan 2018 13:47:16 +0900 Sergey Senozhatsky wrote: > From: Sergey Senozhatsky > Subject: [PATCH] printk: never set console_may_schedule in console_trylock() > > This patch, basically, reverts commit 6b97a20d3a79 ("printk: > set may_schedule for some of console_trylock() callers"). > T

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-16 Thread Petr Mladek
On Tue 2018-01-16 13:47:16, Sergey Senozhatsky wrote: > if you don't mind, let me fix the thing that I broke. > that would be responsible. I believe I also must say the following: > Tetsuo, many thanks for reporting the issues for song long, and > sorry that it took quite a while to revert that

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-16 Thread Petr Mladek
On Tue 2018-01-16 11:23:49, Sergey Senozhatsky wrote: > On (01/15/18 15:45), Petr Mladek wrote: > > > I think adding the preempt_disable() would fix printk() but let non > > > printk console_unlock() still preempt. > > > > I would personally remove cond_resched() from console_unlock() > > complete

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-16 Thread Sergey Senozhatsky
On (01/16/18 10:36), Petr Mladek wrote: [..] > > unfortunately disabling preemtion in console_unlock() is a bit > > dangerous :( we have paths that call console_unlock() exactly > > to flush everything (not only new pending messages, but everything) > > that is in logbuf and we cannot return from c

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-16 Thread Petr Mladek
On Tue 2018-01-16 15:10:13, Sergey Senozhatsky wrote: > Hi, > > On (01/15/18 12:50), Petr Mladek wrote: > > On Mon 2018-01-15 11:17:43, Petr Mladek wrote: > > > PS: Sergey, you have many good points. The printk-stuff is very > > > complex and we could spend years discussing the perfect solution. >

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-16 Thread Petr Mladek
On Tue 2018-01-16 14:16:22, Sergey Senozhatsky wrote: > On (01/15/18 09:51), Petr Mladek wrote: > > On Sat 2018-01-13 16:31:00, Sergey Senozhatsky wrote: > > > On (01/12/18 13:55), Petr Mladek wrote: > > > [..] > > > > > I'm not fixing console_unlock(), I'm fixing printk(). BTW, all my > > > > > ke

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-15 Thread Sergey Senozhatsky
Hi, On (01/15/18 12:50), Petr Mladek wrote: > On Mon 2018-01-15 11:17:43, Petr Mladek wrote: > > PS: Sergey, you have many good points. The printk-stuff is very > > complex and we could spend years discussing the perfect solution. > > BTW: One solution that comes to my mind is based on ideas > al

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-15 Thread Sergey Senozhatsky
Hi, On (01/15/18 11:17), Petr Mladek wrote: > Hi Sergey, > > I wonder if there is still some miss understanding. > > Steven and me are trying to get this patch in because we believe > that it is a step forward. We know that it is not perfect. But > we believe that it makes things better. In part

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-15 Thread Sergey Senozhatsky
On (01/15/18 09:51), Petr Mladek wrote: > On Sat 2018-01-13 16:31:00, Sergey Senozhatsky wrote: > > On (01/12/18 13:55), Petr Mladek wrote: > > [..] > > > > I'm not fixing console_unlock(), I'm fixing printk(). BTW, all my > > > > kernels are CONFIG_PREEMPT (I'm a RT guy), my mind thinks more about

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-15 Thread Sergey Senozhatsky
On (01/15/18 07:08), Steven Rostedt wrote: > On Fri, 12 Jan 2018 13:55:37 +0100 > Petr Mladek wrote: > > > > I'm not fixing console_unlock(), I'm fixing printk(). BTW, all my > > > kernels are CONFIG_PREEMPT (I'm a RT guy), my mind thinks more about > > > PREEMPT kernels than !PREEMPT ones. > >

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-15 Thread Sergey Senozhatsky
On (01/16/18 11:23), Sergey Senozhatsky wrote: [..] > > Adding the preempt_disable() basically means to revert the already > > mentioned commit 6b97a20d3a7909daa06625 ("printk: set may_schedule > > for some of console_trylock() callers"). > > > > I originally wanted to solve this separately to mak

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-15 Thread Sergey Senozhatsky
On (01/15/18 15:45), Petr Mladek wrote: [..] > > With the preempt_disable() there really isn't a delay. I agree, we > > shouldn't let printk preempt (unless we have CONFIG_PREEMPT_RT enabled, > > but that's another story). > > > > > > > > so very schematically, for hand-off it's something like >

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-15 Thread Sergey Senozhatsky
On (01/15/18 07:06), Steven Rostedt wrote: > > > Yep, but I'm still not convinced you are seeing an issue with a single > > > printk. > > > > what do you mean by this? > > I'm not sure your issues happen because a single printk is locked up, > but you have many printks in one area. hm, need to

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-15 Thread Petr Mladek
On Mon 2018-01-15 07:06:37, Steven Rostedt wrote: > On Sat, 13 Jan 2018 16:28:34 +0900 > Sergey Senozhatsky wrote: > > On (01/12/18 07:21), Steven Rostedt wrote: > > > > > An OOM does not do everything in one printk, it calls hundreds. > > > Having hundreds of printks is an issue, especially in c

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-15 Thread Steven Rostedt
On Fri, 12 Jan 2018 13:55:37 +0100 Petr Mladek wrote: > > I'm not fixing console_unlock(), I'm fixing printk(). BTW, all my > > kernels are CONFIG_PREEMPT (I'm a RT guy), my mind thinks more about > > PREEMPT kernels than !PREEMPT ones. > > I would say that the patch improves also console_unlo

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-15 Thread Steven Rostedt
On Sat, 13 Jan 2018 16:28:34 +0900 Sergey Senozhatsky wrote: > On (01/12/18 07:21), Steven Rostedt wrote: > [..] > > Yep, but I'm still not convinced you are seeing an issue with a single > > printk. > > what do you mean by this? I'm not sure your issues happen because a single printk is lock

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-15 Thread Petr Mladek
On Mon 2018-01-15 11:17:43, Petr Mladek wrote: > PS: Sergey, you have many good points. The printk-stuff is very > complex and we could spend years discussing the perfect solution. BTW: One solution that comes to my mind is based on ideas already mentioned in this thread: void console_unlock(voi

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-15 Thread Petr Mladek
Hi Sergey, I wonder if there is still some miss understanding. Steven and me are trying to get this patch in because we believe that it is a step forward. We know that it is not perfect. But we believe that it makes things better. In particular, it limits the time spent in console_unlock() in ato

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-15 Thread Sergey Senozhatsky
On (01/15/18 09:51), Petr Mladek wrote: > On Sat 2018-01-13 16:31:00, Sergey Senozhatsky wrote: > > On (01/12/18 13:55), Petr Mladek wrote: > > [..] > > > > I'm not fixing console_unlock(), I'm fixing printk(). BTW, all my > > > > kernels are CONFIG_PREEMPT (I'm a RT guy), my mind thinks more about

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-15 Thread Petr Mladek
On Sat 2018-01-13 16:31:00, Sergey Senozhatsky wrote: > On (01/12/18 13:55), Petr Mladek wrote: > [..] > > > I'm not fixing console_unlock(), I'm fixing printk(). BTW, all my > > > kernels are CONFIG_PREEMPT (I'm a RT guy), my mind thinks more about > > > PREEMPT kernels than !PREEMPT ones. > > >

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-12 Thread Sergey Senozhatsky
On (01/12/18 13:55), Petr Mladek wrote: [..] > > I'm not fixing console_unlock(), I'm fixing printk(). BTW, all my > > kernels are CONFIG_PREEMPT (I'm a RT guy), my mind thinks more about > > PREEMPT kernels than !PREEMPT ones. > > I would say that the patch improves also console_unlock() but only

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-12 Thread Sergey Senozhatsky
On (01/12/18 07:21), Steven Rostedt wrote: [..] > Yep, but I'm still not convinced you are seeing an issue with a single > printk. what do you mean by this? > An OOM does not do everything in one printk, it calls hundreds. > Having hundreds of printks is an issue, especially in critical sections.

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-12 Thread Petr Mladek
On Fri 2018-01-12 07:21:23, Steven Rostedt wrote: > On Fri, 12 Jan 2018 19:05:44 +0900 > Sergey Senozhatsky wrote: > > 3) console_unlock(void) > >{ > > for (;;) { > > printk_safe_enter_irqsave(flags); > > // lock-unlock logbuf > > call_console_drivers(ex

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-12 Thread Steven Rostedt
On Fri, 12 Jan 2018 19:05:44 +0900 Sergey Senozhatsky wrote: > Steven, we are having too many things in one email, I've dropped most > of them to concentrate on one topic only. I totally agree, and I believe this is the reason behind the tensions between us. We are not discussing the topic of th

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-12 Thread Sergey Senozhatsky
Steven, we are having too many things in one email, I've dropped most of them to concentrate on one topic only. On (01/11/18 22:21), Steven Rostedt wrote: [..] > > After playing with the module in my last email, I think your trying to > solve multiple printks, not one that is stuck I wouldn't say

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-11 Thread Steven Rostedt
On Thu, 11 Jan 2018 21:55:47 -0500 Steven Rostedt wrote: > I ran this on a box with 4 CPUs and a serial console (so it has a slow > console). Again, all I have is each CPU doing exactly ONE printk()! > then sleeping for a full millisecond! It will cause a lot of output, > and perhaps slow the sys

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-11 Thread Steven Rostedt
On Fri, 12 Jan 2018 11:56:12 +0900 Sergey Senozhatsky wrote: > Hi, > > On (01/11/18 11:29), Steven Rostedt wrote: > [..] > > > - if the patch's goal is to bound (not necessarily to watchdog's > > > threshold) > > > the amount of time we spend in console_unlock(), then the patch is kinda > > > o

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-11 Thread Sergey Senozhatsky
On (01/11/18 20:30), Steven Rostedt wrote: [..] > Today, printk() can print for a time of A * B, where, as you state > above: > >A is the amount of data to print in the worst case >B the time call_console_drivers() needs to print a single > char to all registered and enabled consol

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-11 Thread Sergey Senozhatsky
Hi, On (01/11/18 11:29), Steven Rostedt wrote: [..] > > - if the patch's goal is to bound (not necessarily to watchdog's threshold) > > the amount of time we spend in console_unlock(), then the patch is kinda > > overcomplicated. but no further questions in this case. > > It's goal is to keep pri

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-11 Thread Steven Rostedt
On Thu, 11 Jan 2018 20:30:57 -0500 Steven Rostedt wrote: > I have to say that your analysis here really does point out the benefit > of my patch. > > Today, printk() can print for a time of A * B, where, as you state > above: > >A is the amount of data to print in the worst case >B the

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-11 Thread Steven Rostedt
On Thu, 11 Jan 2018 11:29:08 -0500 Steven Rostedt wrote: > > claiming that for any given A, B, C the following is always true > > > > A * B < C > > > > where > > A is the amount of data to print in the worst case > > B the time call_console_drivers() needs to

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-11 Thread Steven Rostedt
On Thu, 11 Jan 2018 19:38:45 +0900 Sergey Senozhatsky wrote: > > the non-atomic -> atomic context console_sem transfer. we previously > would have kept the console_sem owner to its non-atomic owner. we now > will make sure that if printk from atomic context happens then it will > make it to cons

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-11 Thread Sergey Senozhatsky
On (01/11/18 12:24), Petr Mladek wrote: [..] > You might argue that we already know that Steven's solution will > not be enough. IMHO, the problem here is the term "real life example". this is really boring, how real life examples happen only on Steven's PC or Petr's qemu image. whatever.

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-11 Thread Petr Mladek
On Thu 2018-01-11 19:38:45, Sergey Senozhatsky wrote: > On (01/11/18 10:34), Petr Mladek wrote: > [..] > > > except that handing off a console_sem to atomic task when there > > > is O(logbuf) > watchdog_thresh is a regression, basically... > > > it is what it is. > > > > How this could be a re

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-11 Thread Petr Mladek
On Thu 2018-01-11 16:36:18, Sergey Senozhatsky wrote: > Hi Mathieu, > > On (01/10/18 18:40), Mathieu Desnoyers wrote: > [..] > > > > There appears to be two problems at hand. One is making sure a console > > buffer owner only flushes a bounded amount of data. > > which, realistically, has quite

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-11 Thread Sergey Senozhatsky
On (01/11/18 10:34), Petr Mladek wrote: [..] > > except that handing off a console_sem to atomic task when there > > is O(logbuf) > watchdog_thresh is a regression, basically... > > it is what it is. > > How this could be a regression? Is not the victim that handles > other printk's random? Wh

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-11 Thread Petr Mladek
On Thu 2018-01-11 13:58:17, Sergey Senozhatsky wrote: > On (01/10/18 13:05), Steven Rostedt wrote: > > The solution is simple, everyone at KS agreed with it, there should be > > no controversy here. > > frankly speaking, that's not what I recall ;) To be honest, I do not longer remember the detai

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-10 Thread Sergey Senozhatsky
Hi Mathieu, On (01/10/18 18:40), Mathieu Desnoyers wrote: [..] > > There appears to be two problems at hand. One is making sure a console > buffer owner only flushes a bounded amount of data. which, realistically, has quite little to do with the "and thus it fixes the lockups". logbuf size is mu

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-10 Thread Sergey Senozhatsky
On (01/10/18 14:17), Steven Rostedt wrote: [..] > OK, lets start over. good. > Right now my focus is an incremental approach. I'm not trying to solve > all issues that printk has. I've focused on a single issue, and that is > that printk is unbounded. Coming from a Real Time background, I find >

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-10 Thread Sergey Senozhatsky
On (01/10/18 19:21), Peter Zijlstra wrote: > > On Wed, Jan 10, 2018 at 09:02:23AM -0800, Tejun Heo wrote: > > 2. System runs out of memory, OOM triggers. > > 3. OOM handler is printing out OOM debug info. > > 4. While trying to emit the messages for netconsole, the network stack > >/ driver tr

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-10 Thread Sergey Senozhatsky
On (01/10/18 17:29), Petr Mladek wrote: [..] > The next versions used lazy offload from console_unlock() when > the thread spent there too much time. IMHO, this is one > very promising solution. It guarantees that softlockup > would never happen. But it tries hard to get the messages > out immediat

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-10 Thread Sergey Senozhatsky
On (01/10/18 13:05), Steven Rostedt wrote: [..] > My solution takes printk from its current unbounded state, and makes it > fixed bounded. Which means printk() is now a O(1) algorithm. ^^^ O(logbuf) and

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-10 Thread Tejun Heo
Hello, Steven. On Wed, Jan 10, 2018 at 02:44:55PM -0500, Steven Rostedt wrote: > Yes, there can be the case that printks are added via an interrupt, but > then again, it's an issue that a single CPU. And printks from interrupt > context should be considered critical, part of the ASAP category. If

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-10 Thread Steven Rostedt
On Wed, 10 Jan 2018 11:34:51 -0800 Tejun Heo wrote: > > Right now my focus is an incremental approach. I'm not trying to solve > > all issues that printk has. I've focused on a single issue, and that is > > that printk is unbounded. Coming from a Real Time background, I find > > that is a big pro

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-10 Thread Tejun Heo
Hello, Steven. On Wed, Jan 10, 2018 at 02:17:58PM -0500, Steven Rostedt wrote: > > I'm not really sure why punting to a safe context is necessarily > > unacceptable in terms of #1 because there seems to be a pretty wide > > gap between printing useful messages synchronously and a system being > >

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-10 Thread Steven Rostedt
On Wed, 10 Jan 2018 10:57:47 -0800 Tejun Heo wrote: > Hello, Steven. > > On Wed, Jan 10, 2018 at 01:41:57PM -0500, Steven Rostedt wrote: > > The issue with the solution you want to do with printk is that it can > > break existing printk usages. As Petr said, people want printk to do two > > thin

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-10 Thread Tejun Heo
Hello, On Wed, Jan 10, 2018 at 07:41:44PM +0100, Peter Zijlstra wrote: > Typically we (scheduler) have removed printk()s (on boot) when BIGSMP > folks say it creates boot pain. Much of it is now behind the sched_debug > parameter, others are compressed. > > I've also seen other people reduce prin

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-10 Thread Tejun Heo
Hello, Steven. On Wed, Jan 10, 2018 at 01:41:57PM -0500, Steven Rostedt wrote: > The issue with the solution you want to do with printk is that it can > break existing printk usages. As Petr said, people want printk to do two > things. 1 - print out data ASAP, 2 - not lock up the system. The two >

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-10 Thread Steven Rostedt
On Wed, 10 Jan 2018 17:29:00 +0100 Petr Mladek wrote: > he next versions used lazy offload from console_unlock() when > the thread spent there too much time. IMHO, this is one > very promising solution. It guarantees that softlockup > would never happen. But it tries hard to get the messages > ou

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-10 Thread Steven Rostedt
On Wed, 10 Jan 2018 10:14:59 -0800 Tejun Heo wrote: > On Wed, Jan 10, 2018 at 10:12:52AM -0800, Tejun Heo wrote: > > Hello, Steven. > > > > So, everything else on your message, sure. You do what you have to > > do, but I really don't understand the following part, and this has > > been the main

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-10 Thread Peter Zijlstra
On Wed, Jan 10, 2018 at 10:30:55AM -0800, Tejun Heo wrote: > > Why not kill recursive OOM (msgs) ? > > Sure, we can do that too, e.g. marking flushing thread and ignoring > new messages from it, although that does come with its own downsides. Typically we (scheduler) have removed printk()s (on bo

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-10 Thread Steven Rostedt
On Wed, 10 Jan 2018 10:12:52 -0800 Tejun Heo wrote: > Hello, Steven. > > So, everything else on your message, sure. You do what you have to > do, but I really don't understand the following part, and this has > been the main source of frustration in the whole discussion. > > On Wed, Jan 10, 20

Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup

2018-01-10 Thread Mathieu Desnoyers
- On Jan 10, 2018, at 12:02 PM, Tejun Heo t...@kernel.org wrote: > Hello, Linus, Andrew. > > On Wed, Jan 10, 2018 at 05:29:00PM +0100, Petr Mladek wrote: >> Where is the acceptable compromise? I am not sure. So far, the most >> forceful people (Linus) did not see softlockups as a big problem.

  1   2   >