Hi,
Move printk and (some of) MM people to the recipients list.
On (01/10/18 09:02), Tejun Heo wrote:
[..]
> The particular case that we've been seeing regularly in the fleet was
> the following scenario.
>
> 1. Console is IPMI emulated serial console. Super slow. Also
>netconsole is in us
On (01/23/18 07:43), Tejun Heo wrote:
> >
> > We can have more. But if printk is causing printks, that's a major bug.
> > And work queues are not going to fix it, it will just spread out the
> > pain. Have it be 100 printks, it needs to be fixed if it is happening.
> > And having all printks just
Hello, Peter.
On Wed, Jan 24, 2018 at 10:36:07AM +0100, Peter Zijlstra wrote:
> On Wed, Jan 10, 2018 at 09:02:23AM -0800, Tejun Heo wrote:
> > 1. Console is IPMI emulated serial console. Super slow. Also
> >netconsole is in use.
>
> So my IPMI SoE typically run at 115200 Baud (or higher) an
On Wed, Jan 10, 2018 at 09:02:23AM -0800, Tejun Heo wrote:
> 1. Console is IPMI emulated serial console. Super slow. Also
>netconsole is in use.
So my IPMI SoE typically run at 115200 Baud (or higher) and I've not had
trouble like that (granted I don't typically trigger OOM storms, but
they
On (01/23/18 21:52), Steven Rostedt wrote:
> On Wed, 24 Jan 2018 11:11:33 +0900
> Sergey Senozhatsky wrote:
>
> > Please take a look.
>
> Was there something specific to look at?
Not really. Just my previous email, basically.
You said "I have to look at the latest code." so I replied.
Well, if
On Wed, 24 Jan 2018 11:11:33 +0900
Sergey Senozhatsky wrote:
> Please take a look.
Was there something specific to look at?
I'm doing a hundred different things at once, and my memory cache keeps
getting flushed.
-- Steve
Hello,
On (01/23/18 11:24), Steven Rostedt wrote:
[..]
> > With WQ we don't lockup the kernel, because we flush printk_safe in
> > preemptible context. And people are very much expected to fix the
> > misbehaving consoles. But that should not be printk_safe problem.
>
> Right, but now you just ma
Hello, Sergey.
On Wed, Jan 24, 2018 at 01:01:53AM +0900, Sergey Senozhatsky wrote:
> On (01/23/18 10:41), Steven Rostedt wrote:
> [..]
> > We can have more. But if printk is causing printks, that's a major bug.
> > And work queues are not going to fix it, it will just spread out the
> > pain. Have
Hey,
On Tue, Jan 23, 2018 at 11:13:30AM -0500, Steven Rostedt wrote:
> From what I understand is that there's an issue with one of the printk
> consoles, due to memory pressure or whatnot. Then a printk happens
> within a printk recursively. It gets put into the safe buffer and an
> irq is sent to
On Wed, 24 Jan 2018 01:01:53 +0900
Sergey Senozhatsky wrote:
> On (01/23/18 10:41), Steven Rostedt wrote:
> [..]
> > We can have more. But if printk is causing printks, that's a major bug.
> > And work queues are not going to fix it, it will just spread out the
> > pain. Have it be 100 printks, i
On Tue, 23 Jan 2018 07:43:47 -0800
Tejun Heo wrote:
> So, at least in the case that we were seeing, it isn't that black and
> white. printk keeps causing printks but only because printk buffer
> flushing is preventing the printk'ing context from making forward
> progress. The key problem there
Hello, Tejun
On (01/23/18 07:43), Tejun Heo wrote:
> Hello, Steven.
>
> On Tue, Jan 23, 2018 at 10:41:21AM -0500, Steven Rostedt wrote:
> > > I don't want to have heuristics in print_safe, I don't want to have a
> > > magic
> > > number controlled by a user-space visible knob, I don't want to ha
On (01/23/18 10:41), Steven Rostedt wrote:
[..]
> We can have more. But if printk is causing printks, that's a major bug.
> And work queues are not going to fix it, it will just spread out the
> pain. Have it be 100 printks, it needs to be fixed if it is happening.
> And having all printks just gen
Hello, Steven.
On Tue, Jan 23, 2018 at 10:41:21AM -0500, Steven Rostedt wrote:
> > I don't want to have heuristics in print_safe, I don't want to have a magic
> > number controlled by a user-space visible knob, I don't want to have the
> > first 3 lines of a lockdep splat.
>
> We can have more. B
On Wed, 24 Jan 2018 00:21:30 +0900
Sergey Senozhatsky wrote:
> On (01/23/18 09:56), Steven Rostedt wrote:
> [..]
> > > Why do we even use irq_work for printk_safe?
> >
> > Why not?
> >
> > Really, I think you are trying to solve a symptom and not the problem.
> > If we are having issues with
On (01/23/18 09:56), Steven Rostedt wrote:
[..]
> > Why do we even use irq_work for printk_safe?
>
> Why not?
>
> Really, I think you are trying to solve a symptom and not the problem.
> If we are having issues with irq_work, we are going to have issues with
> a work queue. It's just spreading ou
On Tue, 23 Jan 2018 15:40:23 +0900
Sergey Senozhatsky wrote:
> Why do we even use irq_work for printk_safe?
Why not?
Really, I think you are trying to solve a symptom and not the problem.
If we are having issues with irq_work, we are going to have issues with
a work queue. It's just spreading o
On (01/23/18 15:40), Sergey Senozhatsky wrote:
>
> Why do we even use irq_work for printk_safe?
>
... perhaps because of
wq: pool->lock -> printk -> call_console_drivers -> printk -> vprintk_safe ->
wq: pool->lock
Which is a "many things have gone wrong" type of scenario. Maybe we
can workaro
On (01/23/18 15:40), Sergey Senozhatsky wrote:
[..]
> Why do we even use irq_work for printk_safe?
>
> Okay... So, how about this. For printk_safe we use system_wq for flushing.
> IOW, we flush from a task running exactly on the same CPU which hit printk
> recursion, not from IRQ. From vprintk_saf
Hello,
On (01/21/18 23:15), Sergey Senozhatsky wrote:
[..]
> we have printk recursion from console drivers. it's redirected to
> printk_safe and we queue an IRQ work to flush the buffer
>
> printk
> console_unlock
>call_console_drivers
> net_console
> printk
> printk_save ->
On (01/22/18 19:28), Sergey Senozhatsky wrote:
> On (01/22/18 17:56), Sergey Senozhatsky wrote:
> [..]
> > Assume the following,
>
> But more importantly we are missing another huge thing - console_unlock().
IOW, not every console_unlock() is from vprintk_emit(). We can have
console_trylock() ->
On (01/22/18 17:56), Sergey Senozhatsky wrote:
[..]
> Assume the following,
But more importantly we are missing another huge thing - console_unlock().
Suppose:
console_lock();
<< preemption >>
printk
On (01/21/18 16:04), Steven Rostedt wrote:
[..]
> > The problem is that we flush printk_safe right when console_unlock()
> > printing
> > loop enables local IRQs via printk_safe_exit_irqrestore() [given that IRQs
> > were enabled in the first place when the CPU went to console_unlock()].
> > This
On Sun, 21 Jan 2018 23:15:21 +0900
Sergey Senozhatsky wrote:
> so fix the console drivers ;)
Totally agree!
>
>
>
>
> just kidding. ok...
Darn it! ;-)
> the problem is that we flush printk_safe right when console_unlock() printing
> loop enables local IRQs via printk_safe_exit_irqrest
On (01/20/18 10:49), Steven Rostedt wrote:
[..]
> > printks from console_unlock()->call_console_drivers() are redirected
> > to printk_safe buffer. we need irq_work on that CPU to flush its
> > printk_safe buffer.
>
> So is the issue that we keep triggering this irq work then? Then this
> solution
On Sat, 20 Jan 2018 16:14:02 +0900
Sergey Senozhatsky wrote:
> [..]
> > asmlinkage int vprintk_emit(int facility, int level,
> > const char *dict, size_t dictlen,
> > @@ -1849,6 +1918,17 @@ asmlinkage int vprintk_emit(int facility, int level,
> >
> > /* This stops t
On Sat, 20 Jan 2018 04:19:53 -0800
Tejun Heo wrote:
> I'm a bit worried tho because this essentially seems like "detect
> recursion, ignore messages" approach. netcons can have a very large
> surface for bugs. Suppressing those messages would make them
> difficult to debug. For example, all ou
Hello, Steven.
On Fri, Jan 19, 2018 at 01:20:52PM -0500, Steven Rostedt wrote:
> I was thinking about this a bit more, and instead of offloading a
> recursive printk, perhaps its best to simply throttle it. Because the
> problem may not go away if a printk thread takes over, because the bug
> is r
On (01/19/18 13:20), Steven Rostedt wrote:
[..]
> I was thinking about this a bit more, and instead of offloading a
> recursive printk, perhaps its best to simply throttle it. Because the
> problem may not go away if a printk thread takes over, because the bug
> is really the printk infrastructure
Tejun,
I was thinking about this a bit more, and instead of offloading a
recursive printk, perhaps its best to simply throttle it. Because the
problem may not go away if a printk thread takes over, because the bug
is really the printk infrastructure filling the printk buffer keeping
printk from ev
On Thu, 18 Jan 2018 13:31:16 +0900
Sergey Senozhatsky wrote:
> d'oh... indeed, I copy-pasted the wrong URL... it should
> have been lkml.kernel.org/r/ [and it actually was].
I've learned to do a copy after entering the lkml.kernel.org link into
the browser url, and before hitting enter. The redi
On Wed 2018-01-17 12:05:51, Tejun Heo wrote:
> Hello, Steven.
>
> On Wed, Jan 17, 2018 at 12:12:51PM -0500, Steven Rostedt wrote:
> > From what I gathered, you said an OOM would trigger, and then the
> > network console would not be able to allocate memory and it would
> > trigger a printk too, an
On (01/17/18 12:05), Tejun Heo wrote:
[..]
> > This could very well be a great place to force offloading. If a printk
> > is called from within a printk, at the same context (normal, softirq,
> > irq or NMI), then we should trigger the offloading.
>
> I was thinking more of a timeout based approac
On (01/17/18 12:12), Steven Rostedt wrote:
[..]
> /*
> * Can we actually use the console at this time on this cpu?
> @@ -2333,6 +2390,7 @@ void console_unlock(void)
>
> for (;;) {
> struct printk_log *msg;
> + bool offload;
> size_t ext_len = 0;
>
On (01/17/18 14:04), Petr Mladek wrote:
> On Wed 2018-01-17 11:18:56, Sergey Senozhatsky wrote:
> > On (01/16/18 10:45), Steven Rostedt wrote:
> > [..]
> > > > [1] https://marc.info/?l=linux-mm&m=145692016122716
> > >
> > > Especially since Konstantin is working on pulling in all LKML archives,
>
Hello, Steven.
On Wed, Jan 17, 2018 at 12:12:51PM -0500, Steven Rostedt wrote:
> From what I gathered, you said an OOM would trigger, and then the
> network console would not be able to allocate memory and it would
> trigger a printk too, and cause an infinite amount of printks.
Yeah, it falls in
On Wed, 17 Jan 2018 12:12:51 -0500
Steven Rostedt wrote:
> @@ -2393,15 +2451,20 @@ void console_unlock(void)
>* waiter waiting to take over.
>*/
> console_lock_spinning_enable();
> + offload = recursion_check_start();
>
> s
On Wed, 17 Jan 2018 07:15:09 -0800
Tejun Heo wrote:
> It's great that Steven's patches solve a good number of problems. It
> is also true that there's a class of problems that it doesn't solve,
> which other approaches do. The productive thing to do here is trying
> to solve the unsolved one to
On Wed, 17 Jan 2018 14:04:07 +0100
Petr Mladek wrote:
> On Wed 2018-01-17 11:18:56, Sergey Senozhatsky wrote:
> > On (01/16/18 10:45), Steven Rostedt wrote:
> > [..]
> > > > [1] https://marc.info/?l=linux-mm&m=145692016122716
> > >
> > > Especially since Konstantin is working on pulling in a
Hello,
On Wed, Jan 17, 2018 at 10:12:08AM +0100, Petr Mladek wrote:
> IMHO, the bad scenario with OOM was that any printk() called in
> the OOM report became console_lock owner and was responsible
> for pushing all new messages to the console. There was a possible
> livelock because OOM Killer was
On Wed 2018-01-17 11:18:56, Sergey Senozhatsky wrote:
> On (01/16/18 10:45), Steven Rostedt wrote:
> [..]
> > > [1] https://marc.info/?l=linux-mm&m=145692016122716
> >
> > Especially since Konstantin is working on pulling in all LKML archives,
> > the above should be denoted as:
> >
> > Link:
>
On Tue 2018-01-16 11:44:56, Tejun Heo wrote:
> Hello, Steven.
>
> On Thu, Jan 11, 2018 at 09:55:47PM -0500, Steven Rostedt wrote:
> > All I did was start off a work queue on each CPU, and each CPU does one
> > printk() followed by a millisecond sleep. No 10,000 printks, nothing
> > in an interrupt
On (01/16/18 11:13), Petr Mladek wrote:
[..]
> IMHO, it would make sense if flushing the printk buffer behaves
> the same when called either from printk() or from any other path.
> I mean that it should be aggressive and allow an effective
> hand off.
>
> It should be safe as long as foo_specific_
On (01/16/18 11:19), Petr Mladek wrote:
[..]
> > [1] https://marc.info/?l=linux-mm&m=145692016122716
> > Fixes: 6b97a20d3a79 ("printk: set may_schedule for some of
> > console_trylock() callers")
> > Signed-off-by: Sergey Senozhatsky
> > Reported-by: Tetsuo Handa
>
> IMHO, this is a step in the
On (01/16/18 10:45), Steven Rostedt wrote:
[..]
> > [1] https://marc.info/?l=linux-mm&m=145692016122716
>
> Especially since Konstantin is working on pulling in all LKML archives,
> the above should be denoted as:
>
> Link:
> http://lkml.kernel.org/r/201603022101.CAH73907.OVOOMFHFFtQJSL%20()%20
Hello, Steven.
On Thu, Jan 11, 2018 at 09:55:47PM -0500, Steven Rostedt wrote:
> All I did was start off a work queue on each CPU, and each CPU does one
> printk() followed by a millisecond sleep. No 10,000 printks, nothing
> in an interrupt handler. Preemption is disabled while the printk
> happe
On Tue, 16 Jan 2018 15:10:13 +0900
Sergey Senozhatsky wrote:
> overall that's very close to what I have in one of my private branches.
> console_trylock_spinning() for some reason does not perform really
> well on my made-up internal printk torture tests. it seems that I
One thing I noticed in m
On Tue, 16 Jan 2018 13:47:16 +0900
Sergey Senozhatsky wrote:
> From: Sergey Senozhatsky
> Subject: [PATCH] printk: never set console_may_schedule in console_trylock()
>
> This patch, basically, reverts commit 6b97a20d3a79 ("printk:
> set may_schedule for some of console_trylock() callers").
> T
On Tue 2018-01-16 13:47:16, Sergey Senozhatsky wrote:
> if you don't mind, let me fix the thing that I broke.
> that would be responsible. I believe I also must say the following:
> Tetsuo, many thanks for reporting the issues for song long, and
> sorry that it took quite a while to revert that
On Tue 2018-01-16 11:23:49, Sergey Senozhatsky wrote:
> On (01/15/18 15:45), Petr Mladek wrote:
> > > I think adding the preempt_disable() would fix printk() but let non
> > > printk console_unlock() still preempt.
> >
> > I would personally remove cond_resched() from console_unlock()
> > complete
On (01/16/18 10:36), Petr Mladek wrote:
[..]
> > unfortunately disabling preemtion in console_unlock() is a bit
> > dangerous :( we have paths that call console_unlock() exactly
> > to flush everything (not only new pending messages, but everything)
> > that is in logbuf and we cannot return from c
On Tue 2018-01-16 15:10:13, Sergey Senozhatsky wrote:
> Hi,
>
> On (01/15/18 12:50), Petr Mladek wrote:
> > On Mon 2018-01-15 11:17:43, Petr Mladek wrote:
> > > PS: Sergey, you have many good points. The printk-stuff is very
> > > complex and we could spend years discussing the perfect solution.
>
On Tue 2018-01-16 14:16:22, Sergey Senozhatsky wrote:
> On (01/15/18 09:51), Petr Mladek wrote:
> > On Sat 2018-01-13 16:31:00, Sergey Senozhatsky wrote:
> > > On (01/12/18 13:55), Petr Mladek wrote:
> > > [..]
> > > > > I'm not fixing console_unlock(), I'm fixing printk(). BTW, all my
> > > > > ke
Hi,
On (01/15/18 12:50), Petr Mladek wrote:
> On Mon 2018-01-15 11:17:43, Petr Mladek wrote:
> > PS: Sergey, you have many good points. The printk-stuff is very
> > complex and we could spend years discussing the perfect solution.
>
> BTW: One solution that comes to my mind is based on ideas
> al
Hi,
On (01/15/18 11:17), Petr Mladek wrote:
> Hi Sergey,
>
> I wonder if there is still some miss understanding.
>
> Steven and me are trying to get this patch in because we believe
> that it is a step forward. We know that it is not perfect. But
> we believe that it makes things better. In part
On (01/15/18 09:51), Petr Mladek wrote:
> On Sat 2018-01-13 16:31:00, Sergey Senozhatsky wrote:
> > On (01/12/18 13:55), Petr Mladek wrote:
> > [..]
> > > > I'm not fixing console_unlock(), I'm fixing printk(). BTW, all my
> > > > kernels are CONFIG_PREEMPT (I'm a RT guy), my mind thinks more about
On (01/15/18 07:08), Steven Rostedt wrote:
> On Fri, 12 Jan 2018 13:55:37 +0100
> Petr Mladek wrote:
>
> > > I'm not fixing console_unlock(), I'm fixing printk(). BTW, all my
> > > kernels are CONFIG_PREEMPT (I'm a RT guy), my mind thinks more about
> > > PREEMPT kernels than !PREEMPT ones.
> >
On (01/16/18 11:23), Sergey Senozhatsky wrote:
[..]
> > Adding the preempt_disable() basically means to revert the already
> > mentioned commit 6b97a20d3a7909daa06625 ("printk: set may_schedule
> > for some of console_trylock() callers").
> >
> > I originally wanted to solve this separately to mak
On (01/15/18 15:45), Petr Mladek wrote:
[..]
> > With the preempt_disable() there really isn't a delay. I agree, we
> > shouldn't let printk preempt (unless we have CONFIG_PREEMPT_RT enabled,
> > but that's another story).
> >
> > >
> > > so very schematically, for hand-off it's something like
>
On (01/15/18 07:06), Steven Rostedt wrote:
> > > Yep, but I'm still not convinced you are seeing an issue with a single
> > > printk.
> >
> > what do you mean by this?
>
> I'm not sure your issues happen because a single printk is locked up,
> but you have many printks in one area.
hm, need to
On Mon 2018-01-15 07:06:37, Steven Rostedt wrote:
> On Sat, 13 Jan 2018 16:28:34 +0900
> Sergey Senozhatsky wrote:
> > On (01/12/18 07:21), Steven Rostedt wrote:
> >
> > > An OOM does not do everything in one printk, it calls hundreds.
> > > Having hundreds of printks is an issue, especially in c
On Fri, 12 Jan 2018 13:55:37 +0100
Petr Mladek wrote:
> > I'm not fixing console_unlock(), I'm fixing printk(). BTW, all my
> > kernels are CONFIG_PREEMPT (I'm a RT guy), my mind thinks more about
> > PREEMPT kernels than !PREEMPT ones.
>
> I would say that the patch improves also console_unlo
On Sat, 13 Jan 2018 16:28:34 +0900
Sergey Senozhatsky wrote:
> On (01/12/18 07:21), Steven Rostedt wrote:
> [..]
> > Yep, but I'm still not convinced you are seeing an issue with a single
> > printk.
>
> what do you mean by this?
I'm not sure your issues happen because a single printk is lock
On Mon 2018-01-15 11:17:43, Petr Mladek wrote:
> PS: Sergey, you have many good points. The printk-stuff is very
> complex and we could spend years discussing the perfect solution.
BTW: One solution that comes to my mind is based on ideas
already mentioned in this thread:
void console_unlock(voi
Hi Sergey,
I wonder if there is still some miss understanding.
Steven and me are trying to get this patch in because we believe
that it is a step forward. We know that it is not perfect. But
we believe that it makes things better. In particular, it limits
the time spent in console_unlock() in ato
On (01/15/18 09:51), Petr Mladek wrote:
> On Sat 2018-01-13 16:31:00, Sergey Senozhatsky wrote:
> > On (01/12/18 13:55), Petr Mladek wrote:
> > [..]
> > > > I'm not fixing console_unlock(), I'm fixing printk(). BTW, all my
> > > > kernels are CONFIG_PREEMPT (I'm a RT guy), my mind thinks more about
On Sat 2018-01-13 16:31:00, Sergey Senozhatsky wrote:
> On (01/12/18 13:55), Petr Mladek wrote:
> [..]
> > > I'm not fixing console_unlock(), I'm fixing printk(). BTW, all my
> > > kernels are CONFIG_PREEMPT (I'm a RT guy), my mind thinks more about
> > > PREEMPT kernels than !PREEMPT ones.
> >
>
On (01/12/18 13:55), Petr Mladek wrote:
[..]
> > I'm not fixing console_unlock(), I'm fixing printk(). BTW, all my
> > kernels are CONFIG_PREEMPT (I'm a RT guy), my mind thinks more about
> > PREEMPT kernels than !PREEMPT ones.
>
> I would say that the patch improves also console_unlock() but only
On (01/12/18 07:21), Steven Rostedt wrote:
[..]
> Yep, but I'm still not convinced you are seeing an issue with a single
> printk.
what do you mean by this?
> An OOM does not do everything in one printk, it calls hundreds.
> Having hundreds of printks is an issue, especially in critical sections.
On Fri 2018-01-12 07:21:23, Steven Rostedt wrote:
> On Fri, 12 Jan 2018 19:05:44 +0900
> Sergey Senozhatsky wrote:
> > 3) console_unlock(void)
> >{
> > for (;;) {
> > printk_safe_enter_irqsave(flags);
> > // lock-unlock logbuf
> > call_console_drivers(ex
On Fri, 12 Jan 2018 19:05:44 +0900
Sergey Senozhatsky wrote:
> Steven, we are having too many things in one email, I've dropped most
> of them to concentrate on one topic only.
I totally agree, and I believe this is the reason behind the tensions
between us. We are not discussing the topic of th
Steven, we are having too many things in one email, I've dropped most
of them to concentrate on one topic only.
On (01/11/18 22:21), Steven Rostedt wrote:
[..]
>
> After playing with the module in my last email, I think your trying to
> solve multiple printks, not one that is stuck
I wouldn't say
On Thu, 11 Jan 2018 21:55:47 -0500
Steven Rostedt wrote:
> I ran this on a box with 4 CPUs and a serial console (so it has a slow
> console). Again, all I have is each CPU doing exactly ONE printk()!
> then sleeping for a full millisecond! It will cause a lot of output,
> and perhaps slow the sys
On Fri, 12 Jan 2018 11:56:12 +0900
Sergey Senozhatsky wrote:
> Hi,
>
> On (01/11/18 11:29), Steven Rostedt wrote:
> [..]
> > > - if the patch's goal is to bound (not necessarily to watchdog's
> > > threshold)
> > > the amount of time we spend in console_unlock(), then the patch is kinda
> > > o
On (01/11/18 20:30), Steven Rostedt wrote:
[..]
> Today, printk() can print for a time of A * B, where, as you state
> above:
>
>A is the amount of data to print in the worst case
>B the time call_console_drivers() needs to print a single
> char to all registered and enabled consol
Hi,
On (01/11/18 11:29), Steven Rostedt wrote:
[..]
> > - if the patch's goal is to bound (not necessarily to watchdog's threshold)
> > the amount of time we spend in console_unlock(), then the patch is kinda
> > overcomplicated. but no further questions in this case.
>
> It's goal is to keep pri
On Thu, 11 Jan 2018 20:30:57 -0500
Steven Rostedt wrote:
> I have to say that your analysis here really does point out the benefit
> of my patch.
>
> Today, printk() can print for a time of A * B, where, as you state
> above:
>
>A is the amount of data to print in the worst case
>B the
On Thu, 11 Jan 2018 11:29:08 -0500
Steven Rostedt wrote:
> > claiming that for any given A, B, C the following is always true
> >
> > A * B < C
> >
> > where
> > A is the amount of data to print in the worst case
> > B the time call_console_drivers() needs to
On Thu, 11 Jan 2018 19:38:45 +0900
Sergey Senozhatsky wrote:
>
> the non-atomic -> atomic context console_sem transfer. we previously
> would have kept the console_sem owner to its non-atomic owner. we now
> will make sure that if printk from atomic context happens then it will
> make it to cons
On (01/11/18 12:24), Petr Mladek wrote:
[..]
> You might argue that we already know that Steven's solution will
> not be enough. IMHO, the problem here is the term "real life example".
this is really boring, how real life examples happen only on Steven's PC
or Petr's qemu image. whatever.
On Thu 2018-01-11 19:38:45, Sergey Senozhatsky wrote:
> On (01/11/18 10:34), Petr Mladek wrote:
> [..]
> > > except that handing off a console_sem to atomic task when there
> > > is O(logbuf) > watchdog_thresh is a regression, basically...
> > > it is what it is.
> >
> > How this could be a re
On Thu 2018-01-11 16:36:18, Sergey Senozhatsky wrote:
> Hi Mathieu,
>
> On (01/10/18 18:40), Mathieu Desnoyers wrote:
> [..]
> >
> > There appears to be two problems at hand. One is making sure a console
> > buffer owner only flushes a bounded amount of data.
>
> which, realistically, has quite
On (01/11/18 10:34), Petr Mladek wrote:
[..]
> > except that handing off a console_sem to atomic task when there
> > is O(logbuf) > watchdog_thresh is a regression, basically...
> > it is what it is.
>
> How this could be a regression? Is not the victim that handles
> other printk's random? Wh
On Thu 2018-01-11 13:58:17, Sergey Senozhatsky wrote:
> On (01/10/18 13:05), Steven Rostedt wrote:
> > The solution is simple, everyone at KS agreed with it, there should be
> > no controversy here.
>
> frankly speaking, that's not what I recall ;)
To be honest, I do not longer remember the detai
Hi Mathieu,
On (01/10/18 18:40), Mathieu Desnoyers wrote:
[..]
>
> There appears to be two problems at hand. One is making sure a console
> buffer owner only flushes a bounded amount of data.
which, realistically, has quite little to do with the "and thus it
fixes the lockups". logbuf size is mu
On (01/10/18 14:17), Steven Rostedt wrote:
[..]
> OK, lets start over.
good.
> Right now my focus is an incremental approach. I'm not trying to solve
> all issues that printk has. I've focused on a single issue, and that is
> that printk is unbounded. Coming from a Real Time background, I find
>
On (01/10/18 19:21), Peter Zijlstra wrote:
>
> On Wed, Jan 10, 2018 at 09:02:23AM -0800, Tejun Heo wrote:
> > 2. System runs out of memory, OOM triggers.
> > 3. OOM handler is printing out OOM debug info.
> > 4. While trying to emit the messages for netconsole, the network stack
> >/ driver tr
On (01/10/18 17:29), Petr Mladek wrote:
[..]
> The next versions used lazy offload from console_unlock() when
> the thread spent there too much time. IMHO, this is one
> very promising solution. It guarantees that softlockup
> would never happen. But it tries hard to get the messages
> out immediat
On (01/10/18 13:05), Steven Rostedt wrote:
[..]
> My solution takes printk from its current unbounded state, and makes it
> fixed bounded. Which means printk() is now a O(1) algorithm.
^^^
O(logbuf)
and
Hello, Steven.
On Wed, Jan 10, 2018 at 02:44:55PM -0500, Steven Rostedt wrote:
> Yes, there can be the case that printks are added via an interrupt, but
> then again, it's an issue that a single CPU. And printks from interrupt
> context should be considered critical, part of the ASAP category. If
On Wed, 10 Jan 2018 11:34:51 -0800
Tejun Heo wrote:
> > Right now my focus is an incremental approach. I'm not trying to solve
> > all issues that printk has. I've focused on a single issue, and that is
> > that printk is unbounded. Coming from a Real Time background, I find
> > that is a big pro
Hello, Steven.
On Wed, Jan 10, 2018 at 02:17:58PM -0500, Steven Rostedt wrote:
> > I'm not really sure why punting to a safe context is necessarily
> > unacceptable in terms of #1 because there seems to be a pretty wide
> > gap between printing useful messages synchronously and a system being
> >
On Wed, 10 Jan 2018 10:57:47 -0800
Tejun Heo wrote:
> Hello, Steven.
>
> On Wed, Jan 10, 2018 at 01:41:57PM -0500, Steven Rostedt wrote:
> > The issue with the solution you want to do with printk is that it can
> > break existing printk usages. As Petr said, people want printk to do two
> > thin
Hello,
On Wed, Jan 10, 2018 at 07:41:44PM +0100, Peter Zijlstra wrote:
> Typically we (scheduler) have removed printk()s (on boot) when BIGSMP
> folks say it creates boot pain. Much of it is now behind the sched_debug
> parameter, others are compressed.
>
> I've also seen other people reduce prin
Hello, Steven.
On Wed, Jan 10, 2018 at 01:41:57PM -0500, Steven Rostedt wrote:
> The issue with the solution you want to do with printk is that it can
> break existing printk usages. As Petr said, people want printk to do two
> things. 1 - print out data ASAP, 2 - not lock up the system. The two
>
On Wed, 10 Jan 2018 17:29:00 +0100
Petr Mladek wrote:
> he next versions used lazy offload from console_unlock() when
> the thread spent there too much time. IMHO, this is one
> very promising solution. It guarantees that softlockup
> would never happen. But it tries hard to get the messages
> ou
On Wed, 10 Jan 2018 10:14:59 -0800
Tejun Heo wrote:
> On Wed, Jan 10, 2018 at 10:12:52AM -0800, Tejun Heo wrote:
> > Hello, Steven.
> >
> > So, everything else on your message, sure. You do what you have to
> > do, but I really don't understand the following part, and this has
> > been the main
On Wed, Jan 10, 2018 at 10:30:55AM -0800, Tejun Heo wrote:
> > Why not kill recursive OOM (msgs) ?
>
> Sure, we can do that too, e.g. marking flushing thread and ignoring
> new messages from it, although that does come with its own downsides.
Typically we (scheduler) have removed printk()s (on bo
On Wed, 10 Jan 2018 10:12:52 -0800
Tejun Heo wrote:
> Hello, Steven.
>
> So, everything else on your message, sure. You do what you have to
> do, but I really don't understand the following part, and this has
> been the main source of frustration in the whole discussion.
>
> On Wed, Jan 10, 20
- On Jan 10, 2018, at 12:02 PM, Tejun Heo t...@kernel.org wrote:
> Hello, Linus, Andrew.
>
> On Wed, Jan 10, 2018 at 05:29:00PM +0100, Petr Mladek wrote:
>> Where is the acceptable compromise? I am not sure. So far, the most
>> forceful people (Linus) did not see softlockups as a big problem.
1 - 100 of 110 matches
Mail list logo