On (01/29/16 08:54), Byungchul Park wrote: > > The problem is you have postulated a very shallow recursion. > > This looks much worse if this happens 1000 times, and > > probably won't recover to output anything. > > > > Additionally, what if the console_sem is simply corrupted? > > A livelock with no output ever is not very helpful. > > > > As I wrote earlier, I don't think this is the way to fix > > recursion problems with printk() [by eliding output]. > > I think you are currently misunderstading about this patch. Or I'm > misunderstanding you.. The patch was changed in v4 so that it can print > a debug message even in the recursive cycle case, at the first time. > > I also thought printing nothing in the case was not helpful at all which I > did in v1,2,3. But it's changed in v4, that is, this patch.
because you don't give any details and don't answer any questions. it took a while to even find out that you are reporting this issues not against a real H/W, but a qemu. I suppose qemu-arm running on x86_64 box. now, what else we don't know? explain STEP-BY-STEP why do you think spinlock debug code can lockup itself. not just "I don't think this is the case, I don't think that is the case". on very spin_dump recursive call it waits for the spin_lock and when it eventually grabs it, it does the job that it wanted to do under that spin lock, unlock it and return back. and the only case when it never "return back" is when it never "eventually grabs it". so I still don't see what issue you fix here -- the possibility to consume the entire kernel stack doing recursive spin_dump->spin_lock() calls because: a) something never unlocks the lock (no matter why.. corruption, HW fault, etc.) or b) everything was OK, but we attempted to printk() already being in a very-very deep callstack, so doing 5 extra printk->spin_dump->printk->spin_dump would simply kill it. if none of the above. then what you report and fix is simply non realistic. spin_dump must eventually unwind the stack back. yes, you'll see a lot of dump_stack() and all cpus backtraces done on every roollback stack. but you would still see some of them anyway, even w/o the spinlock debug code -- because you'd just raw_spin_lock_irqsave() on that lock for a very long time; which does upset watchdog, etc. please start explaining the things. -ss