On (12/15/17 09:31), Petr Mladek wrote: > > On (12/14/17 22:18), Steven Rostedt wrote: > > > > Steven, your approach works ONLY when we have the following > > > > preconditions: > > > > > > > > a) there is a CPU that is calling printk() from the 'safe' (non-atomic, > > > > etc) context > > > > > > > > what does guarantee that? what happens if there is NO non-atomic > > > > CPU or that non-atomic simplky missses the console_owner != > > > > false > > > > point? we are going to conclude > > > > > > > > "if printk() doesn't work for you, it's because you are holding > > > > it wrong"? > > > > > > > > > > > > what if that non-atomic CPU does not call printk(), but instead > > > > it does console_lock()/console_unlock()? why there is no > > > > handoff? > > > > > > > > CPU0 CPU1 ~ CPU10 > > > > in atomic contexts [!]. > > > > ping-ponging console_sem > > > > ownership to each other. while > > > > what they really > > > > need to do is to simply up() > > > > and let CPU0 to > > > > handle it. > > > > printk > > > > console_lock() > > > > schedule() > > > > ... > > > > printk > > > > printk > > > > ... > > > > printk > > > > printk > > > > > > > > up() > > > > > > > > // woken up > > > > console_unlock() > > > > > > > > why do we make an emphasis on fixing vprintk_printk()? > > Is the above scenario really dangerous? console_lock() owner is > able to sleep. Therefore there is no risk of a softlockup. > > Sure, many messages will get stacked in the meantime and the console > owner my get then passed to another owner in atomic context. But > do you really see this in the real life?
console_sem is locked by atomic printk CPU1~CPU10. non-atomic CPU is just sleeping waiting for the console_sem. while atomic printk CPUs just hand off console_sem ownership to each other without ever up()-ing the console_sem. what's the point of hand off here? how is that going to work? what we need to do is to offload printing from atomic contexts to a non-atomic one - which is CPU0. and that non-atomic CPU is sleeping on the console_sem, ready to continue printing. but it never gets its chance to do so, because CPU0 ~ CPU10 just passing console_sem ownership around, resulting in the same "we print from atomic context" thing. > Of course, there is a chance that it will pass the work from > a safe context to atomic one. But there was the same chance that > the work already started in the atomic context. Therefore statistically > this should not make things worse. which is not a justification. we are not looking for a solution that does not make the things worse. we are looking for a solution that does improve the things. -ss