On Wed 2019-09-18 11:05:28, John Ogness wrote:
> On 2019-09-18, Sergey Senozhatsky <sergey.senozhatsky.w...@gmail.com> wrote:
> >> Each console has its own iterator. This iterators will need to
> >> advance, regardless if the message was printed via write() or
> >> write_atomic().
> >
> > Great.
> >
> > ->atomic_write() path will make sure that kthread is parked or will
> > those compete for uart port?
> 
> A cpu-lock (probably per-console) will be used to synchronize the
> two. Unlike my RFCv1, we want to keep the cpu-lock out of the console
> drivers and we want it to be less aggressive (using trylock's instead of
> spinning). This should make the cpu-lock less "dangerous". I talked with
> PeterZ, Thomas, and PetrM about how this can be implemented, but there
> may still be some corner cases.

If we take cpu_lock() only in non-preemptive context and the system is
normally working then try_lock() should be pretty reliable. I mean
that try_lock() would either succeed or the other CPU would be able
to flush the messages.

We might need to be more aggressive in panic(). But then it should be
easier because only one CPU can be running panic. This CPU would try
to stop the other CPUs and flush the consoles.

I though also about reusing the console-waiter logic in panic()
We could try to steel the cpu_lock() a more safe way. We would only
need to limit the busy waiting to 1 sec or so.

Regarding SysRq. I could imagine introducing another SysRq that
would just call panic(). I mean that it would try to flush the
logs and reboot in the most safe way.

I am not completely sure what to do with suspend, halt, and other
operations where we could not rely on the kthread. I would prefer to
allow only atomic consoles there in the beginning.

These are just some ideas. I do not think that everything needs to be
done immediately. I am sure that we will break some scenarios. We
should not complicate the code too much proactively because of
scenarios that are not much reliable even now.


> I would like to put everything together now so that we can run and test
> if the decisions made in that meeting hold up for all the cases. I think
> it will be easier to identify/add the missing pieces, once we have it
> coded.

Make sense. Just please, do not hold the entire series until all
details are solved.

It is always easier to review small pieces. Also it is a big pain
to rework/rebase huge series. IMHO, we need to reasonably handle
normal state and panic() at the beginning. All the other special
situations can be solved by follow up patches.

Best Regards,
Petr

Reply via email to