On Mon, Jun 29, 2020 at 04:50:20PM +0200, Petr Mladek wrote:
> On Mon 2020-06-29 16:59:24, Cengiz Can wrote:
> > `kdb_msg_write` operates on a global `struct kgdb_io *` called
> > `dbg_io_ops`.
> > 
> > Although it is initialized in `debug_core.c`, there's a null check in
> > `kdb_msg_write` which implies that it can be null whenever we dereference
> > it in this function call.
> > 
> > Coverity scanner caught this as CID 1465042.
> > 
> > I have modified the function to bail out if `dbg_io_ops` is not properly
> > initialized.
> > 
> > Signed-off-by: Cengiz Can <cen...@kernel.wtf>
> > ---
> >  kernel/debug/kdb/kdb_io.c | 15 ++++++++-------
> >  1 file changed, 8 insertions(+), 7 deletions(-)
> > 
> > diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c
> > index 683a799618ad..85e579812458 100644
> > --- a/kernel/debug/kdb/kdb_io.c
> > +++ b/kernel/debug/kdb/kdb_io.c
> > @@ -549,14 +549,15 @@ static void kdb_msg_write(const char *msg, int 
> > msg_len)
> >     if (msg_len == 0)
> >             return;
> >  
> > -   if (dbg_io_ops) {
> > -           const char *cp = msg;
> > -           int len = msg_len;
> > +   if (!dbg_io_ops)
> > +           return;
> 
> This looks wrong. The message should be printed to the consoles
> even when dbg_io_ops is NULL. I mean that the for_each_console(c)
> cycle should always get called.
> 
> Well, the code really looks racy. dbg_io_ops is set under
> kgdb_registration_lock. IMHO, it should also get accessed under this lock.
> 
> It seems that the race is possible. kdb_msg_write() is called from
> vkdb_printf(). This function is serialized on more CPUs using
> kdb_printf_cpu lock. But it is not serialized with
> kgdb_register_io_module() and kgdb_unregister_io_module() calls.

We can't take the lock from the trap handler itself since we cannot
have spinlocks contended between regular threads and the debug trap
(which could be an NMI).

Instead, the call to kgdb_unregister_callbacks() at the beginning
of kgdb_unregister_io_module() should render kdb_msg_write()
unreachable prior to dbg_io_ops becoming NULL.

As it happens I am starting to believe there is a race in this area but
the race is between register/unregister calls rather than against the
trap handler (if there were register/unregister races then the trap
handler is be a potential victim of the race though).


> But I might miss something. dbg_io_ops is dereferenced on many other
> locations without any check.

There is already a paranoid "just in case there are bugs" check in
kgdb_io_ready() so in any case I think the check in kdb_msg_write() can
simply be removed.

As I said in my other post, if dbg_io_ops were ever NULL then the
system is completely hosed anyway: we can never receive the keystroke
needed to leave the debugger... and may not be able to tell anybody
why.


> >  
> > -           while (len--) {
> > -                   dbg_io_ops->write_char(*cp);
> > -                   cp++;
> > -           }
> > +   const char *cp = msg;
> > +   int len = msg_len;
> > +
> > +   while (len--) {
> > +           dbg_io_ops->write_char(*cp);
> > +           cp++;
> >     }
> >  
> >     for_each_console(c) {
> 
> You probably got confused by this new code:
> 
>               if (c == dbg_io_ops->cons)
>                       continue;
> 
> It dereferences dbg_io_ops without NULL check. It should probably
> get replaced by:
> 
>               if (dbg_io_ops && c == dbg_io_ops->cons)
>                       continue;
> 
> Daniel, Sumit, could you please put some light on this?

As above, I think the NULL check that confuses coverity can simply be
removed.


Daniel.

Reply via email to