Hi Junichi,

A little earlier Peter Hurley has posted a patch to fix this problem.
https://lkml.org/lkml/2015/11/27/546

It may be found firstly on arm by Pratyush Anand <pan...@redhat.com>.
I found it too this week on Fedora 23.

Anyway, it's great problem has been fixed very quickly. Just reply to
let you know this.

Thanks
Baoquan

On 12/16/15 at 06:32am, Junichi Nomura wrote:
> Since kernel v4.4-rc1, kdump capture service with Fedora23 / RHEL7.2
> almost always fails on my test system which uses serial console. It
> used to work fine until kernel v4.3.
> 
> Kdump fails with an error like this:
>   kdump.sh[1040]: /bin/kdump.sh: line 8: /dev/console: Input/output error
> 
> The line 8 of kdump.sh is doing this:
>   exec &> /dev/console
> (http://pkgs.fedoraproject.org/cgit/kexec-tools.git/tree/dracut-kdump.sh)
> 
> and the EIO is returned by this code in tty_reopen():
>         if (!tty->count)
>                 return -EIO;
> 
> Bisection tells that commit 79c1faa4511e ("tty: Remove
> tty_wait_until_sent_from_close()") is the first bad commit.
> Actually, after reverting the commit, kdump capture starts working
> again.
> 
> Open of /dev/console used to return -EIO when it races with close.
> (https://bugs.launchpad.net/ubuntu/+source/linux/+bug/554172/comments/245)
> But the commit seems widening the race window.
> 
>   Before the commit:
>     tty_release()
>       tty_lock(tty)
>       tty->ops->close(tty, filp)
>         tty_unlock(tty)
>         tty_wait_until_sent()
>         // the window starts from here
>         tty_lock(tty)
>       decrement tty->count
>       tty_unlock(tty)
>       (releasing tty if count became zero)
> 
>   After the commit
>     tty_release()
>       // the window starts from here
>       tty_lock(tty)
>       tty->ops->close(tty, filp)
>         tty_wait_until_sent()
>       decrement tty->count
>       tty_unlock(tty)
>       (releasing tty if count became zero)
> 
> While it might be possible for user space to cope with the problem
> by retrying open(), there is no clue whether and how long it should.
> Also current situation makes shell scripting like the above kdump.sh
> fragile for this sort of timing change.
> 
> How about retrying tty_open in kernel instead, like the attached patch?
> If !tty->count in tty_reopen() means the race has happened, that
> seems reasonable.
> 
> ---
> Jun'ichi Nomura, NEC Corporation
> 
> diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
> index bcc8e1e..070ea66 100644
> --- a/drivers/tty/tty_io.c
> +++ b/drivers/tty/tty_io.c
> @@ -1462,8 +1462,9 @@ static int tty_reopen(struct tty_struct *tty)
>  {
>       struct tty_driver *driver = tty->driver;
>  
> +     /* We cannot re-open tty which is being released. */
>       if (!tty->count)
> -             return -EIO;
> +             return -ERESTARTSYS;
>  
>       if (driver->type == TTY_DRIVER_TYPE_PTY &&
>           driver->subtype == PTY_TYPE_MASTER)
> @@ -2087,6 +2088,11 @@ retry_open:
>  
>       if (IS_ERR(tty)) {
>               retval = PTR_ERR(tty);
> +             if (retval == -ERESTARTSYS && !signal_pending(current)) {
> +                     tty_free_file(filp);
> +                     schedule();
> +                     goto retry_open;
> +             }
>               goto err_file;
>       }
>  
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to