Re: May close() return any error code?

2015-08-04 Thread Takashi Iwai
On Sun, 02 Aug 2015 16:07:06 +0200,
Al Viro wrote:
> 
> On Wed, Jul 29, 2015 at 07:45:11AM -0700, Dmitry Torokhov wrote:
> > > This seems coming from evdev_flush().  As there is no fd leak, it's no
> > > big problem per se.  But, now the question is whether returning such
> > > an error code is correct behavior at all.  At least, it doesn't seem
> > > defined in POSIX:
> > >   http://pubs.opengroup.org/onlinepubs/009695399/functions/close.html
> > 
> > Hmm, if I checked the right version of the code close_nointr_nofail()
> > expects only 0 as the return code so even if we change the kernel to
> > use more conforming -EIO instead of -ENODEV systemd will still die...
> > 
> > The question is whether we really need to propagate return value from
> > f_op->flush() up to userspace in filp_close(). Why don't we ask Al?
> 
> That's the whole damn point of having ->flush().  And yes, we do need that -
> things like NFS (not to mention tapes, etc.) do rely on that.
> 
> Whether it makes sense to do this kind of "do something that might have
> a failure to report on each close()" for evdev is up to driver, obviously.

So, the behavior of VFS layer is as designed.  Then I suppose the fix
should be rather in evdev.c.  Dmitry, could you paper over it?


thanks,

Takashi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: May close() return any error code?

2015-08-04 Thread Takashi Iwai
On Sun, 02 Aug 2015 16:07:06 +0200,
Al Viro wrote:
 
 On Wed, Jul 29, 2015 at 07:45:11AM -0700, Dmitry Torokhov wrote:
   This seems coming from evdev_flush().  As there is no fd leak, it's no
   big problem per se.  But, now the question is whether returning such
   an error code is correct behavior at all.  At least, it doesn't seem
   defined in POSIX:
 http://pubs.opengroup.org/onlinepubs/009695399/functions/close.html
  
  Hmm, if I checked the right version of the code close_nointr_nofail()
  expects only 0 as the return code so even if we change the kernel to
  use more conforming -EIO instead of -ENODEV systemd will still die...
  
  The question is whether we really need to propagate return value from
  f_op-flush() up to userspace in filp_close(). Why don't we ask Al?
 
 That's the whole damn point of having -flush().  And yes, we do need that -
 things like NFS (not to mention tapes, etc.) do rely on that.
 
 Whether it makes sense to do this kind of do something that might have
 a failure to report on each close() for evdev is up to driver, obviously.

So, the behavior of VFS layer is as designed.  Then I suppose the fix
should be rather in evdev.c.  Dmitry, could you paper over it?


thanks,

Takashi
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: May close() return any error code?

2015-08-02 Thread Al Viro
On Wed, Jul 29, 2015 at 07:45:11AM -0700, Dmitry Torokhov wrote:
> > This seems coming from evdev_flush().  As there is no fd leak, it's no
> > big problem per se.  But, now the question is whether returning such
> > an error code is correct behavior at all.  At least, it doesn't seem
> > defined in POSIX:
> >   http://pubs.opengroup.org/onlinepubs/009695399/functions/close.html
> 
> Hmm, if I checked the right version of the code close_nointr_nofail()
> expects only 0 as the return code so even if we change the kernel to
> use more conforming -EIO instead of -ENODEV systemd will still die...
> 
> The question is whether we really need to propagate return value from
> f_op->flush() up to userspace in filp_close(). Why don't we ask Al?

That's the whole damn point of having ->flush().  And yes, we do need that -
things like NFS (not to mention tapes, etc.) do rely on that.

Whether it makes sense to do this kind of "do something that might have
a failure to report on each close()" for evdev is up to driver, obviously.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: May close() return any error code?

2015-08-02 Thread Al Viro
On Sun, Aug 02, 2015 at 09:42:20AM +0200, Pavel Machek wrote:
> > This seems coming from evdev_flush().  As there is no fd leak, it's no
> > big problem per se.  But, now the question is whether returning such
> > an error code is correct behavior at all.  At least, it doesn't seem
> > defined in POSIX:
> >
> http://pubs.opengroup.org/onlinepubs/009695399/functions/close.html
> 
> Returning an error from close() would imply that file descriptor is
> not closed seems like bad idea. Just fix the kernel not to do it.

The only thing implied here is failure to RTF{M,S}.  To quote close(2):

NOTES
   Not checking the return value of close() is a common  but  nevertheless
   serious  programming error.  It is quite possible that errors on a pre‐
   vious write(2) operation are first reported at the final close().   Not
   checking the return value when closing the file may lead to silent loss
   of data.  This can especially be observed with NFS and with disk quota.
   Note  that  the  return  value should only be used for diagnostics.  In
   particular close() should not be retried after an EINTR since this  may
   cause a reused descriptor from another thread to be closed.

That's Linux one.  FreeBSD one says

 In case of any error except EBADF, the supplied file descriptor is deal-
 located and therefore is no longer valid.

and that matches behaviour of historical BSD variants as well.  POSIX is being
its usual charming self and says "if a kernel shipped by $VALUED_MEMBER does
this and that cretinous thing, far be it from us to call it broken".
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: May close() return any error code?

2015-08-02 Thread Pavel Machek
On Wed 2015-07-29 12:46:59, Takashi Iwai wrote:
> Hi,
> 
> while debugging a problem of X and gdm with the old systemd-210, we
> encountered a sudden death of systemd-logind, and this turned out to
> be an unexpected errno from close().  The close() call for input
> devices returns ENODEV error.  The logind in systemd-210 treats this
> error code as fatal, triggers assert() and eventually kills itself.
> The details are found in an openSUSE bugzilla thread:
>   https://bugzilla.opensuse.org/show_bug.cgi?id=939571
> 
> This seems coming from evdev_flush().  As there is no fd leak, it's no
> big problem per se.  But, now the question is whether returning such
> an error code is correct behavior at all.  At least, it doesn't seem
> defined in POSIX:
>
http://pubs.opengroup.org/onlinepubs/009695399/functions/close.html


Returning an error from close() would imply that file descriptor is
not closed seems like bad idea. Just fix the kernel not to do it.

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: May close() return any error code?

2015-08-02 Thread Pavel Machek
On Wed 2015-07-29 12:46:59, Takashi Iwai wrote:
 Hi,
 
 while debugging a problem of X and gdm with the old systemd-210, we
 encountered a sudden death of systemd-logind, and this turned out to
 be an unexpected errno from close().  The close() call for input
 devices returns ENODEV error.  The logind in systemd-210 treats this
 error code as fatal, triggers assert() and eventually kills itself.
 The details are found in an openSUSE bugzilla thread:
   https://bugzilla.opensuse.org/show_bug.cgi?id=939571
 
 This seems coming from evdev_flush().  As there is no fd leak, it's no
 big problem per se.  But, now the question is whether returning such
 an error code is correct behavior at all.  At least, it doesn't seem
 defined in POSIX:

http://pubs.opengroup.org/onlinepubs/009695399/functions/close.html


Returning an error from close() would imply that file descriptor is
not closed seems like bad idea. Just fix the kernel not to do it.

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: May close() return any error code?

2015-08-02 Thread Al Viro
On Wed, Jul 29, 2015 at 07:45:11AM -0700, Dmitry Torokhov wrote:
  This seems coming from evdev_flush().  As there is no fd leak, it's no
  big problem per se.  But, now the question is whether returning such
  an error code is correct behavior at all.  At least, it doesn't seem
  defined in POSIX:
http://pubs.opengroup.org/onlinepubs/009695399/functions/close.html
 
 Hmm, if I checked the right version of the code close_nointr_nofail()
 expects only 0 as the return code so even if we change the kernel to
 use more conforming -EIO instead of -ENODEV systemd will still die...
 
 The question is whether we really need to propagate return value from
 f_op-flush() up to userspace in filp_close(). Why don't we ask Al?

That's the whole damn point of having -flush().  And yes, we do need that -
things like NFS (not to mention tapes, etc.) do rely on that.

Whether it makes sense to do this kind of do something that might have
a failure to report on each close() for evdev is up to driver, obviously.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: May close() return any error code?

2015-08-02 Thread Al Viro
On Sun, Aug 02, 2015 at 09:42:20AM +0200, Pavel Machek wrote:
  This seems coming from evdev_flush().  As there is no fd leak, it's no
  big problem per se.  But, now the question is whether returning such
  an error code is correct behavior at all.  At least, it doesn't seem
  defined in POSIX:
 
 http://pubs.opengroup.org/onlinepubs/009695399/functions/close.html
 
 Returning an error from close() would imply that file descriptor is
 not closed seems like bad idea. Just fix the kernel not to do it.

The only thing implied here is failure to RTF{M,S}.  To quote close(2):

NOTES
   Not checking the return value of close() is a common  but  nevertheless
   serious  programming error.  It is quite possible that errors on a pre‐
   vious write(2) operation are first reported at the final close().   Not
   checking the return value when closing the file may lead to silent loss
   of data.  This can especially be observed with NFS and with disk quota.
   Note  that  the  return  value should only be used for diagnostics.  In
   particular close() should not be retried after an EINTR since this  may
   cause a reused descriptor from another thread to be closed.

That's Linux one.  FreeBSD one says

 In case of any error except EBADF, the supplied file descriptor is deal-
 located and therefore is no longer valid.

and that matches behaviour of historical BSD variants as well.  POSIX is being
its usual charming self and says if a kernel shipped by $VALUED_MEMBER does
this and that cretinous thing, far be it from us to call it broken.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: May close() return any error code?

2015-07-30 Thread Takashi Iwai
On Wed, 29 Jul 2015 16:45:11 +0200,
Dmitry Torokhov wrote:
> 
> HI Takashi,
> 
> On Wed, Jul 29, 2015 at 12:46:59PM +0200, Takashi Iwai wrote:
> > Hi,
> > 
> > while debugging a problem of X and gdm with the old systemd-210, we
> > encountered a sudden death of systemd-logind, and this turned out to
> > be an unexpected errno from close().  The close() call for input
> > devices returns ENODEV error.  The logind in systemd-210 treats this
> > error code as fatal, triggers assert() and eventually kills itself.
> > The details are found in an openSUSE bugzilla thread:
> >   https://bugzilla.opensuse.org/show_bug.cgi?id=939571
> > 
> > This seems coming from evdev_flush().  As there is no fd leak, it's no
> > big problem per se.  But, now the question is whether returning such
> > an error code is correct behavior at all.  At least, it doesn't seem
> > defined in POSIX:
> >   http://pubs.opengroup.org/onlinepubs/009695399/functions/close.html
> 
> Hmm, if I checked the right version of the code close_nointr_nofail()
> expects only 0 as the return code so even if we change the kernel to
> use more conforming -EIO instead of -ENODEV systemd will still die...

Yes, it can be seen rather as a bug in systemd-210, and the later
systemd version doesn't seem to trigger assert() with that condition.

> The question is whether we really need to propagate return value from
> f_op->flush() up to userspace in filp_close(). Why don't we ask Al?

Oh yes, of course, Al is the best person to ask :)

Looking back at this problem now, I find it even dangerous to
propagate an error, especially -EINTR.  Then user-space may interpret
it as if the close didn't succeed and the fd were alive, although the
kernel performed the close itself.


thanks,

Takashi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: May close() return any error code?

2015-07-30 Thread Takashi Iwai
On Wed, 29 Jul 2015 16:45:11 +0200,
Dmitry Torokhov wrote:
 
 HI Takashi,
 
 On Wed, Jul 29, 2015 at 12:46:59PM +0200, Takashi Iwai wrote:
  Hi,
  
  while debugging a problem of X and gdm with the old systemd-210, we
  encountered a sudden death of systemd-logind, and this turned out to
  be an unexpected errno from close().  The close() call for input
  devices returns ENODEV error.  The logind in systemd-210 treats this
  error code as fatal, triggers assert() and eventually kills itself.
  The details are found in an openSUSE bugzilla thread:
https://bugzilla.opensuse.org/show_bug.cgi?id=939571
  
  This seems coming from evdev_flush().  As there is no fd leak, it's no
  big problem per se.  But, now the question is whether returning such
  an error code is correct behavior at all.  At least, it doesn't seem
  defined in POSIX:
http://pubs.opengroup.org/onlinepubs/009695399/functions/close.html
 
 Hmm, if I checked the right version of the code close_nointr_nofail()
 expects only 0 as the return code so even if we change the kernel to
 use more conforming -EIO instead of -ENODEV systemd will still die...

Yes, it can be seen rather as a bug in systemd-210, and the later
systemd version doesn't seem to trigger assert() with that condition.

 The question is whether we really need to propagate return value from
 f_op-flush() up to userspace in filp_close(). Why don't we ask Al?

Oh yes, of course, Al is the best person to ask :)

Looking back at this problem now, I find it even dangerous to
propagate an error, especially -EINTR.  Then user-space may interpret
it as if the close didn't succeed and the fd were alive, although the
kernel performed the close itself.


thanks,

Takashi
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: May close() return any error code?

2015-07-29 Thread Dmitry Torokhov
HI Takashi,

On Wed, Jul 29, 2015 at 12:46:59PM +0200, Takashi Iwai wrote:
> Hi,
> 
> while debugging a problem of X and gdm with the old systemd-210, we
> encountered a sudden death of systemd-logind, and this turned out to
> be an unexpected errno from close().  The close() call for input
> devices returns ENODEV error.  The logind in systemd-210 treats this
> error code as fatal, triggers assert() and eventually kills itself.
> The details are found in an openSUSE bugzilla thread:
>   https://bugzilla.opensuse.org/show_bug.cgi?id=939571
> 
> This seems coming from evdev_flush().  As there is no fd leak, it's no
> big problem per se.  But, now the question is whether returning such
> an error code is correct behavior at all.  At least, it doesn't seem
> defined in POSIX:
>   http://pubs.opengroup.org/onlinepubs/009695399/functions/close.html

Hmm, if I checked the right version of the code close_nointr_nofail()
expects only 0 as the return code so even if we change the kernel to
use more conforming -EIO instead of -ENODEV systemd will still die...

The question is whether we really need to propagate return value from
f_op->flush() up to userspace in filp_close(). Why don't we ask Al?

Thanks.

-- 
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


May close() return any error code?

2015-07-29 Thread Takashi Iwai
Hi,

while debugging a problem of X and gdm with the old systemd-210, we
encountered a sudden death of systemd-logind, and this turned out to
be an unexpected errno from close().  The close() call for input
devices returns ENODEV error.  The logind in systemd-210 treats this
error code as fatal, triggers assert() and eventually kills itself.
The details are found in an openSUSE bugzilla thread:
  https://bugzilla.opensuse.org/show_bug.cgi?id=939571

This seems coming from evdev_flush().  As there is no fd leak, it's no
big problem per se.  But, now the question is whether returning such
an error code is correct behavior at all.  At least, it doesn't seem
defined in POSIX:
  http://pubs.opengroup.org/onlinepubs/009695399/functions/close.html

I myself have no strong opinion here, so would like just to ask others
suggestions / comments.

thanks,


Takashi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


May close() return any error code?

2015-07-29 Thread Takashi Iwai
Hi,

while debugging a problem of X and gdm with the old systemd-210, we
encountered a sudden death of systemd-logind, and this turned out to
be an unexpected errno from close().  The close() call for input
devices returns ENODEV error.  The logind in systemd-210 treats this
error code as fatal, triggers assert() and eventually kills itself.
The details are found in an openSUSE bugzilla thread:
  https://bugzilla.opensuse.org/show_bug.cgi?id=939571

This seems coming from evdev_flush().  As there is no fd leak, it's no
big problem per se.  But, now the question is whether returning such
an error code is correct behavior at all.  At least, it doesn't seem
defined in POSIX:
  http://pubs.opengroup.org/onlinepubs/009695399/functions/close.html

I myself have no strong opinion here, so would like just to ask others
suggestions / comments.

thanks,


Takashi
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: May close() return any error code?

2015-07-29 Thread Dmitry Torokhov
HI Takashi,

On Wed, Jul 29, 2015 at 12:46:59PM +0200, Takashi Iwai wrote:
 Hi,
 
 while debugging a problem of X and gdm with the old systemd-210, we
 encountered a sudden death of systemd-logind, and this turned out to
 be an unexpected errno from close().  The close() call for input
 devices returns ENODEV error.  The logind in systemd-210 treats this
 error code as fatal, triggers assert() and eventually kills itself.
 The details are found in an openSUSE bugzilla thread:
   https://bugzilla.opensuse.org/show_bug.cgi?id=939571
 
 This seems coming from evdev_flush().  As there is no fd leak, it's no
 big problem per se.  But, now the question is whether returning such
 an error code is correct behavior at all.  At least, it doesn't seem
 defined in POSIX:
   http://pubs.opengroup.org/onlinepubs/009695399/functions/close.html

Hmm, if I checked the right version of the code close_nointr_nofail()
expects only 0 as the return code so even if we change the kernel to
use more conforming -EIO instead of -ENODEV systemd will still die...

The question is whether we really need to propagate return value from
f_op-flush() up to userspace in filp_close(). Why don't we ask Al?

Thanks.

-- 
Dmitry
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/