Re: poll()-ing a pipe descriptor, watching for POLLHUP

2009-06-11 Thread Vlad Galu
On Wed, Jun 3, 2009 at 9:42 PM, Bruce Evans wrote:
[...]

Hello Bruce, Kostik, Oliver. Was any consensus reached on how to
tackle this issue? The RELENG_7 code looks slightly different so I
haven't (yet) tried adapting the provided patches. As for the
interface behavior, I think the nicest way would be to signal EOF by
POLLHUP alone, without POLLIN.

Regards,
Vlad
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: poll()-ing a pipe descriptor, watching for POLLHUP

2009-06-03 Thread Bruce Evans

On Wed, 3 Jun 2009, Kostik Belousov wrote:


On Wed, Jun 03, 2009 at 05:30:51PM +0300, Kostik Belousov wrote:

On Wed, Jun 03, 2009 at 04:10:34PM +0300, Vlad Galu wrote:

Hm, I was having an issue with an internal piece of software, but
never checked what kind of pipe caused the problem. Turns out it was a
FIFO, and I got bitten by the same bug described here:
http://lists.freebsd.org/pipermail/freebsd-bugs/2006-March/017591.html

The problem is that the reader process isn't notified when the writer

G>>> process exits or closes the FIFO fd...


So you did found the relevant PR with long audit trail and patches
attached. You obviously should contact the author of the patches,
Oliver Fromme, who is FreeBSD committer for some time (CCed).

I agree that the thing shall be fixed finally. Skimming over the
patches in kern/94772, I have some doubts about removal of
POLLINIGNEOF flag. The reason is that we are generally do not
remove exposed user interfaces.


Maybe, but this flag was not a documented interface, and too much
ugliness might be required to preserve its behaviour bug-for-bug
compatibly (the old buggy behaviour would probably be more wanted
for compatibility than the strange behaviour given by this flag!


I forward-ported Bruce' patch to the CURRENT. It passes the tests
from tools/regression/fifo and a test from kern/94772.


Thanks.  I won't be committing it any time soon, so you should.

I rewrote the test programs extensively (enclosed at the end) in Oct
2007 and updated the kernel patches to match.  Please run the new tests
to see if you are missing anything important in the kernel part.  If
so, I will search for the kernel patches later (actually, now --
enclosed in the middle).  I just ran them under RELENG_7 and unpatched
-current and found no differences with the Oct 2007 version for RELENG_7
in the old test output.

The old test output is in the following subdirectories:
4: FreeBSD-4
7: FreeBSD-7
l: Linux-2.6.10
m: my version of FreeBSD-5.2 including patches for this problem.
AFAIR, the FreeBSD output in "m" is the same as the Linux output in
all except a couple of cases where Linux select is inconsistent with
itself and/or with Linux poll.  However, the differences in the saved
output are that the Linux output is mysteriously missing results for
tests 5-8.  The tests attempt to test certain race possibilities in a
non-racy way.  This is not easy and the differences might be due to
some races/states not occurring under Linux.

POSIX finally specified the behaviour strictly enough for it to be possible
to test it a couple of years ago.  I didn't follow all the developments
and forget the details, but it was close to the Linux behaviour.


For my liking, I did not removed POLLINIGNEOF.

diff --git a/sys/fs/fifofs/fifo_vnops.c b/sys/fs/fifofs/fifo_vnops.c
index 66963bc..7e279ca 100644
--- a/sys/fs/fifofs/fifo_vnops.c
+++ b/sys/fs/fifofs/fifo_vnops.c
@@ -226,11 +226,47 @@ fail1:
if (ap->a_mode & FREAD) {
fip->fi_readers++;
if (fip->fi_readers == 1) {
+   SOCKBUF_LOCK(&fip->fi_readsock->so_rcv);
+   if (fip->fi_writers > 0)
+   fip->fi_readsock->so_rcv.sb_state |=
+   SBS_COULDRCV;


My current version is in fact completely different.  It doesn't have
SBS_COULDRCV, but uses a generation count.  IIRC, this is the same
method as is used in Linux, and is needed for the same reasons
(something to do with keeping new "connections" separate from old ones).
So I will try to enclose the components of the patch in the order of
your diff (might miss some).  First one:

% Index: fifo_vnops.c
% ===
% RCS file: /home/ncvs/src/sys/fs/fifofs/fifo_vnops.c,v
% retrieving revision 1.100
% diff -u -2 -r1.100 fifo_vnops.c
% --- fifo_vnops.c  23 Jun 2004 00:35:50 -  1.100
% +++ fifo_vnops.c  17 Oct 2007 11:36:23 -
% @@ -36,4 +36,5 @@
%  #include 
%  #include 
% +#include 
%  #include 
%  #include 
% @@ -61,4 +62,5 @@
%   longfi_readers;
%   longfi_writers;
% + int fi_wgen;
%  };
% 
% @@ -182,8 +184,11 @@

%   struct ucred *a_cred;
%   struct thread *a_td;
% + int  a_fdidx;
%   } */ *ap;
%  {
%   struct vnode *vp = ap->a_vp;
%   struct fifoinfo *fip;
% + struct file *fp;
% + struct filedesc *fdp;
%   struct thread *td = ap->a_td;
%   struct ucred *cred = ap->a_cred;
% @@ -240,4 +245,10 @@
%   }
%   }
% + fdp = td->td_proc->p_fd;
% + FILEDESC_LOCK(fdp);
% + fp = fget_locked(fdp, ap->a_fdidx);
% + /* Abuse f_msgcount as a generation count. */
% + fp->f_msgcount = fip->fi_wgen - fip->fi_writers;
% + FILEDESC_UNLOCK(fdp);
%   }
%   if (ap->a_mode & FWRITE) {
% @@ -

Re: poll()-ing a pipe descriptor, watching for POLLHUP

2009-06-03 Thread Kostik Belousov
On Wed, Jun 03, 2009 at 05:30:51PM +0300, Kostik Belousov wrote:
> On Wed, Jun 03, 2009 at 04:10:34PM +0300, Vlad Galu wrote:
> > Hm, I was having an issue with an internal piece of software, but
> > never checked what kind of pipe caused the problem. Turns out it was a
> > FIFO, and I got bitten by the same bug described here:
> > http://lists.freebsd.org/pipermail/freebsd-bugs/2006-March/017591.html
> > 
> > The problem is that the reader process isn't notified when the writer
> > process exits or closes the FIFO fd...
> 
> So you did found the relevant PR with long audit trail and patches
> attached. You obviously should contact the author of the patches,
> Oliver Fromme, who is FreeBSD committer for some time (CCed).
> 
> I agree that the thing shall be fixed finally. Skimming over the
> patches in kern/94772, I have some doubts about removal of
> POLLINIGNEOF flag. The reason is that we are generally do not
> remove exposed user interfaces.

I forward-ported Bruce' patch to the CURRENT. It passes the tests
from tools/regression/fifo and a test from kern/94772.
For my liking, I did not removed POLLINIGNEOF.

diff --git a/sys/fs/fifofs/fifo_vnops.c b/sys/fs/fifofs/fifo_vnops.c
index 66963bc..7e279ca 100644
--- a/sys/fs/fifofs/fifo_vnops.c
+++ b/sys/fs/fifofs/fifo_vnops.c
@@ -226,11 +226,47 @@ fail1:
if (ap->a_mode & FREAD) {
fip->fi_readers++;
if (fip->fi_readers == 1) {
+   SOCKBUF_LOCK(&fip->fi_readsock->so_rcv);
+   if (fip->fi_writers > 0)
+   fip->fi_readsock->so_rcv.sb_state |=
+   SBS_COULDRCV;
+   else
+   /*
+* Sloppy?  Might be necessary to clear it
+* in all the places where fi_readers is
+* decremented to 0.  I think only writers
+* polling for input could be confused by
+* having it not set, and there is a problem
+* with these anyway now that we have
+* reversed the sense of the flag -- they
+* now block (?), but shouldn't.
+*/
+   fip->fi_readsock->so_rcv.sb_state &=
+   ~SBS_COULDRCV;
+   SOCKBUF_UNLOCK(&fip->fi_readsock->so_rcv);
SOCKBUF_LOCK(&fip->fi_writesock->so_snd);
fip->fi_writesock->so_snd.sb_state &= ~SBS_CANTSENDMORE;
SOCKBUF_UNLOCK(&fip->fi_writesock->so_snd);
if (fip->fi_writers > 0) {
wakeup(&fip->fi_writers);
+   SOCKBUF_LOCK(&fip->fi_readsock->so_rcv);
+   if (fip->fi_writers > 0)
+   fip->fi_readsock->so_rcv.sb_state |=
+   SBS_COULDRCV;
+   else
+   /*
+* Sloppy?  Might be necessary to clear it
+* in all the places where fi_readers is
+* decremented to 0.  I think only writers
+* polling for input could be confused by
+* having it not set, and there is a problem
+* with these anyway now that we have
+* reversed the sense of the flag -- they
+* now block (?), but shouldn't.
+*/
+   fip->fi_readsock->so_rcv.sb_state &=
+   ~SBS_COULDRCV;
+   SOCKBUF_UNLOCK(&fip->fi_readsock->so_rcv);
sowwakeup(fip->fi_writesock);
}
}
@@ -244,6 +280,7 @@ fail1:
if (fip->fi_writers == 1) {
SOCKBUF_LOCK(&fip->fi_readsock->so_rcv);
fip->fi_readsock->so_rcv.sb_state &= ~SBS_CANTRCVMORE;
+   fip->fi_readsock->so_rcv.sb_state |= SBS_COULDRCV;
SOCKBUF_UNLOCK(&fip->fi_readsock->so_rcv);
if (fip->fi_readers > 0) {
wakeup(&fip->fi_readers);
@@ -672,28 +709,10 @@ fifo_poll_f(struct file *fp, int events, struct ucred 
*cred, struct thread *td)
levents = events &
(POLLIN | POLLINIGNEOF | POLLPRI | POLLRDNORM | POLLRDBAND);
if ((fp->f_flag & FREAD) && levents) {
-   /*
-* If POLLIN or POLLRDNORM is requested and POLLINIGNEOF is
-* not, then convert the first two to the last one.  This
-* tells the socket poll 

Re: poll()-ing a pipe descriptor, watching for POLLHUP

2009-06-03 Thread Kostik Belousov
On Wed, Jun 03, 2009 at 04:10:34PM +0300, Vlad Galu wrote:
> Hm, I was having an issue with an internal piece of software, but
> never checked what kind of pipe caused the problem. Turns out it was a
> FIFO, and I got bitten by the same bug described here:
> http://lists.freebsd.org/pipermail/freebsd-bugs/2006-March/017591.html
> 
> The problem is that the reader process isn't notified when the writer
> process exits or closes the FIFO fd...

So you did found the relevant PR with long audit trail and patches
attached. You obviously should contact the author of the patches,
Oliver Fromme, who is FreeBSD committer for some time (CCed).

I agree that the thing shall be fixed finally. Skimming over the
patches in kern/94772, I have some doubts about removal of
POLLINIGNEOF flag. The reason is that we are generally do not
remove exposed user interfaces.


pgpHiZR2gixgh.pgp
Description: PGP signature


Re: poll()-ing a pipe descriptor, watching for POLLHUP

2009-06-03 Thread Vlad Galu
On Wed, Jun 3, 2009 at 3:35 PM, Vlad Galu  wrote:
> On Wed, Jun 3, 2009 at 3:32 PM, Kostik Belousov  wrote:
>> On Wed, Jun 03, 2009 at 03:15:32PM +0300, Vlad Galu wrote:
>>> Hello,
>>>
>>> Please take a look at the attached code. Shouldn't poll() get a
>>> POLLHUP event when the child process exits, closing the write end of
>>> the pipe?
>>
>> It seems that you code forgot to close the write end of the pipe in
>> parent. Thus, pipe is referenced by another file descriptor from
>> the parent process, and you do not get close event.
>>
>
> Aaarhg! You're right! Sorry for the noise!
>

Hm, I was having an issue with an internal piece of software, but
never checked what kind of pipe caused the problem. Turns out it was a
FIFO, and I got bitten by the same bug described here:
http://lists.freebsd.org/pipermail/freebsd-bugs/2006-March/017591.html

The problem is that the reader process isn't notified when the writer
process exits or closes the FIFO fd...
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: poll()-ing a pipe descriptor, watching for POLLHUP

2009-06-03 Thread Vlad Galu
On Wed, Jun 3, 2009 at 3:32 PM, Kostik Belousov  wrote:
> On Wed, Jun 03, 2009 at 03:15:32PM +0300, Vlad Galu wrote:
>> Hello,
>>
>> Please take a look at the attached code. Shouldn't poll() get a
>> POLLHUP event when the child process exits, closing the write end of
>> the pipe?
>
> It seems that you code forgot to close the write end of the pipe in
> parent. Thus, pipe is referenced by another file descriptor from
> the parent process, and you do not get close event.
>

Aaarhg! You're right! Sorry for the noise!
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: poll()-ing a pipe descriptor, watching for POLLHUP

2009-06-03 Thread Kostik Belousov
On Wed, Jun 03, 2009 at 03:15:32PM +0300, Vlad Galu wrote:
> Hello,
> 
> Please take a look at the attached code. Shouldn't poll() get a
> POLLHUP event when the child process exits, closing the write end of
> the pipe?

It seems that you code forgot to close the write end of the pipe in
parent. Thus, pipe is referenced by another file descriptor from
the parent process, and you do not get close event.


pgpztAkKQhW1z.pgp
Description: PGP signature


Re: poll()-ing a pipe descriptor, watching for POLLHUP

2009-06-03 Thread Vlad Galu
Hm, according to the code at
http://www.greenend.org.uk/rjk/2001/06/poll.html, it seems to work as
expected (returning both POLLIN and POLLHUP), when closing the write
end of the pipe from within the same process.


On Wed, Jun 3, 2009 at 3:15 PM, Vlad Galu  wrote:
> Hello,
>
> Please take a look at the attached code. Shouldn't poll() get a
> POLLHUP event when the child process exits, closing the write end of
> the pipe?
>
> Thanks,
> Vlad
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"