Re: svn commit: r196460 - head/sys/kern

2009-08-26 Thread Jilles Tjoelker
On Wed, Aug 26, 2009 at 05:08:57PM +1000, Bruce Evans wrote:
> On Tue, 25 Aug 2009, Jilles Tjoelker wrote:
> > On Tue, Aug 25, 2009 at 04:07:11AM +1000, Bruce Evans wrote:
> >> On Sun, 23 Aug 2009, Jilles Tjoelker wrote:
> >> % Index: sys/fs/fifofs/fifo_vnops.c
> >> % ===
> >> % --- sys/fs/fifofs/fifo_vnops.c   (revision 196459)
> >> % +++ sys/fs/fifofs/fifo_vnops.c   (working copy)
> >> % @@ -193,6 +193,10 @@
> >> %  goto fail2;
> >> %  fip->fi_writesock = wso;
> >> %  error = soconnect2(wso, rso);
> >> % +if (error == 0)
> >> % +error = soshutdown(rso, SHUT_WR);
> >> % +if (error == 0)
> >> % +error = soshutdown(wso, SHUT_RD);
> >> %  if (error) {
> >> %  (void)soclose(wso);
> >> %  fail2:

> The second shutdown became harmless for me when I fixed the clobbering of
> sb_state.  Does it have any effect?

It seems not. shutdown(SHUT_RD) basically calls the pru_flush protocol
function (which uipc_usrreq.c does not provide), calls socantrecvmore
(which uipc_usrreq.c had already done synchronously with the first
shutdown) and clears the receive socket buffer (which is already empty).

Regarding the direct access to SBS_CANTRCVMORE and SBS_CANTSENDMORE,
this seems related to the special property of fifos that they can
disconnect and reconnect a stream using the same object. The socket
layer normally does not allow clearing SBS_CANTRCVMORE and
SBS_CANTSENDMORE. The only case where fifo_vnops.c touches these flags
directly where a so* function call could be used is the setting of
SBS_CANTRCVMORE on the read side at initial creation (possibly because a
wakeup is not necessary there).

I suppose the new fifo implementation will avoid abusing sockets this
way.

-- 
Jilles Tjoelker
___
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r196460 - head/sys/kern

2009-08-26 Thread Bruce Evans

On Tue, 25 Aug 2009, Jilles Tjoelker wrote:


On Tue, Aug 25, 2009 at 04:07:11AM +1000, Bruce Evans wrote:

On Sun, 23 Aug 2009, Jilles Tjoelker wrote:



I think poll on fifos should instead be fixed by closing the
half-connection corresponding to writing from fi_readsock to
fi_writesock. I have tried this out, see attached patch. With the patch,
pipepoll only gives "expected POLLHUP; got POLLIN | POLLHUP" errors and
an error in fifo case 6b caused by our distinction between new and old
readers.



This sort of worked for me, but has several problems:



% Index: sys/fs/fifofs/fifo_vnops.c
% ===
% --- sys/fs/fifofs/fifo_vnops.c(revision 196459)
% +++ sys/fs/fifofs/fifo_vnops.c(working copy)
% @@ -193,6 +193,10 @@
%   goto fail2;
%   fip->fi_writesock = wso;
%   error = soconnect2(wso, rso);
% + if (error == 0)
% + error = soshutdown(rso, SHUT_WR);
% + if (error == 0)
% + error = soshutdown(wso, SHUT_RD);
%   if (error) {
%   (void)soclose(wso);
%  fail2:



The second soshutdown() is only harmful.  I see the following state changes
- the first soshutdown() sets SBS_CANTRCVMORE on rso like you would expect,
   and also sets SBS_CANTSENDMORE on wso.  This gives the desired state.
- the second soshutdown() then clears SBS_CANTRCVMORE on rso (without
   the first soshutdown() it leaves both flags clear in both directions).
   This clobbers the desired state.  The failure shows in just one of my
   uncommitted regression tests (when there is a writer and there was
   a reader, poll() returns POLLOUT for the writer, but should return
   POLLHUP; the missing SBS_CANTRCVMORE on rso prevents it ever returning
   POLLHUP for writers).



After removing the second soshutdown() and fixing a spurious POLLIN (see
below), all my tests pass.


I have removed the second shutdown, it is not necessary.


The second shutdown became harmless for me when I fixed the clobbering of
sb_state.  Does it have any effect?


% Index: sys/kern/uipc_socket.c
% ===
% --- sys/kern/uipc_socket.c(revision 196469)
% +++ sys/kern/uipc_socket.c(working copy)
% @@ -2898,11 +2898,13 @@
%   if (so->so_oobmark || (so->so_rcv.sb_state & SBS_RCVATMARK))
%   revents |= events & (POLLPRI | POLLRDBAND);
%
% - if ((events & POLLINIGNEOF) == 0)
% - if (so->so_rcv.sb_state & SBS_CANTRCVMORE)
% - revents |= POLLHUP;
% - if (so->so_snd.sb_state & SBS_CANTSENDMORE)
% - revents |= POLLHUP;
% + if ((events & POLLINIGNEOF) == 0) {



Old problems become larger:



I don't like POLLINIGNEOF (for input) affecting POLLHUP for output.  This
seems to cause no problems for fifos, at least when the kernel sets
POLLINIGNEOF, but it is hard to understand even why it doesn't cause
problems, and kib@ wants POLLINIGNEOF to remain user-settable, so the
complications might remain exported to userland for for fifos and
sockets, where they are harder to document and understand.


I do not like userland POLLINIGNEOF either. I think programs can do fine
with the standard functionality (closing and reopening a fifo to reset
the POLLHUP state).


% + revents |= events & (POLLIN | POLLRDNORM);



This gives spurious POLLINs when POLLHUP is also returned, and thus defeats
the point of soreadable_data() being different from soreadable().  Tests
6a-6d show the spurious POLLIN.  I don't understand how tests 6a and 6c-6d
passed for you.


Same problem here. I think kib@ wants to keep this in 8.x for the sake
of buggy programs that do not check for POLLHUP. I suppose we can do it
properly in 9.x.


But it may break non-buggy programs, and doesn't help for buggy programs
like gdb.  (gdb exits immediately when it sees POLLHUP, despite POLLIN
also being set and input being queued.  gdb used to work before it was
converted to use poll().  Then it saw a ready input fd and had to read()
it to detect EOF.)

I now have better tests for POLLOUT, and have tested the buggy case
for select some more (test 9, and 6b (?) in current): if POLLINIGNEOF
is not applied for select(), then test 9 passes as expected.  Test 9
only covers a fifo in initial state with a reader and no writers.  This
is EOF for read() and input-ready for select(), but neither POLLIN nor
POLLHUP for poll().  Test 6b is for essentially the initial state for
read() and select(), but for poll() the state is too-sticky hangup
(POSIX) or back to the initial state (Linux-2.6.10/FreeBSD-current).

Bruce
___
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r196460 - head/sys/kern

2009-08-25 Thread Jilles Tjoelker
On Tue, Aug 25, 2009 at 04:07:11AM +1000, Bruce Evans wrote:
> On Sun, 23 Aug 2009, Jilles Tjoelker wrote:

> > I think poll on fifos should instead be fixed by closing the
> > half-connection corresponding to writing from fi_readsock to
> > fi_writesock. I have tried this out, see attached patch. With the patch,
> > pipepoll only gives "expected POLLHUP; got POLLIN | POLLHUP" errors and
> > an error in fifo case 6b caused by our distinction between new and old
> > readers.

> This sort of worked for me, but has several problems:

> % Index: sys/fs/fifofs/fifo_vnops.c
> % ===
> % --- sys/fs/fifofs/fifo_vnops.c  (revision 196459)
> % +++ sys/fs/fifofs/fifo_vnops.c  (working copy)
> % @@ -193,6 +193,10 @@
> % goto fail2;
> % fip->fi_writesock = wso;
> % error = soconnect2(wso, rso);
> % +   if (error == 0)
> % +   error = soshutdown(rso, SHUT_WR);
> % +   if (error == 0)
> % +   error = soshutdown(wso, SHUT_RD);
> % if (error) {
> % (void)soclose(wso);
> %  fail2:

> The second soshutdown() is only harmful.  I see the following state changes
> - the first soshutdown() sets SBS_CANTRCVMORE on rso like you would expect,
>and also sets SBS_CANTSENDMORE on wso.  This gives the desired state.
> - the second soshutdown() then clears SBS_CANTRCVMORE on rso (without
>the first soshutdown() it leaves both flags clear in both directions).
>This clobbers the desired state.  The failure shows in just one of my
>uncommitted regression tests (when there is a writer and there was
>a reader, poll() returns POLLOUT for the writer, but should return
>POLLHUP; the missing SBS_CANTRCVMORE on rso prevents it ever returning
>POLLHUP for writers).

> After removing the second soshutdown() and fixing a spurious POLLIN (see
> below), all my tests pass.

I have removed the second shutdown, it is not necessary.

> Elsewhere, fifo_vnops.c hacks on SBS_CANT*MORE directly.  Perhaps it should
> call soshutdown(), or if the direct access there is safe then it is probably
> safe above.

That's for a later commit to fix.

> % Index: sys/kern/uipc_socket.c
> % ===
> % --- sys/kern/uipc_socket.c  (revision 196469)
> % +++ sys/kern/uipc_socket.c  (working copy)
> % @@ -2898,11 +2898,13 @@
> % if (so->so_oobmark || (so->so_rcv.sb_state & SBS_RCVATMARK))
> % revents |= events & (POLLPRI | POLLRDBAND);
> % 
> % -   if ((events & POLLINIGNEOF) == 0)
> % -   if (so->so_rcv.sb_state & SBS_CANTRCVMORE)
> % -   revents |= POLLHUP;
> % -   if (so->so_snd.sb_state & SBS_CANTSENDMORE)
> % -   revents |= POLLHUP;
> % +   if ((events & POLLINIGNEOF) == 0) {

> Old problems become larger:

> I don't like POLLINIGNEOF (for input) affecting POLLHUP for output.  This
> seems to cause no problems for fifos, at least when the kernel sets
> POLLINIGNEOF, but it is hard to understand even why it doesn't cause
> problems, and kib@ wants POLLINIGNEOF to remain user-settable, so the
> complications might remain exported to userland for for fifos and
> sockets, where they are harder to document and understand.

I do not like userland POLLINIGNEOF either. I think programs can do fine
with the standard functionality (closing and reopening a fifo to reset
the POLLHUP state).

> % +   revents |= events & (POLLIN | POLLRDNORM);

> This gives spurious POLLINs when POLLHUP is also returned, and thus defeats
> the point of soreadable_data() being different from soreadable().  Tests
> 6a-6d show the spurious POLLIN.  I don't understand how tests 6a and 6c-6d
> passed for you.

Same problem here. I think kib@ wants to keep this in 8.x for the sake
of buggy programs that do not check for POLLHUP. I suppose we can do it
properly in 9.x.

> % +   if (so->so_snd.sb_state & SBS_CANTSENDMORE)
> % +   revents |= POLLHUP;

> Tests 6a-6d pass with the above 3 lines changed to:

>   if (so->so_snd.sb_state & SBS_CANTSENDMORE)
>   revents |= POLLHUP;
>   else
>   revents |= events & (POLLIN | POLLRDNORM);

> Returning POLLIN will cause poll() to not block on input descriptors, but
> this seems to be as correct as possible since there really is a read-EOF.
> Applications just can't see this EOF as POLLHUP -- they will see POLLIN
> and have to try to read(), and then interpret read() returning 0 as meaning
> EOF.  Device-independent applications must do precisely this anyway since
> the input descriptor might be a regular file and there is no POLLHUP for
> regular files.

Yes. Even more, any program must handle it because it is also possible
that an EOF happens between poll() and read().

-- 
Jill

Re: svn commit: r196460 - head/sys/kern

2009-08-24 Thread Bruce Evans

On Tue, 25 Aug 2009, Bruce Evans wrote:


On Sun, 23 Aug 2009, Jilles Tjoelker wrote:


I think poll on fifos should instead be fixed by closing the
half-connection corresponding to writing from fi_readsock to
fi_writesock. I have tried this out, see attached patch. With the patch,
pipepoll only gives "expected POLLHUP; got POLLIN | POLLHUP" errors and
an error in fifo case 6b caused by our distinction between new and old
readers.


This sort of worked for me, but has several problems:

% Index: sys/fs/fifofs/fifo_vnops.c
% ===
% --- sys/fs/fifofs/fifo_vnops.c(revision 196459)
% +++ sys/fs/fifofs/fifo_vnops.c(working copy)
% @@ -193,6 +193,10 @@
%   goto fail2;
%   fip->fi_writesock = wso;
%   error = soconnect2(wso, rso);
% + if (error == 0)
% + error = soshutdown(rso, SHUT_WR);
% + if (error == 0)
% + error = soshutdown(wso, SHUT_RD);
%   if (error) {
%   (void)soclose(wso);
%  fail2:

The second soshutdown() is only harmful.  I see the following state changes
- the first soshutdown() sets SBS_CANTRCVMORE on rso like you would expect,
 and also sets SBS_CANTSENDMORE on wso.  This gives the desired state.
- the second soshutdown() then clears SBS_CANTRCVMORE on rso (without
 the first soshutdown() it leaves both flags clear in both directions).
 This clobbers the desired state.  The failure shows in just one of my
 uncommitted regression tests (when there is a writer and there was
 a reader, poll() returns POLLOUT for the writer, but should return
 POLLHUP; the missing SBS_CANTRCVMORE on rso prevents it ever returning
 POLLHUP for writers).


SBS_CANTRCVMORE on rso is cleared here in ~5.2, but this seems to be fixed
in -current:

% void
% sorflush(struct socket *so)
% {
%   struct sockbuf *sb = &so->so_rcv;
%   struct protosw *pr = so->so_proto;
%   struct sockbuf asb;
% 
% 	/*

%* In order to avoid calling dom_dispose with the socket buffer mutex
%* held, and in order to generally avoid holding the lock for a long
%* time, we make a copy of the socket buffer and clear the original
%* (except locks, state).  The new socket buffer copy won't have
%* initialized locks so we can only call routines that won't use or
%* assert those locks.
%*
%* Dislodge threads currently blocked in receive and wait to acquire
%* a lock against other simultaneous readers before clearing the
%* socket buffer.  Don't let our acquire be interrupted by a signal
%* despite any existing socket disposition on interruptable waiting.
%*/
%   CURVNET_SET(so->so_vnet);
%   socantrcvmore(so);
%   (void) sblock(sb, SBL_WAIT | SBL_NOINTR);
% 
% 	/*

%* Invalidate/clear most of the sockbuf structure, but leave selinfo
%* and mutex data unchanged.
%*/
%   SOCKBUF_LOCK(sb);
%   bzero(&asb, offsetof(struct sockbuf, sb_startzero));
%   bcopy(&sb->sb_startzero, &asb.sb_startzero,
%   sizeof(*sb) - offsetof(struct sockbuf, sb_startzero));
%   bzero(&sb->sb_startzero,
%   sizeof(*sb) - offsetof(struct sockbuf, sb_startzero));
^

sb_state is at the end of the struct in 5.2, but is now just before the
startzero section.

sockbuf.h has usless history (no repo copy), but according to history
of socketvar.h, sb_state was moved mainly to fix clobbering of
SBS_CANTRCVMORE:

% RCS file: /home/ncvs/src/sys/sys/socketvar.h,v
% Working file: socketvar.h
% head: 1.171
% 
% revision 1.137
% date: 2005/01/30 13:11:44;  author: glebius;  state: Exp;  lines: +1 -1
% Move sb_state to the beginning of structure, above sb_startzero member.
% sb_state shouldn't be erased, when socket buffer is flushed by sorflush().
% 
% When sb_state was bzero'ed, a recently set SBS_CANTRCVMORE flag was cleared.

% If a socket was shutdown(SHUT_RD), a subsequent read() would block on it.
% 
% Reported by:	Ed Maste, Gerrit Nagelhout

% Reviewed by:  rwatson
% 

What were previous symptoms of this bug?  This bug lived in FreeBSD-5-2+
for about a year and for 4 more years in my version of 5.2 :-(.  Older
versions didn't have it since the flags were all in so_state which has
never been cleared.

Bruce
___
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r196460 - head/sys/kern

2009-08-24 Thread Bruce Evans

On Sun, 23 Aug 2009, Jilles Tjoelker wrote:


I think poll on fifos should instead be fixed by closing the
half-connection corresponding to writing from fi_readsock to
fi_writesock. I have tried this out, see attached patch. With the patch,
pipepoll only gives "expected POLLHUP; got POLLIN | POLLHUP" errors and
an error in fifo case 6b caused by our distinction between new and old
readers.


This sort of worked for me, but has several problems:

% Index: sys/fs/fifofs/fifo_vnops.c
% ===
% --- sys/fs/fifofs/fifo_vnops.c(revision 196459)
% +++ sys/fs/fifofs/fifo_vnops.c(working copy)
% @@ -193,6 +193,10 @@
%   goto fail2;
%   fip->fi_writesock = wso;
%   error = soconnect2(wso, rso);
% + if (error == 0)
% + error = soshutdown(rso, SHUT_WR);
% + if (error == 0)
% + error = soshutdown(wso, SHUT_RD);
%   if (error) {
%   (void)soclose(wso);
%  fail2:

The second soshutdown() is only harmful.  I see the following state changes
- the first soshutdown() sets SBS_CANTRCVMORE on rso like you would expect,
  and also sets SBS_CANTSENDMORE on wso.  This gives the desired state.
- the second soshutdown() then clears SBS_CANTRCVMORE on rso (without
  the first soshutdown() it leaves both flags clear in both directions).
  This clobbers the desired state.  The failure shows in just one of my
  uncommitted regression tests (when there is a writer and there was
  a reader, poll() returns POLLOUT for the writer, but should return
  POLLHUP; the missing SBS_CANTRCVMORE on rso prevents it ever returning
  POLLHUP for writers).

After removing the second soshutdown() and fixing a spurious POLLIN (see
below), all my tests pass.

Elsewhere, fifo_vnops.c hacks on SBS_CANT*MORE directly.  Perhaps it should
call soshutdown(), or if the direct access there is safe then it is probably
safe above.

% Index: sys/kern/uipc_socket.c
% ===
% --- sys/kern/uipc_socket.c(revision 196469)
% +++ sys/kern/uipc_socket.c(working copy)
% @@ -2898,11 +2898,13 @@
%   if (so->so_oobmark || (so->so_rcv.sb_state & SBS_RCVATMARK))
%   revents |= events & (POLLPRI | POLLRDBAND);
% 
% -	if ((events & POLLINIGNEOF) == 0)

% - if (so->so_rcv.sb_state & SBS_CANTRCVMORE)
% - revents |= POLLHUP;
% - if (so->so_snd.sb_state & SBS_CANTSENDMORE)
% - revents |= POLLHUP;
% + if ((events & POLLINIGNEOF) == 0) {

Old problems become larger:

I don't like POLLINIGNEOF (for input) affecting POLLHUP for output.  This
seems to cause no problems for fifos, at least when the kernel sets
POLLINIGNEOF, but it is hard to understand even why it doesn't cause
problems, and kib@ wants POLLINIGNEOF to remain user-settable, so the
complications might remain exported to userland for for fifos and
sockets, where they are harder to document and understand.

Note that fifo_poll_f() always calls here twice, first for the read
socket, then for the write socket.  For the first call, the kernel
sets POLLINIGNEOF according to its idea of the stickyness of EOF on
input.  For the second call, the kernel never sets POLLINIGNEOF.  Thus
this code is always executed (unless the user sets POLLINIGNEOF) and
since returning POLLHUP is independent of the `events' arg, it is hard
to understand how setting POLLINIGNEOF for the first call has any
effect...

% + if (so->so_rcv.sb_state & SBS_CANTRCVMORE) {

...well, it has an effect since the first call is on a different socket.
The soshutdown()s make the states involved clearer:
- SBS_CANTRCVMORE is always set for wso, so we always get here for the
  second call, and the resulting POLLHUP is purely for the write direction.
- SBS_CANTSENDMORE is aways set for rso, so if we get here for the first
  call then we will set POLLHUP, and this POLLHUP is purely for the read
  direction.
Then fifo_poll_f() will combine the POLLHUPs, so POLLHUP becomes impure,
but POLLHUP remains useful.

% + revents |= events & (POLLIN | POLLRDNORM);

This gives spurious POLLINs when POLLHUP is also returned, and thus defeats
the point of soreadable_data() being different from soreadable().  Tests
6a-6d show the spurious POLLIN.  I don't understand how tests 6a and 6c-6d
passed for you.

% + if (so->so_snd.sb_state & SBS_CANTSENDMORE)
% + revents |= POLLHUP;

Tests 6a-6d pass with the above 3 lines changed to:

if (so->so_snd.sb_state & SBS_CANTSENDMORE)
revents |= POLLHUP;
else
revents |= events & (POLLIN | POLLRDNORM);

Returning POLLIN will cause poll() to not block on input descriptors, but
this seems to be as corre

Re: svn commit: r196460 - head/sys/kern

2009-08-23 Thread Bruce Evans

On Sun, 23 Aug 2009, Jilles Tjoelker wrote:


On Sun, Aug 23, 2009 at 12:44:15PM +, Konstantin Belousov wrote:
...

Modified: head/sys/kern/uipc_socket.c
==
--- head/sys/kern/uipc_socket.c Sun Aug 23 12:23:24 2009(r196459)
+++ head/sys/kern/uipc_socket.c Sun Aug 23 12:44:15 2009(r196460)
@@ -2898,13 +2898,11 @@ sopoll_generic(struct socket *so, int ev
if (so->so_oobmark || (so->so_rcv.sb_state & SBS_RCVATMARK))
revents |= events & (POLLPRI | POLLRDBAND);

-   if ((events & POLLINIGNEOF) == 0) {
-   if (so->so_rcv.sb_state & SBS_CANTRCVMORE) {
-   revents |= events & (POLLIN | POLLRDNORM);
-   if (so->so_snd.sb_state & SBS_CANTSENDMORE)
-   revents |= POLLHUP;
-   }
-   }
+   if ((events & POLLINIGNEOF) == 0)
+   if (so->so_rcv.sb_state & SBS_CANTRCVMORE)
+   revents |= POLLHUP;
+   if (so->so_snd.sb_state & SBS_CANTSENDMORE)
+   revents |= POLLHUP;

if (revents == 0) {
if (events & (POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND)) {


This sets POLLHUP when either of the directions has shut down, instead
of the entire connection. This breaks use of select and poll with
shutdown(2):


We already knew about this in general, but not its details.


* sending some data, shutdown(SHUT_WR) then polling for input
* (the other side for the above) receiving some data, getting EOF on
 that direction then polling for output


This could possibly be fixed by not breaking things perfectly to POSIX
spec, and keep returning POLLOUT with POLLHUP.  We already return POLLIN
with POLLHUP.  This state is meaningful and supported by POSIX.  It means
that hangup has occurred but there is queued input.  The current problem
shows that the state { POLLOUT with POLLHUP } is also meaningful, although
it is forbidden by POSIX.  It should mean that hangup has occurred (for
the read direction only) but output is still possible.


* a paranoid HTTP-like server that wants to wait for the client to close
 the read end after shutting down the write end

Either some data is lost or the program busy-waits for the data.


And it would wait forever if the client fails to tell it after failing
in polling for output?

I can't see any way to fix this short of uncombining the EOFs again.
Setting POLLHUP would at best reduce to busy-waiting since poll() would
always return immediately due to the POLLHUP.


I think the POLLHUP setting before this commit was correct for sockets:
POLLHUP is set if both directions are closed/error. An EOF that only
affects one direction sets the corresponding POLLIN/POLLOUT so that the
EOF becomes known but the other direction can still be used normally.

(The POSIX spec explicitly describes something like this for POLLIN
(zero length message) although it erroneously restricts it to STREAMS
files only; the POLLOUT case has to be derived from the fact that the
read end should still work normally but the EOF should be notified.)


I thought that the zero-length message part of the spec just requires
poll() to not block if read() would return EOF.  If done in all cases
of EOF, this makes POLLHUP almost useless for at least fifos.  If done
only in special cases, then no one, especially POSIX, knows when it
is done.  Note that POSIX has much better wording for requiring select()
on a read descriptor to not block if read() would return EOF.  This makes
select() useless for detecting certain state changes involving EOF on at
least fifos, and previous rounds of "fixes" in this area (about 6 years
ago) broke it away from POSIX.  The POLLINIGNEOF stuff is a failed attempt
to avoid breaking select() in this way while making poll() more useful.


I think poll on fifos should instead be fixed by closing the
half-connection corresponding to writing from fi_readsock to
fi_writesock. I have tried this out, see attached patch.


Interesting.  I wonder what the corresponding magic for socketpair()
and the 4.4BSD implementation of pipes is (I couldn't find any shutdown()
there, and with fully bidirectional pipes or sockets it is unclear how
EOF can ever be detected, since a r/w descriptor can always talk to
itself).


With the patch,
pipepoll only gives "expected POLLHUP; got POLLIN | POLLHUP" errors and
an error in fifo case 6b caused by our distinction between new and old
readers.


6b tests more breakage to POSIX spec.  Another thing to decide is
whether to do this.  Could someone test this on newer versions of
Linux?  I use a hackish kernel change (essentially, throw away the
distinction by corrupting the generation count) to pass this test.


tools/regression/poll does not test shutdown(2) interaction, so it does
not find this problem.


More tests would be welcome.  I have only a few hackish ones for POLLOUT.

Bruce
_

Re: svn commit: r196460 - head/sys/kern

2009-08-23 Thread Kostik Belousov
On Sun, Aug 23, 2009 at 11:38:00PM +0200, Jilles Tjoelker wrote:
> On Sun, Aug 23, 2009 at 12:44:15PM +, Konstantin Belousov wrote:
> > Author: kib
> > Date: Sun Aug 23 12:44:15 2009
> > New Revision: 196460
> > URL: http://svn.freebsd.org/changeset/base/196460
> 
> > Log:
> >   Fix the conformance of poll(2) for sockets after r195423 by
> >   returning POLLHUP instead of POLLIN for several cases. Now, the
> >   tools/regression/poll results for FreeBSD are closer to that of the
> >   Solaris and Linux.
> 
> >   Also, improve the POSIX conformance by explicitely clearing POLLOUT
> >   when POLLHUP is reported in pollscan(), making the fix global.
> 
> >   Submitted by: bde
> >   Reviewed by:  rwatson
> >   MFC after:1 week
...
> > Modified: head/sys/kern/uipc_socket.c
> > ==
> > --- head/sys/kern/uipc_socket.c Sun Aug 23 12:23:24 2009
> > (r196459)
> > +++ head/sys/kern/uipc_socket.c Sun Aug 23 12:44:15 2009
> > (r196460)
> > @@ -2898,13 +2898,11 @@ sopoll_generic(struct socket *so, int ev
> > if (so->so_oobmark || (so->so_rcv.sb_state & SBS_RCVATMARK))
> > revents |= events & (POLLPRI | POLLRDBAND);
> >  
> > -   if ((events & POLLINIGNEOF) == 0) {
> > -   if (so->so_rcv.sb_state & SBS_CANTRCVMORE) {
> > -   revents |= events & (POLLIN | POLLRDNORM);
> > -   if (so->so_snd.sb_state & SBS_CANTSENDMORE)
> > -   revents |= POLLHUP;
> > -   }
> > -   }
> > +   if ((events & POLLINIGNEOF) == 0)
> > +   if (so->so_rcv.sb_state & SBS_CANTRCVMORE)
> > +   revents |= POLLHUP;
> > +   if (so->so_snd.sb_state & SBS_CANTSENDMORE)
> > +   revents |= POLLHUP;
> >  
> > if (revents == 0) {
> > if (events & (POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND)) {
> 
> This sets POLLHUP when either of the directions has shut down, instead
> of the entire connection. This breaks use of select and poll with
> shutdown(2):
> 
> * sending some data, shutdown(SHUT_WR) then polling for input
> * (the other side for the above) receiving some data, getting EOF on
>   that direction then polling for output
> * a paranoid HTTP-like server that wants to wait for the client to close
>   the read end after shutting down the write end
> 
> Either some data is lost or the program busy-waits for the data.
Exactly this situation concerned me and Robert.

> 
> I think the POLLHUP setting before this commit was correct for sockets:
> POLLHUP is set if both directions are closed/error. An EOF that only
> affects one direction sets the corresponding POLLIN/POLLOUT so that the
> EOF becomes known but the other direction can still be used normally.
> 
> (The POSIX spec explicitly describes something like this for POLLIN
> (zero length message) although it erroneously restricts it to STREAMS
> files only; the POLLOUT case has to be derived from the fact that the
> read end should still work normally but the EOF should be notified.)
> 
> I think poll on fifos should instead be fixed by closing the
> half-connection corresponding to writing from fi_readsock to
> fi_writesock. I have tried this out, see attached patch. With the patch,
> pipepoll only gives "expected POLLHUP; got POLLIN | POLLHUP" errors and
> an error in fifo case 6b caused by our distinction between new and old
> readers.
This sounds right.
Please go ahead.

> 
> tools/regression/poll does not test shutdown(2) interaction, so it does
> not find this problem.
> 
> -- 
> Jilles Tjoelker




pgp3YZ8tlHvjU.pgp
Description: PGP signature


Re: svn commit: r196460 - head/sys/kern

2009-08-23 Thread Jilles Tjoelker
On Sun, Aug 23, 2009 at 12:44:15PM +, Konstantin Belousov wrote:
> Author: kib
> Date: Sun Aug 23 12:44:15 2009
> New Revision: 196460
> URL: http://svn.freebsd.org/changeset/base/196460

> Log:
>   Fix the conformance of poll(2) for sockets after r195423 by
>   returning POLLHUP instead of POLLIN for several cases. Now, the
>   tools/regression/poll results for FreeBSD are closer to that of the
>   Solaris and Linux.

>   Also, improve the POSIX conformance by explicitely clearing POLLOUT
>   when POLLHUP is reported in pollscan(), making the fix global.

>   Submitted by:   bde
>   Reviewed by:rwatson
>   MFC after:  1 week

> Modified:
>   head/sys/kern/sys_generic.c
>   head/sys/kern/uipc_socket.c

> Modified: head/sys/kern/sys_generic.c
> ==
> --- head/sys/kern/sys_generic.c   Sun Aug 23 12:23:24 2009
> (r196459)
> +++ head/sys/kern/sys_generic.c   Sun Aug 23 12:44:15 2009
> (r196460)
> @@ -1228,6 +1228,13 @@ pollscan(td, fds, nfd)
>   selfdalloc(td, fds);
>   fds->revents = fo_poll(fp, fds->events,
>   td->td_ucred, td);
> + /*
> +  * POSIX requires POLLOUT to be never
> +  * set simultaneously with POLLHUP.
> +  */
> + if ((fds->revents & POLLHUP) != 0)
> + fds->revents &= ~POLLOUT;
> +
>   if (fds->revents != 0)
>   n++;
>   }

This looks useful.

> Modified: head/sys/kern/uipc_socket.c
> ==
> --- head/sys/kern/uipc_socket.c   Sun Aug 23 12:23:24 2009
> (r196459)
> +++ head/sys/kern/uipc_socket.c   Sun Aug 23 12:44:15 2009
> (r196460)
> @@ -2898,13 +2898,11 @@ sopoll_generic(struct socket *so, int ev
>   if (so->so_oobmark || (so->so_rcv.sb_state & SBS_RCVATMARK))
>   revents |= events & (POLLPRI | POLLRDBAND);
>  
> - if ((events & POLLINIGNEOF) == 0) {
> - if (so->so_rcv.sb_state & SBS_CANTRCVMORE) {
> - revents |= events & (POLLIN | POLLRDNORM);
> - if (so->so_snd.sb_state & SBS_CANTSENDMORE)
> - revents |= POLLHUP;
> - }
> - }
> + if ((events & POLLINIGNEOF) == 0)
> + if (so->so_rcv.sb_state & SBS_CANTRCVMORE)
> + revents |= POLLHUP;
> + if (so->so_snd.sb_state & SBS_CANTSENDMORE)
> + revents |= POLLHUP;
>  
>   if (revents == 0) {
>   if (events & (POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND)) {

This sets POLLHUP when either of the directions has shut down, instead
of the entire connection. This breaks use of select and poll with
shutdown(2):

* sending some data, shutdown(SHUT_WR) then polling for input
* (the other side for the above) receiving some data, getting EOF on
  that direction then polling for output
* a paranoid HTTP-like server that wants to wait for the client to close
  the read end after shutting down the write end

Either some data is lost or the program busy-waits for the data.

I think the POLLHUP setting before this commit was correct for sockets:
POLLHUP is set if both directions are closed/error. An EOF that only
affects one direction sets the corresponding POLLIN/POLLOUT so that the
EOF becomes known but the other direction can still be used normally.

(The POSIX spec explicitly describes something like this for POLLIN
(zero length message) although it erroneously restricts it to STREAMS
files only; the POLLOUT case has to be derived from the fact that the
read end should still work normally but the EOF should be notified.)

I think poll on fifos should instead be fixed by closing the
half-connection corresponding to writing from fi_readsock to
fi_writesock. I have tried this out, see attached patch. With the patch,
pipepoll only gives "expected POLLHUP; got POLLIN | POLLHUP" errors and
an error in fifo case 6b caused by our distinction between new and old
readers.

tools/regression/poll does not test shutdown(2) interaction, so it does
not find this problem.

-- 
Jilles Tjoelker
Index: sys/kern/uipc_socket.c
===
--- sys/kern/uipc_socket.c	(revision 196469)
+++ sys/kern/uipc_socket.c	(working copy)
@@ -2898,11 +2898,13 @@
 		if (so->so_oobmark || (so->so_rcv.sb_state & SBS_RCVATMARK))
 			revents |= events & (POLLPRI | POLLRDBAND);
 
-	if ((events & POLLINIGNEOF) == 0)
-		if (so->so_rcv.sb_state & SBS_CANTRCVMORE)
-			revents |= POLLHUP;
-	if (so->so_snd.sb_state & SBS_CANTSENDMORE)
-		revents |= POLLHUP;
+	if ((events & POLLINIGNEOF) == 0) {
+		if (so->so_r

svn commit: r196460 - head/sys/kern

2009-08-23 Thread Konstantin Belousov
Author: kib
Date: Sun Aug 23 12:44:15 2009
New Revision: 196460
URL: http://svn.freebsd.org/changeset/base/196460

Log:
  Fix the conformance of poll(2) for sockets after r195423 by
  returning POLLHUP instead of POLLIN for several cases. Now, the
  tools/regression/poll results for FreeBSD are closer to that of the
  Solaris and Linux.
  
  Also, improve the POSIX conformance by explicitely clearing POLLOUT
  when POLLHUP is reported in pollscan(), making the fix global.
  
  Submitted by: bde
  Reviewed by:  rwatson
  MFC after:1 week

Modified:
  head/sys/kern/sys_generic.c
  head/sys/kern/uipc_socket.c

Modified: head/sys/kern/sys_generic.c
==
--- head/sys/kern/sys_generic.c Sun Aug 23 12:23:24 2009(r196459)
+++ head/sys/kern/sys_generic.c Sun Aug 23 12:44:15 2009(r196460)
@@ -1228,6 +1228,13 @@ pollscan(td, fds, nfd)
selfdalloc(td, fds);
fds->revents = fo_poll(fp, fds->events,
td->td_ucred, td);
+   /*
+* POSIX requires POLLOUT to be never
+* set simultaneously with POLLHUP.
+*/
+   if ((fds->revents & POLLHUP) != 0)
+   fds->revents &= ~POLLOUT;
+
if (fds->revents != 0)
n++;
}

Modified: head/sys/kern/uipc_socket.c
==
--- head/sys/kern/uipc_socket.c Sun Aug 23 12:23:24 2009(r196459)
+++ head/sys/kern/uipc_socket.c Sun Aug 23 12:44:15 2009(r196460)
@@ -2898,13 +2898,11 @@ sopoll_generic(struct socket *so, int ev
if (so->so_oobmark || (so->so_rcv.sb_state & SBS_RCVATMARK))
revents |= events & (POLLPRI | POLLRDBAND);
 
-   if ((events & POLLINIGNEOF) == 0) {
-   if (so->so_rcv.sb_state & SBS_CANTRCVMORE) {
-   revents |= events & (POLLIN | POLLRDNORM);
-   if (so->so_snd.sb_state & SBS_CANTSENDMORE)
-   revents |= POLLHUP;
-   }
-   }
+   if ((events & POLLINIGNEOF) == 0)
+   if (so->so_rcv.sb_state & SBS_CANTRCVMORE)
+   revents |= POLLHUP;
+   if (so->so_snd.sb_state & SBS_CANTSENDMORE)
+   revents |= POLLHUP;
 
if (revents == 0) {
if (events & (POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND)) {
___
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"