The select(2) family of syscalls was designed to work this way (block when
file descriptors are not available, even when they cannot become available).
For example:

        select(0, NULL, NULL, NULL, NULL);

blocks forever.

In fact in the very old days (before usleep, nanosleep, clock_nanosleep),
select(2) was the only way to get sub-second sleeps from the kernel, and
the idiom was similar to the above:

void
usleep(int usec) {
        struct timeval tv;

        tv.tv_sec = 0;
        tv.tv_usec = usec;

        select(0, NULL, NULL, NULL, &tv);
}

Best,

christos

> On Apr 1, 2020, at 8:19 AM, Theo de Raadt <dera...@openbsd.org> wrote:
> 
> The regress test is wrong.
> 
> The first parameter to pselect has to tell it which bits to look at.
> It should be something like getdtablesize(), or fd+1, or FD_SETSIZE.
> 
> And then there's all the handwaving about fd_set overflow, meaning what
> if fd >= FD_SETSIZE.  I've never heard of an exploitable one  though.
> 
> Martin Pieuchot <m...@openbsd.org> wrote:
> 
>> While working on moving pselect(2) outside of the KERNEL_LOCK() I found
>> this surprising {p,}select(2) behavior exposed by the libc/sys/t_select.c
>> regression test.  Should it be considered as a bug or feature?
>> 
>> The above mentioned test does the following (simplified):
>> 
>>      fd = open("/dev/null", O_RDONLY);
>>      FD_ZERO(&rset);
>>      FD_SET(fd, &rset);
>>      pselect(1, &rset, NULL, NULL, NULL, &set);
>> 
>> Note that in most environments, open(2) will return a value higher than
>> 1 which means the value written by FD_SET() will never be checked by the
>> current implementation of {p,}select(2).  In this case, even if it isn't
>> waiting for any condition, the current implementation will block.
>> 
>> To put it differently, the code above is an obfuscated way to write:
>> 
>>      sigsuspend(&set);
>> 
>> If we agree that this is not what the user wants to do, I'm suggesting
>> the diff below which makes {p,}select(2) return EINVAL if no bit has
>> been found in the given set up to `nfds'.
>> 
>> Independently from our decision, I'd suggest to rename the variable
>> name `nfds' in the syscall documentation because it can lead to this
>> kind of confusion.
>> 
>> Finally if we accept this change the regression test will have to be
>> adapted because polling "/dev/null" for reading always return true so
>> pselect(2) wont block waiting for a signal.  Any suggestion?
>> 
>> Index: sys/kern/sys_generic.c
>> ===================================================================
>> RCS file: /cvs/src/sys/kern/sys_generic.c,v
>> retrieving revision 1.131
>> diff -u -p -r1.131 sys_generic.c
>> --- sys/kern/sys_generic.c   20 Mar 2020 04:11:05 -0000      1.131
>> +++ sys/kern/sys_generic.c   1 Apr 2020 10:27:51 -0000
>> @@ -703,7 +703,7 @@ selscan(struct proc *p, fd_set *ibits, f
>> {
>>      caddr_t cibits = (caddr_t)ibits, cobits = (caddr_t)obits;
>>      struct filedesc *fdp = p->p_fd;
>> -    int msk, i, j, fd;
>> +    int msk, i, j, fd, foundfds = 0;
>>      fd_mask bits;
>>      struct file *fp;
>>      int n = 0;
>> @@ -719,6 +719,7 @@ selscan(struct proc *p, fd_set *ibits, f
>>                              bits &= ~(1 << j);
>>                              if ((fp = fd_getfile(fdp, fd)) == NULL)
>>                                      return (EBADF);
>> +                            foundfds++;
>>                              if ((*fp->f_ops->fo_poll)(fp, flag[msk], p)) {
>>                                      FD_SET(fd, pobits);
>>                                      n++;
>> @@ -727,6 +728,8 @@ selscan(struct proc *p, fd_set *ibits, f
>>                      }
>>              }
>>      }
>> +    if (foundfds == 0)
>> +            return (EINVAL);
>>      *retval = n;
>>      return (0);
>> }
>> Index: lib/libc/sys/select.2
>> ===================================================================
>> RCS file: /cvs/src/lib/libc/sys/select.2,v
>> retrieving revision 1.42
>> diff -u -p -r1.42 select.2
>> --- lib/libc/sys/select.2    17 Sep 2016 01:01:42 -0000      1.42
>> +++ lib/libc/sys/select.2    1 Apr 2020 10:35:29 -0000
>> @@ -188,7 +188,7 @@ The specified time limit is invalid.
>> One of its components is negative or too large.
>> .It Bq Er EINVAL
>> .Fa nfds
>> -was less than 0.
>> +was smaller than the smallest descriptor specified in the sets.
>> .El
>> .Sh SEE ALSO
>> .Xr accept 2 ,
>> 

Attachment: signature.asc
Description: Message signed with OpenPGP

Reply via email to