On Sat, Oct 03, 2020 at 09:09:00AM +0200, Martin Pieuchot wrote:
> On 02/10/20(Fri) 19:09, Scott Cheloha wrote:
> > On Fri, Oct 02, 2020 at 12:19:35PM +0200, Martin Pieuchot wrote:
> > > @@ -635,12 +642,39 @@ sys_kevent(struct proc *p, void *v, regi
> > > goto done;
> > > }
> > >
> > > +
> > > KQREF(kq);
> > > FRELE(fp, p);
> > > - error = kqueue_scan(kq, SCARG(uap, nevents), SCARG(uap, eventlist),
> > > - tsp, kev, p, &n);
> > > + /*
> > > + * Collect as many events as we can. The timeout on successive
> > > + * loops is disabled (kqueue_scan() becomes non-blocking).
> > > + */
> > > + total = 0;
> > > + error = 0;
> > > + kqueue_scan_setup(&scan, kq);
> > > + while ((n = SCARG(uap, nevents) - total) > 0) {
> > > + if (n > nitems(kev))
> > > + n = nitems(kev);
> > > + ready = kqueue_scan(&scan, n, kev, tsp, p, &error);
> > > + if (ready == 0)
> > > + break;
> > > + error = copyout(kev, SCARG(uap, eventlist) + total,
> > > + sizeof(struct kevent) * ready);
> > > +#ifdef KTRACE
> > > + if (KTRPOINT(p, KTR_STRUCT))
> > > + ktrevent(p, kev, ready);
> > > +#endif
> > > + total += ready;
> > > + if (error || ready < n)
> > > + break;
> > > + tsp = &ts; /* successive loops non-blocking */
> > > + timespecclear(tsp);
> >
> > Here, this. Why do we force a non-blocking loop the second time?
>
> If there's a second time that implies the first time already reported
> some events so there's already something to return to userland. In that
> case we just want to gather the events that were not collected the first
> time and not sleep again.
Okay, now I see it, thank you for the explanation.
> > > + }
> > > + kqueue_scan_finish(&scan);
> > > KQRELE(kq);
> > > - *retval = n;
> > > + if (error == EWOULDBLOCK)
> > > + error = 0;
> > > + *retval = total;
> > > return (error);
> > >
> > > done:
> > > @@ -894,24 +928,22 @@ kqueue_sleep(struct kqueue *kq, struct t
> > > return (error);
> > > }
> > >
> > > +/*
> > > + * Scan the kqueue, blocking if necessary until the target time is
> > > reached.
> > > + * If tsp is NULL we block indefinitely. If tsp->ts_secs/nsecs are both
> > > + * 0 we do not block at all.
> > > + */
> > > int
> > > -kqueue_scan(struct kqueue *kq, int maxevents, struct kevent *ulistp,
> > > - struct timespec *tsp, struct kevent *kev, struct proc *p, int
> > > *retval)
> > > +kqueue_scan(struct kqueue_scan_state *scan, int maxevents,
> > > + struct kevent *kevp, struct timespec *tsp, struct proc *p, int
> > > *errorp)
> > > {
> > > - struct kevent *kevp;
> > > - struct knote mend, mstart, *kn;
> > > - int s, count, nkev, error = 0;
> > > -
> > > - nkev = 0;
> > > - kevp = kev;
> > > + struct knote *kn;
> > > + struct kqueue *kq = scan->kqs_kq;
> > > + int s, count, nkev = 0, error = 0;
> > >
> > > count = maxevents;
> > > if (count == 0)
> > > goto done;
> > > -
> > > - memset(&mstart, 0, sizeof(mstart));
> > > - memset(&mend, 0, sizeof(mend));
> > > -
> > > retry:
> > > KASSERT(count == maxevents);
> > > KASSERT(nkev == 0);
> > > @@ -923,7 +955,8 @@ retry:
> > >
> > > s = splhigh();
> > > if (kq->kq_count == 0) {
> > > - if (tsp != NULL && !timespecisset(tsp)) {
> > > + if ((tsp != NULL && !timespecisset(tsp)) ||
> > > + scan->kqs_nevent != 0) {
> > > splx(s);
> > > error = 0;
> > > goto done;
> > > @@ -931,7 +964,7 @@ retry:
> > > kq->kq_state |= KQ_SLEEP;
> > > error = kqueue_sleep(kq, tsp);
> > > splx(s);
> > > - if (error == 0 || error == EWOULDBLOCK)
> > > + if (error == 0)
> > > goto retry;
> >
> > Why wouldn't we want to retry in the EWOULDBLOCK case?
> > You have a check for
> >
> > tsp != NULL && !timespecisset(tsp)
> >
> > e.g., when you time out.
>
> I don't recall why or even if there was a reason. I'll change it back,
> thanks.
Cool.
> > > +
> > > + /*
> > > + * The poll/select family of syscalls has been designed to
> > > + * block when file descriptors are not available, even if
> > > + * there's nothing to wait for.
> > > + */
> > > + if (nevents == 0) {
> > > + uint64_t nsecs = INFSLP;
> > > +
> > > + if (timeout != NULL)
> > > + nsecs = MAX(1, MIN(TIMESPEC_TO_NSEC(timeout), MAXTSLP));
> > > +
> > > + error = tsleep_nsec(&p->p_kq, PSOCK | PCATCH, "kqsel", nsecs);
> > > + /* select is not restarted after signals... */
> > > + if (error == ERESTART)
> > > + error = EINTR;
> > > + }
> >
> > Aside: can the new logic (below) not handle the case where
> > nevents == 0? Like, what happens if we go into kqueue_scan()
> > with count == 0?
>
> First like of kqueue_scan() does:
>
> if (count == 0)
> goto done
>
> So there's no sleep. I opted for the explicit sleep rather than the
> obfuscated one ;)
I'm a bit torn on this. On the one hand, I like being overt about
special cases. On the other hand, I worry that this will introduce
bugs.
For instance, I just noticed that you have changed the behavior
in the zero timeout case. select(2) and pselect(2) are not supposed
to block if the timeout is zero, e.g. code like:
struct timeval tv;
timerclear(&tv);
select(0, NULL, NULL, NULL, &tv);
should not block. But your new code blocks.
In theory there are no other bugs...
> > > + /* Collect at most `nevents' possibly waiting in kqueue_scan() */
> > > + kqueue_scan_setup(&scan, p->p_kq);
> > > + while (nevents > 0) {
> > > + struct kevent kev[KQ_NEVENTS];
> > > + int i, ready, count;
> > > +
> > > + /* Maxium number of events per iteration */
> > > + count = MIN(nitems(kev), nevents);
> > > + ready = kqueue_scan(&scan, count, kev, timeout, p, &error);
> > > +#ifdef KTRACE
> > > + if (KTRPOINT(p, KTR_STRUCT))
> > > + ktrevent(p, kev, ready);
> > > +#endif
> > > + /* Convert back events that are ready. */
> > > + for (i = 0; i < ready; i++)
> > > + *retval += pselcollect(p, &kev[i], pobits);
> > > +
> > > + /*
> > > + * Stop if there was an error or if we had enough
> > > + * place to collect all events that were ready.
> > > + */
> > > + if (error || ready < count)
> > > + break;
> > > +
> > > + timeout = &ts; /* successive loops non-blocking */
> > > + timespecclear(timeout);
> >
> > ... again, it looks like we now truncate instead of resleeping.
>
> Yes, that's on purpose. Maybe you can suggest a comment to explain this
> behavior that seems surprising?
After thinking about it a bit I think it is less surprising than I
first thought.
What a surprise :)
I'm trying to communicate:
kev is a fixed-length array. The caller's event array might
be longer than kev. In that case, we may need to call kqueue_scan()
multiple times to capture all events. But we only want to
block in kqueue_scan() on the 1st iteration because we necessarily
captured events during the first call.
Here's my best attempt:
/*
* There may be more ready events. We will try to capture them during
* the next iteration. However, we have already captured other events,
* so we do not want to block in kqueue_scan() if it turns out that there
* aren't any.
*/
... meh.