Re: cron(8): add '@' interval time specifier

2021-07-11 Thread Todd C . Miller
On Sat, 10 Jul 2021 15:54:51 -, Job Snijders wrote:

> > > I borrowed the idea from FreeBSD's cron [1]. A difference between
> > > the below changeset and the freebsd implementation is that they
> > > specify the interval in seconds, while the below specifies in
> > > minutes.
> > 
> > Why be incompatible?  Better have a strong justification.
>
> OpenBSD cron would need to be extended to support 'up to 1 second
> precise' scheduling, perhaps some code can be borrowed from 
> https://github.com/freebsd/freebsd-src/commit/7a5c30c5b67555c9f76a053812ba80a
> e258b8874#diff-f8f4ba67b7df93eb4b436203d0c70670560029bd8c5b9a642691a914aa33d4
> 00
> to accomplish this.

We use ppoll(2) when waiting for the next event so it should be
simple to add support for sub-minute resolution.  Our code differs
greatly from FreeBSD's in this way.

> Alternatively, I can remove the below '* SECONDS_PER_MINUTE' multiplier,
> so that from a syntax perspective we are equivalent to FreeBSD, but our
> cron will have 'one minute precision' regarding when the job is
> executed.

I think maintaining compatibility with FreeBSD is the way to go
since that is where this feature originates.  We can add sub-minute
precision separately.

> entry.c:
> + } else if (*cmd != '\0' &&
> + (interval = strtol(cmd, , 10)) > 0 &&
> + *endptr == '\0') {
> + e->interval = interval * SECONDS_PER_MINUTE;
> + e->flags |= INTERVAL | SINGLE_JOB;
>
> For me 'one minute' precision is good enough, but if others think it is
> worth putting in the effort to get to '1 second' precision scheduling
> I'm happy to try to hack it.

Currently, cron_sleep() takes a target time in minutes but it doesn't
have to be that way.  We could set the target time based the first
job in the queue or now + 1 minute, whichever is smaller.

 - todd



Re: Add f_modify and f_process callbacks to FIFO filterops

2021-07-11 Thread Todd C . Miller
On Sun, 11 Jul 2021 14:27:00 -, Visa Hankala wrote:

> This patch adds f_modify and f_process callbacks to FIFO event filters.
> The general idea is the same as with sockets, discussed in thread
> "Add f_modify and f_process callbacks to socket filterops" on tech@.
>
> I think it is best to add the callbacks now, before further changes to
> socket locking.

This matches the socket changes and looks correct to me.
OK millert@

 - todd



ddb trace: fix output for too many arguments

2021-07-11 Thread Jasper Lievisse Adriaanse
Hi,

When printing a trace from ddb, some architectures are limited by the number of
registers which are used to pass arguments. If the number of arguments to a 
function
exceeded this number, the code in db_stack_trace_print() would print that many 
arguments
without any indication that one or more arguments aren't printed.

Here's a diff that tweaks the output to make it clear there were more arguments.
Do we want to print ',...' for each ommited argument (like this diff does)
or perhaps just a single ',...'?

Index: amd64/amd64/db_trace.c
===
RCS file: /cvs/src/sys/arch/amd64/amd64/db_trace.c,v
retrieving revision 1.53
diff -u -p -r1.53 db_trace.c
--- amd64/amd64/db_trace.c  14 May 2020 06:58:54 -  1.53
+++ amd64/amd64/db_trace.c  11 Jul 2021 13:23:04 -
@@ -137,7 +136,7 @@ db_stack_trace_print(db_expr_t addr, int
 
lastframe = 0;
while (count && frame != 0) {
-   int narg;
+   int narg, extra_args = 0;
unsigned inti;
char *  name;
db_expr_t   offset;
@@ -165,8 +164,12 @@ db_stack_trace_print(db_expr_t addr, int
}
 
narg = db_ctf_func_numargs(sym);
-   if (narg < 0 || narg > 6)
+   if (narg < 0)
+   narg = 6;
+   else if (narg > 6) {
+   extra_args = narg % 6;
narg = 6;
+   }
 
if (name == NULL)
(*pr)("%lx(", callpc);
@@ -204,6 +207,9 @@ db_stack_trace_print(db_expr_t addr, int
if (--narg != 0)
(*pr)(",");
}
+   for (i = extra_args; i > 0; i--)
+   (*pr)(",...");  
+
(*pr)(") at ");
db_printsym(callpc, DB_STGY_PROC, pr);
(*pr)("\n");
Index: powerpc/ddb/db_trace.c
===
RCS file: /cvs/src/sys/arch/powerpc/ddb/db_trace.c,v
retrieving revision 1.17
diff -u -p -r1.17 db_trace.c
--- powerpc/ddb/db_trace.c  14 May 2020 06:58:54 -  1.17
+++ powerpc/ddb/db_trace.c  11 Jul 2021 13:23:04 -
@@ -123,7 +123,7 @@ db_stack_trace_print(db_expr_t addr, int
Elf_Sym *sym;
char*name;
char c, *cp = modif;
-   int  i, narg, trace_proc = 0;
+   int  i, narg, trace_proc = 0, extra_args = 0;
 
while ((c = *cp++) != 0) {
if (c == 't')
@@ -158,8 +158,12 @@ db_stack_trace_print(db_expr_t addr, int
(*pr)("at 0x%lx", lr - 4);
} else {
narg = db_ctf_func_numargs(sym);
-   if (narg < 0 || narg > 8)
+   if (narg < 0)
narg = 8;
+   else if (narg > 8) {
+   extra_args = narg % 8;
+   narg = 8;
+   }
 
(*pr)("%s(", name);
 
@@ -172,6 +176,9 @@ db_stack_trace_print(db_expr_t addr, int
(*pr)(",");
}
}
+
+   for (i = extra_args; i > 0; i--)
+   (*pr)(",...");
 
(*pr)(") at ");
db_printsym(lr - 4, DB_STGY_PROC, pr);
-- 
jasper



Re: Do not spin on the NET_LOCK() in kqueue

2021-07-11 Thread Visa Hankala
On Sat, Jul 10, 2021 at 05:26:57PM +0200, Martin Pieuchot wrote:
> One of the reasons for the drop of performances in the kqueue-based
> poll/select is the fact that kqueue filters are called up to 3 times
> per syscall and that they all spin on the NET_LOCK() for TCP/UDP
> packets.
> 
> Diff below is a RFC for improving the situation.
> 
> socket kqueue filters mainly check for the amount of available items to
> read/write.  This involves comparing various socket buffer fields (sb_cc,
> sb_lowat, etc).  The diff below introduces a new mutex to serialize
> updates of those fields with reads in the kqueue filters.
> 
> Since these fields are always modified with the socket lock held, either
> the mutex or the solock are enough to have a coherent view of them.
> Note that either of these locks is necessary only if multiple fields
> have to be read (like in sbspace()).
> 
> Other per-socket fields accessed in the kqueue filters are never
> combined (with &&) to determine a condition.  So assuming it is fine to
> read register-sized fields w/o the socket lock we can safely remove it
> there.
> 
> Could such mutex also be used to serialize klist updates?

I think the lock should be such that it can serialize socket klists.

As the main motivator for this change is kqueue, the viability of using
the mutex for the klist locking should be checked now. The mutex has to
be held whenever calling KNOTE() on sb_sel.si_note, or selwakeup() on
sb_sel. Then the socket f_event callbacks will not need to lock the
mutex themselves.

I had a diff that serialized socket klists using solock(). It did not
work well because it increased lock contention, especially when using
kqueue as backend for poll(2) and select(2). The diff is not even
correct any longer since recent changes to socket locking have
introduced new lock order constraints that conflict with it.



Add f_modify and f_process callbacks to FIFO filterops

2021-07-11 Thread Visa Hankala
This patch adds f_modify and f_process callbacks to FIFO event filters.
The general idea is the same as with sockets, discussed in thread
"Add f_modify and f_process callbacks to socket filterops" on tech@.

I think it is best to add the callbacks now, before further changes to
socket locking.

OK?

Index: miscfs/fifofs/fifo_vnops.c
===
RCS file: src/sys/miscfs/fifofs/fifo_vnops.c,v
retrieving revision 1.79
diff -u -p -r1.79 fifo_vnops.c
--- miscfs/fifofs/fifo_vnops.c  17 Jan 2021 05:23:34 -  1.79
+++ miscfs/fifofs/fifo_vnops.c  11 Jul 2021 12:04:07 -
@@ -104,14 +104,22 @@ const struct vops fifo_vops = {
 
 void   filt_fifordetach(struct knote *kn);
 intfilt_fiforead(struct knote *kn, long hint);
+intfilt_fiforeadmodify(struct kevent *kev, struct knote *kn);
+intfilt_fiforeadprocess(struct knote *kn, struct kevent *kev);
+intfilt_fiforead_common(struct knote *kn, struct socket *so);
 void   filt_fifowdetach(struct knote *kn);
 intfilt_fifowrite(struct knote *kn, long hint);
+intfilt_fifowritemodify(struct kevent *kev, struct knote *kn);
+intfilt_fifowriteprocess(struct knote *kn, struct kevent *kev);
+intfilt_fifowrite_common(struct knote *kn, struct socket *so);
 
 const struct filterops fiforead_filtops = {
.f_flags= FILTEROP_ISFD,
.f_attach   = NULL,
.f_detach   = filt_fifordetach,
.f_event= filt_fiforead,
+   .f_modify   = filt_fiforeadmodify,
+   .f_process  = filt_fiforeadprocess,
 };
 
 const struct filterops fifowrite_filtops = {
@@ -119,6 +127,8 @@ const struct filterops fifowrite_filtops
.f_attach   = NULL,
.f_detach   = filt_fifowdetach,
.f_event= filt_fifowrite,
+   .f_modify   = filt_fifowritemodify,
+   .f_process  = filt_fifowriteprocess,
 };
 
 /*
@@ -546,13 +556,12 @@ filt_fifordetach(struct knote *kn)
 }
 
 int
-filt_fiforead(struct knote *kn, long hint)
+filt_fiforead_common(struct knote *kn, struct socket *so)
 {
-   struct socket *so = (struct socket *)kn->kn_hook;
-   int s, rv;
+   int rv;
+
+   soassertlocked(so);
 
-   if ((hint & NOTE_SUBMIT) == 0)
-   s = solock(so);
kn->kn_data = so->so_rcv.sb_cc;
if (so->so_state & SS_CANTRCVMORE) {
kn->kn_flags |= EV_EOF;
@@ -565,8 +574,46 @@ filt_fiforead(struct knote *kn, long hin
kn->kn_flags &= ~EV_EOF;
rv = (kn->kn_data > 0);
}
-   if ((hint & NOTE_SUBMIT) == 0)
-   sounlock(so, s);
+
+   return (rv);
+}
+
+int
+filt_fiforead(struct knote *kn, long hint)
+{
+   struct socket *so = kn->kn_hook;
+
+   return (filt_fiforead_common(kn, so));
+}
+
+int
+filt_fiforeadmodify(struct kevent *kev, struct knote *kn)
+{
+   struct socket *so = kn->kn_hook;
+   int rv, s;
+
+   s = solock(so);
+   knote_modify(kev, kn);
+   rv = filt_fiforead_common(kn, so);
+   sounlock(so, s);
+
+   return (rv);
+}
+
+int
+filt_fiforeadprocess(struct knote *kn, struct kevent *kev)
+{
+   struct socket *so = kn->kn_hook;
+   int rv, s;
+
+   s = solock(so);
+   if (kev != NULL && (kn->kn_flags & EV_ONESHOT))
+   rv = 1;
+   else
+   rv = filt_fiforead_common(kn, so);
+   if (rv != 0)
+   knote_submit(kn, kev);
+   sounlock(so, s);
 
return (rv);
 }
@@ -580,13 +627,12 @@ filt_fifowdetach(struct knote *kn)
 }
 
 int
-filt_fifowrite(struct knote *kn, long hint)
+filt_fifowrite_common(struct knote *kn, struct socket *so)
 {
-   struct socket *so = (struct socket *)kn->kn_hook;
-   int s, rv;
+   int rv;
+
+   soassertlocked(so);
 
-   if ((hint & NOTE_SUBMIT) == 0)
-   s = solock(so);
kn->kn_data = sbspace(so, >so_snd);
if (so->so_state & SS_CANTSENDMORE) {
kn->kn_flags |= EV_EOF;
@@ -595,8 +641,46 @@ filt_fifowrite(struct knote *kn, long hi
kn->kn_flags &= ~EV_EOF;
rv = (kn->kn_data >= so->so_snd.sb_lowat);
}
-   if ((hint & NOTE_SUBMIT) == 0)
-   sounlock(so, s);
+
+   return (rv);
+}
+
+int
+filt_fifowrite(struct knote *kn, long hint)
+{
+   struct socket *so = kn->kn_hook;
+
+   return (filt_fifowrite_common(kn, so));
+}
+
+int
+filt_fifowritemodify(struct kevent *kev, struct knote *kn)
+{
+   struct socket *so = kn->kn_hook;
+   int rv, s;
+
+   s = solock(so);
+   knote_modify(kev, kn);
+   rv = filt_fifowrite_common(kn, so);
+   sounlock(so, s);
+
+   return (rv);
+}
+
+int
+filt_fifowriteprocess(struct knote *kn, struct kevent *kev)
+{
+   struct socket *so = kn->kn_hook;
+   int rv, s;
+
+   s = solock(so);
+   if (kev != NULL && (kn->kn_flags & EV_ONESHOT))
+   rv = 1;
+   else
+   rv = 

Re: vmd(8): simplify vcpu logic, removing uart & net reads

2021-07-11 Thread Dave Voutila


Ping...looking for OK. Would like to get this committed this week.

Dave Voutila writes:

> Looking for an OK for this one now. Anyone?
>
> Dave Voutila  writes:
>
>> Dave Voutila writes:
>>
>>> Looking for some broader testing of the following diff. It cleans up
>>> some complicated logic predominantly left over from the early days of
>>> vmd prior to its having a dedicated device thread.
>>
>> Still looking for tester feedback. I've been running this diff while
>> hosting multiple guests continously (OpenBSD-current, Alpine 3.14,
>> Debian 10.10, Ubuntu 20.04) with no issues.
>>
>> I know a few folks have told me they've applied the diff and have not
>> seen issues.
>
> I've had positive reports from 4 people. Thanks everyone that tested and
> provided feedback!
>
>>
>> I'll prod for OK next week, so if you've tested the diff please let me
>> know!
>
> OK to commit?
>
>>
>>>
>>> In summary, this diff:
>>>
>>> - Removes vionet "rx pending" state handling and removes the code path
>>>   for the vcpu thread to possibly take control of the virtio net device
>>>   and attempt a read of the underlying tap(4). (virtio.{c,h}, vm.c)
>>>
>>> - Removes ns8250 "rcv pending" state handling and removes the code path
>>>   for the vcpu thread to read the pty via com_rcv(). (ns8250.{c,h})
>>>
>>> In both of the above cases, the event handling thread will be notified
>>> of readable data and deal with it.
>>>
>>> Why remove them? The logic is overly complicated and hard to reason
>>> about for zero gain. (This diff results in no intended functional
>>> change.) Plus, some of the above logic I helped add to deal with the
>>> race conditions and state corruption over a year ago. The logic was
>>> needed once upon a time, but shouldn't be needed at present.
>>>
>>> I've had positive testing feedback from abieber@ so far with at least
>>> the ns8250/uart diff, but want to cast a broader net here with both
>>> before either part is committed. I debated splitting these up, but
>>> they're thematically related.
>>>
>>> -dv
>>>
>>> Index: virtio.c
>>> ===
>>> RCS file: /cvs/src/usr.sbin/vmd/virtio.c,v
>>> retrieving revision 1.91
>>> diff -u -p -r1.91 virtio.c
>>> --- virtio.c21 Jun 2021 02:38:18 -  1.91
>>> +++ virtio.c23 Jun 2021 11:28:03 -
>>> @@ -1254,12 +1254,12 @@ static int
>>>  vionet_rx(struct vionet_dev *dev)
>>>  {
>>> char buf[PAGE_SIZE];
>>> -   int hasdata, num_enq = 0, spc = 0;
>>> +   int num_enq = 0, spc = 0;
>>> struct ether_header *eh;
>>> ssize_t sz;
>>>
>>> do {
>>> -   sz = read(dev->fd, buf, sizeof buf);
>>> +   sz = read(dev->fd, buf, sizeof(buf));
>>> if (sz == -1) {
>>> /*
>>>  * If we get EAGAIN, No data is currently available.
>>> @@ -1270,21 +1270,17 @@ vionet_rx(struct vionet_dev *dev)
>>> "device");
>>> } else if (sz > 0) {
>>> eh = (struct ether_header *)buf;
>>> -   if (!dev->lockedmac || sz < ETHER_HDR_LEN ||
>>> +   if (!dev->lockedmac ||
>>> ETHER_IS_MULTICAST(eh->ether_dhost) ||
>>> memcmp(eh->ether_dhost, dev->mac,
>>> sizeof(eh->ether_dhost)) == 0)
>>> num_enq += vionet_enq_rx(dev, buf, sz, );
>>> } else if (sz == 0) {
>>> log_debug("process_rx: no data");
>>> -   hasdata = 0;
>>> break;
>>> }
>>> +   } while (spc > 0 && sz > 0);
>>>
>>> -   hasdata = fd_hasdata(dev->fd);
>>> -   } while (spc && hasdata);
>>> -
>>> -   dev->rx_pending = hasdata;
>>> return (num_enq);
>>>  }
>>>
>>> @@ -1301,16 +1297,6 @@ vionet_rx_event(int fd, short kind, void
>>>
>>> mutex_lock(>mutex);
>>>
>>> -   /*
>>> -* We already have other data pending to be received. The data that
>>> -* has become available now will be enqueued to the vionet_dev
>>> -* later.
>>> -*/
>>> -   if (dev->rx_pending) {
>>> -   mutex_unlock(>mutex);
>>> -   return;
>>> -   }
>>> -
>>> if (vionet_rx(dev) > 0) {
>>> /* XXX: vcpu_id */
>>> vcpu_assert_pic_irq(dev->vm_id, 0, dev->irq);
>>> @@ -1320,40 +1306,6 @@ vionet_rx_event(int fd, short kind, void
>>>  }
>>>
>>>  /*
>>> - * vionet_process_rx
>>> - *
>>> - * Processes any remaining pending receivable data for a vionet device.
>>> - * Called on VCPU exit. Although we poll on the tap file descriptor of
>>> - * a vionet_dev in a separate thread, this function still needs to be
>>> - * called on VCPU exit: it can happen that not all data fits into the
>>> - * receive queue of the vionet_dev immediately. So any outstanding data
>>> - * is handled here.
>>> - *
>>> - * Parameters:
>>> - *  vm_id: VM ID of the VM for which to process vionet events
>>> - */
>>> -void
>>>