Re: [PATCH] Convert ddb_sysctl to sysctl_bounded_arr

2020-12-04 Thread Greg Steuck
George Koehler  writes:

> On Mon, 30 Nov 2020 20:04:10 -0800
> Greg Steuck  wrote:
>
>> Tested with a bunch of manual sysctl -w's.
>> 
>> OK?
>
> I'm not sure about this diff.  I'm more likely to do
>   ddb{0}> set $radix = 0t2
> and less likely to do
>   # sysctl ddb.radix=2
> but you enforce the range check (radix in 8..16) only in sysctl,
> not in ddb set.
>
> If I do ddb set a bad value, then sysctl refuses to show the value:
>
>   # sysctl ddb.console=1
>   ddb.console: 0 -> 1
>   # sysctl ddb.trigger=1
>   Stopped at  ddb_sysctl+0x114:   ori r0,r0,0x0
>   ddb{0}> set $radix = 0t2
>   ddb{0}> c
>   ddb.trigger: 0 -> 1
>   # sysctl ddb.radix
>   sysctl: ddb.radix: Invalid argument
>
> This diff might be better than doing nothing?  I'm not sure.  --George

I'm game for changing the range of ddb.radix to [2..INT_MAX] if you
think that's better. I doubt it makes that much of a difference either
way.

Thanks
Greg



Re: [PATCH] fixes relayd Websocket "Connection: close" header when Upgrade is requested

2020-12-04 Thread Franz Bettag
thanks for bringing it up again, i always have to patch multiple relayds after 
upgrades. -.-

Sent from my iPad

> On 4. Dec 2020, at 14:18, Marcus MERIGHI  wrote:
> 
> Hello!
> 
> This patch wasn't commited and not discussed (publicly). 
> 
> It lets me use relayd as a front-end to the mattermost-server.
> 
> Just a friendly reminder...
> 
> @franz: Thank you!
> 
> Marcus
> 
> fr...@bett.ag (Franz Bettag), 2020.03.04 (Wed) 17:52 (CET):
>> After migrating my home setup from nginx reverse proxying to relayd, i
>> noticed my iOS devices having issues connecting through Websockets.
>> After debugging, i noticed that relayd adds the "Connection: close"
>> regardless of upgrade being requested.
>> 
>> This issue is also reported on a blog-post using relayd as a Websocket
>> Client. https://deftly.net/posts/2019-10-23-websockets-with-relayd.html
>> 
>> This resulted in the response having two Connection Headers, one
>> "Connection: upgrade" and one "Connection: close", which iOS strict
>> implementation does not allow.
>> 
>> I have fixed the if condition that leads to this issue.
>> 
>> The fix has been tested with Apple iOS 13 and the problem can be
>> observed using Firefox and just connecting to something Websocket over
>> relayd. Both "Connection:" headers will appear.
>> 
>> best regards
>> 
>> Franz
>> 
>> 
>> diff --git usr.sbin/relayd/relay_http.c usr.sbin/relayd/relay_http.c
>> index 960d4c54a..3a6678790 100644
>> --- usr.sbin/relayd/relay_http.c
>> +++ usr.sbin/relayd/relay_http.c
>> @@ -524,9 +524,11 @@ relay_read_http(struct bufferevent *bev, void *arg)
>> 
>>/*
>> * Ask the server to close the connection after this request
>> - * since we don't read any further request headers.
>> + * since we don't read any further request headers, unless
>> + * an Upgrade is requested, in which case we do NOT want to add
>> + * this header.
>> */
>> -if (cre->toread == TOREAD_UNLIMITED)
>> +if (cre->toread == TOREAD_UNLIMITED && upgrade == NULL)
>>if (kv_add(>http_headers, "Connection",
>>"close", 0) == NULL)
>>goto fail;
>> 



Re: srp_finalize(9): tsleep(9) -> tsleep_nsec(9)

2020-12-04 Thread Scott Cheloha
On Fri, Dec 04, 2020 at 09:56:02AM +0100, Claudio Jeker wrote:
> On Thu, Dec 03, 2020 at 10:05:30PM -0600, Scott Cheloha wrote:
> > Hi,
> > 
> > srp_finalize(9) uses tsleep(9) to spin while it waits for the object's
> > refcount to reach zero.  It blocks for up to 1 tick and then checks
> > the refecount again and again.
> > 
> > We can just as easily do this with tsleep_nsec(9) and block for 1
> > millisecond per interval.
> > 
> > ok?
> > 
> > Index: kern_srp.c
> > ===
> > RCS file: /cvs/src/sys/kern/kern_srp.c,v
> > retrieving revision 1.12
> > diff -u -p -r1.12 kern_srp.c
> > --- kern_srp.c  8 Sep 2017 05:36:53 -   1.12
> > +++ kern_srp.c  4 Dec 2020 04:04:39 -
> > @@ -274,7 +274,7 @@ void
> >  srp_finalize(void *v, const char *wmesg)
> >  {
> > while (srp_referenced(v))
> > -   tsleep(v, PWAIT, wmesg, 1);
> > +   tsleep_nsec(v, PWAIT, wmesg, MSEC_TO_NSEC(1));
> >  }
> >  
> >  #else /* MULTIPROCESSOR */
> > 
> 
> Why only 1ms instead of the original 10ms (at least on most archs)?

The underlying implementation can only process timeouts from
hardclock(9) which runs about hz times per second.  If we tell the
thread to "sleep for 10ms" it's almost always going to overshoot the
next hardclock(9) and wind up sleeping ~20ms.

Some people run with HZ=1000 kernels.  I don't think many people run
with kernels with a higher HZ than that, though.  So I figure a 1ms
sleep is "good enough" for all practical kernels.  On HZ=100 kernels
the thread will oversleep because it doesn't process timeouts often
enough to honor the 1ms request.

Basically I'm trying to pick a reasonable polling interval (not too
fast) that also won't cause the existing default kernel to block for
longer than it already does (~10ms).  The default kernel is HZ=100, so
a 1ms sleep will, in this case, almost always sleep ~10ms per
iteration of this loop.

It's a bit of a chicken-and-egg problem.

Does that make any sense?



Re: mbg(4): tsleep(9) -> tsleep_nsec(9)

2020-12-04 Thread Scott Cheloha
On Fri, Dec 04, 2020 at 10:07:07AM +0100, Claudio Jeker wrote:
> On Thu, Dec 03, 2020 at 10:42:50PM -0600, Scott Cheloha wrote:
> > Hi,
> > 
> > mbg(4) is among the few remaining drivers using tsleep(9).
> > 
> > In a few spots, when the kernel is not cold, the driver will spin for
> > up to 1/10 seconds waiting for the MBG_BUSY flag to go low.
> > 
> > We can approximate this behavior by spinning 10 times and sleeping 10
> > milliseconds each iteration.  10 x 10ms = 100ms = 1/10 seconds.
> > 
> > I can't test this but I was able to compile it on amd64.  It isn't
> > normally built for amd64, though.  Just i386.
> > 
> > I have my doubts that anyone has this card and is able to actually
> > test this diff.
> > 
> > Anybody ok?
> 
> This code needs to wait for around 70us for the card to process the
> command (according to the comment). The cold code sleeps a max of
> 50 * 20us (1ms). I don't see why the regular code should sleep so much
> longer. I would suggest to use a 1ms timeout and loop 10 times. This is a
> magnitude more than enough and most probably only one cycle will be
> needed.
> 
> IIRC someone got a mbg(4) device some time ago apart from mbalmer@

Makes sense to me.  Updated diff attached.

How are we going to find this person?

Index: mbg.c
===
RCS file: /cvs/src/sys/dev/pci/mbg.c,v
retrieving revision 1.31
diff -u -p -r1.31 mbg.c
--- mbg.c   29 Nov 2020 03:17:27 -  1.31
+++ mbg.c   4 Dec 2020 18:07:43 -
@@ -417,12 +417,12 @@ mbg_read_amcc_s5920(struct mbg_softc *sc
 
/* wait for the BUSY flag to go low (approx 70 us on i386) */
timer = 0;
-   tmax = cold ? 50 : hz / 10;
+   tmax = cold ? 50 : 10;
do {
if (cold)
delay(20);
else
-   tsleep(tstamp, 0, "mbg", 1);
+   tsleep_nsec(tstamp, 0, "mbg", MSEC_TO_NSEC(1));
status = bus_space_read_1(sc->sc_iot, sc->sc_ioh,
AMCC_IMB4 + 3);
} while ((status & MBG_BUSY) && timer++ < tmax);
@@ -473,12 +473,12 @@ mbg_read_amcc_s5933(struct mbg_softc *sc
 
/* wait for the BUSY flag to go low (approx 70 us on i386) */
timer = 0;
-   tmax = cold ? 50 : hz / 10;
+   tmax = cold ? 50 : 10;
do {
if (cold)
delay(20);
else
-   tsleep(tstamp, 0, "mbg", 1);
+   tsleep_nsec(tstamp, 0, "mbg", MSEC_TO_NSEC(1));
status = bus_space_read_1(sc->sc_iot, sc->sc_ioh,
AMCC_IMB4 + 3);
} while ((status & MBG_BUSY) && timer++ < tmax);
@@ -525,12 +525,12 @@ mbg_read_asic(struct mbg_softc *sc, int 
 
/* wait for the BUSY flag to go low */
timer = 0;
-   tmax = cold ? 50 : hz / 10;
+   tmax = cold ? 50 : 10;
do {
if (cold)
delay(20);
else
-   tsleep(tstamp, 0, "mbg", 1);
+   tsleep_nsec(tstamp, 0, "mbg", MSEC_TO_NSEC(1));
status = bus_space_read_1(sc->sc_iot, sc->sc_ioh, ASIC_STATUS);
} while ((status & MBG_BUSY) && timer++ < tmax);
 



Re: Switch select(2) to kqueue-based implementation

2020-12-04 Thread Martin Pieuchot
On 30/11/20(Mon) 17:23, Martin Pieuchot wrote:
> On 30/11/20(Mon) 17:06, Visa Hankala wrote:
> > On Mon, Nov 30, 2020 at 01:28:14PM -0300, Martin Pieuchot wrote:
> > > I plan to commit this in 3 steps, to ease a possible revert:
> > >  - kevent(2) refactoring
> > >  - introduction of newer kq* APIs
> > >  - dopselect rewrite
> > 
> > Please send a separate patch for the first step.
> 
> Here's it.  Diff below changes kevent(2) to possibly call kqueue_scan()
> multiple times.  The same pattern is/will be used by select(2) and
> poll(2).
> 
> The copyout(2) and ktrace entry point have been moved to the syscall
> function.

visa@ noticed that clearing the timeout was redundant with the element
check in kqueue_scan() so move the comment there. 

Comments?  Oks?

Index: kern/kern_event.c
===
RCS file: /cvs/src/sys/kern/kern_event.c,v
retrieving revision 1.145
diff -u -p -r1.145 kern_event.c
--- kern/kern_event.c   25 Nov 2020 13:49:00 -  1.145
+++ kern/kern_event.c   1 Dec 2020 17:20:57 -
@@ -567,6 +567,7 @@ sys_kevent(struct proc *p, void *v, regi
struct timespec ts;
struct timespec *tsp = NULL;
int i, n, nerrors, error;
+   int ready, total;
struct kevent kev[KQ_NEVENTS];
 
if ((fp = fd_getfile(fdp, SCARG(uap, fd))) == NULL)
@@ -595,9 +596,9 @@ sys_kevent(struct proc *p, void *v, regi
kq = fp->f_data;
nerrors = 0;
 
-   while (SCARG(uap, nchanges) > 0) {
-   n = SCARG(uap, nchanges) > KQ_NEVENTS ?
-   KQ_NEVENTS : SCARG(uap, nchanges);
+   while ((n = SCARG(uap, nchanges)) > 0) {
+   if (n > nitems(kev))
+   n = nitems(kev);
error = copyin(SCARG(uap, changelist), kev,
n * sizeof(struct kevent));
if (error)
@@ -635,11 +636,30 @@ sys_kevent(struct proc *p, void *v, regi
 
kqueue_scan_setup(, kq);
FRELE(fp, p);
-   error = kqueue_scan(, SCARG(uap, nevents), SCARG(uap, eventlist),
-   tsp, kev, p, );
+   /*
+* Collect as many events as we can.  The timeout on successive
+* loops is disabled (kqueue_scan() becomes non-blocking).
+*/
+   total = 0;
+   error = 0;
+   while ((n = SCARG(uap, nevents) - total) > 0) {
+   if (n > nitems(kev))
+   n = nitems(kev);
+   ready = kqueue_scan(, n, kev, tsp, p, );
+   if (ready == 0)
+   break;
+   error = copyout(kev, SCARG(uap, eventlist) + total,
+   sizeof(struct kevent) * ready);
+#ifdef KTRACE
+   if (KTRPOINT(p, KTR_STRUCT))
+   ktrevent(p, kev, ready);
+#endif
+   total += ready;
+   if (error || ready < n)
+   break;
+   }
kqueue_scan_finish();
-
-   *retval = n;
+   *retval = total;
return (error);
 
  done:
@@ -893,22 +913,22 @@ kqueue_sleep(struct kqueue *kq, struct t
return (error);
 }
 
+/*
+ * Scan the kqueue, blocking if necessary until the target time is reached.
+ * If tsp is NULL we block indefinitely.  If tsp->ts_secs/nsecs are both
+ * 0 we do not block at all.
+ */
 int
 kqueue_scan(struct kqueue_scan_state *scan, int maxevents,
-struct kevent *ulistp, struct timespec *tsp, struct kevent *kev,
-struct proc *p, int *retval)
+struct kevent *kevp, struct timespec *tsp, struct proc *p, int *errorp)
 {
struct kqueue *kq = scan->kqs_kq;
-   struct kevent *kevp;
struct knote *kn;
-   int s, count, nkev, error = 0;
+   int s, count, nkev = 0, error = 0;
 
-   nkev = 0;
-   kevp = kev;
count = maxevents;
if (count == 0)
goto done;
-
 retry:
KASSERT(count == maxevents);
KASSERT(nkev == 0);
@@ -920,6 +940,10 @@ retry:
 
s = splhigh();
if (kq->kq_count == 0) {
+   /*
+* Successive loops are only necessary if there are more
+* ready events to gather, so they don't need to block.
+*/
if ((tsp != NULL && !timespecisset(tsp)) ||
scan->kqs_nevent != 0) {
splx(s);
@@ -958,14 +982,8 @@ retry:
while (count) {
kn = TAILQ_NEXT(>kqs_start, kn_tqe);
if (kn->kn_filter == EVFILT_MARKER) {
-   if (kn == >kqs_end) {
-   TAILQ_REMOVE(>kq_head, >kqs_start,
-   kn_tqe);
-   splx(s);
-   if (scan->kqs_nevent == 0)
-   goto retry;
-   goto done;
-   }
+   if (kn == >kqs_end)
+   break;
 
/* Move start 

Re: [PATCH] fixes relayd Websocket "Connection: close" header when Upgrade is requested

2020-12-04 Thread Marcus MERIGHI
Hello!

This patch wasn't commited and not discussed (publicly). 

It lets me use relayd as a front-end to the mattermost-server.

Just a friendly reminder...

@franz: Thank you!

Marcus

fr...@bett.ag (Franz Bettag), 2020.03.04 (Wed) 17:52 (CET):
> After migrating my home setup from nginx reverse proxying to relayd, i
> noticed my iOS devices having issues connecting through Websockets.
> After debugging, i noticed that relayd adds the "Connection: close"
> regardless of upgrade being requested.
> 
> This issue is also reported on a blog-post using relayd as a Websocket
> Client. https://deftly.net/posts/2019-10-23-websockets-with-relayd.html
> 
> This resulted in the response having two Connection Headers, one
> "Connection: upgrade" and one "Connection: close", which iOS strict
> implementation does not allow.
> 
> I have fixed the if condition that leads to this issue.
> 
> The fix has been tested with Apple iOS 13 and the problem can be
> observed using Firefox and just connecting to something Websocket over
> relayd. Both "Connection:" headers will appear.
> 
> best regards
> 
> Franz
> 
> 
> diff --git usr.sbin/relayd/relay_http.c usr.sbin/relayd/relay_http.c
> index 960d4c54a..3a6678790 100644
> --- usr.sbin/relayd/relay_http.c
> +++ usr.sbin/relayd/relay_http.c
> @@ -524,9 +524,11 @@ relay_read_http(struct bufferevent *bev, void *arg)
> 
>   /*
>* Ask the server to close the connection after this request
> -  * since we don't read any further request headers.
> +  * since we don't read any further request headers, unless
> +  * an Upgrade is requested, in which case we do NOT want to add
> +  * this header.
>*/
> - if (cre->toread == TOREAD_UNLIMITED)
> + if (cre->toread == TOREAD_UNLIMITED && upgrade == NULL)
>   if (kv_add(>http_headers, "Connection",
>   "close", 0) == NULL)
>   goto fail;
> 



Re: relax loopback rule for networks

2020-12-04 Thread Stuart Henderson
On 2020/12/04 12:36, Claudio Jeker wrote:
> In bgpd network inet static and network inet connected should skip
> networks that use 127.0.0.1 as gateway. (This is to prevent network inet
> static picking up reject routes like 224/4).
> This does not really make sense for network inet rtlabel "theones".
> Using rtlabels the operator is in control and can do the selection of
> routes carefully. Similar to rtlabel is priority.
> 
> Because of this weaken the route exclusion when networks are selected and
> only do it for static and connected filters.

I think this makes sense.

The only place where I announce networks with a fib entry pointing to
127.0.0.1 is when I'm announcing a default route to certain peers,
done with prefix-sets and previously with "network 0.0.0.0/0",
I guess it may also be used for distributing blackhole routes, in
all of those cases it would be with more specific distribution than
"network inet static" or "network inet connected".

OK.

> OK?
> -- 
> :wq Claudio
> 
> Index: kroute.c
> ===
> RCS file: /cvs/src/usr.sbin/bgpd/kroute.c,v
> retrieving revision 1.239
> diff -u -p -r1.239 kroute.c
> --- kroute.c  1 Oct 2019 08:57:48 -   1.239
> +++ kroute.c  4 Dec 2020 11:31:09 -
> @@ -110,7 +110,7 @@ int   kr6_delete(struct ktable *, struct k
>  int  krVPN4_delete(struct ktable *, struct kroute_full *, u_int8_t);
>  int  krVPN6_delete(struct ktable *, struct kroute_full *, u_int8_t);
>  void kr_net_delete(struct network *);
> -int  kr_net_match(struct ktable *, struct network_config *, u_int16_t);
> +int  kr_net_match(struct ktable *, struct network_config *, u_int16_t, int);
>  struct network *kr_net_find(struct ktable *, struct network *);
>  void kr_net_clear(struct ktable *);
>  void kr_redistribute(int, struct ktable *, struct kroute *);
> @@ -1318,7 +1318,8 @@ kr_net_redist_del(struct ktable *kt, str
>  }
>  
>  int
> -kr_net_match(struct ktable *kt, struct network_config *net, u_int16_t flags)
> +kr_net_match(struct ktable *kt, struct network_config *net, u_int16_t flags,
> +int loopback)
>  {
>   struct network  *xn;
>  
> @@ -1330,10 +1331,16 @@ kr_net_match(struct ktable *kt, struct n
>   /* static match already redistributed */
>   continue;
>   case NETWORK_STATIC:
> + /* Skip networks with nexthop on loopback. */
> + if (loopback)
> + continue;
>   if (flags & F_STATIC)
>   break;
>   continue;
>   case NETWORK_CONNECTED:
> + /* Skip networks with nexthop on loopback. */
> + if (loopback)
> + continue;
>   if (flags & F_CONNECTED)
>   break;
>   continue;
> @@ -1419,6 +1426,7 @@ kr_redistribute(int type, struct ktable 
>  {
>   struct network_confignet;
>   u_int32_ta;
> + int  loflag = 0;
>  
>   bzero(, sizeof(net));
>   net.prefix.aid = AID_INET;
> @@ -1449,9 +1457,9 @@ kr_redistribute(int type, struct ktable 
>   (a >> IN_CLASSA_NSHIFT) == IN_LOOPBACKNET)
>   return;
>  
> - /* Consider networks with nexthop loopback as not redistributable. */
> + /* Check if the nexthop is the loopback addr. */
>   if (kr->nexthop.s_addr == htonl(INADDR_LOOPBACK))
> - return;
> + loflag = 1;
>  
>   /*
>* never allow 0.0.0.0/0 the default route can only be redistributed
> @@ -1460,7 +1468,7 @@ kr_redistribute(int type, struct ktable 
>   if (kr->prefix.s_addr == INADDR_ANY && kr->prefixlen == 0)
>   return;
>  
> - if (kr_net_match(kt, , kr->flags) == 0)
> + if (kr_net_match(kt, , kr->flags, loflag) == 0)
>   /* no longer matches, if still present remove it */
>   kr_net_redist_del(kt, , 1);
>  }
> @@ -1468,7 +1476,8 @@ kr_redistribute(int type, struct ktable 
>  void
>  kr_redistribute6(int type, struct ktable *kt, struct kroute6 *kr6)
>  {
> - struct network_confignet;
> + struct network_config   net;
> + int loflag = 0;
>  
>   bzero(, sizeof(net));
>   net.prefix.aid = AID_INET6;
> @@ -1503,11 +1512,9 @@ kr_redistribute6(int type, struct ktable
>   IN6_IS_ADDR_V4COMPAT(>prefix))
>   return;
>  
> - /*
> -  * Consider networks with nexthop loopback as not redistributable.
> -  */
> + /* Check if the nexthop is the loopback addr. */
>   if (IN6_IS_ADDR_LOOPBACK(>nexthop))
> - return;
> + loflag = 1;
>  
>   /*
>* never allow ::/0 the default route can only be redistributed
> @@ -1517,7 +1524,7 @@ kr_redistribute6(int type, struct ktable
>   memcmp(>prefix, _any, 

Re: Use SMR_TAILQ for `ps_threads'

2020-12-04 Thread Martin Pieuchot
On 04/12/20(Fri) 12:01, Jonathan Matthew wrote:
> On Wed, Dec 02, 2020 at 11:41:04AM -0300, Martin Pieuchot wrote:
> > [...] 
> > Could you try the diff below that only call smr_barrier() for multi-
> > threaded processes with threads still in the list.  I guess this also
> > answers guenther@'s question.  The same could be done with smr_flush().
> 
> This removes the overhead, more or less.  Are we only looking at unlocking 
> access
> to ps_threads from within a process (not the sysctl or ptrace stuff)?  
> Otherwise
> this doesn't seem safe.

I'd argue that if `ps_thread' is being iterated the CPU doing the
iteration must already have a reference to the "struct process" so
the serialization should be done on this reference.

Now I doubt we'll be able to answer all the questions right now.  If we
can find a path forward that doesn't decrease performances too much and
allow us to move signal delivery and sleep out of the KERNEL_LOCK()
that's a huge win.



relax loopback rule for networks

2020-12-04 Thread Claudio Jeker
In bgpd network inet static and network inet connected should skip
networks that use 127.0.0.1 as gateway. (This is to prevent network inet
static picking up reject routes like 224/4).
This does not really make sense for network inet rtlabel "theones".
Using rtlabels the operator is in control and can do the selection of
routes carefully. Similar to rtlabel is priority.

Because of this weaken the route exclusion when networks are selected and
only do it for static and connected filters.

OK?
-- 
:wq Claudio

Index: kroute.c
===
RCS file: /cvs/src/usr.sbin/bgpd/kroute.c,v
retrieving revision 1.239
diff -u -p -r1.239 kroute.c
--- kroute.c1 Oct 2019 08:57:48 -   1.239
+++ kroute.c4 Dec 2020 11:31:09 -
@@ -110,7 +110,7 @@ int kr6_delete(struct ktable *, struct k
 intkrVPN4_delete(struct ktable *, struct kroute_full *, u_int8_t);
 intkrVPN6_delete(struct ktable *, struct kroute_full *, u_int8_t);
 void   kr_net_delete(struct network *);
-intkr_net_match(struct ktable *, struct network_config *, u_int16_t);
+intkr_net_match(struct ktable *, struct network_config *, u_int16_t, int);
 struct network *kr_net_find(struct ktable *, struct network *);
 void   kr_net_clear(struct ktable *);
 void   kr_redistribute(int, struct ktable *, struct kroute *);
@@ -1318,7 +1318,8 @@ kr_net_redist_del(struct ktable *kt, str
 }
 
 int
-kr_net_match(struct ktable *kt, struct network_config *net, u_int16_t flags)
+kr_net_match(struct ktable *kt, struct network_config *net, u_int16_t flags,
+int loopback)
 {
struct network  *xn;
 
@@ -1330,10 +1331,16 @@ kr_net_match(struct ktable *kt, struct n
/* static match already redistributed */
continue;
case NETWORK_STATIC:
+   /* Skip networks with nexthop on loopback. */
+   if (loopback)
+   continue;
if (flags & F_STATIC)
break;
continue;
case NETWORK_CONNECTED:
+   /* Skip networks with nexthop on loopback. */
+   if (loopback)
+   continue;
if (flags & F_CONNECTED)
break;
continue;
@@ -1419,6 +1426,7 @@ kr_redistribute(int type, struct ktable 
 {
struct network_confignet;
u_int32_ta;
+   int  loflag = 0;
 
bzero(, sizeof(net));
net.prefix.aid = AID_INET;
@@ -1449,9 +1457,9 @@ kr_redistribute(int type, struct ktable 
(a >> IN_CLASSA_NSHIFT) == IN_LOOPBACKNET)
return;
 
-   /* Consider networks with nexthop loopback as not redistributable. */
+   /* Check if the nexthop is the loopback addr. */
if (kr->nexthop.s_addr == htonl(INADDR_LOOPBACK))
-   return;
+   loflag = 1;
 
/*
 * never allow 0.0.0.0/0 the default route can only be redistributed
@@ -1460,7 +1468,7 @@ kr_redistribute(int type, struct ktable 
if (kr->prefix.s_addr == INADDR_ANY && kr->prefixlen == 0)
return;
 
-   if (kr_net_match(kt, , kr->flags) == 0)
+   if (kr_net_match(kt, , kr->flags, loflag) == 0)
/* no longer matches, if still present remove it */
kr_net_redist_del(kt, , 1);
 }
@@ -1468,7 +1476,8 @@ kr_redistribute(int type, struct ktable 
 void
 kr_redistribute6(int type, struct ktable *kt, struct kroute6 *kr6)
 {
-   struct network_confignet;
+   struct network_config   net;
+   int loflag = 0;
 
bzero(, sizeof(net));
net.prefix.aid = AID_INET6;
@@ -1503,11 +1512,9 @@ kr_redistribute6(int type, struct ktable
IN6_IS_ADDR_V4COMPAT(>prefix))
return;
 
-   /*
-* Consider networks with nexthop loopback as not redistributable.
-*/
+   /* Check if the nexthop is the loopback addr. */
if (IN6_IS_ADDR_LOOPBACK(>nexthop))
-   return;
+   loflag = 1;
 
/*
 * never allow ::/0 the default route can only be redistributed
@@ -1517,7 +1524,7 @@ kr_redistribute6(int type, struct ktable
memcmp(>prefix, _any, sizeof(struct in6_addr)) == 0)
return;
 
-   if (kr_net_match(kt, , kr6->flags) == 0)
+   if (kr_net_match(kt, , kr6->flags, loflag) == 0)
/* no longer matches, if still present remove it */
kr_net_redist_del(kt, , 1);
 }



Re: PF synproxy should act on inbound packets only

2020-12-04 Thread Alexander Bluhm
On Fri, Dec 04, 2020 at 01:08:53AM +0100, Alexandr Nedvedicky wrote:
> below is updated diff. The new diff also updates pf.conf(5) manpage.

OK bluhm@

A note for the man page.

> @@ -2126,6 +2126,9 @@ will not work if
>  .Xr pf 4
>  operates on a
>  .Xr bridge 4 .
> +Also
> +.Cm synproxy state
> +option acts on inbound packets only.

The synproxy rules are the subject of the previous sentence.  I
would not repeate synproxy state in one paragraph.  What about

Also they act on incoming SYN packets only.



Re: hvn(4): msleep(9) -> msleep_nsec(9)

2020-12-04 Thread Andre Stoebe
Hello Scott,

works fine here.

Regards,
Andre

OpenBSD 6.8-current (GENERIC.MP) #8: Fri Dec  4 10:21:19 CET 2020
an...@dev.stoe.be:/sys/arch/amd64/compile/GENERIC.MP
real mem = 4278124544 (4079MB)
avail mem = 4133175296 (3941MB)
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.3 @ 0xf93d0 (338 entries)
bios0: vendor American Megatrends Inc. version "090008" date 12/07/2018
bios0: Microsoft Corporation Virtual Machine
acpi0 at bios0: ACPI 2.0
acpi0: sleep states S0 S5
acpi0: tables DSDT FACP WAET SLIC OEM0 SRAT APIC OEMB
acpi0: wakeup devices
acpitimer0 at acpi0: 3579545 Hz, 32 bits
acpihve0 at acpi0
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
ioapic0 at mainbus0: apid 0 pa 0xfec0, version 11, 24 pins, remapped
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Xeon(R) CPU E3-1245 v5 @ 3.50GHz, 2772.06 MHz, 06-5e-03
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,FSGSBASE,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,RDSEED,ADX,SMAP,CLFLUSHOPT,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
cpu0: apic clock running at 156MHz
cpu1 at mainbus0: apid 1 (application processor)
cpu1: Intel(R) Xeon(R) CPU E3-1245 v5 @ 3.50GHz, 2746.97 MHz, 06-5e-03
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,FSGSBASE,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,RDSEED,ADX,SMAP,CLFLUSHOPT,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
cpu1: 256KB 64b/line 8-way L2 cache
cpu1: smt 0, core 1, package 0
cpu2 at mainbus0: apid 2 (application processor)
cpu2: Intel(R) Xeon(R) CPU E3-1245 v5 @ 3.50GHz, 2746.49 MHz, 06-5e-03
cpu2: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,FSGSBASE,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,RDSEED,ADX,SMAP,CLFLUSHOPT,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
cpu2: 256KB 64b/line 8-way L2 cache
cpu2: smt 0, core 2, package 0
cpu3 at mainbus0: apid 3 (application processor)
cpu3: Intel(R) Xeon(R) CPU E3-1245 v5 @ 3.50GHz, 2746.53 MHz, 06-5e-03
cpu3: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,FSGSBASE,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,RDSEED,ADX,SMAP,CLFLUSHOPT,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
cpu3: 256KB 64b/line 8-way L2 cache
cpu3: smt 0, core 3, package 0
acpiprt0 at acpi0: bus 0 (PCI0)
acpipci0 at acpi0 PCI0
acpicmos0 at acpi0
"VMBus" at acpi0 not configured
"Hyper_V_Gen_Counter_V1" at acpi0 not configured
acpicpu0 at acpi0: C1(@1 halt!)
acpicpu1 at acpi0: C1(@1 halt!)
acpicpu2 at acpi0: C1(@1 halt!)
acpicpu3 at acpi0: C1(@1 halt!)
cpu0: using VERW MDS workaround (except on vmm entry)
pvbus0 at mainbus0: Hyper-V 10.0
hyperv0 at pvbus0: protocol 4.0, features 0x2e7f
hyperv0: heartbeat, kvp, shutdown, timesync
hvs0 at hyperv0 channel 2: ide, protocol 6.2
scsibus1 at hvs0: 2 targets
sd0 at scsibus1 targ 0 lun 0:  
naa.600224802050cb43ae008c0f97857048
sd0: 65536MB, 512 bytes/sector, 134217728 sectors, thin
hvs1 at hyperv0 channel 15: scsi, protocol 6.2
scsibus2 at hvs1: 2 targets
hvn0 at hyperv0 channel 16: NVS 5.0 NDIS 6.30, address 00:15:5d:9a:72:2f
pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 "Intel 82443BX" rev 0x03
pcib0 at pci0 dev 7 function 0 "Intel 82371AB PIIX4 ISA" rev 0x01
pciide0 at pci0 dev 7 function 1 "Intel 82371AB IDE" rev 0x01: DMA, channel 0 
wired to compatibility, channel 1 wired to compatibility
pciide0: channel 0 disabled (no drives)
atapiscsi0 at pciide0 channel 1 drive 0
scsibus3 at atapiscsi0: 2 targets
cd0 at scsibus3 targ 0 lun 0:  removable
cd0(pciide0:1:0): using PIO mode 4, DMA mode 2
piixpm0 at pci0 dev 7 function 3 "Intel 82371AB Power" rev 0x02: SMBus disabled
vga1 at pci0 dev 8 function 0 "Microsoft VGA" rev 0x00
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
isa0 at pcib0
isadma0 at isa0
fdc0 at isa0 port 0x3f0/6 irq 6 drq 2
com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
com1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo
pckbc0 at isa0 port 0x60/5 irq 1 irq 12
pckbd0 at pckbc0 

Re: mbg(4): tsleep(9) -> tsleep_nsec(9)

2020-12-04 Thread Claudio Jeker
On Thu, Dec 03, 2020 at 10:42:50PM -0600, Scott Cheloha wrote:
> Hi,
> 
> mbg(4) is among the few remaining drivers using tsleep(9).
> 
> In a few spots, when the kernel is not cold, the driver will spin for
> up to 1/10 seconds waiting for the MBG_BUSY flag to go low.
> 
> We can approximate this behavior by spinning 10 times and sleeping 10
> milliseconds each iteration.  10 x 10ms = 100ms = 1/10 seconds.
> 
> I can't test this but I was able to compile it on amd64.  It isn't
> normally built for amd64, though.  Just i386.
> 
> I have my doubts that anyone has this card and is able to actually
> test this diff.
> 
> Anybody ok?

This code needs to wait for around 70us for the card to process the
command (according to the comment). The cold code sleeps a max of
50 * 20us (1ms). I don't see why the regular code should sleep so much
longer. I would suggest to use a 1ms timeout and loop 10 times. This is a
magnitude more than enough and most probably only one cycle will be
needed.

IIRC someone got a mbg(4) device some time ago apart from mbalmer@
 
> Index: mbg.c
> ===
> RCS file: /cvs/src/sys/dev/pci/mbg.c,v
> retrieving revision 1.31
> diff -u -p -r1.31 mbg.c
> --- mbg.c 29 Nov 2020 03:17:27 -  1.31
> +++ mbg.c 4 Dec 2020 04:39:59 -
> @@ -417,12 +417,12 @@ mbg_read_amcc_s5920(struct mbg_softc *sc
>  
>   /* wait for the BUSY flag to go low (approx 70 us on i386) */
>   timer = 0;
> - tmax = cold ? 50 : hz / 10;
> + tmax = cold ? 50 : 10;
>   do {
>   if (cold)
>   delay(20);
>   else
> - tsleep(tstamp, 0, "mbg", 1);
> + tsleep_nsec(tstamp, 0, "mbg", MSEC_TO_NSEC(10));
>   status = bus_space_read_1(sc->sc_iot, sc->sc_ioh,
>   AMCC_IMB4 + 3);
>   } while ((status & MBG_BUSY) && timer++ < tmax);
> @@ -473,12 +473,12 @@ mbg_read_amcc_s5933(struct mbg_softc *sc
>  
>   /* wait for the BUSY flag to go low (approx 70 us on i386) */
>   timer = 0;
> - tmax = cold ? 50 : hz / 10;
> + tmax = cold ? 50 : 10;
>   do {
>   if (cold)
>   delay(20);
>   else
> - tsleep(tstamp, 0, "mbg", 1);
> + tsleep_nsec(tstamp, 0, "mbg", MSEC_TO_NSEC(10));
>   status = bus_space_read_1(sc->sc_iot, sc->sc_ioh,
>   AMCC_IMB4 + 3);
>   } while ((status & MBG_BUSY) && timer++ < tmax);
> @@ -525,12 +525,12 @@ mbg_read_asic(struct mbg_softc *sc, int 
>  
>   /* wait for the BUSY flag to go low */
>   timer = 0;
> - tmax = cold ? 50 : hz / 10;
> + tmax = cold ? 50 : 10;
>   do {
>   if (cold)
>   delay(20);
>   else
> - tsleep(tstamp, 0, "mbg", 1);
> + tsleep_nsec(tstamp, 0, "mbg", MSEC_TO_NSEC(10));
>   status = bus_space_read_1(sc->sc_iot, sc->sc_ioh, ASIC_STATUS);
>   } while ((status & MBG_BUSY) && timer++ < tmax);
>  
> 

-- 
:wq Claudio



Re: srp_finalize(9): tsleep(9) -> tsleep_nsec(9)

2020-12-04 Thread Claudio Jeker
On Thu, Dec 03, 2020 at 10:05:30PM -0600, Scott Cheloha wrote:
> Hi,
> 
> srp_finalize(9) uses tsleep(9) to spin while it waits for the object's
> refcount to reach zero.  It blocks for up to 1 tick and then checks
> the refecount again and again.
> 
> We can just as easily do this with tsleep_nsec(9) and block for 1
> millisecond per interval.
> 
> ok?
> 
> Index: kern_srp.c
> ===
> RCS file: /cvs/src/sys/kern/kern_srp.c,v
> retrieving revision 1.12
> diff -u -p -r1.12 kern_srp.c
> --- kern_srp.c8 Sep 2017 05:36:53 -   1.12
> +++ kern_srp.c4 Dec 2020 04:04:39 -
> @@ -274,7 +274,7 @@ void
>  srp_finalize(void *v, const char *wmesg)
>  {
>   while (srp_referenced(v))
> - tsleep(v, PWAIT, wmesg, 1);
> + tsleep_nsec(v, PWAIT, wmesg, MSEC_TO_NSEC(1));
>  }
>  
>  #else /* MULTIPROCESSOR */
> 

Why only 1ms instead of the original 10ms (at least on most archs)?

-- 
:wq Claudio