Re: slaacd(8) improve movement between IPv6 & legacy-IP-only networks

2020-01-09 Thread Florian Obser
Hi Fernando,

On Thu, Jan 09, 2020 at 08:49:15AM -0300, Fernando Gont wrote:
> Hi, Pamela,
> 
> On 27/8/19 15:31, Pamela Mosiejczuk wrote:
> > Hi all,
> > 
> > When slaacd(8) detects a link-state change, it sets the preferred
> > lifetime (pltime) of all configured addresses to zero and tries to
> > obtain a new router advertisement.
> > 
> > If none is received after some time, slaacd will give up, but existing
> > configured IPv6 addresses remain. This leads applications to believe
> > that IPv6 connectivity is available when it is not, leading to
> > potentially long timeouts. RFC 6059 suggests removing these addresses.
> > 
> > Please test this with IPv6-enabled -> IPv6-enabled transitions (there
> > should be no regression) and IPv6-enabled -> legacy-only transitions
> > (things should be improved).
> > 
> > RFC 6059 has additional suggestions about what to do when moving among
> > various IPv6-enabled networks.
> 
> Just happened to see this (sorry for the bad timing!).
> 
> Note: It may be appropriate to remove existing addresses as soon as you
> realize you've connected to a different link (e.g., different SSID for
> wifi, or if you somehow detect a different net as per RFC6059).
> 
> However, a link-state shouldn't, per-se, lead to addresses being
> removed. Otherwise, you might break e.g. existing TCP connections upon
> transient network problems.

Pamela is currently busy with other things, so let me jump in here.
I helped testing this and gave a bit of guidance where things are in
slaacd(8) so I know a bit about the history.

This had been considered and was actully my biggest fear so I always
refused to implement something like this. For example what would
happen if you hover right at the edge of wifi reception.

Pamela and others then convinced me to try and see, and operational
experience shows that this is not an issue at all. In fact the diff
has been received overwhelmingly positive. No issues had been reported
at all (and it has been in for a long time).

We don't just nuke addresses on link state change, we go through the
normal procedure of sending 3 router solicitations. Only if we receive
no response do we remove the addreses. This seems to be good enough.

Of course there is still more to implement from RFC 6059 - but the
effects would not be big it seems.

> 
> Since we are at it: heads-up:
> https://www.ietf.org/internet-drafts/draft-ietf-6man-rfc4941bis-05.txt
> 
> Thanks!
> 
> Cheers,
> -- 
> Fernando Gont
> e-mail: ferna...@gont.com.ar || fg...@si6networks.com
> PGP Fingerprint: 7809 84F5 322E 45C7 F1C9 3945 96EE A9EF D076 FFF1
> 
> 
> 

-- 
I'm not entirely sure you are real.



Re: man.cgi(8): turn off HTML5 autocomplete for the query input field

2020-01-09 Thread Landry Breuil
On Thu, Jan 09, 2020 at 09:18:25PM -0600, Tim Baumgard wrote:
> This turns off HTML5 autocomplete for the query input field for
> man.cgi(8). This essentially makes smartphones and tablets behave the
> same as PCs since they otherwise would try to annoyingly capitalize the
> first character and change things like "mandoc" to "man doctor."

hilarious.. 'smart'phones :)



Re: xlights(4): timeout_add(9) -> timeout_add_msec(9)

2020-01-09 Thread Jonathan Gray
On Wed, Dec 18, 2019 at 08:47:54PM -0600, Scott Cheloha wrote:
> 250 ticks at macppc's 100hz is 2500 milliseconds.
> 
> ok?

ok jsg@

> 
> Index: xlights.c
> ===
> RCS file: /cvs/src/sys/arch/macppc/dev/xlights.c,v
> retrieving revision 1.9
> diff -u -p -r1.9 xlights.c
> --- xlights.c 8 Oct 2019 13:21:38 -   1.9
> +++ xlights.c 19 Dec 2019 02:45:33 -
> @@ -282,7 +282,7 @@ xlights_startdma(struct xlights_softc *s
>   dbdma_command_t *cmdp = sc->sc_dmacmd;
>  
>   sc->sc_dmasts = 1;
> - timeout_add(>sc_tmo, 250);
> + timeout_add_msec(>sc_tmo, 2500);
>  
>   DBDMA_BUILD(cmdp, DBDMA_CMD_OUT_LAST, 0,
>   sc->sc_bufmap->dm_segs[0].ds_len,
> 
> 



Re: pluart(4): timeout_add(9) -> timeout_add_sec(9)

2020-01-09 Thread Scott Cheloha
On Fri, Jan 10, 2020 at 02:57:41PM +1100, Jonathan Gray wrote:
> On Sun, Dec 22, 2019 at 03:35:09PM -0600, Scott Cheloha wrote:
> > ok?
> > 
> > Index: pluart.c
> > ===
> > RCS file: /cvs/src/sys/dev/ic/pluart.c,v
> > retrieving revision 1.4
> > diff -u -p -r1.4 pluart.c
> > --- pluart.c11 Aug 2019 10:03:32 -  1.4
> > +++ pluart.c22 Dec 2019 21:32:50 -
> > @@ -225,7 +225,7 @@ pluart_intr(void *arg)
> > if (p >= sc->sc_ibufend) {
> > sc->sc_floods++;
> > if (sc->sc_errors++ == 0)
> > -   timeout_add(>sc_diag_tmo, 60 * hz);
> > +   timeout_add_sec(>sc_diag_tmo, 60);
> > } else {
> > *p++ = c;
> > if (p == sc->sc_ibufhigh && ISSET(tp->t_cflag, 
> > CRTSCTS)) {
> > @@ -428,7 +428,7 @@ pluart_softint(void *arg)
> > if (ISSET(c, IMXUART_RX_OVERRUN)) {
> > sc->sc_overflows++;
> > if (sc->sc_errors++ == 0)
> > -   timeout_add(>sc_diag_tmo, 60 * hz);
> > +   timeout_add_sec(>sc_diag_tmo, 60);
> > }
> > */
> > /* This is ugly, but fast. */
> > @@ -605,7 +605,7 @@ pluartclose(dev_t dev, int flag, int mod
> > /* tty device is waiting for carrier; drop dtr then re-raise */
> > //CLR(sc->sc_ucr3, IMXUART_CR3_DSR);
> > //bus_space_write_4(iot, ioh, IMXUART_UCR3, sc->sc_ucr3);
> > -   timeout_add(>sc_dtr_tmo, hz * 2);
> > +   timeout_add_sec(>sc_dtr_tmo, 2);
> > } else {
> > /* no one else waiting; turn off the uart */
> > pluart_pwroff(sc);
> > 
> > 
> 
> ok jsg@
> 
> same diff for exuart

That diff below is what I came up with here, just wasn't sure if that
code was dead or not.

ok cheloha@

> 
> Index: exuart.c
> ===
> RCS file: /cvs/src/sys/arch/armv7/exynos/exuart.c,v
> retrieving revision 1.16
> diff -u -p -r1.16 exuart.c
> --- exuart.c  19 Jul 2019 00:17:15 -  1.16
> +++ exuart.c  10 Jan 2020 03:52:57 -
> @@ -283,7 +283,7 @@ exuart_intr(void *arg)
>   if (p >= sc->sc_ibufend) {
>   sc->sc_floods++;
>   if (sc->sc_errors++ == 0)
> - timeout_add(>sc_diag_tmo, 60 * hz);
> + timeout_add_sec(>sc_diag_tmo, 60);
>   } else {
>   *p++ = c;
>   if (p == sc->sc_ibufhigh && ISSET(tp->t_cflag, CRTSCTS))
> @@ -710,7 +710,7 @@ exuartclose(dev_t dev, int flag, int mod
>   /* tty device is waiting for carrier; drop dtr then re-raise */
>   //CLR(sc->sc_ucr3, EXUART_CR3_DSR);
>   //bus_space_write_2(iot, ioh, EXUART_UCR3, sc->sc_ucr3);
> - timeout_add(>sc_dtr_tmo, hz * 2);
> + timeout_add_sec(>sc_dtr_tmo, 2);
>   } else {
>   /* no one else waiting; turn off the uart */
>   exuart_pwroff(sc);



ubsec(4): timeout_add(9) -> timeout_add_msec(9)

2020-01-09 Thread Scott Cheloha
The intent here is to poll no faster than 100 times a second,
so change sc_rnghz to sc_rngms and set it to 10 milliseconds,
then call timeout_add_msec() instead of timeout_add().

ok?

Index: ubsec.c
===
RCS file: /cvs/src/sys/dev/pci/ubsec.c,v
retrieving revision 1.164
diff -u -p -r1.164 ubsec.c
--- ubsec.c 28 Apr 2018 15:44:59 -  1.164
+++ ubsec.c 10 Jan 2020 04:04:01 -
@@ -318,11 +318,8 @@ ubsec_attach(struct device *parent, stru
}
 
timeout_set(>sc_rngto, ubsec_rng, sc);
-   if (hz >= 100)
-   sc->sc_rnghz = hz / 100;
-   else
-   sc->sc_rnghz = 1;
-   timeout_add(>sc_rngto, sc->sc_rnghz);
+   sc->sc_rngms = 10;
+   timeout_add_msec(>sc_rngto, sc->sc_rngms);
printf(" RNG");
 skip_rng:
;
@@ -1455,7 +1452,7 @@ ubsec_callback2(struct ubsec_softc *sc, 
for (i = 0; i < UBSEC_RNG_BUFSIZ; p++, i++)
enqueue_randomness(*p);
rng->rng_used = 0;
-   timeout_add(>sc_rngto, sc->sc_rnghz);
+   timeout_add_msec(>sc_rngto, sc->sc_rngms);
break;
}
 #endif
@@ -1530,7 +1527,7 @@ out:
 */
(*nqueue)--;
splx(s);
-   timeout_add(>sc_rngto, sc->sc_rnghz);
+   timeout_add_msec(>sc_rngto, sc->sc_rngms);
 }
 #endif /* UBSEC_NO_RNG */
 
Index: ubsecvar.h
===
RCS file: /cvs/src/sys/dev/pci/ubsecvar.h,v
retrieving revision 1.40
diff -u -p -r1.40 ubsecvar.h
--- ubsecvar.h  15 Aug 2014 15:37:51 -  1.40
+++ ubsecvar.h  10 Jan 2020 04:04:01 -
@@ -147,7 +147,7 @@ struct ubsec_softc {
int sc_nsessions;   /* # of sessions */
struct ubsec_session*sc_sessions;   /* sessions */
struct timeout  sc_rngto;   /* rng timeout */
-   int sc_rnghz;   /* rng poll time */
+   int sc_rngms;   /* rng poll time (msecs) */
struct ubsec_q2_rng sc_rng;
struct ubsec_dmasc_dmaa[UBS_MAX_NQUEUE];
struct ubsec_q  *sc_queuea[UBS_MAX_NQUEUE];



Re: pluart(4): timeout_add(9) -> timeout_add_sec(9)

2020-01-09 Thread Jonathan Gray
On Sun, Dec 22, 2019 at 03:35:09PM -0600, Scott Cheloha wrote:
> ok?
> 
> Index: pluart.c
> ===
> RCS file: /cvs/src/sys/dev/ic/pluart.c,v
> retrieving revision 1.4
> diff -u -p -r1.4 pluart.c
> --- pluart.c  11 Aug 2019 10:03:32 -  1.4
> +++ pluart.c  22 Dec 2019 21:32:50 -
> @@ -225,7 +225,7 @@ pluart_intr(void *arg)
>   if (p >= sc->sc_ibufend) {
>   sc->sc_floods++;
>   if (sc->sc_errors++ == 0)
> - timeout_add(>sc_diag_tmo, 60 * hz);
> + timeout_add_sec(>sc_diag_tmo, 60);
>   } else {
>   *p++ = c;
>   if (p == sc->sc_ibufhigh && ISSET(tp->t_cflag, 
> CRTSCTS)) {
> @@ -428,7 +428,7 @@ pluart_softint(void *arg)
>   if (ISSET(c, IMXUART_RX_OVERRUN)) {
>   sc->sc_overflows++;
>   if (sc->sc_errors++ == 0)
> - timeout_add(>sc_diag_tmo, 60 * hz);
> + timeout_add_sec(>sc_diag_tmo, 60);
>   }
>   */
>   /* This is ugly, but fast. */
> @@ -605,7 +605,7 @@ pluartclose(dev_t dev, int flag, int mod
>   /* tty device is waiting for carrier; drop dtr then re-raise */
>   //CLR(sc->sc_ucr3, IMXUART_CR3_DSR);
>   //bus_space_write_4(iot, ioh, IMXUART_UCR3, sc->sc_ucr3);
> - timeout_add(>sc_dtr_tmo, hz * 2);
> + timeout_add_sec(>sc_dtr_tmo, 2);
>   } else {
>   /* no one else waiting; turn off the uart */
>   pluart_pwroff(sc);
> 
> 

ok jsg@

same diff for exuart

Index: exuart.c
===
RCS file: /cvs/src/sys/arch/armv7/exynos/exuart.c,v
retrieving revision 1.16
diff -u -p -r1.16 exuart.c
--- exuart.c19 Jul 2019 00:17:15 -  1.16
+++ exuart.c10 Jan 2020 03:52:57 -
@@ -283,7 +283,7 @@ exuart_intr(void *arg)
if (p >= sc->sc_ibufend) {
sc->sc_floods++;
if (sc->sc_errors++ == 0)
-   timeout_add(>sc_diag_tmo, 60 * hz);
+   timeout_add_sec(>sc_diag_tmo, 60);
} else {
*p++ = c;
if (p == sc->sc_ibufhigh && ISSET(tp->t_cflag, CRTSCTS))
@@ -710,7 +710,7 @@ exuartclose(dev_t dev, int flag, int mod
/* tty device is waiting for carrier; drop dtr then re-raise */
//CLR(sc->sc_ucr3, EXUART_CR3_DSR);
//bus_space_write_2(iot, ioh, EXUART_UCR3, sc->sc_ucr3);
-   timeout_add(>sc_dtr_tmo, hz * 2);
+   timeout_add_sec(>sc_dtr_tmo, 2);
} else {
/* no one else waiting; turn off the uart */
exuart_pwroff(sc);



Re: pluart(4): timeout_add(9) -> timeout_add_sec(9)

2020-01-09 Thread Scott Cheloha
On Sun, Dec 22, 2019 at 03:35:09PM -0600, Scott Cheloha wrote:
> ok?

Bump.

> Index: pluart.c
> ===
> RCS file: /cvs/src/sys/dev/ic/pluart.c,v
> retrieving revision 1.4
> diff -u -p -r1.4 pluart.c
> --- pluart.c  11 Aug 2019 10:03:32 -  1.4
> +++ pluart.c  22 Dec 2019 21:32:50 -
> @@ -225,7 +225,7 @@ pluart_intr(void *arg)
>   if (p >= sc->sc_ibufend) {
>   sc->sc_floods++;
>   if (sc->sc_errors++ == 0)
> - timeout_add(>sc_diag_tmo, 60 * hz);
> + timeout_add_sec(>sc_diag_tmo, 60);
>   } else {
>   *p++ = c;
>   if (p == sc->sc_ibufhigh && ISSET(tp->t_cflag, 
> CRTSCTS)) {
> @@ -428,7 +428,7 @@ pluart_softint(void *arg)
>   if (ISSET(c, IMXUART_RX_OVERRUN)) {
>   sc->sc_overflows++;
>   if (sc->sc_errors++ == 0)
> - timeout_add(>sc_diag_tmo, 60 * hz);
> + timeout_add_sec(>sc_diag_tmo, 60);
>   }
>   */
>   /* This is ugly, but fast. */
> @@ -605,7 +605,7 @@ pluartclose(dev_t dev, int flag, int mod
>   /* tty device is waiting for carrier; drop dtr then re-raise */
>   //CLR(sc->sc_ucr3, IMXUART_CR3_DSR);
>   //bus_space_write_4(iot, ioh, IMXUART_UCR3, sc->sc_ucr3);
> - timeout_add(>sc_dtr_tmo, hz * 2);
> + timeout_add_sec(>sc_dtr_tmo, 2);
>   } else {
>   /* no one else waiting; turn off the uart */
>   pluart_pwroff(sc);



Re: xlights(4): timeout_add(9) -> timeout_add_msec(9)

2020-01-09 Thread Scott Cheloha
On Wed, Dec 18, 2019 at 08:47:54PM -0600, Scott Cheloha wrote:
> 250 ticks at macppc's 100hz is 2500 milliseconds.
> 
> ok?

Bump.

> Index: xlights.c
> ===
> RCS file: /cvs/src/sys/arch/macppc/dev/xlights.c,v
> retrieving revision 1.9
> diff -u -p -r1.9 xlights.c
> --- xlights.c 8 Oct 2019 13:21:38 -   1.9
> +++ xlights.c 19 Dec 2019 02:45:33 -
> @@ -282,7 +282,7 @@ xlights_startdma(struct xlights_softc *s
>   dbdma_command_t *cmdp = sc->sc_dmacmd;
>  
>   sc->sc_dmasts = 1;
> - timeout_add(>sc_tmo, 250);
> + timeout_add_msec(>sc_tmo, 2500);
>  
>   DBDMA_BUILD(cmdp, DBDMA_CMD_OUT_LAST, 0,
>   sc->sc_bufmap->dm_segs[0].ds_len,



Re: fd(4): tsleep(9) -> tsleep_nsec(9), timeout_add(9) -> timeout_add_msec(9)

2020-01-09 Thread Scott Cheloha
On Tue, Jan 07, 2020 at 08:03:27AM -, Miod Vallat wrote:
> 
> > Convert ticks to milliseconds.
> >
> > The 33 milliseconds in the timeout_add_msec(9) call will be truncated
> > to 3 ticks, but that was already the case with the current (hz / 30)
> > code so this is no worse.
> 
> [...]
> 
> > /* allow 1/30 second for heads to settle */
> > -   timeout_add(>fdcpseudointr_to, hz / 30);
> > +   timeout_add_msec(>fdcpseudointr_to, 33);
> 
> Why not simply write `1000 / 30' rather than 33, to match the comment?

Eh, sometimes when I see that stuff I wish I didn't have to do the
integer division in my head.

Not that big of a deal though.  Sure.

Index: dev/isa/fd.c
===
RCS file: /cvs/src/sys/dev/isa/fd.c,v
retrieving revision 1.106
diff -u -p -r1.106 fd.c
--- dev/isa/fd.c30 Dec 2017 23:08:29 -  1.106
+++ dev/isa/fd.c10 Jan 2020 03:21:35 -
@@ -210,11 +210,11 @@ fdprobe(struct device *parent, void *mat
/* select drive and turn on motor */
bus_space_write_1(iot, ioh, fdout, drive | FDO_FRST | FDO_MOEN(drive));
/* wait for motor to spin up */
-   tsleep(fdc, 0, "fdprobe", 250 * hz / 1000);
+   tsleep_nsec(fdc, 0, "fdprobe", MSEC_TO_NSEC(250));
out_fdc(iot, ioh, NE7CMD_RECAL);
out_fdc(iot, ioh, drive);
/* wait for recalibrate */
-   tsleep(fdc, 0, "fdprobe", 2000 * hz / 1000);
+   tsleep_nsec(fdc, 0, "fdprobe", MSEC_TO_NSEC(2000));
out_fdc(iot, ioh, NE7CMD_SENSEI);
n = fdcresult(fdc);
 #ifdef FD_DEBUG
@@ -228,7 +228,7 @@ fdprobe(struct device *parent, void *mat
 #endif
 
/* turn off motor */
-   tsleep(fdc, 0, "fdprobe", 250 * hz / 1000);
+   tsleep_nsec(fdc, 0, "fdprobe", MSEC_TO_NSEC(250));
bus_space_write_1(iot, ioh, fdout, FDO_FRST);
 
/* flags & 0x20 forces the drive to be found even if it won't probe */
@@ -900,7 +900,7 @@ loop:
timeout_del(>fdtimeout_to);
fdc->sc_state = RECALCOMPLETE;
/* allow 1/30 second for heads to settle */
-   timeout_add(>fdcpseudointr_to, hz / 30);
+   timeout_add_msec(>fdcpseudointr_to, 1000 / 30);
return 1;   /* will return later */
 
case RECALCOMPLETE:



man.cgi(8): turn off HTML5 autocomplete for the query input field

2020-01-09 Thread Tim Baumgard
This turns off HTML5 autocomplete for the query input field for
man.cgi(8). This essentially makes smartphones and tablets behave the
same as PCs since they otherwise would try to annoyingly capitalize the
first character and change things like "mandoc" to "man doctor."

Tim


Index: cgi.c
===
RCS file: /cvs/src/usr.bin/mandoc/cgi.c,v
retrieving revision 1.107
diff -u -p -r1.107 cgi.c
--- cgi.c10 Nov 2019 22:18:01 -1.107
+++ cgi.c10 Jan 2020 02:55:23 -
@@ -417,7 +417,7 @@ resp_searchform(const struct req *req, e
 printf("q.query != NULL)
 html_print(req->q.query);
-printf( "\" size=\"40\"");
+printf( "\" autocomplete=\"off\" size=\"40\"");
 if (focus == FOCUS_QUERY)
 printf(" autofocus");
 puts(">");



Re: dangling vnode panic

2020-01-09 Thread Todd C . Miller
On Thu, 09 Jan 2020 23:03:37 +0100, Alexander Bluhm wrote:

> That is why I have to keep the list alive while flushing it out.
> So I came to the TAILQ solution.

It seems like the best solution right now.

> > The alternative would be to add another loop around the list processing. I
> > think that's worse because we don't know how many times to keep looping.
>
> There might be a race when TAILQ_FOREACH_SAFE reaches the end, next
> is NULL, but during the final sleep a new vnode is inserted.  Then
> we may miss to flush it.
>
> I think vlfush() should use a while(!TAILQ_EMPTY()) logic.  But I
> don't want to change that in a huge diff.  There are other _SAFE
> macros that are not needed or even unsafe.  I want to convert to
> TAILQ first to be more flexible.  Reversing the list is enough to
> fix my existing panic.

That makes sense.

> > bluhm's diff is the best approach imo. ok me.
>
> I also think this is the easiest step in the right direction.

OK millert@

 - todd



Re: dangling vnode panic

2020-01-09 Thread Alexander Bluhm
On Thu, Jan 09, 2020 at 01:22:21PM -0500, Ted Unangst wrote:
> Martin Pieuchot wrote:
> > On 09/01/20(Thu) 10:46, Alexander Bluhm wrote:
> > > Without this diff my regress machines became so unstable, that they
> > > often do not finish the test run.
> > >
> > > With this diff, they run fine.  No regressions.
> >
> > Sadly this is a workaround.  What is the issue?  We're missing a barrier
> > during umount?  How does other BSDs solved that?
> >
> > If your diff work because you changed a TAIL vs a HEAD insert and now
> > how to reproduce it you could capture stack traces from the function
> > doing the insert.  Is it insmntque()?  Could we prevent adding the vnode
> > to the list if a dying flag is on the mount point?
>
> I don't think that's possible. syncing the filesystem can cause more work to
> be queued. If you prevent adding the vnode to the list, how will that work be
> done?

Exactly!  My first approach was to add a flag to prevent list
insertion during unmount.  I have added the diff below for reference.
But it does not work.

The first problem is that checkalias() calls insmntque(vp, mp).  It
does not provide an error path, but that could be fixed by changing
all callers in the file system specific code.  Also a creating
device alias during unmount is unlikely.

But it fails miserably with softdeps.  When a buffer gets fushed,
new vnodes are allocated and added to the list.  As my diff prevents
vnode allocation during unmount, the metadata is not updated.  So
you end with a clean file system where fsck -f finds wrong reference
counters.

That is why I have to keep the list alive while flushing it out.
So I came to the TAILQ solution.

> The alternative would be to add another loop around the list processing. I
> think that's worse because we don't know how many times to keep looping.

There might be a race when TAILQ_FOREACH_SAFE reaches the end, next
is NULL, but during the final sleep a new vnode is inserted.  Then
we may miss to flush it.

I think vlfush() should use a while(!TAILQ_EMPTY()) logic.  But I
don't want to change that in a huge diff.  There are other _SAFE
macros that are not needed or even unsafe.  I want to convert to
TAILQ first to be more flexible.  Reversing the list is enough to
fix my existing panic.

> bluhm's diff is the best approach imo. ok me.

I also think this is the easiest step in the right direction.

bluhm

Index: kern/vfs_subr.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/kern/vfs_subr.c,v
retrieving revision 1.297
diff -u -p -r1.297 vfs_subr.c
--- kern/vfs_subr.c 30 Dec 2019 13:49:40 -  1.297
+++ kern/vfs_subr.c 3 Jan 2020 13:13:42 -
@@ -401,6 +401,11 @@ getnewvnode(enum vtagtype tag, struct mo
toggle = 0;

s = splbio();
+   if (mp != NULL && mp->mnt_flag & MNT_FORCE) {
+   splx(s);
+   *vpp = NULLVP;
+   return (EIO);
+   }
if ((numvnodes < maxvnodes) ||
((TAILQ_FIRST(listhd = _free_list) == NULL) &&
((TAILQ_FIRST(listhd = _hold_list) == NULL) || toggle))) {
@@ -425,7 +430,7 @@ getnewvnode(enum vtagtype tag, struct mo
if (vp == NULL) {
splx(s);
tablefull("vnode");
-   *vpp = 0;
+   *vpp = NULLVP;
return (ENFILE);
}

@@ -480,8 +485,12 @@ insmntque(struct vnode *vp, struct mount
/*
 * Insert into list of vnodes for the new mount point, if available.
 */
-   if ((vp->v_mount = mp) != NULL)
+   if ((vp->v_mount = mp) != NULL) {
+   /* Do not insert new vnodes while forcefully unmounting. */
+   if (mp->mnt_flag & MNT_FORCE)
+   printf("%s: insert vnode during unmount\n", __func__);
LIST_INSERT_HEAD(>mnt_vnodelist, vp, v_mntvnodes);
+   }
 }

 /*
Index: kern/vfs_syscalls.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/kern/vfs_syscalls.c,v
retrieving revision 1.338
diff -u -p -r1.338 vfs_syscalls.c
--- kern/vfs_syscalls.c 29 Nov 2019 20:58:17 -  1.338
+++ kern/vfs_syscalls.c 1 Jan 2020 17:49:55 -
@@ -477,6 +477,9 @@ dounmount_leaf(struct mount *mp, int fla
int error;
int hadsyncer = 0;

+   KASSERT(vfs_isbusy(mp));
+   if (flags & MNT_FORCE)
+   mp->mnt_flag |= MNT_FORCE;
mp->mnt_flag &=~ MNT_ASYNC;
cache_purgevfs(mp); /* remove cache entries for this file sys */
if (mp->mnt_syncer != NULL) {
@@ -501,6 +504,7 @@ dounmount_leaf(struct mount *mp, int fla
if (error && !(flags & MNT_DOOMED)) {
if ((mp->mnt_flag & MNT_RDONLY) == 0 && hadsyncer)
(void) vfs_allocate_syncvnode(mp);
+   mp->mnt_flag &=~ MNT_FORCE;
vfs_unbusy(mp);

Re: small bgpd performance improvement

2020-01-09 Thread Sebastian Benoit
nice, ok

Claudio Jeker(cje...@diehard.n-r-g.com) on 2020.01.09 16:25:58 +0100:
> The path_hash function is called reasonably often. Calling
> SipHash24_Update() over and over for small data is not optimal.
> Inspired by /sys/sys/proc.h add a aspath_hashstart and aspath_hashend to
> the struct rde_aspath and use that for the hash.
> 
> -- 
> :wq Claudio
> 
> Index: rde.h
> ===
> RCS file: /cvs/src/usr.sbin/bgpd/rde.h,v
> retrieving revision 1.230
> diff -u -p -r1.230 rde.h
> --- rde.h 9 Jan 2020 13:31:52 -   1.230
> +++ rde.h 9 Jan 2020 15:12:47 -
> @@ -215,14 +215,16 @@ struct rde_aspath {
>   struct attr **others;
>   struct aspath   *aspath;
>   u_int64_thash;
> + int  refcnt;
>   u_int32_tflags; /* internally used */
> +#define  aspath_hashstartmed
>   u_int32_tmed;   /* multi exit disc */
>   u_int32_tlpref; /* local pref */
>   u_int32_tweight;/* low prio lpref */
> - int  refcnt;
>   u_int16_trtlabelid; /* route label id */
>   u_int16_tpftableid; /* pf table id */
>   u_int8_t origin;
> +#define  aspath_hashend  others_len
>   u_int8_t others_len;
>  };
>  
> Index: rde_rib.c
> ===
> RCS file: /cvs/src/usr.sbin/bgpd/rde_rib.c,v
> retrieving revision 1.212
> diff -u -p -r1.212 rde_rib.c
> --- rde_rib.c 9 Jan 2020 11:55:25 -   1.212
> +++ rde_rib.c 9 Jan 2020 15:14:27 -
> @@ -701,12 +701,8 @@ path_hash(struct rde_aspath *asp)
>   u_int64_t   hash;
>  
>   SipHash24_Init(, );
> - SipHash24_Update(, >origin, sizeof(asp->origin));
> - SipHash24_Update(, >med, sizeof(asp->med));
> - SipHash24_Update(, >lpref, sizeof(asp->lpref));
> - SipHash24_Update(, >weight, sizeof(asp->weight));
> - SipHash24_Update(, >rtlabelid, sizeof(asp->rtlabelid));
> - SipHash24_Update(, >pftableid, sizeof(asp->pftableid));
> + SipHash24_Update(, >aspath_hashstart,
> + (char *)>aspath_hashend - (char *)>aspath_hashstart);
>  
>   if (asp->aspath)
>   SipHash24_Update(, asp->aspath->data, asp->aspath->len);
> 



Re: bgpd sofreconfigure and export default

2020-01-09 Thread Sebastian Benoit
Claudio Jeker(cje...@diehard.n-r-g.com) on 2020.01.09 16:05:06 +0100:
> When using 'export default-route' you still need an output filter to allow
> the default route out. I'm probably not the only one forgetting this fact
> from time to time. Now to make things worse adding the filter rule to
> allow the route plus config reload does not work since the softreconfigure
> code does not handle the export default-route case correctly.

ok
 
> The following diff fixes this. There is still the problem that changing
> the 'export default-route' setting itself needs a session reset but that
> one is less easy to tackle :(
>
> -- 
> :wq Claudio
> 
> Index: rde.c
> ===
> RCS file: /cvs/src/usr.sbin/bgpd/rde.c,v
> retrieving revision 1.498
> diff -u -p -r1.498 rde.c
> --- rde.c 9 Jan 2020 13:31:52 -   1.498
> +++ rde.c 9 Jan 2020 14:42:37 -
> @@ -3221,11 +3221,25 @@ rde_softreconfig_in_done(void *arg, u_in
>   }
>  
>   LIST_FOREACH(peer, , peer_l) {
> - if (peer->reconf_out)
> - rib_byid(peer->loc_rib_id)->state = RECONF_RELOAD;
> - else if (peer->reconf_rib) {
> - u_int8_t aid;
> + u_int8_t aid;
>  
> + if (peer->reconf_out) {
> + if (peer->conf.export_type == EXPORT_NONE) {
> + /* nothing to do here */
> + peer->reconf_out = 0;
> + } else if (peer->conf.export_type ==
> + EXPORT_DEFAULT_ROUTE) {
> + /* just resend the default route */
> + for (aid = 0; aid < AID_MAX; aid++) {
> + if (peer->capa.mp[aid])
> + up_generate_default(out_rules,
> + peer, aid);
> + }
> + peer->reconf_out = 0;
> + } else
> + rib_byid(peer->loc_rib_id)->state =
> + RECONF_RELOAD;
> + } else if (peer->reconf_rib) {
>   /* dump the full table to neighbors that changed rib */
>   for (aid = 0; aid < AID_MAX; aid++) {
>   if (peer->capa.mp[aid])
> 



Re: ldomctl: Fail on incomplete domain parameters

2020-01-09 Thread Mark Kettenis
> Date: Thu, 9 Jan 2020 21:53:09 +0100
> From: Klemens Nanni 
> 
> Each guest needs vcpu and memory, otherwise it's invalid.
> Each guest also needs at least one of vdisk, vnet or iodevice, otherwise
> it has nothing to boot from.
> 
> Consider this example:
> 
>   # cat foo.conf
>   domain no-boot {
>   vcpu 1
>   memory 32G
>   }
>   domain no-vcpu {
>   memory 32G
>   }
>   domain no-memory {
>   vcpu 2
>   }
> 
> Current code accepts the first two guests, only the latter would fail
> with bogus "ldomctl: unable to allocate guest memory".
> 
> Diff below fixes each of these cases:
> 
>   # ldomctl init-system ../foo.conf
>   foo.conf:4 at least one bootable device is required: no-boot
>   foo.conf:7 vcpu is required: no-vcpu
>   foo.conf:10 memory is required: no-memory
> 
> Alternatively, we could make vcpu and memory default to something,
> eliminating such validation checks - we only need sane default values.

I don't think there is a sane default for memory.

> For devices there is no such possible default.
> 
> Feedback? OK?

ok kettenis@

> Index: parse.y
> ===
> RCS file: /cvs/src/usr.sbin/ldomctl/parse.y,v
> retrieving revision 1.14
> diff -u -p -r1.14 parse.y
> --- parse.y   9 Jan 2020 09:23:57 -   1.14
> +++ parse.y   9 Jan 2020 20:44:51 -
> @@ -124,6 +124,25 @@ domain   : DOMAIN STRING optnl '{' optnl 
>   SIMPLEQ_INIT(>iodev_list);
>   }
>   domainopts_l '}' {
> + if (strcmp(domain->name, "primary") != 0) {
> + if (domain->vcpu == 0) {
> + yyerror("vcpu is required: %s",
> + domain->name);
> + YYERROR;
> + }
> + if ( domain->memory == 0) {

there is a space that shouldn't be there

> + yyerror("memory is required: %s",
> + domain->name);
> + YYERROR;
> + }
> + if (SIMPLEQ_EMPTY(>vdisk_list) &&
> + SIMPLEQ_EMPTY(>vnet_list) &&
> + SIMPLEQ_EMPTY(>iodev_list)) {
> + yyerror("at least one bootable device"
> + " is required: %s", domain->name);
> + YYERROR;
> + }
> + }
>   SIMPLEQ_INSERT_TAIL(>domain_list, domain, entry);
>   domain = NULL;
>   }
> 
> 



ldomctl: Fail on incomplete domain parameters

2020-01-09 Thread Klemens Nanni
Each guest needs vcpu and memory, otherwise it's invalid.
Each guest also needs at least one of vdisk, vnet or iodevice, otherwise
it has nothing to boot from.

Consider this example:

# cat foo.conf
domain no-boot {
vcpu 1
memory 32G
}
domain no-vcpu {
memory 32G
}
domain no-memory {
vcpu 2
}

Current code accepts the first two guests, only the latter would fail
with bogus "ldomctl: unable to allocate guest memory".

Diff below fixes each of these cases:

# ldomctl init-system ../foo.conf
foo.conf:4 at least one bootable device is required: no-boot
foo.conf:7 vcpu is required: no-vcpu
foo.conf:10 memory is required: no-memory

Alternatively, we could make vcpu and memory default to something,
eliminating such validation checks - we only need sane default values.

For devices there is no such possible default.

Feedback? OK?


Index: parse.y
===
RCS file: /cvs/src/usr.sbin/ldomctl/parse.y,v
retrieving revision 1.14
diff -u -p -r1.14 parse.y
--- parse.y 9 Jan 2020 09:23:57 -   1.14
+++ parse.y 9 Jan 2020 20:44:51 -
@@ -124,6 +124,25 @@ domain : DOMAIN STRING optnl '{' optnl 
SIMPLEQ_INIT(>iodev_list);
}
domainopts_l '}' {
+   if (strcmp(domain->name, "primary") != 0) {
+   if (domain->vcpu == 0) {
+   yyerror("vcpu is required: %s",
+   domain->name);
+   YYERROR;
+   }
+   if ( domain->memory == 0) {
+   yyerror("memory is required: %s",
+   domain->name);
+   YYERROR;
+   }
+   if (SIMPLEQ_EMPTY(>vdisk_list) &&
+   SIMPLEQ_EMPTY(>vnet_list) &&
+   SIMPLEQ_EMPTY(>iodev_list)) {
+   yyerror("at least one bootable device"
+   " is required: %s", domain->name);
+   YYERROR;
+   }
+   }
SIMPLEQ_INSERT_TAIL(>domain_list, domain, entry);
domain = NULL;
}



Re: dangling vnode panic

2020-01-09 Thread Ted Unangst
Martin Pieuchot wrote:
> On 09/01/20(Thu) 10:46, Alexander Bluhm wrote:
> > Without this diff my regress machines became so unstable, that they
> > often do not finish the test run.
> > 
> > With this diff, they run fine.  No regressions.
> 
> Sadly this is a workaround.  What is the issue?  We're missing a barrier
> during umount?  How does other BSDs solved that?
> 
> If your diff work because you changed a TAIL vs a HEAD insert and now
> how to reproduce it you could capture stack traces from the function
> doing the insert.  Is it insmntque()?  Could we prevent adding the vnode
> to the list if a dying flag is on the mount point?

I don't think that's possible. syncing the filesystem can cause more work to
be queued. If you prevent adding the vnode to the list, how will that work be
done?

The alternative would be to add another loop around the list processing. I
think that's worse because we don't know how many times to keep looping.

bluhm's diff is the best approach imo. ok me.



small bgpd performance improvement

2020-01-09 Thread Claudio Jeker
The path_hash function is called reasonably often. Calling
SipHash24_Update() over and over for small data is not optimal.
Inspired by /sys/sys/proc.h add a aspath_hashstart and aspath_hashend to
the struct rde_aspath and use that for the hash.

-- 
:wq Claudio

Index: rde.h
===
RCS file: /cvs/src/usr.sbin/bgpd/rde.h,v
retrieving revision 1.230
diff -u -p -r1.230 rde.h
--- rde.h   9 Jan 2020 13:31:52 -   1.230
+++ rde.h   9 Jan 2020 15:12:47 -
@@ -215,14 +215,16 @@ struct rde_aspath {
struct attr **others;
struct aspath   *aspath;
u_int64_thash;
+   int  refcnt;
u_int32_tflags; /* internally used */
+#defineaspath_hashstartmed
u_int32_tmed;   /* multi exit disc */
u_int32_tlpref; /* local pref */
u_int32_tweight;/* low prio lpref */
-   int  refcnt;
u_int16_trtlabelid; /* route label id */
u_int16_tpftableid; /* pf table id */
u_int8_t origin;
+#defineaspath_hashend  others_len
u_int8_t others_len;
 };
 
Index: rde_rib.c
===
RCS file: /cvs/src/usr.sbin/bgpd/rde_rib.c,v
retrieving revision 1.212
diff -u -p -r1.212 rde_rib.c
--- rde_rib.c   9 Jan 2020 11:55:25 -   1.212
+++ rde_rib.c   9 Jan 2020 15:14:27 -
@@ -701,12 +701,8 @@ path_hash(struct rde_aspath *asp)
u_int64_t   hash;
 
SipHash24_Init(, );
-   SipHash24_Update(, >origin, sizeof(asp->origin));
-   SipHash24_Update(, >med, sizeof(asp->med));
-   SipHash24_Update(, >lpref, sizeof(asp->lpref));
-   SipHash24_Update(, >weight, sizeof(asp->weight));
-   SipHash24_Update(, >rtlabelid, sizeof(asp->rtlabelid));
-   SipHash24_Update(, >pftableid, sizeof(asp->pftableid));
+   SipHash24_Update(, >aspath_hashstart,
+   (char *)>aspath_hashend - (char *)>aspath_hashstart);
 
if (asp->aspath)
SipHash24_Update(, asp->aspath->data, asp->aspath->len);



bgpd sofreconfigure and export default

2020-01-09 Thread Claudio Jeker
When using 'export default-route' you still need an output filter to allow
the default route out. I'm probably not the only one forgetting this fact
from time to time. Now to make things worse adding the filter rule to
allow the route plus config reload does not work since the softreconfigure
code does not handle the export default-route case correctly.

The following diff fixes this. There is still the problem that changing
the 'export default-route' setting itself needs a session reset but that
one is less easy to tackle :(
-- 
:wq Claudio

Index: rde.c
===
RCS file: /cvs/src/usr.sbin/bgpd/rde.c,v
retrieving revision 1.498
diff -u -p -r1.498 rde.c
--- rde.c   9 Jan 2020 13:31:52 -   1.498
+++ rde.c   9 Jan 2020 14:42:37 -
@@ -3221,11 +3221,25 @@ rde_softreconfig_in_done(void *arg, u_in
}
 
LIST_FOREACH(peer, , peer_l) {
-   if (peer->reconf_out)
-   rib_byid(peer->loc_rib_id)->state = RECONF_RELOAD;
-   else if (peer->reconf_rib) {
-   u_int8_t aid;
+   u_int8_t aid;
 
+   if (peer->reconf_out) {
+   if (peer->conf.export_type == EXPORT_NONE) {
+   /* nothing to do here */
+   peer->reconf_out = 0;
+   } else if (peer->conf.export_type ==
+   EXPORT_DEFAULT_ROUTE) {
+   /* just resend the default route */
+   for (aid = 0; aid < AID_MAX; aid++) {
+   if (peer->capa.mp[aid])
+   up_generate_default(out_rules,
+   peer, aid);
+   }
+   peer->reconf_out = 0;
+   } else
+   rib_byid(peer->loc_rib_id)->state =
+   RECONF_RELOAD;
+   } else if (peer->reconf_rib) {
/* dump the full table to neighbors that changed rib */
for (aid = 0; aid < AID_MAX; aid++) {
if (peer->capa.mp[aid])



Re: dangling vnode panic

2020-01-09 Thread Martin Pieuchot
On 09/01/20(Thu) 10:46, Alexander Bluhm wrote:
> Without this diff my regress machines became so unstable, that they
> often do not finish the test run.
> 
> With this diff, they run fine.  No regressions.

Sadly this is a workaround.  What is the issue?  We're missing a barrier
during umount?  How does other BSDs solved that?

If your diff work because you changed a TAIL vs a HEAD insert and now
how to reproduce it you could capture stack traces from the function
doing the insert.  Is it insmntque()?  Could we prevent adding the vnode
to the list if a dying flag is on the mount point?

> Any ok?
> 
> bluhm
> 
> On Sat, Jan 04, 2020 at 10:55:46PM +0100, Alexander Bluhm wrote:
> > On my amd64 test machine regress/sys/kern/mount run-regress-unmount-busy
> > triggers the dangling vnode panic regulary.  Something has changed
> > in the previous months that makes it more likely.
> >
> > Problem is that while flushing the mnt_vnodelist list, the unmount
> > process sleeps.  Then active processes can write new dirty vnodes
> > to the list.  This happens without softdep.
> >
> > My first attempt was to disable insertions while unmounting.  That
> > does not work.  A buffer flush can cause softdep to add new dirty
> > vnodes to the list.
> >
> > This diff adds the dirty vnodes to the end of the list.  vflush()
> > trverses it from the begining.  Then new vnodes, that are added
> > during syncing, will be flushed.  This fixes my test.
> >
> > Please test with and without softdep.
> >
> > ok?
> >
> > bluhm
> >
> > Index: sys/kern/vfs_subr.c
> > ===
> > RCS file: /data/mirror/openbsd/cvs/src/sys/kern/vfs_subr.c,v
> > retrieving revision 1.297
> > diff -u -p -r1.297 vfs_subr.c
> > --- sys/kern/vfs_subr.c 30 Dec 2019 13:49:40 -  1.297
> > +++ sys/kern/vfs_subr.c 4 Jan 2020 18:42:25 -
> > @@ -178,7 +178,7 @@ vfs_mount_alloc(struct vnode *vp, struct
> > rw_init_flags(>mnt_lock, "vfslock", RWL_IS_VNODE);
> > (void)vfs_busy(mp, VB_READ|VB_NOWAIT);
> >
> > -   LIST_INIT(>mnt_vnodelist);
> > +   TAILQ_INIT(>mnt_vnodelist);
> > mp->mnt_vnodecovered = vp;
> >
> > atomic_inc_int(>vfc_refcount);
> > @@ -476,12 +476,12 @@ insmntque(struct vnode *vp, struct mount
> >  * Delete from old mount point vnode list, if on one.
> >  */
> > if (vp->v_mount != NULL)
> > -   LIST_REMOVE(vp, v_mntvnodes);
> > +   TAILQ_REMOVE(>v_mount->mnt_vnodelist, vp, v_mntvnodes);
> > /*
> >  * Insert into list of vnodes for the new mount point, if available.
> >  */
> > if ((vp->v_mount = mp) != NULL)
> > -   LIST_INSERT_HEAD(>mnt_vnodelist, vp, v_mntvnodes);
> > +   TAILQ_INSERT_TAIL(>mnt_vnodelist, vp, v_mntvnodes);
> >  }
> >
> >  /*
> > @@ -872,7 +872,7 @@ vfs_mount_foreach_vnode(struct mount *mp
> > int error = 0;
> >
> >  loop:
> > -   LIST_FOREACH_SAFE(vp , >mnt_vnodelist, v_mntvnodes, nvp) {
> > +   TAILQ_FOREACH_SAFE(vp , >mnt_vnodelist, v_mntvnodes, nvp) {
> > if (vp->v_mount != mp)
> > goto loop;
> >
> > @@ -1299,7 +1299,7 @@ printlockedvnodes(void)
> > TAILQ_FOREACH(mp, , mnt_list) {
> > if (vfs_busy(mp, VB_READ|VB_NOWAIT))
> > continue;
> > -   LIST_FOREACH(vp, >mnt_vnodelist, v_mntvnodes) {
> > +   TAILQ_FOREACH(vp, >mnt_vnodelist, v_mntvnodes) {
> > if (VOP_ISLOCKED(vp))
> > vprint(NULL, vp);
> > }
> > @@ -2287,7 +2287,7 @@ vfs_mount_print(struct mount *mp, int fu
> > (*pr)("locked vnodes:");
> > /* XXX would take mountlist lock, except ddb has no context */
> > cnt = 0;
> > -   LIST_FOREACH(vp, >mnt_vnodelist, v_mntvnodes) {
> > +   TAILQ_FOREACH(vp, >mnt_vnodelist, v_mntvnodes) {
> > if (VOP_ISLOCKED(vp)) {
> > if (cnt == 0)
> > (*pr)("\n  %p", vp);
> > @@ -2304,7 +2304,7 @@ vfs_mount_print(struct mount *mp, int fu
> > (*pr)("all vnodes:");
> > /* XXX would take mountlist lock, except ddb has no context */
> > cnt = 0;
> > -   LIST_FOREACH(vp, >mnt_vnodelist, v_mntvnodes) {
> > +   TAILQ_FOREACH(vp, >mnt_vnodelist, v_mntvnodes) {
> > if (cnt == 0)
> > (*pr)("\n  %p", vp);
> > else if ((cnt % (72 / (sizeof(void *) * 2 + 4))) == 0)
> > Index: sys/kern/vfs_syscalls.c
> > ===
> > RCS file: /data/mirror/openbsd/cvs/src/sys/kern/vfs_syscalls.c,v
> > retrieving revision 1.338
> > diff -u -p -r1.338 vfs_syscalls.c
> > --- sys/kern/vfs_syscalls.c 29 Nov 2019 20:58:17 -  1.338
> > +++ sys/kern/vfs_syscalls.c 4 Jan 2020 18:41:20 -
> > @@ -489,7 +489,7 @@ dounmount_leaf(struct mount *mp, int fla
> >  * Before calling file system unmount, make sure
> >  * all unveils to vnodes 

bgpd, remove getifaddrs call in RDE

2020-01-09 Thread Claudio Jeker
The RDE needs to know the local v4 and v6 address of a session so that
nexthop self works. Until now the lookup for the other AF address was done
in the RDE when the session got established. This diff moves this code
over to the SE where it fits better. Especially this allows to remove the
route pledge from the RDE.

This diff works for me but I would like more tests especially on link
local IPv6 sessions (the KAME embedded scope curse haunts me again).
-- 
:wq Claudio

? obj
Index: bgpd.h
===
RCS file: /cvs/src/usr.sbin/bgpd/bgpd.h,v
retrieving revision 1.397
diff -u -p -r1.397 bgpd.h
--- bgpd.h  9 Jan 2020 11:55:25 -   1.397
+++ bgpd.h  9 Jan 2020 13:37:19 -
@@ -656,7 +656,8 @@ struct kif {
 };
 
 struct session_up {
-   struct bgpd_addrlocal_addr;
+   struct bgpd_addrlocal_v4_addr;
+   struct bgpd_addrlocal_v6_addr;
struct bgpd_addrremote_addr;
struct capabilities capa;
u_int32_t   remote_bgpid;
Index: rde.c
===
RCS file: /cvs/src/usr.sbin/bgpd/rde.c,v
retrieving revision 1.498
diff -u -p -r1.498 rde.c
--- rde.c   9 Jan 2020 13:31:52 -   1.498
+++ rde.c   9 Jan 2020 13:37:19 -
@@ -177,7 +177,7 @@ rde_main(int debug, int verbose)
setresuid(pw->pw_uid, pw->pw_uid, pw->pw_uid))
fatal("can't drop privileges");
 
-   if (pledge("stdio route recvfd", NULL) == -1)
+   if (pledge("stdio recvfd", NULL) == -1)
fatal("pledge");
 
signal(SIGTERM, rde_sighdlr);
Index: rde_peer.c
===
RCS file: /cvs/src/usr.sbin/bgpd/rde_peer.c,v
retrieving revision 1.2
diff -u -p -r1.2 rde_peer.c
--- rde_peer.c  9 Jan 2020 13:31:52 -   1.2
+++ rde_peer.c  9 Jan 2020 13:37:19 -
@@ -17,9 +17,7 @@
  */
 #include 
 #include 
-#include 
 
-#include 
 #include 
 #include 
 #include 
@@ -306,102 +304,6 @@ rde_up_dump_done(void *ptr, u_int8_t aid
fatal("%s: prefix_dump_new", __func__);
 }
 
-static int
-sa_cmp(struct bgpd_addr *a, struct sockaddr *b)
-{
-   struct sockaddr_in  *in_b;
-   struct sockaddr_in6 *in6_b;
-
-   if (aid2af(a->aid) != b->sa_family)
-   return (1);
-
-   switch (b->sa_family) {
-   case AF_INET:
-   in_b = (struct sockaddr_in *)b;
-   if (a->v4.s_addr != in_b->sin_addr.s_addr)
-   return (1);
-   break;
-   case AF_INET6:
-   in6_b = (struct sockaddr_in6 *)b;
-#ifdef __KAME__
-   /* directly stolen from sbin/ifconfig/ifconfig.c */
-   if (IN6_IS_ADDR_LINKLOCAL(_b->sin6_addr)) {
-   in6_b->sin6_scope_id =
-   ntohs(*(u_int16_t *)_b->sin6_addr.s6_addr[2]);
-   in6_b->sin6_addr.s6_addr[2] =
-   in6_b->sin6_addr.s6_addr[3] = 0;
-   }
-#endif
-   if (bcmp(>v6, _b->sin6_addr,
-   sizeof(struct in6_addr)))
-   return (1);
-   break;
-   default:
-   fatal("king bula sez: unknown address family");
-   /* NOTREACHED */
-   }
-
-   return (0);
-}
-
-/*
- * Figure out the local IP addresses most suitable for this session.
- * This looks up the local address of other address family based on
- * the address of the TCP session.
- */
-static int
-peer_localaddrs(struct rde_peer *peer, struct bgpd_addr *laddr)
-{
-   struct ifaddrs  *ifap, *ifa, *match;
-
-   if (getifaddrs() == -1)
-   fatal("getifaddrs");
-
-   for (match = ifap; match != NULL; match = match->ifa_next)
-   if (sa_cmp(laddr, match->ifa_addr) == 0)
-   break;
-
-   if (match == NULL) {
-   log_warnx("peer_localaddrs: local address not found");
-   return (-1);
-   }
-
-   for (ifa = ifap; ifa != NULL; ifa = ifa->ifa_next) {
-   if (ifa->ifa_addr->sa_family == AF_INET &&
-   strcmp(ifa->ifa_name, match->ifa_name) == 0) {
-   if (ifa->ifa_addr->sa_family ==
-   match->ifa_addr->sa_family)
-   ifa = match;
-   sa2addr(ifa->ifa_addr, >local_v4_addr, NULL);
-   break;
-   }
-   }
-   for (ifa = ifap; ifa != NULL; ifa = ifa->ifa_next) {
-   if (ifa->ifa_addr->sa_family == AF_INET6 &&
-   strcmp(ifa->ifa_name, match->ifa_name) == 0) {
-   /*
-* only accept global scope addresses except explicitly
-* specified.
-*/
-   if (ifa->ifa_addr->sa_family ==
- 

Re: slaacd(8) improve movement between IPv6 & legacy-IP-only networks

2020-01-09 Thread Fernando Gont
Hi, Pamela,

On 27/8/19 15:31, Pamela Mosiejczuk wrote:
> Hi all,
> 
> When slaacd(8) detects a link-state change, it sets the preferred
> lifetime (pltime) of all configured addresses to zero and tries to
> obtain a new router advertisement.
> 
> If none is received after some time, slaacd will give up, but existing
> configured IPv6 addresses remain. This leads applications to believe
> that IPv6 connectivity is available when it is not, leading to
> potentially long timeouts. RFC 6059 suggests removing these addresses.
> 
> Please test this with IPv6-enabled -> IPv6-enabled transitions (there
> should be no regression) and IPv6-enabled -> legacy-only transitions
> (things should be improved).
> 
> RFC 6059 has additional suggestions about what to do when moving among
> various IPv6-enabled networks.

Just happened to see this (sorry for the bad timing!).

Note: It may be appropriate to remove existing addresses as soon as you
realize you've connected to a different link (e.g., different SSID for
wifi, or if you somehow detect a different net as per RFC6059).

However, a link-state shouldn't, per-se, lead to addresses being
removed. Otherwise, you might break e.g. existing TCP connections upon
transient network problems.

Since we are at it: heads-up:
https://www.ietf.org/internet-drafts/draft-ietf-6man-rfc4941bis-05.txt

Thanks!

Cheers,
-- 
Fernando Gont
e-mail: ferna...@gont.com.ar || fg...@si6networks.com
PGP Fingerprint: 7809 84F5 322E 45C7 F1C9 3945 96EE A9EF D076 FFF1





Re: amd64 SMEP SMAP trap panic print

2020-01-09 Thread Mark Kettenis
> Date: Thu, 9 Jan 2020 14:01:32 +0100
> From: Alexander Bluhm 
> 
> On Thu, Jan 09, 2020 at 12:01:07PM +0100, Mark Kettenis wrote:
> > By splitting this out, the "retain the kernel lock" bit of the
> > comment doesn't make a lot of sense anymore...
> 
> Then move the comment to the return where we retain it.

Yes, I think that's better.

ok kettenis@

> Index: arch/amd64/amd64/db_trace.c
> ===
> RCS file: /data/mirror/openbsd/cvs/src/sys/arch/amd64/amd64/db_trace.c,v
> retrieving revision 1.47
> diff -u -p -r1.47 db_trace.c
> --- arch/amd64/amd64/db_trace.c   10 Nov 2019 10:03:33 -  1.47
> +++ arch/amd64/amd64/db_trace.c   9 Jan 2020 12:25:54 -
> @@ -150,7 +150,7 @@ db_stack_trace_print(db_expr_t addr, int
>   name = NULL;
>   }
> 
> - if (lastframe == 0 && sym == NULL) {
> + if (lastframe == 0 && sym == NULL && callpc != 0) {
>   /* Symbol not found, peek at code */
>   unsigned long instr = db_get_value(callpc, 8, 0);
> 
> Index: arch/amd64/amd64/trap.c
> ===
> RCS file: /data/mirror/openbsd/cvs/src/sys/arch/amd64/amd64/trap.c,v
> retrieving revision 1.77
> diff -u -p -r1.77 trap.c
> --- arch/amd64/amd64/trap.c   6 Sep 2019 12:22:01 -   1.77
> +++ arch/amd64/amd64/trap.c   9 Jan 2020 12:53:10 -
> @@ -77,6 +77,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
> 
>  #include 
> 
> @@ -132,6 +133,23 @@ static inline void verify_smap(const cha
>  static inline void debug_trap(struct trapframe *_frame, struct proc *_p,
>  long _type);
> 
> +static inline void
> +fault(const char *format, ...)
> +{
> + static char faultbuf[512];
> + va_list ap;
> +
> + /*
> +  * Save the fault info for DDB.  Kernel lock protects
> +  * faultbuf from being overwritten by another CPU.
> +  */
> + va_start(ap, format);
> + vsnprintf(faultbuf, sizeof faultbuf, format, ap);
> + va_end(ap);
> + printf("%s\n", faultbuf);
> + faultstr = faultbuf;
> +}
> +
>  /*
>   * pageflttrap(frame, usermode): page fault handler
>   * Returns non-zero if the fault was handled (possibly by generating
> @@ -160,17 +178,21 @@ pageflttrap(struct trapframe *frame, int
>   KERNEL_LOCK();
> 
>   if (!usermode) {
> - extern struct vm_map *kernel_map;
> -
>   /* This will only trigger if SMEP is enabled */
> - if (cr2 <= VM_MAXUSER_ADDRESS && frame->tf_err & PGEX_I)
> - panic("attempt to execute user address %p "
> + if (cr2 <= VM_MAXUSER_ADDRESS && frame->tf_err & PGEX_I) {
> + fault("attempt to execute user address %p "
>   "in supervisor mode", (void *)cr2);
> + /* retain kernel lock */
> + return 0;
> + }
>   /* This will only trigger if SMAP is enabled */
>   if (pcb->pcb_onfault == NULL && cr2 <= VM_MAXUSER_ADDRESS &&
> - frame->tf_err & PGEX_P)
> - panic("attempt to access user address %p "
> + frame->tf_err & PGEX_P) {
> + fault("attempt to access user address %p "
>   "in supervisor mode", (void *)cr2);
> + /* retain kernel lock */
> + return 0;
> + }
> 
>   /*
>* It is only a kernel address space fault iff:
> @@ -211,17 +233,10 @@ pageflttrap(struct trapframe *frame, int
>   frame->tf_rip = (u_int64_t)pcb->pcb_onfault;
>   return 1;
>   } else {
> - /*
> -  * Bad memory access in the kernel; save the fault
> -  * info for DDB and retain the kernel lock to keep
> -  * faultbuf from being overwritten by another CPU.
> -  */
> - static char faultbuf[512];
> - snprintf(faultbuf, sizeof faultbuf,
> - "uvm_fault(%p, 0x%llx, 0, %d) -> %x",
> + /* bad memory access in the kernel */
> + fault("uvm_fault(%p, 0x%llx, 0, %d) -> %x",
>   map, cr2, ftype, error);
> - printf("%s\n", faultbuf);
> - faultstr = faultbuf;
> + /* retain kernel lock */
>   return 0;
>   }
>   } else {
> @@ -395,7 +410,7 @@ trap_print(struct trapframe *frame, int
>   printf(" in %s mode\n", KERNELMODE(frame->tf_cs, frame->tf_rflags) ?
>   "supervisor" : "user");
>   printf("trap type %d code %llx rip %llx cs %llx rflags %llx cr2 "
> -" %llx cpl %x rsp %llx\n",
> +"%llx cpl %x rsp %llx\n",
>   type, 

Re: amd64 SMEP SMAP trap panic print

2020-01-09 Thread Alexander Bluhm
On Thu, Jan 09, 2020 at 12:01:07PM +0100, Mark Kettenis wrote:
> By splitting this out, the "retain the kernel lock" bit of the
> comment doesn't make a lot of sense anymore...

Then move the comment to the return where we retain it.

bluhm

Index: arch/amd64/amd64/db_trace.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/arch/amd64/amd64/db_trace.c,v
retrieving revision 1.47
diff -u -p -r1.47 db_trace.c
--- arch/amd64/amd64/db_trace.c 10 Nov 2019 10:03:33 -  1.47
+++ arch/amd64/amd64/db_trace.c 9 Jan 2020 12:25:54 -
@@ -150,7 +150,7 @@ db_stack_trace_print(db_expr_t addr, int
name = NULL;
}

-   if (lastframe == 0 && sym == NULL) {
+   if (lastframe == 0 && sym == NULL && callpc != 0) {
/* Symbol not found, peek at code */
unsigned long instr = db_get_value(callpc, 8, 0);

Index: arch/amd64/amd64/trap.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/arch/amd64/amd64/trap.c,v
retrieving revision 1.77
diff -u -p -r1.77 trap.c
--- arch/amd64/amd64/trap.c 6 Sep 2019 12:22:01 -   1.77
+++ arch/amd64/amd64/trap.c 9 Jan 2020 12:53:10 -
@@ -77,6 +77,7 @@
 #include 
 #include 
 #include 
+#include 

 #include 

@@ -132,6 +133,23 @@ static inline void verify_smap(const cha
 static inline void debug_trap(struct trapframe *_frame, struct proc *_p,
 long _type);

+static inline void
+fault(const char *format, ...)
+{
+   static char faultbuf[512];
+   va_list ap;
+
+   /*
+* Save the fault info for DDB.  Kernel lock protects
+* faultbuf from being overwritten by another CPU.
+*/
+   va_start(ap, format);
+   vsnprintf(faultbuf, sizeof faultbuf, format, ap);
+   va_end(ap);
+   printf("%s\n", faultbuf);
+   faultstr = faultbuf;
+}
+
 /*
  * pageflttrap(frame, usermode): page fault handler
  * Returns non-zero if the fault was handled (possibly by generating
@@ -160,17 +178,21 @@ pageflttrap(struct trapframe *frame, int
KERNEL_LOCK();

if (!usermode) {
-   extern struct vm_map *kernel_map;
-
/* This will only trigger if SMEP is enabled */
-   if (cr2 <= VM_MAXUSER_ADDRESS && frame->tf_err & PGEX_I)
-   panic("attempt to execute user address %p "
+   if (cr2 <= VM_MAXUSER_ADDRESS && frame->tf_err & PGEX_I) {
+   fault("attempt to execute user address %p "
"in supervisor mode", (void *)cr2);
+   /* retain kernel lock */
+   return 0;
+   }
/* This will only trigger if SMAP is enabled */
if (pcb->pcb_onfault == NULL && cr2 <= VM_MAXUSER_ADDRESS &&
-   frame->tf_err & PGEX_P)
-   panic("attempt to access user address %p "
+   frame->tf_err & PGEX_P) {
+   fault("attempt to access user address %p "
"in supervisor mode", (void *)cr2);
+   /* retain kernel lock */
+   return 0;
+   }

/*
 * It is only a kernel address space fault iff:
@@ -211,17 +233,10 @@ pageflttrap(struct trapframe *frame, int
frame->tf_rip = (u_int64_t)pcb->pcb_onfault;
return 1;
} else {
-   /*
-* Bad memory access in the kernel; save the fault
-* info for DDB and retain the kernel lock to keep
-* faultbuf from being overwritten by another CPU.
-*/
-   static char faultbuf[512];
-   snprintf(faultbuf, sizeof faultbuf,
-   "uvm_fault(%p, 0x%llx, 0, %d) -> %x",
+   /* bad memory access in the kernel */
+   fault("uvm_fault(%p, 0x%llx, 0, %d) -> %x",
map, cr2, ftype, error);
-   printf("%s\n", faultbuf);
-   faultstr = faultbuf;
+   /* retain kernel lock */
return 0;
}
} else {
@@ -395,7 +410,7 @@ trap_print(struct trapframe *frame, int
printf(" in %s mode\n", KERNELMODE(frame->tf_cs, frame->tf_rflags) ?
"supervisor" : "user");
printf("trap type %d code %llx rip %llx cs %llx rflags %llx cr2 "
-  " %llx cpl %x rsp %llx\n",
+  "%llx cpl %x rsp %llx\n",
type, frame->tf_err, frame->tf_rip, frame->tf_cs,
frame->tf_rflags, rcr2(), curcpu()->ci_ilevel, frame->tf_rsp);
printf("gsbase %p  kgsbase %p\n",
Index: ddb/db_examine.c

Re: bgpd, move peer related functions to rde_peer.c

2020-01-09 Thread Sebastian Benoit


ok

Claudio Jeker(cje...@diehard.n-r-g.com) on 2020.01.09 13:03:13 +0100:
> On Thu, Jan 09, 2020 at 09:42:39AM +0100, Claudio Jeker wrote:
> > This diff just moves some of the code from rde.c to rde_peer.c where it
> > should be. Apart from moving the code peer_down() was modified so that it
> > can be used as callback of peer_foreach(). Also peer_foreach() was changed
> > to just walk the peer list instead of traversing the peer hash table.
> > 
> 
> Updated version after last commit.
> 
> -- 
> :wq Claudio
> 
> ? obj
> Index: rde.c
> ===
> RCS file: /cvs/src/usr.sbin/bgpd/rde.c,v
> retrieving revision 1.497
> diff -u -p -r1.497 rde.c
> --- rde.c 9 Jan 2020 11:55:25 -   1.497
> +++ rde.c 9 Jan 2020 12:01:51 -
> @@ -20,12 +20,10 @@
>   */
>  
>  #include 
> -#include 
>  #include 
>  #include 
>  
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 
> @@ -63,9 +61,6 @@ int  rde_get_mp_nexthop(u_char *, u_int
>struct filterstate *);
>  void  rde_update_err(struct rde_peer *, u_int8_t , u_int8_t,
>void *, u_int16_t);
> -void  rde_update_log(const char *, u_int16_t,
> -  const struct rde_peer *, const struct bgpd_addr *,
> -  const struct bgpd_addr *, u_int8_t);
>  void  rde_as4byte_fixup(struct rde_peer *, struct rde_aspath *);
>  void  rde_reflector(struct rde_peer *, struct rde_aspath *);
>  
> @@ -94,20 +89,8 @@ voidrde_mark_prefixsets_dirty(struct 
>  u_int8_t  rde_roa_validity(struct rde_prefixset *,
>struct bgpd_addr *, u_int8_t, u_int32_t);
>  
> -void  peer_init(u_int32_t);
> -void  peer_shutdown(void);
> -void  peer_foreach(void (*)(struct rde_peer *, void *), void *);
> -int   peer_imsg_pending(void);
> -int   peer_localaddrs(struct rde_peer *, struct bgpd_addr *);
> -struct rde_peer *peer_match(struct ctl_neighbor *, u_int32_t);
> -struct rde_peer  *peer_add(u_int32_t, struct peer_config *);
> -void  peer_up(struct rde_peer *, struct session_up *);
> -void  peer_down(struct rde_peer *);
> -void  peer_flush(struct rde_peer *, u_int8_t, time_t);
> -void  peer_stale(struct rde_peer *, u_int8_t);
> -void  peer_dump(struct rde_peer *, u_int8_t);
> -static void   peer_recv_eor(struct rde_peer *, u_int8_t);
> -static void   peer_send_eor(struct rde_peer *, u_int8_t);
> +static void   rde_peer_recv_eor(struct rde_peer *, u_int8_t);
> +static void   rde_peer_send_eor(struct rde_peer *, u_int8_t);
>  
>  void  network_add(struct network_config *, struct filterstate *);
>  void  network_delete(struct network_config *);
> @@ -115,13 +98,10 @@ static void   network_dump_upcall(struct 
>  static void   network_flush_upcall(struct rib_entry *, void *);
>  
>  void  rde_shutdown(void);
> -int   sa_cmp(struct bgpd_addr *, struct sockaddr *);
>  int   ovs_match(struct prefix *, u_int32_t);
>  
>  volatile sig_atomic_t rde_quit = 0;
>  struct bgpd_config   *conf, *nconf;
> -struct rde_peer_head  peerlist;
> -struct rde_peer  *peerself;
>  struct filter_head   *out_rules, *out_rules_tmp;
>  struct imsgbuf   *ibuf_se;
>  struct imsgbuf   *ibuf_se_ctl;
> @@ -129,6 +109,9 @@ struct imsgbuf*ibuf_main;
>  struct rde_memstats   rdemem;
>  int   softreconfig;
>  
> +extern struct rde_peer_head   peerlist;
> +extern struct rde_peer   *peerself;
> +
>  struct rde_dump_ctx {
>   LIST_ENTRY(rde_dump_ctx)entry;
>   struct ctl_show_rib_request req;
> @@ -990,10 +973,14 @@ rde_dispatch_imsg_peer(struct rde_peer *
>   if (imsg.hdr.len - IMSG_HEADER_SIZE != sizeof(sup))
>   fatalx("incorrect size of session request");
>   memcpy(, imsg.data, sizeof(sup));
> - peer_up(peer, );
> + if (peer_up(peer, ) == -1) {
> + peer->state = PEER_DOWN;
> + imsg_compose(ibuf_se, IMSG_SESSION_DOWN, peer->conf.id,
> + 0, -1, NULL, 0);
> + }
>   break;
>   case IMSG_SESSION_DOWN:
> - peer_down(peer);
> + peer_down(peer, NULL);
>   break;
>   case IMSG_SESSION_STALE:
>   case IMSG_SESSION_FLUSH:
> @@ -1088,7 +1075,7 @@ rde_update_dispatch(struct rde_peer *pee
>   }
>   if (withdrawn_len == 0) {
>   /* EoR marker */
> - peer_recv_eor(peer, AID_INET);
> + rde_peer_recv_eor(peer, AID_INET);
>   return;
>   }
>   }
> @@ -1200,7 +1187,7 @@ rde_update_dispatch(struct rde_peer *pee
>   if ((state.aspath.flags & ~F_ATTR_MP_UNREACH) == 0 &&
>  

MAKE: redux patch

2020-01-09 Thread Marc Espie
So my development branch is getting a bit too far
ahead compared to what's committed.

Here's a big diff to test.  Comments as to what's going on
and the changes this contains:

- buffer changes: add support for "permanent static buffers"
that don't get reinit'd all the time (that's a mostly trivial
change)

- parser change: SPECIAL_TARGET and SPECIAL_SOURCE are not
really needed (figured out in netbsd)
- (related): create the special nodes once in a simpler way

- a bit of reorg: take the code for expand_children out
of suff.c proper, as this file is always fairly unwieldy.
This only requires Link_Parent, and find_best_path, which
are suff-independent.

- engine change: stop mixing parallel and compat so much.
In particular, don't run Make_Update when in compat mode (which
would happen because handle_all_signals goes into the main loop)

- (related): abstract the engine running into operations in
'enginechoice.c' so that it's perfectly clear what part is compatMake
and what part is not.   Have it report out_of_date and errors in 
a simple way.  Don't mix up .BEGIN/.END handling.

- (related): job.c determines whether to use the expensive heuristics
or not depending on how many jobs we run. 

- jobrunner cleanup: we have no need for maxjobs, it's enough to move
stuff around from available into running or error... do the move to 
error later so that everything is in the same location.

- with the above changes: the special case of running_one_job is no
longer needed at all.

- feature coming from bmake: treat .BEGIN/.END more as normal targets.
More specifically, just invoke the engine on them if they exist.
This makes it possible to have
.BEGIN: somethingelse
which does make a lot of sense, actually.

I would really need this to go in so that I can keep pushing forward.

I think I might take out the "old" parallel engine (which is somewhat
broken, and frankly, I don't get how it can work) because modifying
compat to actually queue stuff and build it  is a distinct possibility
now.  e.g., having an actual parallel engine that works.


diff --git a/Makefile b/Makefile
index 0cd84fc..90747de 100644
--- a/Makefile
+++ b/Makefile
@@ -16,8 +16,8 @@ CFLAGS+=${CDEFS}
 HOSTCFLAGS+=${CDEFS}
 
 SRCS=  arch.c buf.c cmd_exec.c compat.c cond.c dir.c direxpand.c dump.c \
-   engine.c \
-   error.c for.c init.c job.c lowparse.c main.c make.c memory.c parse.c \
+   engine.c enginechoice.c error.c expandchildren.c \
+   for.c init.c job.c lowparse.c main.c make.c memory.c parse.c \
parsevar.c str.c stats.c suff.c targ.c targequiv.c timestamp.c \
var.c varmodifiers.c varname.c
 
diff --git a/arch.c b/arch.c
index 85e8e7e..02c18b5 100644
--- a/arch.c
+++ b/arch.c
@@ -195,11 +195,10 @@ bool
 Arch_ParseArchive(const char **line, Lst nodes, SymTable *ctxt)
 {
bool result;
-   BUFFER expand;
+   static BUFFER expand;
 
-   Buf_Init(, MAKE_BSIZE);
+   Buf_Reinit(, MAKE_BSIZE);
result = parse_archive(, line, nodes, ctxt);
-   Buf_Destroy();
return result;
 }
 
diff --git a/buf.c b/buf.c
index 74cbfe8..931bd7e 100644
--- a/buf.c
+++ b/buf.c
@@ -153,6 +153,15 @@ Buf_printf(Buffer bp, const char *fmt, ...)
bp->inPtr += n;
 }
 
+void
+Buf_Reinit(Buffer bp, size_t size)
+{
+   if (bp->buffer == NULL)
+   Buf_Init(bp, size);
+   else
+   Buf_Reset(bp);
+}
+
 void
 Buf_Init(Buffer bp, size_t size)
 {
diff --git a/buf.h b/buf.h
index 1b56b27..20ea56a 100644
--- a/buf.h
+++ b/buf.h
@@ -106,6 +106,9 @@ extern void Buf_AddChars(Buffer, size_t, const char *);
  * Initializes a buffer, to hold approximately init chars.
  * Set init to 0 if you have no idea.  */
 extern void Buf_Init(Buffer, size_t);
+/* Buf_Reinit(buf, init);
+ * Initializes/reset a static buffer */
+extern void Buf_Reinit(Buffer, size_t);
 /* Buf_Destroy(buf);
  * Nukes a buffer and all its resources.   */
 #define Buf_Destroy(bp) ((void)free((bp)->buffer))
diff --git a/compat.c b/compat.c
index 069e52f..fd78d78 100644
--- a/compat.c
+++ b/compat.c
@@ -193,14 +193,12 @@ CompatMake(void *gnp, /* The node to make */
/* copy over what we just did */
gn->built_status = sib->built_status;
 
-   if (gn->built_status != ERROR) {
-   /* If the node was built successfully, mark it so,
+   if (gn->built_status == REBUILT) {
+   /* If the node was built successfully, 
 * update its modification time and timestamp all
 * its parents.
 * This is to keep its state from affecting that of
 * its parent.  */
-   gn->built_status = REBUILT;
-   sib->built_status = REBUILT;
/* This is what Make does and it's actually a good
 * thing, as it allows rules like
 *
@@ 

Re: bgpd, move peer related functions to rde_peer.c

2020-01-09 Thread Claudio Jeker
On Thu, Jan 09, 2020 at 09:42:39AM +0100, Claudio Jeker wrote:
> This diff just moves some of the code from rde.c to rde_peer.c where it
> should be. Apart from moving the code peer_down() was modified so that it
> can be used as callback of peer_foreach(). Also peer_foreach() was changed
> to just walk the peer list instead of traversing the peer hash table.
> 

Updated version after last commit.

-- 
:wq Claudio

? obj
Index: rde.c
===
RCS file: /cvs/src/usr.sbin/bgpd/rde.c,v
retrieving revision 1.497
diff -u -p -r1.497 rde.c
--- rde.c   9 Jan 2020 11:55:25 -   1.497
+++ rde.c   9 Jan 2020 12:01:51 -
@@ -20,12 +20,10 @@
  */
 
 #include 
-#include 
 #include 
 #include 
 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -63,9 +61,6 @@ intrde_get_mp_nexthop(u_char *, u_int
 struct filterstate *);
 voidrde_update_err(struct rde_peer *, u_int8_t , u_int8_t,
 void *, u_int16_t);
-voidrde_update_log(const char *, u_int16_t,
-const struct rde_peer *, const struct bgpd_addr *,
-const struct bgpd_addr *, u_int8_t);
 voidrde_as4byte_fixup(struct rde_peer *, struct rde_aspath *);
 voidrde_reflector(struct rde_peer *, struct rde_aspath *);
 
@@ -94,20 +89,8 @@ void  rde_mark_prefixsets_dirty(struct 
 u_int8_trde_roa_validity(struct rde_prefixset *,
 struct bgpd_addr *, u_int8_t, u_int32_t);
 
-voidpeer_init(u_int32_t);
-voidpeer_shutdown(void);
-voidpeer_foreach(void (*)(struct rde_peer *, void *), void *);
-int peer_imsg_pending(void);
-int peer_localaddrs(struct rde_peer *, struct bgpd_addr *);
-struct rde_peer *peer_match(struct ctl_neighbor *, u_int32_t);
-struct rde_peer*peer_add(u_int32_t, struct peer_config *);
-voidpeer_up(struct rde_peer *, struct session_up *);
-voidpeer_down(struct rde_peer *);
-voidpeer_flush(struct rde_peer *, u_int8_t, time_t);
-voidpeer_stale(struct rde_peer *, u_int8_t);
-voidpeer_dump(struct rde_peer *, u_int8_t);
-static void peer_recv_eor(struct rde_peer *, u_int8_t);
-static void peer_send_eor(struct rde_peer *, u_int8_t);
+static void rde_peer_recv_eor(struct rde_peer *, u_int8_t);
+static void rde_peer_send_eor(struct rde_peer *, u_int8_t);
 
 voidnetwork_add(struct network_config *, struct filterstate *);
 voidnetwork_delete(struct network_config *);
@@ -115,13 +98,10 @@ static void network_dump_upcall(struct 
 static void network_flush_upcall(struct rib_entry *, void *);
 
 voidrde_shutdown(void);
-int sa_cmp(struct bgpd_addr *, struct sockaddr *);
 int ovs_match(struct prefix *, u_int32_t);
 
 volatile sig_atomic_t   rde_quit = 0;
 struct bgpd_config *conf, *nconf;
-struct rde_peer_headpeerlist;
-struct rde_peer*peerself;
 struct filter_head *out_rules, *out_rules_tmp;
 struct imsgbuf *ibuf_se;
 struct imsgbuf *ibuf_se_ctl;
@@ -129,6 +109,9 @@ struct imsgbuf  *ibuf_main;
 struct rde_memstats rdemem;
 int softreconfig;
 
+extern struct rde_peer_head peerlist;
+extern struct rde_peer *peerself;
+
 struct rde_dump_ctx {
LIST_ENTRY(rde_dump_ctx)entry;
struct ctl_show_rib_request req;
@@ -990,10 +973,14 @@ rde_dispatch_imsg_peer(struct rde_peer *
if (imsg.hdr.len - IMSG_HEADER_SIZE != sizeof(sup))
fatalx("incorrect size of session request");
memcpy(, imsg.data, sizeof(sup));
-   peer_up(peer, );
+   if (peer_up(peer, ) == -1) {
+   peer->state = PEER_DOWN;
+   imsg_compose(ibuf_se, IMSG_SESSION_DOWN, peer->conf.id,
+   0, -1, NULL, 0);
+   }
break;
case IMSG_SESSION_DOWN:
-   peer_down(peer);
+   peer_down(peer, NULL);
break;
case IMSG_SESSION_STALE:
case IMSG_SESSION_FLUSH:
@@ -1088,7 +1075,7 @@ rde_update_dispatch(struct rde_peer *pee
}
if (withdrawn_len == 0) {
/* EoR marker */
-   peer_recv_eor(peer, AID_INET);
+   rde_peer_recv_eor(peer, AID_INET);
return;
}
}
@@ -1200,7 +1187,7 @@ rde_update_dispatch(struct rde_peer *pee
if ((state.aspath.flags & ~F_ATTR_MP_UNREACH) == 0 &&
mplen == 0) {
/* EoR marker */
-   peer_recv_eor(peer, aid);
+   rde_peer_recv_eor(peer, aid);
}
 

Re: amd64 SMEP SMAP trap panic print

2020-01-09 Thread Mark Kettenis
> Date: Thu, 9 Jan 2020 10:42:55 +0100
> From: Alexander Bluhm 
> 
> On Thu, Dec 26, 2019 at 08:12:50PM +0100, Alexander Bluhm wrote:
> > Any more concerns or can this be commited?
> 
> Any ok?
> 
> > Index: arch/amd64/amd64/db_trace.c
> > ===
> > RCS file: /data/mirror/openbsd/cvs/src/sys/arch/amd64/amd64/db_trace.c,v
> > retrieving revision 1.47
> > diff -u -p -r1.47 db_trace.c
> > --- arch/amd64/amd64/db_trace.c 10 Nov 2019 10:03:33 -  1.47
> > +++ arch/amd64/amd64/db_trace.c 26 Dec 2019 19:08:35 -
> > @@ -150,7 +150,7 @@ db_stack_trace_print(db_expr_t addr, int
> > name = NULL;
> > }
> >
> > -   if (lastframe == 0 && sym == NULL) {
> > +   if (lastframe == 0 && sym == NULL && callpc != 0) {
> > /* Symbol not found, peek at code */
> > unsigned long instr = db_get_value(callpc, 8, 0);
> >
> > Index: arch/amd64/amd64/trap.c
> > ===
> > RCS file: /data/mirror/openbsd/cvs/src/sys/arch/amd64/amd64/trap.c,v
> > retrieving revision 1.77
> > diff -u -p -r1.77 trap.c
> > --- arch/amd64/amd64/trap.c 6 Sep 2019 12:22:01 -   1.77
> > +++ arch/amd64/amd64/trap.c 26 Dec 2019 19:08:35 -
> > @@ -77,6 +77,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >
> >  #include 
> >
> > @@ -132,6 +133,24 @@ static inline void verify_smap(const cha
> >  static inline void debug_trap(struct trapframe *_frame, struct proc *_p,
> >  long _type);
> >
> > +static inline int
> > +fault(const char *format, ...)
> > +{
> > +   static char faultbuf[512];
> > +   va_list ap;
> > +
> > +   /*
> > +* Save the fault info for DDB and retain the kernel lock to keep
> > +* faultbuf from being overwritten by another CPU.
> > +*/

By splitting this out, the "retain the kernel lock" bit of the
comment doesn't make a lot of sense anymore...

> > +   va_start(ap, format);
> > +   vsnprintf(faultbuf, sizeof faultbuf, format, ap);
> > +   va_end(ap);
> > +   printf("%s\n", faultbuf);
> > +   faultstr = faultbuf;
> > +   return 0;
> > +}
> > +
> >  /*
> >   * pageflttrap(frame, usermode): page fault handler
> >   * Returns non-zero if the fault was handled (possibly by generating
> > @@ -160,16 +179,14 @@ pageflttrap(struct trapframe *frame, int
> > KERNEL_LOCK();
> >
> > if (!usermode) {
> > -   extern struct vm_map *kernel_map;
> > -
> > /* This will only trigger if SMEP is enabled */
> > if (cr2 <= VM_MAXUSER_ADDRESS && frame->tf_err & PGEX_I)
> > -   panic("attempt to execute user address %p "
> > +   return fault("attempt to execute user address %p "
> > "in supervisor mode", (void *)cr2);
> > /* This will only trigger if SMAP is enabled */
> > if (pcb->pcb_onfault == NULL && cr2 <= VM_MAXUSER_ADDRESS &&
> > frame->tf_err & PGEX_P)
> > -   panic("attempt to access user address %p "
> > +   return fault("attempt to access user address %p "
> > "in supervisor mode", (void *)cr2);
> >
> > /*
> > @@ -211,18 +228,9 @@ pageflttrap(struct trapframe *frame, int
> > frame->tf_rip = (u_int64_t)pcb->pcb_onfault;
> > return 1;
> > } else {
> > -   /*
> > -* Bad memory access in the kernel; save the fault
> > -* info for DDB and retain the kernel lock to keep
> > -* faultbuf from being overwritten by another CPU.
> > -*/
> > -   static char faultbuf[512];
> > -   snprintf(faultbuf, sizeof faultbuf,
> > -   "uvm_fault(%p, 0x%llx, 0, %d) -> %x",
> > +   /* bad memory access in the kernel */
> > +   return fault("uvm_fault(%p, 0x%llx, 0, %d) -> %x",
> > map, cr2, ftype, error);
> > -   printf("%s\n", faultbuf);
> > -   faultstr = faultbuf;
> > -   return 0;
> > }
> > } else {
> > union sigval sv;
> > @@ -395,7 +403,7 @@ trap_print(struct trapframe *frame, int
> > printf(" in %s mode\n", KERNELMODE(frame->tf_cs, frame->tf_rflags) ?
> > "supervisor" : "user");
> > printf("trap type %d code %llx rip %llx cs %llx rflags %llx cr2 "
> > -  " %llx cpl %x rsp %llx\n",
> > +  "%llx cpl %x rsp %llx\n",
> > type, frame->tf_err, frame->tf_rip, frame->tf_cs,
> > frame->tf_rflags, rcr2(), curcpu()->ci_ilevel, frame->tf_rsp);
> > printf("gsbase %p  kgsbase %p\n",
> > Index: ddb/db_examine.c
> > ===
> > RCS file: 

Re: bgpd and time stamps

2020-01-09 Thread Denis Fondras
On Tue, Dec 31, 2019 at 04:32:24PM +0100, Claudio Jeker wrote:
> This changes bgpd to only use CLOCK_MONOTONIC via the getmonotime()
> function. Additionally it changes the export of RIB entries to report the
> last modified time relative to now. Other places also needed some fixup,
> like the mrt dump code since the Originated Time field in those messages
> is epoch based.
> 
> It would be nice to also send the last_read, last_write, and last_updown in
> a relative form to bgpctl but that is more complex.
> 
> In bgpctl fmt_timeframe is now for relative timestamps and fmt_monotime
> for those using CLOCK_MONOTONIC. The goal is to remove the latter once
> last_read and friends got changed.
> 
> OK?

OK denis@

> -- 
> :wq Claudio
> 
> Index: bgpctl/bgpctl.c
> ===
> RCS file: /cvs/src/usr.sbin/bgpctl/bgpctl.c,v
> retrieving revision 1.255
> diff -u -p -r1.255 bgpctl.c
> --- bgpctl/bgpctl.c   31 Dec 2019 14:09:27 -  1.255
> +++ bgpctl/bgpctl.c   31 Dec 2019 15:31:39 -
> @@ -595,8 +595,8 @@ print_neighbor_msgstats(struct peer *p)
>  #define TF_BUFS  8
>  #define TF_LEN   9
>  
> -static const char *
> -fmt_timeframe_core(time_t t)
> +const char *
> +fmt_timeframe(time_t t)
>  {
>   char*buf;
>   static char  tfbuf[TF_BUFS][TF_LEN];/* ring buffer */
> @@ -630,17 +630,18 @@ fmt_timeframe_core(time_t t)
>  }
>  
>  const char *
> -fmt_timeframe(time_t t)
> +fmt_monotime(time_t t)
>  {
> - time_t now;
> + struct timespec ts;
>  
>   if (t == 0)
>   return ("Never");
>  
> - now = time(NULL);
> - if (t > now)/* time in the future is not possible */
> - t = now;
> - return (fmt_timeframe_core(now - t));
> + if (clock_gettime(CLOCK_MONOTONIC, ) != 0)
> + err(1, "clock_gettime");
> + if (t > ts.tv_sec)  /* time in the future is not possible */
> + t = ts.tv_sec;
> + return (fmt_timeframe(ts.tv_sec - t));
>  }
>  
>  void
> @@ -1414,17 +1415,20 @@ show_mrt_dump(struct mrt_rib *mr, struct
>   struct parse_result  res;
>   struct ctl_show_rib_request *req = arg;
>   struct mrt_rib_entry*mre;
> + time_t   now;
>   u_int16_ti, j;
>  
>   memset(, 0, sizeof(res));
>   res.flags = req->flags;
> + now = time(NULL);
>  
>   for (i = 0; i < mr->nentries; i++) {
>   mre = >entries[i];
>   bzero(, sizeof(ctl));
>   ctl.prefix = mr->prefix;
>   ctl.prefixlen = mr->prefixlen;
> - ctl.lastchange = mre->originated;
> + if (mre->originated <= now)
> + ctl.age = now - mre->originated;
>   ctl.true_nexthop = mre->nexthop;
>   ctl.exit_nexthop = mre->nexthop;
>   ctl.origin = mre->origin;
> @@ -1490,14 +1494,17 @@ network_mrt_dump(struct mrt_rib *mr, str
>   struct ctl_show_rib_request *req = arg;
>   struct mrt_rib_entry*mre;
>   struct ibuf *msg;
> + time_t   now;
>   u_int16_ti, j;
>  
> + now = time(NULL);
>   for (i = 0; i < mr->nentries; i++) {
>   mre = >entries[i];
>   bzero(, sizeof(ctl));
>   ctl.prefix = mr->prefix;
>   ctl.prefixlen = mr->prefixlen;
> - ctl.lastchange = mre->originated;
> + if (mre->originated <= now)
> + ctl.age = now - mre->originated;
>   ctl.true_nexthop = mre->nexthop;
>   ctl.exit_nexthop = mre->nexthop;
>   ctl.origin = mre->origin;
> Index: bgpctl/bgpctl.h
> ===
> RCS file: /cvs/src/usr.sbin/bgpctl/bgpctl.h,v
> retrieving revision 1.2
> diff -u -p -r1.2 bgpctl.h
> --- bgpctl/bgpctl.h   31 Dec 2019 14:09:27 -  1.2
> +++ bgpctl/bgpctl.h   31 Dec 2019 15:31:39 -
> @@ -44,6 +44,7 @@ const char  *print_auth_method(enum auth_
>  const char   *fmt_mem(long long);
>  
>  const char   *fmt_timeframe(time_t);
> +const char   *fmt_monotime(time_t);
>  char *fmt_peer(const char *, const struct bgpd_addr *, int);
>  const char   *get_errstr(u_int8_t, u_int8_t);
>  
> Index: bgpctl/output.c
> ===
> RCS file: /cvs/src/usr.sbin/bgpctl/output.c,v
> retrieving revision 1.2
> diff -u -p -r1.2 output.c
> --- bgpctl/output.c   31 Dec 2019 14:09:27 -  1.2
> +++ bgpctl/output.c   31 Dec 2019 15:31:39 -
> @@ -111,7 +111,7 @@ show_summary(struct peer *p)
>   p->stats.msg_sent_update + p->stats.msg_sent_keepalive +
>   p->stats.msg_sent_rrefresh,
>   p->wbuf.queued,
> - fmt_timeframe(p->stats.last_updown));
> + fmt_monotime(p->stats.last_updown));
>  

Re: dangling vnode panic

2020-01-09 Thread Alexander Bluhm
Without this diff my regress machines became so unstable, that they
often do not finish the test run.

With this diff, they run fine.  No regressions.

Any ok?

bluhm

On Sat, Jan 04, 2020 at 10:55:46PM +0100, Alexander Bluhm wrote:
> On my amd64 test machine regress/sys/kern/mount run-regress-unmount-busy
> triggers the dangling vnode panic regulary.  Something has changed
> in the previous months that makes it more likely.
>
> Problem is that while flushing the mnt_vnodelist list, the unmount
> process sleeps.  Then active processes can write new dirty vnodes
> to the list.  This happens without softdep.
>
> My first attempt was to disable insertions while unmounting.  That
> does not work.  A buffer flush can cause softdep to add new dirty
> vnodes to the list.
>
> This diff adds the dirty vnodes to the end of the list.  vflush()
> trverses it from the begining.  Then new vnodes, that are added
> during syncing, will be flushed.  This fixes my test.
>
> Please test with and without softdep.
>
> ok?
>
> bluhm
>
> Index: sys/kern/vfs_subr.c
> ===
> RCS file: /data/mirror/openbsd/cvs/src/sys/kern/vfs_subr.c,v
> retrieving revision 1.297
> diff -u -p -r1.297 vfs_subr.c
> --- sys/kern/vfs_subr.c   30 Dec 2019 13:49:40 -  1.297
> +++ sys/kern/vfs_subr.c   4 Jan 2020 18:42:25 -
> @@ -178,7 +178,7 @@ vfs_mount_alloc(struct vnode *vp, struct
>   rw_init_flags(>mnt_lock, "vfslock", RWL_IS_VNODE);
>   (void)vfs_busy(mp, VB_READ|VB_NOWAIT);
>
> - LIST_INIT(>mnt_vnodelist);
> + TAILQ_INIT(>mnt_vnodelist);
>   mp->mnt_vnodecovered = vp;
>
>   atomic_inc_int(>vfc_refcount);
> @@ -476,12 +476,12 @@ insmntque(struct vnode *vp, struct mount
>* Delete from old mount point vnode list, if on one.
>*/
>   if (vp->v_mount != NULL)
> - LIST_REMOVE(vp, v_mntvnodes);
> + TAILQ_REMOVE(>v_mount->mnt_vnodelist, vp, v_mntvnodes);
>   /*
>* Insert into list of vnodes for the new mount point, if available.
>*/
>   if ((vp->v_mount = mp) != NULL)
> - LIST_INSERT_HEAD(>mnt_vnodelist, vp, v_mntvnodes);
> + TAILQ_INSERT_TAIL(>mnt_vnodelist, vp, v_mntvnodes);
>  }
>
>  /*
> @@ -872,7 +872,7 @@ vfs_mount_foreach_vnode(struct mount *mp
>   int error = 0;
>
>  loop:
> - LIST_FOREACH_SAFE(vp , >mnt_vnodelist, v_mntvnodes, nvp) {
> + TAILQ_FOREACH_SAFE(vp , >mnt_vnodelist, v_mntvnodes, nvp) {
>   if (vp->v_mount != mp)
>   goto loop;
>
> @@ -1299,7 +1299,7 @@ printlockedvnodes(void)
>   TAILQ_FOREACH(mp, , mnt_list) {
>   if (vfs_busy(mp, VB_READ|VB_NOWAIT))
>   continue;
> - LIST_FOREACH(vp, >mnt_vnodelist, v_mntvnodes) {
> + TAILQ_FOREACH(vp, >mnt_vnodelist, v_mntvnodes) {
>   if (VOP_ISLOCKED(vp))
>   vprint(NULL, vp);
>   }
> @@ -2287,7 +2287,7 @@ vfs_mount_print(struct mount *mp, int fu
>   (*pr)("locked vnodes:");
>   /* XXX would take mountlist lock, except ddb has no context */
>   cnt = 0;
> - LIST_FOREACH(vp, >mnt_vnodelist, v_mntvnodes) {
> + TAILQ_FOREACH(vp, >mnt_vnodelist, v_mntvnodes) {
>   if (VOP_ISLOCKED(vp)) {
>   if (cnt == 0)
>   (*pr)("\n  %p", vp);
> @@ -2304,7 +2304,7 @@ vfs_mount_print(struct mount *mp, int fu
>   (*pr)("all vnodes:");
>   /* XXX would take mountlist lock, except ddb has no context */
>   cnt = 0;
> - LIST_FOREACH(vp, >mnt_vnodelist, v_mntvnodes) {
> + TAILQ_FOREACH(vp, >mnt_vnodelist, v_mntvnodes) {
>   if (cnt == 0)
>   (*pr)("\n  %p", vp);
>   else if ((cnt % (72 / (sizeof(void *) * 2 + 4))) == 0)
> Index: sys/kern/vfs_syscalls.c
> ===
> RCS file: /data/mirror/openbsd/cvs/src/sys/kern/vfs_syscalls.c,v
> retrieving revision 1.338
> diff -u -p -r1.338 vfs_syscalls.c
> --- sys/kern/vfs_syscalls.c   29 Nov 2019 20:58:17 -  1.338
> +++ sys/kern/vfs_syscalls.c   4 Jan 2020 18:41:20 -
> @@ -489,7 +489,7 @@ dounmount_leaf(struct mount *mp, int fla
>* Before calling file system unmount, make sure
>* all unveils to vnodes in here are dropped.
>*/
> - LIST_FOREACH_SAFE(vp , >mnt_vnodelist, v_mntvnodes, nvp) {
> + TAILQ_FOREACH_SAFE(vp , >mnt_vnodelist, v_mntvnodes, nvp) {
>   unveil_removevnode(vp);
>   }
>
> @@ -511,7 +511,7 @@ dounmount_leaf(struct mount *mp, int fla
>   vrele(coveredvp);
>   }
>
> - if (!LIST_EMPTY(>mnt_vnodelist))
> + if (!TAILQ_EMPTY(>mnt_vnodelist))
>   panic("unmount: dangling vnode");
>
>   vfs_unbusy(mp);
> Index: sys/nfs/nfs_subs.c
> 

Re: amd64 SMEP SMAP trap panic print

2020-01-09 Thread Alexander Bluhm
On Thu, Dec 26, 2019 at 08:12:50PM +0100, Alexander Bluhm wrote:
> Any more concerns or can this be commited?

Any ok?

> Index: arch/amd64/amd64/db_trace.c
> ===
> RCS file: /data/mirror/openbsd/cvs/src/sys/arch/amd64/amd64/db_trace.c,v
> retrieving revision 1.47
> diff -u -p -r1.47 db_trace.c
> --- arch/amd64/amd64/db_trace.c   10 Nov 2019 10:03:33 -  1.47
> +++ arch/amd64/amd64/db_trace.c   26 Dec 2019 19:08:35 -
> @@ -150,7 +150,7 @@ db_stack_trace_print(db_expr_t addr, int
>   name = NULL;
>   }
>
> - if (lastframe == 0 && sym == NULL) {
> + if (lastframe == 0 && sym == NULL && callpc != 0) {
>   /* Symbol not found, peek at code */
>   unsigned long instr = db_get_value(callpc, 8, 0);
>
> Index: arch/amd64/amd64/trap.c
> ===
> RCS file: /data/mirror/openbsd/cvs/src/sys/arch/amd64/amd64/trap.c,v
> retrieving revision 1.77
> diff -u -p -r1.77 trap.c
> --- arch/amd64/amd64/trap.c   6 Sep 2019 12:22:01 -   1.77
> +++ arch/amd64/amd64/trap.c   26 Dec 2019 19:08:35 -
> @@ -77,6 +77,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #include 
>
> @@ -132,6 +133,24 @@ static inline void verify_smap(const cha
>  static inline void debug_trap(struct trapframe *_frame, struct proc *_p,
>  long _type);
>
> +static inline int
> +fault(const char *format, ...)
> +{
> + static char faultbuf[512];
> + va_list ap;
> +
> + /*
> +  * Save the fault info for DDB and retain the kernel lock to keep
> +  * faultbuf from being overwritten by another CPU.
> +  */
> + va_start(ap, format);
> + vsnprintf(faultbuf, sizeof faultbuf, format, ap);
> + va_end(ap);
> + printf("%s\n", faultbuf);
> + faultstr = faultbuf;
> + return 0;
> +}
> +
>  /*
>   * pageflttrap(frame, usermode): page fault handler
>   * Returns non-zero if the fault was handled (possibly by generating
> @@ -160,16 +179,14 @@ pageflttrap(struct trapframe *frame, int
>   KERNEL_LOCK();
>
>   if (!usermode) {
> - extern struct vm_map *kernel_map;
> -
>   /* This will only trigger if SMEP is enabled */
>   if (cr2 <= VM_MAXUSER_ADDRESS && frame->tf_err & PGEX_I)
> - panic("attempt to execute user address %p "
> + return fault("attempt to execute user address %p "
>   "in supervisor mode", (void *)cr2);
>   /* This will only trigger if SMAP is enabled */
>   if (pcb->pcb_onfault == NULL && cr2 <= VM_MAXUSER_ADDRESS &&
>   frame->tf_err & PGEX_P)
> - panic("attempt to access user address %p "
> + return fault("attempt to access user address %p "
>   "in supervisor mode", (void *)cr2);
>
>   /*
> @@ -211,18 +228,9 @@ pageflttrap(struct trapframe *frame, int
>   frame->tf_rip = (u_int64_t)pcb->pcb_onfault;
>   return 1;
>   } else {
> - /*
> -  * Bad memory access in the kernel; save the fault
> -  * info for DDB and retain the kernel lock to keep
> -  * faultbuf from being overwritten by another CPU.
> -  */
> - static char faultbuf[512];
> - snprintf(faultbuf, sizeof faultbuf,
> - "uvm_fault(%p, 0x%llx, 0, %d) -> %x",
> + /* bad memory access in the kernel */
> + return fault("uvm_fault(%p, 0x%llx, 0, %d) -> %x",
>   map, cr2, ftype, error);
> - printf("%s\n", faultbuf);
> - faultstr = faultbuf;
> - return 0;
>   }
>   } else {
>   union sigval sv;
> @@ -395,7 +403,7 @@ trap_print(struct trapframe *frame, int
>   printf(" in %s mode\n", KERNELMODE(frame->tf_cs, frame->tf_rflags) ?
>   "supervisor" : "user");
>   printf("trap type %d code %llx rip %llx cs %llx rflags %llx cr2 "
> -" %llx cpl %x rsp %llx\n",
> +"%llx cpl %x rsp %llx\n",
>   type, frame->tf_err, frame->tf_rip, frame->tf_cs,
>   frame->tf_rflags, rcr2(), curcpu()->ci_ilevel, frame->tf_rsp);
>   printf("gsbase %p  kgsbase %p\n",
> Index: ddb/db_examine.c
> ===
> RCS file: /data/mirror/openbsd/cvs/src/sys/ddb/db_examine.c,v
> retrieving revision 1.26
> diff -u -p -r1.26 db_examine.c
> --- ddb/db_examine.c  7 Nov 2019 13:16:25 -   1.26
> +++ ddb/db_examine.c  26 Dec 2019 19:08:35 -
> @@ -288,8 +288,10 @@ void
>  db_print_loc_and_inst(vaddr_t loc)
>  {
>   db_printsym(loc, 

bgpd, move peer related functions to rde_peer.c

2020-01-09 Thread Claudio Jeker
This diff just moves some of the code from rde.c to rde_peer.c where it
should be. Apart from moving the code peer_down() was modified so that it
can be used as callback of peer_foreach(). Also peer_foreach() was changed
to just walk the peer list instead of traversing the peer hash table.

OK?
-- 
:wq Claudio


Index: rde.c
===
RCS file: /cvs/src/usr.sbin/bgpd/rde.c,v
retrieving revision 1.496
diff -u -p -r1.496 rde.c
--- rde.c   8 Jan 2020 18:01:22 -   1.496
+++ rde.c   9 Jan 2020 07:51:33 -
@@ -20,12 +20,10 @@
  */
 
 #include 
-#include 
 #include 
 #include 
 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -63,9 +61,6 @@ intrde_get_mp_nexthop(u_char *, u_int
 struct filterstate *);
 voidrde_update_err(struct rde_peer *, u_int8_t , u_int8_t,
 void *, u_int16_t);
-voidrde_update_log(const char *, u_int16_t,
-const struct rde_peer *, const struct bgpd_addr *,
-const struct bgpd_addr *, u_int8_t);
 voidrde_as4byte_fixup(struct rde_peer *, struct rde_aspath *);
 voidrde_reflector(struct rde_peer *, struct rde_aspath *);
 
@@ -94,20 +89,8 @@ void  rde_mark_prefixsets_dirty(struct 
 u_int8_trde_roa_validity(struct rde_prefixset *,
 struct bgpd_addr *, u_int8_t, u_int32_t);
 
-voidpeer_init(u_int32_t);
-voidpeer_shutdown(void);
-voidpeer_foreach(void (*)(struct rde_peer *, void *), void *);
-int peer_imsg_pending(void);
-int peer_localaddrs(struct rde_peer *, struct bgpd_addr *);
-struct rde_peer *peer_match(struct ctl_neighbor *, u_int32_t);
-struct rde_peer*peer_add(u_int32_t, struct peer_config *);
-voidpeer_up(struct rde_peer *, struct session_up *);
-voidpeer_down(struct rde_peer *);
-voidpeer_flush(struct rde_peer *, u_int8_t, time_t);
-voidpeer_stale(struct rde_peer *, u_int8_t);
-voidpeer_dump(struct rde_peer *, u_int8_t);
-static void peer_recv_eor(struct rde_peer *, u_int8_t);
-static void peer_send_eor(struct rde_peer *, u_int8_t);
+static void rde_peer_recv_eor(struct rde_peer *, u_int8_t);
+static void rde_peer_send_eor(struct rde_peer *, u_int8_t);
 
 voidnetwork_add(struct network_config *, struct filterstate *);
 voidnetwork_delete(struct network_config *);
@@ -115,13 +98,10 @@ static void network_dump_upcall(struct 
 static void network_flush_upcall(struct rib_entry *, void *);
 
 voidrde_shutdown(void);
-int sa_cmp(struct bgpd_addr *, struct sockaddr *);
 int ovs_match(struct prefix *, u_int32_t);
 
 volatile sig_atomic_t   rde_quit = 0;
 struct bgpd_config *conf, *nconf;
-struct rde_peer_headpeerlist;
-struct rde_peer*peerself;
 struct filter_head *out_rules, *out_rules_tmp;
 struct imsgbuf *ibuf_se;
 struct imsgbuf *ibuf_se_ctl;
@@ -129,6 +109,9 @@ struct imsgbuf  *ibuf_main;
 struct rde_memstats rdemem;
 int softreconfig;
 
+extern struct rde_peer_head peerlist;
+extern struct rde_peer *peerself;
+
 struct rde_dump_ctx {
LIST_ENTRY(rde_dump_ctx)entry;
struct ctl_show_rib_request req;
@@ -990,10 +973,14 @@ rde_dispatch_imsg_peer(struct rde_peer *
if (imsg.hdr.len - IMSG_HEADER_SIZE != sizeof(sup))
fatalx("incorrect size of session request");
memcpy(, imsg.data, sizeof(sup));
-   peer_up(peer, );
+   if (peer_up(peer, ) == -1) {
+   peer->state = PEER_DOWN;
+   imsg_compose(ibuf_se, IMSG_SESSION_DOWN, peer->conf.id,
+   0, -1, NULL, 0);
+   }
break;
case IMSG_SESSION_DOWN:
-   peer_down(peer);
+   peer_down(peer, NULL);
break;
case IMSG_SESSION_STALE:
case IMSG_SESSION_FLUSH:
@@ -1088,7 +1075,7 @@ rde_update_dispatch(struct rde_peer *pee
}
if (withdrawn_len == 0) {
/* EoR marker */
-   peer_recv_eor(peer, AID_INET);
+   rde_peer_recv_eor(peer, AID_INET);
return;
}
}
@@ -1200,7 +1187,7 @@ rde_update_dispatch(struct rde_peer *pee
if ((state.aspath.flags & ~F_ATTR_MP_UNREACH) == 0 &&
mplen == 0) {
/* EoR marker */
-   peer_recv_eor(peer, aid);
+   rde_peer_recv_eor(peer, aid);
}
 
switch (aid) {
@@ -2290,7 +2277,7 @@ rde_dump_rib_as(struct prefix *p, struct
}
 }
 
-static int
+int