Re: Use SMR instead of SRP list in rtsock.c

2022-08-10 Thread Visa Hankala
On Wed, Aug 10, 2022 at 11:08:06AM +0200, Claudio Jeker wrote:
> On Fri, Jul 01, 2022 at 04:03:21PM +, Visa Hankala wrote:
> > On Fri, Jul 01, 2022 at 09:59:11AM +0200, Claudio Jeker wrote:
> > > On Thu, Jun 30, 2022 at 03:46:35PM +, Visa Hankala wrote:
> > > > On Thu, Jun 30, 2022 at 11:51:52AM +0200, Claudio Jeker wrote:
> > > > > After discussing this with mpi@ and jmatthew@ we came to the 
> > > > > conclusion
> > > > > that we need to smr_barrier() before refcnt_finalize() to ensure that 
> > > > > no
> > > > > other CPU is between the SMR_TAILQ_FOREACH, refcnt_take() and
> > > > > smr_read_leave().
> > > > 
> > > > [...]
> > > > 
> > > > > @@ -509,7 +487,8 @@ route_input(struct mbuf *m0, struct sock
> > > > >   return;
> > > > >   }
> > > > >  
> > > > > - SRPL_FOREACH(rop, , _list, rop_list) {
> > > > > + smr_read_enter();
> > > > > + SMR_TAILQ_FOREACH(rop, _list, rop_list) {
> > > > >   /*
> > > > >* If route socket is bound to an address family only 
> > > > > send
> > > > >* messages that match the address family. Address 
> > > > > family
> > > > > @@ -519,7 +498,8 @@ route_input(struct mbuf *m0, struct sock
> > > > >   rop->rop_proto != sa_family)
> > > > >   continue;
> > > > >  
> > > > > -
> > > > > + refcnt_take(>rop_refcnt);
> > > > > + smr_read_leave();
> > > > >   so = rop->rop_socket;
> > > > >   solock(so);
> > > > >  
> > > > > @@ -579,8 +559,10 @@ route_input(struct mbuf *m0, struct sock
> > > > >   rtm_sendup(so, m);
> > > > >  next:
> > > > >   sounlock(so);
> > > > > + smr_read_enter();
> > > > > + refcnt_rele_wake(>rop_refcnt);
> > > > 
> > > > This does not look correct.
> > > > 
> > > > smr_barrier() can proceed after smr_read_leave(), so refcnt_rele_wake()
> > > > might drop the final reference and this thread can no longer access
> > > > rop safely (SMR_TAILQ_NEXT() inside SMR_TAILQ_FOREACH()).
> > > > 
> > > > Also, SMR_TAILQ_NEXT() of rop becomes potentially dangling after
> > > > smr_read_leave(). After this thread leaves the read-side critical
> > > > section, another thread might free rop's successor.
> > > 
> > > So we need to either smr_barrier() before and after the refcnt_finalize()
> > > to make sure that the rop pointer remains stable in both cases or we alter
> > > the SMR_TAILQ_FOREACH() loop so that SMR_TAILQ_NEXT can be grabbed before
> > > refcnt_rele_wake().
> > > 
> > > While the double smr_barrier() is trivial it is not ideal and I think it
> > > is better to adjust the loop since SMR loops with sleep points is a
> > > somewhat common issue and so we should have a good clear way on how to
> > > solve it.
> > 
> > Adjusting SMR_TAILQ_FOREACH() will not help.
> > 
> > In general, a reader cannot resume a lockless iteration after it has
> > left the read-side critical section and crossed a sleep point. The
> > guarantee of consistent(-looking) forward linkage is gone. The reader
> > no longer knows if the value of SMR_TAILQ_NEXT() is valid. If the
> > reader wants to continue with the list, it has to re-enter the read-side
> > critical section and restart the iteration.
> 
> This is not a real SMR_TAILQ_FOREACH() use case so trying to use
> SMR_TAILQ_FOREACH() here is not right. The code wants to walk the list of
> route pcbs linked via rop_list. The code just needs to walk all active
> connections and does not care about races with sockets that are concurrently
> closed or opened. In the first case SS_CANTRCVMORE will be set and the
> socket is skipped and in the second case the socket is simply ignored
> because new sockets are inserted at the head of the list.
>  
> It is not a lockless iteration over the full list. It is not required to
> be either. The only thing that matters is that the forward linkage is
> consitent (not pointing to invalid objects).
> 
> There is no need to restart the iteration because element on the list can
> not be reinserted. They can only be removed and a removed element does not
> get any message anyway (either by not visiting the object or by skipping
> it in the loop).
> 
> The refcnt ensures that the currently used pcb is not freed before the
> next element is picked. As long as the refcnt is hold the object can't be
> removed.

Lets assume that another thread begins to detach rop while
route_input() sleeps. The reference prevents early freeing of rop.

Lets assume further that yet another thread detaches and frees the
successor of rop while the first thread is still sleeping. What will
SMR_LIST_NEXT(rop, rop_list) point to?



Re: echo(1): check for stdio errors

2022-08-10 Thread Scott Cheloha
On Thu, Aug 11, 2022 at 02:22:08AM +0200, Jeremie Courreges-Anglas wrote:
> On Wed, Aug 10 2022, Scott Cheloha  wrote:
> > [...]
> >
> > 1. Our ksh(1) already checks for stdout errors in the echo builtin.
> 
> So do any of the scripts in our source tree use /bin/echo for whatever
> reason?  If so, could one of these scripts be broken if /bin/echo
> started to report an error?  Shouldn't those scripts be reviewed?

I didn't look.

There are... hundreds files that look like shell scripts in src.

$ cd /usr/src
$ find . -exec egrep -l '\#.*\!.*sh' > ~/src-shell-script-paths
$ wc -l ~/src-shell-script-paths
1118 /home/ssc/src-shell-script-paths

A lot of them are in regress/.

I guess I better start looking.



Re: echo(1): check for stdio errors

2022-08-10 Thread Jeremie Courreges-Anglas
On Wed, Aug 10 2022, Scott Cheloha  wrote:
> On Wed, Aug 10, 2022 at 02:23:08PM -0600, Theo de Raadt wrote:
>> Scott Cheloha  wrote:
>> 
>> > On Wed, Aug 10, 2022 at 12:26:17PM -0600, Theo de Raadt wrote:
>> > > Scott Cheloha  wrote:
>> > > 
>> > > > We're sorta-kinda circling around adding the missing (?) stdio error
>> > > > checking to other utilities in bin/ and usr.bin/, no?  I want to be
>> > > > sure I understand how to do the next patch, because if we do that it
>> > > > will probably be a bunch of programs all at once.
>> > > 
>> > > This specific program has not checked for this condition since at least
>> > > 2 AT UNIX.
>> > > 
>> > > Your change does not just add a new warning.  It adds a new exit code
>> > > condition.
>> > > 
>> > > Some scripts using echo, which accepted the condition because echo would
>> > > exit 0 and not check for this condition, will now see this exit 1.  Some
>> > > scripts will abort, because they use "set -o errexit" or similar.
>> > > 
>> > > You are changing the exit code for a command which is used a lot.
>> > > 
>> > > POSIX does not require or specify exit 1 for this condition.  If you
>> > > disagree, please show where it says so.
>> > 
>> > It's the usual thing.  >0 if "an error occurred".
>> 
>> The 40 year old code base says otherwise.
>> 
>> > Here is my thinking:
>> > 
>> >echo(1) has ONE job: print the arguments given.
>> > 
>> > If it fails to print those arguments, shouldn't we signal that to the
>> > program that forked echo(1)?
>> 
>> Only if you validate all callers can handle this change in behaviour.
>> 
>> > How is echo(1) supposed to differentiate between a write(2) that is
>> > allowed to fail, e.g. a diagnostic printout from fw_update to the
>> > user's stderr, and one that is not allowed to fail?
>> 
>> Perhaps it is not supposed to validate this problem  in 2022, because it
>> didn't validate it for 40 years.
>> 
>> > Consider this scenario:
>> > 
>> > 1.  A shell script uses echo(1) to write something to a file.
>> > 
>> >/bin/echo foo.dat >> /var/workerd/data-processing-list
>> > 
>> > 2.  The bytes don't arrive on disk because the file system is full.
>> > 
>> > 3.  The shell script succeeds because echo(1) can't fail, even if
>> > it fails to do what it was told to do.
>> > 
>> > Isn't that bad?
>> > 
>> > And it isn't necessarily true that some other thing will fail later
>> > and the whole interlocking system will fail.  ENOSPC is a transient
>> > condition.  One write(2) can fail and the next write(2) can succeed.
>> 
>> Yet, for 40 years noone complained.
>> 
>> Consider the situation you break and change the behaviour of 1000's of
>> shell scripts, and haven'd lifted your finger once to review all the
>> shell scripts that call echo.
>> 
>> Have you even compared this behaviour to the echo built-ins in all
>> the shells?
>
> I assume what you mean to say is, roughly:
>
>   Gee, this seems risky.
>
>   What do other echo implementations do?
>
> 1. Our ksh(1) already checks for stdout errors in the echo builtin.

So do any of the scripts in our source tree use /bin/echo for whatever
reason?  If so, could one of these scripts be broken if /bin/echo
started to report an error?  Shouldn't those scripts be reviewed?

/bin/echo may look unimportant but I think it deserves the same
treatment as more complex tools that you may modify in a similar way.
While I agree that adding more error reporting would be good, it
sounds fair that it should be done while caring for the actual errors. ;)

[...]

-- 
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE



Re: echo(1): check for stdio errors

2022-08-10 Thread Scott Cheloha
On Wed, Aug 10, 2022 at 02:23:08PM -0600, Theo de Raadt wrote:
> Scott Cheloha  wrote:
> 
> > On Wed, Aug 10, 2022 at 12:26:17PM -0600, Theo de Raadt wrote:
> > > Scott Cheloha  wrote:
> > > 
> > > > We're sorta-kinda circling around adding the missing (?) stdio error
> > > > checking to other utilities in bin/ and usr.bin/, no?  I want to be
> > > > sure I understand how to do the next patch, because if we do that it
> > > > will probably be a bunch of programs all at once.
> > > 
> > > This specific program has not checked for this condition since at least
> > > 2 AT UNIX.
> > > 
> > > Your change does not just add a new warning.  It adds a new exit code
> > > condition.
> > > 
> > > Some scripts using echo, which accepted the condition because echo would
> > > exit 0 and not check for this condition, will now see this exit 1.  Some
> > > scripts will abort, because they use "set -o errexit" or similar.
> > > 
> > > You are changing the exit code for a command which is used a lot.
> > > 
> > > POSIX does not require or specify exit 1 for this condition.  If you
> > > disagree, please show where it says so.
> > 
> > It's the usual thing.  >0 if "an error occurred".
> 
> The 40 year old code base says otherwise.
> 
> > Here is my thinking:
> > 
> > echo(1) has ONE job: print the arguments given.
> > 
> > If it fails to print those arguments, shouldn't we signal that to the
> > program that forked echo(1)?
> 
> Only if you validate all callers can handle this change in behaviour.
> 
> > How is echo(1) supposed to differentiate between a write(2) that is
> > allowed to fail, e.g. a diagnostic printout from fw_update to the
> > user's stderr, and one that is not allowed to fail?
> 
> Perhaps it is not supposed to validate this problem  in 2022, because it
> didn't validate it for 40 years.
> 
> > Consider this scenario:
> > 
> > 1.  A shell script uses echo(1) to write something to a file.
> > 
> > /bin/echo foo.dat >> /var/workerd/data-processing-list
> > 
> > 2.  The bytes don't arrive on disk because the file system is full.
> > 
> > 3.  The shell script succeeds because echo(1) can't fail, even if
> > it fails to do what it was told to do.
> > 
> > Isn't that bad?
> > 
> > And it isn't necessarily true that some other thing will fail later
> > and the whole interlocking system will fail.  ENOSPC is a transient
> > condition.  One write(2) can fail and the next write(2) can succeed.
> 
> Yet, for 40 years noone complained.
> 
> Consider the situation you break and change the behaviour of 1000's of
> shell scripts, and haven'd lifted your finger once to review all the
> shell scripts that call echo.
> 
> Have you even compared this behaviour to the echo built-ins in all
> the shells?

I assume what you mean to say is, roughly:

Gee, this seems risky.

What do other echo implementations do?

1. Our ksh(1) already checks for stdout errors in the echo builtin.

2. FreeBSD's /bin/echo has checked for writev(2) errors in /bin/echo
   since 2003:

https://cgit.freebsd.org/src/commit/bin/echo/echo.c?id=91b7d6dc5871f532b1a86ee76389a9bc348bdf58

3. NetBSD's /bin/echo has checked for stdout errors with ferror(3)
   since 2008:

http://cvsweb.netbsd.org/bsdweb.cgi/src/bin/echo/echo.c?rev=1.18=text/x-cvsweb-markup_with_tag=MAIN

4. NetBSD's /bin/sh echo builtin has checked for write errors since
   2008:

http://cvsweb.netbsd.org/bsdweb.cgi/src/bin/sh/bltin/echo.c?rev=1.14=text/x-cvsweb-markup_with_tag=MAIN

5. OpenSolaris has checked for fflush(3) errors in /usr/bin/echo since
   2005 (OpenSolaris launch):

https://github.com/illumos/illumos-gate/blob/7c478bd95313f5f23a4c958a745db2134aa03244/usr/src/cmd/echo/echo.c#L144

6. Looking forward, illumos inherited and retains the behavior in
   their /usr/bin/echo.

7. Extrapolating backward, we can assume Solaris did that checking in
   /usr/bin/echo prior to 2005.

8. GNU Coreutils echo has checked for fflush(3) and fclose(3) errors on
   stdout since 2000:

https://git.savannah.gnu.org/cgit/coreutils.git/commit/src/echo.c?id=d3683509b3953beb014e540f6d6194658ede1dea

   They use close_stdout() in an atexit(3) hook.  close_stdout() is a
   convenience function provided by gnulib since 1998 that does what I
   described:

https://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=23928550db5d400f27fa67de29c738ca324a31ea;hp=f76477e515b36a1e10f7734aac3c5478ccf75989

   Maybe of note is that they do this atexit(3) stdout flush/close
   error checking for many of their utilities.

9. The GNU Bash echo builtin has checked for write errors since v2.04,
   in 2000:

https://git.savannah.gnu.org/cgit/bash.git/commit/builtins/echo.def?id=bb70624e964126b7ac4ff085ba163a9c35ffa18f

   They even noted it in the CHANGES file for that release:

https://git.savannah.gnu.org/cgit/bash.git/commit/CHANGES?id=bb70624e964126b7ac4ff085ba163a9c35ffa18f

--

I don't think that we are first movers in this case.



Re: echo(1): check for stdio errors

2022-08-10 Thread Jeremie Courreges-Anglas
On Wed, Aug 10 2022, "Theo de Raadt"  wrote:
> Scott Cheloha  wrote:
>
>> We're sorta-kinda circling around adding the missing (?) stdio error
>> checking to other utilities in bin/ and usr.bin/, no?  I want to be
>> sure I understand how to do the next patch, because if we do that it
>> will probably be a bunch of programs all at once.
>
>
> This specific program has not checked for this condition since at least
> 2 AT UNIX.
>
> Your change does not just add a new warning.  It adds a new exit code
> condition.
>
> Some scripts using echo, which accepted the condition because echo would
> exit 0 and not check for this condition, will now see this exit 1.  Some
> scripts will abort, because they use "set -o errexit" or similar.

About this, and only related to echo: the builtin version in ksh(1) does
report an error.  I have just tried the mentioned ENOSPC case with

  exec 3>/tmp/o/file; echo foobar >&3; echo $?

and /tmp/o being a full MFS filesystem.  So shell scripts run with
ksh(1) should already handle echo - the builtin - failures.
I think it makes sense for the echo(1) executable to behave like the
shell builtin.

For the other programs, well, I don't have a strong feeling.  I'll just
note that adding stdio error reporting has been done in some other
projects.

-- 
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE



Re: echo(1): check for stdio errors

2022-08-10 Thread Theo de Raadt
Scott Cheloha  wrote:

> On Wed, Aug 10, 2022 at 12:26:17PM -0600, Theo de Raadt wrote:
> > Scott Cheloha  wrote:
> > 
> > > We're sorta-kinda circling around adding the missing (?) stdio error
> > > checking to other utilities in bin/ and usr.bin/, no?  I want to be
> > > sure I understand how to do the next patch, because if we do that it
> > > will probably be a bunch of programs all at once.
> > 
> > This specific program has not checked for this condition since at least
> > 2 AT UNIX.
> > 
> > Your change does not just add a new warning.  It adds a new exit code
> > condition.
> > 
> > Some scripts using echo, which accepted the condition because echo would
> > exit 0 and not check for this condition, will now see this exit 1.  Some
> > scripts will abort, because they use "set -o errexit" or similar.
> > 
> > You are changing the exit code for a command which is used a lot.
> > 
> > POSIX does not require or specify exit 1 for this condition.  If you
> > disagree, please show where it says so.
> 
> It's the usual thing.  >0 if "an error occurred".

The 40 year old code base says otherwise.

> Here is my thinking:
> 
>   echo(1) has ONE job: print the arguments given.
> 
> If it fails to print those arguments, shouldn't we signal that to the
> program that forked echo(1)?

Only if you validate all callers can handle this change in behaviour.

> How is echo(1) supposed to differentiate between a write(2) that is
> allowed to fail, e.g. a diagnostic printout from fw_update to the
> user's stderr, and one that is not allowed to fail?

Perhaps it is not supposed to validate this problem  in 2022, because it
didn't validate it for 40 years.

> Consider this scenario:
> 
> 1.  A shell script uses echo(1) to write something to a file.
> 
>   /bin/echo foo.dat >> /var/workerd/data-processing-list
> 
> 2.  The bytes don't arrive on disk because the file system is full.
> 
> 3.  The shell script succeeds because echo(1) can't fail, even if
> it fails to do what it was told to do.
> 
> Isn't that bad?
> 
> And it isn't necessarily true that some other thing will fail later
> and the whole interlocking system will fail.  ENOSPC is a transient
> condition.  One write(2) can fail and the next write(2) can succeed.

Yet, for 40 years noone complained.


Consider the situation you break and change the behaviour of 1000's of
shell scripts, and haven'd lifted your finger once to review all the
shell scripts that call echo.

Have you even compared this behaviour to the echo built-ins in all
the shells?

Isn't that bad?


Your proposal is dead in the water until you assess *the code that
calls echo*.  If you find one script that breaks, the proposal is
even more dead.



Re: echo(1): check for stdio errors

2022-08-10 Thread Scott Cheloha
On Wed, Aug 10, 2022 at 12:26:17PM -0600, Theo de Raadt wrote:
> Scott Cheloha  wrote:
> 
> > We're sorta-kinda circling around adding the missing (?) stdio error
> > checking to other utilities in bin/ and usr.bin/, no?  I want to be
> > sure I understand how to do the next patch, because if we do that it
> > will probably be a bunch of programs all at once.
> 
> This specific program has not checked for this condition since at least
> 2 AT UNIX.
> 
> Your change does not just add a new warning.  It adds a new exit code
> condition.
> 
> Some scripts using echo, which accepted the condition because echo would
> exit 0 and not check for this condition, will now see this exit 1.  Some
> scripts will abort, because they use "set -o errexit" or similar.
> 
> You are changing the exit code for a command which is used a lot.
> 
> POSIX does not require or specify exit 1 for this condition.  If you
> disagree, please show where it says so.

It's the usual thing.  >0 if "an error occurred".

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/echo.html

EXIT STATUS

The following exit values shall be returned:

 0
Successful completion.
>0
An error occurred.

CONSEQUENCES OF ERRORS

Default.

> So my question is:  What will be broken by this change?
> 
> Nothing isn't an answer.  I can write a 5 line shell script that will
> observe the change in behaviour.  Many large shell scripts could break
> from this.  I am thinking of fw_update and the installer, but it could
> also be a problem in Makefiles.

Here is my thinking:

echo(1) has ONE job: print the arguments given.

If it fails to print those arguments, shouldn't we signal that to the
program that forked echo(1)?

How is echo(1) supposed to differentiate between a write(2) that is
allowed to fail, e.g. a diagnostic printout from fw_update to the
user's stderr, and one that is not allowed to fail?

> > I want to be sure I understand how to do the next patch, because if we
> > do that it will probably be a bunch of programs all at once.
> 
> If you cannot speak to the exit code command changing for this one
> simple program, I think there is no case for adding to to hundreds of
> other programs.  Unless POSIX specifies the requirement, I'd like to see
> some justification.
> 
> There will always be situations that UNIX didn't anticipate or handle,
> and then POSIX failed to specify.  Such things are now unhandled, probably
> forever, and have become defacto standards.
> 
> On the balance, is your diff improving on some dangerous problem, or is
> it introducing a vast number of dangerous new risks which cannot be
> identified (and which would require an audit of every known script
> calling echo).  Has such an audit been started?

Consider this scenario:

1.  A shell script uses echo(1) to write something to a file.

/bin/echo foo.dat >> /var/workerd/data-processing-list

2.  The bytes don't arrive on disk because the file system is full.

3.  The shell script succeeds because echo(1) can't fail, even if
it fails to do what it was told to do.

Isn't that bad?

And it isn't necessarily true that some other thing will fail later
and the whole interlocking system will fail.  ENOSPC is a transient
condition.  One write(2) can fail and the next write(2) can succeed.



Re: slowcgi, httpd and fastcgi abnormal termination

2022-08-10 Thread Omar Polo
On 2022/08/10 15:07:15 +0200, Claudio Jeker wrote:
> On Sun, Aug 07, 2022 at 11:10:22AM +0200, Omar Polo wrote:
> > blob - ddf83f965d0e6a99ada695694bea77b775bae2aa
> > blob + 1d577ba63efca388ca3644d1a52d9b3d9f246014
> > --- usr.sbin/slowcgi/slowcgi.c
> > +++ usr.sbin/slowcgi/slowcgi.c
> > @@ -515,7 +515,34 @@ slowcgi_accept(int fd, short events, void *arg)
> >  void
> >  slowcgi_timeout(int fd, short events, void *arg)
> >  {
> > -   cleanup_request((struct request*) arg);
> > +   struct request  *c;
> > +
> > +   c = arg;
> > +
> > +   if (c->script_flags & (STDOUT_DONE | STDERR_DONE | SCRIPT_DONE))
> 
> Is this the correct check? If we hit the timeout we want to close all
> pipes to the CGI process and hope it dies because of that. Now one of
> STDOUT_DONE or STDERR_DONE may have happened but the script is still
> running.
> 
> I think it would be better to use:
> if (c->script_flags == (STDOUT_DONE | STDERR_DONE | SCRIPT_DONE))
> return;
> here.

Woops!  Yes, I meant to use == and not & here.

> Should slowcgi kill the command if SCRIPT_DONE is not set? 

RFC 3875 says that a script should be prepared to handle abnormal
termination and tha the server acn interrupt or terminate it at any
time and without warning, so killing the script is a possibility.

However, I'm a bit worried.  We could hit the timer not because of a
faulty script but because an HTTP client is purposefully reading data
slowly.  This could even allow to kill the scripts at specific points.
Maybe I'm overthinking and it's not an issue.

Anyway, here's an updated version of the diff for slowcgi with the
fixed logic.

diff refs/remotes/origin/master refs/heads/wip
commit - 3f5af05ce0a79edb8ab7606a968e2ff2617f1972
commit + acc148801b858e3177b9b43a93d20fd53f97d40d
blob - 9a0f8b5656d10fd09e62691a5ca827d357f1ff7a
blob + 77e370c99d0d26646e699a285665f90323411444
--- usr.sbin/slowcgi/slowcgi.c
+++ usr.sbin/slowcgi/slowcgi.c
@@ -515,7 +515,34 @@ slowcgi_accept(int fd, short events, void *arg)
 void
 slowcgi_timeout(int fd, short events, void *arg)
 {
-   cleanup_request((struct request*) arg);
+   struct request  *c;
+
+   c = arg;
+
+   if (c->script_flags == (STDOUT_DONE | STDERR_DONE | SCRIPT_DONE))
+   return;
+
+   if (!c->stdout_fd_closed) {
+   c->stdout_fd_closed = 1;
+   close(EVENT_FD(>script_ev));
+   event_del(>script_ev);
+   }
+
+   if (!c->stderr_fd_closed) {
+   c->stderr_fd_closed = 1;
+   close(EVENT_FD(>script_err_ev));
+   event_del(>script_err_ev);
+   }
+
+   if (!c->stdin_fd_closed) {
+   c->stdin_fd_closed = 1;
+   close(EVENT_FD(>script_stdin_ev));
+   event_del(>script_stdin_ev);
+   }
+
+   c->script_status = SIGALRM;
+   c->script_flags = STDOUT_DONE | STDERR_DONE | SCRIPT_DONE;
+   create_end_record(c);
 }
 
 void



Re: echo(1): check for stdio errors

2022-08-10 Thread Theo de Raadt
Scott Cheloha  wrote:

> We're sorta-kinda circling around adding the missing (?) stdio error
> checking to other utilities in bin/ and usr.bin/, no?  I want to be
> sure I understand how to do the next patch, because if we do that it
> will probably be a bunch of programs all at once.


This specific program has not checked for this condition since at least
2 AT UNIX.

Your change does not just add a new warning.  It adds a new exit code
condition.

Some scripts using echo, which accepted the condition because echo would
exit 0 and not check for this condition, will now see this exit 1.  Some
scripts will abort, because they use "set -o errexit" or similar.

You are changing the exit code for a command which is used a lot.

POSIX does not require or specify exit 1 for this condition.  If you
disagree, please show where it says so.

So my question is:  What will be broken by this change?

Nothing isn't an answer.  I can write a 5 line shell script that will
observe the change in behaviour.  Many large shell scripts could break
from this.  I am thinking of fw_update and the installer, but it could
also be a problem in Makefiles.

> I want to be sure I understand how to do the next patch, because if we
> do that it will probably be a bunch of programs all at once.

If you cannot speak to the exit code command changing for this one
simple program, I think there is no case for adding to to hundreds of
other programs.  Unless POSIX specifies the requirement, I'd like to see
some justification.

There will always be situations that UNIX didn't anticipate or handle,
and then POSIX failed to specify.  Such things are now unhandled, probably
forever, and have become defacto standards.

On the balance, is your diff improving on some dangerous problem, or is
it introducing a vast number of dangerous new risks which cannot be
identified (and which would require an audit of every known script
calling echo).  Has such an audit been started?







Re: echo(1): check for stdio errors

2022-08-10 Thread Scott Cheloha
On Sat, Jul 30, 2022 at 05:23:37PM -0600, Todd C. Miller wrote:
> On Sat, 30 Jul 2022 18:19:02 -0500, Scott Cheloha wrote:
> 
> > Bump.  The standard's error cases for fflush(3) are identical to those
> > for fclose(3):
> >
> > https://pubs.opengroup.org/onlinepubs/9699919799/functions/fflush.html
> > https://pubs.opengroup.org/onlinepubs/9699919799/functions/fclose.html
> >
> > Is the fact that our fclose(3) can succeed even if the error flag is
> > set a bug?
> 
> As far as I can tell, neither fflush() nor fclose() check the status
> of the error flag, though they may set it of course.  That is why
> I was suggesting an explicit ferror() call at the end.

I'm sorry, I'm having a dumb moment, I don't quite understand what
you're looking for.

Please tweak my patch so it's the way you want it, with the ferror(3)
call in the right spot.

We're sorta-kinda circling around adding the missing (?) stdio error
checking to other utilities in bin/ and usr.bin/, no?  I want to be
sure I understand how to do the next patch, because if we do that it
will probably be a bunch of programs all at once.

Index: echo.c
===
RCS file: /cvs/src/bin/echo/echo.c,v
retrieving revision 1.10
diff -u -p -r1.10 echo.c
--- echo.c  9 Oct 2015 01:37:06 -   1.10
+++ echo.c  10 Aug 2022 18:00:12 -
@@ -53,12 +53,15 @@ main(int argc, char *argv[])
nflag = 0;
 
while (*argv) {
-   (void)fputs(*argv, stdout);
-   if (*++argv)
-   putchar(' ');
+   if (fputs(*argv, stdout) == EOF)
+   err(1, "stdout");
+   if (*++argv && putchar(' ') == EOF)
+   err(1, "stdout");
}
-   if (!nflag)
-   putchar('\n');
+   if (!nflag && putchar('\n') == EOF)
+   err(1, "stdout");
+   if (fflush(stdout) == EOF || ferror(stdout) || fclose(stdout) == EOF)
+   err(1, "stdout");
 
return 0;
 }



Re: ts(1): parse input format string only once

2022-08-10 Thread Scott Cheloha
On Fri, Jul 29, 2022 at 08:13:14AM -0500, Scott Cheloha wrote:
> On Wed, Jul 13, 2022 at 12:50:24AM -0500, Scott Cheloha wrote:
> > We reduce overhead if we only parse the user's format string once.  To
> > achieve that, this patch does the following:
> > 
> > [...]
> > 
> > - When parsing the user format string in fmtfmt(), keep a list of
> >   where each microsecond substring lands in buf.  We'll need it later.
> > 
> > - Move the printing part of fmtfmt() into a new function, fmtprint().
> >   fmtprint() is now called from the main loop instead of fmtfmt().
> > 
> > - In fmtprint(), before calling strftime(3), update any microsecond
> >   substrings in buf using the list we built earlier in fmtfmt().  Note
> >   that if there aren't any such substrings we don't call snprintf(3)
> >   at all.
> > 
> > [...]
> 
> Two week bump.
> 
> Here is a stripped-down patch with only the above changes.  Hopefully
> this makes the intent of the patch more obvious.

Four week bump + rebase.

Index: ts.c
===
RCS file: /cvs/src/usr.bin/ts/ts.c,v
retrieving revision 1.9
diff -u -p -r1.9 ts.c
--- ts.c3 Aug 2022 16:54:30 -   1.9
+++ ts.c10 Aug 2022 17:49:53 -
@@ -17,6 +17,7 @@
  */
 
 #include 
+#include 
 #include 
 
 #include 
@@ -27,13 +28,20 @@
 #include 
 #include 
 
+SIMPLEQ_HEAD(, usec) usec_queue = SIMPLEQ_HEAD_INITIALIZER(usec_queue);
+struct usec {
+   SIMPLEQ_ENTRY(usec) next;
+   char *pos;
+};
+
 static char*format = "%b %d %H:%M:%S";
 static char*buf;
 static char*outbuf;
 static size_t   bufsize;
 static size_t   obsize;
 
-static void fmtfmt(const struct timespec *);
+static void fmtfmt(void);
+static void fmtprint(const struct timespec *);
 static void __dead  usage(void);
 
 int
@@ -90,6 +98,8 @@ main(int argc, char *argv[])
if ((outbuf = calloc(1, obsize)) == NULL)
err(1, NULL);
 
+   fmtfmt();
+
/* force UTC for interval calculations */
if (iflag || sflag)
if (setenv("TZ", "UTC", 1) == -1)
@@ -108,7 +118,7 @@ main(int argc, char *argv[])
timespecadd(, _offset, );
else
ts = now;
-   fmtfmt();
+   fmtprint();
if (iflag)
start = now;
}
@@ -134,15 +144,11 @@ usage(void)
  * so you can format while you format
  */
 static void
-fmtfmt(const struct timespec *ts)
+fmtfmt(void)
 {
-   struct tm *tm;
-   char *f, us[7];
-
-   if ((tm = localtime(>tv_sec)) == NULL)
-   err(1, "localtime");
+   char *f;
+   struct usec *u;
 
-   snprintf(us, sizeof(us), "%06ld", ts->tv_nsec / 1000);
strlcpy(buf, format, bufsize);
f = buf;
 
@@ -161,12 +167,34 @@ fmtfmt(const struct timespec *ts)
f[0] = f[1];
f[1] = '.';
f += 2;
+   u = malloc(sizeof u);
+   if (u == NULL)
+   err(1, NULL);
+   u->pos = f;
+   SIMPLEQ_INSERT_TAIL(_queue, u, next);
l = strlen(f);
memmove(f + 6, f, l + 1);
-   memcpy(f, us, 6);
f += 6;
}
} while (*f != '\0');
+}
+
+static void
+fmtprint(const struct timespec *ts)
+{
+   char us[8];
+   struct tm *tm;
+   struct usec *u;
+
+   if ((tm = localtime(>tv_sec)) == NULL)
+   err(1, "localtime");
+
+   /* Update any microsecond substrings in the format buffer. */
+   if (!SIMPLEQ_EMPTY(_queue)) {
+   snprintf(us, sizeof(us), "%06ld", ts->tv_nsec / 1000);
+   SIMPLEQ_FOREACH(u, _queue, next)
+   memcpy(u->pos, us, 6);
+   }
 
*outbuf = '\0';
if (*buf != '\0') {



Re: rpki-client: disallow inherit in ROA EE IP Resources extension

2022-08-10 Thread Job Snijders
On Wed, Aug 10, 2022 at 06:16:30PM +0200, Theo Buehler wrote:
> On Wed, Aug 10, 2022 at 03:10:19PM +, Job Snijders wrote:
> > An errata exists for RFC 6482, which informs us: """The EE certificate
> > MUST NOT use "inherit" elements as described in [RFC3779].""" Read the
> > full report here: https://www.rfc-editor.org/errata/eid3166
> > 
> > Although it might seem a bit 'wasteful' to d2i the IP Resources
> > extension in multiple places, noodling through parameters when to check
> > for inheritance and when not to check didn't improve code readability.
> > I'm open to suggestions how to perform this check differently.
> 
> As I understand it, what really is missing isn't a check for inheritance
> per se, but rather a check whether the prefixes in the ROA are covered
> by the EE cert's IP address delegation extension (the bullet point in
> RFC 6482, section 4). If we had such a check, that would be the natural
> place for adding an inheritance check for the EE cert.
> 
> Below is my "overclaim" diff from a few weeks back that prepended the EE
> cert to the auth chain for ROAs and RSCs so that we check their
> resources against the EE cert instead of our currently incorrect checks
> that permitted overclaiming. The diff was ok job and claudio told me
> that it looked ok - I will need to think it through in detail once more,
> however.

Thank you.

> I believe that with something like this diff, your desired inheritance
> check should be added to valid_roa() above the for() loop.
> 
> Does that make sense?

Yes it does.

Kind regards,

Job



Re: rpki-client: disallow inherit in ROA EE IP Resources extension

2022-08-10 Thread Theo Buehler
On Wed, Aug 10, 2022 at 03:10:19PM +, Job Snijders wrote:
> Hi all,
> 
> An errata exists for RFC 6482, which informs us: """The EE certificate
> MUST NOT use "inherit" elements as described in [RFC3779].""" Read the
> full report here: https://www.rfc-editor.org/errata/eid3166
> 
> Although it might seem a bit 'wasteful' to d2i the IP Resources
> extension in multiple places, noodling through parameters when to check
> for inheritance and when not to check didn't improve code readability.
> I'm open to suggestions how to perform this check differently.

As I understand it, what really is missing isn't a check for inheritance
per se, but rather a check whether the prefixes in the ROA are covered
by the EE cert's IP address delegation extension (the bullet point in
RFC 6482, section 4). If we had such a check, that would be the natural
place for adding an inheritance check for the EE cert.

Below is my "overclaim" diff from a few weeks back that prepended the EE
cert to the auth chain for ROAs and RSCs so that we check their
resources against the EE cert instead of our currently incorrect checks
that permitted overclaiming. The diff was ok job and claudio told me
that it looked ok - I will need to think it through in detail once more,
however.

I believe that with something like this diff, your desired inheritance
check should be added to valid_roa() above the for() loop.

Does that make sense?

Index: cert.c
===
RCS file: /cvs/src/usr.sbin/rpki-client/cert.c,v
retrieving revision 1.84
diff -u -p -r1.84 cert.c
--- cert.c  31 May 2022 18:51:35 -  1.84
+++ cert.c  10 Aug 2022 15:40:25 -
@@ -467,23 +467,19 @@ sbgp_sia(struct parse *p, X509_EXTENSION
}
}
 
-   if (p->res->mft == NULL || p->res->repo == NULL) {
-   warnx("%s: RFC 6487 section 4.8.8: SIA: missing caRepository "
-   "or rpkiManifest", p->fn);
-   goto out;
-   }
-
-   if (strstr(p->res->mft, p->res->repo) != p->res->mft) {
-   warnx("%s: RFC 6487 section 4.8.8: SIA: "
-   "conflicting URIs for caRepository and rpkiManifest",
-   p->fn);
-   goto out;
-   }
+   if (p->res->mft != NULL && p->res->repo != NULL) {
+   if (strstr(p->res->mft, p->res->repo) != p->res->mft) {
+   warnx("%s: RFC 6487 section 4.8.8: SIA: conflicting "
+   "URIs for caRepository and rpkiManifest",
+   p->fn);
+   goto out;
+   }
 
-   if (rtype_from_file_extension(p->res->mft) != RTYPE_MFT) {
-   warnx("%s: RFC 6487 section 4.8.8: SIA: "
-   "not an MFT file", p->fn);
-   goto out;
+   if (rtype_from_file_extension(p->res->mft) != RTYPE_MFT) {
+   warnx("%s: RFC 6487 section 4.8.8: SIA: "
+   "not an MFT file", p->fn);
+   goto out;
+   }
}
 
rc = 1;
@@ -570,42 +566,20 @@ certificate_policies(struct parse *p, X5
return rc;
 }
 
-/*
- * Parse and partially validate an RPKI X509 certificate (either a trust
- * anchor or a certificate) as defined in RFC 6487.
- * Returns the parse results or NULL on failure.
- */
-struct cert *
-cert_parse_pre(const char *fn, const unsigned char *der, size_t len)
+static struct cert *
+cert_extract_x509(const char *fn, X509 *x)
 {
int  extsz;
-   int  sia_present = 0;
size_t   i;
-   X509*x = NULL;
X509_EXTENSION  *ext = NULL;
ASN1_OBJECT *obj;
struct parse p;
 
-   /* just fail for empty buffers, the warning was printed elsewhere */
-   if (der == NULL)
-   return NULL;
-
memset(, 0, sizeof(struct parse));
p.fn = fn;
if ((p.res = calloc(1, sizeof(struct cert))) == NULL)
err(1, NULL);
 
-   if ((x = d2i_X509(NULL, , len)) == NULL) {
-   cryptowarnx("%s: d2i_X509", p.fn);
-   goto out;
-   }
-
-   /* Cache X509v3 extensions, see X509_check_ca(3). */
-   if (X509_check_purpose(x, -1, -1) <= 0) {
-   cryptowarnx("%s: could not cache X509v3 extensions", p.fn);
-   goto out;
-   }
-
/* Look for X509v3 extensions. */
 
if ((extsz = X509_get_ext_count(x)) < 0)
@@ -627,7 +601,6 @@ cert_parse_pre(const char *fn, const uns
goto out;
break;
case NID_sinfo_access:
-   sia_present = 1;
if (!sbgp_sia(, ext))
goto out;
break;
@@ -647,12 +620,6 @@ cert_parse_pre(const char *fn, const uns
case NID_ext_key_usage:
break;
   

Re: store pf rules in a tree

2022-08-10 Thread Alexandr Nedvedicky
Hello,


On Wed, Aug 10, 2022 at 02:38:16PM +, Stefan Butz wrote:
> Hi everyone,
> 
> this mail includes a patch to store pf rules in a red-black tree.
> Currently they are stored in a linked list.
> My system configured with 16000 rules takes about 10 minutes
> to print them out using `pfctl -sr`.
> This patch decreases the time to 4 seconds.
> I was not able to measure a time increase for rule insertion.
> Inserting a lot of labels is still very slow.
> This has to be attacked separatly.

took me while to figure out why it can take sooo lng
to retrieve 16k rules from kernel. this is the answer

1480 case DIOCGETRULE: {
1496 if (pr->ticket != ruleset->rules.active.ticket) {
1497 error = EBUSY;
1498 PF_UNLOCK();
1499 NET_UNLOCK();
1500 goto fail;
1501 }
1502 rule = TAILQ_FIRST(ruleset->rules.active.ptr);
1503 while ((rule != NULL) && (rule->nr != pr->nr))
1504 rule = TAILQ_NEXT(rule, entries);
1505 if (rule == NULL) {

so yes, in order to retrieve the next rule in list we must
always start with the first (HEAD) rule and iterate to rule n+1.

so I understand a motivation why to solve it using  rb-tree.

I also (as todd) have some doubts about code in DIOCHANGERULE.
the current code is able to insert to desired location.
I don't understand how your new code does that:


1675 
1676 if (pcr->action == PF_CHANGE_ADD_HEAD)
1677 oldrule = RB_MIN(pf_ruletree,
1678 ruleset->rules.active.ptr);
1679 else if (pcr->action == PF_CHANGE_ADD_TAIL)
1680 oldrule = RB_MAX(pf_ruletree,
1681 ruleset->rules.active.ptr);
   
1682 else {
1683 oldrule = RB_MIN(pf_ruletree,
1684 ruleset->rules.active.ptr);
1685 while ((oldrule != NULL) && (oldrule->nr != 
pcr->nr))
1686 oldrule = RB_NEXT(pf_ruletree,
1687 ruleset->rules.active.ptr, oldrule);   
   
1688 if (oldrule == NULL) {
1689 if (newrule != NULL)
1690 pf_rm_rule(NULL, newrule); 
   
1691 error = EINVAL;
   
1692 PF_UNLOCK();
1693 NET_UNLOCK();
1694 goto fail; 
   
1695 }  
   
1696 } 
1697 
1698 if (pcr->action == PF_CHANGE_REMOVE) {
1699 pf_rm_rule(ruleset->rules.active.ptr, oldrule);
   
1700 ruleset->rules.active.rcount--;
1701 } else {
1702 RB_INSERT(pf_ruletree,
1703 ruleset->rules.active.ptr, newrule);
1704 ruleset->rules.active.rcount++;
1705 newrule->ruleset = ruleset;
1706 }
1707 

how does RB_INSERT() at line 1702 insert rule to
desired location. Consider the current rules look like:

[ 1, 2, 3, 4 ]

i'm going to insert rule x, which should follow existing rule nr 3

[ 1, 2, 3, x, 4]

current code just re-indexes/renumbers the rules after insertion

[ 1, 2, 3, 4, 5 ], where x gets assigned 4.

I don't see how does that happen with RB_INSERT().

I think what we really need to fix is the iteration to n-th retrieved
rule we currently have in DIOCGETRULE. I'm working on slightly large
change which updates pf(4) transactions. I'll try to cherry pick
some changes from my in-progress tree and factor out a smaller diff 
which will solve that DIOCGETRULE itch for you. please stay tuned.

thanks and
regards
sashan



Re: store pf rules in a tree

2022-08-10 Thread Todd C . Miller
On Wed, 10 Aug 2022 14:38:16 -, Stefan Butz wrote:

> this mail includes a patch to store pf rules in a red-black tree.
> Currently they are stored in a linked list.
> My system configured with 16000 rules takes about 10 minutes
> to print them out using `pfctl -sr`.
> This patch decreases the time to 4 seconds.
> I was not able to measure a time increase for rule insertion.
> Inserting a lot of labels is still very slow.
> This has to be attacked separatly.

You are using the rule number 'nr' as the key for items in the tree.
However, that field can change post-insertion.  Won't that invalidate
the tree order?

See pf_purge_rule() and the DIOCCHANGERULE case in pfioctl()
for examples of what I mean.

 - todd



rpki-client: disallow inherit in ROA EE IP Resources extension

2022-08-10 Thread Job Snijders
Hi all,

An errata exists for RFC 6482, which informs us: """The EE certificate
MUST NOT use "inherit" elements as described in [RFC3779].""" Read the
full report here: https://www.rfc-editor.org/errata/eid3166

Although it might seem a bit 'wasteful' to d2i the IP Resources
extension in multiple places, noodling through parameters when to check
for inheritance and when not to check didn't improve code readability.
I'm open to suggestions how to perform this check differently.

OK?

Kind regards,

Job

Index: roa.c
===
RCS file: /cvs/src/usr.sbin/rpki-client/roa.c,v
retrieving revision 1.49
diff -u -p -r1.49 roa.c
--- roa.c   10 Aug 2022 14:54:03 -  1.49
+++ roa.c   10 Aug 2022 15:02:05 -
@@ -206,6 +206,7 @@ roa_parse(X509 **x509, const char *fn, c
unsigned char   *cms;
int  rc = 0;
const ASN1_TIME *at;
+   STACK_OF(IPAddressFamily)   *addrblk = NULL;
 
memset(, 0, sizeof(struct parse));
p.fn = fn;
@@ -234,6 +235,16 @@ roa_parse(X509 **x509, const char *fn, c
goto out;
}
 
+   addrblk = X509_get_ext_d2i(*x509, NID_sbgp_ipAddrBlock, NULL, NULL);
+   if (addrblk == NULL) {
+   warnx("%s: X509_get_ext_d2i NID_sbgp_ipAddrBlock failed", fn);
+   goto out;
+   }
+   if (X509v3_addr_inherits(addrblk)) {
+   warnx("%s: inherit is disallowed (IETF Errata ID 3166)", fn);
+   goto out;
+   }
+
at = X509_get0_notAfter(*x509);
if (at == NULL) {
warnx("%s: X509_get0_notAfter failed", fn);
@@ -255,6 +266,7 @@ out:
X509_free(*x509);
*x509 = NULL;
}
+   sk_IPAddressFamily_pop_free(addrblk, IPAddressFamily_free);
free(cms);
return p.res;
 }



store pf rules in a tree

2022-08-10 Thread Stefan Butz
Hi everyone,

this mail includes a patch to store pf rules in a red-black tree.
Currently they are stored in a linked list.
My system configured with 16000 rules takes about 10 minutes
to print them out using `pfctl -sr`.
This patch decreases the time to 4 seconds.
I was not able to measure a time increase for rule insertion.
Inserting a lot of labels is still very slow.
This has to be attacked separatly.


diff --git sbin/pfctl/parse.y sbin/pfctl/parse.y
index 8a92a7e895c..c6f8629c690 100644
--- sbin/pfctl/parse.y
+++ sbin/pfctl/parse.y
@@ -5548,11 +5548,13 @@ mv_rules(struct pf_ruleset *src, struct pf_ruleset *dst)
 {
struct pf_rule *r;
 
-   TAILQ_FOREACH(r, src->rules.active.ptr, entries)
+   RB_FOREACH(r, pf_ruletree, src->rules.active.ptr) {
dst->anchor->match++;
-   TAILQ_CONCAT(dst->rules.active.ptr, src->rules.active.ptr, entries);
+   RB_INSERT(pf_ruletree, dst->rules.active.ptr, r);
+   }
src->anchor->match = 0;
-   TAILQ_CONCAT(dst->rules.inactive.ptr, src->rules.inactive.ptr, entries);
+   RB_FOREACH(r, pf_ruletree, src->rules.inactive.ptr)
+   RB_INSERT(pf_ruletree, dst->rules.inactive.ptr, r);
 }
 
 void
diff --git sbin/pfctl/pfctl.c sbin/pfctl/pfctl.c
index 9f14b955ca7..49aafbed4e5 100644
--- sbin/pfctl/pfctl.c
+++ sbin/pfctl/pfctl.c
@@ -1170,7 +1170,7 @@ pfctl_add_rule(struct pfctl *pf, struct pf_rule *r)
err(1, "calloc");
bcopy(r, rule, sizeof(*rule));
 
-   TAILQ_INSERT_TAIL(rs->rules.active.ptr, rule, entries);
+   RB_INSERT(pf_ruletree, rs->rules.active.ptr, rule);
 }
 
 int
@@ -1409,7 +1409,7 @@ pfctl_check_qassignments(struct pf_ruleset *rs)
}
}
 
-   TAILQ_FOREACH(r, rs->rules.active.ptr, entries) {
+   RB_FOREACH(r, pf_ruletree, rs->rules.active.ptr) {
if (r->anchor)
errs += pfctl_check_qassignments(>anchor->ruleset);
if (pfctl_leafqueue_check(r->qname) ||
@@ -1436,7 +1436,7 @@ pfctl_load_ruleset(struct pfctl *pf, char *path, struct 
pf_ruleset *rs,
snprintf([len], PATH_MAX - len, "%s", pf->anchor->path);
 
if (depth) {
-   if (TAILQ_FIRST(rs->rules.active.ptr) != NULL) {
+   if (!RB_EMPTY(rs->rules.active.ptr)) {
brace++;
if (pf->opts & PF_OPT_VERBOSE)
printf(" {\n");
@@ -1454,8 +1454,8 @@ pfctl_load_ruleset(struct pfctl *pf, char *path, struct 
pf_ruleset *rs,
if (pf->optimize)
pfctl_optimize_ruleset(pf, rs);
 
-   while ((r = TAILQ_FIRST(rs->rules.active.ptr)) != NULL) {
-   TAILQ_REMOVE(rs->rules.active.ptr, r, entries);
+   while ((r = RB_MIN(pf_ruletree, rs->rules.active.ptr)) != NULL) {
+   RB_REMOVE(pf_ruletree, rs->rules.active.ptr, r);
pfctl_expand_label_nr(r, rno);
rno++;
if ((error = pfctl_load_rule(pf, path, r, depth)))
diff --git sbin/pfctl/pfctl_optimize.c sbin/pfctl/pfctl_optimize.c
index dc93b882bef..d0ba5c2aca4 100644
--- sbin/pfctl/pfctl_optimize.c
+++ sbin/pfctl/pfctl_optimize.c
@@ -191,6 +191,7 @@ struct pf_rule_field {
 PF_RULE_FIELD(src_nodes,   DC),
 PF_RULE_FIELD(nr,  DC),
 PF_RULE_FIELD(entries, DC),
+PF_RULE_FIELD(usage,   DC),
 PF_RULE_FIELD(qid, DC),
 PF_RULE_FIELD(pqid,DC),
 PF_RULE_FIELD(anchor_relative, DC),
@@ -268,9 +269,9 @@ pfctl_optimize_ruleset(struct pfctl *pf, struct pf_ruleset 
*rs)
struct superblock *block;
struct pf_opt_rule *por;
struct pf_rule *r;
-   struct pf_rulequeue *old_rules;
+   struct pf_ruletree *old_rules;
 
-   if (TAILQ_EMPTY(rs->rules.active.ptr))
+   if (RB_EMPTY(rs->rules.active.ptr))
return (0);
 
DEBUG("optimizing ruleset \"%s\"", rs->anchor->path);
@@ -286,8 +287,8 @@ pfctl_optimize_ruleset(struct pfctl *pf, struct pf_ruleset 
*rs)
 * XXX expanding the pf_opt_rule format throughout pfctl might allow
 * us to avoid all this copying.
 */
-   while ((r = TAILQ_FIRST(rs->rules.inactive.ptr)) != NULL) {
-   TAILQ_REMOVE(rs->rules.inactive.ptr, r, entries);
+   while ((r = RB_MIN(pf_ruletree, rs->rules.inactive.ptr)) != NULL) {
+   RB_REMOVE(pf_ruletree, rs->rules.inactive.ptr, r);
if ((por = calloc(1, sizeof(*por))) == NULL)
err(1, "calloc");
memcpy(>por_rule, r, sizeof(*r));
@@ -319,7 +320,7 @@ pfctl_optimize_ruleset(struct pfctl *pf, struct pf_ruleset 
*rs)
if ((r = calloc(1, sizeof(*r))) == NULL)
err(1, "calloc");
memcpy(r, >por_rule, sizeof(*r));
-   TAILQ_INSERT_TAIL(rs->rules.active.ptr, r, entries);
+

Re: bgpd: inverted NULL check in krVPN6_change()

2022-08-10 Thread Claudio Jeker
On Wed, Aug 10, 2022 at 03:59:12PM +0200, Theo Buehler wrote:
> The below matches the VPN4 code and makes more sense given that we deref
> kr6 in the else block.
> 
> Index: kroute.c
> ===
> RCS file: /cvs/src/usr.sbin/bgpd/kroute.c,v
> retrieving revision 1.287
> diff -u -p -U5 -r1.287 kroute.c
> --- kroute.c  3 Aug 2022 08:16:05 -   1.287
> +++ kroute.c  10 Aug 2022 13:56:28 -
> @@ -619,11 +619,11 @@ krVPN6_change(struct ktable *kt, struct 
>   /* for blackhole and reject routes nexthop needs to be ::1 */
>   if (kf->flags & (F_BLACKHOLE|F_REJECT))
>   bcopy(, >nexthop.v6, sizeof(kf->nexthop.v6));
>  
>   if ((kr6 = kroute6_find(kt, >prefix, kf->prefixlen,
> - kf->priority)) != NULL) {
> + kf->priority)) == NULL) {
>   if (kroute_insert(kt, kf) == -1)
>   return (-1);
>   } else {
>   kr6->mplslabel = mplslabel;
>   kr6->ifindex = kf->ifindex;
> 

Nice find. This is indeed wrong. OK claudio@

-- 
:wq Claudio



Re: rpki-client: tighten ROA parsing by forbidding AS Resources extension on the EE

2022-08-10 Thread Theo Buehler
On Wed, Aug 10, 2022 at 01:58:14PM +, Job Snijders wrote:
> Hi,
> 
> The ROA specification (RFC 6482 § 4) is a bit underspecified, but in the
> wild the RFC 3779 AS Resources extension never ever appears on ROA EE
> certificates, as it serves no purpose in the validation process. I've
> seen it happen once, in the past, which was a CA mistake.
> 
> Related reading material in the 3779 space:
> 
> The BGPSec profile (RFC 8209 § 3.1.3.4) is better in this regard: it
> explicitly forbids NID_sbgp_ipAddrBlock from being present (rpki-client
> checks this), and the upcoming ASPA RFC will also be less ambigious,
> ASPA forbids NID_sbgp_ipAddrBlock too (my WIP ASPA code checks this).
> 
> OK?

I'm fine with adding such a check. See below

> 
> Kind regards,
> 
> Job
> 
> Index: roa.c
> ===
> RCS file: /cvs/src/usr.sbin/rpki-client/roa.c,v
> retrieving revision 1.47
> diff -u -p -r1.47 roa.c
> --- roa.c 10 Jun 2022 10:36:43 -  1.47
> +++ roa.c 10 Aug 2022 13:49:58 -
> @@ -229,6 +229,12 @@ roa_parse(X509 **x509, const char *fn, c
>   goto out;
>   }
>  
> + if (X509_get_ext_d2i(*x509, NID_sbgp_autonomousSysNum, NULL, NULL)
> + != NULL) {

The pointer returned by X509_get_ext_d2i() needs to be freed, so this
would leak. You can check presence of an extension without allocating as
follows:

if (X509_get_ext_by_NID(*x509, NID_sbgp_autonomousSysNum, -1) != -1) {


> + warnx("%s: superfluous AS Resources extension present", fn);
> + goto out;
> + }
> +
>   at = X509_get0_notAfter(*x509);
>   if (at == NULL) {
>   warnx("%s: X509_get0_notAfter failed", fn);
> 



bgpd: inverted NULL check in krVPN6_change()

2022-08-10 Thread Theo Buehler
The below matches the VPN4 code and makes more sense given that we deref
kr6 in the else block.

Index: kroute.c
===
RCS file: /cvs/src/usr.sbin/bgpd/kroute.c,v
retrieving revision 1.287
diff -u -p -U5 -r1.287 kroute.c
--- kroute.c3 Aug 2022 08:16:05 -   1.287
+++ kroute.c10 Aug 2022 13:56:28 -
@@ -619,11 +619,11 @@ krVPN6_change(struct ktable *kt, struct 
/* for blackhole and reject routes nexthop needs to be ::1 */
if (kf->flags & (F_BLACKHOLE|F_REJECT))
bcopy(, >nexthop.v6, sizeof(kf->nexthop.v6));
 
if ((kr6 = kroute6_find(kt, >prefix, kf->prefixlen,
-   kf->priority)) != NULL) {
+   kf->priority)) == NULL) {
if (kroute_insert(kt, kf) == -1)
return (-1);
} else {
kr6->mplslabel = mplslabel;
kr6->ifindex = kf->ifindex;



rpki-client: tighten ROA parsing by forbidding AS Resources extension on the EE

2022-08-10 Thread Job Snijders
Hi,

The ROA specification (RFC 6482 § 4) is a bit underspecified, but in the
wild the RFC 3779 AS Resources extension never ever appears on ROA EE
certificates, as it serves no purpose in the validation process. I've
seen it happen once, in the past, which was a CA mistake.

Related reading material in the 3779 space:

The BGPSec profile (RFC 8209 § 3.1.3.4) is better in this regard: it
explicitly forbids NID_sbgp_ipAddrBlock from being present (rpki-client
checks this), and the upcoming ASPA RFC will also be less ambigious,
ASPA forbids NID_sbgp_ipAddrBlock too (my WIP ASPA code checks this).

OK?

Kind regards,

Job

Index: roa.c
===
RCS file: /cvs/src/usr.sbin/rpki-client/roa.c,v
retrieving revision 1.47
diff -u -p -r1.47 roa.c
--- roa.c   10 Jun 2022 10:36:43 -  1.47
+++ roa.c   10 Aug 2022 13:49:58 -
@@ -229,6 +229,12 @@ roa_parse(X509 **x509, const char *fn, c
goto out;
}
 
+   if (X509_get_ext_d2i(*x509, NID_sbgp_autonomousSysNum, NULL, NULL)
+   != NULL) {
+   warnx("%s: superfluous AS Resources extension present", fn);
+   goto out;
+   }
+
at = X509_get0_notAfter(*x509);
if (at == NULL) {
warnx("%s: X509_get0_notAfter failed", fn);



Re: bgpd more nexthop cleanup

2022-08-10 Thread Theo Buehler
On Wed, Aug 10, 2022 at 02:54:58PM +0200, Claudio Jeker wrote:
> This is more of what I just did in other places. Use direct assignment
> instead of memcpy(), remove double bzero() calls, switch to memset()
> and order struct kroute_nexthop in a more sensible way.

ok

> There should be no behaviour change from all this.

Agreed.



Re: slowcgi, httpd and fastcgi abnormal termination

2022-08-10 Thread Claudio Jeker
On Sun, Aug 07, 2022 at 11:10:22AM +0200, Omar Polo wrote:
> I'm not sure httpd(8) handles correctly when the fastcgi application
> (e.g. slowcgi) closes the connection prematurely.
> 
> To verify it, I'm playing with three simple CGI scripts running under
> slowcgi with a very low timeout (-t2).  The scripts test "hangs"
> (which slowcgi turns into hard disconnections) in different places:
> 
>   /* slow.c -- hang before replying */
>   sleep(10);
>   puts("Content-Type: text/plain\n");
>   puts("hello, world");
>   return 0;
> 
>   /* slow2.c -- hang during headers */
>   puts("Content-Type: text/plain");
>   fflush(stdout);
>   sleep(10);
>   puts("");
>   puts("hello, world");
>   return 0;
> 
>   /* slow3.c -- hang during the body */
>   puts("Content-Type: text/plain\n");
>   printf("hello, (wait for it...)");
>   fflush(stdout);
>   sleep(10);
>   puts("world!");
>   return 0;
> 
> Those sleep calls are meant to hit slowcgi_timeout which abruptly
> closes the connection with httpd(8).
> 
> httpd handles the fastcgi termination via server_file_error which
> assumes the request' headers and body were already sent and so goes on
> to set up for the next request (if clt_persist.)  This leaves the HTTP
> client hanging and waiting for a response that won't arrive, until
> they give up when the fastcgi reply wasn't complete.
> 
> Diff below (first three hunks) addresses this issue by introducing a
> server_fcgi_error callback that handles the "no header" and
> "incomplete data" cases before calling server_file_error.  With it,
> the first two scripts become an httpd' "500 error" page and the third
> is correctly closed after the first line.
> 
> I'm also bundling another small change for slowcgi to make it send a
> FCGI_END_REQUEST record instead of abruptly closing the request.  I
> think it's better to let the fastcgi server know that the request is
> over, but the slowcgi/httpd combo works even without it.
> 
> 
> diff refs/heads/wip refs/heads/fastcgi
> commit - 3aac3f8529325cbd58806247542caabd23befb6e
> commit + cab6b39fe822d105a2dc852f2435693a6eff834f
> blob - 381fade2924c4b5cea77cd9cd6500e75d4d59257
> blob + b6541b7c68235ac1dfc5a9d0243db988e5932a7f
> --- usr.sbin/httpd/server_fcgi.c
> +++ usr.sbin/httpd/server_fcgi.c
> @@ -77,6 +77,7 @@ struct server_fcgi_param {
>  };
>  
>  int  server_fcgi_header(struct client *, unsigned int);
> +void server_fcgi_error(struct bufferevent *, short, void *);
>  void server_fcgi_read(struct bufferevent *, void *);
>  int  server_fcgi_writeheader(struct client *, struct kv *, void *);
>  int  server_fcgi_writechunk(struct client *);
> @@ -133,7 +134,7 @@ server_fcgi(struct httpd *env, struct client *clt)
>  
>   clt->clt_srvbev_throttled = 0;
>   clt->clt_srvbev = bufferevent_new(fd, server_fcgi_read,
> - NULL, server_file_error, clt);
> + NULL, server_fcgi_error, clt);
>   if (clt->clt_srvbev == NULL) {
>   errstr = "failed to allocate fcgi buffer event";
>   goto fail;
> @@ -482,6 +483,23 @@ fcgi_add_param(struct server_fcgi_param *p, const char
>  }
>  
>  void
> +server_fcgi_error(struct bufferevent *bev, short error, void *arg)
> +{
> + struct client   *clt = arg;
> +
> + if ((error & EVBUFFER_EOF) && !clt->clt_fcgi.headersdone) {
> + server_abort_http(clt, 500, "malformed or no headers");
> + return;
> + }
> +
> + /* send the end marker if not already */
> + if (clt->clt_fcgi.chunked && !clt->clt_fcgi.end++)
> + server_bufferevent_print(clt, "0\r\n\r\n");
> +
> + server_file_error(bev, error, arg);
> +}
> +
> +void
>  server_fcgi_read(struct bufferevent *bev, void *arg)
>  {
>   uint8_t  buf[FCGI_RECORD_SIZE];

I already gave an OK for this. It still is OK.

> blob - ddf83f965d0e6a99ada695694bea77b775bae2aa
> blob + 1d577ba63efca388ca3644d1a52d9b3d9f246014
> --- usr.sbin/slowcgi/slowcgi.c
> +++ usr.sbin/slowcgi/slowcgi.c
> @@ -515,7 +515,34 @@ slowcgi_accept(int fd, short events, void *arg)
>  void
>  slowcgi_timeout(int fd, short events, void *arg)
>  {
> - cleanup_request((struct request*) arg);
> + struct request  *c;
> +
> + c = arg;
> +
> + if (c->script_flags & (STDOUT_DONE | STDERR_DONE | SCRIPT_DONE))

Is this the correct check? If we hit the timeout we want to close all
pipes to the CGI process and hope it dies because of that. Now one of
STDOUT_DONE or STDERR_DONE may have happened but the script is still
running.

I think it would be better to use:
if (c->script_flags == (STDOUT_DONE | STDERR_DONE | SCRIPT_DONE))
return;
here.

Should slowcgi kill the command if SCRIPT_DONE is not set? 

> + return;
> +
> + if (!c->stdout_fd_closed) {
> + c->stdout_fd_closed = 1;
> + close(EVENT_FD(>script_ev));
> + event_del(>script_ev);
> + 

bgpd more nexthop cleanup

2022-08-10 Thread Claudio Jeker
This is more of what I just did in other places. Use direct assignment
instead of memcpy(), remove double bzero() calls, switch to memset()
and order struct kroute_nexthop in a more sensible way.

There should be no behaviour change from all this.
-- 
:wq Claudio

Index: bgpd.h
===
RCS file: /cvs/src/usr.sbin/bgpd/bgpd.h,v
retrieving revision 1.448
diff -u -p -r1.448 bgpd.h
--- bgpd.h  28 Jul 2022 13:11:48 -  1.448
+++ bgpd.h  10 Aug 2022 12:46:11 -
@@ -717,9 +717,9 @@ struct kroute_nexthop {
struct bgpd_addrnexthop;
struct bgpd_addrgateway;
struct bgpd_addrnet;
+   uint8_t netlen;
uint8_t valid;
uint8_t connected;
-   uint8_t netlen;
 };
 
 struct session_dependon {
Index: rde.c
===
RCS file: /cvs/src/usr.sbin/bgpd/rde.c,v
retrieving revision 1.561
diff -u -p -r1.561 rde.c
--- rde.c   10 Aug 2022 11:11:02 -  1.561
+++ rde.c   10 Aug 2022 12:46:11 -
@@ -2474,7 +2474,7 @@ rde_dump_rib_as(struct prefix *p, struct
 
nexthop = prefix_nexthop(p);
peer = prefix_peer(p);
-   bzero(, sizeof(rib));
+   memset(, 0, sizeof(rib));
rib.age = getmonotime() - p->lastchange;
rib.local_pref = asp->lpref;
rib.med = asp->med;
@@ -2484,16 +2484,12 @@ rde_dump_rib_as(struct prefix *p, struct
sizeof(rib.remote_addr));
rib.remote_id = peer->remote_bgpid;
if (nexthop != NULL) {
-   memcpy(_nexthop, >true_nexthop,
-   sizeof(rib.true_nexthop));
-   memcpy(_nexthop, >exit_nexthop,
-   sizeof(rib.exit_nexthop));
+   rib.exit_nexthop = nexthop->exit_nexthop;
+   rib.true_nexthop = nexthop->true_nexthop;
} else {
-   /* announced network may have a NULL nexthop */
-   bzero(_nexthop, sizeof(rib.true_nexthop));
-   bzero(_nexthop, sizeof(rib.exit_nexthop));
-   rib.true_nexthop.aid = p->pt->aid;
+   /* announced network can have a NULL nexthop */
rib.exit_nexthop.aid = p->pt->aid;
+   rib.true_nexthop.aid = p->pt->aid;
}
pt_getaddr(p->pt, );
rib.prefixlen = p->pt->prefixlen;
Index: rde_rib.c
===
RCS file: /cvs/src/usr.sbin/bgpd/rde_rib.c,v
retrieving revision 1.242
diff -u -p -r1.242 rde_rib.c
--- rde_rib.c   28 Jul 2022 13:11:51 -  1.242
+++ rde_rib.c   10 Aug 2022 12:46:11 -
@@ -1741,14 +1741,11 @@ nexthop_update(struct kroute_nexthop *ms
 
if (msg->connected) {
nh->flags |= NEXTHOP_CONNECTED;
-   memcpy(>true_nexthop, >exit_nexthop,
-   sizeof(nh->true_nexthop));
+   nh->true_nexthop = nh->exit_nexthop;
} else
-   memcpy(>true_nexthop, >gateway,
-   sizeof(nh->true_nexthop));
+   nh->true_nexthop = msg->gateway;
 
-   memcpy(>nexthop_net, >net,
-   sizeof(nh->nexthop_net));
+   nh->nexthop_net = msg->net;
nh->nexthop_netlen = msg->netlen;
 
nh->next_prefix = LIST_FIRST(>prefix_h);



Re: bgpd fix bgpctl show network

2022-08-10 Thread Theo Buehler
On Wed, Aug 10, 2022 at 11:59:30AM +0200, Claudio Jeker wrote:
> When introducing prefix_nhvalid(p) the code in network_dump_upcall()
> was not correctly adjusted:
> 
> Before:
>   if (prefix_nexthop(p) == NULL ||
>   prefix_nexthop(p)->state != NEXTHOP_REACH)
>   kf.nexthop.aid = kf.prefix.aid;
>   else
>   kf.nexthop = prefix_nexthop(p)->true_nexthop;
> 
> Now:
>   if (prefix_nhvalid(p))
>   kf.nexthop.aid = kf.prefix.aid;
>   else
>   kf.nexthop = prefix_nexthop(p)->true_nexthop;
> 
> What it should be:
> 
>   if (prefix_nhvalid(p) &&  prefix_nexthop(p) != NULL)
>   kf.nexthop = prefix_nexthop(p)->exit_nexthop;
>   else
>   kf.nexthop.aid = kf.prefix.aid;
> 
> If the nexthop is valid we want to show it but in most cases the nexthop
> of announced networks is NULL so make sure we don't dereference NULL.
> Also I think it makes more sense to show the exit_nexthop (which is what
> is shown in show rib output by default). It is also what will be sent to
> the peers.
> 
> While there adjust the code to be a bit more logical and modern.

This all makes sense and is much nicer this way.

ok

> -- 
> :wq Claudio
> 
> Index: rde.c
> ===
> RCS file: /cvs/src/usr.sbin/bgpd/rde.c,v
> retrieving revision 1.560
> diff -u -p -r1.560 rde.c
> --- rde.c 28 Jul 2022 13:11:50 -  1.560
> +++ rde.c 10 Aug 2022 09:56:51 -
> @@ -4247,14 +4247,13 @@ network_dump_upcall(struct rib_entry *re
>   continue;
>   pt_getaddr(p->pt, );
>  
> - bzero(, sizeof(kf));
> - memcpy(, , sizeof(kf.prefix));
> - if (prefix_nhvalid(p))
> - kf.nexthop.aid = kf.prefix.aid;
> - else
> - memcpy(, _nexthop(p)->true_nexthop,
> - sizeof(kf.nexthop));
> + memset(, 0, sizeof(kf));
> + kf.prefix = addr;
>   kf.prefixlen = p->pt->prefixlen;
> + if (prefix_nhvalid(p) && prefix_nexthop(p) != NULL)
> + kf.nexthop = prefix_nexthop(p)->exit_nexthop;
> + else
> + kf.nexthop.aid = kf.prefix.aid;
>   if ((asp->flags & F_ANN_DYNAMIC) == 0)
>   kf.flags = F_STATIC;
>   if (imsg_compose(ibuf_se_ctl, IMSG_CTL_SHOW_NETWORK, 0,
> 



Re: fix bgpctl show network header

2022-08-10 Thread Theo Buehler
On Wed, Aug 10, 2022 at 12:17:29PM +0200, Claudio Jeker wrote:
> bgpctl show network uses the same data handler as bgpctl show fib.
> I increased the space between destination and gateway for IPv6 for the
> latter but forgot to adjust the former.
> 
> Before:
> flags: S = Static
> flags prio destination  gateway
> S0 10.2.3.0/24  0.0.0.0
> 
> After:
> flags: S = Static
> flags prio destination  gateway 
> S0 10.2.3.0/24  0.0.0.0

ok

> 
> -- 
> :wq Claudio
> 
> Index: output.c
> ===
> RCS file: /cvs/src/usr.sbin/bgpctl/output.c,v
> retrieving revision 1.25
> diff -u -p -r1.25 output.c
> --- output.c  28 Jul 2022 10:40:25 -  1.25
> +++ output.c  10 Aug 2022 10:11:35 -
> @@ -46,8 +46,7 @@ show_head(struct parse_result *res)
>   break;
>   case SHOW_FIB:
>   printf("flags: B = BGP, C = Connected, S = Static\n");
> - printf("   "
> - "N = BGP Nexthop reachable via this route\n");
> + printf("   N = BGP Nexthop reachable via this route\n");
>   printf("   r = reject route, b = blackhole route\n\n");
>   printf("%-5s %-4s %-32s %-32s\n", "flags", "prio",
>   "destination", "gateway");
> @@ -83,7 +82,8 @@ show_head(struct parse_result *res)
>   break;
>   case NETWORK_SHOW:
>   printf("flags: S = Static\n");
> - printf("flags prio destination  gateway\n");
> + printf("%-5s %-4s %-32s %-32s\n", "flags", "prio",
> + "destination", "gateway");
>   break;
>   default:
>   break;
> 



fix bgpctl show network header

2022-08-10 Thread Claudio Jeker
bgpctl show network uses the same data handler as bgpctl show fib.
I increased the space between destination and gateway for IPv6 for the
latter but forgot to adjust the former.

Before:
flags: S = Static
flags prio destination  gateway
S0 10.2.3.0/24  0.0.0.0

After:
flags: S = Static
flags prio destination  gateway 
S0 10.2.3.0/24  0.0.0.0

-- 
:wq Claudio

Index: output.c
===
RCS file: /cvs/src/usr.sbin/bgpctl/output.c,v
retrieving revision 1.25
diff -u -p -r1.25 output.c
--- output.c28 Jul 2022 10:40:25 -  1.25
+++ output.c10 Aug 2022 10:11:35 -
@@ -46,8 +46,7 @@ show_head(struct parse_result *res)
break;
case SHOW_FIB:
printf("flags: B = BGP, C = Connected, S = Static\n");
-   printf("   "
-   "N = BGP Nexthop reachable via this route\n");
+   printf("   N = BGP Nexthop reachable via this route\n");
printf("   r = reject route, b = blackhole route\n\n");
printf("%-5s %-4s %-32s %-32s\n", "flags", "prio",
"destination", "gateway");
@@ -83,7 +82,8 @@ show_head(struct parse_result *res)
break;
case NETWORK_SHOW:
printf("flags: S = Static\n");
-   printf("flags prio destination  gateway\n");
+   printf("%-5s %-4s %-32s %-32s\n", "flags", "prio",
+   "destination", "gateway");
break;
default:
break;



Re: rpki-client: decrease how long to wait for the remote peer to send IO

2022-08-10 Thread Claudio Jeker
On Wed, Aug 10, 2022 at 02:17:53AM +, Job Snijders wrote:
> Dear all,
> 
> I like to run rpki-client very often, and not be bogged down with
> non-responsive respositories. If a repository is uncommunicative,
> rpki-client as-is will try other transports, or come back later (because
> of a next crontab invocation).
> 
> In rpki-client, RSYNC & HTTPS timeouts now are unified:
> 
> - if the initial connection can't be established within MAX_CONN_TIMEOUT
>   seconds, try one of: Another Address Family (such as IPv6), or Another
>   Transport Protocol (like RSYNC).
> 
> - If the remote peer at the other side of the HTTPS or RSYNC connection
>   doens't send us any data for the duration of MAX_IO_TIMEOUT, try one
>   of Another Address Family (such as IPv6), or Another Transport
>   Protocol (like RSYNC).
> 
> I propose that if the remote side fails to send any new data for the
> duration of 30 seconds, we should give up, try something else (which
> includes locally cached data from the previous invocation).
> 
> I've tested with MAX_IO_TIMEOUT set to 15 seconds; and would feel
> comfortable committing that; but some in our community expressed
> hesitance, hence the below proposal for 30.
> 
> OK?

I'm OK with 30sec I would not go lower because on shitty networks with
lots of retransmits the time for data to arrive may be a fair bit longer
than what you see. Also once the connection is open it should be ok to try
a bit harder to finish the transaction.

> Index: extern.h
> ===
> RCS file: /cvs/src/usr.sbin/rpki-client/extern.h,v
> retrieving revision 1.146
> diff -u -p -r1.146 extern.h
> --- extern.h  9 Aug 2022 09:02:26 -   1.146
> +++ extern.h  10 Aug 2022 02:01:45 -
> @@ -731,7 +731,7 @@ int   mkpathat(int, const char *);
>  #define MAX_CONN_TIMEOUT 15
>  
>  /* How long to wait for IO from a remote server. */
> -#define MAX_IO_TIMEOUT   180
> +#define MAX_IO_TIMEOUT   30
>  
>  /* Maximum allowd repositories per tal */
>  #define MAX_REPO_PER_TAL 1000
> 

OK claudio@

-- 
:wq Claudio



bgpd fix bgpctl show network

2022-08-10 Thread Claudio Jeker
When introducing prefix_nhvalid(p) the code in network_dump_upcall()
was not correctly adjusted:

Before:
if (prefix_nexthop(p) == NULL ||
prefix_nexthop(p)->state != NEXTHOP_REACH)
kf.nexthop.aid = kf.prefix.aid;
else
kf.nexthop = prefix_nexthop(p)->true_nexthop;

Now:
if (prefix_nhvalid(p))
kf.nexthop.aid = kf.prefix.aid;
else
kf.nexthop = prefix_nexthop(p)->true_nexthop;

What it should be:

if (prefix_nhvalid(p) &&  prefix_nexthop(p) != NULL)
kf.nexthop = prefix_nexthop(p)->exit_nexthop;
else
kf.nexthop.aid = kf.prefix.aid;

If the nexthop is valid we want to show it but in most cases the nexthop
of announced networks is NULL so make sure we don't dereference NULL.
Also I think it makes more sense to show the exit_nexthop (which is what
is shown in show rib output by default). It is also what will be sent to
the peers.

While there adjust the code to be a bit more logical and modern.
-- 
:wq Claudio

Index: rde.c
===
RCS file: /cvs/src/usr.sbin/bgpd/rde.c,v
retrieving revision 1.560
diff -u -p -r1.560 rde.c
--- rde.c   28 Jul 2022 13:11:50 -  1.560
+++ rde.c   10 Aug 2022 09:56:51 -
@@ -4247,14 +4247,13 @@ network_dump_upcall(struct rib_entry *re
continue;
pt_getaddr(p->pt, );
 
-   bzero(, sizeof(kf));
-   memcpy(, , sizeof(kf.prefix));
-   if (prefix_nhvalid(p))
-   kf.nexthop.aid = kf.prefix.aid;
-   else
-   memcpy(, _nexthop(p)->true_nexthop,
-   sizeof(kf.nexthop));
+   memset(, 0, sizeof(kf));
+   kf.prefix = addr;
kf.prefixlen = p->pt->prefixlen;
+   if (prefix_nhvalid(p) && prefix_nexthop(p) != NULL)
+   kf.nexthop = prefix_nexthop(p)->exit_nexthop;
+   else
+   kf.nexthop.aid = kf.prefix.aid;
if ((asp->flags & F_ANN_DYNAMIC) == 0)
kf.flags = F_STATIC;
if (imsg_compose(ibuf_se_ctl, IMSG_CTL_SHOW_NETWORK, 0,



Re: Use SMR instead of SRP list in rtsock.c

2022-08-10 Thread Claudio Jeker
On Fri, Jul 01, 2022 at 04:03:21PM +, Visa Hankala wrote:
> On Fri, Jul 01, 2022 at 09:59:11AM +0200, Claudio Jeker wrote:
> > On Thu, Jun 30, 2022 at 03:46:35PM +, Visa Hankala wrote:
> > > On Thu, Jun 30, 2022 at 11:51:52AM +0200, Claudio Jeker wrote:
> > > > After discussing this with mpi@ and jmatthew@ we came to the conclusion
> > > > that we need to smr_barrier() before refcnt_finalize() to ensure that no
> > > > other CPU is between the SMR_TAILQ_FOREACH, refcnt_take() and
> > > > smr_read_leave().
> > > 
> > > [...]
> > > 
> > > > @@ -509,7 +487,8 @@ route_input(struct mbuf *m0, struct sock
> > > > return;
> > > > }
> > > >  
> > > > -   SRPL_FOREACH(rop, , _list, rop_list) {
> > > > +   smr_read_enter();
> > > > +   SMR_TAILQ_FOREACH(rop, _list, rop_list) {
> > > > /*
> > > >  * If route socket is bound to an address family only 
> > > > send
> > > >  * messages that match the address family. Address 
> > > > family
> > > > @@ -519,7 +498,8 @@ route_input(struct mbuf *m0, struct sock
> > > > rop->rop_proto != sa_family)
> > > > continue;
> > > >  
> > > > -
> > > > +   refcnt_take(>rop_refcnt);
> > > > +   smr_read_leave();
> > > > so = rop->rop_socket;
> > > > solock(so);
> > > >  
> > > > @@ -579,8 +559,10 @@ route_input(struct mbuf *m0, struct sock
> > > > rtm_sendup(so, m);
> > > >  next:
> > > > sounlock(so);
> > > > +   smr_read_enter();
> > > > +   refcnt_rele_wake(>rop_refcnt);
> > > 
> > > This does not look correct.
> > > 
> > > smr_barrier() can proceed after smr_read_leave(), so refcnt_rele_wake()
> > > might drop the final reference and this thread can no longer access
> > > rop safely (SMR_TAILQ_NEXT() inside SMR_TAILQ_FOREACH()).
> > > 
> > > Also, SMR_TAILQ_NEXT() of rop becomes potentially dangling after
> > > smr_read_leave(). After this thread leaves the read-side critical
> > > section, another thread might free rop's successor.
> > 
> > So we need to either smr_barrier() before and after the refcnt_finalize()
> > to make sure that the rop pointer remains stable in both cases or we alter
> > the SMR_TAILQ_FOREACH() loop so that SMR_TAILQ_NEXT can be grabbed before
> > refcnt_rele_wake().
> > 
> > While the double smr_barrier() is trivial it is not ideal and I think it
> > is better to adjust the loop since SMR loops with sleep points is a
> > somewhat common issue and so we should have a good clear way on how to
> > solve it.
> 
> Adjusting SMR_TAILQ_FOREACH() will not help.
> 
> In general, a reader cannot resume a lockless iteration after it has
> left the read-side critical section and crossed a sleep point. The
> guarantee of consistent(-looking) forward linkage is gone. The reader
> no longer knows if the value of SMR_TAILQ_NEXT() is valid. If the
> reader wants to continue with the list, it has to re-enter the read-side
> critical section and restart the iteration.

This is not a real SMR_TAILQ_FOREACH() use case so trying to use
SMR_TAILQ_FOREACH() here is not right. The code wants to walk the list of
route pcbs linked via rop_list. The code just needs to walk all active
connections and does not care about races with sockets that are concurrently
closed or opened. In the first case SS_CANTRCVMORE will be set and the
socket is skipped and in the second case the socket is simply ignored
because new sockets are inserted at the head of the list.
 
It is not a lockless iteration over the full list. It is not required to
be either. The only thing that matters is that the forward linkage is
consitent (not pointing to invalid objects).

There is no need to restart the iteration because element on the list can
not be reinserted. They can only be removed and a removed element does not
get any message anyway (either by not visiting the object or by skipping
it in the loop).

The refcnt ensures that the currently used pcb is not freed before the
next element is picked. As long as the refcnt is hold the object can't be
removed.
 
> I guess I should finish the sleepable variant of SMR that I was
> tinkering with long ago...

I switched from SMR_TAILQ to SMR_LIST since there is no need to access the
tail. Also the code uses just SMR_LIST_FIRST() and SMR_LIST_NEXT() and not
SMR_LIST_FOREACH().

-- 
:wq Claudio

Index: rtsock.c
===
RCS file: /cvs/src/sys/net/rtsock.c,v
retrieving revision 1.334
diff -u -p -r1.334 rtsock.c
--- rtsock.c28 Jun 2022 10:01:13 -  1.334
+++ rtsock.c10 Aug 2022 08:43:31 -
@@ -71,7 +71,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 
 #include 
 #include 
@@ -107,8 +107,6 @@ struct walkarg {
 };
 
 void   route_prinit(void);
-void   rcb_ref(void *, void *);
-void   rcb_unref(void *, void *);
 introute_output(struct mbuf