Re: Add CT methods to standard_exts, fix timestamp printing
ok beck@ > On Nov 23, 2021, at 21:14, Theo Buehler wrote: > > Two small diffs now that beck has linked the certificate transparency > code to the build. > > The diff for ext_dat.h links the CT methods to the standard extensions. > This replaces the gibberish from the CT extensions which are now present > in most certs with something readable. Try > > $ openssl s_client -connect libressl.org:443 | openssl x509 -noout -text > > The diff for ct_prn makes sure that the timestamp is actually printed. > Our ASN1_GENERALIZEDTIME_set_string() does not accept fractional > seconds, so don't feed them into it for printing. eopenssl11 doesn't > print the fractional sections either. > > Index: x509/ext_dat.h > === > RCS file: /cvs/src/lib/libcrypto/x509/ext_dat.h,v > retrieving revision 1.3 > diff -u -p -r1.3 ext_dat.h > --- x509/ext_dat.h2 Sep 2021 21:27:26 -1.3 > +++ x509/ext_dat.h16 Nov 2021 16:56:19 - > @@ -73,6 +73,7 @@ extern X509V3_EXT_METHOD v3_crl_hold, v3 > extern X509V3_EXT_METHOD v3_policy_mappings, v3_policy_constraints; > extern X509V3_EXT_METHOD v3_name_constraints, v3_inhibit_anyp, v3_idp; > extern const X509V3_EXT_METHOD v3_addr, v3_asid; > +extern const X509V3_EXT_METHOD v3_ct_scts[3]; > > /* This table will be searched using OBJ_bsearch so it *must* kept in > * order of the ext_nid values. > @@ -129,6 +130,11 @@ static const X509V3_EXT_METHOD *standard >&v3_idp, >&v3_alt[2], >&v3_freshest_crl, > +#ifndef OPENSSL_NO_CT > +&v3_ct_scts[0], > +&v3_ct_scts[1], > +&v3_ct_scts[2], > +#endif > }; > > /* Number of standard extensions */ > Index: ct/ct_prn.c > === > RCS file: /cvs/src/lib/libcrypto/ct/ct_prn.c,v > retrieving revision 1.3 > diff -u -p -r1.3 ct_prn.c > --- ct/ct_prn.c20 Nov 2021 01:10:49 -1.3 > +++ ct/ct_prn.c21 Nov 2021 15:32:56 - > @@ -71,8 +71,7 @@ timestamp_print(uint64_t timestamp, BIO > * Note GeneralizedTime from ASN1_GENERALIZETIME_adj is always 15 > * characters long with a final Z. Update it with fractional seconds. > */ > -snprintf(genstr, sizeof(genstr), "%.14s.%03dZ", > -ASN1_STRING_get0_data(gen), (unsigned int)(timestamp % 1000)); > +snprintf(genstr, sizeof(genstr), "%.14sZ", ASN1_STRING_get0_data(gen)); >if (ASN1_GENERALIZEDTIME_set_string(gen, genstr)) >ASN1_GENERALIZEDTIME_print(out, gen); >ASN1_GENERALIZEDTIME_free(gen); >
Re: dhcpleased - set ciaddr per RFC
On Fri, Nov 19, 2021 at 1:01 PM Joel Knight wrote: > > One thing that got missed in the refactor was that the requested-ip > option should not be set in a RENEWING or BINDING state (or in other > words, when ciaddr is set). This chunk on top of your diff also works > as expected (successful unicast renewal at T1). Hi Florian, Does the chunk below make sense? > Index: frontend.c > === > RCS file: /data/cvs-mirror/OpenBSD/src/sbin/dhcpleased/frontend.c,v > retrieving revision 1.23 > diff -p -u -r1.23 frontend.c > --- frontend.c 20 Oct 2021 07:04:49 - 1.23 > +++ frontend.c 19 Nov 2021 16:25:18 - > @@ -963,11 +966,13 @@ build_packet(uint8_t message_type, char > p += sizeof(dhcp_req_list); > > if (message_type == DHCPREQUEST) { > - memcpy(dhcp_requested_address + 2, requested_ip, > - sizeof(*requested_ip)); > - memcpy(p, dhcp_requested_address, > - sizeof(dhcp_requested_address)); > - p += sizeof(dhcp_requested_address); > + if (ciaddr->s_addr == 0) { > + memcpy(dhcp_requested_address + 2, requested_ip, > + sizeof(*requested_ip)); > + memcpy(p, dhcp_requested_address, > + sizeof(dhcp_requested_address)); > + p += sizeof(dhcp_requested_address); > + } > > if (server_identifier->s_addr != INADDR_ANY) { > memcpy(dhcp_server_identifier + 2, server_identifier,
Add CT methods to standard_exts, fix timestamp printing
Two small diffs now that beck has linked the certificate transparency code to the build. The diff for ext_dat.h links the CT methods to the standard extensions. This replaces the gibberish from the CT extensions which are now present in most certs with something readable. Try $ openssl s_client -connect libressl.org:443 | openssl x509 -noout -text The diff for ct_prn makes sure that the timestamp is actually printed. Our ASN1_GENERALIZEDTIME_set_string() does not accept fractional seconds, so don't feed them into it for printing. eopenssl11 doesn't print the fractional sections either. Index: x509/ext_dat.h === RCS file: /cvs/src/lib/libcrypto/x509/ext_dat.h,v retrieving revision 1.3 diff -u -p -r1.3 ext_dat.h --- x509/ext_dat.h 2 Sep 2021 21:27:26 - 1.3 +++ x509/ext_dat.h 16 Nov 2021 16:56:19 - @@ -73,6 +73,7 @@ extern X509V3_EXT_METHOD v3_crl_hold, v3 extern X509V3_EXT_METHOD v3_policy_mappings, v3_policy_constraints; extern X509V3_EXT_METHOD v3_name_constraints, v3_inhibit_anyp, v3_idp; extern const X509V3_EXT_METHOD v3_addr, v3_asid; +extern const X509V3_EXT_METHOD v3_ct_scts[3]; /* This table will be searched using OBJ_bsearch so it *must* kept in * order of the ext_nid values. @@ -129,6 +130,11 @@ static const X509V3_EXT_METHOD *standard &v3_idp, &v3_alt[2], &v3_freshest_crl, +#ifndef OPENSSL_NO_CT + &v3_ct_scts[0], + &v3_ct_scts[1], + &v3_ct_scts[2], +#endif }; /* Number of standard extensions */ Index: ct/ct_prn.c === RCS file: /cvs/src/lib/libcrypto/ct/ct_prn.c,v retrieving revision 1.3 diff -u -p -r1.3 ct_prn.c --- ct/ct_prn.c 20 Nov 2021 01:10:49 - 1.3 +++ ct/ct_prn.c 21 Nov 2021 15:32:56 - @@ -71,8 +71,7 @@ timestamp_print(uint64_t timestamp, BIO * Note GeneralizedTime from ASN1_GENERALIZETIME_adj is always 15 * characters long with a final Z. Update it with fractional seconds. */ - snprintf(genstr, sizeof(genstr), "%.14s.%03dZ", - ASN1_STRING_get0_data(gen), (unsigned int)(timestamp % 1000)); + snprintf(genstr, sizeof(genstr), "%.14sZ", ASN1_STRING_get0_data(gen)); if (ASN1_GENERALIZEDTIME_set_string(gen, genstr)) ASN1_GENERALIZEDTIME_print(out, gen); ASN1_GENERALIZEDTIME_free(gen);
Re: vport: set UP on ip assign
Klemens Nanni wrote: > Then, finally, interfaces only go UP if users do `ifconfig ... up' > or hostname.* contain the word "up". Otherwise they stay DOWN. > > This would be a dead simple thing to reason. Yeah it is so reasonable in fact why don't we add a chunk to the top of netstart to force all interfaces people configured by hand down!?!?!?!?! Or hey force them all The chain of proposals written in these emails are not going to happen, because netstart and hostname.* are a CONVENIENCE and may not be authoritative wrt the actual configuration. At boottime they are close to authoritative, but netstart can be run later. I don't know why you have gone on this bizzare tangent that "everything must change". No, things don't need to change.
Re: vport: set UP on ip assign
And here is the root of the argument -- where it is all going towards. > If we decide to handle this in netstart alone, shouldn't all interfaces > behave like vport(4) and not mess with their state unless explicitly > requested to do so? the implication here, is let's go change all the drivers and network stack to not "auto-up", because netstart will handle it. Absolutely no, because noone has tested this in all configurations and all the new implications that are being ignored in this proposal because it is only trying to fix another small nit. I am not thrilled about where this is going. Yes, there are obscure corner cases configuring network device "graphs". And the answer is to keep re-arranging netstart? I'm not sure about that. And now the conversation heads towards "drivers should not come up automatically", and "drivers should be pre-created". I think this is just deck chair re-arrangement. Lots of people do manual configuration, where they DON'T USE sh netstart, and their fingers have memories ... but this proposal of "netstart does it", and "drivers don't do it", is basically telling them to register at a re-education camp. Another way of looking at this, is that the situation isn't that bad, because it makes the simple device/network usages very simple to use. OpenBSD isn't a brand-name enterprise switch or router. I'm familiar with the usage pattern on such devices. That does not mean I (or others) automatically require the "kind of implied atomicity" of only bringing up interfaces "late". Even on major routers/switches, this is such a small little thing -- the management of those devices requires that you to have brought the devices down before you make changes, otherwise there is no manual "up" step, because you as an admin left it "up" while you were making changes. So the transaction model you want to enforce here with netstart is only concerned with the "up" side, there is no implied "down", there cannot be an implied "down" side... unless you mean people should reboot to test? A bit of hyperbole, but why solve the "do not auto up" if during re-configuriaton people are very often already going to be up??? So, I do not understand the end-game of this proposal. I mean, beyond fresh boot. But now to tie this into "devices should not come up on their own", why does that even MATTER during the few moments of bring-up. Chasing ghosts? And yes I know the routing daemons are processing excessive messages, but isn't it their job to normally process potentially many more messages than this meager amount, and isn't it good that such code is actually being tested to behave correctly? In summary, there are undeal setups. Does it mean everything has to change those to satisfy those few odd cases? I don't see justification.
Re: vport: set UP on ip assign
On Wed, Nov 24, 2021 at 02:30:08AM +0100, Klemens Nanni wrote: > On Tue, Nov 16, 2021 at 09:22:26AM +1000, David Gwynne wrote: > > On Mon, Nov 15, 2021 at 02:31:42PM +, Klemens Nanni wrote: > > > On Mon, Nov 15, 2021 at 01:37:49PM +, Stuart Henderson wrote: > > > > On 2021/11/15 12:27, Klemens Nanni wrote: > > > > > On Sun, Nov 14, 2021 at 07:04:42PM -0700, Theo de Raadt wrote: > > > > > > I think physical interfaces should come up when something is > > > > > > configured > > > > > > on them, but virtual interfaces shouldn't -- mostly because the > > > > > > order of > > > > > > configuration is often muddled. > > > > > > > > > > So "inet6 2001:db8::1" in hostname.em0 will do the trick but > > > > > hostname.vport0 would need another "up" for the same behaviour: > > > > > that's > > > > > rather confusing me as a user. > > > > > > > > hostname.* files are orthogonal to this; netstart can process all the > > > > lines, > > > > then if it has seen a line doing address configuration and has not seen > > > > an > > > > explicit "down", it can bring the interface up automatically at the end. > > > > (if this changed, it would be a nightmare for users to do anything > > > > else). > > > > > > Yes, netstart can and should deal with this correctly, just like you > > > describe. > > > > > > > Users would need to make sure they have a netstart which does that if > > > > updating a kernel, but that's just a case of matching kernel+userland > > > > and is > > > > nothing new for OpenBSD. > > > > > > > > The different behaviour would be apparent with separate runs of > > > > ifconfig. > > > > some scripts may need adapting and users might need to run "ifconfig XX > > > > up" > > > > themselves but I don't think that would be a problem. > > > > > > Agreed. > > > > > > Having the implicit-up logic entirely contained in netstart would make > > > lifer much easier, both for network stack hackers and users, imho. > > > > this was my attempt at just that. > > Given netstart(8) always pulls interfaces UP or DOWN as the last step, > surely this wouldn't be all, right? Put differently: Such netstart change really just tries to abstract a consistent behaviour across multiple interfaces doing it differently. > Interfaces/drivers still pull themselves up, thus create route message > churn, transition from initial DOWN to implicit UP due to address > configuration to possibly administrative DOWN again due to "down" in > hostname.*, etc. So once/iff this change lands, I think all interfaces should do (or not do) what netstart compensates for. Once all interfaces avoid implicit state changes and only transition upon administrative command (and possibly `autoconf'), the netstart workaround --that's what it really is-- ought to be removed. > If we decide to handle this in netstart alone, shouldn't all interfaces > behave like vport(4) and not mess with their state unless explicitly > requested to do so? Then, finally, interfaces only go UP if users do `ifconfig ... up' or hostname.* contain the word "up". Otherwise they stay DOWN. This would be a dead simple thing to reason. The netstart workaround itself already changes behaviour and tells users to add a very specific line to their configuration -- a format which really shouldn't have such specific requirements but instead should work like regular `ifconfig' lines on the shell, btw. So If that already needs attention, the final solution of no interface doing implicit changes and netstart not doing any implicit stuff would also eventually result in the simplest and most intuitive form of configuration: "up" anywhere pulls UP, "down" anywhere pulls DOWN, neither anywhere leaves interfaces at their creation default of DOWN. > Not sure if the (now finally documented) implicit `up' in `autoconf' > should stay after the netstart change, but that's another topic > (pulling interfaces UP two times won't hurt, I guess). > > > the installer has its own netstart though, right? > > Yes, diff for that at the end after tweaking yours and adapting it. > > > Index: etc/netstart > > === > > RCS file: /cvs/src/etc/netstart,v > > retrieving revision 1.216 > > diff -u -p -r1.216 netstart > > --- etc/netstart2 Sep 2021 19:38:20 - 1.216 > > +++ etc/netstart15 Nov 2021 23:20:00 - > > @@ -71,6 +71,9 @@ parse_hn_line() { > > dhcp) _cmds[${#_cmds[*]}]="ifconfig $_if inet autoconf" > > V4_AUTOCONF=true > > ;; > > + down) _c[0]= > > This reset seems unneeded, `_c' isn't used afterwards and I don't > undertand why you do it. > > > + _ifup=down > > So `_ifup' comes from ifstart() and not parse_hn_line(). > > We use the "_" prefix to denote local variables... > > > + ;; > > '!'*) _cmd=$(print -- "${_c[@]}" | sed 's/\$if/'$_if'/g') > > _cmds[${#_cmds[*]}]="${_cmd#!}" > > ;; > > @@ -118,6 +121,7 @@ vifs
Re: vport: set UP on ip assign
On Tue, Nov 16, 2021 at 09:22:26AM +1000, David Gwynne wrote: > On Mon, Nov 15, 2021 at 02:31:42PM +, Klemens Nanni wrote: > > On Mon, Nov 15, 2021 at 01:37:49PM +, Stuart Henderson wrote: > > > On 2021/11/15 12:27, Klemens Nanni wrote: > > > > On Sun, Nov 14, 2021 at 07:04:42PM -0700, Theo de Raadt wrote: > > > > > I think physical interfaces should come up when something is > > > > > configured > > > > > on them, but virtual interfaces shouldn't -- mostly because the order > > > > > of > > > > > configuration is often muddled. > > > > > > > > So "inet6 2001:db8::1" in hostname.em0 will do the trick but > > > > hostname.vport0 would need another "up" for the same behaviour: that's > > > > rather confusing me as a user. > > > > > > hostname.* files are orthogonal to this; netstart can process all the > > > lines, > > > then if it has seen a line doing address configuration and has not seen an > > > explicit "down", it can bring the interface up automatically at the end. > > > (if this changed, it would be a nightmare for users to do anything else). > > > > Yes, netstart can and should deal with this correctly, just like you > > describe. > > > > > Users would need to make sure they have a netstart which does that if > > > updating a kernel, but that's just a case of matching kernel+userland and > > > is > > > nothing new for OpenBSD. > > > > > > The different behaviour would be apparent with separate runs of ifconfig. > > > some scripts may need adapting and users might need to run "ifconfig XX > > > up" > > > themselves but I don't think that would be a problem. > > > > Agreed. > > > > Having the implicit-up logic entirely contained in netstart would make > > lifer much easier, both for network stack hackers and users, imho. > > this was my attempt at just that. Given netstart(8) always pulls interfaces UP or DOWN as the last step, surely this wouldn't be all, right? Interfaces/drivers still pull themselves up, thus create route message churn, transition from initial DOWN to implicit UP due to address configuration to possibly administrative DOWN again due to "down" in hostname.*, etc. If we decide to handle this in netstart alone, shouldn't all interfaces behave like vport(4) and not mess with their state unless explicitly requested to do so? Not sure if the (now finally documented) implicit `up' in `autoconf' should stay after the netstart change, but that's another topic (pulling interfaces UP two times won't hurt, I guess). > the installer has its own netstart though, right? Yes, diff for that at the end after tweaking yours and adapting it. > Index: etc/netstart > === > RCS file: /cvs/src/etc/netstart,v > retrieving revision 1.216 > diff -u -p -r1.216 netstart > --- etc/netstart 2 Sep 2021 19:38:20 - 1.216 > +++ etc/netstart 15 Nov 2021 23:20:00 - > @@ -71,6 +71,9 @@ parse_hn_line() { > dhcp) _cmds[${#_cmds[*]}]="ifconfig $_if inet autoconf" > V4_AUTOCONF=true > ;; > + down) _c[0]= This reset seems unneeded, `_c' isn't used afterwards and I don't undertand why you do it. > + _ifup=down So `_ifup' comes from ifstart() and not parse_hn_line(). We use the "_" prefix to denote local variables... > + ;; > '!'*) _cmd=$(print -- "${_c[@]}" | sed 's/\$if/'$_if'/g') > _cmds[${#_cmds[*]}]="${_cmd#!}" > ;; > @@ -118,6 +121,7 @@ vifscreate() { > ifstart() { > local _if=$1 _hn=/etc/hostname.$1 _cmds _i=0 _line _stat > set -A _cmds > + _ifup=up except you define a global variable inside a function. This should be local to ifstart() (deserving its prefix), which makes it reach into the scope of parse_hn_line() (as is usual semantic with `local' variables in at least ksh and bash). I've added comments to reflect on this special use, as I'm unaware of any other piece of shell code were we actively use function-local variables up the call stack. > > # Interface names must be alphanumeric only. We check to avoid > # configuring backup or temp files, and to catch the "*" case. > @@ -145,6 +149,8 @@ ifstart() { > while IFS= read -- _line; do > parse_hn_line $_line > done <$_hn > + > + _cmds[${#_cmds[*]}]="ifconfig $_if $_ifup" Lastly, I'd say `_ifstate' because that equally means "up" and "down". > > # Apply the interface configuration commands stored in _cmds array. > while ((_i < ${#_cmds[*]})); do > Index: share/man/man5/hostname.if.5 > === > RCS file: /cvs/src/share/man/man5/hostname.if.5,v > retrieving revision 1.77 > diff -u -p -r1.77 hostname.if.5 > --- share/man/man5/hostname.if.5 17 Jul 2021 15:28:31 - 1.77 > +++ share/man/man5/hostname.if.5 15 Nov 2021 23:20:01 - > @@ -57,6 +57,9 @@ the administrator should not expect magi >
Re: IPsec tdb ref counting
> On 23 Nov 2021, at 18:16, Tobias Heider wrote: > > On Tue, Nov 23, 2021 at 02:18:26PM +0100, Alexander Bluhm wrote: >> On Tue, Nov 23, 2021 at 06:54:59AM +0100, Hrvoje Popovski wrote: >>> after 24 hours hitting sasyncd setup one box panic >> >> Thanks for testing. >> >> I have reduced my iked lifetime to about 10 seconds and got the >> same panic on my new 8 core test machine. >> >> ddb{2}> trace >> db_enter() at db_enter+0x10 >> panic(81eaa8e3) at panic+0xbf >> pool_do_get(821e64d8,9,8000238b0524) at pool_do_get+0x35c >> pool_get(821e64d8,9) at pool_get+0x93 >> tdb_alloc(0) at tdb_alloc+0x62 >> reserve_spi(0,100,,80d41254,80d41238,32,cbd2b00c6d3d3ec >> d) at reserve_spi+0xfc >> pfkeyv2_send(fd8739174900,81b3ba80,50) at pfkeyv2_send+0x19c6 >> pfkeyv2_output(fd80948cea00,fd8739174900,0,0) at pfkeyv2_output+0x8a >> pfkeyv2_usrreq(fd8739174900,9,fd80948cea00,0,0,8000238857b0) at >> pfk >> eyv2_usrreq+0x1b0 >> sosend(fd8739174900,0,8000238b0b60,0,0,0) at sosend+0x3a9 >> dofilewritev(8000238857b0,3,8000238b0b60,0,8000238b0c60) at >> dofilew >> ritev+0x14d >> sys_writev(8000238857b0,8000238b0c00,8000238b0c60) at >> sys_writev+0x >> d2 >> syscall(8000238b0cd0) at syscall+0x3a9 >> Xsyscall() at Xsyscall+0x128 >> >>> ddb{3}> show tdb >> >> You have to add the pool item addr to this command. >> >> I additionally have refcount tracing diff on my machine. With that >> I see this result: >> >> ddb{2}> show panic >> *cpu2: pool_do_get: tdb free list modified: page 0x8801; item >> addr 0 >> x8801c998; offset 0x28=0xdeadbeee >> >> ddb{2}> show tdb /f 0x8801c998 >> tdb at 0x8801c998 >> hnext: 0x4c38c8f8ffb0cab5 >> dnext: 0xff2c2a5ac7964242 >> snext: 0xdeadbeefdeadbeef >> ... >> tdb_trace[78]: 350309838: refs 5 -1 cpu2 ipsec_forward_check:1081 >> tdb_trace[79]: 350309839: refs 4 +1 cpu2 gettdb_dir:358 >> tdb_trace[80]: 350309840: refs 5 -1 cpu2 ipsec_common_input:355 >> tdb_trace[81]: 350309841: refs 4 +1 cpu2 gettdb_dir:358 >> tdb_trace[82]: 350309842: refs 5 -1 cpu2 ipsec_forward_check:1081 >> tdb_trace[83]: 350310888: refs 4 -1 cpu2 ipsp_spd_lookup:529 >> tdb_trace[84]: 350816099: refs 3 -1 cpu0 tdb_soft_timeout:726 >> tdb_trace[85]: 351266117: refs 2 +1 cpu2 gettdb_dir:358 >> tdb_trace[86]: 351266118: refs 3 +0 cpu2 pfkeyv2_send:1599 >> tdb_trace[87]: 351266119: refs 3 -1 cpu2 tdb_delete0:997 >> tdb_trace[88]: 351271898: refs 2 -1 cpu2 pfkeyv2_send:2143 >> tdb_trace[89]: 351300368: refs 1 +0 cpu0 tdb_timeout:688 >> tdb_trace[90]: 351300369: refs 1 -1 cpu0 tdb_delete0:997 >> tdb_trace[91]: 351300370: refs 3735928559 -1 cpu0 tdb_timeout:691 >> >> I will try mvs@ IPL_NET fix and think a bit more about the problem. >> >> bluhm >> > > It looks like the problem is that we are calling tdb_delete() twice, once from > pfkey_send() and again from tdb_timeout(). My guess is that the timeout task > is already scheduled and waiting for NETLOCK, which is why tdb_deltimeouts() > can't delete it. > The diff below adds a flag to prevent the TDB from being deleted more than > once. > This should fix the problem above. > This is strange. If timeout(9) handler is already running timeout_del(9) returns 0 and tdb_unref() will not be called. Also I expect to see refcounter assertion instead of pool corruption. > + > +void > +tdb_deltimeouts(struct tdb *tdbp) > +{ > + if (timeout_del(&tdbp->tdb_timer_tmo)) > + tdb_unref(tdbp); > Index: sys/net/if_bridge.c > === > RCS file: /cvs/src/sys/net/if_bridge.c,v > retrieving revision 1.358 > diff -u -p -r1.358 if_bridge.c > --- sys/net/if_bridge.c 11 Nov 2021 18:08:17 - 1.358 > +++ sys/net/if_bridge.c 23 Nov 2021 15:12:53 - > @@ -1567,20 +1567,28 @@ bridge_ipsec(struct ifnet *ifp, struct e > tdb->tdb_xform != NULL) { > if (tdb->tdb_first_use == 0) { > tdb->tdb_first_use = gettime(); > - if (tdb->tdb_flags & TDBF_FIRSTUSE) > - timeout_add_sec(&tdb->tdb_first_tmo, > - tdb->tdb_exp_first_use); > - if (tdb->tdb_flags & TDBF_SOFT_FIRSTUSE) > - timeout_add_sec(&tdb->tdb_sfirst_tmo, > - tdb->tdb_soft_first_use); > + if (tdb->tdb_flags & TDBF_FIRSTUSE) { > + if (timeout_add_sec( > + &tdb->tdb_first_tmo, > + tdb->tdb_exp_first_use)) > + tdb_ref(tdb); > +
Re: asr(3): strip AD flag in responses
ksh/ksh.1:Note that ksh/ksh.1:Note that any unquoted space before and after a pattern is ksh/ksh.1:Note that redirections specified after a function definition are ksh/ksh.1:Note that changing the ksh/ksh.1:Note that if the ksh/ksh.1:Note that ksh/ksh.1:Note that both the parameter name and the ksh/ksh.1:Note that if ksh/ksh.1:Note that since the command-line editors try to figure out how long the prompt ksh/ksh.1:Note that the backslash itself may be interpreted by the shell. ksh/ksh.1:Note that none of the above pattern elements match either a period ksh/ksh.1:Note that this means the command ksh/ksh.1:Note that this behaviour is slightly ksh/ksh.1:Note that if a command is not found using ksh/ksh.1:Note that special parameters (e.g.\& ksh/ksh.1:Note that ksh/ksh.1:Note that the Bourne shell differs here; ksh/ksh.1:Note that ksh/ksh.1:Note that setting this option does not affect the current value of the ksh/ksh.1:Note that some special rules are applied (courtesy of POSIX) ksh/ksh.1:Note that the output of ksh/ksh.1:Note that for ksh/ksh.1:Note that only commands that create processes (e.g. asynchronous commands, ksh/ksh.1:Note that editing command names are used only with the ksh/ksh.1:Note that the ^X stands for control-X; also To pick one example { list; } Compound construct; list is executed, but not in a subshell. Note that `{' and `}' are reserved words, not meta-characters. What would be wrong with saying it like this? { list; } Compound construct; list is executed, but not in a subshell. `{' and `}' are reserved words, not meta-characters. Hang on, I'm finding a piece of paper to make a note. Apparently this 2nd sentence is very important, the requirement I make a note of it must mean it is more important than the 1st sentence!
Re: asr(3): strip AD flag in responses
You mean to say Note that, you can drop "Note that". I have no idea where this construct came from. Maybe it should be replaced with "PAY ATTENTION NOW". It is just rude. Imagine if the cat manual page went like this: DESCRIPTION Note that the cat utility reads files sequentially, writing them to the standard output. Note that The file operands are processed in command-line order. Note that if file is a single dash (`-') or absent, cat reads from the standard input. Florian Obser wrote: > You could drop "Note that". Either way, OK florian > > On 23 November 2021 13:39:51 CET, Jeremie Courreges-Anglas > wrote: > >On Mon, Nov 22 2021, Florian Obser wrote: > >> On 2021-11-21 22:21 +01, Jeremie Courreges-Anglas wrote: > >>> On Sun, Nov 21 2021, Jeremie Courreges-Anglas wrote: > On Sat, Nov 20 2021, Florian Obser wrote: > >>> > >>> [...] > >>> > >> Index: lib/libc/asr/res_mkquery.c > >> === > >> RCS file: /home/cvs/src/lib/libc/asr/res_mkquery.c,v > >> retrieving revision 1.13 > >> diff -u -p -r1.13 res_mkquery.c > >> --- lib/libc/asr/res_mkquery.c 14 Jan 2019 06:49:42 - 1.13 > >> +++ lib/libc/asr/res_mkquery.c 20 Nov 2021 14:24:08 - > >> @@ -62,6 +62,9 @@ res_mkquery(int op, const char *dname, i > >>h.flags |= RD_MASK; > >>if (ac->ac_options & RES_USE_CD) > >>h.flags |= CD_MASK; > >> + if ((ac->ac_options & RES_TRUSTAD) && > >> + !(ac->ac_options & RES_USE_DNSSEC)) > >> + h.flags |= AD_MASK; > > > > do you remember why you check for !RES_USE_DNSSEC? > > I'd like to leave it out. > > First, here's my understanding of RES_USE_DNSSEC: it's supposed to > activate EDNS and set the DO bit. The server is then supposed to reply > with AD set only if the data has been validated (with or without > DNSSEC), and possibly append add DNSSEC data if available. > > Since I didn't know the semantics of both setting AD and DO in a query > (I would expect such semantics to be sane) I wrote those more > conservative checks instead. Does this make sense? If so, maybe > a comment would help? > > /* Set the AD flag unless we already plan to set the DNSSEC DO bit. */ > if ((ac->ac_options & RES_TRUSTAD) && > !(ac->ac_options & RES_USE_DNSSEC)) > > > In fact I don't think RES_USE_DNSSEC is useful > > at all. > > If you want to set the DO flag and start DNSSEC from first principles > > you are better of talking to the network directly, i.e. rewrite unwind. > > RES_USE_DNSSEC has been there since some time already and it's used by > software in the ports tree, precisely to detect AD=1 - and not so much > for the key records I think. > >>> > >>> BTW clearing AD might break software that depends on it for stuff like > >>> DANE. postfix uses RES_USE_DNSSEC for this, but also force-enables > >>> RES_TRUSTAD if available so it shouldn't be affected (upstream's stance > >>> is that you should only use a trusted resolver when running postfix). > >> > >> Ambitious... > >> > >> I actually had considerd to only do automatic trust-ad without any way > >> for users and programs to change it / expand it to non-localhost. > > > >> But then I thought it might be nice in a datacenter setting where you > >> might trust the path to the resolver. > > > >I also think that'trust-ad" is good thing to support. > > > >> And then postfix comes along :( > > > >If we don't want uptreams to use RES_TRUSTAD and thus force the admin to > >do a conscious choice, we could hide it as an implementation detail in > >asr/ or make it a noop (decoupling it from "options trust-ad" and your > >automatic detection). It would require careful thinking about how > >people use the API. Personnally I don't feel too strongly about giving > >people rope... > > > >>> On the other hand mail/exim knowlingly doesn't force RES_TRUSTAD. > >> > >> I don't think we'll brake anything, at leat it should not stop > >> progress. DANE verification has to be oportunistic, no? Good luck > >> sending/receiving mail while requiring that certificates are either > >> signed by a known CA or TLSA records are published. That will probably > >> lose you a whole lot of the internet... > > > >Some people in the corporate world actually start to require the use of > >certificates signed by a known CA. I only spotted this twice so far. > >Regarding DANE and TLSA, one of the goals is to be able to use strong > >authentication without deferring to commercial CAs. So I'm pretty sure > >some nerdy admins have started enforcing it where they can, just because > >they can. > > > >>> I don't know of other ports using RES_USE_DNSSEC and caring about the AD > >>> flag. > >>> > >>> The effect of RES_TRUSTAD/trust-ad on RES_USE_DNSSEC ought to
Re: asr(3): strip AD flag in responses
You could drop "Note that". Either way, OK florian On 23 November 2021 13:39:51 CET, Jeremie Courreges-Anglas wrote: >On Mon, Nov 22 2021, Florian Obser wrote: >> On 2021-11-21 22:21 +01, Jeremie Courreges-Anglas wrote: >>> On Sun, Nov 21 2021, Jeremie Courreges-Anglas wrote: On Sat, Nov 20 2021, Florian Obser wrote: >>> >>> [...] >>> >> Index: lib/libc/asr/res_mkquery.c >> === >> RCS file: /home/cvs/src/lib/libc/asr/res_mkquery.c,v >> retrieving revision 1.13 >> diff -u -p -r1.13 res_mkquery.c >> --- lib/libc/asr/res_mkquery.c 14 Jan 2019 06:49:42 - 1.13 >> +++ lib/libc/asr/res_mkquery.c 20 Nov 2021 14:24:08 - >> @@ -62,6 +62,9 @@ res_mkquery(int op, const char *dname, i >> h.flags |= RD_MASK; >> if (ac->ac_options & RES_USE_CD) >> h.flags |= CD_MASK; >> +if ((ac->ac_options & RES_TRUSTAD) && >> +!(ac->ac_options & RES_USE_DNSSEC)) >> +h.flags |= AD_MASK; > > do you remember why you check for !RES_USE_DNSSEC? > I'd like to leave it out. First, here's my understanding of RES_USE_DNSSEC: it's supposed to activate EDNS and set the DO bit. The server is then supposed to reply with AD set only if the data has been validated (with or without DNSSEC), and possibly append add DNSSEC data if available. Since I didn't know the semantics of both setting AD and DO in a query (I would expect such semantics to be sane) I wrote those more conservative checks instead. Does this make sense? If so, maybe a comment would help? /* Set the AD flag unless we already plan to set the DNSSEC DO bit. */ if ((ac->ac_options & RES_TRUSTAD) && !(ac->ac_options & RES_USE_DNSSEC)) > In fact I don't think RES_USE_DNSSEC is useful > at all. > If you want to set the DO flag and start DNSSEC from first principles > you are better of talking to the network directly, i.e. rewrite unwind. RES_USE_DNSSEC has been there since some time already and it's used by software in the ports tree, precisely to detect AD=1 - and not so much for the key records I think. >>> >>> BTW clearing AD might break software that depends on it for stuff like >>> DANE. postfix uses RES_USE_DNSSEC for this, but also force-enables >>> RES_TRUSTAD if available so it shouldn't be affected (upstream's stance >>> is that you should only use a trusted resolver when running postfix). >> >> Ambitious... >> >> I actually had considerd to only do automatic trust-ad without any way >> for users and programs to change it / expand it to non-localhost. > >> But then I thought it might be nice in a datacenter setting where you >> might trust the path to the resolver. > >I also think that'trust-ad" is good thing to support. > >> And then postfix comes along :( > >If we don't want uptreams to use RES_TRUSTAD and thus force the admin to >do a conscious choice, we could hide it as an implementation detail in >asr/ or make it a noop (decoupling it from "options trust-ad" and your >automatic detection). It would require careful thinking about how >people use the API. Personnally I don't feel too strongly about giving >people rope... > >>> On the other hand mail/exim knowlingly doesn't force RES_TRUSTAD. >> >> I don't think we'll brake anything, at leat it should not stop >> progress. DANE verification has to be oportunistic, no? Good luck >> sending/receiving mail while requiring that certificates are either >> signed by a known CA or TLSA records are published. That will probably >> lose you a whole lot of the internet... > >Some people in the corporate world actually start to require the use of >certificates signed by a known CA. I only spotted this twice so far. >Regarding DANE and TLSA, one of the goals is to be able to use strong >authentication without deferring to commercial CAs. So I'm pretty sure >some nerdy admins have started enforcing it where they can, just because >they can. > >>> I don't know of other ports using RES_USE_DNSSEC and caring about the AD >>> flag. >>> >>> The effect of RES_TRUSTAD/trust-ad on RES_USE_DNSSEC ought to be >>> documented, but let's do that in another diff: the documentation of the >>> latter option is already lacking. > >Proposal below. Feedback / oks? > > >Index: res_init.3 >=== >RCS file: /home/cvs/src/lib/libc/net/res_init.3,v >retrieving revision 1.5 >diff -u -p -p -u -r1.5 res_init.3 >--- res_init.3 22 Nov 2021 20:18:27 - 1.5 >+++ res_init.3 23 Nov 2021 12:25:01 - >@@ -218,6 +218,19 @@ uses 4096 bytes as input buffer size. > Request that the resolver uses > Domain Name System Security Extensions (DNSSEC), > as defined in RFCs 4033, 4034, and 4035. >+The resolver routines will use the EDNS0 extension and
Re: IPsec tdb ref counting
On 23.11.2021. 14:18, Alexander Bluhm wrote: > On Tue, Nov 23, 2021 at 06:54:59AM +0100, Hrvoje Popovski wrote: >> after 24 hours hitting sasyncd setup one box panic > > Thanks for testing. > > I have reduced my iked lifetime to about 10 seconds and got the > same panic on my new 8 core test machine. > > ddb{2}> trace > db_enter() at db_enter+0x10 > panic(81eaa8e3) at panic+0xbf > pool_do_get(821e64d8,9,8000238b0524) at pool_do_get+0x35c > pool_get(821e64d8,9) at pool_get+0x93 > tdb_alloc(0) at tdb_alloc+0x62 > reserve_spi(0,100,,80d41254,80d41238,32,cbd2b00c6d3d3ec > d) at reserve_spi+0xfc > pfkeyv2_send(fd8739174900,81b3ba80,50) at pfkeyv2_send+0x19c6 > pfkeyv2_output(fd80948cea00,fd8739174900,0,0) at pfkeyv2_output+0x8a > pfkeyv2_usrreq(fd8739174900,9,fd80948cea00,0,0,8000238857b0) at > pfk > eyv2_usrreq+0x1b0 > sosend(fd8739174900,0,8000238b0b60,0,0,0) at sosend+0x3a9 > dofilewritev(8000238857b0,3,8000238b0b60,0,8000238b0c60) at > dofilew > ritev+0x14d > sys_writev(8000238857b0,8000238b0c00,8000238b0c60) at > sys_writev+0x > d2 > syscall(8000238b0cd0) at syscall+0x3a9 > Xsyscall() at Xsyscall+0x128 > >> ddb{3}> show tdb > > You have to add the pool item addr to this command. > > I additionally have refcount tracing diff on my machine. With that > I see this result: > > ddb{2}> show panic > *cpu2: pool_do_get: tdb free list modified: page 0x8801; item > addr 0 > x8801c998; offset 0x28=0xdeadbeee > > ddb{2}> show tdb /f 0x8801c998 > tdb at 0x8801c998 > hnext: 0x4c38c8f8ffb0cab5 > dnext: 0xff2c2a5ac7964242 > snext: 0xdeadbeefdeadbeef > ... > tdb_trace[78]: 350309838: refs 5 -1 cpu2 ipsec_forward_check:1081 > tdb_trace[79]: 350309839: refs 4 +1 cpu2 gettdb_dir:358 > tdb_trace[80]: 350309840: refs 5 -1 cpu2 ipsec_common_input:355 > tdb_trace[81]: 350309841: refs 4 +1 cpu2 gettdb_dir:358 > tdb_trace[82]: 350309842: refs 5 -1 cpu2 ipsec_forward_check:1081 > tdb_trace[83]: 350310888: refs 4 -1 cpu2 ipsp_spd_lookup:529 > tdb_trace[84]: 350816099: refs 3 -1 cpu0 tdb_soft_timeout:726 > tdb_trace[85]: 351266117: refs 2 +1 cpu2 gettdb_dir:358 > tdb_trace[86]: 351266118: refs 3 +0 cpu2 pfkeyv2_send:1599 > tdb_trace[87]: 351266119: refs 3 -1 cpu2 tdb_delete0:997 > tdb_trace[88]: 351271898: refs 2 -1 cpu2 pfkeyv2_send:2143 > tdb_trace[89]: 351300368: refs 1 +0 cpu0 tdb_timeout:688 > tdb_trace[90]: 351300369: refs 1 -1 cpu0 tdb_delete0:997 > tdb_trace[91]: 351300370: refs 3735928559 -1 cpu0 tdb_timeout:691 > > I will try mvs@ IPL_NET fix and think a bit more about the problem. > > bluhm > Hi, i've got panic with mvs@ diff bluhm@ thank you for tips .. r620-2# panic: pool_do_get: tdb free list modified: page 0x812ee000; item addr 0 x812f1bb0; offset 0x28=0xdeafbeac Stopped at db_enter+0x10: popq%rbp TIDPIDUID PRFLAGS PFLAGS CPU COMMAND 263347 98359 680x10 02 sasyncd *136177 87522 680x10 03 isakmpd 282035451 0 0x14000 0x2001 softnet db_enter() at db_enter+0x10 panic(81ea6d34) at panic+0xbf pool_do_get(822308b8,9,800022da0f94) at pool_do_get+0x35c pool_get(822308b8,9) at pool_get+0x93 tdb_alloc(0) at tdb_alloc+0x62 reserve_spi(0,100,,812c4254,812c4238,32,3f96bc02a5ef3ac f) at reserve_spi+0xff pfkeyv2_send(fd83b1902a90,812c3400,50) at pfkeyv2_send+0x146a pfkeyv2_output(fd80a5358c00,fd83b1902a90,0,0) at pfkeyv2_output+0x8a pfkeyv2_usrreq(fd83b1902a90,9,fd80a5358c00,0,0,800022d03a48) at pfk eyv2_usrreq+0x1b0 sosend(fd83b1902a90,0,800022da15e0,0,0,0) at sosend+0x3a9 dofilewritev(800022d03a48,7,800022da15e0,0,800022da16e0) at dofilew ritev+0x14d sys_writev(800022d03a48,800022da1680,800022da16e0) at sys_writev+0x d2 syscall(800022da1750) at syscall+0x3a9 Xsyscall() at Xsyscall+0x128 end of kernel end trace frame: 0x7f7c6cc0, count: 1 https://www.openbsd.org/ddb.html describes the minimum info required in bug reports. Insufficient info makes it difficult to find and fix bugs. ddb{3}> show tdb 0x812ee000: (unknown address family)->(unknown address family) :0 #-2135853982 dead4110 ddb{3}> show all tdb 0x812ee8b0: fac0dfe4 192.168.42.113->192.168.42.100:50 #3 1082 0x812efdf0: 4e927a9b 192.168.42.112->192.168.42.100:50 #2 0012 0x812f0ab0: c630e737 192.168.42.100->192.168.42.113:50 #4 000d1082 ddb{3}> ddb{3}> show tdb /f 0x812ee000 tdb at 0x812ee000 hnext: 0x7105245c18ca6678 dnext: 0xf0a45d406013ea71 snext: 0xdeafbeaddeafbead inext: 0xdeafbeaddeafbead onext: 0xdeafbeaddeafbead
Re: IPsec tdb ref counting
On Tue, Nov 23, 2021 at 02:18:26PM +0100, Alexander Bluhm wrote: > On Tue, Nov 23, 2021 at 06:54:59AM +0100, Hrvoje Popovski wrote: > > after 24 hours hitting sasyncd setup one box panic > > Thanks for testing. > > I have reduced my iked lifetime to about 10 seconds and got the > same panic on my new 8 core test machine. > > ddb{2}> trace > db_enter() at db_enter+0x10 > panic(81eaa8e3) at panic+0xbf > pool_do_get(821e64d8,9,8000238b0524) at pool_do_get+0x35c > pool_get(821e64d8,9) at pool_get+0x93 > tdb_alloc(0) at tdb_alloc+0x62 > reserve_spi(0,100,,80d41254,80d41238,32,cbd2b00c6d3d3ec > d) at reserve_spi+0xfc > pfkeyv2_send(fd8739174900,81b3ba80,50) at pfkeyv2_send+0x19c6 > pfkeyv2_output(fd80948cea00,fd8739174900,0,0) at pfkeyv2_output+0x8a > pfkeyv2_usrreq(fd8739174900,9,fd80948cea00,0,0,8000238857b0) at > pfk > eyv2_usrreq+0x1b0 > sosend(fd8739174900,0,8000238b0b60,0,0,0) at sosend+0x3a9 > dofilewritev(8000238857b0,3,8000238b0b60,0,8000238b0c60) at > dofilew > ritev+0x14d > sys_writev(8000238857b0,8000238b0c00,8000238b0c60) at > sys_writev+0x > d2 > syscall(8000238b0cd0) at syscall+0x3a9 > Xsyscall() at Xsyscall+0x128 > > > ddb{3}> show tdb > > You have to add the pool item addr to this command. > > I additionally have refcount tracing diff on my machine. With that > I see this result: > > ddb{2}> show panic > *cpu2: pool_do_get: tdb free list modified: page 0x8801; item > addr 0 > x8801c998; offset 0x28=0xdeadbeee > > ddb{2}> show tdb /f 0x8801c998 > tdb at 0x8801c998 > hnext: 0x4c38c8f8ffb0cab5 > dnext: 0xff2c2a5ac7964242 > snext: 0xdeadbeefdeadbeef > ... > tdb_trace[78]: 350309838: refs 5 -1 cpu2 ipsec_forward_check:1081 > tdb_trace[79]: 350309839: refs 4 +1 cpu2 gettdb_dir:358 > tdb_trace[80]: 350309840: refs 5 -1 cpu2 ipsec_common_input:355 > tdb_trace[81]: 350309841: refs 4 +1 cpu2 gettdb_dir:358 > tdb_trace[82]: 350309842: refs 5 -1 cpu2 ipsec_forward_check:1081 > tdb_trace[83]: 350310888: refs 4 -1 cpu2 ipsp_spd_lookup:529 > tdb_trace[84]: 350816099: refs 3 -1 cpu0 tdb_soft_timeout:726 > tdb_trace[85]: 351266117: refs 2 +1 cpu2 gettdb_dir:358 > tdb_trace[86]: 351266118: refs 3 +0 cpu2 pfkeyv2_send:1599 > tdb_trace[87]: 351266119: refs 3 -1 cpu2 tdb_delete0:997 > tdb_trace[88]: 351271898: refs 2 -1 cpu2 pfkeyv2_send:2143 > tdb_trace[89]: 351300368: refs 1 +0 cpu0 tdb_timeout:688 > tdb_trace[90]: 351300369: refs 1 -1 cpu0 tdb_delete0:997 > tdb_trace[91]: 351300370: refs 3735928559 -1 cpu0 tdb_timeout:691 > > I will try mvs@ IPL_NET fix and think a bit more about the problem. > > bluhm > It looks like the problem is that we are calling tdb_delete() twice, once from pfkey_send() and again from tdb_timeout(). My guess is that the timeout task is already scheduled and waiting for NETLOCK, which is why tdb_deltimeouts() can't delete it. The diff below adds a flag to prevent the TDB from being deleted more than once. This should fix the problem above. Index: sys/net/if_bridge.c === RCS file: /cvs/src/sys/net/if_bridge.c,v retrieving revision 1.358 diff -u -p -r1.358 if_bridge.c --- sys/net/if_bridge.c 11 Nov 2021 18:08:17 - 1.358 +++ sys/net/if_bridge.c 23 Nov 2021 15:12:53 - @@ -1567,20 +1567,28 @@ bridge_ipsec(struct ifnet *ifp, struct e tdb->tdb_xform != NULL) { if (tdb->tdb_first_use == 0) { tdb->tdb_first_use = gettime(); - if (tdb->tdb_flags & TDBF_FIRSTUSE) - timeout_add_sec(&tdb->tdb_first_tmo, - tdb->tdb_exp_first_use); - if (tdb->tdb_flags & TDBF_SOFT_FIRSTUSE) - timeout_add_sec(&tdb->tdb_sfirst_tmo, - tdb->tdb_soft_first_use); + if (tdb->tdb_flags & TDBF_FIRSTUSE) { + if (timeout_add_sec( + &tdb->tdb_first_tmo, + tdb->tdb_exp_first_use)) + tdb_ref(tdb); + } + if (tdb->tdb_flags & TDBF_SOFT_FIRSTUSE) { + if (timeout_add_sec( + &tdb->tdb_sfirst_tmo, + tdb->tdb_soft_first_use)) + tdb_ref(tdb); + } } prot = (*(tdb->tdb_xform->xf_input))(&m, tdb, hlen,
Re: IPsec tdb ref counting
On Tue, Nov 23, 2021 at 06:54:59AM +0100, Hrvoje Popovski wrote: > after 24 hours hitting sasyncd setup one box panic Thanks for testing. I have reduced my iked lifetime to about 10 seconds and got the same panic on my new 8 core test machine. ddb{2}> trace db_enter() at db_enter+0x10 panic(81eaa8e3) at panic+0xbf pool_do_get(821e64d8,9,8000238b0524) at pool_do_get+0x35c pool_get(821e64d8,9) at pool_get+0x93 tdb_alloc(0) at tdb_alloc+0x62 reserve_spi(0,100,,80d41254,80d41238,32,cbd2b00c6d3d3ec d) at reserve_spi+0xfc pfkeyv2_send(fd8739174900,81b3ba80,50) at pfkeyv2_send+0x19c6 pfkeyv2_output(fd80948cea00,fd8739174900,0,0) at pfkeyv2_output+0x8a pfkeyv2_usrreq(fd8739174900,9,fd80948cea00,0,0,8000238857b0) at pfk eyv2_usrreq+0x1b0 sosend(fd8739174900,0,8000238b0b60,0,0,0) at sosend+0x3a9 dofilewritev(8000238857b0,3,8000238b0b60,0,8000238b0c60) at dofilew ritev+0x14d sys_writev(8000238857b0,8000238b0c00,8000238b0c60) at sys_writev+0x d2 syscall(8000238b0cd0) at syscall+0x3a9 Xsyscall() at Xsyscall+0x128 > ddb{3}> show tdb You have to add the pool item addr to this command. I additionally have refcount tracing diff on my machine. With that I see this result: ddb{2}> show panic *cpu2: pool_do_get: tdb free list modified: page 0x8801; item addr 0 x8801c998; offset 0x28=0xdeadbeee ddb{2}> show tdb /f 0x8801c998 tdb at 0x8801c998 hnext: 0x4c38c8f8ffb0cab5 dnext: 0xff2c2a5ac7964242 snext: 0xdeadbeefdeadbeef ... tdb_trace[78]: 350309838: refs 5 -1 cpu2 ipsec_forward_check:1081 tdb_trace[79]: 350309839: refs 4 +1 cpu2 gettdb_dir:358 tdb_trace[80]: 350309840: refs 5 -1 cpu2 ipsec_common_input:355 tdb_trace[81]: 350309841: refs 4 +1 cpu2 gettdb_dir:358 tdb_trace[82]: 350309842: refs 5 -1 cpu2 ipsec_forward_check:1081 tdb_trace[83]: 350310888: refs 4 -1 cpu2 ipsp_spd_lookup:529 tdb_trace[84]: 350816099: refs 3 -1 cpu0 tdb_soft_timeout:726 tdb_trace[85]: 351266117: refs 2 +1 cpu2 gettdb_dir:358 tdb_trace[86]: 351266118: refs 3 +0 cpu2 pfkeyv2_send:1599 tdb_trace[87]: 351266119: refs 3 -1 cpu2 tdb_delete0:997 tdb_trace[88]: 351271898: refs 2 -1 cpu2 pfkeyv2_send:2143 tdb_trace[89]: 351300368: refs 1 +0 cpu0 tdb_timeout:688 tdb_trace[90]: 351300369: refs 1 -1 cpu0 tdb_delete0:997 tdb_trace[91]: 351300370: refs 3735928559 -1 cpu0 tdb_timeout:691 I will try mvs@ IPL_NET fix and think a bit more about the problem. bluhm
Re: extern optind etc already declared in unistd.h
On Mon, 22 Nov 2021 17:53:48 -0700, "Theo de Raadt" wrote: > > This is the usr.sbin part; I am not touching nsd > > and other third-party stuff. > > Well, some of our programs do have -portable variations (not just openssh), > and some of them will have to cope. I don't think we really care about portability to pre-POSIX systems, do we? - todd
Re: asr(3): strip AD flag in responses
On Mon, Nov 22 2021, Florian Obser wrote: > On 2021-11-21 22:21 +01, Jeremie Courreges-Anglas wrote: >> On Sun, Nov 21 2021, Jeremie Courreges-Anglas wrote: >>> On Sat, Nov 20 2021, Florian Obser wrote: >> >> [...] >> > Index: lib/libc/asr/res_mkquery.c > === > RCS file: /home/cvs/src/lib/libc/asr/res_mkquery.c,v > retrieving revision 1.13 > diff -u -p -r1.13 res_mkquery.c > --- lib/libc/asr/res_mkquery.c14 Jan 2019 06:49:42 - 1.13 > +++ lib/libc/asr/res_mkquery.c20 Nov 2021 14:24:08 - > @@ -62,6 +62,9 @@ res_mkquery(int op, const char *dname, i > h.flags |= RD_MASK; > if (ac->ac_options & RES_USE_CD) > h.flags |= CD_MASK; > + if ((ac->ac_options & RES_TRUSTAD) && > + !(ac->ac_options & RES_USE_DNSSEC)) > + h.flags |= AD_MASK; do you remember why you check for !RES_USE_DNSSEC? I'd like to leave it out. >>> >>> First, here's my understanding of RES_USE_DNSSEC: it's supposed to >>> activate EDNS and set the DO bit. The server is then supposed to reply >>> with AD set only if the data has been validated (with or without >>> DNSSEC), and possibly append add DNSSEC data if available. >>> >>> Since I didn't know the semantics of both setting AD and DO in a query >>> (I would expect such semantics to be sane) I wrote those more >>> conservative checks instead. Does this make sense? If so, maybe >>> a comment would help? >>> >>> /* Set the AD flag unless we already plan to set the DNSSEC DO bit. */ >>> if ((ac->ac_options & RES_TRUSTAD) && >>> !(ac->ac_options & RES_USE_DNSSEC)) >>> In fact I don't think RES_USE_DNSSEC is useful at all. If you want to set the DO flag and start DNSSEC from first principles you are better of talking to the network directly, i.e. rewrite unwind. >>> >>> RES_USE_DNSSEC has been there since some time already and it's used by >>> software in the ports tree, precisely to detect AD=1 - and not so much >>> for the key records I think. >> >> BTW clearing AD might break software that depends on it for stuff like >> DANE. postfix uses RES_USE_DNSSEC for this, but also force-enables >> RES_TRUSTAD if available so it shouldn't be affected (upstream's stance >> is that you should only use a trusted resolver when running postfix). > > Ambitious... > > I actually had considerd to only do automatic trust-ad without any way > for users and programs to change it / expand it to non-localhost. > But then I thought it might be nice in a datacenter setting where you > might trust the path to the resolver. I also think that'trust-ad" is good thing to support. > And then postfix comes along :( If we don't want uptreams to use RES_TRUSTAD and thus force the admin to do a conscious choice, we could hide it as an implementation detail in asr/ or make it a noop (decoupling it from "options trust-ad" and your automatic detection). It would require careful thinking about how people use the API. Personnally I don't feel too strongly about giving people rope... >> On the other hand mail/exim knowlingly doesn't force RES_TRUSTAD. > > I don't think we'll brake anything, at leat it should not stop > progress. DANE verification has to be oportunistic, no? Good luck > sending/receiving mail while requiring that certificates are either > signed by a known CA or TLSA records are published. That will probably > lose you a whole lot of the internet... Some people in the corporate world actually start to require the use of certificates signed by a known CA. I only spotted this twice so far. Regarding DANE and TLSA, one of the goals is to be able to use strong authentication without deferring to commercial CAs. So I'm pretty sure some nerdy admins have started enforcing it where they can, just because they can. >> I don't know of other ports using RES_USE_DNSSEC and caring about the AD >> flag. >> >> The effect of RES_TRUSTAD/trust-ad on RES_USE_DNSSEC ought to be >> documented, but let's do that in another diff: the documentation of the >> latter option is already lacking. Proposal below. Feedback / oks? Index: res_init.3 === RCS file: /home/cvs/src/lib/libc/net/res_init.3,v retrieving revision 1.5 diff -u -p -p -u -r1.5 res_init.3 --- res_init.3 22 Nov 2021 20:18:27 - 1.5 +++ res_init.3 23 Nov 2021 12:25:01 - @@ -218,6 +218,19 @@ uses 4096 bytes as input buffer size. Request that the resolver uses Domain Name System Security Extensions (DNSSEC), as defined in RFCs 4033, 4034, and 4035. +The resolver routines will use the EDNS0 extension and set the DNSSEC DO +flag in queries, asking the name server to signal validated records by +setting the AD flag in the reply and to attach additional DNSSEC +records. +Note that the resolver routines will clear the AD flag in replies unless +the name servers are c
Re: IPsec tdb ref counting
On Tue, Nov 23, 2021 at 06:54:59AM +0100, Hrvoje Popovski wrote: > On 21.11.2021. 23:36, Alexander Bluhm wrote: > > Updated tdb refcounting diff after merging with mvs@'s commit. > > Hi, > > after 24 hours hitting sasyncd setup one box panic > > r620-2# panic: pool_do_get: tdb free list modified: page > 0x812e; item addr 0 > x812e2a88; offset 0x28=0xdead410f > Stopped at db_enter+0x10: popq%rbp > TIDPIDUID PRFLAGS PFLAGS CPU COMMAND > 272153 56403 680x10 01 sasyncd > * 32102 91875 680x10 03 isakmpd > 471006 1037 730x100010 0x800 syslogd > 484026 49900 0 0x14000 0x2004 softnet > 106465 49485 0 0x14000 0x2002 systq > db_enter() at db_enter+0x10 > panic(81ea6a83) at panic+0xbf > pool_do_get(822bd5a0,9,800022d80e34) at pool_do_get+0x35c > pool_get(822bd5a0,9) at pool_get+0x93 > tdb_alloc(0) at tdb_alloc+0x62 > reserve_spi(0,100,,812c6254,812c6238,32,73d60b71a1c10a4 > 9) at reserve_spi+0xff > pfkeyv2_send(fd8394d5f550,812c5c80,50) at pfkeyv2_send+0x146a > pfkeyv2_output(fd800a5c5100,fd8394d5f550,0,0) at pfkeyv2_output+0x8a > pfkeyv2_usrreq(fd8394d5f550,9,fd800a5c5100,0,0,800022cd5268) > at pfkeyv2_usrreq+0x1b0 > sosend(fd8394d5f550,0,800022d81480,0,0,0) at sosend+0x3a9 > dofilewritev(800022cd5268,7,800022d81480,0,800022d81580) at > dofilewritev+0x14d > sys_writev(800022cd5268,800022d81520,800022d81580) at > sys_writev+0xd2 > syscall(800022d815f0) at syscall+0x3a9 > Xsyscall() at Xsyscall+0x128 > end of kernel > end trace frame: 0x7f7b2430, count: 1 > https://www.openbsd.org/ddb.html describes the minimum info required in > bug reports. Insufficient info makes it difficult to find and fix bugs. > ddb{3}> > It seems `tdb_pool' was modified within interrupt handler. So it should be initialized with 'IPL_NET' level. 198 ipsp_init(void) 199 { 200 pool_init(&tdb_pool, sizeof(struct tdb), 0, IPL_SOFTNET, 0, 201 "tdb", NULL); I did this modification on bluhm@'s diff. Try it please. Index: sys/net/if_bridge.c === RCS file: /cvs/src/sys/net/if_bridge.c,v retrieving revision 1.358 diff -u -p -r1.358 if_bridge.c --- sys/net/if_bridge.c 11 Nov 2021 18:08:17 - 1.358 +++ sys/net/if_bridge.c 23 Nov 2021 08:40:06 - @@ -1567,20 +1567,28 @@ bridge_ipsec(struct ifnet *ifp, struct e tdb->tdb_xform != NULL) { if (tdb->tdb_first_use == 0) { tdb->tdb_first_use = gettime(); - if (tdb->tdb_flags & TDBF_FIRSTUSE) - timeout_add_sec(&tdb->tdb_first_tmo, - tdb->tdb_exp_first_use); - if (tdb->tdb_flags & TDBF_SOFT_FIRSTUSE) - timeout_add_sec(&tdb->tdb_sfirst_tmo, - tdb->tdb_soft_first_use); + if (tdb->tdb_flags & TDBF_FIRSTUSE) { + if (timeout_add_sec( + &tdb->tdb_first_tmo, + tdb->tdb_exp_first_use)) + tdb_ref(tdb); + } + if (tdb->tdb_flags & TDBF_SOFT_FIRSTUSE) { + if (timeout_add_sec( + &tdb->tdb_sfirst_tmo, + tdb->tdb_soft_first_use)) + tdb_ref(tdb); + } } prot = (*(tdb->tdb_xform->xf_input))(&m, tdb, hlen, off); + tdb_unref(tdb); if (prot != IPPROTO_DONE) ip_deliver(&m, &hlen, prot, af); return (1); } else { + tdb_unref(tdb); skiplookup: /* XXX do an input policy lookup */ return (0); Index: sys/net/if_pfsync.c === RCS file: /cvs/src/sys/net/if_pfsync.c,v retrieving revision 1.298 diff -u -p -r1.298 if_pfsync.c --- sys/net/if_pfsync.c 11 Nov 2021 12:35:01 - 1.298 +++ sys/net/if_pfsync.c 23 Nov 2021 08:40:06 - @@ -1325,11 +1325,13 @@ pfsync_update_net_tdb(struct pfsync_tdb /* Neither replay nor byte counter should ever decrease. */ if (pt->rpl < tdb->tdb_rpl || pt->cur_bytes < tdb->tdb_cur_bytes) { +