date:20220713

Re: Is rdomain correct for rtable in man ifconfig?

2022-07-13 Thread YASUOKA Masahiko

Hello,

On Thu, 14 Jul 2022 14:09:52 +0900 (JST)
Masato Asou  wrote:
> The TUNNEL in the man ifconfig(8) is described as follows:
> 
> TUNNEL
> 
>  tunneldomain rtable
>   ^^here
>  Use routing table rtable instead of the default table.  The
>^^here
>  tunnel does not need to terminate in the same routing domain as
>  the interface itself.  rtable can be set to any valid routing
> ^^here
>  table ID; the corresponding routing domain is derived from this
>  table.
> 
>  -tunneldomain
>  Use the default routing table and routing domain 0.
> 
> Shouldn't rdomain be specified for TUNNELDOMAIN, not rtable?

I think it actually means rtable.

> When tunneldomain is set, rdomain is displayed and Rdomain 1 is
> created as shown below:
> 
> $ netstat -R
> Rdomain 0
>   Interfaces: lo0 em0 enc0 pflog0 gif0
>   Routing table: 0
> 
> $ doas ifconfig gif0 tunneldomain 1
> 0 asou@asou-curr: ~  14:04:15
> $ ifconfig gif0   
> gif0: flags=8010 mtu 1280
> index 7 priority 0 llprio 3
> encap: txprio payload rxprio payload
> groups: gif
> tunnel: (unset) ttl 64 nodf ecn rdomain 1
> $ netstat -R
> Rdomain 0
>   Interfaces: lo0 em0 enc0 pflog0 gif0 wg0
>   Routing table: 0
> 
> Rdomain 1
>   Interface: lo1
>   Routing table: 1
> 
> $ 

Which version?  This doesn't match my test.

 # ifconfig  
 lo0: flags=8049 mtu 32768
 index 3 priority 0 llprio 3
 groups: lo
 inet6 ::1 prefixlen 128
 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3
 inet 127.0.0.1 netmask 0xff00
 em0: flags=8802 mtu 1500
 lladdr 52:54:00:12:34:56
 index 1 priority 0 llprio 3
 media: Ethernet autoselect (1000baseT full-duplex)
 status: active
 enc0: flags=0<>
 index 2 priority 0 llprio 3
 groups: enc
 status: active
 pflog0: flags=141 mtu 33136
 index 4 priority 0 llprio 3
 groups: pflog
 # 
 # netstat -R
 Rdomain 0
   Interfaces: lo0 em0 enc0 pflog0
   Routing table: 0
 
 # 
 # ifconfig gif0 tunneldomain 1
 ifconfig: SIOCSLIFPHYRTABLE: Invalid argument
 # 

tunneldomain X fails if X doesn't exist.

Also,

 # route -T1 add 10.0.0.0/8 127.0.0.1
 add net 10.0.0.0/8: gateway 127.0.0.1
 # 

create a rtable 1 by creating a dummy route.

 # ifconfig gif0 tunneldomain 1
 #
 # ifconfig gif0
 gif0: flags=8010 mtu 1280
 index 5 priority 0 llprio 3
 encap: txprio payload rxprio payload
 groups: gif
 tunnel: (unset) ttl 64 nodf ecn rdomain 1

the command becomes ok.

 # 
 # netstat -R 
 Rdomain 0
   Interfaces: lo0 em0 enc0 pflog0 gif0
   Routing tables: 0 1
 
 # sysctl kern.version   
 kern.version=OpenBSD 7.1 (GENERIC.MP) #465: Mon Apr 11 18:03:57 MDT 2022
 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
 
 # 

It seems a rtable can be specified for "tunneldomain".

Re: dhcpleased(8): close unneeded bpf FDs

2022-07-13 Thread Florian Obser

On 2022-07-12 14:35 +02, Florian Obser  wrote:
> When the autoconf flag flaps around we might end up with multiple bpf
> FDs in flight. Things then get confusing. The kernel tells us we can
> read from the bpf FD but the data is actually "on the other FD", so
> read(2) returns 0.
>
> Found the hard way by, and patiently debugged with weerd@
>
> One way to trigger this is booting a vmm VM where dhcpleased(8)'s
> init_ifaces() loses a race against netstart(8). init_ifaces() would
> already see the autoconf flag and request a bpf FD.
> But then it would receive a RTM_IFINFO message without the autoconf flag
> set from when the interface came up. Then it will see another RTM_IFINFO
> message with the autoconf flag set and request yet another bpf FD. If
> the first bpf FD had not arrived yet we ended up with two in the frontend
> process.
>
> While here make sure a bpf FD has been configured before calling
> close(2) on it.
>
> OK?
>

Anyone? This is a slightly annoying bug... Difficult to hit though.


diff --git frontend.c frontend.c
index 2959a6c0d0f..275e5352e0b 100644
--- frontend.c
+++ frontend.c
@@ -1170,8 +1170,10 @@ remove_iface(uint32_t if_index)
return;
 
LIST_REMOVE(iface, entries);
-   event_del(&iface->bpfev.ev);
-   close(EVENT_FD(&iface->bpfev.ev));
+   if (event_initialized(&iface->bpfev.ev)) {
+   event_del(&iface->bpfev.ev);
+   close(EVENT_FD(&iface->bpfev.ev));
+   }
if (iface->udpsock != -1)
close(iface->udpsock);
free(iface);
@@ -1185,7 +1187,13 @@ set_bpfsock(int bpfsock, uint32_t if_index)
if ((iface = get_iface_by_id(if_index)) == NULL) {
/*
 * The interface disappeared while we were waiting for the
-* parent process to open the raw socket.
+* parent process to open the bpf socket.
+*/
+   close(bpfsock);
+   } else if (event_initialized(&iface->bpfev.ev)) {
+   /*
+* The autoconf flag is flapping and we have multiple bpf 
sockets in
+* flight. We don't need this one because we already got one.
 */
close(bpfsock);
} else {


-- 
I'm not entirely sure you are real.

Is rdomain correct for rtable in man ifconfig?

2022-07-13 Thread Masato Asou

Hi,

The TUNNEL in the man ifconfig(8) is described as follows:

TUNNEL

 tunneldomain rtable
  ^^here
 Use routing table rtable instead of the default table.  The
   ^^here
 tunnel does not need to terminate in the same routing domain as
 the interface itself.  rtable can be set to any valid routing
^^here
 table ID; the corresponding routing domain is derived from this
 table.

 -tunneldomain
 Use the default routing table and routing domain 0.

Shouldn't rdomain be specified for TUNNELDOMAIN, not rtable?

When tunneldomain is set, rdomain is displayed and Rdomain 1 is
created as shown below:

$ netstat -R
Rdomain 0
  Interfaces: lo0 em0 enc0 pflog0 gif0
  Routing table: 0

$ doas ifconfig gif0 tunneldomain 1
0 asou@asou-curr: ~  14:04:15
$ ifconfig gif0   
gif0: flags=8010 mtu 1280
index 7 priority 0 llprio 3
encap: txprio payload rxprio payload
groups: gif
tunnel: (unset) ttl 64 nodf ecn rdomain 1
$ netstat -R
Rdomain 0
  Interfaces: lo0 em0 enc0 pflog0 gif0 wg0
  Routing table: 0

Rdomain 1
  Interface: lo1
  Routing table: 1

$ 
--
ASOU Masato

Re: unp_solock_peer() and READ_ONCE()

2022-07-13 Thread Vitaliy Makkoveev

> On 14 Jul 2022, at 05:51, Visa Hankala  wrote:
> 
> On Thu, Jul 14, 2022 at 04:39:33AM +0300, Vitaliy Makkoveev wrote:
>> It looks like sometimes the `unp_conn' doesn't reloaded in
>> the "unp->unp_conn != unp2" check, and READ_ONCE() should prevent this.
> 
> Are you sure about the non-reloading of unp->unp_conn? I did a quick
> look in the compiled output on amd64, mips64 and riscv64, and did not
> spot any obvious change (there is no change on mips64 and riscv64).
> 

It seems to be. Otherwise I have no explanation of syzkaller reports.

May be it is not compiler, but CPU cache specific. I see the
difference on amd64, compiler loads to register before compare.

mov0x20(%r13),%rax
cmp%r12,%rax
je 0x...

instead of

cmp%r12,0x20(%r13)
je 0x...

I can’t say what is the real difference in cmp behaviour in both
cases. May be some one could explain.

> The loads are surrounded by lock/reference operations that imply
> memory clobbering. The compiler should not assume that unp->unp_conn
> is unchanged.
> 
>> unp_solock_peer(struct socket *so)
>> {
>>struct unpcb *unp, *unp2;
>>struct socket *so2;
>> 
>>unp = so->so_pcb;
>> 
>> again:
>>if ((unp2 = unp->unp_conn) == NULL)
>>return NULL;
>> 
>>so2 = unp2->unp_socket;
>> 
>>if (so < so2)
>>solock(so2);
>>else if (so > so2){
>>unp_ref(unp2);
>>sounlock(so);
>>solock(so2);
>>solock(so);
>> 
>>/* Datagram socket could be reconnected due to re-lock. */
>>if (unp->unp_conn != unp2) {
>>sounlock(so2);
>>unp_rele(unp2);
>>goto again;
>>}
>> 
>>unp_rele(unp2);
>>}
>> 
>>return so2;
>> }
>> 
>> Index: sys/kern/uipc_usrreq.c
>> ===
>> RCS file: /cvs/src/sys/kern/uipc_usrreq.c,v
>> retrieving revision 1.167
>> diff -u -p -r1.167 uipc_usrreq.c
>> --- sys/kern/uipc_usrreq.c   2 Jul 2022 11:49:23 -   1.167
>> +++ sys/kern/uipc_usrreq.c   14 Jul 2022 01:08:22 -
>> @@ -154,7 +154,7 @@ unp_solock_peer(struct socket *so)
>>  unp = so->so_pcb;
>> 
>> again:
>> -if ((unp2 = unp->unp_conn) == NULL)
>> +if ((unp2 = READ_ONCE(unp->unp_conn)) == NULL)
>>  return NULL;
>> 
>>  so2 = unp2->unp_socket;
>> @@ -168,7 +168,7 @@ again:
>>  solock(so);
>> 
>>  /* Datagram socket could be reconnected due to re-lock. */
>> -if (unp->unp_conn != unp2) {
>> +if (READ_ONCE(unp->unp_conn) != unp2) {
>>  sounlock(so2);
>>  unp_rele(unp2);
>>  goto again;
>>

Re: unp_solock_peer() and READ_ONCE()

2022-07-13 Thread Visa Hankala

On Thu, Jul 14, 2022 at 04:39:33AM +0300, Vitaliy Makkoveev wrote:
> It looks like sometimes the `unp_conn' doesn't reloaded in
> the "unp->unp_conn != unp2" check, and READ_ONCE() should prevent this.

Are you sure about the non-reloading of unp->unp_conn? I did a quick
look in the compiled output on amd64, mips64 and riscv64, and did not
spot any obvious change (there is no change on mips64 and riscv64).

The loads are surrounded by lock/reference operations that imply
memory clobbering. The compiler should not assume that unp->unp_conn
is unchanged.

> unp_solock_peer(struct socket *so)
> {
> struct unpcb *unp, *unp2;
> struct socket *so2;
> 
> unp = so->so_pcb;
> 
> again:
> if ((unp2 = unp->unp_conn) == NULL)
> return NULL;
> 
> so2 = unp2->unp_socket;
> 
> if (so < so2)
> solock(so2);
> else if (so > so2){
> unp_ref(unp2);
> sounlock(so);
> solock(so2);
> solock(so);
> 
> /* Datagram socket could be reconnected due to re-lock. */
> if (unp->unp_conn != unp2) {
> sounlock(so2);
> unp_rele(unp2);
> goto again;
> }
> 
> unp_rele(unp2);
> }
> 
> return so2;
> }
> 
> Index: sys/kern/uipc_usrreq.c
> ===
> RCS file: /cvs/src/sys/kern/uipc_usrreq.c,v
> retrieving revision 1.167
> diff -u -p -r1.167 uipc_usrreq.c
> --- sys/kern/uipc_usrreq.c2 Jul 2022 11:49:23 -   1.167
> +++ sys/kern/uipc_usrreq.c14 Jul 2022 01:08:22 -
> @@ -154,7 +154,7 @@ unp_solock_peer(struct socket *so)
>   unp = so->so_pcb;
>  
>  again:
> - if ((unp2 = unp->unp_conn) == NULL)
> + if ((unp2 = READ_ONCE(unp->unp_conn)) == NULL)
>   return NULL;
>  
>   so2 = unp2->unp_socket;
> @@ -168,7 +168,7 @@ again:
>   solock(so);
>  
>   /* Datagram socket could be reconnected due to re-lock. */
> - if (unp->unp_conn != unp2) {
> + if (READ_ONCE(unp->unp_conn) != unp2) {
>   sounlock(so2);
>   unp_rele(unp2);
>   goto again;
>

unp_solock_peer() and READ_ONCE()

2022-07-13 Thread Vitaliy Makkoveev

It looks like sometimes the `unp_conn' doesn't reloaded in
the "unp->unp_conn != unp2" check, and READ_ONCE() should prevent this.

unp_solock_peer(struct socket *so)
{
struct unpcb *unp, *unp2;
struct socket *so2;

unp = so->so_pcb;

again:
if ((unp2 = unp->unp_conn) == NULL)
return NULL;

so2 = unp2->unp_socket;

if (so < so2)
solock(so2);
else if (so > so2){
unp_ref(unp2);
sounlock(so);
solock(so2);
solock(so);

/* Datagram socket could be reconnected due to re-lock. */
if (unp->unp_conn != unp2) {
sounlock(so2);
unp_rele(unp2);
goto again;
}

unp_rele(unp2);
}

return so2;
}

Index: sys/kern/uipc_usrreq.c
===
RCS file: /cvs/src/sys/kern/uipc_usrreq.c,v
retrieving revision 1.167
diff -u -p -r1.167 uipc_usrreq.c
--- sys/kern/uipc_usrreq.c  2 Jul 2022 11:49:23 -   1.167
+++ sys/kern/uipc_usrreq.c  14 Jul 2022 01:08:22 -
@@ -154,7 +154,7 @@ unp_solock_peer(struct socket *so)
unp = so->so_pcb;
 
 again:
-   if ((unp2 = unp->unp_conn) == NULL)
+   if ((unp2 = READ_ONCE(unp->unp_conn)) == NULL)
return NULL;
 
so2 = unp2->unp_socket;
@@ -168,7 +168,7 @@ again:
solock(so);
 
/* Datagram socket could be reconnected due to re-lock. */
-   if (unp->unp_conn != unp2) {
+   if (READ_ONCE(unp->unp_conn) != unp2) {
sounlock(so2);
unp_rele(unp2);
goto again;

Re: [v3] amd64: simplify TSC sync testing

2022-07-13 Thread Theo de Raadt

Christian Weisgerber  wrote:

> Scott Cheloha:
> 
> > > kern.timecounter.tick=1
> > > kern.timecounter.timestepwarnings=0
> > > kern.timecounter.hardware=i8254
> > > kern.timecounter.choice=i8254(0) tsc(-1000) acpihpet0(1000) 
> > > acpitimer0(1000)
> > 
> > This is expected behavior with the patch.
> > 
> > cpu0's TSC is way out of sync with every
> > other CPU's TSC, so the TSC is marked
> > as a bad timecounter and a different one is
> > chosen.
> 
> Shouldn't it pick acpihpet0 then?

Let's keep in mind the goal:  When it works well, the TSC is a better timer
than all the others, so if the TSC problems can be diagnosed & resolved by 
various
measurements and mitigations, then we will want the TSC to be chosen.

So the long term roadmap probably does NOT include acpihpet0 being chosen.
Long term, that hopefully is a falacy.

Re: ifconfig description for wireguard peers

2022-07-13 Thread Mikolaj Kucharski

On Wed, Jul 13, 2022 at 05:43:59PM +0100, Stuart Henderson wrote:
> > 
> > Not sure how to handle long output in different way. If you don't
> > specify wgdesc to the ifconfig, the diff doesn't change anything and
> > ifconfig(8) output is exactly the same. If you don't find this feature
> > useful, by not using it, nothing changes for you. Isn't that fair?
> 
> It might be better if wgpeer details would be skipped from plain
> "ifconfig" and only listed if "ifconfig wg0" is used, much like
> IPv4 aliases are skipped from "ifconfig" and only listed when
> you use either "ifconfig -A" or "ifconfig em0" etc.
> 

Ah, that helps. Thank you! Will look into it. Indeed this makes more
sense. I was aware of -A, but was not aware of ifconfig  behaviour.

-- 
Regards,
 Mikolaj

Re: ifconfig description for wireguard peers

2022-07-13 Thread Stuart Henderson

On 2022/07/13 16:18, Mikolaj Kucharski wrote:
> On Wed, Jul 13, 2022 at 10:02:30AM -0600, Theo de Raadt wrote:
> > Mikolaj Kucharski  wrote:
> > 
> > > I took the libery and refreshed the patch. What I did so far:
> > > 
> > > - compiled GENERIC.MP on amd64
> > > - compiled new ifconfig, same arch
> > > - booted up new bsd.mp with the patch
> > > - when wgdesc is not set, pre-patch ifconfig seems to work
> > > - when with new ifconfig I set wgdesc, old ifconfig wg segfaults
> > > 
> > > Example from running -current:
> > > 
> > > pce-0067# ifconfig.new wg0 
> > > wg0: flags=80c3 mtu 1420
> > > index 8 priority 0 llprio 3
> > > wgport 51820
> > > wgpubkey qcb...
> > > wgpeer klM...
> > > description: ks2
> > > wgpsk (present)
> > > wgpka 25 (sec)
> > > wgendpoint xxx.xxx.xxx.xxx 51820
> > > tx: 1932, rx: 620
> > > last handshake: 83 seconds ago
> > > wgaip fde4:f456:48c2:13c0::/64
> > > groups: wg
> > > inet6 fde4:f456:48c2:13c0::cc67 prefixlen 64
> > 
> > That seems disgustingly verbose to me.
> > 
> > Who is going to read it?
> 
> Me. I find this one additional line useful.
> 
> > Shall we put all the active counters for normal ethernet/wifi into the
> > default ifconfig output, so that the default ifconfig output scrolls and
> > scrolls and scrolls and noone actually reads it?
> > 
> > I think this has gone off the rails.
> 
> This is the nature of wg(4) interface. I have machine with 25 peers and
> each peer adds lines to the ifconfig(8) output. This is on a machine
> without patch from this thread, -stable, official kernel:
> 
> # ifconfig wg0 | wc -l
>  128
> 
> Not sure how to handle long output in different way. If you don't
> specify wgdesc to the ifconfig, the diff doesn't change anything and
> ifconfig(8) output is exactly the same. If you don't find this feature
> useful, by not using it, nothing changes for you. Isn't that fair?

It might be better if wgpeer details would be skipped from plain
"ifconfig" and only listed if "ifconfig wg0" is used, much like
IPv4 aliases are skipped from "ifconfig" and only listed when
you use either "ifconfig -A" or "ifconfig em0" etc.

Re: ifconfig description for wireguard peers

2022-07-13 Thread Mikolaj Kucharski

On Wed, Jul 13, 2022 at 10:02:30AM -0600, Theo de Raadt wrote:
> Mikolaj Kucharski  wrote:
> 
> > I took the libery and refreshed the patch. What I did so far:
> > 
> > - compiled GENERIC.MP on amd64
> > - compiled new ifconfig, same arch
> > - booted up new bsd.mp with the patch
> > - when wgdesc is not set, pre-patch ifconfig seems to work
> > - when with new ifconfig I set wgdesc, old ifconfig wg segfaults
> > 
> > Example from running -current:
> > 
> > pce-0067# ifconfig.new wg0 
> > wg0: flags=80c3 mtu 1420
> > index 8 priority 0 llprio 3
> > wgport 51820
> > wgpubkey qcb...
> > wgpeer klM...
> > description: ks2
> > wgpsk (present)
> > wgpka 25 (sec)
> > wgendpoint xxx.xxx.xxx.xxx 51820
> > tx: 1932, rx: 620
> > last handshake: 83 seconds ago
> > wgaip fde4:f456:48c2:13c0::/64
> > groups: wg
> > inet6 fde4:f456:48c2:13c0::cc67 prefixlen 64
> 
> That seems disgustingly verbose to me.
> 
> Who is going to read it?

Me. I find this one additional line useful.

> Shall we put all the active counters for normal ethernet/wifi into the
> default ifconfig output, so that the default ifconfig output scrolls and
> scrolls and scrolls and noone actually reads it?
> 
> I think this has gone off the rails.

This is the nature of wg(4) interface. I have machine with 25 peers and
each peer adds lines to the ifconfig(8) output. This is on a machine
without patch from this thread, -stable, official kernel:

# ifconfig wg0 | wc -l
 128

Not sure how to handle long output in different way. If you don't
specify wgdesc to the ifconfig, the diff doesn't change anything and
ifconfig(8) output is exactly the same. If you don't find this feature
useful, by not using it, nothing changes for you. Isn't that fair?

-- 
Regards,
 Mikolaj

Re: ifconfig description for wireguard peers

2022-07-13 Thread Theo de Raadt

Mikolaj Kucharski  wrote:

> I took the libery and refreshed the patch. What I did so far:
> 
> - compiled GENERIC.MP on amd64
> - compiled new ifconfig, same arch
> - booted up new bsd.mp with the patch
> - when wgdesc is not set, pre-patch ifconfig seems to work
> - when with new ifconfig I set wgdesc, old ifconfig wg segfaults
> 
> Example from running -current:
> 
> pce-0067# ifconfig.new wg0 
> wg0: flags=80c3 mtu 1420
> index 8 priority 0 llprio 3
> wgport 51820
> wgpubkey qcb...
> wgpeer klM...
> description: ks2
> wgpsk (present)
> wgpka 25 (sec)
> wgendpoint xxx.xxx.xxx.xxx 51820
> tx: 1932, rx: 620
> last handshake: 83 seconds ago
> wgaip fde4:f456:48c2:13c0::/64
> groups: wg
> inet6 fde4:f456:48c2:13c0::cc67 prefixlen 64

That seems disgustingly verbose to me.

Who is going to read it?

Shall we put all the active counters for normal ethernet/wifi into the
default ifconfig output, so that the default ifconfig output scrolls and
scrolls and scrolls and noone actually reads it?

I think this has gone off the rails.

Re: ifconfig description for wireguard peers

2022-07-13 Thread Mikolaj Kucharski

On Wed, Jul 13, 2022 at 03:37:05PM +, Mikolaj Kucharski wrote:
> Example from running -current:
> 
> pce-0067# ifconfig.new wg0 
> wg0: flags=80c3 mtu 1420
> index 8 priority 0 llprio 3
> wgport 51820
> wgpubkey qcb...
> wgpeer klM...
> description: ks2
> wgpsk (present)
> wgpka 25 (sec)
> wgendpoint xxx.xxx.xxx.xxx 51820
> tx: 1932, rx: 620
> last handshake: 83 seconds ago
> wgaip fde4:f456:48c2:13c0::/64
> groups: wg
> inet6 fde4:f456:48c2:13c0::cc67 prefixlen 64
> 
> Seems to work for me. Looking at above ifconfig(8) output, not sure
> should the output should say `description:` or just follow the pattern
> of other wireguard keywords and just say `wgdesc` (same keyword like
> for ifconfig command and no colon). I think I would prefer that.
> 

I mean like this:

--- ifconfig.c  Wed Jul 13 15:19:24 2022
+++ ifconfig.c  Wed Jul 13 15:39:04 2022
@@ -5972,7 +5972,7 @@
printf("\twgpeer %s\n", key);
 
if (strlen(wg_peer->p_description))
-   printf("\t\tdescription: %s\n", wg_peer->p_description);
+   printf("\t\twgdesc %s\n", wg_peer->p_description);
 
if (wg_peer->p_flags & WG_PEER_HAS_PSK)
printf("\t\twgpsk (present)\n");



pce-0067# ifconfig.new wg
wg0: flags=80c3 mtu 1420
index 8 priority 0 llprio 3
wgport 51820
wgpubkey qcb...
wgpeer klM...
wgdesc ks2
wgpsk (present)
wgpka 25 (sec)
wgendpoint xxx.xxx.xxx.xxx 51820
tx: 3260, rx: 1116
last handshake: 69 seconds ago
wgaip fde4:f456:48c2:13c0::/64
groups: wg
inet6 fde4:f456:48c2:13c0::cc67 prefixlen 64

-- 
Regards,
 Mikolaj

Re: ifconfig description for wireguard peers

2022-07-13 Thread Mikolaj Kucharski

I took the libery and refreshed the patch. What I did so far:

- compiled GENERIC.MP on amd64
- compiled new ifconfig, same arch
- booted up new bsd.mp with the patch
- when wgdesc is not set, pre-patch ifconfig seems to work
- when with new ifconfig I set wgdesc, old ifconfig wg segfaults

Example from running -current:

pce-0067# ifconfig.new wg0 
wg0: flags=80c3 mtu 1420
index 8 priority 0 llprio 3
wgport 51820
wgpubkey qcb...
wgpeer klM...
description: ks2
wgpsk (present)
wgpka 25 (sec)
wgendpoint xxx.xxx.xxx.xxx 51820
tx: 1932, rx: 620
last handshake: 83 seconds ago
wgaip fde4:f456:48c2:13c0::/64
groups: wg
inet6 fde4:f456:48c2:13c0::cc67 prefixlen 64

Seems to work for me. Looking at above ifconfig(8) output, not sure
should the output should say `description:` or just follow the pattern
of other wireguard keywords and just say `wgdesc` (same keyword like
for ifconfig command and no colon). I think I would prefer that.

Patch is not mine. It's from Noah Meier  and
I just re-applied it to -current. It's at the end of this email.

I hope I didn't screw up anything.


On Wed, Jul 13, 2022 at 11:29:51AM +, Mikolaj Kucharski wrote:
> Hi,
> 
> I'm sorry I didn't test this, as I don't have -current OpenBSD on all my
> machines, but I do have one WireGuard gateway running -stable with few
> peers:
> 
> # ifconfig wg | grep -cw wgpeer
> 25
> 
> I would love to have this merged into main repo, as it would make my
> life a tiny bit easier to see which pubkey is which peer.
> 
> Based on what Stefan wrote below, do you have by any chance newer
> version of your diff, Noah?
> 
> 
> 
> On Wed, Dec 01, 2021 at 12:18:52AM +0100, Stefan Sperling wrote:
> > On Tue, Nov 30, 2021 at 02:31:20PM -0500, Noah Meier wrote:
> > > Hi Stefan,
> > > 
> > > Richard Procter offered some kind advice on the ordering of options in 
> > > the man page
> > > (to be done alphabetically) and commented on an unnecessary cast.
> > > 
> > > I also believe that I goofed by failing to initalize the mutex and zero 
> > > the description
> > > upon peer creation (in wg_peer_create).
> > > 
> > > I’ve attempted to address these issues and have pasted the diff below.
> > > 
> > > NM
> > 
> > This new patch does not apply cleanly.
> > Leading whitespace was stripped, and thus patch complains as follows:
> > 
> > Patching file if_wg.c using Plan A...
> > patch:  malformed patch at line 15: };
> > 
> > And it would help if you created a patch where all paths are relative to
> > the /usr/src directory. Something like this should do it:
> > 
> >  cd /usr/src
> >  cvs diff -u sbin/ifconfig sys/net  > /tmp/wgdesc.patch
> > 
> > If your mailer cannot preserve whitespace as-is then please try attaching
> > the patch file instead of inlining it.
> > 
> > Thanks!




Index: sbin/ifconfig/ifconfig.8
===
RCS file: /cvs/src/sbin/ifconfig/ifconfig.8,v
retrieving revision 1.384
diff -u -p -u -r1.384 ifconfig.8
--- sbin/ifconfig/ifconfig.827 Jun 2022 16:27:03 -  1.384
+++ sbin/ifconfig/ifconfig.813 Jul 2022 14:53:22 -
@@ -2251,6 +2251,10 @@ Set the peer's IPv4 or IPv6
 range for tunneled traffic.
 Repeat the option to set multiple ranges.
 By default, no addresses are allowed.
+.It Cm wgdesc Ar value
+Specify a description of the peer.
+.It Cm -wgdesc
+Clear the peer description.
 .It Cm wgendpoint Ar peer_address port
 Address traffic to the peer's IPv4 or IPv6
 .Ar peer_address
Index: sbin/ifconfig/ifconfig.c
===
RCS file: /cvs/src/sbin/ifconfig/ifconfig.c,v
retrieving revision 1.456
diff -u -p -u -r1.456 ifconfig.c
--- sbin/ifconfig/ifconfig.c8 Jul 2022 07:04:54 -   1.456
+++ sbin/ifconfig/ifconfig.c13 Jul 2022 14:53:22 -
@@ -355,12 +355,14 @@ void  setwgpeerep(const char *, const cha
 void   setwgpeeraip(const char *, int);
 void   setwgpeerpsk(const char *, int);
 void   setwgpeerpka(const char *, int);
+void   setwgpeerdesc(const char *, int);
 void   setwgport(const char *, int);
 void   setwgkey(const char *, int);
 void   setwgrtable(const char *, int);
 
 void   unsetwgpeer(const char *, int);
 void   unsetwgpeerpsk(const char *, int);
+void   unsetwgpeerdesc(const char *, int);
 void   unsetwgpeerall(const char *, int);
 
 void   wg_status();
@@ -620,11 +622,13 @@ const struct  cmd {
{ "wgaip",  NEXTARG,A_WIREGUARD,setwgpeeraip},
{ "wgpsk",  NEXTARG,A_WIREGUARD,setwgpeerpsk},
{ "wgpka",  NEXTARG,A_WIREGUARD,setwgpeerpka},
+   { "wgdesc", NEXTARG,A_WIREGUARD,setwgpeerdesc},
{ "wgport", NEXTARG,A_WIREGUARD,setwgport},
{ "wgkey",  NEXTARG,A_WIREGUARD,setwgkey},
{ "wgrtable

Re: Race in disk_attach_callback?

2022-07-13 Thread Stefan Sperling

On Wed, Jul 13, 2022 at 02:18:53PM +0200, Otto Moerbeek wrote:
> Hi,
> 
> after a prompt from stsp@ and florian@, reporting that newfs_msdos
> fails when given an $duid.i argument, I set down to see what could be
> going on. My test using an USB stick failed to reprdouce the problem.
> Then I started using a vnd, and that shows the issue (once in a
> while). The feeling is that any disk devcied created on the fly might
> show this issue.

Just as an additional detail: This also affects softraid volumes,
which is how I ran into the issue. Installing a laptop with
softraid full disk encryption on top of an EFI/GPT disk can result
in a failure to format the MSDOS partition required for installing
the EFI boot loader. When this happens the installer reports that
it could not install a boot loader.
A manual invocation of newfs_msdos with the correct /dev/sdxi device,
and a subsequent invocation of installboot, can be used as a workaround.

Re: Race in disk_attach_callback?

2022-07-13 Thread Otto Moerbeek

On Wed, Jul 13, 2022 at 02:18:53PM +0200, Otto Moerbeek wrote:

> Hi,
> 
> after a prompt from stsp@ and florian@, reporting that newfs_msdos
> fails when given an $duid.i argument, I set down to see what could be
> going on. My test using an USB stick failed to reprdouce the problem.
> Then I started using a vnd, and that shows the issue (once in a
> while). The feeling is that any disk devcied created on the fly might
> show this issue.
> 
> What I have found out so far: sometimes, disk_subr.c:disk_map()
> fails, because the device in question does not have the DKF_LABELVALID
> flag set, so it is skipped in the duid search.
> 
> The only place where DKF_LABELVALID is set is in
> disk_attach_callback().
> 
> Often the disk_readlabel() call in this function succeeds and the flag
> is set, but also often it does not succeed. 
> 
> I have observed it failing setting the message to "cannot read disk
> label, 0xe32/0x2932, error 25", which seems to come from the
> DIOCGPDINFO NOTTY case in dev/vnc.c
> 
> This is how far I got.

Looking a bit more, I see that the attachment is done from a task. I
suppose that if the open happens before the task is run, we end up
with a flag not set. 


> 
> I am using the test script below. In my case, instrumenting
> subr_disk.c with printf made the problem occur less often, so I
> suspect a race between the disk attachment and the label becoming
> available.
> 
> The vnconfig triggers the disk attachment message (but only the first time a
> vnd is created, it seems, I am not seeing any messages after a vnconfig
> -u and re-running the script).
> 
> 
> === script ===
> set -e
> dd if=/dev/random of=/tmp/image$1 bs=1m count=100
> vnconfig vnd$1 /tmp/image$1
> fdisk -ig vnd$1
> echo "a\ni\n\n\n\nw\nl\nq\n" | disklabel -E vnd$1
> ==
> 
> Try to run newfs_msdos on the duid.i reported. Note that you need to
> reboot to see a change a behaviour on a specific vnd. If it starts out
> OK, it keeps on being OK, if it fails, it keeps on failing. This
> corresponds to my observation that the disk_attach_callback() is
> called only once for a specifc vnd, even after unconfig and reconfig.
> 
> Hope somebody who knows how this all shoudl work can see where the
> issue is.
> 
>   -Otto
> 
> Index: subr_disk.c
> ===
> RCS file: /cvs/src/sys/kern/subr_disk.c,v
> retrieving revision 1.248
> diff -u -p -r1.248 subr_disk.c
> --- subr_disk.c   2 Jan 2022 17:26:14 -   1.248
> +++ subr_disk.c   13 Jul 2022 12:02:05 -
> @@ -1118,8 +1118,10 @@ disk_attach_callback(void *xdat)
>   /* Read disklabel. */
>   if (disk_readlabel(&dl, dk->dk_devno, errbuf, sizeof(errbuf)) == NULL) {
>   enqueue_randomness(dl.d_checksum);
> + printf("disk_attach_callback %s: seting DKF_LABELVALID\n", 
> dk->dk_name);
>   dk->dk_flags |= DKF_LABELVALID;
>   }
> + else printf("disk_attach_callback %s: NOT seting DKF_LABELVALID 
> (%s)\n", dk->dk_name, errbuf);
>  
>  done:
>   dk->dk_flags |= DKF_OPENED;
> @@ -1785,6 +1787,7 @@ disk_map(char *path, char *mappath, int 
>  
>   mdk = NULL;
>   TAILQ_FOREACH(dk, &disklist, dk_link) {
> + printf("YYY %p %s %x %p\n", dk, dk->dk_name, dk->dk_flags, 
> dk->dk_label);
>   if ((dk->dk_flags & DKF_LABELVALID) && dk->dk_label &&
>   memcmp(dk->dk_label->d_uid, uid,
>   sizeof(dk->dk_label->d_uid)) == 0) {
> @@ -1795,8 +1798,10 @@ disk_map(char *path, char *mappath, int 
>   }
>   }
>  
> - if (mdk == NULL || mdk->dk_name == NULL)
> + if (mdk == NULL || mdk->dk_name == NULL) {
> + printf("XXX %p\n", mdk);
>   return -1;
> + }
>  
>   snprintf(mappath, size, "/dev/%s%s%c",
>   (flags & DM_OPENBLCK) ? "" : "r", mdk->dk_name, part);
>

Race in disk_attach_callback?

2022-07-13 Thread Otto Moerbeek

Hi,

after a prompt from stsp@ and florian@, reporting that newfs_msdos
fails when given an $duid.i argument, I set down to see what could be
going on. My test using an USB stick failed to reprdouce the problem.
Then I started using a vnd, and that shows the issue (once in a
while). The feeling is that any disk devcied created on the fly might
show this issue.

What I have found out so far: sometimes, disk_subr.c:disk_map()
fails, because the device in question does not have the DKF_LABELVALID
flag set, so it is skipped in the duid search.

The only place where DKF_LABELVALID is set is in
disk_attach_callback().

Often the disk_readlabel() call in this function succeeds and the flag
is set, but also often it does not succeed. 

I have observed it failing setting the message to "cannot read disk
label, 0xe32/0x2932, error 25", which seems to come from the
DIOCGPDINFO NOTTY case in dev/vnc.c

This is how far I got.

I am using the test script below. In my case, instrumenting
subr_disk.c with printf made the problem occur less often, so I
suspect a race between the disk attachment and the label becoming
available.

The vnconfig triggers the disk attachment message (but only the first time a
vnd is created, it seems, I am not seeing any messages after a vnconfig
-u and re-running the script).


=== script ===
set -e
dd if=/dev/random of=/tmp/image$1 bs=1m count=100
vnconfig vnd$1 /tmp/image$1
fdisk -ig vnd$1
echo "a\ni\n\n\n\nw\nl\nq\n" | disklabel -E vnd$1
==

Try to run newfs_msdos on the duid.i reported. Note that you need to
reboot to see a change a behaviour on a specific vnd. If it starts out
OK, it keeps on being OK, if it fails, it keeps on failing. This
corresponds to my observation that the disk_attach_callback() is
called only once for a specifc vnd, even after unconfig and reconfig.

Hope somebody who knows how this all shoudl work can see where the
issue is.

-Otto

Index: subr_disk.c
===
RCS file: /cvs/src/sys/kern/subr_disk.c,v
retrieving revision 1.248
diff -u -p -r1.248 subr_disk.c
--- subr_disk.c 2 Jan 2022 17:26:14 -   1.248
+++ subr_disk.c 13 Jul 2022 12:02:05 -
@@ -1118,8 +1118,10 @@ disk_attach_callback(void *xdat)
/* Read disklabel. */
if (disk_readlabel(&dl, dk->dk_devno, errbuf, sizeof(errbuf)) == NULL) {
enqueue_randomness(dl.d_checksum);
+   printf("disk_attach_callback %s: seting DKF_LABELVALID\n", 
dk->dk_name);
dk->dk_flags |= DKF_LABELVALID;
}
+   else printf("disk_attach_callback %s: NOT seting DKF_LABELVALID 
(%s)\n", dk->dk_name, errbuf);
 
 done:
dk->dk_flags |= DKF_OPENED;
@@ -1785,6 +1787,7 @@ disk_map(char *path, char *mappath, int 
 
mdk = NULL;
TAILQ_FOREACH(dk, &disklist, dk_link) {
+   printf("YYY %p %s %x %p\n", dk, dk->dk_name, dk->dk_flags, 
dk->dk_label);
if ((dk->dk_flags & DKF_LABELVALID) && dk->dk_label &&
memcmp(dk->dk_label->d_uid, uid,
sizeof(dk->dk_label->d_uid)) == 0) {
@@ -1795,8 +1798,10 @@ disk_map(char *path, char *mappath, int 
}
}
 
-   if (mdk == NULL || mdk->dk_name == NULL)
+   if (mdk == NULL || mdk->dk_name == NULL) {
+   printf("XXX %p\n", mdk);
return -1;
+   }
 
snprintf(mappath, size, "/dev/%s%s%c",
(flags & DM_OPENBLCK) ? "" : "r", mdk->dk_name, part);

Re: ifconfig description for wireguard peers

2022-07-13 Thread Mikolaj Kucharski

Hi,

I'm sorry I didn't test this, as I don't have -current OpenBSD on all my
machines, but I do have one WireGuard gateway running -stable with few
peers:

# ifconfig wg | grep -cw wgpeer
25

I would love to have this merged into main repo, as it would make my
life a tiny bit easier to see which pubkey is which peer.

Based on what Stefan wrote below, do you have by any chance newer
version of your diff, Noah?



On Wed, Dec 01, 2021 at 12:18:52AM +0100, Stefan Sperling wrote:
> On Tue, Nov 30, 2021 at 02:31:20PM -0500, Noah Meier wrote:
> > Hi Stefan,
> > 
> > Richard Procter offered some kind advice on the ordering of options in the 
> > man page
> > (to be done alphabetically) and commented on an unnecessary cast.
> > 
> > I also believe that I goofed by failing to initalize the mutex and zero the 
> > description
> > upon peer creation (in wg_peer_create).
> > 
> > I’ve attempted to address these issues and have pasted the diff below.
> > 
> > NM
> 
> This new patch does not apply cleanly.
> Leading whitespace was stripped, and thus patch complains as follows:
> 
> Patching file if_wg.c using Plan A...
> patch:  malformed patch at line 15: };
> 
> And it would help if you created a patch where all paths are relative to
> the /usr/src directory. Something like this should do it:
> 
>  cd /usr/src
>  cvs diff -u sbin/ifconfig sys/net  > /tmp/wgdesc.patch
> 
> If your mailer cannot preserve whitespace as-is then please try attaching
> the patch file instead of inlining it.
> 
> Thanks!
> 

-- 
Regards,
 Mikolaj

Introduce fine grained pipex(4) locking

2022-07-13 Thread Vitaliy Makkoveev

Use per-session `pxs_mtx' mutex(9) to protect session context. Except
MPPE encryption, PPPOE sessions are mostly immutable, so no lock
required for that case.

Global pipex(4) data is already protected by `pipex_list_mtx', so
pipex(4) doesn't rely on netlock anymore. Netlock could be also removed
from pppx(4)/pppac(4) layer with next diffs.

This diff doesn't contain (*if_output)() magic.

PPPOE/L2TP cases were additionally tested by Hrvoje Popovski.

Index: sys/net/if_pppx.c
===
RCS file: /cvs/src/sys/net/if_pppx.c,v
retrieving revision 1.118
diff -u -p -r1.118 if_pppx.c
--- sys/net/if_pppx.c   2 Jul 2022 08:50:42 -   1.118
+++ sys/net/if_pppx.c   13 Jul 2022 09:47:24 -
@@ -637,8 +637,6 @@ pppx_add_session(struct pppx_dev *pxd, s
ifp->if_softc = pxi;
/* ifp->if_rdomain = req->pr_rdomain; */
if_counters_alloc(ifp);
-   /* XXXSMP: be sure pppx_if_qstart() called with NET_LOCK held */
-   ifq_set_maxlen(&ifp->if_snd, 1);
 
/* XXXSMP breaks atomicity */
NET_UNLOCK();
@@ -777,15 +775,27 @@ pppx_if_qstart(struct ifqueue *ifq)
struct ifnet *ifp = ifq->ifq_if;
struct pppx_if *pxi = (struct pppx_if *)ifp->if_softc;
struct mbuf *m;
-   int proto;
+   int proto, lock = 0;
+
+   switch (pxi->pxi_session->protocol) {
+   case PIPEX_PROTO_PPTP:
+   case PIPEX_PROTO_L2TP:
+   lock = 1;
+   break;
+   }
+
+   if (lock)
+   mtx_enter(&pxi->pxi_session->pxs_mtx);
 
-   NET_ASSERT_LOCKED();
while ((m = ifq_dequeue(ifq)) != NULL) {
proto = *mtod(m, int *);
m_adj(m, sizeof(proto));
 
pipex_ppp_output(m, pxi->pxi_session, proto);
}
+
+   if (lock)
+   mtx_leave(&pxi->pxi_session->pxs_mtx);
 }
 
 int
@@ -1022,8 +1032,6 @@ pppacopen(dev_t dev, int flags, int mode
ifp->if_output = pppac_output;
ifp->if_qstart = pppac_qstart;
ifp->if_ioctl = pppac_ioctl;
-   /* XXXSMP: be sure pppac_qstart() called with NET_LOCK held */
-   ifq_set_maxlen(&ifp->if_snd, 1);
 
if_counters_alloc(ifp);
if_attach(ifp);
@@ -1398,7 +1406,6 @@ pppac_qstart(struct ifqueue *ifq)
struct ip ip;
int rv;
 
-   NET_ASSERT_LOCKED();
while ((m = ifq_dequeue(ifq)) != NULL) {
 #if NBPFILTER > 0
if (ifp->if_bpf) {
Index: sys/net/pipex.c
===
RCS file: /cvs/src/sys/net/pipex.c,v
retrieving revision 1.145
diff -u -p -r1.145 pipex.c
--- sys/net/pipex.c 12 Jul 2022 08:58:53 -  1.145
+++ sys/net/pipex.c 13 Jul 2022 09:47:24 -
@@ -90,7 +90,6 @@ struct pool mppe_key_pool;
  * Locks used to protect global data
  *   A   atomic operation
  *   I   immutable after creation
- *   N   net lock
  *   L   pipex_list_mtx
  */
 
@@ -171,7 +170,6 @@ pipex_ioctl(void *ownersc, u_long cmd, c
 {
int ret = 0;
 
-   NET_ASSERT_LOCKED();
switch (cmd) {
case PIPEXGSTAT:
ret = pipex_get_stat((struct pipex_session_stat_req *)data,
@@ -567,8 +565,6 @@ pipex_get_stat(struct pipex_session_stat
struct pipex_session *session;
int error = 0;
 
-   NET_ASSERT_LOCKED();
-
session = pipex_lookup_by_session_id(req->psr_protocol,
req->psr_session_id);
if (session == NULL)
@@ -771,7 +767,7 @@ pipex_timer(void *ignored_arg)
 Static void
 pipex_ip_output(struct mbuf *m0, struct pipex_session *session)
 {
-   int is_idle;
+   int is_idle, lock = 0;
 
if ((session->flags & PIPEX_SFLAGS_MULTICAST) == 0) {
/*
@@ -800,7 +796,20 @@ pipex_ip_output(struct mbuf *m0, struct 
goto dropped;
}
 
+   switch (session->protocol) {
+   case PIPEX_PROTO_PPTP:
+   case PIPEX_PROTO_L2TP:
+   lock = 1;
+   break;
+   }
+
+   if (lock)
+   mtx_enter(&session->pxs_mtx);
+
pipex_ppp_output(m0, session, PPP_IP);
+
+   if (lock)
+   mtx_leave(&session->pxs_mtx);
} else {
struct pipex_session *session_tmp;
struct mbuf *m;
@@ -820,9 +829,22 @@ pipex_ip_output(struct mbuf *m0, struct 
mtx_leave(&pipex_list_mtx);
 
m = m_copym(m0, 0, M_COPYALL, M_NOWAIT);
-   if (m != NULL)
+   if (m != NULL) {
+   switch (session_tmp->protocol) {
+   case PIPEX_PROTO_PPTP:
+   case PIPEX_PROTO_L2TP:
+   lock = 1;
+   break;
+

Re: amd64 serial console changes, part 2

2022-07-13 Thread Mark Kettenis

> Date: Tue, 12 Jul 2022 20:03:15 +0200
> From: Denis Fondras 
> 
> Le Wed, Jul 06, 2022 at 10:45:39PM +0200, Mark Kettenis a écrit :
> > Now that the kernel supports the extended BOOTARG_CONSDEV struct and
> > snaps with that change are out there, here is the diff that changes
> > the amd64 bootloaders to switch to the extended struct and provide the
> > parameters necessary for using the non-standard UART on the AMD Ryzen
> > Embedded V1000 SoCs.
> > 
> > It would be good if someone can confirm this works on something like
> > an APU.
> > 
> 
> I don't have any other EFI appliance to test but it reads fine, applies and
> builds OK.
> 
> Anyway, I could not make it work on the AMD Ryzen Embedded V1000. I
> might be missing a step here. I built a kernel with the diff
> applied, built the ramdrive and tried to boot it but it still
> reboots when in ELFNAME().

You'll need an image with an updated bootloader; check that the
version number of the bootloader is at least 3.61.

The diff is committed now, so the easiest thing is to just wait for a
fresh snapshot.

Cheers,

Mark

Re: Is rdomain correct for rtable in man ifconfig?

Re: dhcpleased(8): close unneeded bpf FDs

Is rdomain correct for rtable in man ifconfig?

Re: unp_solock_peer() and READ_ONCE()

Re: unp_solock_peer() and READ_ONCE()

unp_solock_peer() and READ_ONCE()

Re: [v3] amd64: simplify TSC sync testing

Re: ifconfig description for wireguard peers

Re: ifconfig description for wireguard peers

Re: ifconfig description for wireguard peers

Re: ifconfig description for wireguard peers

Re: ifconfig description for wireguard peers

Re: ifconfig description for wireguard peers

Re: Race in disk_attach_callback?

Re: Race in disk_attach_callback?

Race in disk_attach_callback?

Re: ifconfig description for wireguard peers

Introduce fine grained pipex(4) locking

Re: amd64 serial console changes, part 2

19 matches

Site Navigation

Mail list logo

Footer information