Small change in sysupgrade for custom release and test

2023-05-16 Thread Sven F.
Bienvenue,

--- /usr/sbin/sysupgrade.oldTue May 16 18:53:13 2023
+++ /usr/sbin/sysupgradeTue May 16 19:04:46 2023
@@ -143,6 +143,7 @@
 case ${_LINE} in
 *\ ${_KEY})SIGNIFY_KEY=/etc/signify/${_KEY} ;;
 *\ ${_NEXTKEY})SIGNIFY_KEY=/etc/signify/${_NEXTKEY} ;;
+*\ *.pub)  SIGNIFY_KEY=/etc/signify/${_LINE##* } && echo Using custom
key $SIGNIFY_KEY ;;
 *) err "invalid signing key" ;;
 esac

Read the signing key in the file so you can use a custom key when
testing release,

Have a good one.



Re: Very little patch : ref getrtable in rdomain

2021-05-19 Thread Sven F.
On Wed, May 19, 2021 at 12:21 PM Jason McIntyre  wrote:

> On Wed, May 19, 2021 at 10:55:11AM -0400, Sven F. wrote:
> > Index: rdomain.4
> > ===
> > RCS file: /cvs/src/share/man/man4/rdomain.4,v
> > retrieving revision 1.17
> > diff -u -p -r1.17 rdomain.4
> > --- rdomain.4   24 Sep 2020 11:05:32 -  1.17
> > +++ rdomain.4   19 May 2021 14:51:37 -
> > @@ -139,7 +139,8 @@ Delete rdomain 4 again:
> >  .Xr route 4 ,
> >  .Xr pf.conf 5 ,
> >  .Xr ifconfig 8 ,
> > -.Xr route 8
> > +.Xr route 8,
> > +.Xr getrtable 2
> >  .Sh HISTORY
> >  .Ox
> >  support for
> >
>
> hi.
>
> i'll let others decide whether we expect users of rdomains/rtables to need
> to
> read the programming pages. i suspect not.
>

> note that your addition to SEE ALSO is wrong - there should be a
> space between the section number and punctuation, and pages are
> ordered by section first, so it'd be added after the entry to ps(1).
> get used to running doc changes through "mandoc -Tlint" to pick up
> such small fry...
>
> jmc
>
>
yes the diff just propose , it's not following standard
unsure why the actual getrtable call would not be shown here


Index: rdomain.4
===
RCS file: /cvs/src/share/man/man4/rdomain.4,v
retrieving revision 1.17
diff -u -p -r1.17 rdomain.4
--- rdomain.4   24 Sep 2020 11:05:32 -  1.17
+++ rdomain.4   19 May 2021 14:51:37 -
@@ -139,7 +139,8 @@ Delete rdomain 4 again:
+ .Xr getrtable 2 ,
 .Xr route 4 ,
 .Xr pf.conf 5 ,
 .Xr ifconfig 8 ,
 .Xr route 8
 .Sh HISTORY
 .Ox
 support for

-- 
--
-
Knowing is not enough; we must apply. Willing is not enough; we must do


Re: Very little patch : ref getrtable in rdomain

2021-05-19 Thread Sven F.
On Wed, May 19, 2021 at 10:55 AM Sven F.  wrote:

> Index: rdomain.4
> ===
> RCS file: /cvs/src/share/man/man4/rdomain.4,v
> retrieving revision 1.17
> diff -u -p -r1.17 rdomain.4
> --- rdomain.4   24 Sep 2020 11:05:32 -  1.17
> +++ rdomain.4   19 May 2021 14:51:37 -
> @@ -139,7 +139,8 @@ Delete rdomain 4 again:
>  .Xr route 4 ,
>  .Xr pf.conf 5 ,
>  .Xr ifconfig 8 ,
> -.Xr route 8
> +.Xr route 8,
> +.Xr getrtable 2
>  .Sh HISTORY
>  .Ox
>  support for
>
> PS:
>
> Is it possible to setrtable in perl script ?
>
>
>
syscall(310, XX);


Very little patch : ref getrtable in rdomain

2021-05-19 Thread Sven F.
Index: rdomain.4
===
RCS file: /cvs/src/share/man/man4/rdomain.4,v
retrieving revision 1.17
diff -u -p -r1.17 rdomain.4
--- rdomain.4   24 Sep 2020 11:05:32 -  1.17
+++ rdomain.4   19 May 2021 14:51:37 -
@@ -139,7 +139,8 @@ Delete rdomain 4 again:
 .Xr route 4 ,
 .Xr pf.conf 5 ,
 .Xr ifconfig 8 ,
-.Xr route 8
+.Xr route 8,
+.Xr getrtable 2
 .Sh HISTORY
 .Ox
 support for

PS:

Is it possible to setrtable in perl script ?

-- 
--
-
Knowing is not enough; we must apply. Willing is not enough; we must do


Re: missing arg in esample renice man command ?

2021-04-06 Thread Sven F.
On Tue, Apr 6, 2021 at 2:29 PM Jason McIntyre  wrote:
>
> On Tue, Apr 06, 2021 at 01:42:51PM -0400, Sven F. wrote:
> > Line 120 , in renice.8
> >
> > https://github.com/openbsd/src/blob/master/usr.bin/renice/renice.8#L120
> >
> > # renice -n +1 987 -u daemon root -p 32
> >
> > should be
> >
> > # renice -n +1 -g 987 -u daemon root -p 32 ?
> >
> >
> > --- usr.bin/renice/renice.8.orig   2021-04-06 13:41:09.272347600 -0400
> > +++ usr.bin/renice/renice.82021-04-06 13:41:45.089202200 -0400
> > @@ -117,7 +117,7 @@
> >  changes the priority of process IDs 987 and 32,
> >  and all processes owned by users daemon and root:
> >  .Bd -literal -offset indent
> > -# renice -n +1 987 -u daemon root -p 32
> > +# renice -n +1 -g 987 -u daemon root -p 32
> >  .Ed
> >  .Sh SEE ALSO
> >  .Xr nice 1 ,
> >
>
> hi.
>
> i don;t think the example is wrong. it says it alters the priority of
> process IDs 987 and 32. and the man page notes:
>
>  If none of the -gpu options are specified, the default is
>  to select by process ID.
>
> having said that, i do think the example is a really weird way to
> specify IDs 987 and 32 and users daemon and root. and since it does give
> both -u and -p, it might not be entirely obvious to link it back to the
> text sating that -p is the default. it would be saner to do:
>
> # renice -n +1 -p 987 32 -u daemon root
> or
> # renice -n +1 987 32 -u daemon root
>
> (not tested!)
>
> both netbsd and freebsd have the same example that we currently have, so
> they're probably pretty old. so not sure if it's worth changing. or leaving
> as an exercise to the reader...
>
> jmc
>

example is fine, i got confused doing more than one thing at a time
bit confusing the 987 is not after the option like in SYNOPSIS

On a helping / clarity point of vue,changing
`The following example changes the priority of process`
by something like
`The following raise the priority of process to allocate more
resources to it by 1 `
may be more helping to someone in a hurry


-- 
--
-
Knowing is not enough; we must apply. Willing is not enough; we must do



missing arg in esample renice man command ?

2021-04-06 Thread Sven F.
Line 120 , in renice.8

https://github.com/openbsd/src/blob/master/usr.bin/renice/renice.8#L120

# renice -n +1 987 -u daemon root -p 32

should be

# renice -n +1 -g 987 -u daemon root -p 32 ?


--- usr.bin/renice/renice.8.orig   2021-04-06 13:41:09.272347600 -0400
+++ usr.bin/renice/renice.82021-04-06 13:41:45.089202200 -0400
@@ -117,7 +117,7 @@
 changes the priority of process IDs 987 and 32,
 and all processes owned by users daemon and root:
 .Bd -literal -offset indent
-# renice -n +1 987 -u daemon root -p 32
+# renice -n +1 -g 987 -u daemon root -p 32
 .Ed
 .Sh SEE ALSO
 .Xr nice 1 ,


-- 
--
-
Knowing is not enough; we must apply. Willing is not enough; we must do



Re: [External] : Re: pf route-to issues

2021-01-24 Thread Sven F.
On Sun, Jan 24, 2021 at 11:51 PM David Gwynne  wrote:

> On Mon, Jan 25, 2021 at 02:50:12AM +0100, Alexandr Nedvedicky wrote:
> > Hello,
> >
> > >
> > > ok. i don't know how to split up the rest of the change though.
> > >
> > > here's an updated diff that includes the rest of the kernel changes and
> > > the pfctl and pf.conf tweaks.
> > >
> > > it's probably useful for me to try and explain at a high level what
> > > i think the semantics should be, otherwise we might end up arguing
> about
> > > which bits of the current config i broke.
> > >
> > > so, from an extremely high level point of view, and apologies if
> > > this is condescending, pf sits between the network stack and an
> > > interface that a packet travels on. for connections handled by the
> > > local box, this means packets come from the stack and get an output
> > > interface selected by a route lookup, then pf checks it, and then
> > > it goes out the selected interface. replies come into an interface,
> > > get checked by pf, and then enter the stack. when forwarding, a
> > > packet comes into an interface, pf checks it, the stack does a route
> > > lookup to pick an interface, pf checks it again, and then it goes
> > > out the interface.
> > >
> > > so what does it mean when route-to (or reply-to) gets involved? i'm
> > > saying that when route-to is applied to a packet, pf takes the packet
> > > away from the stack and immediately forwards it toward to specified
> > > destination address. for a packet entering the system, ie, when the
> > > packet is going from the interface into the stack, route-to should
> > > pretend that it is forwarding the packet and basically push it
> > > straight out an interface. however, like normal forwarding via the
> > > stack, there might be some policy on packets leaving that interface
> that
> > > you want to apply, so pf should run pf_test in that situation so the
> > > policy can be applied. this is especially useful if you need to apply
> > > nat-to when packets leave a particular interface.
> > >
> > > however, if you route-to when a packet is on the way out of the
> > > stack, i'm arguing that pf should not run again against that packet.
> > > currently route-to rules run pf_test again if the interface the packet
> > > is routed out of changes, which means pf runs multiple times against a
> > > packet if rules keep changing which interface it goes out. this means
> > > there's loop prevention in pf to mitigate against this, and weird
> > > potentials for multiple states to be created when nat gets involved.
> > >
> > > for simplicity, both in terms of reasoning and code i think pf should
> > > only be run once when a packet enters the system, and only once when it
> > > leaves the system. the only reason i can come up with for running
> > > pf_test multiple times when route-to changes the outgoing interface is
> > > so you can check the packet with "pass out on $new_if" type rules. we
> > > don't rerun pf again when nat/rdr changes addresses, so this feels
> > > inconsistent to me.
> >
> > I understand that simple is better here, so I won't object
> > if we will lean towards simplified model above. However I still
> > would like to share my view on current PF.
> >
> > the way I understand how things (should) work currently is fairly
> simple:
> >
> >   we always run pf_test() as packet crosses interface.
> >   packet can cross interface either in outbound or
> >   inbound direction.
>
> That's how I understand the current code. I'm proposing that we change
> the semantics so they are:
>
> - we always run pf_test as a packet enters or leaves the network stack.
> - pf is able to filter or apply policy based on various attributes
>   of the packet such as addresses and ports, but also metadata about
>   the packet such as the current prio, or the interface it came
>   from or is going to.
> - changing a packet or it's metadata does not cause a rerun of pf_test.
> - route-to on an incoming packet basically bypasses the default
>   stack processing with a "fast route" out of the stack.
>
> > this way we can always create a complex route-to loops,
> > however it can also solve some route-to vs. NAT issues.
> > consider those fairly innocent rules:
> >
> > 8<---8<---8<--8<
> > table  { 10.10.10.10, 172.16.1.1 }
> >
> > pass out on em0 from 192.168.1.0/24 to any route-to 
> > pass out on em1 from 192.168.1.0 to any nat-to (em1)
> > pass out on em2 all
> > 8<---8<---8<--8<
> >
> > Rules above should currently work, but will stop if we will
> > go with simplified model.
>
> The entries in  make the packet go out em1 and em2?
>
> I'm ok with breaking configs like that. We don't run pf_test again for
> other changes to the packet, so if we do want to support something like
> that I think we should make the following work:
>
>   # pf_pdesc kif is em0
>

Re: Enhancing (some) PF state

2020-12-18 Thread Sven F.
On Fri, Dec 18, 2020 at 3:53 AM Alexandr Nedvedicky
 wrote:
>
> Hello Sven,
>
> your change makes me wonder: 'what is the actual problem you are trying to
> solve'?
>
> the reason I'm asking is that latency is just one factor, which contributes to
> TCP connection performance. The other factor (and perhaps more important) is 
> to
> guess amount of retransmitted data. Processes (a.k.a. endpoints), which
> communicate over TCP can experience significant delay once TCP packets starts
> to be dropped. Those dropped TCP packets contribute to delay experienced in 
> the
> more significant way, than 'network latency' in sense of roundtrip.
>
> I'm not much experienced firewall administrator, the only firewall I run is
> APU box at my home, hence I'm sorry if my question sounds naive. So basically
> what sort of problem in network you hope to diagnose with PF?
>
> And also don't get me wrong: I like your idea to extend PF to enable firewall
> to provide better picture of what happens on network. I just want to point
> out that sampling network latency (round-trip) might not be sufficient.
>
> thanks and
> regards
> sashan
>

I need data on perceived server load // compare to network jitter with
different locations.
The retransmission global counter, while interesting, is certainly not
perfect either.

But for now I'd like to be able to keep a record of the latency of
'real' TCP connections.
I cannot do it on the clients, and cannot do it on the servers. It
must be done in
the trusted OpenBSD environment. Having retransmissions per tcp connection
would definitely be a plus, but it's not the goal here. It is not a
problem I need to solve,
just additional data I want to extract to make better informed choices
and cross validate results.

I would like to go through this one step at a time to stay focused on :
computing latency timing of TCP connexion in openbsd 'correctly'


--
--
-
Knowing is not enough; we must apply. Willing is not enough; we must do



Enhancing (some) PF state

2020-12-17 Thread Sven F.
Dear readers,

pfctl -vv -ss shows detailed information on states.
I would like to improve the information provided about specific TCP connections,
regarding the latency of the network.
An obvious way seems to be to measure the time to get ACKs back.
Another way would be to use packets timestamps.

I patched my kernel to extract this information and
log it, before trying to report it to the userland.
I did not use timestamps, because I am not sure how i could do that.
If you have any advice on that, it would be welcome.
Moreover the patch is a prototype,
So I would appreciate any feedback on my diff (attached):
Currently the code is using a LABEL to trigger the measure,
of course, later it should be a keyword like "latency" in the rules.
For example :
match proto tcp to port 80 latency
Or something else.
This would be discuss after choosing a method for latency computation,

Maybe there is a better way to extract network TCP latencies information
( i would like to avoid running in promiscuous, but if a current
packaged software does it well... )
but I did not come across it.
Please share if you know of a better way to tackle this.


Happy holidays !


lag.diff
Description: Binary data


Re: A few questions regarding WG(4)

2020-09-29 Thread Sven F.
On Mon, Sep 28, 2020 at 6:35 PM Sven F.  wrote:

> Dear reader,
>
> i tested 6.8-beta and WG
>
> After going for behind NAT to behind NAT experiment ,
> i went for two 'clients' behind a NAT to an openBSD device with a public IP
> called here 'Server'
>
> First of all , a minor detail, unless I thought wgport was not
> optional because the
> ifconfig output will not tell you the 'random port' chosen.
> So you cannot configure wgpeer after, unless
> you up the interface (1)
>
> 'Server'
>
> # ifconfig wg1
> wg1: flags=80c3 mtu 1420
> index 5 priority 0 llprio 3
> wgport 
> wgpubkey XdbTdbNzEASSXvgwAHrBuuBNHpeDtS0CGH3KsT7TxzY=
> wgpeer XxILKSdZ3JJr7fhAqzVNhNE4wbxJGfFlb4EYijqnU1k=
> wgendpoint XX 
> tx: 13988, rx: 11164
> last handshake: 135 seconds ago
> wgaip 192.168.5.1/24
> wgpeer Xo6rmtAMkXhGIJOtulLhzCialGdzoPhDSHou+LWWfz8=
> wgendpoint XX 
> tx: 10164, rx: 5992
> last handshake: 9 seconds ago
> wgaip 192.168.0.0/16
> groups: wg
> inet 192.168.5.1 netmask 0x broadcast 192.168.255.255
>
> the wgaip filter is a bit confusing to me because i MAY want to
> allow 192.168.5.1
> on both but not having overlapping subnet , or maybe it's dedicated to
> routing.
> The man page of WG(4) or the faq could have a more fancy example to
> illustrate
> correct use of wgaip
>
> The main question is related to the fact that
> I was unable to ping the peers from the 'server'
> until I pinged 192.168.5.1 from the two 'clients'.
>
> # ping 192.168.6.1
> PING 192.168.6.1 (192.168.6.1): 56 data bytes
> ^C
> --- 192.168.6.1 ping statistics ---
> 5 packets transmitted, 0 packets received, 100.0% packet loss
> ## ping 192.168.5.1 or remote device here
> # ping 192.168.6.1
> PING 192.168.6.1 (192.168.6.1): 56 data bytes
> 64 bytes from 192.168.6.1: icmp_seq=0 ttl=255 time=12.564 ms
> 64 bytes from 192.168.6.1: icmp_seq=1 ttl=255 time=16.005 ms
>
> Is this expected and/or due to the fact 192.168.6.1 is behind a NAT ?
>
> Best
> ( one client is i386 the other amd64 , 6.8 beta is working so far !)
>
>
> (1)
> # ifconfig wg2 create wgkey `openssl rand -base64 32`
> # ifconfig wg2
> wg2: flags=8082 mtu 1420
> index 6 priority 0 llprio 3
> wgpubkey iKbEvJvgyyzcdRcefgXaC7BWkmfUTREtL5BWvFeKdHo=
> groups: wg
> vps105766# ifconfig wg2 up
> vps105766# ifconfig wg2
> wg2: flags=80c3 mtu 1420
> index 6 priority 0 llprio 3
> wgport 16326
> wgpubkey iKbEvJvgyyzcdRcefgXaC7BWkmfUTREtL5BWvFeKdHo=
> groups: wg
>
> man
> ```
>  wgport port
>  Set the UDP port that the tunnel operates on.  The interface
> will
>  bind to INADDR_ANY and IN6ADDR_ANY_INIT.  If no port is
>  configured, one will be chosen automatically.
> ```
> to
> ```
>  wgport port
>  Set the UDP port that the tunnel operates on.  The interface
> will
>  bind to INADDR_ANY and IN6ADDR_ANY_INIT.  If no port is
>  configured, one will be chosen automatically when the
> interface is up.
> ```
>
> ?
>


My tunnel did not survive a suspend mode on the crapbook laptop.
public IP  did not roam
nothing after a few minutes ( other device is working fine )

Am I supposed to do a down / up cycle on the wg interface after suspend ?


Re: dhclient new commit are good but....

2020-09-21 Thread Sven F.
On Sat, Sep 19, 2020 at 2:07 PM Denis Fondras  wrote:

> On Wed, Sep 16, 2020 at 03:06:32PM -0400, Sven F. wrote:
> > On Wed, Jun 3, 2020 at 2:13 PM sven falempin 
> > wrote:
> >
>
> Can you send a proper diff please for review please ?
>

bpf.c is sort of ok, the rest need choices .
The configuration of bpf should probably be updated if someone changes the
lladdr after launching dhclient.
Index: bpf.c
===
RCS file: /cvs/src/sbin/dhclient/bpf.c,v
retrieving revision 1.75
diff -u -p -r1.75 bpf.c
--- bpf.c	18 Mar 2019 00:00:59 -	1.75
+++ bpf.c	21 Sep 2020 19:18:52 -
@@ -102,6 +102,49 @@ get_udp_sock(int rdomain)
 	return sock;
 }
 
+
+/*
+ * Packet MAC filter program.
+ *
+ * N.B.: Changes to the filter program may require changes to the
+ * constant offsets used in if_register_receive to patch the BPF program!
+ */
+
+struct bpf_insn dhcp_bpf_mac_filter[] = {
+	/* Make sure this is an IP packet. */
+	BPF_STMT(BPF_LD + BPF_H + BPF_ABS, 12),
+	BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, ETHERTYPE_IP, 0, 8),
+
+	/* Make sure it's a UDP packet. */
+	BPF_STMT(BPF_LD + BPF_B + BPF_ABS, 23),
+	BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, IPPROTO_UDP, 0, 6),
+
+	/* Make sure this isn't a fragment. */
+	BPF_STMT(BPF_LD + BPF_H + BPF_ABS, 20),
+	BPF_JUMP(BPF_JMP + BPF_JSET + BPF_K, 0x1fff, 8, 0),
+
+	/* Get the IP header length. */
+	BPF_STMT(BPF_LDX + BPF_B + BPF_MSH, 14),
+
+	/* Make sure it's to the right port. */
+	BPF_STMT(BPF_LD + BPF_H + BPF_IND, 16),
+	BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, 67, 0, 1),		/* patch */
+
+	/* check bootp.hw.addr 2 bytes */
+	BPF_STMT(BPF_LD + BPF_H + BPF_IND, 50),
+	BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, 0x, 0, 3),  /* patch */
+	/* check bootp.hw.addr 4 bytes */
+	BPF_STMT(BPF_LD + BPF_W + BPF_IND, 52),
+	BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, 0x, 0, 1),  /* patch */ 
+
+	/* If we passed all the tests, ask for the whole packet. */
+	BPF_STMT(BPF_RET+BPF_K, (unsigned int)-1),
+
+	/* Otherwise, drop it. */
+	BPF_STMT(BPF_RET+BPF_K, 0),
+};
+int dhcp_bpf_mac_filter_len = sizeof(dhcp_bpf_mac_filter) / sizeof(struct bpf_insn);
+
 /*
  * Packet filter program.
  *
@@ -177,13 +220,19 @@ struct bpf_insn dhcp_bpf_wfilter[] = {
 
 int dhcp_bpf_wfilter_len = sizeof(dhcp_bpf_wfilter) / sizeof(struct bpf_insn);
 
+
+extern int cmd_opts;
+
 int
-configure_bpf_sock(int bpffd)
+configure_bpf_sock(struct interface_info *ifi)
 {
 	struct bpf_version	 v;
 	struct bpf_program	 p;
 	int			 flag = 1, sz;
 	int			 fildrop = BPF_FILDROP_CAPTURE;
+	uint32_t	 bits;
+	uint16_t	 bits16;
+	int bpffd = ifi->bpffd;
 
 	/* Make sure the BPF version is in range. */
 	if (ioctl(bpffd, BIOCVERSION, &v) == -1)
@@ -218,7 +267,18 @@ configure_bpf_sock(int bpffd)
 	 * N.B.: changes to filter program may require changes to the
 	 * insn number(s) used below!
 	 */
-	dhcp_bpf_filter[8].k = LOCAL_PORT;
+
+	if (cmd_opts & OPT_MAC_FITLER) {
+		p.bf_len = dhcp_bpf_mac_filter_len;
+		p.bf_insns = dhcp_bpf_mac_filter;
+		dhcp_bpf_mac_filter[8].k = LOCAL_PORT;
+		memcpy(&bits16, ifi->hw_address.ether_addr_octet, sizeof(bits16));
+		dhcp_bpf_mac_filter[10].k = ntohs(bits16);
+		memcpy(&bits, ifi->hw_address.ether_addr_octet + 2, sizeof(bits));
+		dhcp_bpf_mac_filter[12].k = ntohl(bits);
+	} else {
+		dhcp_bpf_filter[8].k = LOCAL_PORT;
+	}
 
 	if (ioctl(bpffd, BIOCSETF, &p) == -1)
 		fatal("BIOCSETF");
Index: dhclient.c
===
RCS file: /cvs/src/sbin/dhclient/dhclient.c,v
retrieving revision 1.678
diff -u -p -r1.678 dhclient.c
--- dhclient.c	31 Jul 2020 12:12:11 -	1.678
+++ dhclient.c	21 Sep 2020 19:18:53 -
@@ -471,7 +471,7 @@ main(int argc, char *argv[])
 
 	log_setverbose(0);	/* Don't show log_debug() messages. */
 
-	while ((ch = getopt(argc, argv, "c:di:L:nrv")) != -1)
+	while ((ch = getopt(argc, argv, "c:di:L:nrvm")) != -1)
 		switch (ch) {
 		case 'c':
 			if (optarg == NULL)
@@ -494,6 +494,9 @@ main(int argc, char *argv[])
 			cmd_opts |= OPT_DBPATH;
 			path_option_db = optarg;
 			break;
+		case 'm':
+			cmd_opts |= OPT_MAC_FITLER;
+			break;
 		case 'n':
 			cmd_opts |= OPT_NOACTION;
 			break;
@@ -670,7 +673,7 @@ main(int argc, char *argv[])
 	/* Create the udp and bpf sockets, growing rbuf if needed. */
 	ifi->udpfd = get_udp_sock(ifi->rdomain);
 	ifi->bpffd = get_bpf_sock(ifi->name);
-	newsize = configure_bpf_sock(ifi->bpffd);
+	newsize = configure_bpf_sock(ifi);
 	if (newsize > ifi->rbuf_max) {
 		if ((newp = realloc(ifi->rbuf, newsize)) == NULL)
 			fatal("rbuf");
Index: dhcpd.h
===
RCS file: /cvs/src/sbin/dhclient/dhcpd.h,v
retrieving revision 1.287
diff -u -p -r1.287 dhcpd.h
--- dhcpd.h	21 May 2020 01:07:52 -	1.287
+++ dhcpd.h	21 Se

Re: acme-client: improve account creation error message

2020-09-14 Thread Sven F.
On Mon, Sep 14, 2020 at 9:45 AM Bob Beck  wrote:

>
> But what if I like json and I am already set up to be a hipster and
> feed all the untrusted inputs through jq..
>
> (ok beck@)
>
> On Mon, Sep 14, 2020 at 03:37:25PM +0200, Florian Obser wrote:
> > not helpful:
> > $ doas acme-client $(hostname)
> > acme-client: https://api.test4.buypass.no/acme-v02/new-acct: bad HTTP:
> 400
> >
> > vomitting unformated json is not better:
> > $ doas acme-client -v $(hostname)
> > acme-client: transfer buffer:
> [{"type":"urn:ietf:params:acme:error:malformed","detail":"Email is a
> required
> contact","code":400,"message":"MALFORMED_BAD_REQUEST","details":"HTTP 400
> Bad Request"}] (164 bytes)
> >
> > let's do this:
> > $ doas obj/acme-client -v $(hostname)
> > acme-client: Email is a required contact
> >
> > OK?
> >
> > diff --git extern.h extern.h
> > index 529d3350205..364425b0500 100644
> > --- extern.h
> > +++ extern.h
> > @@ -259,6 +259,7 @@ intjson_parse_order(struct jsmnn *,
> struct order *);
> >  int   json_parse_upd_order(struct jsmnn *, struct order *);
> >  void  json_free_capaths(struct capaths *);
> >  int   json_parse_capaths(struct jsmnn *, struct capaths *);
> > +char *json_getstr(struct jsmnn *, const char *);
> >
> >  char *json_fmt_newcert(const char *);
> >  char *json_fmt_chkacc(void);
> > diff --git json.c json.c
> > index 61d2631359f..a6762eeb258 100644
> > --- json.c
> > +++ json.c
> > @@ -297,7 +297,7 @@ json_getobj(struct jsmnn *n, const char *name)
> >   * that it's the correct type.
> >   * Returns NULL on failure.
> >   */
> > -static char *
> > +char *
> >  json_getstr(struct jsmnn *n, const char *name)
> >  {
> >   size_t   i;
> > diff --git netproc.c netproc.c
> > index 7b8152196d1..05e36897c38 100644
> > --- netproc.c
> > +++ netproc.c
> > @@ -371,15 +371,27 @@ sreq(struct conn *c, const char *addr, int kid,
> const char *req, char **loc)
> >  static int
> >  donewacc(struct conn *c, const struct capaths *p)
> >  {
> > + struct jsmnn*j = NULL;
> >   int  rc = 0;
> > - char*req;
> > + char*req, *detail, *error = NULL;
> >   long lc;
> >
> >   if ((req = json_fmt_newacc()) == NULL)
> >   warnx("json_fmt_newacc");
> >   else if ((lc = sreq(c, p->newaccount, 0, req, &c->kid)) < 0)
> >   warnx("%s: bad comm", p->newaccount);
> > - else if (lc != 200 && lc != 201)
> > + else if (lc == 400) {
> > + if ((j = json_parse(c->buf.buf, c->buf.sz)) == NULL)
> > + warnx("%s: bad JSON object", p->newaccount);
> > + else {
> > + detail = json_getstr(j, "detail");
> > + if (detail != NULL && stravis(&error, detail,
> VIS_SAFE)
> > + != -1) {
> > + warnx("%s", error);
> > + free(error);
> > + }
> > + }
> > + } else if (lc != 200 && lc != 201)
> >   warnx("%s: bad HTTP: %ld", p->newaccount, lc);
> >   else if (c->buf.buf == NULL || c->buf.sz == 0)
> >   warnx("%s: empty response", p->newaccount);
> >
> >
> > --
> > I'm not entirely sure you are real.
> >
>
>
please dont drop the all buffer , or keep it with -vv ?
example : warnx("%s: bad JSON object:%s", p->newaccount, c->buf.buf);

i don't want to ktrace it to see why the new certbot version is not working

-- 
--
-
Knowing is not enough; we must apply. Willing is not enough; we must do


Fwd: pppx(4): move ifnet out of KERNEL_LOCK()

2020-08-07 Thread Sven F.
-- Forwarded message -
From: Vitaliy Makkoveev 
Date: Fri, Aug 7, 2020 at 2:11 PM
Subject: Re: pppx(4): move ifnet out of KERNEL_LOCK()
To: Sven F. 


What reaction do you expect? Look, you did something and you got panic
with *not* modified system. What is expected you will do? At least you
described your actions which caused issue. It's great if you know how to
reproduce it and you also describe required actions. Also dmesg(8)
output welcomed. That's all. It's not expected you have C or OpenBSD
internals knowledge. It's not expected you have a solution to fix. But
without things I described above your problem report is *useless*. Also
we have bugs@ list to report bugs. I guess posting to tech@ is welcomed
if you understand the problem you caught and you have solution.

Did you anything of things above? No you *didn't*. You have system
modified by yourself. You caught panic. Who knows what did you modify?
May be you are the person who introduced issue? Who ever knows? Do you
expect someone of developers will spend his time and do your work
instead of you? Are you serious?

The second your appearance. You did something and you caught panic with
generic system. You posted dmesg(8) output. Ok, but where is the
description of actions you did? Imagine, you are the first person who
caught this issue. We have no telepathy skills, your report is useless.

But well, I looked in your dmesg(8) output and the problem looks the
same problem with I work. I answered you what works is in progress.
Also I said you there is no quick solution for. And my solution can be
rejected by other developers. So there is nothing to share yet.

The problem is not secret, but you looks not the person who understand
the reasons which caused problems. Also your posts about netlock show
you as person which don't understand haw this subsystem works. Do you
remember I asked you what is the data you propose to protect by netlock?
Looks like you don't understand the meaning of rwlocks. What is the
reason for me or other developers to have discussion with you? To teach
you how to code? What is my interest?

And don't assume my reply as "fuck off stupid user". It's *not*. I spend
my time to explain you obvious things. If you are skilled enough, if you
understand what you do - you are welcomed. But you should *describe* the
problem and *describe* what you are doing. That's enough. If you are
wrong you will be corrected. But unfortunately it's looks like it's not
about you.


-- 
--
-
Knowing is not enough; we must apply. Willing is not enough; we must do



Re: pppx(4): move ifnet out of KERNEL_LOCK()

2020-08-07 Thread Sven F.
On Thu, Aug 6, 2020 at 8:11 AM Vitaliy Makkoveev  wrote:
>
> On Thu, Aug 06, 2020 at 01:25:14PM +0200, Martin Pieuchot wrote:
> > On 05/08/20(Wed) 12:50, Vitaliy Makkoveev wrote:
> > > pipex(4) and pppx(4) are ready to became a little bit more MP capable.
> > > Diff below moves pppx(4) related `ifnet' out of KERNEL_LOCK().
> >
> > Nice, one comment below.
> >
> > > Index: sys/net/if_pppx.c
> > >
> > > [skip]
> > >
> > > +   NET_LOCK();
> > > pipex_ppp_output(m, pxi->pxi_session, proto);
> > > +   NET_UNLOCK();
> >
> > This means the lock is taken and released for every packet.  It would be
> > better to grab it outside the loop.
>
> Ok, fixed.
>
> Index: sys/net/if_pppx.c
> ===
> RCS file: /cvs/src/sys/net/if_pppx.c,v
> retrieving revision 1.98
> diff -u -p -r1.98 if_pppx.c
> --- sys/net/if_pppx.c   28 Jul 2020 09:53:36 -  1.98
> +++ sys/net/if_pppx.c   6 Aug 2020 11:54:44 -
> @@ -191,7 +191,7 @@ int pppx_set_session_descr(struct pppx_
> struct pipex_session_descr_req *);
>
>  void   pppx_if_destroy(struct pppx_dev *, struct pppx_if *);
> -void   pppx_if_start(struct ifnet *);
> +void   pppx_if_qstart(struct ifqueue *);
>  intpppx_if_output(struct ifnet *, struct mbuf *,
> struct sockaddr *, struct rtentry *);
>  intpppx_if_ioctl(struct ifnet *, u_long, caddr_t);
> @@ -683,13 +683,12 @@ pppx_add_session(struct pppx_dev *pxd, s
> snprintf(ifp->if_xname, sizeof(ifp->if_xname), "%s%d", "pppx", unit);
> ifp->if_mtu = req->pr_peer_mru; /* XXX */
> ifp->if_flags = IFF_POINTOPOINT | IFF_MULTICAST | IFF_UP;
> -   ifp->if_xflags = IFXF_CLONED;
> -   ifp->if_start = pppx_if_start;
> +   ifp->if_xflags = IFXF_CLONED | IFXF_MPSAFE;
> +   ifp->if_qstart = pppx_if_qstart;
> ifp->if_output = pppx_if_output;
> ifp->if_ioctl = pppx_if_ioctl;
> ifp->if_rtrequest = p2p_rtrequest;
> ifp->if_type = IFT_PPP;
> -   ifq_set_maxlen(&ifp->if_snd, 1);
> ifp->if_softc = pxi;
> /* ifp->if_rdomain = req->pr_rdomain; */
>
> @@ -864,21 +863,15 @@ pppx_if_destroy(struct pppx_dev *pxd, st
>  }
>
>  void
> -pppx_if_start(struct ifnet *ifp)
> +pppx_if_qstart(struct ifqueue *ifq)
>  {
> +   struct ifnet *ifp = ifq->ifq_if;
> struct pppx_if *pxi = (struct pppx_if *)ifp->if_softc;
> struct mbuf *m;
> int proto;
>
> -   if (!ISSET(ifp->if_flags, IFF_RUNNING))
> -   return;
> -
> -   for (;;) {
> -   m = ifq_dequeue(&ifp->if_snd);
> -
> -   if (m == NULL)
> -   break;
> -
> +   NET_LOCK();
> +   while ((m = ifq_dequeue(ifq)) != NULL) {
> proto = *mtod(m, int *);
> m_adj(m, sizeof(proto));
>
> @@ -887,6 +880,7 @@ pppx_if_start(struct ifnet *ifp)
>
> pipex_ppp_output(m, pxi->pxi_session, proto);
> }
> +   NET_UNLOCK();
>  }
>
>  int
>

Hey ,

I think putting things out of locks
while there is known race condition in code, is hurting.
I was candidly asking  polite questions because I create those races sometimes,
but apparently the work in progress must be kept secret .

Because it is a work in progress reading code may just mislead into :
<>
but I think it is hurting the project to have known issues like that,
in the network stack.
Never saw OpenBSD do DDB> outside some strange drivers or snapshots, since
like 3.5 ? But way more recently.

Currently I'm asking myself why each call of
if_hooks_run in if.c are surrounded by NET_LOCK
and then do another mutex : mtx_enter(&if_hooks_mtx);

But then call (*func)(arg);

While some deletion of func are not protected by any lock

like here, in XX_detach which is not NET_LOCK nor &if_hooks_mtx locked:

int
lacp_detach(struct trunk_softc *sc)
{
struct lacp_softc *lsc = LACP_SOFTC(sc);

KASSERT(TAILQ_EMPTY(&lsc->lsc_aggregators));
KASSERT(lsc->lsc_active_aggregator == NULL);

sc->tr_psc = NULL;

Most enthusiastic users would suffer a lot from a failing PPP driver ( not me ).
I understand the work is complex ( because as usual OpenBSD is trying
to do the right thing ).

Maybe a define 'MP_NET_STACK' could activate/deactivate optimization
for MP architecture
so users can easily switch between tests and stable for this complex task.
Because validating Races is not an easy task especially on modern processors.

Thanks for the great MP optimization work.

Have a good weekend y'all.



Re: bugs in bridge ( netlock ? )

2020-08-05 Thread Sven F.
On Wed, Aug 5, 2020 at 9:14 AM Sven F.  wrote:
>
> Never seen before crash ( 6. 7 stable )
>
> My devices run a lot of things in, load is easily 4
> which is good for breaking lock code ?
>
> uvm_fault(0xfd820a916cc0, 0x8, 0, 1) -> e
> kernel: page fault trap, code=0
> Stopped at  bridge_brlconf+0x24:movq0x8(%rdi),%rax
> ddb{1}> bridge_brlconf(0,800022e482b0) at bridge_brlconf+0x24
> bridge_ioctl(80524000,c030694f,800022e482b0) at bridge_ioctl+0x34f
> ifioctl(fd8118f33c98,c030694f,800022e482b0,800022895510)
> at ifioctl+0xa03
> soo_ioctl(fd814c9cf2e8,c030694f,800022e482b0,800022895510)
> at soo_ioctl+0x171
> sys_ioctl(800022895510,800022e483c0,800022e48420) at 
> sys_ioctl+0x2df
> syscall(800022e48490) at syscall+0x389
> Xsyscall() at Xsyscall+0x128
> end of kernel
> end trace frame: 0x7f7d2c30, count: -7
> ddb{1}>PID TID   PPIDUID  S   FLAGS  WAIT  COMMAND
>  93583  108149190  0  3 0x2  smrbarifconfig
>  91803  290614  22775  0  2   0perl
>  10793  382902  22775  0  2   0perl
>  83906   60674  57936  0  2 0x2ifconfig
> *80202  204044  63585  0  7 0x2ifconfig
>  635857414  22775  0  30x80  piperdperl
>  57936  439878  22775  0  30x80  piperdperl
>  70874   88156  22775  0  7   0perl
>  72682  513697  1  0  30x80  poll  openvpn
>  84518  279046  22775  0  2   0perl
>  55262   52512  22775  0  2   0perl
>  36158  256298  22775  0  2   0perl
>  84969  398264  1  0  30x80  poll  openvpn
>  31484  239479  1  0  30x80  poll  openvpn
>   7902   84087  60669  0  30x82  netio sshd
>  25285  366282  1  0  30x80  poll  openvpn
>  96838  424361  1  0  30x80  poll  openvpn
>  42763  368876  1  0  30x80  poll  openvpn
>  92032  243887  1  0  30x80  poll  openvpn
>  22775  200805  21119  0  2 0x2perl
>  21119  407468  75737  0  30x10008a  pause sh
>  ddb{1}> rebooting...
> OpenBSD 6.7-stable (GENERIC.MP) #43: Tue Jul 28 21:46:24 EDT 2020
> root@builder:/sys/arch/amd64/compile/GENERIC.MP
> real mem = 8371683328 (7983MB)
> avail mem = 8105316352 (7729MB)
> mpath0 at root
> scsibus0 at mpath0: 256 targets
> mainbus0 at root
> bios0 at mainbus0: SMBIOS rev. 2.8 @ 0xf6830 (11 entries)
> bios0: vendor SeaBIOS version "2:1.10.2-58953eb7" date 04/01/2014
> bios0: OpenStack Foundation OpenStack Nova
> acpi0 at bios0: ACPI 1.0
> acpi0: sleep states S3 S4 S5
> acpi0: tables DSDT FACP APIC
> acpi0: wakeup devices
> acpitimer0 at acpi0: 3579545 Hz, 24 bits
> acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
> cpu0 at mainbus0: apid 0 (boot processor)
> cpu0: Intel Core Processor (Haswell, no TSX), 2394.80 MHz, 06-3c-01
> cpu0: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,RDTSCP,LONG,LAHF,ABM,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MD_CLEAR,ARAT,XSAVEOPT,MELTDOWN
> cpu0: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 512KB
> 64b/line 16-way L2 cache
> cpu0: ITLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
> cpu0: DTLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
> cpu0: smt 0, core 0, package 0
> mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
> cpu0: apic clock running at 1000MHz
> cpu1 at mainbus0: apid 1 (application processor)
> cpu1: Intel Core Processor (Haswell, no TSX), 2394.56 MHz, 06-3c-01
> cpu1: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,RDTSCP,LONG,LAHF,ABM,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MD_CLEAR,ARAT,XSAVEOPT,MELTDOWN
> cpu1: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 512KB
> 64b/line 16-way L2 cache
> cpu1: ITLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
> cpu1: DTLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
> cpu1: smt 0, core 0, package 1
> ioapic0 at mainbus0: apid 0 pa 0xfec0, version 11, 24 pins
> acpiprt0 at acpi0: bus 0 (PCI0)
> acpicpu0 at acpi0: C1(@1 halt!)
> acpicpu1 a

bugs in bridge ( netlock ? )

2020-08-05 Thread Sven F.
Never seen before crash ( 6. 7 stable )

My devices run a lot of things in, load is easily 4
which is good for breaking lock code ?

uvm_fault(0xfd820a916cc0, 0x8, 0, 1) -> e
kernel: page fault trap, code=0
Stopped at  bridge_brlconf+0x24:movq0x8(%rdi),%rax
ddb{1}> bridge_brlconf(0,800022e482b0) at bridge_brlconf+0x24
bridge_ioctl(80524000,c030694f,800022e482b0) at bridge_ioctl+0x34f
ifioctl(fd8118f33c98,c030694f,800022e482b0,800022895510)
at ifioctl+0xa03
soo_ioctl(fd814c9cf2e8,c030694f,800022e482b0,800022895510)
at soo_ioctl+0x171
sys_ioctl(800022895510,800022e483c0,800022e48420) at sys_ioctl+0x2df
syscall(800022e48490) at syscall+0x389
Xsyscall() at Xsyscall+0x128
end of kernel
end trace frame: 0x7f7d2c30, count: -7
ddb{1}>PID TID   PPIDUID  S   FLAGS  WAIT  COMMAND
 93583  108149190  0  3 0x2  smrbarifconfig
 91803  290614  22775  0  2   0perl
 10793  382902  22775  0  2   0perl
 83906   60674  57936  0  2 0x2ifconfig
*80202  204044  63585  0  7 0x2ifconfig
 635857414  22775  0  30x80  piperdperl
 57936  439878  22775  0  30x80  piperdperl
 70874   88156  22775  0  7   0perl
 72682  513697  1  0  30x80  poll  openvpn
 84518  279046  22775  0  2   0perl
 55262   52512  22775  0  2   0perl
 36158  256298  22775  0  2   0perl
 84969  398264  1  0  30x80  poll  openvpn
 31484  239479  1  0  30x80  poll  openvpn
  7902   84087  60669  0  30x82  netio sshd
 25285  366282  1  0  30x80  poll  openvpn
 96838  424361  1  0  30x80  poll  openvpn
 42763  368876  1  0  30x80  poll  openvpn
 92032  243887  1  0  30x80  poll  openvpn
 22775  200805  21119  0  2 0x2perl
 21119  407468  75737  0  30x10008a  pause sh
 ddb{1}> rebooting...
OpenBSD 6.7-stable (GENERIC.MP) #43: Tue Jul 28 21:46:24 EDT 2020
root@builder:/sys/arch/amd64/compile/GENERIC.MP
real mem = 8371683328 (7983MB)
avail mem = 8105316352 (7729MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.8 @ 0xf6830 (11 entries)
bios0: vendor SeaBIOS version "2:1.10.2-58953eb7" date 04/01/2014
bios0: OpenStack Foundation OpenStack Nova
acpi0 at bios0: ACPI 1.0
acpi0: sleep states S3 S4 S5
acpi0: tables DSDT FACP APIC
acpi0: wakeup devices
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel Core Processor (Haswell, no TSX), 2394.80 MHz, 06-3c-01
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,RDTSCP,LONG,LAHF,ABM,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MD_CLEAR,ARAT,XSAVEOPT,MELTDOWN
cpu0: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 512KB
64b/line 16-way L2 cache
cpu0: ITLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
cpu0: DTLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
cpu0: apic clock running at 1000MHz
cpu1 at mainbus0: apid 1 (application processor)
cpu1: Intel Core Processor (Haswell, no TSX), 2394.56 MHz, 06-3c-01
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,RDTSCP,LONG,LAHF,ABM,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MD_CLEAR,ARAT,XSAVEOPT,MELTDOWN
cpu1: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 512KB
64b/line 16-way L2 cache
cpu1: ITLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
cpu1: DTLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
cpu1: smt 0, core 0, package 1
ioapic0 at mainbus0: apid 0 pa 0xfec0, version 11, 24 pins
acpiprt0 at acpi0: bus 0 (PCI0)
acpicpu0 at acpi0: C1(@1 halt!)
acpicpu1 at acpi0: C1(@1 halt!)
"ACPI0006" at acpi0 not configured
acpipci0 at acpi0 PCI0: _OSC failed
acpicmos0 at acpi0
"PNP0A06" at acpi0 not configured
"PNP0A06" at acpi0 not configured
"PNP0A06" at acpi0 not configured
"QEMU0002" at acpi0 not configured
"ACPI0010" at acpi0 not configured
cpu0: using VERW MDS workaround
pvbus0 at mainbus0: KVM
pvclock0 at pvbus0
pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 "Intel 82441FX" rev 0x02
pcib0 at pci0 dev 1 function 0 "Intel 82371SB ISA" 

Re: NET_LOCK and trunk detach

2020-08-04 Thread Sven F.
On Tue, Jul 28, 2020 at 5:27 PM Vitaliy Makkoveev  wrote:
>
>
>
> > On 29 Jul 2020, at 00:09, sven falempin  wrote:
> >
> > On Tue, Jul 28, 2020 at 4:42 PM Vitaliy Makkoveev  wrote:
> >>
> >> On Tue, Jul 28, 2020 at 04:10:01PM -0400, sven falempin wrote:
> >>> Hello,
> >>>
> >>> I am running some trunk interfaces in a multi core environment,
> >>> it's a slightly modified version, i have a few NET_ASSERT_LOCKED();
> >>> suspecting some multi core shenanigans, which i guess was confirmed:
> >>> (unsure the have X meaning, but i ' m pretty sure 256 is very wrong)
> >>> the if_trunk.c locking is completely unmodified
> >>> The code is 6.7-stable
> >>>
> >>> splassert: lacp_detach: want 2 have 0
> >>> splassert: lacp_detach: want 2 have 0
> >>> splassert: lacp_detach: want 2 have 256
> >>>
> >>> I noticed : trunk_clone_destroy ,call
> >>>
> >>>if (tr->tr_proto != TRUNK_PROTO_NONE)
> >>>tr->tr_detach(tr);
> >>>
> >>> outside the lock, and that trunk_ioctl call it
> >>>
> >>>if (tr->tr_proto != TRUNK_PROTO_NONE)
> >>>error = tr->tr_detach(tr);
> >>>
> >>> but ioctl is as far as i understand locked.
> >>> I'm unsure if the difficult and amazing unlocking work
> >>> did an oopsies or if ioctl is already assumed unlocked.
> >>>
> >>> Kindly inform me.
> >>> Best regards, thank you for reading.
> >>>
> >>
> >> lacp_detach() touches nothing which requires NET_LOCK(). What is the
> >> reason you placed assertion to lacp_detach()?
> >
> > <>
> >
> > the lacp_detach is not yours, i put them here because i have a NULL pointer
> > popping in other 'driver' callback.
> >
> > I'm tracking this and trying to understand  how this memory is 'nullified'
> > mid function.
>
> I have no telepathy skills. Sorry. If you have panics send please dmesg(8)
> output and instruction for reproduce.
>
> >
> > I do not think putting NET_ASSERT_LOCKED can be harmful in any way.
> > If so please tell me.
> >
>
> What is the data you think needs be protected by netlock?
>
> > I am just tracking  a bug  and noticed these detach locking strangeness.
> >
>


After further analysis the *current* conclusion is:
 * current code is sort of safe, but the delete would be safer with a
lock ( for consistency anyway )
 * if the code run into a VM that do strange stuff with the stack it
creates bugs ( the bug only appears in KVM cpu ,
a qemu emulating a real cpu does not have this problem )

-- 
--
-
Knowing is not enough; we must apply. Willing is not enough; we must do