Re: [PATCH] gianfar: fix jumbo packets+napi+rx overrun crash

2021-03-07 Thread michael-dev

Hi,

sorry I missed the mail.

Am 04.03.2021 11:05, schrieb Claudiu Manoil:

Could you help provide some context, e.g.:
On what board/soc were you able to trigger this issue?


I have an OpenWRT running on P1020WLAN boards and have IPsec for some 
gretap tunnel configured.
I run iperf3 -s on the AP and iperf3 -c --udp -b 1000M on some server 
and then the AP reliably crashes.
It also crashed sometimes during normal operations when doing some fast 
download over the IPsec tunnel, but that was hard to reproduce.



How often does the overrun occur?

Reliably within seconds with the above test.

What's the use case? Is the issue triggered with smaller packets than 
9600B?
It cannot be triggered with 1500 Byte  MTU packets as these have FIRST 
and LAST set in one.

I use jumbo frames to speed up ipsec.


Increasing the Rx ring size does significantly reduce ring overruns?


I did not test this.

Regards,
Michael Braun


Re: [PATCHv2] gianfar: fix jumbo packets+napi+rx overrun crash

2021-03-07 Thread michael-dev

Am 04.03.2021 21:17, schrieb Vladimir Oltean:

Just for my understanding, do you have a reproducer for the issue?
I notice you haven't answered Claudiu's questions posted on v1.
On LS1021A I cannot trigger this apparent hardware issue even if I 
force
RX overruns (by reducing the ring size). Judging from the "NIP" 
register

from your stack trace, this is a PowerPC device, which one is it?


This is on P1020WLAN (AP) devices and it happens reproducably when I use 
ipsec + iperf3 --udp -b 1000M on some other server targetting the AP.

Yes this is PPC.

Regards,
Michael



[PATCHv2] gianfar: fix jumbo packets+napi+rx overrun crash

2021-03-04 Thread michael-dev
From: Michael Braun 

When using jumbo packets and overrunning rx queue with napi enabled,
the following sequence is observed in gfar_add_rx_frag:

   | lstatus  |   | skb   |
t  | lstatus,  size, flags| first | len, data_len, *ptr   |
---+--+---+---+
13 | 18002348, 9032, INTERRUPT LAST   | 0 | 9600, 8000,  f554c12e |
12 | 1640, 1600, INTERRUPT| 0 | 8000, 6400,  f554c12e |
11 | 1640, 1600, INTERRUPT| 0 | 6400, 4800,  f554c12e |
10 | 1640, 1600, INTERRUPT| 0 | 4800, 3200,  f554c12e |
09 | 1640, 1600, INTERRUPT| 0 | 3200, 1600,  f554c12e |
08 | 14000640, 1600, INTERRUPT FIRST  | 0 | 1600, 0, f554c12e |
07 | 14000640, 1600, INTERRUPT FIRST  | 1 | 0,0, f554c12e |
06 | 1c80, 128,  INTERRUPT LAST FIRST | 1 | 0,0, abf3bd6e |
05 | 18002348, 9032, INTERRUPT LAST   | 0 | 8000, 6400,  c5a57780 |
04 | 1640, 1600, INTERRUPT| 0 | 6400, 4800,  c5a57780 |
03 | 1640, 1600, INTERRUPT| 0 | 4800, 3200,  c5a57780 |
02 | 1640, 1600, INTERRUPT| 0 | 3200, 1600,  c5a57780 |
01 | 1640, 1600, INTERRUPT| 0 | 1600, 0, c5a57780 |
00 | 14000640, 1600, INTERRUPT FIRST  | 1 | 0,0, c5a57780 |

So at t=7 a new packets is started but not finished, probably due to rx
overrun - but rx overrun is not indicated in the flags. Instead a new
packets starts at t=8. This results in skb->len to exceed size for the LAST
fragment at t=13 and thus a negative fragment size added to the skb.

This then crashes:

kernel BUG at include/linux/skbuff.h:2277!
Oops: Exception in kernel mode, sig: 5 [#1]
...
NIP [c04689f4] skb_pull+0x2c/0x48
LR [c03f62ac] gfar_clean_rx_ring+0x2e4/0x844
Call Trace:
[ec4bfd38] [c06a84c4] _raw_spin_unlock_irqrestore+0x60/0x7c (unreliable)
[ec4bfda8] [c03f6a44] gfar_poll_rx_sq+0x48/0xe4
[ec4bfdc8] [c048d504] __napi_poll+0x54/0x26c
[ec4bfdf8] [c048d908] net_rx_action+0x138/0x2c0
[ec4bfe68] [c06a8f34] __do_softirq+0x3a4/0x4fc
[ec4bfed8] [c0040150] run_ksoftirqd+0x58/0x70
[ec4bfee8] [c0066ecc] smpboot_thread_fn+0x184/0x1cc
[ec4bff08] [c0062718] kthread+0x140/0x144
[ec4bff38] [c0012350] ret_from_kernel_thread+0x14/0x1c

This patch fixes this by checking for computed LAST fragment size, so a
negative sized fragment is never added.
In order to prevent the newer rx frame from getting corrupted, the FIRST
flag is checked to discard the incomplete older frame.

Signed-off-by: Michael Braun 
---
 drivers/net/ethernet/freescale/gianfar.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/drivers/net/ethernet/freescale/gianfar.c 
b/drivers/net/ethernet/freescale/gianfar.c
index 541de32ea662..1cf8ef717453 100644
--- a/drivers/net/ethernet/freescale/gianfar.c
+++ b/drivers/net/ethernet/freescale/gianfar.c
@@ -2390,6 +2390,10 @@ static bool gfar_add_rx_frag(struct gfar_rx_buff *rxb, 
u32 lstatus,
if (lstatus & BD_LFLAG(RXBD_LAST))
size -= skb->len;
 
+   WARN(size < 0, "gianfar: rx fragment size underflow");
+   if (size < 0)
+   return false;
+
skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, page,
rxb->page_offset + RXBUF_ALIGNMENT,
size, GFAR_RXB_TRUESIZE);
@@ -2552,6 +2556,17 @@ static int gfar_clean_rx_ring(struct gfar_priv_rx_q 
*rx_queue,
if (lstatus & BD_LFLAG(RXBD_EMPTY))
break;
 
+   /* lost RXBD_LAST descriptor due to overrun */
+   if (skb &&
+   (lstatus & BD_LFLAG(RXBD_FIRST))) {
+   /* discard faulty buffer */
+   dev_kfree_skb(skb);
+   skb = NULL;
+   rx_queue->stats.rx_dropped++;
+
+   /* can continue normally */
+   }
+
/* order rx buffer descriptor reads */
rmb();
 
-- 
2.20.1



[PATCH] gianfar: fix jumbo packets+napi+rx overrun crash

2021-03-03 Thread michael-dev
From: Michael Braun 

When using jumbo packets and overrunning rx queue with napi enabled,
the following sequence is observed in gfar_add_rx_frag:

   | lstatus  |   | skb   |
t  | lstatus,  size, flags| first | len, data_len, *ptr   |
---+--+---+---+
13 | 18002348, 9032, INTERRUPT LAST   | 0 | 9600, 8000,  f554c12e |
12 | 1640, 1600, INTERRUPT| 0 | 8000, 6400,  f554c12e |
11 | 1640, 1600, INTERRUPT| 0 | 6400, 4800,  f554c12e |
10 | 1640, 1600, INTERRUPT| 0 | 4800, 3200,  f554c12e |
09 | 1640, 1600, INTERRUPT| 0 | 3200, 1600,  f554c12e |
08 | 14000640, 1600, INTERRUPT FIRST  | 0 | 1600, 0, f554c12e |
07 | 14000640, 1600, INTERRUPT FIRST  | 1 | 0,0, f554c12e |
06 | 1c80, 128,  INTERRUPT LAST FIRST | 1 | 0,0, abf3bd6e |
05 | 18002348, 9032, INTERRUPT LAST   | 0 | 8000, 6400,  c5a57780 |
04 | 1640, 1600, INTERRUPT| 0 | 6400, 4800,  c5a57780 |
03 | 1640, 1600, INTERRUPT| 0 | 4800, 3200,  c5a57780 |
02 | 1640, 1600, INTERRUPT| 0 | 3200, 1600,  c5a57780 |
01 | 1640, 1600, INTERRUPT| 0 | 1600, 0, c5a57780 |
00 | 14000640, 1600, INTERRUPT FIRST  | 1 | 0,0, c5a57780 |

So at t=7 a new packets is started but not finished, probably due to rx
overrun - but rx overrun is not indicated in the flags. Instead a new
packets starts at t=8. This results in skb->len to exceed size for the LAST
fragment at t=13 and thus a negative fragment size added to the skb.

This then crashes:

kernel BUG at include/linux/skbuff.h:2277!
Oops: Exception in kernel mode, sig: 5 [#1]
...
NIP [c04689f4] skb_pull+0x2c/0x48
LR [c03f62ac] gfar_clean_rx_ring+0x2e4/0x844
Call Trace:
[ec4bfd38] [c06a84c4] _raw_spin_unlock_irqrestore+0x60/0x7c (unreliable)
[ec4bfda8] [c03f6a44] gfar_poll_rx_sq+0x48/0xe4
[ec4bfdc8] [c048d504] __napi_poll+0x54/0x26c
[ec4bfdf8] [c048d908] net_rx_action+0x138/0x2c0
[ec4bfe68] [c06a8f34] __do_softirq+0x3a4/0x4fc
[ec4bfed8] [c0040150] run_ksoftirqd+0x58/0x70
[ec4bfee8] [c0066ecc] smpboot_thread_fn+0x184/0x1cc
[ec4bff08] [c0062718] kthread+0x140/0x144
[ec4bff38] [c0012350] ret_from_kernel_thread+0x14/0x1c

This patch fixes this by checking for computed LAST fragment size, so a
negative sized fragment is never added.
In order to prevent the newer rx frame from getting corrupted, the FIRST
flag is checked to discard the incomplete older frame.

Signed-off-by: Michael Braun 
---
 drivers/net/ethernet/freescale/gianfar.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/drivers/net/ethernet/freescale/gianfar.c 
b/drivers/net/ethernet/freescale/gianfar.c
index 541de32ea662..2aecae23bfd0 100644
--- a/drivers/net/ethernet/freescale/gianfar.c
+++ b/drivers/net/ethernet/freescale/gianfar.c
@@ -2390,6 +2390,10 @@ static bool gfar_add_rx_frag(struct gfar_rx_buff *rxb, 
u32 lstatus,
if (lstatus & BD_LFLAG(RXBD_LAST))
size -= skb->len;
 
+   WARN(size < 0, "gianfar: rx fragment size underflow");
+   if (size < 0)
+   return false;
+
skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, page,
rxb->page_offset + RXBUF_ALIGNMENT,
size, GFAR_RXB_TRUESIZE);
@@ -2552,6 +2556,16 @@ static int gfar_clean_rx_ring(struct gfar_priv_rx_q 
*rx_queue,
if (lstatus & BD_LFLAG(RXBD_EMPTY))
break;
 
+   /* lost RXBD_LAST descriptor due to overrun */
+   if (skb &&
+   (lstatus & BD_LFLAG(RXBD_FIRST))) {
+   /* discard faulty buffer */
+   dev_kfree_skb(skb);
+   skb = NULL;
+
+   /* can continue normally */
+   }
+
/* order rx buffer descriptor reads */
rmb();
 
-- 
2.20.1



Re: [PATCH] bridge: increase mtu to 64K

2020-05-08 Thread michael-dev

Am 07.05.2020 13:06, schrieb Nikolay Aleksandrov:

That isn't correct, have you tested with a recent kernel?
After:
commit 804b854d374e
Author: Nikolay Aleksandrov 
Date:   Fri Mar 30 13:46:19 2018 +0300


Thanks for pointing out, I'm sorry my test kernel was too old.

Regards,
Michael


[PATCH] Fix dumping vlan rules

2019-07-13 Thread michael-dev
From: "M. Braun" 

Given the following bridge rules:
1. ip protocol icmp accept
2. ether type vlan vlan type ip ip protocol icmp accept

The are currently both dumped by "nft list ruleset" as
1. ip protocol icmp accept
2. ip protocol icmp accept

Though, the netlink code actually is different

bridge filter FORWARD 4
  [ payload load 2b @ link header + 12 => reg 1 ]
  [ cmp eq reg 1 0x0008 ]
  [ payload load 1b @ network header + 9 => reg 1 ]
  [ cmp eq reg 1 0x0001 ]
  [ immediate reg 0 accept ]

bridge filter FORWARD 5 4
  [ payload load 2b @ link header + 12 => reg 1 ]
  [ cmp eq reg 1 0x0081 ]
  [ payload load 2b @ link header + 16 => reg 1 ]
  [ cmp eq reg 1 0x0008 ]
  [ payload load 1b @ network header + 9 => reg 1 ]
  [ cmp eq reg 1 0x0001 ]
  [ immediate reg 0 accept ]

Fix this by avoiding the removal of all vlan statements
in the given example.

Signed-off-by: Michael Braun 
---
 src/payload.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/src/payload.c b/src/payload.c
index 3bf1ecc..905422a 100644
--- a/src/payload.c
+++ b/src/payload.c
@@ -506,6 +506,18 @@ static bool payload_may_dependency_kill(struct 
payload_dep_ctx *ctx,
 dep->left->payload.desc == &proto_ip6) &&
expr->payload.base == PROTO_BASE_TRANSPORT_HDR)
return false;
+   /* Do not kill
+*  ether type vlan and vlan type ip and ip protocol icmp
+* into
+*  ip protocol icmp
+* as this lacks ether type vlan.
+* More generally speaking, do not kill protocol type
+* for stacked protocols if we only have protcol type matches.
+*/
+   if (dep->left->etype == EXPR_PAYLOAD && dep->op == OP_EQ &&
+   expr->flags & EXPR_F_PROTOCOL &&
+   expr->payload.base == dep->left->payload.base)
+   return false;
break;
}
 
-- 
2.20.1



Re: [PATCH net-next] bridge: multicast to unicast

2017-01-09 Thread michael-dev

Am 09.01.2017 13:15, schrieb Johannes Berg:

That is bridge fdb entries (need to) expire so the bridge might
"forget" a still-connected station not sending but only consuming
broadcast traffic.


Ok, that I don't know. Somehow if you address a unicast packet there
the bridge has to make a decision - so it really should know?


If the bridge has not learned the unicast destination mac address on any 
port, it will flood the packet on all ports except the port it received 
the packet on.


Regards,
M. Braun


Re: [PATCH 3/3] mac80211: multicast to unicast conversion

2016-10-06 Thread michael-dev

Am 05.10.2016 13:58, schrieb Johannes Berg:


Anyway, perhaps this needs to change to take DMS/per-station into
account?

Then again, this kind of setting - global multicast-to-unicast -
fundamentally *cannot* be done on a per-station basis, since if you
enable it for one station and not for another, the first station that
has it enabled would get the packets twice...


as I see it, that is exactly how DMS is standarized.

IEEE 802.11-2012 section 10.23.15 DMS procedures:

"If the requested DMS is accepted by the AP, the AP shall send 
subsequent group addressed MSDUs that
match the frame classifier specified in the DMS Descriptors to the 
requesting STA as A-MSDU subframes

within an individually addressed A-MSDU frame (see 8.3.2.2 and 9.11)."

 -> so the multicast packets shall go out as unicast A-MSDU frames to 
stations that requested this


"The AP shall continue to transmit the matching frames as group 
addressed frames (see 9.3.6, and 10.2.1.16) if at least one associated 
STA has not requested DMS for these frames."


 -> so it will continue to send it as multicast frames as well.

As with DMS the station requested DMS for a specific multicast address, 
it could then drop multicast frames addressed to the multicast address 
it registered for DMS.


Regards,
M. Braun


Re: [PATCH 3/3] mac80211: multicast to unicast conversion

2016-10-05 Thread michael-dev

Am 05.10.2016 12:19, schrieb Johannes Berg:

on both ends. Furthermore, I've seen a few mobile phone stations
locally that indicate qos support but won't complete DHCP if their
broadcasts are encapsulated as A-MSDU. Though they work fine with
this series approach.


Presumably those phones also don't even try to use DMS, right?


When I traced this two years ago, almost no device indicated DMS 
support, even though almost all seem to accepted multicast in unicast 
a-msdu frames.





This patch therefore does not opt to implement DMS but instead just
replicates the packet and changes the destination address. As this
works fine with ARP, IPv4 and IPv6, it is limited to these protocols
and normal 802.11 multicast frames are send out for all other payload
protocols.


How did you determine that it "works fine"?


First, I tested this manually using my own devices or asked friends. I 
think this covered at least a recent debian x64 with an intel wireless 
card, a windows 7 x64 with an intel wireless card, an android phone, an 
ios phone and some recent macbook. Manually testing included visiting an 
IPv6 only website (this network uses IPv6 router advertismentens (RA) 
but no DHCPv6), so RA is accepted and ND working. Additionally, 
arping'ing these station using broadcast arp request worked fine, so 
broadcast arp requests are working. Finally, DHCP worked fine and UPNP 
multicast discovery for some closed source media streaming wireless 
device was reported working.


Next, that change was rolled out. It is now in use for at least three 
years with about 300 simulatenously online stations and >2000 currently 
registered devices and there hasn't been a single problem report that 
could be related to that change. Though, e.g. our samsung galaxy users 
report consistently that their devices refuse to connect using WPA-PSK 
as our network advertises FT-PSK next to WPA-PSK and I learned that 
there was at least one device there that did not like the 
multicast-as-unicast-amsdu packets due to a user problem report.



I see at least one undesirable impact of this, which DMS doesn't have;
it breaks a client's MUST NOT requirement from RFC 1122:


Okay, so this cannot go into linux, right?

The thing I dislike most about DMS is that it is client driven, that is 
an AP will only apply unicast conversion if a station actively requests 
so.



You should split the patch into cfg80211 and mac80211, IMHO it's big
enough to do that.


ok


+ * @set_ap_unicast: set the multicast to unicast flag for a AP
interface


That API name isn't very descriptive, I'm sure we can do something
better there.


proposal: "request multicast packets to be trasnmitted as unicast" ?


Also, perhaps we should structure this already like we would DMS, with
a per-station toggle or even list of multicast addresses?


should be possible, yes



+static int ieee80211_set_ap_unicast(struct wiphy *wiphy, struct
net_device *dev,
+   const bool unicast)
+{
+   struct ieee80211_sub_if_data *sdata =
IEEE80211_DEV_TO_SUB_IF(dev);
+
+   if (sdata->vif.type != NL80211_IFTYPE_AP)
+   return -1;


Was this not documented but also intended to apply to its dependent
VLANs?


it was intended as a per per-BSS toggle, so it applies to all dependent 
VLANs automatically.



+/* Check if multicast to unicast conversion is needed and do it.
+ * Returns 1 if skb was freed and should not be send out. */


wrong comment style :)


you mean the */ at end of line instead of on a new line?


+   unicast = nla_data(info->attrs[NL80211_ATTR_UNICAST]);


What's this supposed to mean?


this was supposed to be nla_get_u8.

michael


Re: [PATCH] iproute2: macvlan: add "source" mode

2016-09-25 Thread michael-dev

Please ignore this patch, something went wrong.

Regards,
M. Braun

Am 25.09.2016 20:52, schrieb Michael Braun:

Adjusting iproute2 utility to support new macvlan link type mode called
"source".

Example of commands that can be applied:
  ip link add link eth0 name macvlan0 type macvlan mode source
  ip link set link dev macvlan0 type macvlan macaddr add 
00:11:11:11:11:11
  ip link set link dev macvlan0 type macvlan macaddr del 
00:11:11:11:11:11

  ip link set link dev macvlan0 type macvlan macaddr flush
  ip -details link show dev macvlan0

Based on previous work of Stefan Gula 

Signed-off-by: Michael Braun 

Cc: ste...@gmail.com
---
 include/linux/if_link.h |  2 ++
 man/man8/ip-link.8.in   | 57 
+

 2 files changed, 59 insertions(+)

diff --git a/include/linux/if_link.h b/include/linux/if_link.h
index 1feb708..ec5e64e 100644
--- a/include/linux/if_link.h
+++ b/include/linux/if_link.h
@@ -401,6 +401,8 @@ enum macvlan_macaddr_mode {
 };

 #define MACVLAN_FLAG_NOPROMISC 1
+#define MACVLAN_FLAG_UNICAST   2
+#define MACVLAN_FLAG_UNICAST_ALL   4

 /* VRF section */
 enum {
diff --git a/man/man8/ip-link.8.in b/man/man8/ip-link.8.in
index ffc4160..1ad3cfe 100644
--- a/man/man8/ip-link.8.in
+++ b/man/man8/ip-link.8.in
@@ -138,6 +138,9 @@ ip-link \- network device configuration
 .IR NAME " ]"
 .br
 .RB "[ " addrgenmode " { " eui64 " | " none " | " stable_secret " | "
random " } ]"
+.br
+.B macaddr " |"
+.IR "COMMAND MACADDR |"


 .ti -8
@@ -228,8 +231,46 @@ Link types:
 - IP over Infiniband device
 .sp
 .B macvlan
+.I MODE
 - Virtual interface base on link layer address (MAC)
 .sp
+Modes:
+.in +8
+.B private
+- The device never communicates with any other device on the same 
upper_dev.
+This even includes frames coming back from a reflective relay, where 
supported

+by the adjacent bridge.
+.sp
+.B vepa
+- we assume that the adjacent bridge returns all frames where both 
source and
+destination are local to the macvlan port, i.e. the bridge is set up 
as a
+reflective relay. Broadcast frames coming in from the upper_dev get 
flooded to
+all macvlan interfaces in VEPA mode. We never deliver any frames 
locally.

+.sp
+.B bridge
+- behave as simple bridge between different macvlan interfaces on the 
same
+port. Frames from one interface to another one get delivered directly 
and are
+not sent out externally. Broadcast frames get flooded to all other 
bridge
+ports and to the external interface, but when they come back from a 
reflective
+relay, we don't deliver them again. Since we know all the MAC 
addresses, the
+macvlan bridge mode does not require learning or STP like the bridge 
module

+does.
+.sp
+.B passthru
+- allows takeover of the underlying device and passing it to a guest 
using
+virtio with macvtap backend. Only one macvlan device is allowed in 
passthru
+mode and it inherits the mac address from the underlying device and 
sets it in

+promiscuous mode to receive and forward all the packets.
+.sp
+.B source
+- allows one to set a list of allowed mac address, which is used to 
match
+against source mac address from received frames on underlying 
interface. This
+allows creating mac based VLAN associations, instead of standard port 
or tag

+based. The feature is useful to deploy 802.1x mac based behavior,
+where drivers of underlying interfaces doesn't allows that.
+.sp
+.in -8
+.sp
 .B macvtap
 - Virtual interface based on link layer address (MAC) and TAP.
 .sp
@@ -1074,6 +1115,22 @@ specifies the type of the device.

 .SS ip link set - change device attributes

+.TP
+.BI macaddr " COMMAND MACADDR"
+add or removes MACADDR from allowed list for source mode macvlan type 
link

+Commands:
+.in +8
+.B add
+- add MACADDR to allowed list
+.sp
+.B del
+- remove MACADDR from allowed list
+.sp
+.B flush
+- flush whole allowed list
+.sp
+.in -8
+
 .PP
 .B Warning:
 If multiple parameter changes are requested,