Re: Problem with queuing vlan tagged packets after migration from 3.16.0 to 4.9.0
Hi W dniu 03.01.2019 o 21:44, Cong Wang pisze: On Thu, Jan 3, 2019 at 7:25 AM Bartek Kois wrote: Hi 1. What exactly caused this change in the kernel? I don't follow VLAN changes, I guess it must be some change which inserts the VLAN tag before this ->ndo_start_xmit(). 2. I don`t understand why adding VLAN tag, which is just 4 additional bytes to the passing packet make it impossible to classify. It is possible, you just have to specify the offset manually, as the iproute2 can't detect the offset of IP header in this case. So it is just inconvenient. 3. This whole thing makes the QoS under Linux routers hard to configure in scenarios with more than one VLAN which is pretty much every slightly bigger router nowadays especially if we use IFB and hashing filters. Is there any walkaround for that problem? Just move these filters from physical device to the VLAN device instead? Basically my current scenario looks like this: - router with eth0 as WAN and eth1 as LAN with 10-20 vlans, - around 1000-2000 ip addresses in differnets subnets behind router (on the LAN side), - QoS made with tc + ifb (for upload queuing) + hasing filters (for performance reasons) Moving this to two queuing trees (one on vlan and one on ifbx) per each vlan makes this really hard to configure, but not impossible as long as I can redirect single VLAN to ifb (don`t know if that is possible). Anton suggested to use iptables+ipset but I don`t think that would be a good idea to do that in scenario with so many queues. Best reagrds Bartek Kois
Re: Problem with queuing vlan tagged packets after migration from 3.16.0 to 4.9.0
Hi Is this equally fast as hashing tables? Best regards Bartek Kois W dniu 03.01.2019 o 22:49, Anton Danilov pisze: Hi. There is the workaround - classify the packets with iptables+ipset - it's enough fast and more friendly. On Fri, 4 Jan 2019 at 00:21, Bartek Kois wrote: Hi 1. What exactly caused this change in the kernel? 2. I don`t understand why adding VLAN tag, which is just 4 additional bytes to the passing packet make it impossible to classify. 3. This whole thing makes the QoS under Linux routers hard to configure in scenarios with more than one VLAN which is pretty much every slightly bigger router nowadays especially if we use IFB and hashing filters. Is there any walkaround for that problem? Best regards Bartek Kois W dniu 03.01.2019 o 04:30, Cong Wang pisze: On Tue, Jan 1, 2019 at 11:46 AM Bartek Kois wrote: Hi Yes it did work since I remember (like around 2.4.x) and it changed since I moved from Debian 8 to 9. I would appreciate fixing that in the future beacuse it is essential for queueing traffic on the routers, but the question is why these filters don`t work in that case: tc filter add dev $LAN_ETH parent 1:0 protocol ip prio 4 u32 match u32 0x0a000c08 0x at 20 classid 1:2001 # for 10.0.12.8 ip address tc filter add dev $LAN_ETH parent 1:0 protocol ip prio 4 u32 match u32 0x0a000c09 0x at 20 classid 1:2002 # for 10.0.12.9 ip address tc filter add dev $LAN_ETH parent 1:0 protocol ip prio 4 u32 match u32 0x0a000c10 0x at 20 classid 1:2003 # for 10.0.12.10 ip address I`ve changed "at 16" which works without vlan tags to "at 20" to take vlan tag into account. Yeah, this confirms my speculation. The problem is essentially a design flaw of u32 filter, the IP header and TCP header offsets are never fixed, for example VLAN tagging and IP options. What's more, it is not easy for user-space to learn the offset for different packets as it requires to parse into each packets. I don't know whether we can fix this either, VLAN call path probably already makes assumptions on the current skb->data position, if we "fix" it for u32, it would probably break other things.
Re: Problem with queuing vlan tagged packets after migration from 3.16.0 to 4.9.0
Hi 1. What exactly caused this change in the kernel? 2. I don`t understand why adding VLAN tag, which is just 4 additional bytes to the passing packet make it impossible to classify. 3. This whole thing makes the QoS under Linux routers hard to configure in scenarios with more than one VLAN which is pretty much every slightly bigger router nowadays especially if we use IFB and hashing filters. Is there any walkaround for that problem? Best regards Bartek Kois W dniu 03.01.2019 o 04:30, Cong Wang pisze: On Tue, Jan 1, 2019 at 11:46 AM Bartek Kois wrote: Hi Yes it did work since I remember (like around 2.4.x) and it changed since I moved from Debian 8 to 9. I would appreciate fixing that in the future beacuse it is essential for queueing traffic on the routers, but the question is why these filters don`t work in that case: tc filter add dev $LAN_ETH parent 1:0 protocol ip prio 4 u32 match u32 0x0a000c08 0x at 20 classid 1:2001 # for 10.0.12.8 ip address tc filter add dev $LAN_ETH parent 1:0 protocol ip prio 4 u32 match u32 0x0a000c09 0x at 20 classid 1:2002 # for 10.0.12.9 ip address tc filter add dev $LAN_ETH parent 1:0 protocol ip prio 4 u32 match u32 0x0a000c10 0x at 20 classid 1:2003 # for 10.0.12.10 ip address I`ve changed "at 16" which works without vlan tags to "at 20" to take vlan tag into account. Yeah, this confirms my speculation. The problem is essentially a design flaw of u32 filter, the IP header and TCP header offsets are never fixed, for example VLAN tagging and IP options. What's more, it is not easy for user-space to learn the offset for different packets as it requires to parse into each packets. I don't know whether we can fix this either, VLAN call path probably already makes assumptions on the current skb->data position, if we "fix" it for u32, it would probably break other things.
Re: Problem with queuing vlan tagged packets after migration from 3.16.0 to 4.9.0
Hi Yes it did work since I remember (like around 2.4.x) and it changed since I moved from Debian 8 to 9. I would appreciate fixing that in the future beacuse it is essential for queueing traffic on the routers, but the question is why these filters don`t work in that case: tc filter add dev $LAN_ETH parent 1:0 protocol ip prio 4 u32 match u32 0x0a000c08 0x at 20 classid 1:2001 # for 10.0.12.8 ip address tc filter add dev $LAN_ETH parent 1:0 protocol ip prio 4 u32 match u32 0x0a000c09 0x at 20 classid 1:2002 # for 10.0.12.9 ip address tc filter add dev $LAN_ETH parent 1:0 protocol ip prio 4 u32 match u32 0x0a000c10 0x at 20 classid 1:2003 # for 10.0.12.10 ip address I`ve changed "at 16" which works without vlan tags to "at 20" to take vlan tag into account. Best regards Bartek Kois W dniu 01.01.2019 o 20:33, Cong Wang pisze: On Mon, Dec 31, 2018 at 10:13 AM Bartek Kois wrote: Hi, I tested 4.20 and the problem remains (it is not possible to classify tagged packets if the root filter is on physical interface). Hmm, I guess it is because the offset used by u32 filter is no longer accurate when vlan tag is inserted into mac header. On egress side, skb->data points to the mac header, so the offset of IP header is different when vlan tag is involved. Did this really work before? I don't follow vlan changes, it seems it has been already like this for a long time.
Re: Problem with queuing vlan tagged packets after migration from 3.16.0 to 4.9.0
Witam Working setup (driver e1000e): # ethtool -k eth1 | grep vlan rx-vlan-offload: on tx-vlan-offload: on rx-vlan-filter: on [fixed] vlan-challenged: off [fixed] tx-vlan-stag-hw-insert: off [fixed] rx-vlan-stag-hw-parse: off [fixed] rx-vlan-stag-filter: off [fixed] Broken setup (driver e1000e): # ethtool -k eth1 | grep vlan rx-vlan-offload: on tx-vlan-offload: on rx-vlan-filter: on [fixed] vlan-challenged: off [fixed] tx-vlan-stag-hw-insert: off [fixed] rx-vlan-stag-hw-parse: off [fixed] rx-vlan-stag-filter: off [fixed] The same happens in case of ixgbe driver (tested on different machine).I`ve been using this for several years and all of the sudden it stops working properly. I`ve tried to use u32 classifier with value and mask to match ip address on 16 or 20 byte (in case it contains additional 4 bytes of vlan tag) to check if it will work with no luck. Browsing the internet I found this notation: "protocol 802.1q", but it doesn`t work on my system. Pozdrawiam Bartek Kois W dniu 31.12.2018 o 22:47, Jakub Kicinski pisze: On Sat, 29 Dec 2018 13:52:23 +0100, Bartek Kois wrote: Hi, I`ve got problem while queuing with HFSC vlan tagged packets after migrating my tc scripts from Debian 8.2 (3.16.0-4-amd64) to Debian 9.5 (4.9.0-6-amd64). tc filters added to eth1 do not classify correctly src and dst ip addresses anymore if they are encapsulated with vlan tag which wasn`t a problem previously. It works fine if I run them without vlan tagging oraz if the root device is a vlan (eg. tc filter add dev eth1.20). Could you please help me find out what has changed in kernel between those two versions and what is walkaround for that problem? Could this be related to your device driver not stripping VLAN tags by default any more? Just a short in the dark.. Try: $ ethtool -k lo | grep vlan on working vs broken setup. What is your HW/device driver? Example of my classification filters: tc filter add dev eth1 parent 1:0 prio 4 protocol ip u32 tc filter add dev eth1 parent 1:0 prio 4 handle ${NETWORK_GROUP_HEX}: protocol ip u32 divisor 256 tc filter add dev eth1 protocol ip parent 1:0 prio 4 u32 ht 800:: match ip dst ${NETWORK_ADDRESS}/24 hashkey mask 0x00ff at 16 link ${NETWORK_GROUP_HEX}: tc filter add dev eth1 parent 1:0 protocol ip prio 4 u32 ht ${NETWORK_GROUP_HEX}:0x${ADDR_Q4_HEX} match ip dst $ADDR classid 1:${MARK_NORMAL} Best regards Bartek Kois
Re: Problem with queuing vlan tagged packets after migration from 3.16.0 to 4.9.0
Hi, I tested 4.20 and the problem remains (it is not possible to classify tagged packets if the root filter is on physical interface). Best regards Bartek Kois W dniu 30.12.2018 o 22:14, Bartek Kois pisze: Hi I haven`t tested any newer kernels cause I thought that something related to packet classification has been changed permanently and I have to figure out what. Which one should I test? Best regards Bartek Kois W dniu 30.12.2018 o 19:53, Cong Wang pisze: Hello, On Sat, Dec 29, 2018 at 11:54 AM Bartek Kois wrote: Hi, I`ve got problem while queuing with HFSC vlan tagged packets after migrating my tc scripts from Debian 8.2 (3.16.0-4-amd64) to Debian 9.5 (4.9.0-6-amd64). tc filters added to eth1 do not classify correctly src and dst ip addresses anymore if they are encapsulated with vlan tag which wasn`t a problem previously. It works fine if I run them without vlan tagging oraz if the root device is a vlan (eg. tc filter add dev eth1.20). Could you please help me find out what has changed in kernel between those two versions and what is walkaround for that problem? Does this problem still exist on some latest kernel? 4.9 is still too old for upstream to be interesting. :(
Re: Problem with queuing vlan tagged packets after migration from 3.16.0 to 4.9.0
Hi I haven`t tested any newer kernels cause I thought that something related to packet classification has been changed permanently and I have to figure out what. Which one should I test? Best regards Bartek Kois W dniu 30.12.2018 o 19:53, Cong Wang pisze: Hello, On Sat, Dec 29, 2018 at 11:54 AM Bartek Kois wrote: Hi, I`ve got problem while queuing with HFSC vlan tagged packets after migrating my tc scripts from Debian 8.2 (3.16.0-4-amd64) to Debian 9.5 (4.9.0-6-amd64). tc filters added to eth1 do not classify correctly src and dst ip addresses anymore if they are encapsulated with vlan tag which wasn`t a problem previously. It works fine if I run them without vlan tagging oraz if the root device is a vlan (eg. tc filter add dev eth1.20). Could you please help me find out what has changed in kernel between those two versions and what is walkaround for that problem? Does this problem still exist on some latest kernel? 4.9 is still too old for upstream to be interesting. :(
Problem with queuing vlan tagged packets after migration from 3.16.0 to 4.9.0
Hi, I`ve got problem while queuing with HFSC vlan tagged packets after migrating my tc scripts from Debian 8.2 (3.16.0-4-amd64) to Debian 9.5 (4.9.0-6-amd64). tc filters added to eth1 do not classify correctly src and dst ip addresses anymore if they are encapsulated with vlan tag which wasn`t a problem previously. It works fine if I run them without vlan tagging oraz if the root device is a vlan (eg. tc filter add dev eth1.20). Could you please help me find out what has changed in kernel between those two versions and what is walkaround for that problem? Example of my classification filters: tc filter add dev eth1 parent 1:0 prio 4 protocol ip u32 tc filter add dev eth1 parent 1:0 prio 4 handle ${NETWORK_GROUP_HEX}: protocol ip u32 divisor 256 tc filter add dev eth1 protocol ip parent 1:0 prio 4 u32 ht 800:: match ip dst ${NETWORK_ADDRESS}/24 hashkey mask 0x00ff at 16 link ${NETWORK_GROUP_HEX}: tc filter add dev eth1 parent 1:0 protocol ip prio 4 u32 ht ${NETWORK_GROUP_HEX}:0x${ADDR_Q4_HEX} match ip dst $ADDR classid 1:${MARK_NORMAL} Best regards Bartek Kois
Re: kernel BUG at net/core/skbuff.c in linux-2.6.21-rc6
Your fix is probably needed too. However, I think the issue that Patrick was trying to fix is the case where p[0] != PPP_ALLSTATIONS and therefore we'd still have a problem there. I tested Paul's patch for last few days and I think everything seems ok. The system is stable. Regards Bartek - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html