[ovs-dev] [PATCH v14] netdev-dpdk: Add custom rx-steering configuration.
Some control protocols are used to maintain link status between forwarding engines (e.g. LACP). When the system is not sized properly, the PMD threads may not be able to process all incoming traffic from the configured Rx queues. When a signaling packet of such protocols is dropped, it can cause link flapping, worsening the situation. Use the rte_flow API to redirect these protocols into a dedicated Rx queue. The assumption is made that the ratio between control protocol traffic and user data traffic is very low and thus this dedicated Rx queue will never get full. Re-program the RSS redirection table to only use the other Rx queues. The additional Rx queue will be assigned a PMD core like any other Rx queue. Polling that extra queue may introduce increased latency and a slight performance penalty at the benefit of preventing link flapping. This feature must be enabled per port on specific protocols via the rx-steering option. This option takes "rss" followed by a "+" separated list of protocol names. It is only supported on ethernet ports. This feature is experimental. If the user has already configured multiple Rx queues on the port, an additional one will be allocated for control packets. If the hardware cannot satisfy the number of requested Rx queues, the last Rx queue will be assigned for control plane. If only one Rx queue is available, the rx-steering feature will be disabled. If the hardware does not support the rte_flow matchers/actions, the rx-steering feature will be completely disabled on the port and regular rss will be performed instead. It cannot be enabled when other-config:hw-offload=true as it may conflict with the offloaded flows. Similarly, if hw-offload is enabled, custom rx-steering will be forcibly disabled on all ports and replaced by regular rss. Example use: ovs-vsctl add-bond br-phy bond0 phy0 phy1 -- \ set interface phy0 type=dpdk options:dpdk-devargs=:ca:00.0 -- \ set interface phy0 options:rx-steering=rss+lacp -- \ set interface phy1 type=dpdk options:dpdk-devargs=:ca:00.1 -- \ set interface phy1 options:rx-steering=rss+lacp As a starting point, only one protocol is supported: LACP. Other protocols can be added in the future. NIC compatibility should be checked. To validate that this works as intended, I used a traffic generator to generate random traffic slightly above the machine capacity at line rate on a two ports bond interface. OVS is configured to receive traffic on two VLANs and pop/push them in a br-int bridge based on tags set on patch ports. +--+ | DUT | |++| || br-int || in_port=patch10,actions=mod_dl_src:$patch11,mod_dl_dst:$tgen1,output:patch11 |||| in_port=patch11,actions=mod_dl_src:$patch10,mod_dl_dst:$tgen0,output:patch10 || patch10patch11 || |+---|---|+| || | | |+---|---|+| || patch00patch01 || || tag:10tag:20 || |||| || br-phy || default flow, action=NORMAL |||| || bond0|| balance-slb, lacp=passive, lacp-time=fast ||phy0 phy1 || |+--|-|---+| +---|-|+ | | +---|-|+ | port0 port1 | balance L3/L4, lacp=active, lacp-time=fast | lag | mode trunk VLANs 10, 20 | | |switch| | | | vlan 10vlan 20 | mode access | port2 port3 | +-|--|-+ | | +-|--|-+ | tgen0 tgen1 | Random traffic that is properly balanced | | across the bond ports in both directions. | traffic generator | +--+ Without rx-steering, the bond0 links are randomly switching to "defaulted" when one of the LACP packets sent by the switch is dropped because the RX queues are full and the PMD threads did not process them fast enough. When that happens, all traffic must go through a single link which causes above line rate traffic to be dropped. ~# ovs-appctl lacp/show-stats bond0 bond0 statistics member: phy0: TX PDUs: 347246 RX PDUs: 14865 RX Bad PDUs: 0 RX Marker Request PDUs: 0 Link Expired: 168 Link Defaulted: 0 Carrier Status Changed: 0 member: phy1: TX PDUs: 347245 RX PDUs: 14919 RX Bad PDUs: 0 RX Marker Request PDUs: 0 Link Expired: 147 Link Defaulted: 1 Carrier Status Changed: 0 When rx-steering is enabled, no LACP packet is dropped and the bond links remain enabled at all times, maximizing the throughput. Neither the "Link Expired" nor the "Link Defaulted" counters are incremented anymore. This feature may be considered as "QoS". However, it does not work by limiting the rate of traffic explicitly. It only guarantees that some protocols have
Re: [ovs-dev] [PATCH v14] netdev-dpdk: Add custom rx-steering configuration.
Robin Jarry writes: > Some control protocols are used to maintain link status between > forwarding engines (e.g. LACP). When the system is not sized properly, > the PMD threads may not be able to process all incoming traffic from the > configured Rx queues. When a signaling packet of such protocols is > dropped, it can cause link flapping, worsening the situation. > > Use the rte_flow API to redirect these protocols into a dedicated Rx > queue. The assumption is made that the ratio between control protocol > traffic and user data traffic is very low and thus this dedicated Rx > queue will never get full. Re-program the RSS redirection table to only > use the other Rx queues. > > The additional Rx queue will be assigned a PMD core like any other Rx > queue. Polling that extra queue may introduce increased latency and > a slight performance penalty at the benefit of preventing link flapping. > > This feature must be enabled per port on specific protocols via the > rx-steering option. This option takes "rss" followed by a "+" separated > list of protocol names. It is only supported on ethernet ports. This > feature is experimental. > > If the user has already configured multiple Rx queues on the port, an > additional one will be allocated for control packets. If the hardware > cannot satisfy the number of requested Rx queues, the last Rx queue will > be assigned for control plane. If only one Rx queue is available, the > rx-steering feature will be disabled. If the hardware does not support > the rte_flow matchers/actions, the rx-steering feature will be > completely disabled on the port and regular rss will be performed > instead. > > It cannot be enabled when other-config:hw-offload=true as it may > conflict with the offloaded flows. Similarly, if hw-offload is enabled, > custom rx-steering will be forcibly disabled on all ports and replaced > by regular rss. > > Example use: > > ovs-vsctl add-bond br-phy bond0 phy0 phy1 -- \ >set interface phy0 type=dpdk options:dpdk-devargs=:ca:00.0 -- \ >set interface phy0 options:rx-steering=rss+lacp -- \ >set interface phy1 type=dpdk options:dpdk-devargs=:ca:00.1 -- \ >set interface phy1 options:rx-steering=rss+lacp > > As a starting point, only one protocol is supported: LACP. Other > protocols can be added in the future. NIC compatibility should be > checked. > > To validate that this works as intended, I used a traffic generator to > generate random traffic slightly above the machine capacity at line rate > on a two ports bond interface. OVS is configured to receive traffic on > two VLANs and pop/push them in a br-int bridge based on tags set on > patch ports. > >+--+ >| DUT | >|++| >|| br-int || > in_port=patch10,actions=mod_dl_src:$patch11,mod_dl_dst:$tgen1,output:patch11 >|||| > in_port=patch11,actions=mod_dl_src:$patch10,mod_dl_dst:$tgen0,output:patch10 >|| patch10patch11 || >|+---|---|+| >|| | | >|+---|---|+| >|| patch00patch01 || >|| tag:10tag:20 || >|||| >|| br-phy || default flow, action=NORMAL >|||| >|| bond0|| balance-slb, lacp=passive, lacp-time=fast >||phy0 phy1 || >|+--|-|---+| >+---|-|+ >| | >+---|-|+ >| port0 port1 | balance L3/L4, lacp=active, lacp-time=fast >| lag | mode trunk VLANs 10, 20 >| | >|switch| >| | >| vlan 10vlan 20 | mode access >| port2 port3 | >+-|--|-+ > | | >+-|--|-+ >| tgen0 tgen1 | Random traffic that is properly balanced >| | across the bond ports in both directions. >| traffic generator | >+--+ > > Without rx-steering, the bond0 links are randomly switching to > "defaulted" when one of the LACP packets sent by the switch is dropped > because the RX queues are full and the PMD threads did not process them > fast enough. When that happens, all traffic must go through a single > link which causes above line rate traffic to be dropped. > > ~# ovs-appctl lacp/show-stats bond0 > bond0 statistics > member: phy0: >TX PDUs: 347246 >RX PDUs: 14865 >RX Bad PDUs: 0 >RX Marker Request PDUs: 0 >Link Expired: 168 >Link Defaulted: 0 >Carrier Status Changed: 0 > member: phy1: >TX PDUs: 347245 >RX PDUs: 14919 >RX Bad PDUs: 0 >RX Marker Request PDUs: 0 >Link Expired: 147 >Link Defaulted: 1 >Carrier Status Changed: 0 > > When rx-steering is enabled, no LACP packet is dropped and the bond > links remain enabled at all times, maximizing the throughput.
Re: [ovs-dev] [PATCH v14] netdev-dpdk: Add custom rx-steering configuration.
On 7/4/23 21:59, Robin Jarry wrote: > Some control protocols are used to maintain link status between > forwarding engines (e.g. LACP). When the system is not sized properly, > the PMD threads may not be able to process all incoming traffic from the > configured Rx queues. When a signaling packet of such protocols is > dropped, it can cause link flapping, worsening the situation. > > Use the rte_flow API to redirect these protocols into a dedicated Rx > queue. The assumption is made that the ratio between control protocol > traffic and user data traffic is very low and thus this dedicated Rx > queue will never get full. Re-program the RSS redirection table to only > use the other Rx queues. > > The additional Rx queue will be assigned a PMD core like any other Rx > queue. Polling that extra queue may introduce increased latency and > a slight performance penalty at the benefit of preventing link flapping. > > This feature must be enabled per port on specific protocols via the > rx-steering option. This option takes "rss" followed by a "+" separated > list of protocol names. It is only supported on ethernet ports. This > feature is experimental. > > If the user has already configured multiple Rx queues on the port, an > additional one will be allocated for control packets. If the hardware > cannot satisfy the number of requested Rx queues, the last Rx queue will > be assigned for control plane. If only one Rx queue is available, the > rx-steering feature will be disabled. If the hardware does not support > the rte_flow matchers/actions, the rx-steering feature will be > completely disabled on the port and regular rss will be performed > instead. > > It cannot be enabled when other-config:hw-offload=true as it may > conflict with the offloaded flows. Similarly, if hw-offload is enabled, > custom rx-steering will be forcibly disabled on all ports and replaced > by regular rss. > > Example use: > > ovs-vsctl add-bond br-phy bond0 phy0 phy1 -- \ >set interface phy0 type=dpdk options:dpdk-devargs=:ca:00.0 -- \ >set interface phy0 options:rx-steering=rss+lacp -- \ >set interface phy1 type=dpdk options:dpdk-devargs=:ca:00.1 -- \ >set interface phy1 options:rx-steering=rss+lacp > > As a starting point, only one protocol is supported: LACP. Other > protocols can be added in the future. NIC compatibility should be > checked. > > To validate that this works as intended, I used a traffic generator to > generate random traffic slightly above the machine capacity at line rate > on a two ports bond interface. OVS is configured to receive traffic on > two VLANs and pop/push them in a br-int bridge based on tags set on > patch ports. > >+--+ >| DUT | >|++| >|| br-int || > in_port=patch10,actions=mod_dl_src:$patch11,mod_dl_dst:$tgen1,output:patch11 >|||| > in_port=patch11,actions=mod_dl_src:$patch10,mod_dl_dst:$tgen0,output:patch10 >|| patch10patch11 || >|+---|---|+| >|| | | >|+---|---|+| >|| patch00patch01 || >|| tag:10tag:20 || >|||| >|| br-phy || default flow, action=NORMAL >|||| >|| bond0|| balance-slb, lacp=passive, lacp-time=fast >||phy0 phy1 || >|+--|-|---+| >+---|-|+ >| | >+---|-|+ >| port0 port1 | balance L3/L4, lacp=active, lacp-time=fast >| lag | mode trunk VLANs 10, 20 >| | >|switch| >| | >| vlan 10vlan 20 | mode access >| port2 port3 | >+-|--|-+ > | | >+-|--|-+ >| tgen0 tgen1 | Random traffic that is properly balanced >| | across the bond ports in both directions. >| traffic generator | >+--+ > > Without rx-steering, the bond0 links are randomly switching to > "defaulted" when one of the LACP packets sent by the switch is dropped > because the RX queues are full and the PMD threads did not process them > fast enough. When that happens, all traffic must go through a single > link which causes above line rate traffic to be dropped. > > ~# ovs-appctl lacp/show-stats bond0 > bond0 statistics > member: phy0: >TX PDUs: 347246 >RX PDUs: 14865 >RX Bad PDUs: 0 >RX Marker Request PDUs: 0 >Link Expired: 168 >Link Defaulted: 0 >Carrier Status Changed: 0 > member: phy1: >TX PDUs: 347245 >RX PDUs: 14919 >RX Bad PDUs: 0 >RX Marker Request PDUs: 0 >Link Expired: 147 >Link Defaulted: 1 >Carrier Status Changed: 0 > > When rx-steering is enabled, no LACP packet is dropped and the bond > links remain enabled at all times,
Re: [ovs-dev] [PATCH v14] netdev-dpdk: Add custom rx-steering configuration.
Ilya Maximets, Jul 10, 2023 at 15:48: > In order to speed up the process a bit, proposing a following incremental > change: > > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c > index 27b2fa5e0..aa87ee546 100644 > --- a/lib/netdev-dpdk.c > +++ b/lib/netdev-dpdk.c > @@ -2074,23 +2074,23 @@ dpdk_set_rx_steer_config(struct netdev *netdev, > struct netdev_dpdk *dev, > const char *arg = smap_get_def(args, "rx-steering", "rss"); > uint64_t flags = 0; > > -if (nullable_string_is_equal(arg, "rss+lacp")) { > +if (!strcmp(arg, "rss+lacp")) { > flags = DPDK_RX_STEER_LACP; > -} else if (!nullable_string_is_equal(arg, "rss")) { > -VLOG_WARN_BUF(errp, "%s options:rx-steering " > +} else if (strcmp(arg, "rss")) { > +VLOG_WARN_BUF(errp, "%s: options:rx-steering " >"unsupported parameter value '%s'", >netdev_get_name(netdev), arg); > } > > if (flags && dev->type != DPDK_DEV_ETH) { > -VLOG_WARN_BUF(errp, "%s options:rx-steering " > +VLOG_WARN_BUF(errp, "%s: options:rx-steering " >"is only supported on ethernet ports", >netdev_get_name(netdev)); > flags = 0; > } > > if (flags && netdev_is_flow_api_enabled()) { > -VLOG_WARN_BUF(errp, "%s options:rx-steering " > +VLOG_WARN_BUF(errp, "%s: options:rx-steering " >"is incompatible with hw-offload", >netdev_get_name(netdev)); > flags = 0; > diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml > index 2f756b1b7..01408e90a 100644 > --- a/vswitchd/vswitch.xml > +++ b/vswitchd/vswitch.xml > @@ -3539,9 +3539,9 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 > type=patch options:peer=p1 \ > > > > - If the user has already configured multiple > - options:n_rxq on the port, an additional one will be > - allocated for the specified protocols. Even if the hardware cannot > + If the user has already configured multiple + column="options" key="n_rxq" /> on the port, an additional one will > + be allocated for the specified protocols. Even if the hardware > cannot >satisfy the requested number of requested Rx queues, the last Rx >queue will be used. If only one Rx queue is available or if the >hardware does not support the rte_flow matchers/actions required to > @@ -3551,10 +3551,14 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 > type=patch options:peer=p1 \ > >This feature is mutually exclusive with > > - as it may conflict with the offloaded RTE flows. If both are > enabled, > + as it may conflict with the offloaded flows. If both are enabled, >rx-steering will fall back to default rss >mode. > > + > + This option is only applicable to interfaces with type > + dpdk. > + > > > --- > > If that looks good to you, I could fold it in while applying the patch. > What do you think? > > Best regards, Ilya Maximets. Yes, that'll be faster. Thanks! ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v14] netdev-dpdk: Add custom rx-steering configuration.
On 7/10/23 15:51, Robin Jarry wrote: > Ilya Maximets, Jul 10, 2023 at 15:48: >> In order to speed up the process a bit, proposing a following incremental >> change: >> >> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c >> index 27b2fa5e0..aa87ee546 100644 >> --- a/lib/netdev-dpdk.c >> +++ b/lib/netdev-dpdk.c >> @@ -2074,23 +2074,23 @@ dpdk_set_rx_steer_config(struct netdev *netdev, >> struct netdev_dpdk *dev, >> const char *arg = smap_get_def(args, "rx-steering", "rss"); >> uint64_t flags = 0; >> >> -if (nullable_string_is_equal(arg, "rss+lacp")) { >> +if (!strcmp(arg, "rss+lacp")) { >> flags = DPDK_RX_STEER_LACP; >> -} else if (!nullable_string_is_equal(arg, "rss")) { >> -VLOG_WARN_BUF(errp, "%s options:rx-steering " >> +} else if (strcmp(arg, "rss")) { >> +VLOG_WARN_BUF(errp, "%s: options:rx-steering " >>"unsupported parameter value '%s'", >>netdev_get_name(netdev), arg); >> } >> >> if (flags && dev->type != DPDK_DEV_ETH) { >> -VLOG_WARN_BUF(errp, "%s options:rx-steering " >> +VLOG_WARN_BUF(errp, "%s: options:rx-steering " >>"is only supported on ethernet ports", >>netdev_get_name(netdev)); >> flags = 0; >> } >> >> if (flags && netdev_is_flow_api_enabled()) { >> -VLOG_WARN_BUF(errp, "%s options:rx-steering " >> +VLOG_WARN_BUF(errp, "%s: options:rx-steering " >>"is incompatible with hw-offload", >>netdev_get_name(netdev)); >> flags = 0; >> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml >> index 2f756b1b7..01408e90a 100644 >> --- a/vswitchd/vswitch.xml >> +++ b/vswitchd/vswitch.xml >> @@ -3539,9 +3539,9 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 >> type=patch options:peer=p1 \ >> >> >> >> - If the user has already configured multiple >> - options:n_rxq on the port, an additional one will be >> - allocated for the specified protocols. Even if the hardware cannot >> + If the user has already configured multiple > + column="options" key="n_rxq" /> on the port, an additional one >> will >> + be allocated for the specified protocols. Even if the hardware >> cannot >>satisfy the requested number of requested Rx queues, the last Rx >>queue will be used. If only one Rx queue is available or if the >>hardware does not support the rte_flow matchers/actions required >> to >> @@ -3551,10 +3551,14 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 >> type=patch options:peer=p1 \ >> >>This feature is mutually exclusive with >>> /> >> - as it may conflict with the offloaded RTE flows. If both are >> enabled, >> + as it may conflict with the offloaded flows. If both are enabled, >>rx-steering will fall back to default >> rss >>mode. >> >> + >> + This option is only applicable to interfaces with type >> + dpdk. >> + >> >> >>> --- >> >> If that looks good to you, I could fold it in while applying the patch. >> What do you think? >> >> Best regards, Ilya Maximets. > > Yes, that'll be faster. Thanks! > Thanks, Robin, Kevin and Aaron! Applied. Best regards, Ilya Maximets. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev