[ovs-dev] [PATCH] fix vhost user stats
1. "+=" should be "=" 2. tx_errors is a generic param, and should be 0 since vhost does not create such error. Or some app, like libvirt will complain for failure to find this key. Signed-off-by: wangzhike --- lib/netdev-dpdk.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index e90fd0e..1c50aa3 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -2016,14 +2016,15 @@ netdev_dpdk_vhost_get_stats(const struct netdev *netdev, rte_spinlock_lock(&dev->stats_lock); /* Supported Stats */ -stats->rx_packets += dev->stats.rx_packets; -stats->tx_packets += dev->stats.tx_packets; +stats->rx_packets = dev->stats.rx_packets; +stats->tx_packets = dev->stats.tx_packets; stats->rx_dropped = dev->stats.rx_dropped; -stats->tx_dropped += dev->stats.tx_dropped; +stats->tx_dropped = dev->stats.tx_dropped; stats->multicast = dev->stats.multicast; stats->rx_bytes = dev->stats.rx_bytes; stats->tx_bytes = dev->stats.tx_bytes; stats->rx_errors = dev->stats.rx_errors; +stats->tx_errors = 0; stats->rx_length_errors = dev->stats.rx_length_errors; stats->rx_1_to_64_packets = dev->stats.rx_1_to_64_packets; -- 1.8.3.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] vhost user port is displayed as LINK DOWN after ovs-vswitchd reboot
Thanks Darrell. I just send it out via git send-email. Br, Wang Zhike (lawrence) -Original Message- From: Darrell Ball [mailto:db...@vmware.com] Sent: Wednesday, August 23, 2017 8:48 AM To: 王志克; d...@openvswitch.org Subject: Re: [ovs-dev] vhost user port is displayed as LINK DOWN after ovs-vswitchd reboot This looks reasonable at first glance Lawrence Would you like to try git format-patch and git send-email ? It would make it easier in the long term. Thanks Darrell On 8/21/17, 6:00 AM, "ovs-dev-boun...@openvswitch.org on behalf of 王志克" wrote: Hi, I create a pull request, regarding the vhost user port status. The problem is that the port may be udpated while the vhost_reconfigured is false. Then the vhost_reconfigured is updated. As a result, the vhost user status is kept as LINK-DOWN. Note the traffic is OK in this case. Only the status is wrong. https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_openvswitch_ovs_pull_198&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=AM5cxUEtfcit4Rc-vVRud4qt3gZ4cRk19IR_fqiEppM&s=e55NLote0ZavherSDH6l2VDVlSIpYJ4df-RVDfw1t-U&e= Thanks. Br, Lawrence ___ dev mailing list d...@openvswitch.org https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=AM5cxUEtfcit4Rc-vVRud4qt3gZ4cRk19IR_fqiEppM&s=zTr1Xq-O5yLDfASgQ1Gs9ZYPrhuDk3kayv2fOAznbK4&e= ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] Fix: vhost user port status
After ovs-vswitchd reboots, vhost user status is displayed as LINK DOWN though the traffic is OK. The problem is that the port may be udpated while the vhost_reconfigured is false. Then the vhost_reconfigured is updated to true. As a result, the vhost user status is kept as LINK-DOWN. Signed-off-by: wangzhike --- lib/netdev-dpdk.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 1aaf6f7..e90fd0e 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -3227,7 +3227,11 @@ dpdk_vhost_reconfigure_helper(struct netdev_dpdk *dev) } if (netdev_dpdk_get_vid(dev) >= 0) { -dev->vhost_reconfigured = true; +if (dev->vhost_reconfigured == false) { +dev->vhost_reconfigured = true; +/* change vhost_reconfigured may affect carrier status */ +netdev_change_seq_changed(&dev->up); +} } return 0; -- 1.8.3.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] ovn: Check for known logical switch port types.
OVN is lenient with the types of logical switch ports. Maybe too lenient. This patch attempts to solve this problem on two fronts: 1) In ovn-nbctl, if you attempt to set the port type to an unknown type, the command will not end up setting the type. 2) In northd, when copying the port type from the northbound database to the corresponding port-binding in the southbound database, a warning will be issued if the port is of an unknown type. Signed-off-by: Mark Michelson --- ovn/lib/ovn-util.c| 30 +++ ovn/lib/ovn-util.h| 2 ++ ovn/northd/ovn-northd.c | 4 +++ ovn/utilities/ovn-nbctl.c | 7 - tests/ovn-nbctl.at| 73 +++ 5 files changed, 115 insertions(+), 1 deletion(-) diff --git a/ovn/lib/ovn-util.c b/ovn/lib/ovn-util.c index 037d0798a..883ce4ccb 100644 --- a/ovn/lib/ovn-util.c +++ b/ovn/lib/ovn-util.c @@ -299,3 +299,33 @@ default_sb_db(void) } return def; } + +/* l3gateway, chassisredirect, and patch + * are not in this list since they are + * only set in the SB DB by northd + */ +const char *OVN_NB_LSP_TYPES[] = { +"router", +"localnet", +"localport", +"l2gateway", +"vtep", +}; + +bool +ovn_is_known_nb_lsp_type(const char *type) +{ +int i; + +if (!type || !type[0]) { +return true; +} + +for (i = 0; i < ARRAY_SIZE(OVN_NB_LSP_TYPES); ++i) { +if (!strcmp(OVN_NB_LSP_TYPES[i], type)) { +return true; +} +} + +return false; +} diff --git a/ovn/lib/ovn-util.h b/ovn/lib/ovn-util.h index b3d2125a3..9b456426d 100644 --- a/ovn/lib/ovn-util.h +++ b/ovn/lib/ovn-util.h @@ -67,4 +67,6 @@ char *alloc_nat_zone_key(const struct uuid *key, const char *type); const char *default_nb_db(void); const char *default_sb_db(void); +bool ovn_is_known_nb_lsp_type(const char *type); + #endif diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c index 49e4ac338..177df5a1e 100644 --- a/ovn/northd/ovn-northd.c +++ b/ovn/northd/ovn-northd.c @@ -1962,6 +1962,10 @@ ovn_port_update_sbrec(struct northd_context *ctx, } sbrec_port_binding_set_options(op->sb, &options); smap_destroy(&options); +if (!ovn_is_known_nb_lsp_type(op->nbsp->type)) { +VLOG_WARN("Unknown port type '%s' set on logical switch '%s'. " + "Using anyway.", op->nbsp->type, op->nbsp->name); +} sbrec_port_binding_set_type(op->sb, op->nbsp->type); } else { const char *chassis = NULL; diff --git a/ovn/utilities/ovn-nbctl.c b/ovn/utilities/ovn-nbctl.c index 46ede4e50..d00bd6ec9 100644 --- a/ovn/utilities/ovn-nbctl.c +++ b/ovn/utilities/ovn-nbctl.c @@ -1173,7 +1173,12 @@ nbctl_lsp_set_type(struct ctl_context *ctx) const struct nbrec_logical_switch_port *lsp; lsp = lsp_by_name_or_uuid(ctx, id, true); -nbrec_logical_switch_port_set_type(lsp, type); +if (ovn_is_known_nb_lsp_type(type)) { +nbrec_logical_switch_port_set_type(lsp, type); +} else { +ctl_fatal("Logical switch port type '%s' is unrecognized. " + "Not setting type.", type); +} } static void diff --git a/tests/ovn-nbctl.at b/tests/ovn-nbctl.at index 354b8df96..e1c4b4473 100644 --- a/tests/ovn-nbctl.at +++ b/tests/ovn-nbctl.at @@ -951,3 +951,76 @@ IPv6 Routes OVN_NBCTL_TEST_STOP AT_CLEANUP + +dnl - + +AT_SETUP([ovn-nbctl - lsp types]) +OVN_NBCTL_TEST_START + +AT_CHECK([ovn-nbctl ls-add ls0]) +AT_CHECK([ovn-nbctl lsp-add ls0 lp0]) + +dnl switchport type defaults to empty +AT_CHECK([ovn-nbctl lsp-get-type lp0], [0], [dnl + +]) + +dnl The following are the valid entries for +dnl switchport type +AT_CHECK([ovn-nbctl lsp-set-type lp0 l2gateway]) +AT_CHECK([ovn-nbctl lsp-get-type lp0], [0], [dnl +l2gateway +]) + +AT_CHECK([ovn-nbctl lsp-set-type lp0 router]) +AT_CHECK([ovn-nbctl lsp-get-type lp0], [0], [dnl +router +]) + +AT_CHECK([ovn-nbctl lsp-set-type lp0 localnet]) +AT_CHECK([ovn-nbctl lsp-get-type lp0], [0], [dnl +localnet +]) + +AT_CHECK([ovn-nbctl lsp-set-type lp0 localport]) +AT_CHECK([ovn-nbctl lsp-get-type lp0], [0], [dnl +localport +]) + +AT_CHECK([ovn-nbctl lsp-set-type lp0 vtep]) +AT_CHECK([ovn-nbctl lsp-get-type lp0], [0], [dnl +vtep +]) + +dnl All of these are valid southbound port types but +dnl should be rejected for northbound logical switch +dnl ports. +AT_CHECK([ovn-nbctl lsp-set-type lp0 l3gateway], [1], [], [dnl +ovn-nbctl: Logical switch port type 'l3gateway' is unrecognized. Not setting type. +]) +AT_CHECK([ovn-nbctl lsp-set-type lp0 patch], [1], [], [dnl +ovn-nbctl: Logical switch port type 'patch' is unrecognized. Not setting type. +]) +AT_CHECK([ovn-nbctl lsp-set-type lp0 chassisredirect], [1], [], [dnl +ovn-nbctl: Logical switch port type 'chassisredirect' is unrecognized. Not setting type. +]) + +dnl switch port type should still be "vt
[ovs-dev] OVS+DPDK QoS rate limit issue
Hi All, I am using OVS2.7.0 and DPDK 16.11, and testing rate limit function. I found that if the policing_rate is set very large, say 5Gbps, the rate is limited dramatically to very low value, like 800Mbps. The command is as below: ovs-vsctl set interface port-7zel2so9sg ingress_policing_rate=500 ingress_policing_burst=50 If we set the rate lower than 4Gbps, the rate is limited correctly. Test setup: Sender (DPDK pktGen) sends out about 10Gbps udp packet, with size about 1420 IP size. The rate limit is set on VM vhost-user-client port. Any idea about this issue? Is that known issue? Br, Wang Zhike ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 2/5] datapath: Optimize updating for OvS flow_stats.
On 08/23/2017 04:24 PM, Joe Stringer wrote: On 23 August 2017 at 11:21, Greg Rose wrote: > Upstream commit: > commit c57c054eb5b1ccf230c49f736f7a018fcbc3e952 > Author: Tonghao Zhang > Date: Mon Jul 17 23:28:05 2017 -0700 > > openvswitch: Optimize updating for OvS flow_stats. > > In the ovs_flow_stats_update(), we only use the node > var to alloc flow_stats struct. But this is not a > common case, it is unnecessary to call the numa_node_id() > everytime. This patch is not a bugfix, but there maybe > a small increase. > > Signed-off-by: Tonghao Zhang > Signed-off-by: David S. Miller > > Signed-off-by: Greg Rose > --- Thanks for the backports, the rest of this series LGTM. I'm running a quick build check on Travis in the mean time, but once we have an answer on patch #1 I'd be happy to apply this series. Cheers, Joe Thanks Joe - let me work in your suggestion for patch #1 and then I'll resend V2 of the series. - Greg ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 1/5] datapath: Remove all references to SKB_GSO_UDP.
On 08/23/2017 04:21 PM, Joe Stringer wrote: On 23 August 2017 at 11:21, Greg Rose wrote: > Upstream commit: > commit 880388aa3c07fdea4f9b85e35641753017b1852f > Author: David S. Miller > Date: Mon Jul 3 07:29:12 2017 -0700 > > net: Remove all references to SKB_GSO_UDP. > > Such packets are no longer possible. > > Signed-off-by: David S. Miller > > Apply openvswitch section of this upstream patch. > > Signed-off-by: Greg Rose > --- I think that the background context of this patch is that there are no SKB_GSO_UDP frames in the latest upstream versions of the kernel. However, in the OVS datapath we will be running against earlier kernels that still have these flags. If we think that this could impact forwarding (and we care about such configurations on older kernels) then we might have to consider #ifdefing these pieces. Otherwise this seems OK. Ah, right. The patches all passed a travis build for the various kernels but that doesn't necessarily mean it all works right. I'll add the appropriate acinclude check and #ifdef and then send a V2. Thanks for the review! - Greg ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] Web Content
Hello, I recently sent you an email regarding our company and providing Web Content. Let me know if you are interested and we can discuss this further. I look forward to your response. Kind regards, Nicole Fitzpatrick Content Writer _ From: web...@webdeta.biz [mailto:web...@webdeta.biz] Sent: Tuesday, August 22, 2017 5:46 AM To: d...@openvswitch.org Subject: Web Content Hello! I hope things are well. I am a content writer working for an online services Agency. Being a stickler for content I noticed a couple of meta-data related mistakes on your website that I thought I would bring to your attention. It is on one of the inner pages. I have one of the digital marketers preparing a quick report and details for you which list the errors. I thought you may find it interesting and probably a core reason why your online visibility is not increasing. Can I send the report and proposal to you? Kind regards, Nicole Fitzpatrick Content Writer ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 2/2] atlocal: Document find_l7_lib()
On 22 August 2017 at 17:52, Yi-Hung Wei wrote: > When a system traffic is skipped due to 'HAVE_FTP = no' or > 'HAVE_TFTP = no', it takes some effort to figure out it is due to > missing the required python library. Add some comments around the > find_l7_lib(), so that user can figure that out by > $ git grep HAVE_FTP. > > Signed-off-by: Yi-Hung Wei > --- Thanks, applied to master. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 1/2] datapath: compat: Fix build on RHEL 7.4
On 22 August 2017 at 18:00, Joe Stringer wrote: > On 22 August 2017 at 17:52, Yi-Hung Wei wrote: >> RHEL 7.4 introduces netdev_master_upper_dev_link_rh() that breaks the >> backport of OVS kernel module on RHEL 7.4. This patch fixes that issue. >> >> Signed-off-by: Yi-Hung Wei >> --- > > Thanks YiHung, LGTM. > > Unless others have objections, it seems like this would be trivial to > also backport to branch-2.8 and allow that release to compile against > CentOS/RHEL 7.4 kernels. Applied to master and branch-2.8. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 2/5] datapath: Optimize updating for OvS flow_stats.
On 23 August 2017 at 11:21, Greg Rose wrote: > Upstream commit: > commit c57c054eb5b1ccf230c49f736f7a018fcbc3e952 > Author: Tonghao Zhang > Date: Mon Jul 17 23:28:05 2017 -0700 > > openvswitch: Optimize updating for OvS flow_stats. > > In the ovs_flow_stats_update(), we only use the node > var to alloc flow_stats struct. But this is not a > common case, it is unnecessary to call the numa_node_id() > everytime. This patch is not a bugfix, but there maybe > a small increase. > > Signed-off-by: Tonghao Zhang > Signed-off-by: David S. Miller > > Signed-off-by: Greg Rose > --- Thanks for the backports, the rest of this series LGTM. I'm running a quick build check on Travis in the mean time, but once we have an answer on patch #1 I'd be happy to apply this series. Cheers, Joe ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 1/5] datapath: Remove all references to SKB_GSO_UDP.
On 23 August 2017 at 11:21, Greg Rose wrote: > Upstream commit: > commit 880388aa3c07fdea4f9b85e35641753017b1852f > Author: David S. Miller > Date: Mon Jul 3 07:29:12 2017 -0700 > > net: Remove all references to SKB_GSO_UDP. > > Such packets are no longer possible. > > Signed-off-by: David S. Miller > > Apply openvswitch section of this upstream patch. > > Signed-off-by: Greg Rose > --- I think that the background context of this patch is that there are no SKB_GSO_UDP frames in the latest upstream versions of the kernel. However, in the OVS datapath we will be running against earlier kernels that still have these flags. If we think that this could impact forwarding (and we care about such configurations on older kernels) then we might have to consider #ifdefing these pieces. Otherwise this seems OK. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] datapath-windows: Move OvsCreateNewNBLsFromMultipleNBs to BuggerMgmt
Thanks Anand and Shashank! Acked-by: Alin Gabriel Serdean I applied this on branch-2.8 and master. Thanks, Alin. > -Original Message- > From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev- > boun...@openvswitch.org] On Behalf Of Anand Kumar > Sent: Tuesday, August 22, 2017 12:53 AM > To: Shashank Ram ; d...@openvswitch.org > Subject: Re: [ovs-dev] [PATCH] datapath-windows: Move > OvsCreateNewNBLsFromMultipleNBs to BuggerMgmt > > Acked-by: Anand Kumar > > Thanks, > Anand Kumar > > On 8/21/17, 2:45 PM, "ovs-dev-boun...@openvswitch.org on behalf of > Shashank Ram" r...@vmware.com> wrote: > > Moves function OvsCreateNewNBLsFromMultipleNBs() to BufferMgmt.c > to facilitate consumption from outside PacketIO.c. > > Signed-off-by: Shashank Ram > --- > datapath-windows/ovsext/BufferMgmt.c | 47 > > datapath-windows/ovsext/BufferMgmt.h | 4 +++ > datapath-windows/ovsext/PacketIO.c | 42 > 3 files changed, 51 insertions(+), 42 deletions(-) > > diff --git a/datapath-windows/ovsext/BufferMgmt.c b/datapath- > windows/ovsext/BufferMgmt.c > index 1ede4a3..5c9e562 100644 > --- a/datapath-windows/ovsext/BufferMgmt.c > +++ b/datapath-windows/ovsext/BufferMgmt.c > @@ -1783,3 +1783,50 @@ OvsGetCtxSourcePortNo(PNET_BUFFER_LIST > nbl, > *portNo = ctx->srcPortNo; > return NDIS_STATUS_SUCCESS; > } > + > +/* > + * -- > + * OvsCreateNewNBLsFromMultipleNBs -- > + * Creates an NBL chain where each NBL has a single NB, > + * from an NBL which has multiple NBs. > + * Sets 'curNbl' and 'lastNbl' to the first and last NBL in the > + * newly created NBL chain respectively, and completes the original > NBL. > + * -- > + */ > +NTSTATUS > +OvsCreateNewNBLsFromMultipleNBs(POVS_SWITCH_CONTEXT > switchContext, > +PNET_BUFFER_LIST *curNbl, > +PNET_BUFFER_LIST *lastNbl) > +{ > +NTSTATUS status = STATUS_SUCCESS; > +PNET_BUFFER_LIST newNbls = NULL; > +PNET_BUFFER_LIST nbl = NULL; > +BOOLEAN error = TRUE; > + > +do { > +/* Create new NBLs from curNbl with multiple net buffers. */ > +newNbls = OvsPartialCopyToMultipleNBLs(switchContext, > + *curNbl, 0, 0, TRUE); > +if (NULL == newNbls) { > +OVS_LOG_ERROR("Failed to allocate NBLs with single NB."); > +status = NDIS_STATUS_RESOURCES; > +break; > +} > + > +nbl = newNbls; > +while (nbl) { > +*lastNbl = nbl; > +nbl = NET_BUFFER_LIST_NEXT_NBL(nbl); > +} > + > +(*curNbl)->Next = NULL; > + > +OvsCompleteNBL(switchContext, *curNbl, TRUE); > + > +*curNbl = newNbls; > + > +error = FALSE; > +} while (error); > + > +return status; > +} > diff --git a/datapath-windows/ovsext/BufferMgmt.h b/datapath- > windows/ovsext/BufferMgmt.h > index e6cc0fe..dcf310a 100644 > --- a/datapath-windows/ovsext/BufferMgmt.h > +++ b/datapath-windows/ovsext/BufferMgmt.h > @@ -141,4 +141,8 @@ NDIS_STATUS > OvsSetCtxSourcePortNo(PNET_BUFFER_LIST nbl, UINT32 portNo); > > NDIS_STATUS OvsGetCtxSourcePortNo(PNET_BUFFER_LIST nbl, UINT32 > *portNo); > > +NTSTATUS OvsCreateNewNBLsFromMultipleNBs(PVOID context, > + PNET_BUFFER_LIST *curNbl, > + PNET_BUFFER_LIST *lastNbl); > + > #endif /* __BUFFER_MGMT_H_ */ > diff --git a/datapath-windows/ovsext/PacketIO.c b/datapath- > windows/ovsext/PacketIO.c > index 81c574e..38e3e5f 100644 > --- a/datapath-windows/ovsext/PacketIO.c > +++ b/datapath-windows/ovsext/PacketIO.c > @@ -46,10 +46,6 @@ extern NDIS_STRING ovsExtFriendlyNameUC; > static VOID OvsFinalizeCompletionList(OvsCompletionList > *completionList); > static VOID OvsCompleteNBLIngress(POVS_SWITCH_CONTEXT > switchContext, > PNET_BUFFER_LIST netBufferLists, ULONG sendCompleteFlags); > -static NTSTATUS OvsCreateNewNBLsFromMultipleNBs( > -POVS_SWITCH_CONTEXT switchContext, > -PNET_BUFFER_LIST *curNbl, > -PNET_BUFFER_LIST *lastNbl); > > VOID > OvsInitCompletionList(OvsCompletionList *completionList, > @@ -500,41 +496,3 @@ OvsExtCancelSendNBL(NDIS_HANDLE > filterModuleContext, > /* All send requests get completed synchronously, so there is no need > to > * implement this
[ovs-dev] [PATCH 5/5] datapath: fix skb_panic due to the incorrect actions attrlen
Upstream commit: commit 494bea39f3201776cdfddc232705f54a0bd210c4 Author: Liping Zhang Date: Wed Aug 16 13:30:07 2017 +0800 openvswitch: fix skb_panic due to the incorrect actions attrlen For sw_flow_actions, the actions_len only represents the kernel part's size, and when we dump the actions to the userspace, we will do the convertions, so it's true size may become bigger than the actions_len. But unfortunately, for OVS_PACKET_ATTR_ACTIONS, we use the actions_len to alloc the skbuff, so the user_skb's size may become insufficient and oops will happen like this: skbuff: skb_over_panic: text:8148fabf len:1749 put:157 head: 881300f39000 data:881300f39000 tail:0x6d5 end:0x6c0 dev: [ cut here ] kernel BUG at net/core/skbuff.c:129! [...] Call Trace: [] skb_put+0x43/0x44 [] skb_zerocopy+0x6c/0x1f4 [] queue_userspace_packet+0x3a3/0x448 [openvswitch] [] ovs_dp_upcall+0x30/0x5c [openvswitch] [] output_userspace+0x132/0x158 [openvswitch] [] ? ip6_rcv_finish+0x74/0x77 [ipv6] [] do_execute_actions+0xcc1/0xdc8 [openvswitch] [] ovs_execute_actions+0x74/0x106 [openvswitch] [] ovs_dp_process_packet+0xe1/0xfd [openvswitch] [] ? key_extract+0x63c/0x8d5 [openvswitch] [] ovs_vport_receive+0xa1/0xc3 [openvswitch] [...] Also we can find that the actions_len is much little than the orig_len: crash> struct sw_flow_actions 0x8812f539d000 struct sw_flow_actions { rcu = { next = 0x8812f5398800, func = 0xe3b00035db32 }, orig_len = 1384, actions_len = 592, actions = 0x8812f539d01c } So as a quick fix, use the orig_len instead of the actions_len to alloc the user_skb. Last, this oops happened on our system running a relative old kernel, but the same risk still exists on the mainline, since we use the wrong actions_len from the beginning. Fixes: ccea74457bbd ("openvswitch: include datapath actions with sampled-pac Cc: Neil McKee Signed-off-by: Liping Zhang Acked-by: Pravin B Shelar Signed-off-by: David S. Miller Fixes: 0e469d3b380c ("datapath: Include datapath actions with sampled-packet upcall to userspace.") Signed-off-by: Greg Rose --- datapath/actions.c | 1 + datapath/datapath.c | 7 --- datapath/datapath.h | 2 ++ 3 files changed, 7 insertions(+), 3 deletions(-) diff --git a/datapath/actions.c b/datapath/actions.c index 59d91b2..ad18c2c 100644 --- a/datapath/actions.c +++ b/datapath/actions.c @@ -1348,6 +1348,7 @@ int ovs_execute_actions(struct datapath *dp, struct sk_buff *skb, goto out; } + OVS_CB(skb)->acts_origlen = acts->orig_len; err = do_execute_actions(dp, skb, key, acts->actions, acts->actions_len); diff --git a/datapath/datapath.c b/datapath/datapath.c index b565fc5..1780819 100644 --- a/datapath/datapath.c +++ b/datapath/datapath.c @@ -388,7 +388,7 @@ static int queue_gso_packets(struct datapath *dp, struct sk_buff *skb, } static size_t upcall_msg_size(const struct dp_upcall_info *upcall_info, - unsigned int hdrlen) + unsigned int hdrlen, int actions_attrlen) { size_t size = NLMSG_ALIGN(sizeof(struct ovs_header)) + nla_total_size(hdrlen) /* OVS_PACKET_ATTR_PACKET */ @@ -405,7 +405,7 @@ static size_t upcall_msg_size(const struct dp_upcall_info *upcall_info, /* OVS_PACKET_ATTR_ACTIONS */ if (upcall_info->actions_len) - size += nla_total_size(upcall_info->actions_len); + size += nla_total_size(actions_attrlen); /* OVS_PACKET_ATTR_MRU */ if (upcall_info->mru) @@ -472,7 +472,8 @@ static int queue_userspace_packet(struct datapath *dp, struct sk_buff *skb, else hlen = skb->len; - len = upcall_msg_size(upcall_info, hlen - cutlen); + len = upcall_msg_size(upcall_info, hlen - cutlen, + OVS_CB(skb)->acts_origlen); user_skb = genlmsg_new(len, GFP_ATOMIC); if (!user_skb) { err = -ENOMEM; diff --git a/datapath/datapath.h b/datapath/datapath.h index f20deed..70ad0ac 100644 --- a/datapath/datapath.h +++ b/datapath/datapath.h @@ -100,11 +100,13 @@ struct datapath { * when a packet is received by OVS. * @mru: The maximum received fragement size; 0 if the packet is not * fragmented. + * @acts_origlen: The netlink size of the flow actions applied to this skb. * @cutlen: The number of bytes from the packet end to be removed. */ struct ovs_skb_cb { struct vport*input_vport; u16 mru; + u16 acts_origlen; u32 cutlen; }; #define OVS_CB(skb) ((struct ovs_skb_cb *)(
[ovs-dev] [PATCH 4/5] datapath: Remove unnecessary newlines from OVS_NLERR uses
Upstream commit: commit 0ed80da518a1f27562a013f106505e495e891fe4 Author: Joe Perches Date: Fri Aug 11 04:26:26 2017 -0700 openvswitch: Remove unnecessary newlines from OVS_NLERR uses OVS_NLERR already adds a newline so these just add blank lines to the logging. Signed-off-by: Joe Perches Acked-by: Joe Stringer Signed-off-by: David S. Miller Signed-off-by: Greg Rose --- datapath/conntrack.c| 14 +- datapath/flow_netlink.c | 2 +- 2 files changed, 6 insertions(+), 10 deletions(-) diff --git a/datapath/conntrack.c b/datapath/conntrack.c index b645ab1..d517a87 100644 --- a/datapath/conntrack.c +++ b/datapath/conntrack.c @@ -1221,15 +1221,13 @@ static int parse_nat(const struct nlattr *attr, int type = nla_type(a); if (type > OVS_NAT_ATTR_MAX) { - OVS_NLERR(log, - "Unknown NAT attribute (type=%d, max=%d).\n", + OVS_NLERR(log, "Unknown NAT attribute (type=%d, max=%d)", type, OVS_NAT_ATTR_MAX); return -EINVAL; } if (nla_len(a) != ovs_nat_attr_lens[type][ip_vers]) { - OVS_NLERR(log, - "NAT attribute type %d has unexpected length (%d != %d).\n", + OVS_NLERR(log, "NAT attribute type %d has unexpected length (%d != %d)", type, nla_len(a), ovs_nat_attr_lens[type][ip_vers]); return -EINVAL; @@ -1239,9 +1237,7 @@ static int parse_nat(const struct nlattr *attr, case OVS_NAT_ATTR_SRC: case OVS_NAT_ATTR_DST: if (info->nat) { - OVS_NLERR(log, - "Only one type of NAT may be specified.\n" - ); + OVS_NLERR(log, "Only one type of NAT may be specified"); return -ERANGE; } info->nat |= OVS_CT_NAT; @@ -1291,13 +1287,13 @@ static int parse_nat(const struct nlattr *attr, break; default: - OVS_NLERR(log, "Unknown nat attribute (%d).\n", type); + OVS_NLERR(log, "Unknown nat attribute (%d)", type); return -EINVAL; } } if (rem > 0) { - OVS_NLERR(log, "NAT attribute has %d unknown bytes.\n", rem); + OVS_NLERR(log, "NAT attribute has %d unknown bytes", rem); return -EINVAL; } if (!info->nat) { diff --git a/datapath/flow_netlink.c b/datapath/flow_netlink.c index 9b48612..df9d88e 100644 --- a/datapath/flow_netlink.c +++ b/datapath/flow_netlink.c @@ -1257,7 +1257,7 @@ static int ovs_key_from_nlattrs(struct net *net, struct sw_flow_match *match, } if (!is_mask && ipv6_key->ipv6_label & htonl(0xFFF0)) { - OVS_NLERR(log, "IPv6 flow label %x is out of range (max=%x).\n", + OVS_NLERR(log, "IPv6 flow label %x is out of range (max=%x)", ntohl(ipv6_key->ipv6_label), (1 << 20) - 1); return -EINVAL; } -- 1.8.3.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 3/5] datapath: Optimize operations for OvS flow_stats.
Upstream commit: commit c4b2bf6b4a35348fe6d1eb06928eb68d7b9d99a9 Author: Tonghao Zhang Date: Mon Jul 17 23:28:06 2017 -0700 openvswitch: Optimize operations for OvS flow_stats. When calling the flow_free() to free the flow, we call many times (cpu_possible_mask, eg. 128 as default) cpumask_next(). That will take up our CPU usage if we call the flow_free() frequently. When we put all packets to userspace via upcall, and OvS will send them back via netlink to ovs_packet_cmd_execute(will call flow_free). The test topo is shown as below. VM01 sends TCP packets to VM02, and OvS forward packtets. When testing, we use perf to report the system performance. VM01 --- OvS-VM --- VM02 Without this patch, perf-top show as below: The flow_free() is 3.02% CPU usage. 4.23% [kernel][k] _raw_spin_unlock_irqrestore 3.62% [kernel][k] __do_softirq 3.16% [kernel][k] __memcpy 3.02% [kernel][k] flow_free 2.42% libc-2.17.so[.] __memcpy_ssse3_back 2.18% [kernel][k] copy_user_generic_unrolled 2.17% [kernel][k] find_next_bit When applied this patch, perf-top show as below: Not shown on the list anymore. 4.11% [kernel][k] _raw_spin_unlock_irqrestore 3.79% [kernel][k] __do_softirq 3.46% [kernel][k] __memcpy 2.73% libc-2.17.so[.] __memcpy_ssse3_back 2.25% [kernel][k] copy_user_generic_unrolled 1.89% libc-2.17.so[.] _int_malloc 1.53% ovs-vswitchd[.] xlate_actions With this patch, the TCP throughput(we dont use Megaflow Cache + Microflow Cache) between VMs is 1.18Gbs/sec up to 1.30Gbs/sec (maybe ~10% performance imporve). This patch adds cpumask struct, the cpu_used_mask stores the cpu_id that the flow used. And we only check the flow_stats on the cpu we used, and it is unncessary to check all possible cpu when getting, cleaning, and updating the flow_stats. Adding the cpu_used_mask to sw_flow struct does’t increase the cacheline number. Signed-off-by: Tonghao Zhang Acked-by: Pravin B Shelar Signed-off-by: David S. Miller Signed-off-by: Greg Rose --- datapath/flow.c | 7 --- datapath/flow.h | 2 ++ datapath/flow_table.c | 4 +++- 3 files changed, 9 insertions(+), 4 deletions(-) diff --git a/datapath/flow.c b/datapath/flow.c index 09edf8c..8493428 100644 --- a/datapath/flow.c +++ b/datapath/flow.c @@ -71,7 +71,7 @@ void ovs_flow_stats_update(struct sw_flow *flow, __be16 tcp_flags, const struct sk_buff *skb) { struct flow_stats *stats; - int cpu = smp_processor_id(); + unsigned int cpu = smp_processor_id(); int len = skb->len + (skb_vlan_tag_present(skb) ? VLAN_HLEN : 0); stats = rcu_dereference(flow->stats[cpu]); @@ -116,6 +116,7 @@ void ovs_flow_stats_update(struct sw_flow *flow, __be16 tcp_flags, rcu_assign_pointer(flow->stats[cpu], new_stats); + cpumask_set_cpu(cpu, &flow->cpu_used_mask); goto unlock; } } @@ -143,7 +144,7 @@ void ovs_flow_stats_get(const struct sw_flow *flow, memset(ovs_stats, 0, sizeof(*ovs_stats)); /* We open code this to make sure cpu 0 is always considered */ - for (cpu = 0; cpu < nr_cpu_ids; cpu = cpumask_next(cpu, cpu_possible_mask)) { + for (cpu = 0; cpu < nr_cpu_ids; cpu = cpumask_next(cpu, &flow->cpu_used_mask)) { struct flow_stats *stats = rcu_dereference_ovsl(flow->stats[cpu]); if (stats) { @@ -167,7 +168,7 @@ void ovs_flow_stats_clear(struct sw_flow *flow) int cpu; /* We open code this to make sure cpu 0 is always considered */ - for (cpu = 0; cpu < nr_cpu_ids; cpu = cpumask_next(cpu, cpu_possible_mask)) { + for (cpu = 0; cpu < nr_cpu_ids; cpu = cpumask_next(cpu, &flow->cpu_used_mask)) { struct flow_stats *stats = ovsl_dereference(flow->stats[cpu]); if (stats) { diff --git a/datapath/flow.h b/datapath/flow.h index 07af912..0796b09 100644 --- a/datapath/flow.h +++ b/datapath/flow.h @@ -31,6 +31,7 @@ #include #include #include +#include #include #include #include @@ -218,6 +219,7 @@ struct sw_flow { */ struct sw_flow_key key; struct sw_flow_id id; + struct cpumask cpu_used_mask; struct sw_flow_mask *mask; struct sw_flow_actions __rcu *sf_acts; struct flow_stats __rcu *stats[]; /* One for each CPU. First one diff --git a/datapath/flow_table.c b/datapath/flow_table
[ovs-dev] [PATCH 2/5] datapath: Optimize updating for OvS flow_stats.
Upstream commit: commit c57c054eb5b1ccf230c49f736f7a018fcbc3e952 Author: Tonghao Zhang Date: Mon Jul 17 23:28:05 2017 -0700 openvswitch: Optimize updating for OvS flow_stats. In the ovs_flow_stats_update(), we only use the node var to alloc flow_stats struct. But this is not a common case, it is unnecessary to call the numa_node_id() everytime. This patch is not a bugfix, but there maybe a small increase. Signed-off-by: Tonghao Zhang Signed-off-by: David S. Miller Signed-off-by: Greg Rose --- datapath/flow.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/datapath/flow.c b/datapath/flow.c index 9bf3eba..09edf8c 100644 --- a/datapath/flow.c +++ b/datapath/flow.c @@ -71,7 +71,6 @@ void ovs_flow_stats_update(struct sw_flow *flow, __be16 tcp_flags, const struct sk_buff *skb) { struct flow_stats *stats; - int node = numa_node_id(); int cpu = smp_processor_id(); int len = skb->len + (skb_vlan_tag_present(skb) ? VLAN_HLEN : 0); @@ -107,7 +106,7 @@ void ovs_flow_stats_update(struct sw_flow *flow, __be16 tcp_flags, __GFP_THISNODE | __GFP_NOWARN | __GFP_NOMEMALLOC, - node); + numa_node_id()); if (likely(new_stats)) { new_stats->used = jiffies; new_stats->packet_count = 1; -- 1.8.3.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 1/5] datapath: Remove all references to SKB_GSO_UDP.
Upstream commit: commit 880388aa3c07fdea4f9b85e35641753017b1852f Author: David S. Miller Date: Mon Jul 3 07:29:12 2017 -0700 net: Remove all references to SKB_GSO_UDP. Such packets are no longer possible. Signed-off-by: David S. Miller Apply openvswitch section of this upstream patch. Signed-off-by: Greg Rose --- datapath/flow.c | 6 +- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/datapath/flow.c b/datapath/flow.c index c4f63b0..9bf3eba 100644 --- a/datapath/flow.c +++ b/datapath/flow.c @@ -589,8 +589,7 @@ static int key_extract(struct sk_buff *skb, struct sw_flow_key *key) key->ip.frag = OVS_FRAG_TYPE_LATER; return 0; } - if (nh->frag_off & htons(IP_MF) || - skb_shinfo(skb)->gso_type & SKB_GSO_UDP) + if (nh->frag_off & htons(IP_MF)) key->ip.frag = OVS_FRAG_TYPE_FIRST; else key->ip.frag = OVS_FRAG_TYPE_NONE; @@ -707,9 +706,6 @@ static int key_extract(struct sk_buff *skb, struct sw_flow_key *key) if (key->ip.frag == OVS_FRAG_TYPE_LATER) return 0; - if (skb_shinfo(skb)->gso_type & SKB_GSO_UDP) - key->ip.frag = OVS_FRAG_TYPE_FIRST; - /* Transport layer. */ if (key->ip.proto == NEXTHDR_TCP) { if (tcphdr_ok(skb)) { -- 1.8.3.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] how to run script in *.at file in tests folder?
On 23 August 2017 at 10:32, Joe Stringer wrote: > On 23 August 2017 at 01:04, Sam wrote: >> Hi all, >> >> I'm testing, I add a new at file, and in that file, I run another shell >> script like this: >> >> AT_SETUP([vhost - run prepare-env.sh]) >>> #OVS_VSWITCHD_START >>> # No need to create, as bond1 has been created. >>> echo "@" >>> /root/gangyewei/mvs/mvs/tests/prepare-env.sh >>> AT_CHECK([echo ""]) >>> AT_CHECK([/root/gangyewei/mvs/mvs/tests/prepare_env], [0], [stdout]) >>> #OVS_VSWITCHD_STOP >>> AT_CLEANUP >> >> >> I use root to run the tests, But the log show me permission error, how to >> fix this bug and how to run shell script in at file? >> >> # -*- compilation -*- >>> 1. vhost.at:23: testing vhost - run prepare-env.sh ... >>> @ >>> /root/gangyewei/mvs/mvs/tests/testsuite.dir/at-groups/1/test-source: line >>> 12: /root/gangyewei/mvs/mvs/tests/prepare-env.sh: Permission denied >>> ./vhost.at:30: echo "" > > The above is the command being run. > >>> --- /dev/null 2017-02-21 23:39:22.88249 +0800 >>> +++ /root/gangyewei/mvs/mvs/tests/testsuite.dir/at-groups/1/stdout >>> 2017-08-23 15:43:29.348993538 +0800 >>> @@ -0,0 +1 @@ >>> + > > The above is intended to read like a diff from expected to actual > results, and it states that the line with "" is extra > compared to the expected results. > > AT_CHECK by default will check that the command is successful, and > also check that there is no output. > > Perhaps try something like this? > AT_CHECK([echo ""], [0], [ignore]) Ah, I thought you weren't getting a chance to execute because the test didn't proceed beyond this point. Perhaps you should check the execute permissions on the script. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] how to run script in *.at file in tests folder?
On 23 August 2017 at 01:04, Sam wrote: > Hi all, > > I'm testing, I add a new at file, and in that file, I run another shell > script like this: > > AT_SETUP([vhost - run prepare-env.sh]) >> #OVS_VSWITCHD_START >> # No need to create, as bond1 has been created. >> echo "@" >> /root/gangyewei/mvs/mvs/tests/prepare-env.sh >> AT_CHECK([echo ""]) >> AT_CHECK([/root/gangyewei/mvs/mvs/tests/prepare_env], [0], [stdout]) >> #OVS_VSWITCHD_STOP >> AT_CLEANUP > > > I use root to run the tests, But the log show me permission error, how to > fix this bug and how to run shell script in at file? > > # -*- compilation -*- >> 1. vhost.at:23: testing vhost - run prepare-env.sh ... >> @ >> /root/gangyewei/mvs/mvs/tests/testsuite.dir/at-groups/1/test-source: line >> 12: /root/gangyewei/mvs/mvs/tests/prepare-env.sh: Permission denied >> ./vhost.at:30: echo "" The above is the command being run. >> --- /dev/null 2017-02-21 23:39:22.88249 +0800 >> +++ /root/gangyewei/mvs/mvs/tests/testsuite.dir/at-groups/1/stdout >> 2017-08-23 15:43:29.348993538 +0800 >> @@ -0,0 +1 @@ >> + The above is intended to read like a diff from expected to actual results, and it states that the line with "" is extra compared to the expected results. AT_CHECK by default will check that the command is successful, and also check that there is no output. Perhaps try something like this? AT_CHECK([echo ""], [0], [ignore]) ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] Dried Salted Fish August 2017
[ View in browser ]( http://r.newsletter.bonescamail.nl/nru6ruxqoatrf.html ) This email was sent to d...@openvswitch.org You received this email because you are registered with Bonesca Import en Export BV [ Unsubscribe here ]( http://r.newsletter.bonescamail.nl/nru6ruxqoatrg.html ) Sent by [ ]( http://r.newsletter.bonescamail.nl/click/2n3cr3fqlaoatrd.html ) © 2017 Bonesca Import en Export BV ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v3] netdev-dpdk: Create separate memory pool for each port
> -Original Message- > From: Kevin Traynor [mailto:ktray...@redhat.com] > Sent: Tuesday, August 22, 2017 7:58 PM > To: Aaron Conole ; Fischetti, Antonio > > Cc: Wojciechowicz, RobertX ; > d...@openvswitch.org > Subject: Re: [ovs-dev] [PATCH v3] netdev-dpdk: Create separate memory pool for > each port > > On 08/22/2017 06:14 PM, Aaron Conole wrote: > > "Fischetti, Antonio" writes: > > > >> Hi Kevin, pls see comments inline. > >> I'm going to rebase and rework this patch on behalf of Robert. > >> > >> Thanks, > >> Antonio > >> > >>> -Original Message- > >>> From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev- > boun...@openvswitch.org] > >>> On Behalf Of Kevin Traynor > >>> Sent: Tuesday, April 11, 2017 9:37 AM > >>> To: Wojciechowicz, RobertX ; > >>> d...@openvswitch.org > >>> Subject: Re: [ovs-dev] [PATCH v3] netdev-dpdk: Create separate memory pool > for > >>> each port > >>> > >>> On 04/07/2017 09:20 AM, Robert Wojciechowicz wrote: > Since it's possible to delete memory pool in DPDK > we can try to estimate better required memory size > when port is reconfigured, e.g. with different number > of rx queues. > > Signed-off-by: Robert Wojciechowicz > Acked-by: Ian Stokes > --- > v2: > - removing mempool reference counter > - making sure mempool name isn't longer than RTE_MEMPOOL_NAMESIZE > > v3: > - adding memory for corner cases > --- > lib/netdev-dpdk.c | 118 > ++--- > --- > >>> -- > 1 file changed, 57 insertions(+), 61 deletions(-) > > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c > index ddc651b..6b781ac 100644 > --- a/lib/netdev-dpdk.c > +++ b/lib/netdev-dpdk.c > @@ -275,14 +275,12 @@ static struct ovs_list dpdk_list > >>> OVS_GUARDED_BY(dpdk_mutex) > static struct ovs_mutex dpdk_mp_mutex OVS_ACQ_AFTER(dpdk_mutex) > = OVS_MUTEX_INITIALIZER; > > -static struct ovs_list dpdk_mp_list OVS_GUARDED_BY(dpdk_mp_mutex) > -= OVS_LIST_INITIALIZER(&dpdk_mp_list); > - > struct dpdk_mp { > struct rte_mempool *mp; > int mtu; > int socket_id; > -int refcount; > +char if_name[IFNAMSIZ]; > +unsigned mp_size; > struct ovs_list list_node OVS_GUARDED_BY(dpdk_mp_mutex); > }; > > @@ -463,78 +461,82 @@ ovs_rte_pktmbuf_init(struct rte_mempool *mp, > dp_packet_init_dpdk((struct dp_packet *) pkt, pkt->buf_len); > } > > +/* > + * XXX Full DPDK memory pool name must be unique > + * and cannot be longer than RTE_MEMPOOL_NAMESIZE > + */ > +static char * > +dpdk_mp_name(struct dpdk_mp *dmp) > +{ > +uint32_t h = hash_string(dmp->if_name, 0); > +char *mp_name = xcalloc(RTE_MEMPOOL_NAMESIZE, sizeof(char)); > +int ret = snprintf(mp_name, RTE_MEMPOOL_NAMESIZE, "ovs_%x_%d_%u", > + h, dmp->mtu, dmp->mp_size); > +if (ret < 0 || ret >= RTE_MEMPOOL_NAMESIZE) { > +return NULL; > +} > +return mp_name; > +} > + > static struct dpdk_mp * > -dpdk_mp_create(int socket_id, int mtu) > +dpdk_mp_create(struct netdev_dpdk *dev, int mtu) > { > struct rte_pktmbuf_pool_private mbp_priv; > struct dpdk_mp *dmp; > -unsigned mp_size; > char *mp_name; > > dmp = dpdk_rte_mzalloc(sizeof *dmp); > if (!dmp) { > return NULL; > } > -dmp->socket_id = socket_id; > +dmp->socket_id = dev->requested_socket_id; > dmp->mtu = mtu; > -dmp->refcount = 1; > +strncpy(dmp->if_name, dev->up.name, IFNAMSIZ); > mbp_priv.mbuf_data_room_size = MBUF_SIZE(mtu) - sizeof(struct > >>> dp_packet); > mbp_priv.mbuf_priv_size = sizeof(struct dp_packet) > - - sizeof(struct rte_mbuf); > -/* XXX: this is a really rough method of provisioning memory. > - * It's impossible to determine what the exact memory requirements > are > - * when the number of ports and rxqs that utilize a particular > mempool > >>> can > - * change dynamically at runtime. For now, use this rough > heurisitic. > +- sizeof(struct rte_mbuf); > +/* > + * XXX: rough estimation of memory required for port: > + * > + * + > + * + > + * + > */ > -if (mtu >= ETHER_MTU) { > -mp_size = MAX_NB_MBUF; > -} else { > -mp_size = MIN_NB_MBUF; > -} > > -do { > -mp_name = xasprintf("ovs_mp_%d_%d_%u", dmp->mtu, dmp->socket_id, > -mp_size); > +dmp->mp_size = dev->requested_n_rxq * dev->requested_rxq_size > ++ de
[ovs-dev] [PATCH] ovn: Check for known logical switch port types.
OVN is lenient with the types of logical switch ports. Maybe too lenient. This patch attempts to solve this problem on two fronts: 1) In ovn-nbctl, if you attempt to set the port type to an unknown type, the command will not end up setting the type. 2) In northd, when copying the port type from the northbound database to the corresponding port-binding in the southbound database, a warning will be issued if the port is of an unknown type. Signed-off-by: Mark Michelson --- ovn/lib/ovn-util.c| 32 + ovn/lib/ovn-util.h| 2 ++ ovn/northd/ovn-northd.c | 4 +++ ovn/utilities/ovn-nbctl.c | 7 - tests/ovn-nbctl.at| 73 +++ 5 files changed, 117 insertions(+), 1 deletion(-) diff --git a/ovn/lib/ovn-util.c b/ovn/lib/ovn-util.c index 037d0798a..cc0222438 100644 --- a/ovn/lib/ovn-util.c +++ b/ovn/lib/ovn-util.c @@ -299,3 +299,35 @@ default_sb_db(void) } return def; } + +/* l3gateway, chassisredirect, and patch + * are not in this list since they are + * only set in the SB DB by northd + */ +const char *OVN_NB_LSP_TYPES[] = { +"router", +"localnet", +"localport", +"l2gateway", +"vtep", +}; + +bool +ovn_is_known_nb_lsp_type(const char *type) +{ +int i; + +if (!type || !type[0]) { +return true; +} + +for (i = 0; +i < sizeof OVN_NB_LSP_TYPES / sizeof OVN_NB_LSP_TYPES[0]; +++i) { +if (!strcmp(OVN_NB_LSP_TYPES[i], type)) { +return true; +} +} + +return false; +} diff --git a/ovn/lib/ovn-util.h b/ovn/lib/ovn-util.h index b3d2125a3..9b456426d 100644 --- a/ovn/lib/ovn-util.h +++ b/ovn/lib/ovn-util.h @@ -67,4 +67,6 @@ char *alloc_nat_zone_key(const struct uuid *key, const char *type); const char *default_nb_db(void); const char *default_sb_db(void); +bool ovn_is_known_nb_lsp_type(const char *type); + #endif diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c index 49e4ac338..177df5a1e 100644 --- a/ovn/northd/ovn-northd.c +++ b/ovn/northd/ovn-northd.c @@ -1962,6 +1962,10 @@ ovn_port_update_sbrec(struct northd_context *ctx, } sbrec_port_binding_set_options(op->sb, &options); smap_destroy(&options); +if (!ovn_is_known_nb_lsp_type(op->nbsp->type)) { +VLOG_WARN("Unknown port type '%s' set on logical switch '%s'. " + "Using anyway.", op->nbsp->type, op->nbsp->name); +} sbrec_port_binding_set_type(op->sb, op->nbsp->type); } else { const char *chassis = NULL; diff --git a/ovn/utilities/ovn-nbctl.c b/ovn/utilities/ovn-nbctl.c index 46ede4e50..d00bd6ec9 100644 --- a/ovn/utilities/ovn-nbctl.c +++ b/ovn/utilities/ovn-nbctl.c @@ -1173,7 +1173,12 @@ nbctl_lsp_set_type(struct ctl_context *ctx) const struct nbrec_logical_switch_port *lsp; lsp = lsp_by_name_or_uuid(ctx, id, true); -nbrec_logical_switch_port_set_type(lsp, type); +if (ovn_is_known_nb_lsp_type(type)) { +nbrec_logical_switch_port_set_type(lsp, type); +} else { +ctl_fatal("Logical switch port type '%s' is unrecognized. " + "Not setting type.", type); +} } static void diff --git a/tests/ovn-nbctl.at b/tests/ovn-nbctl.at index 354b8df96..e1c4b4473 100644 --- a/tests/ovn-nbctl.at +++ b/tests/ovn-nbctl.at @@ -951,3 +951,76 @@ IPv6 Routes OVN_NBCTL_TEST_STOP AT_CLEANUP + +dnl - + +AT_SETUP([ovn-nbctl - lsp types]) +OVN_NBCTL_TEST_START + +AT_CHECK([ovn-nbctl ls-add ls0]) +AT_CHECK([ovn-nbctl lsp-add ls0 lp0]) + +dnl switchport type defaults to empty +AT_CHECK([ovn-nbctl lsp-get-type lp0], [0], [dnl + +]) + +dnl The following are the valid entries for +dnl switchport type +AT_CHECK([ovn-nbctl lsp-set-type lp0 l2gateway]) +AT_CHECK([ovn-nbctl lsp-get-type lp0], [0], [dnl +l2gateway +]) + +AT_CHECK([ovn-nbctl lsp-set-type lp0 router]) +AT_CHECK([ovn-nbctl lsp-get-type lp0], [0], [dnl +router +]) + +AT_CHECK([ovn-nbctl lsp-set-type lp0 localnet]) +AT_CHECK([ovn-nbctl lsp-get-type lp0], [0], [dnl +localnet +]) + +AT_CHECK([ovn-nbctl lsp-set-type lp0 localport]) +AT_CHECK([ovn-nbctl lsp-get-type lp0], [0], [dnl +localport +]) + +AT_CHECK([ovn-nbctl lsp-set-type lp0 vtep]) +AT_CHECK([ovn-nbctl lsp-get-type lp0], [0], [dnl +vtep +]) + +dnl All of these are valid southbound port types but +dnl should be rejected for northbound logical switch +dnl ports. +AT_CHECK([ovn-nbctl lsp-set-type lp0 l3gateway], [1], [], [dnl +ovn-nbctl: Logical switch port type 'l3gateway' is unrecognized. Not setting type. +]) +AT_CHECK([ovn-nbctl lsp-set-type lp0 patch], [1], [], [dnl +ovn-nbctl: Logical switch port type 'patch' is unrecognized. Not setting type. +]) +AT_CHECK([ovn-nbctl lsp-set-type lp0 chassisredirect], [1], [], [dnl +ovn-nbctl: Logical switch port type 'chassisredirect' is unrecognized. Not setting t
Re: [ovs-dev] [PATCH] windows, python: Remove code duplication in send/recv functions
Thanks Russell and Alin. I applied this on branch-2.7, branch-2.8 and master. Thanks, Alin. > -Original Message- > From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev- > boun...@openvswitch.org] On Behalf Of Russell Bryant > Sent: Tuesday, August 22, 2017 3:32 PM > To: Alin Balutoiu > Cc: d...@openvswitch.org > Subject: Re: [ovs-dev] [PATCH] windows, python: Remove code duplication > in send/recv functions > > On Tue, Aug 22, 2017 at 6:47 AM, Alin Balutoiu > wrote: > > Move the return value at the end of the function regardless of the > > pending/non-pending operation. > > > > Signed-off-by: Alin Balutoiu > > --- > > python/ovs/stream.py | 78 > > > > 1 file changed, 36 insertions(+), 42 deletions(-) > > Acked-by: Russell Bryant > > -- > Russell Bryant > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH net-next v4] openvswitch: enable NSH support
From: Ben Pfaff > Sent: 22 August 2017 18:35 ... > We solved the alignment problem in OVS userspace a different way, by > defining our versions of the network protocol headers so that they only > need 16-bit alignment. In turn, we did that by defining a > ovs_16aligned_be32 type as a pair of be16s and ovs_16aligned_be64 as > four be16s, and using helper functions for reads and writes. ... If you add __attribute__((aligned(2))) to the 32bit and 64bit structure members then gcc will do the loads and shifts for you. David ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] windows, python: Fix event type returned from poller
Thanks Russel and Alin. I applied this on branch-2.7, branch-2.8 and master. > -Original Message- > From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev- > boun...@openvswitch.org] On Behalf Of Russell Bryant > Sent: Tuesday, August 22, 2017 6:34 PM > To: Alin Balutoiu > Cc: d...@openvswitch.org > Subject: Re: [ovs-dev] [PATCH] windows, python: Fix event type returned > from poller > > Ah, that makes sense for the Python version. > > The Python docs were sparse, so I wasn't sure. > > On Tue, Aug 22, 2017 at 8:35 AM, Alin Balutoiu > wrote: > > Looking at the implementation of WaitForMultipleObjects in Python it > > looks that the call will raise an exception if WAIT_FAILED is > > returned. This case is treated by the try/except block around the > WaitForMultipleObjects function. > > > > Thanks, > > Alin Balutoiu. > > > >> -Original Message- > >> From: Russell Bryant [mailto:russ...@ovn.org] > >> Sent: Tuesday, August 22, 2017 2:22 PM > >> To: Alin Balutoiu > >> Cc: d...@openvswitch.org > >> Subject: Re: [ovs-dev] [PATCH] windows, python: Fix event type > >> returned from poller > >> > >> On Tue, Aug 22, 2017 at 6:47 AM, Alin Balutoiu > >> wrote: > >> > The function poll from poller should return a list of tuples > >> > containing the events and their types. > >> > > >> > On Windows the event type is not returned at the moment. > >> > Instead of returning zero all the time, we check to see the type of > >> > event and we set it accordingly before returning the list. > >> > > >> > This is used only for debugging purposes inside the function > >> > "__log_wakeup" later on. > >> > > >> > Signed-off-by: Alin Balutoiu > >> > --- > >> > python/ovs/poller.py | 9 - > >> > 1 file changed, 8 insertions(+), 1 deletion(-) > >> > > >> > diff --git a/python/ovs/poller.py b/python/ovs/poller.py index > >> > 809e512..2f3fcf9 100644 > >> > --- a/python/ovs/poller.py > >> > +++ b/python/ovs/poller.py > >> > @@ -112,7 +112,14 @@ class _SelectSelect(object): > >> > if retval == winutils.winerror.WAIT_TIMEOUT: > >> > return [] > >> > > >> > -return [(events[retval], 0)] > >> > +if events[retval] in self.rlist: > >> > +revent = POLLIN > >> > +elif events[retval] in self.wlist: > >> > +revent = POLLOUT > >> > +else: > >> > +revent = POLLERR > >> > + > >> > +return [(events[retval], revent)] > >> > else: > >> > if timeout == -1: > >> > # epoll uses -1 for infinite timeout, select uses None. > >> > >> Acked-by: Russell Bryant > >> > >> I tried looking up docs of WaitForMultipleObjects to look at possible > >> return values. It looks like WAIT_FAILED could be returned and it > >> would cause an exception when used as the index to events. If that's > >> right, it was true before the patch as well, though. > >> > >> -- > >> Russell Bryant > > > > -- > Russell Bryant > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v1] netdev-dpdk: Implement TCP/UDP TX cksum in ovs-dpdk side
Yes, maintaining only one implementation is resonable. However making ovs-dpdk to support vhost tx-cksum first is doable as well. We can have it in ovs, and replace it with new DPDK API once ovs update its dpdk version which contains the tx-cksum implementation. Thanks Zhenyu Gao 2017-08-23 21:59 GMT+08:00 Loftus, Ciara : > > > > Hi Ciara > > > > You had a general concern below; can we conclude on that before going > > further ? > > > > Thanks Darrell > > > > “ > > > On another note I have a general concern. I understand similar > functionality > > > is present in the DPDK vhost sample app. I wonder if it would be > feasible > > for > > > this to be implemented in the DPDK vhost library and leveraged here, > > rather > > > than having two implementations in two separate code bases. > > This is something I'd like to see, although I wouldn't block on this patch > waiting for it. > Maybe we can have the initial implementation as it is (if it proves > beneficial), then move to a common DPDK API if/when it becomes available. > > I've cc'ed DPDK users list hoping for some input. To summarise: > From my understanding, the DPDK vhost sample application calculates TX > checksum for packets received from vHost ports with invalid/0 checksums: > http://dpdk.org/browse/dpdk/tree/examples/vhost/main.c#n910 > The patch being discussed in this thread (also here: > https://patchwork.ozlabs.org/patch/802070/) it seems does something very > similar. > Wondering on the feasibility of putting this functionality in a rte_vhost > library call such that we don't have two separate implementations? > > Thanks, > Ciara > > > > > > > I have some other comments inline. > > > > > > Thanks, > > > Ciara > > “ > > > > > > > > From: Gao Zhenyu > > Date: Wednesday, August 16, 2017 at 6:38 AM > > To: "Loftus, Ciara" > > Cc: "b...@ovn.org" , "Chandran, Sugesh" > > , "ktray...@redhat.com" > > , Darrell Ball , > > "d...@openvswitch.org" > > Subject: Re: [ovs-dev] [PATCH v1] netdev-dpdk: Implement TCP/UDP TX > > cksum in ovs-dpdk side > > > > Hi Loftus, > >I had submitted a new version, please see > > https://patchwork.ozlabs.org/patch/802070/ > >It move the cksum to vhost receive side. > > Thanks > > Zhenyu Gao > > > > 2017-08-10 12:35 GMT+08:00 Gao Zhenyu : > > I see, for flows in phy-phy setup, they should not be calculate cksum. > > I will revise my patch to do the cksum for vhost port only. I will send > a new > > patch next week. > > > > Thanks > > Zhenyu Gao > > > > 2017-08-08 17:53 GMT+08:00 Loftus, Ciara : > > > > > > Hi Loftus, > > > > > > Thanks for testing and the comments! > > > Can you show more details about your phy-vm-phy,phy-phy setup and > > > testing steps? Then I can reproduce it to see if I can solve this pps > problem. > > > > You're welcome. I forgot to mention my tests were with 64B packets. > > > > For phy-phy the setup is a single host with 2 dpdk physical ports and 1 > flow > > rule port1 -> port2. > > See figure 3 here: https://tools.ietf.org/html/draft-ietf-bmwg-vswitch- > > opnfv-04#section-4 > > > > For the phy-vm-phy the setup is a single host with 2 dpdk physical ports > and 2 > > vhostuser ports with flow rules: > > Dpdk1 -> vhost 1 & vhost2 -> dpdk2 > > IP rules are set up in the VM to route packets from vhost1 to vhost 2. > > See figure 4 in the link above. > > > > > > > > BTW, how about throughput, did you saw improvment? > > > > By throughput if you mean 0% packet loss, I did not test this. > > > > Thanks, > > Ciara > > > > > > > > I would like to implement vhost->vhost part. > > > > > > Thanks > > > Zhenyu Gao > > > > > > 2017-08-04 22:52 GMT+08:00 Loftus, Ciara : > > > > > > > > Currently, the dpdk-vhost side in ovs doesn't support tcp/udp tx > cksum. > > > > So L4 packets's cksum were calculated in VM side but performance is > not > > > > good. > > > > Implementing tcp/udp tx cksum in ovs-dpdk side improves throughput > > and > > > > makes virtio-net frontend-driver support NETIF_F_SG as well > > > > > > > > Signed-off-by: Zhenyu Gao > > > > --- > > > > > > > > Here is some performance number: > > > > > > > > Setup: > > > > > > > > qperf client > > > > +-+ > > > > | VM| > > > > +-+ > > > > | > > > > | qperf server > > > > +--+ ++ > > > > | vswitch+dpdk | | bare-metal | > > > > +--+ ++ > > > >|| > > > >|| > > > > pNic-PhysicalSwitch > > > > > > > > do cksum in ovs-dpdk: Applied this patch and execute 'ethtool -K > eth0 tx > > > on' > > > > in VM side. > > > > It offload cksum job to ovs-dpdk side. > > > > > > > > do cksum in VM: Applied this patch and execute 'ethtool -K eth0 tx > off' in > > > VM > > > > side. > > > > VM calculate cksum for tcp/udp packets. > > > > > > > > We can see huge improvment in TCP throughput if
[ovs-dev] [PATCH] windows,python: Add restrictions to named pipes
Bump the security around named pipes to be more restrictive: disable network access and allow only administrators and above to access the named pipes. Signed-off-by: Alin Balutoiu --- python/ovs/winutils.py | 59 ++ 1 file changed, 59 insertions(+) diff --git a/python/ovs/winutils.py b/python/ovs/winutils.py index 89e28e1..8f3151a 100644 --- a/python/ovs/winutils.py +++ b/python/ovs/winutils.py @@ -17,6 +17,7 @@ import sys if sys.platform != 'win32': raise Exception("Intended to use only on Windows") else: +import ntsecuritycon import pywintypes import win32con import win32event @@ -139,7 +140,65 @@ def create_named_pipe(pipename, openMode=None, pipeMode=None, if saAttr == -1: # saAttr can be None saAttr = win32security.SECURITY_ATTRIBUTES() + +# The identifier authority. +sia = ntsecuritycon.SECURITY_NT_AUTHORITY + +# Initialize the SID. +remoteAccessSid = win32security.SID() +remoteAccessSid.Initialize( +sia, # The identifier authority. +1) # The number of sub authorities to allocate. +# Disable access over network. +remoteAccessSid.SetSubAuthority( +0, # The index of the sub authority to set +ntsecuritycon.SECURITY_NETWORK_RID) + +allowedPsids = [] +# Allow Windows Services to access the Named Pipe. +allowedPsid_0 = win32security.SID() +allowedPsid_0.Initialize( +sia, # The identifier authority. +1) # The number of sub authorities to allocate. +allowedPsid_0.SetSubAuthority( +0, # The index of the sub authority to set +ntsecuritycon.SECURITY_LOCAL_SYSTEM_RID) +# Allow Administrators to access the Named Pipe. +allowedPsid_1 = win32security.SID() +allowedPsid_1.Initialize( +sia, # The identifier authority. +2) # The number of sub authorities to allocate. +allowedPsid_1.SetSubAuthority( +0, # The index of the sub authority to set +ntsecuritycon.SECURITY_BUILTIN_DOMAIN_RID) +allowedPsid_1.SetSubAuthority( +1, # The index of the sub authority to set +ntsecuritycon.DOMAIN_ALIAS_RID_ADMINS) + +allowedPsids.append(allowedPsid_0) +allowedPsids.append(allowedPsid_1) + +# Initialize an ACL. +acl = win32security.ACL() +acl.Initialize() +# Add denied ACL. +acl.AddAccessDeniedAce(win32security.ACL_REVISION, + ntsecuritycon.GENERIC_ALL, + remoteAccessSid) +# Add allowed ACLs. +for allowedPsid in allowedPsids: +acl.AddAccessAllowedAce(win32security.ACL_REVISION, +ntsecuritycon.GENERIC_ALL, +allowedPsid) + +# Initialize an SD. +sd = win32security.SECURITY_DESCRIPTOR() +sd.Initialize() +# Set DACL. +sd.SetSecurityDescriptorDacl(True, acl, False) + saAttr.bInheritHandle = 1 +saAttr.SECURITY_DESCRIPTOR = sd try: npipe = win32pipe.CreateNamedPipe(pipename, -- 2.10.0.windows.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v3 1/4] dpif-netdev: Skip EMC lookup/insert for recirc packets
> -Original Message- > From: Jan Scheurich [mailto:jan.scheur...@ericsson.com] > Sent: Thursday, August 17, 2017 1:42 PM > To: Fischetti, Antonio ; Darrell Ball > ; d...@openvswitch.org > Subject: RE: [ovs-dev] [PATCH v3 1/4] dpif-netdev: Skip EMC lookup/insert for > recirc packets > > Hi Antonio, > > > > Is there a reason to assume that a deterministic selection on some non- > > random > > > criteria like the recirculation count will on average (over deployments > > and > > > applications) give a better performance than a random selection? > > > > [Antonio] > > If we consider latency and jitter a deterministic solution should be > > more preferable than a solution which behaves differently depending > > on the particular values of the packet fields, eg the IP addresses. > > Do you have measurements showing that latency is significantly affected > by EMC hit vs DPCLS hit? I wouldn't think so. Only throughput should vary. > [Antonio] Agree. What I meant to say is that - broadly speaking - it should be preferable to adopt solutions that seem to be more deterministic, especially in a Telco deployment. This approach - at least at a first glance - seems to be more deterministic than other approaches like the "RSS hash threshold method" because the latter can treat the packet differently depending on their header. IMPO it could be good to have this approach in parallel with some other strategies - like the "RSS hash threshold method" - because they operate on two different causes/levels of the same problem. > Probabilistic EMC lookup should only apply in situations where EMC is > overloaded, meaning we have thousands of packet flows. In this case we > maximize the aggregate throughput of the statistical flow mix. But it is not > that a flow using EMC would see higher throughput than analogous flows that > don't. > > > > I don't believe so. For example, the number of "EMC flows" in each pass > > through > > > the datapath can differ hugely: 1 GRE tunnel flow in first pass (from phy > > > port), 100K tenant flows after tunnel decapsulation. Or 100K tenant flows > > in > > > first pass (from VM) but 1 flow after NSH encapsulation in second pass. > > > > [Antonio] > > Maybe I'm wrong but shouldn't the different flows encapped in a GRE > > tunnel hit the EMC in different locations? Because even if they all have the > > same outer IP addresses, they differ in the L4 ports so the 5-tuple hash > > - and the emc locations - should vary. Same thing for NSH encapsulation? > > Neither GRE nor NSH packets have L4 ports for RSS hashing. GRE is a separate > IP protocol (not UDP). All packets of a GRE tunnel share the same pair of IP > addresses. NSH is even a non-IP protocol. > > > > I believe a random selection with dynamically adapted probability is the > > best > > > we can do without a priori knowledge about the traffic patterns and > > pipeline > > > organization. > > > > [Antonio] > > This proposal is orthogonal to other approaches that look at the usage > > of the single locations, eg policies not to overwrite active locations or to > > reduce in general the emc usage. > > I think we should consider both the two strategies to tackle two different > > aspects of the thrashing and use emc more efficiently: > > 1. skip emc lookup/insert for recirc packets (which is only activated when > >emc entries exceeds EMC_RECIRCT_NO_INSERT_THRESHOLD); > > 2. any other strategy that limits emc usage or offers a better entries > > eviction. > > > > So - being agnostic of what's the traffic type - if we have 100k flows > > that could potentially be recirculated: > > 1. allows to tackle the thrashing due to recirculation, which is activated > > when the emc entries exceeds a threshold. > > 2. allows to limit the emc usage to fewer flows because we don't want > > 100k flows to hit emc. > > First of all: we only discuss limiting EMC lookups in the case of EMC > overload. > I still don't think that it is a good idea to general skip EMC lookup for > recirculated flows in that situation. It may be the right thing to do in some > scenarios (e.g. GRE -> VM), but exactly the wrong in others (e.g. VM -> GRE). > > If we go for a probabilistic reduction of EMC lookups we'd statistically have > a > balanced improvement in all (known and unknown) scenarios. > > BR, Jan ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v1] netdev-dpdk: Implement TCP/UDP TX cksum in ovs-dpdk side
> > Hi Ciara > > You had a general concern below; can we conclude on that before going > further ? > > Thanks Darrell > > “ > > On another note I have a general concern. I understand similar functionality > > is present in the DPDK vhost sample app. I wonder if it would be feasible > for > > this to be implemented in the DPDK vhost library and leveraged here, > rather > > than having two implementations in two separate code bases. This is something I'd like to see, although I wouldn't block on this patch waiting for it. Maybe we can have the initial implementation as it is (if it proves beneficial), then move to a common DPDK API if/when it becomes available. I've cc'ed DPDK users list hoping for some input. To summarise: From my understanding, the DPDK vhost sample application calculates TX checksum for packets received from vHost ports with invalid/0 checksums: http://dpdk.org/browse/dpdk/tree/examples/vhost/main.c#n910 The patch being discussed in this thread (also here: https://patchwork.ozlabs.org/patch/802070/) it seems does something very similar. Wondering on the feasibility of putting this functionality in a rte_vhost library call such that we don't have two separate implementations? Thanks, Ciara > > > > I have some other comments inline. > > > > Thanks, > > Ciara > “ > > > > From: Gao Zhenyu > Date: Wednesday, August 16, 2017 at 6:38 AM > To: "Loftus, Ciara" > Cc: "b...@ovn.org" , "Chandran, Sugesh" > , "ktray...@redhat.com" > , Darrell Ball , > "d...@openvswitch.org" > Subject: Re: [ovs-dev] [PATCH v1] netdev-dpdk: Implement TCP/UDP TX > cksum in ovs-dpdk side > > Hi Loftus, > I had submitted a new version, please see > https://patchwork.ozlabs.org/patch/802070/ > It move the cksum to vhost receive side. > Thanks > Zhenyu Gao > > 2017-08-10 12:35 GMT+08:00 Gao Zhenyu : > I see, for flows in phy-phy setup, they should not be calculate cksum. > I will revise my patch to do the cksum for vhost port only. I will send a new > patch next week. > > Thanks > Zhenyu Gao > > 2017-08-08 17:53 GMT+08:00 Loftus, Ciara : > > > > Hi Loftus, > > > > Thanks for testing and the comments! > > Can you show more details about your phy-vm-phy,phy-phy setup and > > testing steps? Then I can reproduce it to see if I can solve this pps > > problem. > > You're welcome. I forgot to mention my tests were with 64B packets. > > For phy-phy the setup is a single host with 2 dpdk physical ports and 1 flow > rule port1 -> port2. > See figure 3 here: https://tools.ietf.org/html/draft-ietf-bmwg-vswitch- > opnfv-04#section-4 > > For the phy-vm-phy the setup is a single host with 2 dpdk physical ports and 2 > vhostuser ports with flow rules: > Dpdk1 -> vhost 1 & vhost2 -> dpdk2 > IP rules are set up in the VM to route packets from vhost1 to vhost 2. > See figure 4 in the link above. > > > > > BTW, how about throughput, did you saw improvment? > > By throughput if you mean 0% packet loss, I did not test this. > > Thanks, > Ciara > > > > > I would like to implement vhost->vhost part. > > > > Thanks > > Zhenyu Gao > > > > 2017-08-04 22:52 GMT+08:00 Loftus, Ciara : > > > > > > Currently, the dpdk-vhost side in ovs doesn't support tcp/udp tx cksum. > > > So L4 packets's cksum were calculated in VM side but performance is not > > > good. > > > Implementing tcp/udp tx cksum in ovs-dpdk side improves throughput > and > > > makes virtio-net frontend-driver support NETIF_F_SG as well > > > > > > Signed-off-by: Zhenyu Gao > > > --- > > > > > > Here is some performance number: > > > > > > Setup: > > > > > > qperf client > > > +-+ > > > | VM | > > > +-+ > > > | > > > | qperf server > > > +--+ ++ > > > | vswitch+dpdk | | bare-metal | > > > +--+ ++ > > > | | > > > | | > > > pNic-PhysicalSwitch > > > > > > do cksum in ovs-dpdk: Applied this patch and execute 'ethtool -K eth0 tx > > on' > > > in VM side. > > > It offload cksum job to ovs-dpdk side. > > > > > > do cksum in VM: Applied this patch and execute 'ethtool -K eth0 tx off' in > > VM > > > side. > > > VM calculate cksum for tcp/udp packets. > > > > > > We can see huge improvment in TCP throughput if we leverage ovs-dpdk > > > cksum. > > Hi Zhenyu, > > > > Thanks for the patch. I tested some alternative use cases and > unfortunately I > > see a degradation for phy-phy and phy-vm-phy topologies. > > Here are my results: > > > > phy-vm-phy: > > without patch: 0.871Mpps > > with patch (offload=on): 0.877Mpps > > with patch (offload=off): 0.891Mpps > > > > phy-phy: > > without patch: 13.581Mpps > > with patch: 13.055Mpps > > > > The half a million pps drop for the second test case is concerning to me but > > not surprising since we're adding extra complexity to netdev_dpdk_send() > > Could this
[ovs-dev] Jona's Topquality Drysmoked Fish
[ View in browser ]( http://r.newsletter.bonescamail.nl/nru6rskeoatrf.html ) Topquality Drysmoked Kupila / Arius Proops IQF 5 kilo 1 box € 5,99 / 10 box € 5,49 / palet (60 box) € 4,99 per kilo Topquality Drysmoked Catfish filets / Skilled fish IQF 5 kilo 1 box € 16,95 / 10 box € 16,50 / palet (60 box) € 15.95 per kilo [ For all our offers click here ]( http://r.newsletter.bonescamail.nl/click/2n3cr3699aoatrd.html ) This email was sent to d...@openvswitch.org You received this email because you are registered with Bonesca Import en Export BV [ Unsubscribe here ]( http://r.newsletter.bonescamail.nl/nru6rskeoatrg.html ) Sent by [ ]( http://r.newsletter.bonescamail.nl/click/2n3cr36a1qoatrd.html ) © 2017 Bonesca Import en Export BV ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v5 6/6] dpif-netdev: Add ovs-appctl dpif-netdev/pmd-rxq-rebalance.
Rxqs consumed processing cycles are used to improve the balance of how rxqs are assigned to pmds. Currently some reconfiguration is needed to perform a reassignment. Add an ovs-appctl command to perform a new assignment in order to balance based on the latest rxq processing cycle information. Note: Jan requested this for testing purposes. Suggested-by: Jan Scheurich Signed-off-by: Kevin Traynor --- Documentation/howto/dpdk.rst | 5 - lib/dpif-netdev.c| 35 +++ vswitchd/ovs-vswitchd.8.in | 2 ++ 3 files changed, 41 insertions(+), 1 deletion(-) diff --git a/Documentation/howto/dpdk.rst b/Documentation/howto/dpdk.rst index 493e215..5d009bd 100644 --- a/Documentation/howto/dpdk.rst +++ b/Documentation/howto/dpdk.rst @@ -140,5 +140,8 @@ Core 7: Q4 (70%) | Q5 (10%) core 8: Q0 (60%) | Q0 (30%) -Rxq to pmds assignment takes place whenever there are configuration changes. +Rxq to pmds assignment takes place whenever there are configuration changes +or can be triggered by using:: + +$ ovs-appctl dpif-netdev/pmd-rxq-rebalance QoS diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 6cc0a1e..4ea1c07 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -723,4 +723,6 @@ static inline bool emc_entry_alive(struct emc_entry *ce); static void emc_clear_entry(struct emc_entry *ce); +static void dp_netdev_request_reconfigure(struct dp_netdev *dp); + static void emc_cache_init(struct emc_cache *flow_cache) @@ -1016,4 +1018,34 @@ sorted_poll_thread_list(struct dp_netdev *dp, static void +dpif_netdev_pmd_rebalance(struct unixctl_conn *conn, int argc, + const char *argv[], void *aux OVS_UNUSED) +{ +struct ds reply = DS_EMPTY_INITIALIZER; +struct dp_netdev *dp = NULL; + +ovs_mutex_lock(&dp_netdev_mutex); + +if (argc == 2) { +dp = shash_find_data(&dp_netdevs, argv[1]); +} else if (shash_count(&dp_netdevs) == 1) { +/* There's only one datapath */ +dp = shash_first(&dp_netdevs)->data; +} + +if (!dp) { +ovs_mutex_unlock(&dp_netdev_mutex); +unixctl_command_reply_error(conn, +"please specify an existing datapath"); +return; +} + +dp_netdev_request_reconfigure(dp); +ovs_mutex_unlock(&dp_netdev_mutex); +ds_put_cstr(&reply, "pmd rxq rebalance requested.\n"); +unixctl_command_reply(conn, ds_cstr(&reply)); +ds_destroy(&reply); +} + +static void dpif_netdev_pmd_info(struct unixctl_conn *conn, int argc, const char *argv[], void *aux) @@ -1093,4 +1125,7 @@ dpif_netdev_init(void) 0, 1, dpif_netdev_pmd_info, (void *)&poll_aux); +unixctl_command_register("dpif-netdev/pmd-rxq-rebalance", "[dp]", + 0, 1, dpif_netdev_pmd_rebalance, + NULL); return 0; } diff --git a/vswitchd/ovs-vswitchd.8.in b/vswitchd/ovs-vswitchd.8.in index dfd209e..c18baf6 100644 --- a/vswitchd/ovs-vswitchd.8.in +++ b/vswitchd/ovs-vswitchd.8.in @@ -281,4 +281,6 @@ bridge statistics, only the values shown by the above command. For each pmd thread of the datapath \fIdp\fR shows list of queue-ids with port names, which this thread polls. +.IP "\fBdpif-netdev/pmd-rxq-rebalance\fR [\fIdp\fR]" +Reassigns rxqs to pmds in the datapath \fIdp\fR based on their current usage. . .so ofproto/ofproto-dpif-unixctl.man -- 1.8.3.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v5 5/6] dpif-netdev: Change pmd selection order.
Up to his point rxqs are sorted by processing cycles they consumed and assigned to pmds in a round robin manner. Ian pointed out that on wrap around the most loaded pmd will be the next one to be assigned an additional rxq and that it would be better to reverse the pmd order when wraparound occurs. In other words, change from assigning by rr to assigning in a forward and reverse cycle through pmds. Also, now that the algothim has finalised, document an example. Suggested-by: Ian Stokes Signed-off-by: Kevin Traynor --- Documentation/howto/dpdk.rst | 16 lib/dpif-netdev.c| 21 - tests/pmd.at | 2 +- 3 files changed, 37 insertions(+), 2 deletions(-) diff --git a/Documentation/howto/dpdk.rst b/Documentation/howto/dpdk.rst index 44737e4..493e215 100644 --- a/Documentation/howto/dpdk.rst +++ b/Documentation/howto/dpdk.rst @@ -124,4 +124,20 @@ will be used where known to assign rxqs with the highest consumption of processing cycles to different pmds. +For example, in the case where here there are 5 rxqs and 3 cores (e.g. 3,7,8) +available, and the measured usage of core cycles per rxq over the last +interval is seen to be: + +- Queue #0: 30% +- Queue #1: 80% +- Queue #3: 60% +- Queue #4: 70% +- Queue #5: 10% + +The rxqs will be assigned to cores 3,7,8 in the following order: + +Core 3: Q1 (80%) | +Core 7: Q4 (70%) | Q5 (10%) +core 8: Q0 (60%) | Q0 (30%) + Rxq to pmds assignment takes place whenever there are configuration changes. diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index afbf591..6cc0a1e 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -3285,4 +3285,5 @@ struct rr_numa { int cur_index; +bool idx_inc; }; @@ -3341,4 +3342,7 @@ rr_numa_list_populate(struct dp_netdev *dp, struct rr_numa_list *rr) numa->pmds = xrealloc(numa->pmds, numa->n_pmds * sizeof *numa->pmds); numa->pmds[numa->n_pmds - 1] = pmd; +/* At least one pmd so initialise curr_idx and idx_inc. */ +numa->cur_index = 0; +numa->idx_inc = true; } } @@ -3347,5 +3351,20 @@ static struct dp_netdev_pmd_thread * rr_numa_get_pmd(struct rr_numa *numa) { -return numa->pmds[numa->cur_index++ % numa->n_pmds]; +int numa_idx = numa->cur_index; + +if (numa->idx_inc == true) { +if (numa->cur_index == numa->n_pmds-1) { +numa->idx_inc = false; +} else { +numa->cur_index++; +} +} else { +if (numa->cur_index == 0) { +numa->idx_inc = true; +} else { +numa->cur_index--; +} +} +return numa->pmds[numa_idx]; } diff --git a/tests/pmd.at b/tests/pmd.at index b6732ea..e39a23a 100644 --- a/tests/pmd.at +++ b/tests/pmd.at @@ -54,5 +54,5 @@ m4_define([CHECK_PMD_THREADS_CREATED], [ m4_define([SED_NUMA_CORE_PATTERN], ["s/\(numa_id \)[[0-9]]*\( core_id \)[[0-9]]*:/\1\2:/"]) -m4_define([SED_NUMA_CORE_QUEUE_PATTERN], ["s/\(numa_id \)[[0-9]]*\( core_id \)[[0-9]]*:/\1\2:/;s/\(queue-id: \)0 2 4 6/\1/;s/\(queue-id: \)1 3 5 7/\1/"]) +m4_define([SED_NUMA_CORE_QUEUE_PATTERN], ["s/\(numa_id \)[[0-9]]*\( core_id \)[[0-9]]*:/\1\2:/;s/\(queue-id: \)1 2 5 6/\1/;s/\(queue-id: \)0 3 4 7/\1/"]) m4_define([DUMMY_NUMA], [--dummy-numa="0,0,0,0"]) -- 1.8.3.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v5 4/6] dpif-netdev: Change rxq_scheduling to use rxq processing cycles.
Previously rxqs were assigned to pmds by round robin in port/queue order. Now that we have the processing cycles used for existing rxqs, use that information to try and produced a better balanced distribution of rxqs across pmds. i.e. given multiple pmds, the rxqs which have consumed the largest amount of processing cycles will be placed on different pmds. The rxqs are sorted by their processing cycles and assigned (in sorted order) round robin across pmds. Signed-off-by: Kevin Traynor --- Documentation/howto/dpdk.rst | 7 +++ lib/dpif-netdev.c| 123 --- 2 files changed, 99 insertions(+), 31 deletions(-) diff --git a/Documentation/howto/dpdk.rst b/Documentation/howto/dpdk.rst index d7f6610..44737e4 100644 --- a/Documentation/howto/dpdk.rst +++ b/Documentation/howto/dpdk.rst @@ -119,4 +119,11 @@ After that PMD threads on cores where RX queues was pinned will become thread. +If pmd-rxq-affinity is not set for rxqs, they will be assigned to pmds (cores) +automatically. The processing cycles that have been required for each rxq +will be used where known to assign rxqs with the highest consumption of +processing cycles to different pmds. + +Rxq to pmds assignment takes place whenever there are configuration changes. + QoS --- diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 5670c55..afbf591 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -712,4 +712,6 @@ static void dp_netdev_rxq_set_interval(struct dp_netdev_rxq *rx, unsigned long long cycles); +static uint64_t +dp_netdev_rxq_get_interval(struct dp_netdev_rxq *rx, unsigned idx); static void dpif_netdev_xps_revalidate_pmd(const struct dp_netdev_pmd_thread *pmd, @@ -3166,4 +3168,12 @@ dp_netdev_rxq_set_interval(struct dp_netdev_rxq *rx, } +static uint64_t +dp_netdev_rxq_get_interval(struct dp_netdev_rxq *rx, unsigned idx) +{ +unsigned long long tmp; +atomic_read_relaxed(&rx->cycles_intrvl[idx], &tmp); +return tmp; +} + static int dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd, @@ -3352,8 +3362,37 @@ rr_numa_list_destroy(struct rr_numa_list *rr) } +/* Sort Rx Queues by the processing cycles they are consuming. */ +static int +rxq_cycle_sort(const void *a, const void *b) +{ +struct dp_netdev_rxq * qa; +struct dp_netdev_rxq * qb; +uint64_t total_qa, total_qb; +unsigned i; + +qa = *(struct dp_netdev_rxq **) a; +qb = *(struct dp_netdev_rxq **) b; + +total_qa = total_qb = 0; +for (i = 0; i < PMD_RXQ_INTERVAL_MAX; i++) { +total_qa += dp_netdev_rxq_get_interval(qa, i); +total_qb += dp_netdev_rxq_get_interval(qb, i); +} +dp_netdev_rxq_set_cycles(qa, RXQ_CYCLES_PROC_HIST, total_qa); +dp_netdev_rxq_set_cycles(qb, RXQ_CYCLES_PROC_HIST, total_qb); + +if (total_qa >= total_qb) { +return -1; +} +return 1; +} + /* Assign pmds to queues. If 'pinned' is true, assign pmds to pinned * queues and marks the pmds as isolated. Otherwise, assign non isolated * pmds to unpinned queues. * + * If 'pinned' is false queues will be sorted by processing cycles they are + * consuming and then assigned to pmds in round robin order. + * * The function doesn't touch the pmd threads, it just stores the assignment * in the 'pmd' member of each rxq. */ @@ -3364,18 +3403,14 @@ rxq_scheduling(struct dp_netdev *dp, bool pinned) OVS_REQUIRES(dp->port_mutex) struct rr_numa_list rr; struct rr_numa *non_local_numa = NULL; - -rr_numa_list_populate(dp, &rr); +struct dp_netdev_rxq ** rxqs = NULL; +int i, n_rxqs = 0; +struct rr_numa *numa = NULL; +int numa_id; HMAP_FOR_EACH (port, node, &dp->ports) { -struct rr_numa *numa; -int numa_id; - if (!netdev_is_pmd(port->netdev)) { continue; } -numa_id = netdev_get_numa_id(port->netdev); -numa = rr_numa_list_lookup(&rr, numa_id); - for (int qid = 0; qid < port->n_rxq; qid++) { struct dp_netdev_rxq *q = &port->rxqs[qid]; @@ -3395,34 +3430,60 @@ rxq_scheduling(struct dp_netdev *dp, bool pinned) OVS_REQUIRES(dp->port_mutex) } } else if (!pinned && q->core_id == OVS_CORE_UNSPEC) { -if (!numa) { -/* There are no pmds on the queue's local NUMA node. - Round-robin on the NUMA nodes that do have pmds. */ -non_local_numa = rr_numa_list_next(&rr, non_local_numa); -if (!non_local_numa) { -VLOG_ERR("There is no available (non-isolated) pmd " - "thread for port \'%s\' queue %d. This queue " - "will not be polled. Is pmd-cpu-mask set to " - "zero? Or are all PMDs isolated to other " - "queues?", netdev_get_name(port->netdev), -
[ovs-dev] [PATCH v5 3/6] dpif-netdev: Count the rxq processing cycles for an rxq.
Count the cycles used for processing an rxq during the pmd rxq interval. As this is an in flight counter and pmds run independently, also store the total cycles used during the last full interval. Signed-off-by: Kevin Traynor --- lib/dpif-netdev.c | 78 +++ 1 file changed, 73 insertions(+), 5 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 51d4050..5670c55 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -182,4 +182,8 @@ struct emc_cache { #define DPCLS_OPTIMIZATION_INTERVAL 1000 +/* Time in ms of the interval in which rxq processing cycles used in + * rxq to pmd assignments is measured and stored. */ +#define PMD_RXQ_INTERVAL_LEN 1 + /* Number of intervals for which cycles are stored * and used during rxq to pmd assignment. */ @@ -568,4 +572,6 @@ struct dp_netdev_pmd_thread { /* Periodically sort subtable vectors according to hit frequencies */ long long int next_optimization; +/* Periodically store the processing cycles used for each rxq. */ +long long int rxq_interval; /* Statistics. */ @@ -694,5 +700,16 @@ static void pmd_load_cached_ports(struct dp_netdev_pmd_thread *pmd) OVS_REQUIRES(pmd->port_mutex); static inline void -dp_netdev_pmd_try_optimize(struct dp_netdev_pmd_thread *pmd); +dp_netdev_pmd_try_optimize(struct dp_netdev_pmd_thread *pmd, + struct polled_queue *poll_list, int poll_cnt); +static void +dp_netdev_rxq_set_cycles(struct dp_netdev_rxq *rx, + enum rxq_cycles_counter_type type, + unsigned long long cycles); +static uint64_t +dp_netdev_rxq_get_cycles(struct dp_netdev_rxq *rx, + enum rxq_cycles_counter_type type); +static void +dp_netdev_rxq_set_interval(struct dp_netdev_rxq *rx, + unsigned long long cycles); static void dpif_netdev_xps_revalidate_pmd(const struct dp_netdev_pmd_thread *pmd, @@ -3124,4 +3141,29 @@ cycles_count_intermediate(struct dp_netdev_pmd_thread *pmd, } +static void +dp_netdev_rxq_set_cycles(struct dp_netdev_rxq *rx, + enum rxq_cycles_counter_type type, + unsigned long long cycles) +{ + atomic_store_relaxed(&rx->cycles[type], cycles); +} + +static uint64_t +dp_netdev_rxq_get_cycles(struct dp_netdev_rxq *rx, + enum rxq_cycles_counter_type type) +{ +unsigned long long tmp; +atomic_read_relaxed(&rx->cycles[type], &tmp); +return tmp; +} + +static void +dp_netdev_rxq_set_interval(struct dp_netdev_rxq *rx, + unsigned long long cycles) +{ + atomic_store_relaxed(&rx->cycles_intrvl[rx->intrvl_idx++ + % PMD_RXQ_INTERVAL_MAX], cycles); +} + static int dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd, @@ -3168,5 +3210,5 @@ port_reconfigure(struct dp_netdev_port *port) { struct netdev *netdev = port->netdev; -int i, err; +int i, err, last_nrxq, j; port->need_reconfigure = false; @@ -3177,4 +3219,5 @@ port_reconfigure(struct dp_netdev_port *port) port->rxqs[i].rx = NULL; } +last_nrxq = port->n_rxq; port->n_rxq = 0; @@ -3197,4 +3240,12 @@ port_reconfigure(struct dp_netdev_port *port) for (i = 0; i < netdev_n_rxq(netdev); i++) { port->rxqs[i].port = port; +if (i >= last_nrxq) { +/* Only reset cycle stats for new queues */ +dp_netdev_rxq_set_cycles(&port->rxqs[i], RXQ_CYCLES_PROC_CURR, 0); +dp_netdev_rxq_set_cycles(&port->rxqs[i], RXQ_CYCLES_PROC_HIST, 0); +for (j = 0; j < PMD_RXQ_INTERVAL_MAX; j++) { +dp_netdev_rxq_set_interval(&port->rxqs[i], 0); +} +} err = netdev_rxq_open(netdev, &port->rxqs[i].rx, i); if (err) { @@ -3877,5 +3928,5 @@ reload: dp_netdev_process_rxq_port(pmd, poll_list[i].rxq->rx, poll_list[i].port_no); -cycles_count_intermediate(pmd, NULL, +cycles_count_intermediate(pmd, poll_list[i].rxq, process_packets ? PMD_CYCLES_PROCESSING : PMD_CYCLES_IDLE); @@ -3888,5 +3939,5 @@ reload: coverage_try_clear(); -dp_netdev_pmd_try_optimize(pmd); +dp_netdev_pmd_try_optimize(pmd, poll_list, poll_cnt); if (!ovsrcu_try_quiesce()) { emc_cache_slow_sweep(&pmd->flow_cache); @@ -4332,4 +4383,5 @@ dp_netdev_configure_pmd(struct dp_netdev_pmd_thread *pmd, struct dp_netdev *dp, cmap_init(&pmd->classifiers); pmd->next_optimization = time_msec() + DPCLS_OPTIMIZATION_INTERVAL; +pmd->rxq_interval = time_msec() + PMD_RXQ_INTERVAL_LEN; hmap_init(&pmd->poll_list); hmap_init(&pmd->tx_ports); @@ -5769,8 +5821,24 @@ dpcls_sort_subtable_vector(str
[ovs-dev] [PATCH v5 2/6] dpif-netdev: Add rxq processing cycle counters.
Add counters to dp_netdev_rxq which will later be used for storing the processing cycles of an rxq. Processing cycles will be stored in reference to a defined time interval. We will store the cycles of the current in progress interval, a number of completed intervals and the sum of the completed intervals. cycles_count_intermediate was used to count cycles for a pmd. With some small additions we can also use it to count the cycles used for processing an rxq. Signed-off-by: Kevin Traynor --- lib/dpif-netdev.c | 28 +--- 1 file changed, 25 insertions(+), 3 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index f35c079..51d4050 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -182,4 +182,8 @@ struct emc_cache { #define DPCLS_OPTIMIZATION_INTERVAL 1000 +/* Number of intervals for which cycles are stored + * and used during rxq to pmd assignment. */ +#define PMD_RXQ_INTERVAL_MAX 6 + struct dpcls { struct cmap_node node; /* Within dp_netdev_pmd_thread.classifiers */ @@ -340,4 +344,13 @@ enum pmd_cycles_counter_type { }; +enum rxq_cycles_counter_type { +RXQ_CYCLES_PROC_CURR, /* Cycles spent successfully polling and + processing packets during the current + interval. */ +RXQ_CYCLES_PROC_HIST, /* Total cycles of all intervals that are used + during rxq to pmd assignment. */ +RXQ_N_CYCLES +}; + #define XPS_TIMEOUT_MS 500LL @@ -351,4 +364,9 @@ struct dp_netdev_rxq { particular core. */ struct dp_netdev_pmd_thread *pmd; /* pmd thread that polls this queue. */ + +/* Counters of cycles spent polling and processing packets. */ +atomic_ullong cycles[RXQ_N_CYCLES]; +atomic_ullong cycles_intrvl[PMD_RXQ_INTERVAL_MAX]; +unsigned intrvl_idx; /* Write index for 'cycles_intrvl'. */ }; @@ -677,5 +695,4 @@ static void pmd_load_cached_ports(struct dp_netdev_pmd_thread *pmd) static inline void dp_netdev_pmd_try_optimize(struct dp_netdev_pmd_thread *pmd); - static void dpif_netdev_xps_revalidate_pmd(const struct dp_netdev_pmd_thread *pmd, @@ -3092,4 +3109,5 @@ cycles_count_end(struct dp_netdev_pmd_thread *pmd, static inline void cycles_count_intermediate(struct dp_netdev_pmd_thread *pmd, + struct dp_netdev_rxq *rxq, enum pmd_cycles_counter_type type) OVS_NO_THREAD_SAFETY_ANALYSIS @@ -3100,4 +3118,8 @@ cycles_count_intermediate(struct dp_netdev_pmd_thread *pmd, non_atomic_ullong_add(&pmd->cycles.n[type], interval); +if (rxq && (type == PMD_CYCLES_PROCESSING)) { +/* Add to the amount of current processing cycles. */ +non_atomic_ullong_add(&rxq->cycles[RXQ_CYCLES_PROC_CURR], interval); +} } @@ -3668,5 +3690,5 @@ dpif_netdev_run(struct dpif *dpif) port->rxqs[i].rx, port->port_no); -cycles_count_intermediate(non_pmd, process_packets ? +cycles_count_intermediate(non_pmd, NULL, process_packets ? PMD_CYCLES_PROCESSING : PMD_CYCLES_IDLE); @@ -3855,5 +3877,5 @@ reload: dp_netdev_process_rxq_port(pmd, poll_list[i].rxq->rx, poll_list[i].port_no); -cycles_count_intermediate(pmd, +cycles_count_intermediate(pmd, NULL, process_packets ? PMD_CYCLES_PROCESSING : PMD_CYCLES_IDLE); -- 1.8.3.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v5 1/6] dpif-netdev: Change polled_queue to use dp_netdev_rxq.
Soon we will want to store processing cycle counts in the dp_netdev_rxq, so use that as a basis for the polled_queue that pmd_thread_main uses. Signed-off-by: Kevin Traynor --- lib/dpif-netdev.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index e2cd931..f35c079 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -486,5 +486,5 @@ struct dp_netdev_pmd_cycles { struct polled_queue { -struct netdev_rxq *rx; +struct dp_netdev_rxq *rxq; odp_port_t port_no; }; @@ -3799,5 +3799,5 @@ pmd_load_queues_and_ports(struct dp_netdev_pmd_thread *pmd, i = 0; HMAP_FOR_EACH (poll, node, &pmd->poll_list) { -poll_list[i].rx = poll->rxq->rx; +poll_list[i].rxq = poll->rxq; poll_list[i].port_no = poll->rxq->port->port_no; i++; @@ -3837,6 +3837,6 @@ reload: for (i = 0; i < poll_cnt; i++) { VLOG_DBG("Core %d processing port \'%s\' with queue-id %d\n", -pmd->core_id, netdev_rxq_get_name(poll_list[i].rx), -netdev_rxq_get_queue_id(poll_list[i].rx)); +pmd->core_id, netdev_rxq_get_name(poll_list[i].rxq->rx), +netdev_rxq_get_queue_id(poll_list[i].rxq->rx)); } @@ -3853,5 +3853,5 @@ reload: for (i = 0; i < poll_cnt; i++) { process_packets = -dp_netdev_process_rxq_port(pmd, poll_list[i].rx, +dp_netdev_process_rxq_port(pmd, poll_list[i].rxq->rx, poll_list[i].port_no); cycles_count_intermediate(pmd, -- 1.8.3.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v5 0/6] OVS-DPDK rxq to pmd assignment improvements.
For the DPDK datapath, by default rxqs are assigned to available pmds in round robin order with no weight or priority. It can happen that some very busy queues are handled by one pmd which does not have enough cycles to prevent packets being dropped on them. While at the same time another pmd which handles queues with no traffic on them is essentially idle. Rxq to pmd assignment happens as a result of a number of events and when it does, the same unweighted round robin approach is applied each time. This patchset proposes to improve the round robin nature of rxq to pmd assignment by counting the processing cycles used by the rxqs during their operation and incorporating that data into assignment. Before assigning in a round robin manner, the rxqs will be sorted in order of the processing cycles they have been consuming. Assuming multiple pmds, this ensures that the rxqs measured to be using the most processing cycles will be assigned to different cores. In some cases the measured cycles for an rxq may be not available as the rxq is new or may not be useful for assignment as traffic patterns may change. In those cases the code will essentially fallback to being round round similar to what currently exists. However, in the case where data is available and a reliable indication of future rxq cycles consumption, rxq to pmd distribution will be much improved. V4 -> V5 Changed history of rxq considered during assignment to 1 min. In order to have data available quicker than 1 min and not to be using up to 1 min old data, introduced storing of data in multiple intervals similar to suggestion by Darrell. Some minor name changes to reflect this. 2/6 Added storage for multiple intervals 3/6 Store cycles into interval 4/6 Sum cycles from intervals and use total for assignment V3 -> V4 4/6 Rebased to accomodate new cross numa assigment. V2 -> V3 Dropped v2 1/7 as not reusing dpcls optimisation interval anymore 2/6 Moved unused functions to 3/6 to avoid compiler warning 3/6 Made pmd rxq interval independent from dpcls opt interval 4/6 Moved docs about rebalance command to when it is available in 6/6 Added logging info for pmd to rxq assignment 5/6 Added an example to docs 6/6 Noted in commit msg that Jan requested this for testing purposes V1 -> V2 Dropped Ciara's patch to change how pmd cycles are counted as it merged. 6/7: Rebased unit tests. Kevin Traynor (6): dpif-netdev: Change polled_queue to use dp_netdev_rxq. dpif-netdev: Add rxq processing cycle counters. dpif-netdev: Count the rxq processing cycles for an rxq. dpif-netdev: Change rxq_scheduling to use rxq processing cycles. dpif-netdev: Change pmd selection order. dpif-netdev: Add ovs-appctl dpif-netdev/pmd-rxq-rebalance. Documentation/howto/dpdk.rst | 26 lib/dpif-netdev.c| 293 --- tests/pmd.at | 2 +- vswitchd/ovs-vswitchd.8.in | 2 + 4 files changed, 278 insertions(+), 45 deletions(-) -- 1.8.3.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] how to run script in *.at file in tests folder?
Silly question, but is prepare-env.sh executable? On Wed, Aug 23, 2017 at 3:05 AM Sam wrote: > Hi all, > > I'm testing, I add a new at file, and in that file, I run another shell > script like this: > > AT_SETUP([vhost - run prepare-env.sh]) > > #OVS_VSWITCHD_START > > # No need to create, as bond1 has been created. > > echo "@" > > /root/gangyewei/mvs/mvs/tests/prepare-env.sh > > AT_CHECK([echo ""]) > > AT_CHECK([/root/gangyewei/mvs/mvs/tests/prepare_env], [0], [stdout]) > > #OVS_VSWITCHD_STOP > > AT_CLEANUP > > > I use root to run the tests, But the log show me permission error, how to > fix this bug and how to run shell script in at file? > > # -*- compilation -*- > > 1. vhost.at:23: testing vhost - run prepare-env.sh ... > > @ > > /root/gangyewei/mvs/mvs/tests/testsuite.dir/at-groups/1/test-source: line > > 12: /root/gangyewei/mvs/mvs/tests/prepare-env.sh: Permission denied > > ./vhost.at:30: echo "" > > --- /dev/null 2017-02-21 23:39:22.88249 +0800 > > +++ /root/gangyewei/mvs/mvs/tests/testsuite.dir/at-groups/1/stdout > > 2017-08-23 15:43:29.348993538 +0800 > > @@ -0,0 +1 @@ > > + > > 1. vhost.at:23: 1. vhost - run prepare-env.sh (vhost.at:23): FAILED ( > > vhost.at:30) > > > Thank you~ > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] Cooked Shrimps Jona
[ View in browser ]( http://r.newsletter.bonescamail.nl/nru6rrryoatrf.html ) [ ]( http://r.newsletter.bonescamail.nl/click/2n3cr333haoatrd.html ) JONA COOKED PEELED CAT TIGER SHRIMPS 100/200 25 X 400 GRS / 300 GRS NET WEIGHT (25 % GLAZED) 1 BOX 1,99 10 BOX 1,89 PALET (63 BOX) € 1,79 PER BAG!!! [ Click here to find all our latest offers ]( http://r.newsletter.bonescamail.nl/click/2n3cr3349qoatrd.html ) This email was sent to d...@openvswitch.org You received this email because you are registered with Bonesca Import en Export BV [ Unsubscribe here ]( http://r.newsletter.bonescamail.nl/nru6rrryoatrg.html ) Sent by [ ]( http://r.newsletter.bonescamail.nl/click/2n3cr33526oatrd.html ) © 2017 Bonesca Import en Export BV ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] windows, python: set the reset event to automatic
I forgot to mention why you need the events on manual reset: https://msdn.microsoft.com/en-us/library/windows/desktop/ms684342(v=vs.85).a spx "Functions such as ReadFile and WriteFile set this handle to the nonsignaled state before they begin an I/O operation. When the operation has completed, the handle is set to the signaled state. Functions such as GetOverlappedResult and the synchronization wait functions reset auto-reset events to the nonsignaled state. Therefore, you should use a manual reset event; if you use an auto-reset event, your application can stop responding if you wait for the operation to complete and then call GetOverlappedResult with the bWait parameter set to TRUE." We should probably document this in both C and Python code when creating the event. Thanks, Alin. > -Original Message- > From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev- > boun...@openvswitch.org] On Behalf Of aserd...@ovn.org > Sent: Wednesday, August 23, 2017 1:12 PM > To: 'Russell Bryant' ; Alin Balutoiu > > Cc: d...@openvswitch.org > Subject: Re: [ovs-dev] [PATCH] windows, python: set the reset event to > automatic > > Me and Alin talked about this offline. > > In the case of named pipes you need events with manual reset and initial > signaled state. > > We need a different event when dealing with sockets on Windows, he will > address this in a new patch. > > While looking over the code we see another good reason for using ctypes > instead of pypiwin32 (the implementation of ReadFile). > > Also the code needs a bit of cleanup. > > Thanks, > Alin. > > > -Original Message- > > From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev- > > boun...@openvswitch.org] On Behalf Of Russell Bryant > > Sent: Tuesday, August 22, 2017 6:35 PM > > To: Alin Balutoiu > > Cc: d...@openvswitch.org > > Subject: Re: [ovs-dev] [PATCH] windows, python: set the reset event to > > automatic > > > > On Tue, Aug 22, 2017 at 8:41 AM, Alin Balutoiu > > wrote: > > > Comments answered inline. > > > > > > Thanks, > > > Alin Balutoiu. > > > > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] windows, python: Fix event type returned from poller
I also think that using ctypes instead of pypiwin32 would be a good idea. In this way we remove all the confusion caused by the implementation of wrappers from pypiwin32 and remove the dependency on the pypiwin32 module. Example for WaitForMultipleObjects where the wrapper from pypiwin32 raises an exception if the function returns WAIT_FAILED and returns the error code otherwise: https://github.com/pywin32/pypiwin32/blob/master/win32/src/win32event.i#L340-L343 For ReadFile, if we pass an overlapped structure and the ReadFile finished syncronously we will have to make an extra call to GetOverlappedResult to get the read number of bytes. This is due to the fact that the number of bytes read is not returned by the wrapper implementation from pypiwin32 and the buffer is not resized if an overlapped structure is passed. The implementation can be viewed here: https://github.com/pywin32/pypiwin32/blob/master/win32/src/win32file.i#L896-L987 > -Original Message- > From: aserd...@ovn.org [mailto:aserd...@ovn.org] > Sent: Wednesday, August 23, 2017 12:05 PM > To: 'Russell Bryant' ; Alin Balutoiu > > Cc: d...@openvswitch.org > Subject: RE: [ovs-dev] [PATCH] windows, python: Fix event type returned > from poller > > Does it make sense to use python ctypes instead of pypiwin32? > > It seems better to make our own calls instead of using wrappers which treat > just certain cases. > > Acked-by: Alin Gabriel Serdean > > > -Original Message- > > From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev- > > boun...@openvswitch.org] On Behalf Of Russell Bryant > > Sent: Tuesday, August 22, 2017 6:34 PM > > To: Alin Balutoiu > > Cc: d...@openvswitch.org > > Subject: Re: [ovs-dev] [PATCH] windows, python: Fix event type > > returned from poller > > > > Ah, that makes sense for the Python version. > > > > The Python docs were sparse, so I wasn't sure. > > > > On Tue, Aug 22, 2017 at 8:35 AM, Alin Balutoiu > > wrote: > > > Looking at the implementation of WaitForMultipleObjects in Python it > > > looks that the call will raise an exception if WAIT_FAILED is > > > returned. This case is treated by the try/except block around the > > WaitForMultipleObjects function. > > > > > > Thanks, > > > Alin Balutoiu. > > > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] windows, python: Remove code duplication in send/recv functions
Acked-by: Alin Gabriel Serdean > -Original Message- > From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev- > boun...@openvswitch.org] On Behalf Of Alin Balutoiu > Sent: Tuesday, August 22, 2017 1:47 PM > To: d...@openvswitch.org > Subject: [ovs-dev] [PATCH] windows, python: Remove code duplication in > send/recv functions > > Move the return value at the end of the function regardless of the > pending/non-pending operation. > > Signed-off-by: Alin Balutoiu > --- > python/ovs/stream.py | 78 - > --- > 1 file changed, 36 insertions(+), 42 deletions(-) > > diff --git a/python/ovs/stream.py b/python/ovs/stream.py index > f82a449..717ea18 100644 > --- a/python/ovs/stream.py > +++ b/python/ovs/stream.py > @@ -321,11 +321,6 @@ class Stream(object): > self._read, > False) > self._read_pending = False > -recvBuffer = self._read_buffer[:nBytesRead] > -# recvBuffer will have the type memoryview in Python3. > -# We can use bytes to convert it to type bytes which works on > -# both Python2 and Python3. > -return (0, bytes(recvBuffer)) > except pywintypes.error as e: > if e.winerror == winutils.winerror.ERROR_IO_INCOMPLETE: > # The operation is still pending, try again @@ -336,30 +331,31 @@ > class Stream(object): > return (0, "") > else: > return (errno.EINVAL, "") > -(errCode, self._read_buffer) = winutils.read_file(self.pipe, > - n, > - self._read) > -if errCode: > -if errCode == winutils.winerror.ERROR_IO_PENDING: > -self._read_pending = True > -return (errno.EAGAIN, "") > -elif errCode in winutils.pipe_disconnected_errors: > -# If the pipe was disconnected, return 0. > -return (0, "") > -else: > -return (errCode, "") > +else: > +(errCode, self._read_buffer) = winutils.read_file(self.pipe, > + n, > + self._read) > +if errCode: > +if errCode == winutils.winerror.ERROR_IO_PENDING: > +self._read_pending = True > +return (errno.EAGAIN, "") > +elif errCode in winutils.pipe_disconnected_errors: > +# If the pipe was disconnected, return 0. > +return (0, "") > +else: > +return (errCode, "") > > -try: > -nBytesRead = winutils.get_overlapped_result(self.pipe, > -self._read, > -False) > -winutils.win32event.SetEvent(self._read.hEvent) > -except pywintypes.error as e: > -if e.winerror in winutils.pipe_disconnected_errors: > -# If the pipe was disconnected, return 0. > -return (0, "") > -else: > -return (e.winerror, "") > +try: > +nBytesRead = winutils.get_overlapped_result(self.pipe, > +self._read, > +False) > +winutils.win32event.SetEvent(self._read.hEvent) > +except pywintypes.error as e: > +if e.winerror in winutils.pipe_disconnected_errors: > +# If the pipe was disconnected, return 0. > +return (0, "") > +else: > +return (e.winerror, "") > > recvBuffer = self._read_buffer[:nBytesRead] > # recvBuffer will have the type memoryview in Python3. > @@ -406,7 +402,6 @@ class Stream(object): > self._write, > False) > self._write_pending = False > -return nBytesWritten > except pywintypes.error as e: > if e.winerror == winutils.winerror.ERROR_IO_INCOMPLETE: > # The operation is still pending, try again @@ -417,19 +412,18 @@ > class Stream(object): > return -errno.ECONNRESET > else: > return -errno.EINVAL > - > -self._write_pending = False > -(errCode, nBytesWritten) = winutils.write_file(self.pipe, > - buf, > -
Re: [ovs-dev] [PATCH] windows, python: set the reset event to automatic
Me and Alin talked about this offline. In the case of named pipes you need events with manual reset and initial signaled state. We need a different event when dealing with sockets on Windows, he will address this in a new patch. While looking over the code we see another good reason for using ctypes instead of pypiwin32 (the implementation of ReadFile). Also the code needs a bit of cleanup. Thanks, Alin. > -Original Message- > From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev- > boun...@openvswitch.org] On Behalf Of Russell Bryant > Sent: Tuesday, August 22, 2017 6:35 PM > To: Alin Balutoiu > Cc: d...@openvswitch.org > Subject: Re: [ovs-dev] [PATCH] windows, python: set the reset event to > automatic > > On Tue, Aug 22, 2017 at 8:41 AM, Alin Balutoiu > wrote: > > Comments answered inline. > > > > Thanks, > > Alin Balutoiu. > > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] windows, python: Fix event type returned from poller
Does it make sense to use python ctypes instead of pypiwin32? It seems better to make our own calls instead of using wrappers which treat just certain cases. Acked-by: Alin Gabriel Serdean > -Original Message- > From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev- > boun...@openvswitch.org] On Behalf Of Russell Bryant > Sent: Tuesday, August 22, 2017 6:34 PM > To: Alin Balutoiu > Cc: d...@openvswitch.org > Subject: Re: [ovs-dev] [PATCH] windows, python: Fix event type returned > from poller > > Ah, that makes sense for the Python version. > > The Python docs were sparse, so I wasn't sure. > > On Tue, Aug 22, 2017 at 8:35 AM, Alin Balutoiu > wrote: > > Looking at the implementation of WaitForMultipleObjects in Python it > > looks that the call will raise an exception if WAIT_FAILED is > > returned. This case is treated by the try/except block around the > WaitForMultipleObjects function. > > > > Thanks, > > Alin Balutoiu. > > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2] ovn: Fix BFD error config on gateway
Reviewed change and it looks fine. Also tested the test case. On Sun, Aug 20, 2017 at 8:07 PM, Zhenyu Gao wrote: > The bfd_calculate_chassis function calculates gateway's peer datapaths > to figure out which tunnel's BFD should be enabled to from the current > chassis. > Existing algorithm only calculats peer datapaths at one hop, but multiple > logical switches and E/W routers could be in the path, making several hops > which were not considered on the calculation. > It may disable BFD on some gw's tunnel ports. Then a port on a remote ovs > cannot send packet out because it believes all remote gateways are down. > > This patch will go through whole graph and visit all datapath's port > which has connection with gateways. > > Signed-off-by: Zhenyu Gao > --- > > v1->v2: Address Miguel's comments and add ovn test > > ovn/controller/bfd.c | 102 +- > tests/ovn.at | 203 ++ > - > 2 files changed, 286 insertions(+), 19 deletions(-) > > diff --git a/ovn/controller/bfd.c b/ovn/controller/bfd.c > index d1448b1..dccfc98 100644 > --- a/ovn/controller/bfd.c > +++ b/ovn/controller/bfd.c > @@ -100,6 +100,88 @@ bfd_calculate_active_tunnels(const struct > ovsrec_bridge *br_int, > } > } > > +struct local_datapath_node { > +struct ovs_list node; > +struct local_datapath *dp; > +}; > + > +static void > +bfd_travel_gw_related_chassis(struct local_datapath *dp, > + const struct hmap *local_datapaths, > + struct ovsdb_idl_index_cursor *cursor, > + struct sbrec_port_binding *lpval, > + struct sset *bfd_chassis) > +{ > +struct ovs_list dp_list; > +const struct sbrec_port_binding *pb; > +struct sset visited_dp = SSET_INITIALIZER(&visited_dp); > +const char *dp_key; > +struct local_datapath_node *dp_binding; > + > +if (!(dp_key = smap_get(&dp->datapath->external_ids, > "logical-router")) && > +!(dp_key = smap_get(&dp->datapath->external_ids, > "logical-switch"))) { > +VLOG_INFO("datapath has no uuid, cannot travel graph"); > +return; > +} > + > +sset_add(&visited_dp, dp_key); > + > +ovs_list_init(&dp_list); > +dp_binding = xmalloc(sizeof *dp_binding); > +dp_binding->dp = dp; > +ovs_list_push_back(&dp_list, &dp_binding->node); > + > +/* > + * Go through whole graph to figure out all chassis which may deliver > + * packetsto gateway. */ > +do { > +dp_binding = CONTAINER_OF(ovs_list_pop_front(&dp_list), > + struct local_datapath_node, node); > +dp = dp_binding->dp; > +free(dp_binding); > +for (size_t i = 0; i < dp->n_peer_dps; i++) { > +const struct sbrec_datapath_binding *pdp = dp->peer_dps[i]; > +if (!pdp) { > +continue; > +} > + > +if (!(dp_key = smap_get(&pdp->external_ids, > "logical-router")) && > +!(dp_key = smap_get(&pdp->external_ids, > "logical-switch"))) { > +continue; > +} > + > +if (sset_contains(&visited_dp, dp_key)) { > +continue; > +} > + > +sset_add(&visited_dp, dp_key); > + > +struct hmap_node *node = hmap_first_with_hash(local_ > datapaths, > + > pdp->tunnel_key); > +if (!node) { > +continue; > +} > + > +dp_binding = xmalloc(sizeof *dp_binding); > +dp_binding->dp = CONTAINER_OF(node, struct local_datapath, > + hmap_node); > +ovs_list_push_back(&dp_list, &dp_binding->node); > + > +sbrec_port_binding_index_set_datapath(lpval, pdp); > +SBREC_PORT_BINDING_FOR_EACH_EQUAL (pb, cursor, lpval) { > +if (pb->chassis) { > +const char *chassis_name = pb->chassis->name; > +if (chassis_name) { > +sset_add(bfd_chassis, chassis_name); > +} > +} > +} > +} > +} while (!ovs_list_is_empty(&dp_list)); > + > +sset_destroy(&visited_dp); > +} > + > static void > bfd_calculate_chassis(struct controller_ctx *ctx, >const struct sbrec_chassis *our_chassis, > @@ -155,24 +237,8 @@ bfd_calculate_chassis(struct controller_ctx *ctx, > } > } > if (our_chassis_is_gw_for_dp) { > -for (size_t i = 0; i < dp->n_peer_dps; i++) { > -const struct sbrec_datapath_binding *pdp = > dp->peer_dps[i]; > -if (!pdp) { > -continue; > -} > - > -sbrec_port_binding_index_set_datapath(lpval, pdp); > -SBREC_PORT_BINDING_FOR_EACH_EQUAL (pb, &cursor, lpval) { > -if (pb->chassi
[ovs-dev] how to run script in *.at file in tests folder?
Hi all, I'm testing, I add a new at file, and in that file, I run another shell script like this: AT_SETUP([vhost - run prepare-env.sh]) > #OVS_VSWITCHD_START > # No need to create, as bond1 has been created. > echo "@" > /root/gangyewei/mvs/mvs/tests/prepare-env.sh > AT_CHECK([echo ""]) > AT_CHECK([/root/gangyewei/mvs/mvs/tests/prepare_env], [0], [stdout]) > #OVS_VSWITCHD_STOP > AT_CLEANUP I use root to run the tests, But the log show me permission error, how to fix this bug and how to run shell script in at file? # -*- compilation -*- > 1. vhost.at:23: testing vhost - run prepare-env.sh ... > @ > /root/gangyewei/mvs/mvs/tests/testsuite.dir/at-groups/1/test-source: line > 12: /root/gangyewei/mvs/mvs/tests/prepare-env.sh: Permission denied > ./vhost.at:30: echo "" > --- /dev/null 2017-02-21 23:39:22.88249 +0800 > +++ /root/gangyewei/mvs/mvs/tests/testsuite.dir/at-groups/1/stdout > 2017-08-23 15:43:29.348993538 +0800 > @@ -0,0 +1 @@ > + > 1. vhost.at:23: 1. vhost - run prepare-env.sh (vhost.at:23): FAILED ( > vhost.at:30) Thank you~ ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH net-next v4] openvswitch: enable NSH support
On Tue, 22 Aug 2017 17:38:26 +0800, Yang, Yi wrote: > I checked GSO support code, we have two cases to handle, one case is to > output NSH packet to VxLAN-gpe port, the other case is to output NSH packet > to Ethernet port (tap, veth, vhost, physical eth, etc.), I don't think > it is a good way to do GSO support in OVS module, because we can't know > the final output port at this point, if the final output port is > VxLAN-gpe port, it will still face GSO issue, but I didn't see vxlan > module handles this, maybe udp tunnel did this. I think a NSH packet can > be split into two packets, one has NSH header, another one doesn't, > so I'm not sure how vxlan module can avoid this situation. As I said before, by either implementing GSO support for NSH or by segmenting in software before pushing the NSH header. In the first case, the packet will be software segmented before being pushed to the hardware, in the second case, it will be software segmented before pushing the NSH header. > For the second case, we can add a nsh GSO module in net/nsh/nsh_gso.c > (as MPLS did in net/mpls/mpls_gso.c), that is the best way to handle > this, what do you think about it? Yes, that's one of the two options that we've been talking about. It's the better and the more efficient one. Jiri ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev