[ovs-dev] 答复: RE: 答复: Re: 答复: Re: 答复: Re: [PATCH] ovn-controller: Support vxlan tunnel in ovn
ovn-controller-vtep could only be used to forwards traffic between networks managed by openstack and physical network openstack not managed. We want to use ovn in the scenary that ovs-computer node and sriov computer node all managed by openstack. Waiting for your suggestions. Thanks Macauley Cheng 2017/05/09 11:16 收件人:"wang.qia...@zte.com.cn" , "ovs-dev-boun...@openvswitch.org" , "b...@ovn.org" , "mickeys@gmail.com" , 抄送: ovs dev , "xu.r...@zte.com.cn" , "ovs-dev-boun...@openvswitch.org" 主题: RE: [ovs-dev] 答复: Re: 答复: Re: 答复: Re: [PATCH] ovn-controller: Support vxlan tunnel in ovn Hi Qianyu, Did you try the ovn-controller-vtep before? You can use the ovn-controller-vtep with HW switch to setup the vxlan tunnel with host which run ovn-controller. -Original Message- From: ovs-dev-boun...@openvswitch.org [ mailto:ovs-dev-boun...@openvswitch.org] On Behalf Of wang.qia...@zte.com.cn Sent: Tuesday, May 09, 2017 9:29 AM To: ovs-dev-boun...@openvswitch.org; b...@ovn.org; mickeys@gmail.com Cc: ovs dev; xu.r...@zte.com.cn; ovs-dev-boun...@openvswitch.org Subject: [ovs-dev] 答复: Re: 答复: Re: 答复: Re: [PATCH] ovn-controller: Support vxlan tunnel in ovn Hi Ben and Mikey, thank you for your review and analysis. As we discribed below, vxlan is used very common, and the use case that we mentioned below is typical architecture of telecom operators' networks. So, we think it is very necessary to support vxlan between ovs and HW switch. If ovn does not support the use case we discribed before, we think the the ovn features may be imperfect. For this reason, we do the changes as this patch. Because vxlan tunnel can not carry enough information like genev to fulfill ovn current pipeline, so we do some assumptions. As Michkey and Ben said, these assumptions may not hold going forward, such as S_SWITCH_IN_L2_LKUP and more than one chassisredirect ports and so on. We think these incompatibles are not impossibly hard problems. If we have the consensus of the necessity to support vxlan, we think we could together to talk over a blueprint that less influence of current planning features and more effective to solve the problem we mentioned. Waiting for your suggestions,Thanks Mickey Spiegel 发件人: ovs-dev-boun...@openvswitch.org 2017/05/08 04:58 收件人:xu.r...@zte.com.cn, 抄送: ovs dev , ovs-dev-boun...@openvswitch.org 主题: Re: [ovs-dev] 答复: Re: 答复: Re: [PATCH] ovn-controller: Support vxlan tunnel in ovn There are some assumptions that you are making which need to be called out. These assumptions may not hold going forward. In fact I refer to two different patches below that are currently under review, that break your assumptions. On Fri, May 5, 2017 at 7:18 PM, wrote: > Hi,Russell > > We think vxlan is the most commonly used tunnel encapsulation in the > overlay network openstack,ovn should better consider it. > > As my workmate wang qianyu said,we would consider computer node connect > with existing hardware switches which associates with SR-IOV as VTEP. > > After discussion, we feel that as long as the following changes for vxlan > tunnel in the table0: > > 1.For local switch, move MFF_TUN_ID to MFF_LOG_DATAPATH, resubmit to > OFTABLE_ETH_UCAST(table29) > It looks like you are overloading OFTABLE_ETH_UCAST that you define here with S_SWITCH_IN_L2_LKUP in ovn/northd/ovn-northd.c. Hardcoding the value to 29 in ovn/controller/lflow.h is not the way to do this. This pipeline stage does move back as new features are added. In fact it is now table 31 due to the recent addition of 2 tables for DNS lookup. > //---In table29, we can find out dst port based on dst mac > You are assuming that outport determination is only based on S_SWITCH_IN_L2_LKUP with no impact from any other ingress pipeline stages. This may not always be true, which I think is the point of Ben's complaint. For example the SFC patch that is currently under review ( http://patchwork.ozlabs.org/patch/754427/) may set outport and then do "output" in the ingress pipeline, in a pipeline stage other than S_SWITCH_IN_L2_LKUP. The alternative is to go through the entire ingress pipeline, but here you have a problem since you do not know the inport. The current VTEP-centric VXLAN code assumes that there is only one port binding per datapath from the VTEP chassis. For the general case that you are trying to address, this assumption does not hold, so you cannot properly determine the inport. The inport may affect the ultimate decision on outport. This is certainly the case for the SFC patch currently under review. You are also assuming that inport does not affect anything in the egress pipeline. This seems to be true at the moment, but this might not always be true as features are added. The existing VTEP functionality does not rely on the assumptions that you made, but since you changed the logic to determine inport in case o
Re: [ovs-dev] [PATCH 1/2] Revert "tunneling: Avoid recirculation on datapath."
On 8 May 2017 at 17:35, William Tu wrote: > Hi Joe and Greg, > > Maybe it's better I put this revert tunneling patch (1/2) and its > tunnel-tests (2/2) in one patch, so the "make check" can pass? They can be separate. It's currently broken, the revert will fix it. The test can be an independent submission. I'd rather not fold yet more changes into the revert. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] 答复: Re: 答复: Re: 答复: Re: [PATCH] ovn-controller: Support vxlan tunnel in ovn
Hi Qianyu, Did you try the ovn-controller-vtep before? You can use the ovn-controller-vtep with HW switch to setup the vxlan tunnel with host which run ovn-controller. -Original Message- From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-boun...@openvswitch.org] On Behalf Of wang.qia...@zte.com.cn Sent: Tuesday, May 09, 2017 9:29 AM To: ovs-dev-boun...@openvswitch.org; b...@ovn.org; mickeys@gmail.com Cc: ovs dev; xu.r...@zte.com.cn; ovs-dev-boun...@openvswitch.org Subject: [ovs-dev] 答复: Re: 答复: Re: 答复: Re: [PATCH] ovn-controller: Support vxlan tunnel in ovn Hi Ben and Mikey, thank you for your review and analysis. As we discribed below, vxlan is used very common, and the use case that we mentioned below is typical architecture of telecom operators' networks. So, we think it is very necessary to support vxlan between ovs and HW switch. If ovn does not support the use case we discribed before, we think the the ovn features may be imperfect. For this reason, we do the changes as this patch. Because vxlan tunnel can not carry enough information like genev to fulfill ovn current pipeline, so we do some assumptions. As Michkey and Ben said, these assumptions may not hold going forward, such as S_SWITCH_IN_L2_LKUP and more than one chassisredirect ports and so on. We think these incompatibles are not impossibly hard problems. If we have the consensus of the necessity to support vxlan, we think we could together to talk over a blueprint that less influence of current planning features and more effective to solve the problem we mentioned. Waiting for your suggestions,Thanks Mickey Spiegel 发件人: ovs-dev-boun...@openvswitch.org 2017/05/08 04:58 收件人:xu.r...@zte.com.cn, 抄送: ovs dev , ovs-dev-boun...@openvswitch.org 主题: Re: [ovs-dev] 答复: Re: 答复: Re: [PATCH] ovn-controller: Support vxlan tunnel in ovn There are some assumptions that you are making which need to be called out. These assumptions may not hold going forward. In fact I refer to two different patches below that are currently under review, that break your assumptions. On Fri, May 5, 2017 at 7:18 PM, wrote: > Hi,Russell > > We think vxlan is the most commonly used tunnel encapsulation in the > overlay network openstack,ovn should better consider it. > > As my workmate wang qianyu said,we would consider computer node connect > with existing hardware switches which associates with SR-IOV as VTEP. > > After discussion, we feel that as long as the following changes for vxlan > tunnel in the table0: > > 1.For local switch, move MFF_TUN_ID to MFF_LOG_DATAPATH, resubmit to > OFTABLE_ETH_UCAST(table29) > It looks like you are overloading OFTABLE_ETH_UCAST that you define here with S_SWITCH_IN_L2_LKUP in ovn/northd/ovn-northd.c. Hardcoding the value to 29 in ovn/controller/lflow.h is not the way to do this. This pipeline stage does move back as new features are added. In fact it is now table 31 due to the recent addition of 2 tables for DNS lookup. > //---In table29, we can find out dst port based on dst mac > You are assuming that outport determination is only based on S_SWITCH_IN_L2_LKUP with no impact from any other ingress pipeline stages. This may not always be true, which I think is the point of Ben's complaint. For example the SFC patch that is currently under review ( http://patchwork.ozlabs.org/patch/754427/) may set outport and then do "output" in the ingress pipeline, in a pipeline stage other than S_SWITCH_IN_L2_LKUP. The alternative is to go through the entire ingress pipeline, but here you have a problem since you do not know the inport. The current VTEP-centric VXLAN code assumes that there is only one port binding per datapath from the VTEP chassis. For the general case that you are trying to address, this assumption does not hold, so you cannot properly determine the inport. The inport may affect the ultimate decision on outport. This is certainly the case for the SFC patch currently under review. You are also assuming that inport does not affect anything in the egress pipeline. This seems to be true at the moment, but this might not always be true as features are added. The existing VTEP functionality does not rely on the assumptions that you made, but since you changed the logic to determine inport in case of VXLAN, you are changing existing functionality. > 2.For local chassisredirect port, move MFF_TUN_ID to MFF_LOG_DATAPATH, set > port tunnel_key to MFF_LOG_OUTPORT and then resubmit to > OFTABLE_LOCAL_OUTPUT. > //---In table 33, we can find out dst local sw and sw patch port based on > the local chassisredirect port,and then follow the exsiting flows. > At the moment, the only case where a tunnel is used for a datapath representing a logical router is when the outport is a chassisredirect port. Your code assumes that will always be the case. If we do what you are suggesting, then that becomes a restriction for all logical router features going
Re: [ovs-dev] [PATCH 1/2] Revert "tunneling: Avoid recirculation on datapath."
On Mon, 2017-05-08 at 17:35 -0700, William Tu wrote: > Hi Joe and Greg, > > Maybe it's better I put this revert tunneling patch (1/2) and its > tunnel-tests (2/2) in one patch, so the "make check" can pass? > > Regards, > William I don't understand... the revert has to be it's own separate patch so that it can address the commit specified. I'm not sure about the other implied policy here - are we not able to revert a patch and then commit another one because in between those two commits a 'make check' will fail? - Greg > > On Mon, May 8, 2017 at 11:34 AM, Greg Rose wrote: > > On Mon, 2017-05-08 at 11:15 -0700, Joe Stringer wrote: > >> This reverts commit f1dac5128ce6db2e493f0d1c7a8b53fb9f34476f. When this > >> commit was introduced, it broke the 'make check-system-userspace' > >> testsuite. It appears that the new translation fails to modify the flow > >> in a way that would represent the flow as an encapsulated flow when the > >> traffic is patched through to the second bridge. As such, rather than > >> matching on, for example, "ip,proto=47" for gre, it would use the inner > >> packet's flow headers. It also results in problems reporting statistics, > >> as the tunnel's header is not reflected in subsequent statistics and > >> truncation is not properly applied during translation. > >> > >> While a refreshed approach to solving the above problem is formed, > >> revert this patch. > >> > >> Reported-at: > >> https://mail.openvswitch.org/pipermail/ovs-dev/2017-May/331972.html > >> Signed-off-by: Joe Stringer > > > > Acked-by: Greg Rose > > > >> --- > >> lib/dpif-netdev.c | 18 ++- > >> ofproto/ofproto-dpif-xlate.c | 280 > >> -- > >> tests/ofproto-dpif.at | 11 +- > >> tests/ovn.at | 6 +- > >> tests/tunnel-push-pop-ipv6.at | 10 +- > >> tests/tunnel-push-pop.at | 12 +- > >> 6 files changed, 173 insertions(+), 164 deletions(-) > >> > >> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c > >> index 4ee5d058aff8..d21515657634 100644 > >> --- a/lib/dpif-netdev.c > >> +++ b/lib/dpif-netdev.c > >> @@ -4970,8 +4970,24 @@ dp_execute_cb(void *aux_, struct dp_packet_batch > >> *packets_, > >> > >> case OVS_ACTION_ATTR_TUNNEL_PUSH: > >> if (*depth < MAX_RECIRC_DEPTH) { > >> +struct dp_packet_batch tnl_pkt; > >> +struct dp_packet_batch *orig_packets_ = packets_; > >> +int err; > >> + > >> +if (!may_steal) { > >> +dp_packet_batch_clone(&tnl_pkt, packets_); > >> +packets_ = &tnl_pkt; > >> +dp_packet_batch_reset_cutlen(orig_packets_); > >> +} > >> + > >> dp_packet_batch_apply_cutlen(packets_); > >> -push_tnl_action(pmd, a, packets_); > >> + > >> +err = push_tnl_action(pmd, a, packets_); > >> +if (!err) { > >> +(*depth)++; > >> +dp_netdev_recirculate(pmd, packets_); > >> +(*depth)--; > >> +} > >> return; > >> } > >> break; > >> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c > >> index b308f21de100..bc3a310227da 100644 > >> --- a/ofproto/ofproto-dpif-xlate.c > >> +++ b/ofproto/ofproto-dpif-xlate.c > >> @@ -425,10 +425,6 @@ static void xlate_action_set(struct xlate_ctx *ctx); > >> static void xlate_commit_actions(struct xlate_ctx *ctx); > >> > >> static void > >> -apply_nested_clone_actions(struct xlate_ctx *ctx, const struct xport > >> *in_dev, > >> - struct xport *out_dev); > >> - > >> -static void > >> ctx_trigger_freeze(struct xlate_ctx *ctx) > >> { > >> ctx->exit = true; > >> @@ -3215,17 +3211,7 @@ build_tunnel_send(struct xlate_ctx *ctx, const > >> struct xport *xport, > >> } > >> tnl_push_data.tnl_port = odp_to_u32(tunnel_odp_port); > >> tnl_push_data.out_port = odp_to_u32(out_dev->odp_port); > >> - > >> -size_t push_action_size = 0; > >> -size_t clone_ofs = nl_msg_start_nested(ctx->odp_actions, > >> - OVS_ACTION_ATTR_CLONE); > >> odp_put_tnl_push_action(ctx->odp_actions, &tnl_push_data); > >> -push_action_size = ctx->odp_actions->size; > >> -apply_nested_clone_actions(ctx, xport, out_dev); > >> -if (ctx->odp_actions->size > push_action_size) { > >> -/* Update the CLONE action only when combined */ > >> -nl_msg_end_nested(ctx->odp_actions, clone_ofs); > >> -} > >> return 0; > >> } > >> > >> @@ -3261,136 +3247,6 @@ xlate_flow_is_protected(const struct xlate_ctx > >> *ctx, const struct flow *flow, co > >> xport_in->xbundle->protected && > >> xport_out->xbundle->protected); > >> } > >> > >> -/* Populate and apply nested actions on 'out_dev'. > >> - * The nested actions are applied on cloned packets in dp while > >> outputting to > >> - * either patch or tunnel ports. > >> - * On output to
Re: [ovs-dev] [PATCHv2] tunnel-tests: Add test to match tunnel traffic.
If you add this second incremental, it is explicit that you want to match on the outer header as part of this test (to check the present breakage). It also makes it clear which packets are involved and needed in this test. diff --git a/tests/tunnel-push-pop.at b/tests/tunnel-push-pop.at index 648b131..1b9b728 100644 --- a/tests/tunnel-push-pop.at +++ b/tests/tunnel-push-pop.at @@ -247,7 +247,7 @@ AT_CHECK([ovs-appctl netdev-dummy/ip4addr br0 1.1.2.88/24], [0], [OK ]) AT_CHECK([ovs-appctl ovs/route/add 1.1.2.92/24 br0], [0], [OK ]) -AT_CHECK([ovs-ofctl add-flow br0 'priority=1,action=normal']) +AT_CHECK([ovs-ofctl add-flow br0 'arp,priority=1,action=normal']) dnl Use arp reply to achieve tunnel next hop mac binding AT_CHECK([ovs-appctl netdev-dummy/receive br0 'recirc_id(0),in_port(100),eth(src=f8:bc:12:44:34:b6,dst=aa:55:aa:55:00:00),eth_type(0x0806),arp(sip=1.1.2.92,tip @@ -261,7 +261,7 @@ Listening ports: gre_sys (3) ]) -AT_CHECK([ovs-ofctl add-flow br0 'priority=99,udp,action=normal']) +AT_CHECK([ovs-ofctl add-flow br0 'ip,ip_src=1.1.2.88,priority=99,action=normal']) dnl Check GRE tunnel push AT_CHECK([ovs-ofctl add-flow int-br action=3]) @@ -273,8 +273,8 @@ AT_CHECK([tail -1 stdout], [0], AT_CHECK([ovs-appctl netdev-dummy/receive int-br '5054000a505400091234']) AT_CHECK([ovs-ofctl dump-flows br0 | ofctl_strip | sort], [0], [dnl) - n_packets=2, n_bytes=98, priority=1 actions=NORMAL - priority=99,udp actions=NORMAL + n_packets=1, n_bytes=42, priority=1,arp actions=NORMAL + n_packets=1, n_bytes=56, priority=99,ip,nw_src=1.1.2.88 actions=NORMAL NXST_FLOW reply: ]) On 5/8/17, 5:27 PM, "ovs-dev-boun...@openvswitch.org on behalf of William Tu" wrote: This test highlights a bug that was affecting master up until the previous patch. Put simply, we have two bridges: an integration bridge which contains a tunnel, and a physical bridge for underlay network connectivity. This test simulates putting UDP traffic through the integration bridge, with the intention to apply GRE tunnel headers and send the packet through the underlay bridge. The underlay bridge should observe GRE traffic. Signed-off-by: William Tu Signed-off-by: Joe Stringer Acked-by: Greg Rose --- tests/tunnel-push-pop.at | 55 1 file changed, 55 insertions(+) diff --git a/tests/tunnel-push-pop.at b/tests/tunnel-push-pop.at index 294d28a2416d..d7936ddc7c1f 100644 --- a/tests/tunnel-push-pop.at +++ b/tests/tunnel-push-pop.at @@ -225,3 +225,58 @@ OVS_WAIT_UNTIL([test `ovs-pcap p0.pcap | grep 5054000a505400091235 | wc OVS_VSWITCHD_STOP AT_CLEANUP + +AT_SETUP([tunnel_push_pop - matching on physical bridge]) + +OVS_VSWITCHD_START([add-port br0 p0 -- set Interface p0 type=dummy ofport_request=1 other-config:hwaddr=aa:55:aa:55:00:00]) +AT_CHECK([ovs-vsctl add-br int-br -- set bridge int-br datapath_type=dummy], [0]) +AT_CHECK([ovs-vsctl add-port int-br t1 -- set Interface t1 type=gre \ + options:remote_ip=1.1.2.92 options:key=456 ofport_request=3], [0]) + +AT_CHECK([ovs-appctl dpif/show], [0], [dnl +dummy@ovs-dummy: hit:0 missed:0 + br0: + br0 65534/100: (dummy-internal) + p0 1/1: (dummy) + int-br: + int-br 65534/2: (dummy-internal) + t1 3/3: (gre: key=456, remote_ip=1.1.2.92) +]) + +AT_CHECK([ovs-appctl netdev-dummy/ip4addr br0 1.1.2.88/24], [0], [OK +]) +AT_CHECK([ovs-appctl ovs/route/add 1.1.2.92/24 br0], [0], [OK +]) +AT_CHECK([ovs-ofctl add-flow br0 'priority=1,action=normal']) + +dnl Use ARP reply to achieve tunnel next hop mac binding +AT_CHECK([ovs-appctl netdev-dummy/receive br0 'recirc_id(0),in_port(100),eth(src=f8:bc:12:44:34:b6,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=1.1.2.92,tip=1.1.2.88,op=2,sha=f8:bc:12:44:34:b6,tha=00:00:00:00:00:00)']) + +AT_CHECK([ovs-appctl tnl/neigh/show | tail -n+3 | sort], [0], [dnl +1.1.2.92 f8:bc:12:44:34:b6 br0 +]) + +AT_CHECK([ovs-appctl tnl/ports/show |sort], [0], [dnl +Listening ports: +gre_sys (3) +]) + +AT_CHECK([ovs-ofctl add-flow br0 'priority=99,udp,action=normal']) + +dnl Check GRE tunnel push +AT_CHECK([ovs-ofctl add-flow int-br action=3]) + +AT_CHECK([ovs-appctl ofproto/trace ovs-dummy 'in_port(2),eth_type(0x0800),ipv4(src=1.1.3.88,dst=1.1.3.112,proto=17,tos=0,ttl=64,frag=no),udp(src=51283,dst=4789)'], [0], [stdout]) +AT_CHECK([tail -1 stdout], [0], + [Datapath actions: tnl_push(tnl_port(3),header(size=42,type=3,eth(dst=f8:bc:12:44:34:b6,src=aa:55:aa:55:00:00,dl_type=0x0800),ipv4(src=1.1.2.88,dst=1.1.2.92,proto=47,tos=0,ttl=64,frag=0x4000),gre((flags=0x2000,proto=0x6558),key=0x1c8)),out_port(100)) +]) + +
[ovs-dev] 答复: Re: 答复: Re: 答复: Re: [PATCH] ovn-controller: Support vxlan tunnel in ovn
Hi Ben and Mikey, thank you for your review and analysis. As we discribed below, vxlan is used very common, and the use case that we mentioned below is typical architecture of telecom operators' networks. So, we think it is very necessary to support vxlan between ovs and HW switch. If ovn does not support the use case we discribed before, we think the the ovn features may be imperfect. For this reason, we do the changes as this patch. Because vxlan tunnel can not carry enough information like genev to fulfill ovn current pipeline, so we do some assumptions. As Michkey and Ben said, these assumptions may not hold going forward, such as S_SWITCH_IN_L2_LKUP and more than one chassisredirect ports and so on. We think these incompatibles are not impossibly hard problems. If we have the consensus of the necessity to support vxlan, we think we could together to talk over a blueprint that less influence of current planning features and more effective to solve the problem we mentioned. Waiting for your suggestions,Thanks Mickey Spiegel 发件人: ovs-dev-boun...@openvswitch.org 2017/05/08 04:58 收件人:xu.r...@zte.com.cn, 抄送: ovs dev , ovs-dev-boun...@openvswitch.org 主题: Re: [ovs-dev] 答复: Re: 答复: Re: [PATCH] ovn-controller: Support vxlan tunnel in ovn There are some assumptions that you are making which need to be called out. These assumptions may not hold going forward. In fact I refer to two different patches below that are currently under review, that break your assumptions. On Fri, May 5, 2017 at 7:18 PM, wrote: > Hi,Russell > > We think vxlan is the most commonly used tunnel encapsulation in the > overlay network openstack,ovn should better consider it. > > As my workmate wang qianyu said,we would consider computer node connect > with existing hardware switches which associates with SR-IOV as VTEP. > > After discussion, we feel that as long as the following changes for vxlan > tunnel in the table0: > > 1.For local switch, move MFF_TUN_ID to MFF_LOG_DATAPATH, resubmit to > OFTABLE_ETH_UCAST(table29) > It looks like you are overloading OFTABLE_ETH_UCAST that you define here with S_SWITCH_IN_L2_LKUP in ovn/northd/ovn-northd.c. Hardcoding the value to 29 in ovn/controller/lflow.h is not the way to do this. This pipeline stage does move back as new features are added. In fact it is now table 31 due to the recent addition of 2 tables for DNS lookup. > //---In table29, we can find out dst port based on dst mac > You are assuming that outport determination is only based on S_SWITCH_IN_L2_LKUP with no impact from any other ingress pipeline stages. This may not always be true, which I think is the point of Ben's complaint. For example the SFC patch that is currently under review ( http://patchwork.ozlabs.org/patch/754427/) may set outport and then do "output" in the ingress pipeline, in a pipeline stage other than S_SWITCH_IN_L2_LKUP. The alternative is to go through the entire ingress pipeline, but here you have a problem since you do not know the inport. The current VTEP-centric VXLAN code assumes that there is only one port binding per datapath from the VTEP chassis. For the general case that you are trying to address, this assumption does not hold, so you cannot properly determine the inport. The inport may affect the ultimate decision on outport. This is certainly the case for the SFC patch currently under review. You are also assuming that inport does not affect anything in the egress pipeline. This seems to be true at the moment, but this might not always be true as features are added. The existing VTEP functionality does not rely on the assumptions that you made, but since you changed the logic to determine inport in case of VXLAN, you are changing existing functionality. > 2.For local chassisredirect port, move MFF_TUN_ID to MFF_LOG_DATAPATH, set > port tunnel_key to MFF_LOG_OUTPORT and then resubmit to > OFTABLE_LOCAL_OUTPUT. > //---In table 33, we can find out dst local sw and sw patch port based on > the local chassisredirect port,and then follow the exsiting flows. > At the moment, the only case where a tunnel is used for a datapath representing a logical router is when the outport is a chassisredirect port. Your code assumes that will always be the case. If we do what you are suggesting, then that becomes a restriction for all logical router features going forward. This code also assumes that there can only be one chassisredirect port per datapath per chassis. There is a patch that has not yet been reviewed ( http://patchwork.ozlabs.org/patch/732815/) that proposes multiple distributed gateway ports (and correspondingly chassisredirect ports) on one datapath. I am not sure what the use case is, but if that feature were added and more than one distributed gateway port on one datapath specified the same redirect-chassis, it would break this code. This version of the code is better than the original version, which was based on
Re: [ovs-dev] [PATCH 2/2] tunnel-tests: Add test to match tunnel traffic.
On 5/8/17, 5:24 PM, "William Tu" wrote: [snip] > -dnl Check ARP Snoop > +dnl Use arp reply to achieve tunnel next hop mac binding > AT_CHECK([ovs-appctl netdev-dummy/receive br0 'recirc_id(0),in_port(100),eth(src=f8:bc:12:44:34:b6,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=1.1.2.92,ti > > AT_CHECK([ovs-appctl tnl/neigh/show | tail -n+3 | sort], [0], [dnl > @@ -275,8 +261,6 @@ Listening ports: > gre_sys (3) > ]) > > -AT_CHECK([ovs-ofctl add-flow br0 'priority=99,udp,action=normal']) > - we need this to make sure inner UDP packet isn't matched. That is fine - leave this particular line then. If it is meant as a kind of “smoke test” for the present breakage issue, then a comment is needed. > dnl Check GRE tunnel push > AT_CHECK([ovs-ofctl add-flow int-br action=3]) > > @@ -287,8 +271,7 @@ AT_CHECK([tail -1 stdout], [0], > > AT_CHECK([ovs-appctl netdev-dummy/receive int-br '5054000a505400091234']) > AT_CHECK([ovs-ofctl dump-flows br0 | ofctl_strip | sort], [0], [dnl) > - n_packets=3, n_bytes=140, priority=1 actions=NORMAL > - priority=99,udp actions=NORMAL > + n_packets=2, n_bytes=98, priority=1 actions=NORMAL > NXST_FLOW reply: > ]) > Thanks for the feedback. I will submit v2 patch. William ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 1/2] Revert "tunneling: Avoid recirculation on datapath."
Hi Joe and Greg, Maybe it's better I put this revert tunneling patch (1/2) and its tunnel-tests (2/2) in one patch, so the "make check" can pass? Regards, William On Mon, May 8, 2017 at 11:34 AM, Greg Rose wrote: > On Mon, 2017-05-08 at 11:15 -0700, Joe Stringer wrote: >> This reverts commit f1dac5128ce6db2e493f0d1c7a8b53fb9f34476f. When this >> commit was introduced, it broke the 'make check-system-userspace' >> testsuite. It appears that the new translation fails to modify the flow >> in a way that would represent the flow as an encapsulated flow when the >> traffic is patched through to the second bridge. As such, rather than >> matching on, for example, "ip,proto=47" for gre, it would use the inner >> packet's flow headers. It also results in problems reporting statistics, >> as the tunnel's header is not reflected in subsequent statistics and >> truncation is not properly applied during translation. >> >> While a refreshed approach to solving the above problem is formed, >> revert this patch. >> >> Reported-at: >> https://mail.openvswitch.org/pipermail/ovs-dev/2017-May/331972.html >> Signed-off-by: Joe Stringer > > Acked-by: Greg Rose > >> --- >> lib/dpif-netdev.c | 18 ++- >> ofproto/ofproto-dpif-xlate.c | 280 >> -- >> tests/ofproto-dpif.at | 11 +- >> tests/ovn.at | 6 +- >> tests/tunnel-push-pop-ipv6.at | 10 +- >> tests/tunnel-push-pop.at | 12 +- >> 6 files changed, 173 insertions(+), 164 deletions(-) >> >> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c >> index 4ee5d058aff8..d21515657634 100644 >> --- a/lib/dpif-netdev.c >> +++ b/lib/dpif-netdev.c >> @@ -4970,8 +4970,24 @@ dp_execute_cb(void *aux_, struct dp_packet_batch >> *packets_, >> >> case OVS_ACTION_ATTR_TUNNEL_PUSH: >> if (*depth < MAX_RECIRC_DEPTH) { >> +struct dp_packet_batch tnl_pkt; >> +struct dp_packet_batch *orig_packets_ = packets_; >> +int err; >> + >> +if (!may_steal) { >> +dp_packet_batch_clone(&tnl_pkt, packets_); >> +packets_ = &tnl_pkt; >> +dp_packet_batch_reset_cutlen(orig_packets_); >> +} >> + >> dp_packet_batch_apply_cutlen(packets_); >> -push_tnl_action(pmd, a, packets_); >> + >> +err = push_tnl_action(pmd, a, packets_); >> +if (!err) { >> +(*depth)++; >> +dp_netdev_recirculate(pmd, packets_); >> +(*depth)--; >> +} >> return; >> } >> break; >> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c >> index b308f21de100..bc3a310227da 100644 >> --- a/ofproto/ofproto-dpif-xlate.c >> +++ b/ofproto/ofproto-dpif-xlate.c >> @@ -425,10 +425,6 @@ static void xlate_action_set(struct xlate_ctx *ctx); >> static void xlate_commit_actions(struct xlate_ctx *ctx); >> >> static void >> -apply_nested_clone_actions(struct xlate_ctx *ctx, const struct xport >> *in_dev, >> - struct xport *out_dev); >> - >> -static void >> ctx_trigger_freeze(struct xlate_ctx *ctx) >> { >> ctx->exit = true; >> @@ -3215,17 +3211,7 @@ build_tunnel_send(struct xlate_ctx *ctx, const struct >> xport *xport, >> } >> tnl_push_data.tnl_port = odp_to_u32(tunnel_odp_port); >> tnl_push_data.out_port = odp_to_u32(out_dev->odp_port); >> - >> -size_t push_action_size = 0; >> -size_t clone_ofs = nl_msg_start_nested(ctx->odp_actions, >> - OVS_ACTION_ATTR_CLONE); >> odp_put_tnl_push_action(ctx->odp_actions, &tnl_push_data); >> -push_action_size = ctx->odp_actions->size; >> -apply_nested_clone_actions(ctx, xport, out_dev); >> -if (ctx->odp_actions->size > push_action_size) { >> -/* Update the CLONE action only when combined */ >> -nl_msg_end_nested(ctx->odp_actions, clone_ofs); >> -} >> return 0; >> } >> >> @@ -3261,136 +3247,6 @@ xlate_flow_is_protected(const struct xlate_ctx *ctx, >> const struct flow *flow, co >> xport_in->xbundle->protected && xport_out->xbundle->protected); >> } >> >> -/* Populate and apply nested actions on 'out_dev'. >> - * The nested actions are applied on cloned packets in dp while outputting >> to >> - * either patch or tunnel ports. >> - * On output to a patch port, the output action will be replaced with set of >> - * nested actions on the peer patch port. >> - * Similarly on output to a tunnel port, the post nested actions on >> - * tunnel are chained up with the tunnel-push action. >> - */ >> -static void >> -apply_nested_clone_actions(struct xlate_ctx *ctx, const struct xport >> *in_dev, >> - struct xport *out_dev) >> -{ >> -struct flow *flow = &ctx->xin->flow; >> -struct flow old_flow = ctx->xin->flow; >> -struct flow_tnl old_flow_tnl_wc = ctx->wc->masks.tunnel; >> -bool old_conntrack = ctx->co
[ovs-dev] [PATCHv2] tunnel-tests: Add test to match tunnel traffic.
This test highlights a bug that was affecting master up until the previous patch. Put simply, we have two bridges: an integration bridge which contains a tunnel, and a physical bridge for underlay network connectivity. This test simulates putting UDP traffic through the integration bridge, with the intention to apply GRE tunnel headers and send the packet through the underlay bridge. The underlay bridge should observe GRE traffic. Signed-off-by: William Tu Signed-off-by: Joe Stringer Acked-by: Greg Rose --- tests/tunnel-push-pop.at | 55 1 file changed, 55 insertions(+) diff --git a/tests/tunnel-push-pop.at b/tests/tunnel-push-pop.at index 294d28a2416d..d7936ddc7c1f 100644 --- a/tests/tunnel-push-pop.at +++ b/tests/tunnel-push-pop.at @@ -225,3 +225,58 @@ OVS_WAIT_UNTIL([test `ovs-pcap p0.pcap | grep 5054000a505400091235 | wc OVS_VSWITCHD_STOP AT_CLEANUP + +AT_SETUP([tunnel_push_pop - matching on physical bridge]) + +OVS_VSWITCHD_START([add-port br0 p0 -- set Interface p0 type=dummy ofport_request=1 other-config:hwaddr=aa:55:aa:55:00:00]) +AT_CHECK([ovs-vsctl add-br int-br -- set bridge int-br datapath_type=dummy], [0]) +AT_CHECK([ovs-vsctl add-port int-br t1 -- set Interface t1 type=gre \ + options:remote_ip=1.1.2.92 options:key=456 ofport_request=3], [0]) + +AT_CHECK([ovs-appctl dpif/show], [0], [dnl +dummy@ovs-dummy: hit:0 missed:0 + br0: + br0 65534/100: (dummy-internal) + p0 1/1: (dummy) + int-br: + int-br 65534/2: (dummy-internal) + t1 3/3: (gre: key=456, remote_ip=1.1.2.92) +]) + +AT_CHECK([ovs-appctl netdev-dummy/ip4addr br0 1.1.2.88/24], [0], [OK +]) +AT_CHECK([ovs-appctl ovs/route/add 1.1.2.92/24 br0], [0], [OK +]) +AT_CHECK([ovs-ofctl add-flow br0 'priority=1,action=normal']) + +dnl Use ARP reply to achieve tunnel next hop mac binding +AT_CHECK([ovs-appctl netdev-dummy/receive br0 'recirc_id(0),in_port(100),eth(src=f8:bc:12:44:34:b6,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=1.1.2.92,tip=1.1.2.88,op=2,sha=f8:bc:12:44:34:b6,tha=00:00:00:00:00:00)']) + +AT_CHECK([ovs-appctl tnl/neigh/show | tail -n+3 | sort], [0], [dnl +1.1.2.92 f8:bc:12:44:34:b6 br0 +]) + +AT_CHECK([ovs-appctl tnl/ports/show |sort], [0], [dnl +Listening ports: +gre_sys (3) +]) + +AT_CHECK([ovs-ofctl add-flow br0 'priority=99,udp,action=normal']) + +dnl Check GRE tunnel push +AT_CHECK([ovs-ofctl add-flow int-br action=3]) + +AT_CHECK([ovs-appctl ofproto/trace ovs-dummy 'in_port(2),eth_type(0x0800),ipv4(src=1.1.3.88,dst=1.1.3.112,proto=17,tos=0,ttl=64,frag=no),udp(src=51283,dst=4789)'], [0], [stdout]) +AT_CHECK([tail -1 stdout], [0], + [Datapath actions: tnl_push(tnl_port(3),header(size=42,type=3,eth(dst=f8:bc:12:44:34:b6,src=aa:55:aa:55:00:00,dl_type=0x0800),ipv4(src=1.1.2.88,dst=1.1.2.92,proto=47,tos=0,ttl=64,frag=0x4000),gre((flags=0x2000,proto=0x6558),key=0x1c8)),out_port(100)) +]) + +AT_CHECK([ovs-appctl netdev-dummy/receive int-br '5054000a505400091234']) +AT_CHECK([ovs-ofctl dump-flows br0 | ofctl_strip | sort], [0], [dnl) + n_packets=2, n_bytes=98, priority=1 actions=NORMAL + priority=99,udp actions=NORMAL +NXST_FLOW reply: +]) + +OVS_VSWITCHD_STOP +AT_CLEANUP -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 2/2] tunnel-tests: Add test to match tunnel traffic.
[snip] > -dnl Check ARP Snoop > +dnl Use arp reply to achieve tunnel next hop mac binding > AT_CHECK([ovs-appctl netdev-dummy/receive br0 > 'recirc_id(0),in_port(100),eth(src=f8:bc:12:44:34:b6,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=1.1.2.92,ti > > AT_CHECK([ovs-appctl tnl/neigh/show | tail -n+3 | sort], [0], [dnl > @@ -275,8 +261,6 @@ Listening ports: > gre_sys (3) > ]) > > -AT_CHECK([ovs-ofctl add-flow br0 'priority=99,udp,action=normal']) > - we need this to make sure inner UDP packet isn't matched. > dnl Check GRE tunnel push > AT_CHECK([ovs-ofctl add-flow int-br action=3]) > > @@ -287,8 +271,7 @@ AT_CHECK([tail -1 stdout], [0], > > AT_CHECK([ovs-appctl netdev-dummy/receive int-br > '5054000a505400091234']) > AT_CHECK([ovs-ofctl dump-flows br0 | ofctl_strip | sort], [0], [dnl) > - n_packets=3, n_bytes=140, priority=1 actions=NORMAL > - priority=99,udp actions=NORMAL > + n_packets=2, n_bytes=98, priority=1 actions=NORMAL > NXST_FLOW reply: > ]) > Thanks for the feedback. I will submit v2 patch. William ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v6 3/4] datapath-windows: NAT integration with conntrack
Hi Sai, Thanks for the feedback! When I removed "static __inline", I meant to make the function public. It's not a matter of coding standard or style, but a matter of feature. Please note the changes I made to Actions.h as well. The functions I made public are utility functions. It's not justifiable why OvsUpdateAddressAndPort should be public but OvsUpdateUdpPorts should be private, for example. Please check the inline comments for the reply to your other concerns. Best regards, Yin Lin -Original Message- From: Sairam Venugopal Sent: Monday, May 8, 2017 5:07 PM To: Yin Lin ; d...@openvswitch.org Subject: Re: [ovs-dev] [PATCH v6 3/4] datapath-windows: NAT integration with conntrack Hi Yin, Thanks for the patches. I had some immediate review comments. - OvsUpdateAddressAndPort does not have a return type - Refrain from changing other code that your patch does not require. Please send out separate incremental patch. - This is the correct way to define static inline functions - "static __inline NDIS_STATUS”. Don’t change this. Update your code to follow this instead. Please find the other comments inline. Thanks, Sairam On 5/8/17, 3:38 PM, "ovs-dev-boun...@openvswitch.org on behalf of Yin Lin" wrote: >Signed-off-by: Yin Lin >--- > datapath-windows/automake.mk | 4 +- > datapath-windows/ovsext/Actions.c | 118 - > datapath-windows/ovsext/Actions.h | 20 > datapath-windows/ovsext/Conntrack.c| 187 + > datapath-windows/ovsext/Conntrack.h| 25 +++-- > datapath-windows/ovsext/ovsext.vcxproj | 2 + > 6 files changed, 273 insertions(+), 83 deletions(-) > >diff --git a/datapath-windows/automake.mk b/datapath-windows/automake.mk >index 7177630..f1632cc 100644 >--- a/datapath-windows/automake.mk >+++ b/datapath-windows/automake.mk >@@ -16,9 +16,9 @@ EXTRA_DIST += \ > datapath-windows/ovsext/Conntrack-icmp.c \ > datapath-windows/ovsext/Conntrack-other.c \ > datapath-windows/ovsext/Conntrack-related.c \ >-datapath-windows/ovsext/Conntrack-nat.c \ >+ datapath-windows/ovsext/Conntrack-nat.c \ > datapath-windows/ovsext/Conntrack-tcp.c \ >-datapath-windows/ovsext/Conntrack-nat.h \ >+ datapath-windows/ovsext/Conntrack-nat.h \ > datapath-windows/ovsext/Conntrack.c \ > datapath-windows/ovsext/Conntrack.h \ > datapath-windows/ovsext/Datapath.c \ >diff --git a/datapath-windows/ovsext/Actions.c >b/datapath-windows/ovsext/Actions.c >index e2eae9a..d1938f3 100644 >--- a/datapath-windows/ovsext/Actions.c >+++ b/datapath-windows/ovsext/Actions.c >@@ -1380,7 +1380,7 @@ PUINT8 OvsGetHeaderBySize(OvsForwardingContext >*ovsFwdCtx, > * based on the specified key. > * > */ [Sai]: Please don’t change this. This is the correct - "static __inline NDIS_STATUS" >-static __inline NDIS_STATUS >+NDIS_STATUS > OvsUpdateUdpPorts(OvsForwardingContext *ovsFwdCtx, > const struct ovs_key_udp *udpAttr) > { >@@ -1427,7 +1427,7 @@ OvsUpdateUdpPorts(OvsForwardingContext *ovsFwdCtx, > * based on the specified key. > * > */ >-static __inline NDIS_STATUS >+NDIS_STATUS [Sai]: Please don’t change this. This is the correct - "static __inline NDIS_STATUS" > OvsUpdateTcpPorts(OvsForwardingContext *ovsFwdCtx, > const struct ovs_key_tcp *tcpAttr) > { >@@ -1465,12 +1465,124 @@ OvsUpdateTcpPorts(OvsForwardingContext *ovsFwdCtx, > > /* > * >+ * OvsUpdateAddressAndPort -- >+ * Updates the source/destination IP and port fields in >+ * ovsFwdCtx.curNbl inline based on the specified key. >+ * >+ */ [Sai]: Missing return type [Yin]: Good catch. Fixed. >+OvsUpdateAddressAndPort(OvsForwardingContext *ovsFwdCtx, >+UINT32 newAddr, UINT16 newPort, >+BOOLEAN isSource, BOOLEAN isTx) >+{ >+PUINT8 bufferStart; >+UINT32 hdrSize; >+OVS_PACKET_HDR_INFO *layers = &ovsFwdCtx->layers; >+IPHdr *ipHdr; >+TCPHdr *tcpHdr = NULL; >+UDPHdr *udpHdr = NULL; >+UINT32 *addrField = NULL; >+UINT16 *portField = NULL; >+UINT16 *checkField = NULL; >+BOOLEAN l4Offload = FALSE; >+NDIS_TCP_IP_CHECKSUM_NET_BUFFER_LIST_INFO csumInfo; >+ >+ASSERT(layers->value != 0); >+ >+if (layers->isTcp || layers->isUdp) { >+hdrSize = layers->l4Offset + >+ layers->isTcp ? sizeof (*tcpHdr) : sizeof (*udpHdr); >+} else { >+hdrSize = layers->l3Offset + sizeof (*ipHdr); >+} >+ >+bufferStart = OvsGetHeaderBySize(ovsFwdCtx, hdrSize); >+if (!bufferStart) { >+return NDIS_STATUS_RESOURCES; >+} >+ >+ipHdr = (IPHdr *)(bufferSt
Re: [ovs-dev] [PATCH] tunnel-tests: Add test to match tunnel traffic.
I sent out a suggested incremental Thanks Darrell On 5/8/17, 5:09 PM, "ovs-dev-boun...@openvswitch.org on behalf of William Tu" wrote: This test highlights a bug that was affecting master up until the previous patch. Put simply, we have two bridges: an integration bridge which contains a tunnel, and a physical bridge for underlay network connectivity. This test simulates putting UDP traffic through the integration bridge, with the intention to apply GRE tunnel headers and send the packet through the underlay bridge. The underlay bridge should observe GRE traffic. Signed-off-by: William Tu Signed-off-by: Joe Stringer Acked-by: Greg Rose --- tests/tunnel-push-pop.at | 69 1 file changed, 69 insertions(+) diff --git a/tests/tunnel-push-pop.at b/tests/tunnel-push-pop.at index 294d28a2416d..c7724671c400 100644 --- a/tests/tunnel-push-pop.at +++ b/tests/tunnel-push-pop.at @@ -225,3 +225,72 @@ OVS_WAIT_UNTIL([test `ovs-pcap p0.pcap | grep 5054000a505400091235 | wc OVS_VSWITCHD_STOP AT_CLEANUP + +AT_SETUP([tunnel_push_pop - matching on physical bridge]) + +OVS_VSWITCHD_START([add-port br0 p0 -- set Interface p0 type=dummy ofport_request=1 other-config:hwaddr=aa:55:aa:55:00:00]) +AT_CHECK([ovs-vsctl add-br int-br -- set bridge int-br datapath_type=dummy], [0]) +AT_CHECK([ovs-vsctl add-port int-br t1 -- set Interface t1 type=gre \ + options:remote_ip=1.1.2.92 options:key=456 ofport_request=3], [0]) + +AT_CHECK([ovs-appctl dpif/show], [0], [dnl +dummy@ovs-dummy: hit:0 missed:0 + br0: + br0 65534/100: (dummy-internal) + p0 1/1: (dummy) + int-br: + int-br 65534/2: (dummy-internal) + t1 3/3: (gre: key=456, remote_ip=1.1.2.92) +]) + +AT_CHECK([ovs-appctl dpif/show], [0], [dnl +dummy@ovs-dummy: hit:0 missed:0 + br0: + br0 65534/100: (dummy-internal) + p0 1/1: (dummy) + int-br: + int-br 65534/2: (dummy-internal) + t1 3/3: (gre: key=456, remote_ip=1.1.2.92) +]) + +AT_CHECK([ovs-appctl netdev-dummy/ip4addr br0 1.1.2.88/24], [0], [OK +]) +AT_CHECK([ovs-appctl ovs/route/add 1.1.2.92/24 br0], [0], [OK +]) +AT_CHECK([ovs-ofctl add-flow br0 'priority=1,action=normal']) + +dnl Check ARP request +AT_CHECK([ovs-vsctl -- set Interface p0 options:pcap=p0.pcap]) +AT_CHECK([ovs-appctl netdev-dummy/receive int-br 'in_port(2),eth(src=aa:55:aa:55:00:00,dst=f8:bc:12:ff:ff:ff),eth_type(0x0800),ipv4(src=1.1.3.92,dst=1.1.3.88,proto=1,tos=0,ttl=64,frag=no),icmp(type=0,code=0)']) + +dnl Check ARP Snoop +AT_CHECK([ovs-appctl netdev-dummy/receive br0 'recirc_id(0),in_port(100),eth(src=f8:bc:12:44:34:b6,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=1.1.2.92,tip=1.1.2.88,op=2,sha=f8:bc:12:44:34:b6,tha=00:00:00:00:00:00)']) + +AT_CHECK([ovs-appctl tnl/neigh/show | tail -n+3 | sort], [0], [dnl +1.1.2.92 f8:bc:12:44:34:b6 br0 +]) + +AT_CHECK([ovs-appctl tnl/ports/show |sort], [0], [dnl +Listening ports: +gre_sys (3) +]) + +AT_CHECK([ovs-ofctl add-flow br0 'priority=99,udp,action=normal']) + +dnl Check GRE tunnel push +AT_CHECK([ovs-ofctl add-flow int-br action=3]) + +AT_CHECK([ovs-appctl ofproto/trace ovs-dummy 'in_port(2),eth_type(0x0800),ipv4(src=1.1.3.88,dst=1.1.3.112,proto=17,tos=0,ttl=64,frag=no),udp(src=51283,dst=4789)'], [0], [stdout]) +AT_CHECK([tail -1 stdout], [0], + [Datapath actions: tnl_push(tnl_port(3),header(size=42,type=3,eth(dst=f8:bc:12:44:34:b6,src=aa:55:aa:55:00:00,dl_type=0x0800),ipv4(src=1.1.2.88,dst=1.1.2.92,proto=47,tos=0,ttl=64,frag=0x4000),gre((flags=0x2000,proto=0x6558),key=0x1c8)),out_port(100)) +]) + +AT_CHECK([ovs-appctl netdev-dummy/receive int-br '5054000a505400091234']) +AT_CHECK([ovs-ofctl dump-flows br0 | ofctl_strip | sort], [0], [dnl) + n_packets=3, n_bytes=140, priority=1 actions=NORMAL + priority=99,udp actions=NORMAL +NXST_FLOW reply: +]) + +OVS_VSWITCHD_STOP +AT_CLEANUP -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=wG_-kQQiDOExXW_vZKJ028TunQQYspMX29KuRHYftO0&s=8uriqdLtDzod-IkSn5nMtjNb8leTSjsVClNqjJV8kJw&e= ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 2/2] tunnel-tests: Add test to match tunnel traffic.
I added in the following incremental which eliminates unnecessary/misleading parts, fixes some misleading/incorrect comments and removes redundant parts. dball@ubuntu:~/ovs$ git diff tests/tunnel-push-pop.at diff --git a/tests/tunnel-push-pop.at b/tests/tunnel-push-pop.at index c772467..654e622 100644 --- a/tests/tunnel-push-pop.at +++ b/tests/tunnel-push-pop.at @@ -243,27 +243,13 @@ dummy@ovs-dummy: hit:0 missed:0 t1 3/3: (gre: key=456, remote_ip=1.1.2.92) ]) -AT_CHECK([ovs-appctl dpif/show], [0], [dnl -dummy@ovs-dummy: hit:0 missed:0 - br0: - br0 65534/100: (dummy-internal) - p0 1/1: (dummy) - int-br: - int-br 65534/2: (dummy-internal) - t1 3/3: (gre: key=456, remote_ip=1.1.2.92) -]) - AT_CHECK([ovs-appctl netdev-dummy/ip4addr br0 1.1.2.88/24], [0], [OK ]) AT_CHECK([ovs-appctl ovs/route/add 1.1.2.92/24 br0], [0], [OK ]) AT_CHECK([ovs-ofctl add-flow br0 'priority=1,action=normal']) -dnl Check ARP request -AT_CHECK([ovs-vsctl -- set Interface p0 options:pcap=p0.pcap]) -AT_CHECK([ovs-appctl netdev-dummy/receive int-br 'in_port(2),eth(src=aa:55:aa:55:00:00,dst=f8:bc:12:ff:ff:ff),eth_type(0x0800),ipv4(src=1.1.3.92,dst=1.1.3.88, - -dnl Check ARP Snoop +dnl Use arp reply to achieve tunnel next hop mac binding AT_CHECK([ovs-appctl netdev-dummy/receive br0 'recirc_id(0),in_port(100),eth(src=f8:bc:12:44:34:b6,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=1.1.2.92,ti AT_CHECK([ovs-appctl tnl/neigh/show | tail -n+3 | sort], [0], [dnl @@ -275,8 +261,6 @@ Listening ports: gre_sys (3) ]) -AT_CHECK([ovs-ofctl add-flow br0 'priority=99,udp,action=normal']) - dnl Check GRE tunnel push AT_CHECK([ovs-ofctl add-flow int-br action=3]) @@ -287,8 +271,7 @@ AT_CHECK([tail -1 stdout], [0], AT_CHECK([ovs-appctl netdev-dummy/receive int-br '5054000a505400091234']) AT_CHECK([ovs-ofctl dump-flows br0 | ofctl_strip | sort], [0], [dnl) - n_packets=3, n_bytes=140, priority=1 actions=NORMAL - priority=99,udp actions=NORMAL + n_packets=2, n_bytes=98, priority=1 actions=NORMAL NXST_FLOW reply: ]) On 5/8/17, 11:15 AM, "ovs-dev-boun...@openvswitch.org on behalf of Joe Stringer" wrote: From: William Tu This test highlights a bug that was affecting master up until the previous patch. Put simply, we have two bridges: an integration bridge which contains a tunnel, and a physical bridge for underlay network connectivity. This test simulates putting UDP traffic through the integration bridge, with the intention to apply GRE tunnel headers and send the packet through the underlay bridge. The underlay bridge should observe GRE traffic. --- tests/tunnel-push-pop.at | 69 1 file changed, 69 insertions(+) diff --git a/tests/tunnel-push-pop.at b/tests/tunnel-push-pop.at index 4eeac4154dbc..fe901525ea62 100644 --- a/tests/tunnel-push-pop.at +++ b/tests/tunnel-push-pop.at @@ -225,3 +225,72 @@ OVS_WAIT_UNTIL([test `ovs-pcap p0.pcap | grep 5054000a505400091235 | wc OVS_VSWITCHD_STOP AT_CLEANUP + +AT_SETUP([tunnel_push_pop - matching on physical bridge]) + +OVS_VSWITCHD_START([add-port br0 p0 -- set Interface p0 type=dummy ofport_request=1 other-config:hwaddr=aa:55:aa:55:00:00]) +AT_CHECK([ovs-vsctl add-br int-br -- set bridge int-br datapath_type=dummy], [0]) +AT_CHECK([ovs-vsctl add-port int-br t1 -- set Interface t1 type=gre \ + options:remote_ip=1.1.2.92 options:key=456 ofport_request=3], [0]) + +AT_CHECK([ovs-appctl dpif/show], [0], [dnl +dummy@ovs-dummy: hit:0 missed:0 + br0: + br0 65534/100: (dummy-internal) + p0 1/1: (dummy) + int-br: + int-br 65534/2: (dummy-internal) + t1 3/3: (gre: key=456, remote_ip=1.1.2.92) +]) + +AT_CHECK([ovs-appctl dpif/show], [0], [dnl +dummy@ovs-dummy: hit:0 missed:0 + br0: + br0 65534/100: (dummy-internal) + p0 1/1: (dummy) + int-br: + int-br 65534/2: (dummy-internal) + t1 3/3: (gre: key=456, remote_ip=1.1.2.92) +]) + +AT_CHECK([ovs-appctl netdev-dummy/ip4addr br0 1.1.2.88/24], [0], [OK +]) +AT_CHECK([ovs-appctl ovs/route/add 1.1.2.92/24 br0], [0], [OK +]) +AT_CHECK([ovs-ofctl add-flow br0 'priority=1,action=normal']) + +dnl Check ARP request +AT_CHECK([ovs-vsctl -- set Interface p0 options:pcap=p0.pcap]) +AT_CHECK([ovs-appctl netdev-dummy/receive int-br 'in_port(2),eth(src=aa:55:aa:55:00:00,dst=f8:bc:12:ff:ff:ff),eth_type(0x0800),ipv4(src=1.1.3.92,dst=1.1.3.88,proto=1,tos=0,ttl=64,frag=no),icmp(type=0,code=0)']) + +dnl Check ARP Snoop +AT_CHECK([ovs-appctl netdev-dummy/receive br0 'recirc_id(0),in_port(100),eth(src=f8:bc:12:44:34:b6,dst=ff:ff:ff:ff:ff:ff),eth_type(0x080
[ovs-dev] [PATCH] tunnel-tests: Add test to match tunnel traffic.
This test highlights a bug that was affecting master up until the previous patch. Put simply, we have two bridges: an integration bridge which contains a tunnel, and a physical bridge for underlay network connectivity. This test simulates putting UDP traffic through the integration bridge, with the intention to apply GRE tunnel headers and send the packet through the underlay bridge. The underlay bridge should observe GRE traffic. Signed-off-by: William Tu Signed-off-by: Joe Stringer Acked-by: Greg Rose --- tests/tunnel-push-pop.at | 69 1 file changed, 69 insertions(+) diff --git a/tests/tunnel-push-pop.at b/tests/tunnel-push-pop.at index 294d28a2416d..c7724671c400 100644 --- a/tests/tunnel-push-pop.at +++ b/tests/tunnel-push-pop.at @@ -225,3 +225,72 @@ OVS_WAIT_UNTIL([test `ovs-pcap p0.pcap | grep 5054000a505400091235 | wc OVS_VSWITCHD_STOP AT_CLEANUP + +AT_SETUP([tunnel_push_pop - matching on physical bridge]) + +OVS_VSWITCHD_START([add-port br0 p0 -- set Interface p0 type=dummy ofport_request=1 other-config:hwaddr=aa:55:aa:55:00:00]) +AT_CHECK([ovs-vsctl add-br int-br -- set bridge int-br datapath_type=dummy], [0]) +AT_CHECK([ovs-vsctl add-port int-br t1 -- set Interface t1 type=gre \ + options:remote_ip=1.1.2.92 options:key=456 ofport_request=3], [0]) + +AT_CHECK([ovs-appctl dpif/show], [0], [dnl +dummy@ovs-dummy: hit:0 missed:0 + br0: + br0 65534/100: (dummy-internal) + p0 1/1: (dummy) + int-br: + int-br 65534/2: (dummy-internal) + t1 3/3: (gre: key=456, remote_ip=1.1.2.92) +]) + +AT_CHECK([ovs-appctl dpif/show], [0], [dnl +dummy@ovs-dummy: hit:0 missed:0 + br0: + br0 65534/100: (dummy-internal) + p0 1/1: (dummy) + int-br: + int-br 65534/2: (dummy-internal) + t1 3/3: (gre: key=456, remote_ip=1.1.2.92) +]) + +AT_CHECK([ovs-appctl netdev-dummy/ip4addr br0 1.1.2.88/24], [0], [OK +]) +AT_CHECK([ovs-appctl ovs/route/add 1.1.2.92/24 br0], [0], [OK +]) +AT_CHECK([ovs-ofctl add-flow br0 'priority=1,action=normal']) + +dnl Check ARP request +AT_CHECK([ovs-vsctl -- set Interface p0 options:pcap=p0.pcap]) +AT_CHECK([ovs-appctl netdev-dummy/receive int-br 'in_port(2),eth(src=aa:55:aa:55:00:00,dst=f8:bc:12:ff:ff:ff),eth_type(0x0800),ipv4(src=1.1.3.92,dst=1.1.3.88,proto=1,tos=0,ttl=64,frag=no),icmp(type=0,code=0)']) + +dnl Check ARP Snoop +AT_CHECK([ovs-appctl netdev-dummy/receive br0 'recirc_id(0),in_port(100),eth(src=f8:bc:12:44:34:b6,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=1.1.2.92,tip=1.1.2.88,op=2,sha=f8:bc:12:44:34:b6,tha=00:00:00:00:00:00)']) + +AT_CHECK([ovs-appctl tnl/neigh/show | tail -n+3 | sort], [0], [dnl +1.1.2.92 f8:bc:12:44:34:b6 br0 +]) + +AT_CHECK([ovs-appctl tnl/ports/show |sort], [0], [dnl +Listening ports: +gre_sys (3) +]) + +AT_CHECK([ovs-ofctl add-flow br0 'priority=99,udp,action=normal']) + +dnl Check GRE tunnel push +AT_CHECK([ovs-ofctl add-flow int-br action=3]) + +AT_CHECK([ovs-appctl ofproto/trace ovs-dummy 'in_port(2),eth_type(0x0800),ipv4(src=1.1.3.88,dst=1.1.3.112,proto=17,tos=0,ttl=64,frag=no),udp(src=51283,dst=4789)'], [0], [stdout]) +AT_CHECK([tail -1 stdout], [0], + [Datapath actions: tnl_push(tnl_port(3),header(size=42,type=3,eth(dst=f8:bc:12:44:34:b6,src=aa:55:aa:55:00:00,dl_type=0x0800),ipv4(src=1.1.2.88,dst=1.1.2.92,proto=47,tos=0,ttl=64,frag=0x4000),gre((flags=0x2000,proto=0x6558),key=0x1c8)),out_port(100)) +]) + +AT_CHECK([ovs-appctl netdev-dummy/receive int-br '5054000a505400091234']) +AT_CHECK([ovs-ofctl dump-flows br0 | ofctl_strip | sort], [0], [dnl) + n_packets=3, n_bytes=140, priority=1 actions=NORMAL + priority=99,udp actions=NORMAL +NXST_FLOW reply: +]) + +OVS_VSWITCHD_STOP +AT_CLEANUP -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v6 3/4] datapath-windows: NAT integration with conntrack
Hi Yin, Thanks for the patches. I had some immediate review comments. - OvsUpdateAddressAndPort does not have a return type - Refrain from changing other code that your patch does not require. Please send out separate incremental patch. - This is the correct way to define static inline functions - "static __inline NDIS_STATUS”. Don’t change this. Update your code to follow this instead. Please find the other comments inline. Thanks, Sairam On 5/8/17, 3:38 PM, "ovs-dev-boun...@openvswitch.org on behalf of Yin Lin" wrote: >Signed-off-by: Yin Lin >--- > datapath-windows/automake.mk | 4 +- > datapath-windows/ovsext/Actions.c | 118 - > datapath-windows/ovsext/Actions.h | 20 > datapath-windows/ovsext/Conntrack.c| 187 + > datapath-windows/ovsext/Conntrack.h| 25 +++-- > datapath-windows/ovsext/ovsext.vcxproj | 2 + > 6 files changed, 273 insertions(+), 83 deletions(-) > >diff --git a/datapath-windows/automake.mk b/datapath-windows/automake.mk >index 7177630..f1632cc 100644 >--- a/datapath-windows/automake.mk >+++ b/datapath-windows/automake.mk >@@ -16,9 +16,9 @@ EXTRA_DIST += \ > datapath-windows/ovsext/Conntrack-icmp.c \ > datapath-windows/ovsext/Conntrack-other.c \ > datapath-windows/ovsext/Conntrack-related.c \ >-datapath-windows/ovsext/Conntrack-nat.c \ >+ datapath-windows/ovsext/Conntrack-nat.c \ > datapath-windows/ovsext/Conntrack-tcp.c \ >-datapath-windows/ovsext/Conntrack-nat.h \ >+ datapath-windows/ovsext/Conntrack-nat.h \ > datapath-windows/ovsext/Conntrack.c \ > datapath-windows/ovsext/Conntrack.h \ > datapath-windows/ovsext/Datapath.c \ >diff --git a/datapath-windows/ovsext/Actions.c >b/datapath-windows/ovsext/Actions.c >index e2eae9a..d1938f3 100644 >--- a/datapath-windows/ovsext/Actions.c >+++ b/datapath-windows/ovsext/Actions.c >@@ -1380,7 +1380,7 @@ PUINT8 OvsGetHeaderBySize(OvsForwardingContext >*ovsFwdCtx, > * based on the specified key. > * > */ [Sai]: Please don’t change this. This is the correct - "static __inline NDIS_STATUS" >-static __inline NDIS_STATUS >+NDIS_STATUS > OvsUpdateUdpPorts(OvsForwardingContext *ovsFwdCtx, > const struct ovs_key_udp *udpAttr) > { >@@ -1427,7 +1427,7 @@ OvsUpdateUdpPorts(OvsForwardingContext *ovsFwdCtx, > * based on the specified key. > * > */ >-static __inline NDIS_STATUS >+NDIS_STATUS [Sai]: Please don’t change this. This is the correct - "static __inline NDIS_STATUS" > OvsUpdateTcpPorts(OvsForwardingContext *ovsFwdCtx, > const struct ovs_key_tcp *tcpAttr) > { >@@ -1465,12 +1465,124 @@ OvsUpdateTcpPorts(OvsForwardingContext *ovsFwdCtx, > > /* > * >+ * OvsUpdateAddressAndPort -- >+ * Updates the source/destination IP and port fields in >+ * ovsFwdCtx.curNbl inline based on the specified key. >+ * >+ */ [Sai]: Missing return type >+OvsUpdateAddressAndPort(OvsForwardingContext *ovsFwdCtx, >+UINT32 newAddr, UINT16 newPort, >+BOOLEAN isSource, BOOLEAN isTx) >+{ >+PUINT8 bufferStart; >+UINT32 hdrSize; >+OVS_PACKET_HDR_INFO *layers = &ovsFwdCtx->layers; >+IPHdr *ipHdr; >+TCPHdr *tcpHdr = NULL; >+UDPHdr *udpHdr = NULL; >+UINT32 *addrField = NULL; >+UINT16 *portField = NULL; >+UINT16 *checkField = NULL; >+BOOLEAN l4Offload = FALSE; >+NDIS_TCP_IP_CHECKSUM_NET_BUFFER_LIST_INFO csumInfo; >+ >+ASSERT(layers->value != 0); >+ >+if (layers->isTcp || layers->isUdp) { >+hdrSize = layers->l4Offset + >+ layers->isTcp ? sizeof (*tcpHdr) : sizeof (*udpHdr); >+} else { >+hdrSize = layers->l3Offset + sizeof (*ipHdr); >+} >+ >+bufferStart = OvsGetHeaderBySize(ovsFwdCtx, hdrSize); >+if (!bufferStart) { >+return NDIS_STATUS_RESOURCES; >+} >+ >+ipHdr = (IPHdr *)(bufferStart + layers->l3Offset); >+ >+if (layers->isTcp) { >+tcpHdr = (TCPHdr *)(bufferStart + layers->l4Offset); >+} else if (layers->isUdp) { >+udpHdr = (UDPHdr *)(bufferStart + layers->l4Offset); >+} >+ >+csumInfo.Value = NET_BUFFER_LIST_INFO(ovsFwdCtx->curNbl, >+ TcpIpChecksumNetBufferListInfo); >+/* >+ * Adjust the IP header inline as dictated by the action, and also update >+ * the IP and the TCP checksum for the data modified. >+ * >+ * In the future, this could be optimized to make one call to >+ * ChecksumUpdate32(). Ignoring this for now, since for the most common >+ * case, we only update the TTL. >+ */ >+ >+if (isSour
Re: [ovs-dev] [PATCH 2/2] tunnel-tests: Add test to match tunnel traffic.
Thanks. I will re-submit this test. William On Mon, May 8, 2017 at 3:56 PM, Joe Stringer wrote: > On 8 May 2017 at 11:35, Greg Rose wrote: >> On Mon, 2017-05-08 at 11:15 -0700, Joe Stringer wrote: >>> From: William Tu >>> >>> This test highlights a bug that was affecting master up until the >>> previous patch. Put simply, we have two bridges: an integration bridge >>> which contains a tunnel, and a physical bridge for underlay network >>> connectivity. This test simulates putting UDP traffic through the >>> integration bridge, with the intention to apply GRE tunnel headers and >>> send the packet through the underlay bridge. The underlay bridge should >>> observe GRE traffic. >>> --- >>> tests/tunnel-push-pop.at | 69 >>> >>> 1 file changed, 69 insertions(+) >> >> No signature. Please fix on push. Otherwise... > > Sorry, yeah. William submitted most of this earlier without sign-off > so looking for it from him. Thanks for looking it over. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v6 4/4] Signed-off-by: Alin Gabriel Serdean
Yin, Can you rebase and rebuild your changes? OvsTcpSegmentNBL arguments have changed and this patch breaks compilation. Can you add commit msgs to your patches? This commit title says: "Signed-off-by:Alin Gabriel…” Can you fix this? Thanks, Sairam On 5/8/17, 3:38 PM, "ovs-dev-boun...@openvswitch.org on behalf of Yin Lin" wrote: >From: Alin Gabriel Serdean > >--- > datapath-windows/ovsext/Actions.c | 38 ++ > 1 file changed, 38 insertions(+) > >diff --git a/datapath-windows/ovsext/Actions.c >b/datapath-windows/ovsext/Actions.c >index d1938f3..bb1e6ea 100644 >--- a/datapath-windows/ovsext/Actions.c >+++ b/datapath-windows/ovsext/Actions.c >@@ -1572,6 +1572,44 @@ OvsUpdateAddressAndPort(OvsForwardingContext *ovsFwdCtx, > } > *portField = newPort; > } >+PNET_BUFFER_LIST curNbl = ovsFwdCtx->curNbl; >+PNET_BUFFER_LIST newNbl = NULL; >+if (layers->isTcp) { >+UINT32 mss = OVSGetTcpMSS(curNbl); >+if (mss) { >+OVS_LOG_TRACE("l4Offset %d", layers->l4Offset); >+newNbl = OvsTcpSegmentNBL(ovsFwdCtx->switchContext, curNbl, >layers, >+ mss, 0); >+if (newNbl == NULL) { >+OVS_LOG_ERROR("Unable to segment NBL"); >+return NDIS_STATUS_FAILURE; >+} >+/* Clear out LSO flags after this point */ >+NET_BUFFER_LIST_INFO(newNbl, TcpLargeSendNetBufferListInfo) = 0; >+} >+} >+/* If we didn't split the packet above, make a copy now */ >+if (newNbl == NULL) { >+csumInfo.Value = NET_BUFFER_LIST_INFO(curNbl, >+ TcpIpChecksumNetBufferListInfo); >+OvsApplySWChecksumOnNB(layers, curNbl, &csumInfo); >+} >+ >+if (newNbl) { >+curNbl = newNbl; >+OvsCompleteNBLForwardingCtx(ovsFwdCtx, >+L"Complete after cloning NBL for >encapsulation"); >+OvsInitForwardingCtx(ovsFwdCtx, ovsFwdCtx->switchContext, >+ newNbl, ovsFwdCtx->srcVportNo, 0, >+ NET_BUFFER_LIST_SWITCH_FORWARDING_DETAIL(newNbl), >+ ovsFwdCtx->completionList, >+ &ovsFwdCtx->layers, FALSE); >+ovsFwdCtx->curNbl = newNbl; >+} >+ >+NET_BUFFER_LIST_INFO(curNbl, >+ TcpIpChecksumNetBufferListInfo) = 0; >+ > return NDIS_STATUS_SUCCESS; > } > >-- >2.10.2.windows.1 > >___ >dev mailing list >d...@openvswitch.org >https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v6 2/4] datapath-windows: Add NAT module in conntrack
Yin, thanks for the patches. Please briefly describe in the commit message the scope of this patch. Same applies to other patches in this series. Thanks, Shashank From: ovs-dev-boun...@openvswitch.org on behalf of Yin Lin Sent: Monday, May 8, 2017 3:38:43 PM To: d...@openvswitch.org Cc: Yin Lin Subject: [ovs-dev] [PATCH v6 2/4] datapath-windows: Add NAT module in conntrack Signed-off-by: Yin Lin Issue: # Change-Id: I6f37360c36525548b343f0016304015fec8aba7d --- datapath-windows/automake.mk| 2 + datapath-windows/ovsext/Conntrack-nat.c | 424 datapath-windows/ovsext/Conntrack-nat.h | 39 +++ 3 files changed, 465 insertions(+) create mode 100644 datapath-windows/ovsext/Conntrack-nat.c create mode 100644 datapath-windows/ovsext/Conntrack-nat.h diff --git a/datapath-windows/automake.mk b/datapath-windows/automake.mk index 4f7b55a..7177630 100644 --- a/datapath-windows/automake.mk +++ b/datapath-windows/automake.mk @@ -16,7 +16,9 @@ EXTRA_DIST += \ datapath-windows/ovsext/Conntrack-icmp.c \ datapath-windows/ovsext/Conntrack-other.c \ datapath-windows/ovsext/Conntrack-related.c \ +datapath-windows/ovsext/Conntrack-nat.c \ datapath-windows/ovsext/Conntrack-tcp.c \ +datapath-windows/ovsext/Conntrack-nat.h \ datapath-windows/ovsext/Conntrack.c \ datapath-windows/ovsext/Conntrack.h \ datapath-windows/ovsext/Datapath.c \ diff --git a/datapath-windows/ovsext/Conntrack-nat.c b/datapath-windows/ovsext/Conntrack-nat.c new file mode 100644 index 000..edf5f8f --- /dev/null +++ b/datapath-windows/ovsext/Conntrack-nat.c @@ -0,0 +1,424 @@ +#include "Conntrack-nat.h" +#include "Jhash.h" + +PLIST_ENTRY ovsNatTable = NULL; +PLIST_ENTRY ovsUnNatTable = NULL; + +/* + *--- + * OvsHashNatKey + * Hash NAT related fields in a Conntrack key. + *--- + */ +static __inline UINT32 +OvsHashNatKey(const OVS_CT_KEY *key) +{ +UINT32 hash = 0; +#define HASH_ADD(field) \ +hash = OvsJhashBytes(&key->field, sizeof(key->field), hash) + +HASH_ADD(src.addr.ipv4_aligned); +HASH_ADD(dst.addr.ipv4_aligned); +HASH_ADD(src.port); +HASH_ADD(dst.port); +HASH_ADD(zone); +#undef HASH_ADD +return hash; +} + +/* + *--- + * OvsNatKeyAreSame + * Compare NAT related fields in a Conntrack key. + *--- + */ +static __inline BOOLEAN +OvsNatKeyAreSame(const OVS_CT_KEY *key1, const OVS_CT_KEY *key2) +{ +// XXX: Compare IPv6 key as well +#define FIELD_COMPARE(field) \ +if (key1->field != key2->field) return FALSE + +FIELD_COMPARE(src.addr.ipv4_aligned); +FIELD_COMPARE(dst.addr.ipv4_aligned); +FIELD_COMPARE(src.port); +FIELD_COMPARE(dst.port); +FIELD_COMPARE(zone); +return TRUE; +#undef FIELD_COMPARE +} + +/* + *--- + * OvsNaGetBucket + * Returns the row of NAT table that has the same hash as the given NAT + * hash key. If isReverse is TRUE, returns the row of reverse NAT table + * instead. + *--- + */ +static __inline PLIST_ENTRY +OvsNatGetBucket(const OVS_CT_KEY *key, BOOLEAN isReverse) +{ +uint32_t hash = OvsHashNatKey(key); +if (isReverse) { +return &ovsUnNatTable[hash & NAT_HASH_TABLE_MASK]; +} else { +return &ovsNatTable[hash & NAT_HASH_TABLE_MASK]; +} +} + +/* + *--- + * OvsNatInit + * Initialize NAT related resources. + *--- + */ +NTSTATUS OvsNatInit() +{ +ASSERT(ovsNatTable == NULL); + +/* Init the Hash Buffer */ +ovsNatTable = OvsAllocateMemoryWithTag( +sizeof(LIST_ENTRY) * NAT_HASH_TABLE_SIZE, +OVS_CT_POOL_TAG); +if (ovsNatTable == NULL) { +goto failNoMem; +} + +ovsUnNatTable = OvsAllocateMemoryWithTag( +sizeof(LIST_ENTRY) * NAT_HASH_TABLE_SIZE, +OVS_CT_POOL_TAG); +if (ovsUnNatTable == NULL) { +goto freeNatTable; +} + +for (int i = 0; i < NAT_HASH_TABLE_SIZE; i++) { +InitializeListHead(&ovsNatTable[i]); +InitializeListHead(&ovsUnNatTable[i]); +} +return STATUS_SUCCESS; + +freeNatTable: +OvsFreeMemoryWithTag(ovsNatTable, OVS_CT_POOL_TAG); +failNoMem: +return STATUS_INSUFFICIENT_RESOURCES; +} + +/* + * + * OvsNatFlush + * Flushes out all NAT entries that match the given zone. + *--
Re: [ovs-dev] [PATCH 2/2] tunnel-tests: Add test to match tunnel traffic.
On 8 May 2017 at 11:35, Greg Rose wrote: > On Mon, 2017-05-08 at 11:15 -0700, Joe Stringer wrote: >> From: William Tu >> >> This test highlights a bug that was affecting master up until the >> previous patch. Put simply, we have two bridges: an integration bridge >> which contains a tunnel, and a physical bridge for underlay network >> connectivity. This test simulates putting UDP traffic through the >> integration bridge, with the intention to apply GRE tunnel headers and >> send the packet through the underlay bridge. The underlay bridge should >> observe GRE traffic. >> --- >> tests/tunnel-push-pop.at | 69 >> >> 1 file changed, 69 insertions(+) > > No signature. Please fix on push. Otherwise... Sorry, yeah. William submitted most of this earlier without sign-off so looking for it from him. Thanks for looking it over. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v6 4/4] Signed-off-by: Alin Gabriel Serdean
From: Alin Gabriel Serdean --- datapath-windows/ovsext/Actions.c | 38 ++ 1 file changed, 38 insertions(+) diff --git a/datapath-windows/ovsext/Actions.c b/datapath-windows/ovsext/Actions.c index d1938f3..bb1e6ea 100644 --- a/datapath-windows/ovsext/Actions.c +++ b/datapath-windows/ovsext/Actions.c @@ -1572,6 +1572,44 @@ OvsUpdateAddressAndPort(OvsForwardingContext *ovsFwdCtx, } *portField = newPort; } +PNET_BUFFER_LIST curNbl = ovsFwdCtx->curNbl; +PNET_BUFFER_LIST newNbl = NULL; +if (layers->isTcp) { +UINT32 mss = OVSGetTcpMSS(curNbl); +if (mss) { +OVS_LOG_TRACE("l4Offset %d", layers->l4Offset); +newNbl = OvsTcpSegmentNBL(ovsFwdCtx->switchContext, curNbl, layers, + mss, 0); +if (newNbl == NULL) { +OVS_LOG_ERROR("Unable to segment NBL"); +return NDIS_STATUS_FAILURE; +} +/* Clear out LSO flags after this point */ +NET_BUFFER_LIST_INFO(newNbl, TcpLargeSendNetBufferListInfo) = 0; +} +} +/* If we didn't split the packet above, make a copy now */ +if (newNbl == NULL) { +csumInfo.Value = NET_BUFFER_LIST_INFO(curNbl, + TcpIpChecksumNetBufferListInfo); +OvsApplySWChecksumOnNB(layers, curNbl, &csumInfo); +} + +if (newNbl) { +curNbl = newNbl; +OvsCompleteNBLForwardingCtx(ovsFwdCtx, +L"Complete after cloning NBL for encapsulation"); +OvsInitForwardingCtx(ovsFwdCtx, ovsFwdCtx->switchContext, + newNbl, ovsFwdCtx->srcVportNo, 0, + NET_BUFFER_LIST_SWITCH_FORWARDING_DETAIL(newNbl), + ovsFwdCtx->completionList, + &ovsFwdCtx->layers, FALSE); +ovsFwdCtx->curNbl = newNbl; +} + +NET_BUFFER_LIST_INFO(curNbl, + TcpIpChecksumNetBufferListInfo) = 0; + return NDIS_STATUS_SUCCESS; } -- 2.10.2.windows.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v6 3/4] datapath-windows: NAT integration with conntrack
Signed-off-by: Yin Lin --- datapath-windows/automake.mk | 4 +- datapath-windows/ovsext/Actions.c | 118 - datapath-windows/ovsext/Actions.h | 20 datapath-windows/ovsext/Conntrack.c| 187 + datapath-windows/ovsext/Conntrack.h| 25 +++-- datapath-windows/ovsext/ovsext.vcxproj | 2 + 6 files changed, 273 insertions(+), 83 deletions(-) diff --git a/datapath-windows/automake.mk b/datapath-windows/automake.mk index 7177630..f1632cc 100644 --- a/datapath-windows/automake.mk +++ b/datapath-windows/automake.mk @@ -16,9 +16,9 @@ EXTRA_DIST += \ datapath-windows/ovsext/Conntrack-icmp.c \ datapath-windows/ovsext/Conntrack-other.c \ datapath-windows/ovsext/Conntrack-related.c \ -datapath-windows/ovsext/Conntrack-nat.c \ + datapath-windows/ovsext/Conntrack-nat.c \ datapath-windows/ovsext/Conntrack-tcp.c \ -datapath-windows/ovsext/Conntrack-nat.h \ + datapath-windows/ovsext/Conntrack-nat.h \ datapath-windows/ovsext/Conntrack.c \ datapath-windows/ovsext/Conntrack.h \ datapath-windows/ovsext/Datapath.c \ diff --git a/datapath-windows/ovsext/Actions.c b/datapath-windows/ovsext/Actions.c index e2eae9a..d1938f3 100644 --- a/datapath-windows/ovsext/Actions.c +++ b/datapath-windows/ovsext/Actions.c @@ -1380,7 +1380,7 @@ PUINT8 OvsGetHeaderBySize(OvsForwardingContext *ovsFwdCtx, * based on the specified key. * */ -static __inline NDIS_STATUS +NDIS_STATUS OvsUpdateUdpPorts(OvsForwardingContext *ovsFwdCtx, const struct ovs_key_udp *udpAttr) { @@ -1427,7 +1427,7 @@ OvsUpdateUdpPorts(OvsForwardingContext *ovsFwdCtx, * based on the specified key. * */ -static __inline NDIS_STATUS +NDIS_STATUS OvsUpdateTcpPorts(OvsForwardingContext *ovsFwdCtx, const struct ovs_key_tcp *tcpAttr) { @@ -1465,12 +1465,124 @@ OvsUpdateTcpPorts(OvsForwardingContext *ovsFwdCtx, /* * + * OvsUpdateAddressAndPort -- + * Updates the source/destination IP and port fields in + * ovsFwdCtx.curNbl inline based on the specified key. + * + */ +OvsUpdateAddressAndPort(OvsForwardingContext *ovsFwdCtx, +UINT32 newAddr, UINT16 newPort, +BOOLEAN isSource, BOOLEAN isTx) +{ +PUINT8 bufferStart; +UINT32 hdrSize; +OVS_PACKET_HDR_INFO *layers = &ovsFwdCtx->layers; +IPHdr *ipHdr; +TCPHdr *tcpHdr = NULL; +UDPHdr *udpHdr = NULL; +UINT32 *addrField = NULL; +UINT16 *portField = NULL; +UINT16 *checkField = NULL; +BOOLEAN l4Offload = FALSE; +NDIS_TCP_IP_CHECKSUM_NET_BUFFER_LIST_INFO csumInfo; + +ASSERT(layers->value != 0); + +if (layers->isTcp || layers->isUdp) { +hdrSize = layers->l4Offset + + layers->isTcp ? sizeof (*tcpHdr) : sizeof (*udpHdr); +} else { +hdrSize = layers->l3Offset + sizeof (*ipHdr); +} + +bufferStart = OvsGetHeaderBySize(ovsFwdCtx, hdrSize); +if (!bufferStart) { +return NDIS_STATUS_RESOURCES; +} + +ipHdr = (IPHdr *)(bufferStart + layers->l3Offset); + +if (layers->isTcp) { +tcpHdr = (TCPHdr *)(bufferStart + layers->l4Offset); +} else if (layers->isUdp) { +udpHdr = (UDPHdr *)(bufferStart + layers->l4Offset); +} + +csumInfo.Value = NET_BUFFER_LIST_INFO(ovsFwdCtx->curNbl, + TcpIpChecksumNetBufferListInfo); +/* + * Adjust the IP header inline as dictated by the action, and also update + * the IP and the TCP checksum for the data modified. + * + * In the future, this could be optimized to make one call to + * ChecksumUpdate32(). Ignoring this for now, since for the most common + * case, we only update the TTL. + */ + +if (isSource) { +addrField = &ipHdr->saddr; +if (tcpHdr) { +portField = &tcpHdr->source; +checkField = &tcpHdr->check; +l4Offload = isTx ? (BOOLEAN)csumInfo.Transmit.TcpChecksum : +((BOOLEAN)csumInfo.Receive.TcpChecksumSucceeded || + (BOOLEAN)csumInfo.Receive.TcpChecksumFailed); +} else if (udpHdr) { +portField = &udpHdr->source; +checkField = &udpHdr->check; +l4Offload = isTx ? (BOOLEAN)csumInfo.Transmit.UdpChecksum : +((BOOLEAN)csumInfo.Receive.UdpChecksumSucceeded || + (BOOLEAN)csumInfo.Receive.UdpChecksumFailed); +} +} else { +addrField = &ipHdr->daddr; +if (tcpHdr) { +portField = &tcpHdr->dest; +che
[ovs-dev] [PATCH v6 2/4] datapath-windows: Add NAT module in conntrack
Signed-off-by: Yin Lin Issue: # Change-Id: I6f37360c36525548b343f0016304015fec8aba7d --- datapath-windows/automake.mk| 2 + datapath-windows/ovsext/Conntrack-nat.c | 424 datapath-windows/ovsext/Conntrack-nat.h | 39 +++ 3 files changed, 465 insertions(+) create mode 100644 datapath-windows/ovsext/Conntrack-nat.c create mode 100644 datapath-windows/ovsext/Conntrack-nat.h diff --git a/datapath-windows/automake.mk b/datapath-windows/automake.mk index 4f7b55a..7177630 100644 --- a/datapath-windows/automake.mk +++ b/datapath-windows/automake.mk @@ -16,7 +16,9 @@ EXTRA_DIST += \ datapath-windows/ovsext/Conntrack-icmp.c \ datapath-windows/ovsext/Conntrack-other.c \ datapath-windows/ovsext/Conntrack-related.c \ +datapath-windows/ovsext/Conntrack-nat.c \ datapath-windows/ovsext/Conntrack-tcp.c \ +datapath-windows/ovsext/Conntrack-nat.h \ datapath-windows/ovsext/Conntrack.c \ datapath-windows/ovsext/Conntrack.h \ datapath-windows/ovsext/Datapath.c \ diff --git a/datapath-windows/ovsext/Conntrack-nat.c b/datapath-windows/ovsext/Conntrack-nat.c new file mode 100644 index 000..edf5f8f --- /dev/null +++ b/datapath-windows/ovsext/Conntrack-nat.c @@ -0,0 +1,424 @@ +#include "Conntrack-nat.h" +#include "Jhash.h" + +PLIST_ENTRY ovsNatTable = NULL; +PLIST_ENTRY ovsUnNatTable = NULL; + +/* + *--- + * OvsHashNatKey + * Hash NAT related fields in a Conntrack key. + *--- + */ +static __inline UINT32 +OvsHashNatKey(const OVS_CT_KEY *key) +{ +UINT32 hash = 0; +#define HASH_ADD(field) \ +hash = OvsJhashBytes(&key->field, sizeof(key->field), hash) + +HASH_ADD(src.addr.ipv4_aligned); +HASH_ADD(dst.addr.ipv4_aligned); +HASH_ADD(src.port); +HASH_ADD(dst.port); +HASH_ADD(zone); +#undef HASH_ADD +return hash; +} + +/* + *--- + * OvsNatKeyAreSame + * Compare NAT related fields in a Conntrack key. + *--- + */ +static __inline BOOLEAN +OvsNatKeyAreSame(const OVS_CT_KEY *key1, const OVS_CT_KEY *key2) +{ +// XXX: Compare IPv6 key as well +#define FIELD_COMPARE(field) \ +if (key1->field != key2->field) return FALSE + +FIELD_COMPARE(src.addr.ipv4_aligned); +FIELD_COMPARE(dst.addr.ipv4_aligned); +FIELD_COMPARE(src.port); +FIELD_COMPARE(dst.port); +FIELD_COMPARE(zone); +return TRUE; +#undef FIELD_COMPARE +} + +/* + *--- + * OvsNaGetBucket + * Returns the row of NAT table that has the same hash as the given NAT + * hash key. If isReverse is TRUE, returns the row of reverse NAT table + * instead. + *--- + */ +static __inline PLIST_ENTRY +OvsNatGetBucket(const OVS_CT_KEY *key, BOOLEAN isReverse) +{ +uint32_t hash = OvsHashNatKey(key); +if (isReverse) { +return &ovsUnNatTable[hash & NAT_HASH_TABLE_MASK]; +} else { +return &ovsNatTable[hash & NAT_HASH_TABLE_MASK]; +} +} + +/* + *--- + * OvsNatInit + * Initialize NAT related resources. + *--- + */ +NTSTATUS OvsNatInit() +{ +ASSERT(ovsNatTable == NULL); + +/* Init the Hash Buffer */ +ovsNatTable = OvsAllocateMemoryWithTag( +sizeof(LIST_ENTRY) * NAT_HASH_TABLE_SIZE, +OVS_CT_POOL_TAG); +if (ovsNatTable == NULL) { +goto failNoMem; +} + +ovsUnNatTable = OvsAllocateMemoryWithTag( +sizeof(LIST_ENTRY) * NAT_HASH_TABLE_SIZE, +OVS_CT_POOL_TAG); +if (ovsUnNatTable == NULL) { +goto freeNatTable; +} + +for (int i = 0; i < NAT_HASH_TABLE_SIZE; i++) { +InitializeListHead(&ovsNatTable[i]); +InitializeListHead(&ovsUnNatTable[i]); +} +return STATUS_SUCCESS; + +freeNatTable: +OvsFreeMemoryWithTag(ovsNatTable, OVS_CT_POOL_TAG); +failNoMem: +return STATUS_INSUFFICIENT_RESOURCES; +} + +/* + * + * OvsNatFlush + * Flushes out all NAT entries that match the given zone. + * + */ +VOID OvsNatFlush(UINT16 zone) +{ +PLIST_ENTRY link, next; +for (int i = 0; i < NAT_HASH_TABLE_SIZE; i++) { +LIST_FORALL_SAFE(&ovsNatTable[i], link, next) { +POVS_NAT_ENTRY entry = +CONTAINING_RECORD(link, OVS_NAT_ENTRY, link); +/* zone is a non-zero value */ +if (!zone || zone == entry->key.zone) { +
[ovs-dev] [PATCH v6 1/4] datapath-windows: Add support for NAT in conntrack
From: Anand Kumar Add support for parsing netlink attributes related to NAT in conntrack. Co-Authored-by: Anand Kumar Co-Authored-by: Darrell Ball Signed-off-by: Yin Lin --- datapath-windows/ovsext/Conntrack.c | 73 - datapath-windows/ovsext/Conntrack.h | 17 + datapath-windows/ovsext/Flow.c | 4 +- 3 files changed, 90 insertions(+), 4 deletions(-) diff --git a/datapath-windows/ovsext/Conntrack.c b/datapath-windows/ovsext/Conntrack.c index dce0c1b..9824368 100644 --- a/datapath-windows/ovsext/Conntrack.c +++ b/datapath-windows/ovsext/Conntrack.c @@ -645,7 +645,8 @@ OvsCtExecute_(PNET_BUFFER_LIST curNbl, UINT16 zone, MD_MARK *mark, MD_LABELS *labels, - PCHAR helper) + PCHAR helper, + PNAT_ACTION_INFO natInfo) { NDIS_STATUS status = NDIS_STATUS_SUCCESS; POVS_CT_ENTRY entry = NULL; @@ -654,6 +655,9 @@ OvsCtExecute_(PNET_BUFFER_LIST curNbl, UINT64 currentTime; NdisGetCurrentSystemTime((LARGE_INTEGER *) ¤tTime); +/* XXX: Not referenced for now */ +UNREFERENCED_PARAMETER(natInfo); + /* Retrieve the Conntrack Key related fields from packet */ OvsCtSetupLookupCtx(key, zone, &ctx, curNbl, layers->l4Offset); @@ -730,11 +734,14 @@ OvsExecuteConntrackAction(OvsForwardingContext *fwdCtx, MD_MARK *mark = NULL; MD_LABELS *labels = NULL; PCHAR helper = NULL; +NAT_ACTION_INFO natActionInfo; PNET_BUFFER_LIST curNbl = fwdCtx->curNbl; OVS_PACKET_HDR_INFO *layers = &fwdCtx->layers; PNET_BUFFER_LIST newNbl = NULL; +NAT_ACTION_INFO natActionInfo; NDIS_STATUS status; +memset(&natActionInfo, 0, sizeof natActionInfo); status = OvsDetectCtPacket(fwdCtx, key, &newNbl); if (status != NDIS_STATUS_SUCCESS) { return status; @@ -757,6 +764,68 @@ OvsExecuteConntrackAction(OvsForwardingContext *fwdCtx, if (ctAttr) { labels = NlAttrGet(ctAttr); } +natActionInfo.natAction = NAT_ACTION_NONE; +ctAttr = NlAttrFindNested(a, OVS_CT_ATTR_NAT); +if (ctAttr) { +/* Pares Nested NAT attributes. */ +PNL_ATTR natAttr; +unsigned int left; +BOOLEAN hasMinIp = FALSE; +BOOLEAN hasMinPort = FALSE; +BOOLEAN hasMaxIp = FALSE; +BOOLEAN hasMaxPort = FALSE; +NL_NESTED_FOR_EACH_UNSAFE (natAttr, left, ctAttr) { +enum ovs_nat_attr sub_type_nest = NlAttrType(natAttr); +switch(sub_type_nest) { +case OVS_NAT_ATTR_SRC: +case OVS_NAT_ATTR_DST: +natActionInfo.natAction |= +((sub_type_nest == OVS_NAT_ATTR_SRC) +? NAT_ACTION_SRC : NAT_ACTION_DST); +break; +case OVS_NAT_ATTR_IP_MIN: +memcpy(&natActionInfo.minAddr, + NlAttrData(natAttr), natAttr->nlaLen - NLA_HDRLEN); +hasMinIp = TRUE; +break; +case OVS_NAT_ATTR_IP_MAX: +memcpy(&natActionInfo.maxAddr, + NlAttrData(natAttr), natAttr->nlaLen - NLA_HDRLEN); +hasMaxIp = TRUE; +break; +case OVS_NAT_ATTR_PROTO_MIN: +natActionInfo.minPort = NlAttrGetU16(natAttr); +hasMinPort = TRUE; +break; +case OVS_NAT_ATTR_PROTO_MAX: +natActionInfo.maxPort = NlAttrGetU16(natAttr); +hasMaxPort = TRUE; +break; +case OVS_NAT_ATTR_PERSISTENT: +case OVS_NAT_ATTR_PROTO_HASH: +case OVS_NAT_ATTR_PROTO_RANDOM: +break; +} +} +if (natActionInfo.natAction == NAT_ACTION_NONE) { +natActionInfo.natAction = NAT_ACTION_REVERSE; +} +if (hasMinIp && !hasMaxIp) { +memcpy(&natActionInfo.maxAddr, + &natActionInfo.minAddr, + sizeof(natActionInfo.maxAddr)); +} +if (hasMinPort && !hasMaxPort) { +natActionInfo.maxPort = natActionInfo.minPort; +} +if (hasMinPort || hasMaxPort) { +if (natActionInfo.natAction & NAT_ACTION_SRC) { +natActionInfo.natAction |= NAT_ACTION_SRC_PORT; +} else if (natActionInfo.natAction & NAT_ACTION_DST) { +natActionInfo.natAction |= NAT_ACTION_DST_PORT; +} +} +} ctAttr = NlAttrFindNested(a, OVS_CT_ATTR_HELPER); if (ctAttr) { helper = NlAttrGetString(ctAttr); @@ -776,7 +845,7 @@ OvsExecuteConntrackAction(OvsForwardingContext *fwdCtx, } /* If newNbl is not allocated, use the current Nbl*/ status = OvsCtExecute_(newNbl != NULL ? newNbl : curNbl, key, layers, - commit, force, zone, mark, labels, helper); + commit, force, zone, mark, labels, helper, &n
Re: [ovs-dev] [PATCH] ovn-openstack.rst: Fix typo.
On Mon, May 08, 2017 at 02:51:21PM -0700, Andy Zhou wrote: > On Mon, May 8, 2017 at 1:53 PM, Ben Pfaff wrote: > > The text here was inconsistent: it referred to port 4 in the text just > > above but the example used port 5 in one place. This fixes the issue. > > > > Signed-off-by: Ben Pfaff > > Acked-by: Andy Zhou Thanks! I applied this to master. > I must have autocorrected this when I typed in this line. :-(. Well, the reader does need to correct it, after all. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] ovn-openstack.rst: Fix typo.
On Mon, May 8, 2017 at 1:53 PM, Ben Pfaff wrote: > The text here was inconsistent: it referred to port 4 in the text just > above but the example used port 5 in one place. This fixes the issue. > > Signed-off-by: Ben Pfaff Acked-by: Andy Zhou I must have autocorrected this when I typed in this line. :-(. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v5] tunneling: Avoid recirculation on datapath by computing the recirculate actions at translate time.
On Mon, May 8, 2017 at 11:29 AM, Zoltán Balogh wrote: > Hi William, > > The reason of 'incorrect' n_bytes stats could be due to the mechanism truncate > and tunneling with clone action do work. As you wrote, the packet size is > changed when the output action is applied. > > Let's say, I have the config below: > >ns1 > | > +--o-+ > | br-int | > +-o--+ > gre0 > LOCAL > +-o--+ > | br-p | > +-o--+ >| > ns2 > > $ ovs-ofctl dump-flows br-int > NXST_FLOW reply (xid=0x4): > cookie=0x0, duration=525.264s, table=0, n_packets=4, n_bytes=4168, > idle_age=62, in_port=1 actions=output(port=2,max_len=200) > $ ovs-ofctl dump-flows br-p > NXST_FLOW reply (xid=0x4): > cookie=0x0, duration=527.255s, table=0, n_packets=5, n_bytes=4210, > idle_age=64, in_port=LOCAL actions=dec_ttl,output:3 > $ ovs-vsctl show > 73ee7b44-e781-4dfd-ad4c-f7790f644000 > Bridge br-int > Port br-int > Interface br-int > type: internal > Port "br-int-ns1" > Interface "br-int-ns1" > Port "gre0" > Interface "gre0" > type: gre > options: {remote_ip="10.0.0.20"} > Bridge br-p > Port "br-p-ns2" > Interface "br-p-ns2" > Port br-p > Interface br-p > type: internal > > If I send IPCM echo request from ns1 towards ns2 through the GRE tunnel, then > I get the following datapath flow: > > netdev@ovs-netdev: hit:7 missed:11 > br-int: > br-int 65534/1: (tap) > br-int-ns1 1/3: (system) > gre0 2/5: (gre: remote_ip=10.0.0.20) > br-p: > br-p 65534/2: (tap) > br-p-ns2 3/4: (system) > > flow-dump from non-dpdk interfaces: > recirc_id(0),in_port(3),eth_type(0x0800),ipv4(tos=0/0x3,ttl=64,frag=no), > packets:0, bytes:0, used:never, actions:trunc(200),clone(tnl_push(tnl_port(5) > ,header(size=38,type=3,eth(dst=be:a5:83:73:e9:dc,src=36:23:30:b4:b2:48,dl_type=0x0800),ipv4(src=10.0.0.10,dst=10.0.0.20,proto=47,tos=0,ttl=64,frag=0x4 > 000),gre((flags=0x0,proto=0x6558))),out_port(2)),set(ipv4(tos=0/0x3,ttl=63)),4) > > The packet arrives on port 3 (br-int-ns1), then cutlen is set via trunc, then > clone is applied. Clone pushes tunnel header, then tos/ttl is set, finally, > the > packet is output to port 4 (br-p-ns2). And this is the point where packet size > is changed, at the end of the datapath flow. So, we have a single flow, and > size of a packet will not be changed while actions are applied except at the > end of processing via output action. It may be cleaner if we add a new trunc action for the datapath, say trunc2 that applies to all outputs within the clone. So the translation will look like: clone(trunc2, native tunnel translation). Would this approach work? > > Without the "Avoid recirculation" patch we have two datapath flows, because > the > packet is recirculated. At the end of the first flow the packet size is > changed > and the packet with modified size enters the OF pipeline again. > > What is the reason not to change packet size when truncate action is applied? > One of the reasons could be that we introduced trunc before clone. Otherwise, a clone(trunc2, output:x) is equivalent to trunc, output:x. Note that the trunc datapath action is different than other datapath actions, which usually applies to all following actions. Native tunneling may be the first use case that motivates trunc2, which should have the normal datapath action behavior. > Best regards, > Zoltan > > >> -Original Message- >> From: William Tu [mailto:u9012...@gmail.com] >> Sent: Monday, May 08, 2017 4:55 PM >> To: Zoltán Balogh >> Cc: Joe Stringer ; Sugesh Chandran >> ; ovs dev ; Ben >> Pfaff ; Andy Zhou ; Jan Scheurich >> >> Subject: Re: [ovs-dev] [PATCH v5] tunneling: Avoid recirculation on datapath >> by computing the recirculate actions at >> translate time. >> >> Hi Zoltan, >> >> Yes, we disallow truncate followed by patch port as output on purpose. >> The reason is that truncate action sets the packet's new length in its >> metadata instead of immediately change the size. Actual change of the >> packet size happens when we see the output action. >> >> Carrying this un-applied truncate metadata to another bridge >> complicates many cases. Ex: what if we truncate, send to patch port, >> and the other bridge does broadcast? Do each of the port gets >> truncated packets? >> Previous discussion: >> https://patchwork.ozlabs.org/patch/631972/ >> >> I applied your patch and it passes my testcase (posted in previous >> email, which makes sure the outer header is matched). Now I see the >> issue you mentioned that flow stats are not correct on the underlay >> bridge. I'm still trying to figure out the reason. >> >> Thanks. >> William >> >> On Mon, May 8, 2017 at 4:15 AM, Zoltán Balogh >> wrote: >> > Hi,
Re: [ovs-dev] [PATCH 0/3] Support OpenFlow 1.5 packet-out
On Mon, May 08, 2017 at 10:24:16AM -0700, Yi-Hung Wei wrote: > Hi Ben, > > Thanks for your valuable review. Yes, it makes sense to use 'struct flow' > instead of 'struct match' to represent metadata. > > As for the "pipeline fields", I briefly look at ovs-fields (7), and I think > the > patch series should be update to include at least the following fields. > * Tunnel fields: (tun_src/dst, tun_ipv6_src/dst, tun_gbp_id, > tun_gbp_flags, tun_flags, tun_metadata0 - tun_metadata63) > * Register fields: (reg0 - reg15, xreg0 - xreg7, xxreg0 - xxreg3) > > I will address these issues and send out a v2 soon. OK, thank you! Maybe there should be an mf_*() function that identifies pipeline fields. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] ovn-openstack.rst: Fix typo.
The text here was inconsistent: it referred to port 4 in the text just above but the example used port 5 in one place. This fixes the issue. Signed-off-by: Ben Pfaff --- Documentation/tutorials/ovn-openstack.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/tutorials/ovn-openstack.rst b/Documentation/tutorials/ovn-openstack.rst index 60c624923bbc..7b4d9cfd9e96 100644 --- a/Documentation/tutorials/ovn-openstack.rst +++ b/Documentation/tutorials/ovn-openstack.rst @@ -923,7 +923,7 @@ important part is:: which means that the packet is ultimately being output to OpenFlow port 4. That's port ``b``, which you can confirm with:: - $ sudo ovs-vsctl find interface ofport=5 + $ sudo ovs-vsctl find interface ofport=4 _uuid : 840a5aca-ea8d-4c16-a11b-a94e0f408091 admin_state : up bfd : {} -- 2.10.2 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 4/6] ovsdb: add support for role-based access controls
On Mon, May 08, 2017 at 01:35:54PM -0400, Lance Richardson wrote: > > From: "Ben Pfaff" > > To: "Lance Richardson" > > Cc: d...@openvswitch.org > > > As a high-level comment, it looks to me like documentation is missing > > for the ways that this affects the schema and the wire protocol. We try > > to document those kinds of changes, relative to the RFC 7074 > > specification, in ovsdb/ovsdb-server.1.in. > > > > Would something like this be sufficient? Seems suitable to me, thanks. (I'll go through it carefully in the next version, of course.) Thank you! ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] checkpatch: Fix inconsistencies skipping datapath files.
On Mon, May 08, 2017 at 03:16:55PM -0400, Aaron Conole wrote: > Ben Pfaff writes: > > > The code in checkpatch inconsistently stripped "a/" or "b/" from the > > beginning of a file name, and the check for "datapath" only worked when > > the prefix was not stripped. This fixes the problem. > > > > CC: Aaron Conole > > Signed-off-by: Ben Pfaff > > --- > > LGTM - just tested it out. > > Acked-by: Aaron Conole Thanks! I applied this to master. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] checkpatch: Fix inconsistencies skipping datapath files.
Ben Pfaff writes: > The code in checkpatch inconsistently stripped "a/" or "b/" from the > beginning of a file name, and the check for "datapath" only worked when > the prefix was not stripped. This fixes the problem. > > CC: Aaron Conole > Signed-off-by: Ben Pfaff > --- LGTM - just tested it out. Acked-by: Aaron Conole ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 2/2] tunnel-tests: Add test to match tunnel traffic.
On Mon, 2017-05-08 at 11:15 -0700, Joe Stringer wrote: > From: William Tu > > This test highlights a bug that was affecting master up until the > previous patch. Put simply, we have two bridges: an integration bridge > which contains a tunnel, and a physical bridge for underlay network > connectivity. This test simulates putting UDP traffic through the > integration bridge, with the intention to apply GRE tunnel headers and > send the packet through the underlay bridge. The underlay bridge should > observe GRE traffic. > --- > tests/tunnel-push-pop.at | 69 > > 1 file changed, 69 insertions(+) No signature. Please fix on push. Otherwise... Acked-by: Greg Rose > > diff --git a/tests/tunnel-push-pop.at b/tests/tunnel-push-pop.at > index 4eeac4154dbc..fe901525ea62 100644 > --- a/tests/tunnel-push-pop.at > +++ b/tests/tunnel-push-pop.at > @@ -225,3 +225,72 @@ OVS_WAIT_UNTIL([test `ovs-pcap p0.pcap | grep > 5054000a505400091235 | wc > > OVS_VSWITCHD_STOP > AT_CLEANUP > + > +AT_SETUP([tunnel_push_pop - matching on physical bridge]) > + > +OVS_VSWITCHD_START([add-port br0 p0 -- set Interface p0 type=dummy > ofport_request=1 other-config:hwaddr=aa:55:aa:55:00:00]) > +AT_CHECK([ovs-vsctl add-br int-br -- set bridge int-br datapath_type=dummy], > [0]) > +AT_CHECK([ovs-vsctl add-port int-br t1 -- set Interface t1 type=gre \ > + options:remote_ip=1.1.2.92 options:key=456 > ofport_request=3], [0]) > + > +AT_CHECK([ovs-appctl dpif/show], [0], [dnl > +dummy@ovs-dummy: hit:0 missed:0 > + br0: > + br0 65534/100: (dummy-internal) > + p0 1/1: (dummy) > + int-br: > + int-br 65534/2: (dummy-internal) > + t1 3/3: (gre: key=456, remote_ip=1.1.2.92) > +]) > + > +AT_CHECK([ovs-appctl dpif/show], [0], [dnl > +dummy@ovs-dummy: hit:0 missed:0 > + br0: > + br0 65534/100: (dummy-internal) > + p0 1/1: (dummy) > + int-br: > + int-br 65534/2: (dummy-internal) > + t1 3/3: (gre: key=456, remote_ip=1.1.2.92) > +]) > + > +AT_CHECK([ovs-appctl netdev-dummy/ip4addr br0 1.1.2.88/24], [0], [OK > +]) > +AT_CHECK([ovs-appctl ovs/route/add 1.1.2.92/24 br0], [0], [OK > +]) > +AT_CHECK([ovs-ofctl add-flow br0 'priority=1,action=normal']) > + > +dnl Check ARP request > +AT_CHECK([ovs-vsctl -- set Interface p0 options:pcap=p0.pcap]) > +AT_CHECK([ovs-appctl netdev-dummy/receive int-br > 'in_port(2),eth(src=aa:55:aa:55:00:00,dst=f8:bc:12:ff:ff:ff),eth_type(0x0800),ipv4(src=1.1.3.92,dst=1.1.3.88,proto=1,tos=0,ttl=64,frag=no),icmp(type=0,code=0)']) > + > +dnl Check ARP Snoop > +AT_CHECK([ovs-appctl netdev-dummy/receive br0 > 'recirc_id(0),in_port(100),eth(src=f8:bc:12:44:34:b6,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=1.1.2.92,tip=1.1.2.88,op=2,sha=f8:bc:12:44:34:b6,tha=00:00:00:00:00:00)']) > + > +AT_CHECK([ovs-appctl tnl/neigh/show | tail -n+3 | sort], [0], [dnl > +1.1.2.92 f8:bc:12:44:34:b6 br0 > +]) > + > +AT_CHECK([ovs-appctl tnl/ports/show |sort], [0], [dnl > +Listening ports: > +gre_sys (3) > +]) > + > +AT_CHECK([ovs-ofctl add-flow br0 'priority=99,udp,action=normal']) > + > +dnl Check GRE tunnel push > +AT_CHECK([ovs-ofctl add-flow int-br action=3]) > + > +AT_CHECK([ovs-appctl ofproto/trace ovs-dummy > 'in_port(2),eth_type(0x0800),ipv4(src=1.1.3.88,dst=1.1.3.112,proto=17,tos=0,ttl=64,frag=no),udp(src=51283,dst=4789)'], > [0], [stdout]) > +AT_CHECK([tail -1 stdout], [0], > + [Datapath actions: > tnl_push(tnl_port(3),header(size=42,type=3,eth(dst=f8:bc:12:44:34:b6,src=aa:55:aa:55:00:00,dl_type=0x0800),ipv4(src=1.1.2.88,dst=1.1.2.92,proto=47,tos=0,ttl=64,frag=0x4000),gre((flags=0x2000,proto=0x6558),key=0x1c8)),out_port(100)) > +]) > + > +AT_CHECK([ovs-appctl netdev-dummy/receive int-br > '5054000a505400091234']) > +AT_CHECK([ovs-ofctl dump-flows br0 | ofctl_strip | sort], [0], [dnl) > + n_packets=3, n_bytes=140, priority=1 actions=NORMAL > + priority=99,udp actions=NORMAL > +NXST_FLOW reply: > +]) > + > +OVS_VSWITCHD_STOP > +AT_CLEANUP ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 1/2] Revert "tunneling: Avoid recirculation on datapath."
On Mon, 2017-05-08 at 11:15 -0700, Joe Stringer wrote: > This reverts commit f1dac5128ce6db2e493f0d1c7a8b53fb9f34476f. When this > commit was introduced, it broke the 'make check-system-userspace' > testsuite. It appears that the new translation fails to modify the flow > in a way that would represent the flow as an encapsulated flow when the > traffic is patched through to the second bridge. As such, rather than > matching on, for example, "ip,proto=47" for gre, it would use the inner > packet's flow headers. It also results in problems reporting statistics, > as the tunnel's header is not reflected in subsequent statistics and > truncation is not properly applied during translation. > > While a refreshed approach to solving the above problem is formed, > revert this patch. > > Reported-at: > https://mail.openvswitch.org/pipermail/ovs-dev/2017-May/331972.html > Signed-off-by: Joe Stringer Acked-by: Greg Rose > --- > lib/dpif-netdev.c | 18 ++- > ofproto/ofproto-dpif-xlate.c | 280 > -- > tests/ofproto-dpif.at | 11 +- > tests/ovn.at | 6 +- > tests/tunnel-push-pop-ipv6.at | 10 +- > tests/tunnel-push-pop.at | 12 +- > 6 files changed, 173 insertions(+), 164 deletions(-) > > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c > index 4ee5d058aff8..d21515657634 100644 > --- a/lib/dpif-netdev.c > +++ b/lib/dpif-netdev.c > @@ -4970,8 +4970,24 @@ dp_execute_cb(void *aux_, struct dp_packet_batch > *packets_, > > case OVS_ACTION_ATTR_TUNNEL_PUSH: > if (*depth < MAX_RECIRC_DEPTH) { > +struct dp_packet_batch tnl_pkt; > +struct dp_packet_batch *orig_packets_ = packets_; > +int err; > + > +if (!may_steal) { > +dp_packet_batch_clone(&tnl_pkt, packets_); > +packets_ = &tnl_pkt; > +dp_packet_batch_reset_cutlen(orig_packets_); > +} > + > dp_packet_batch_apply_cutlen(packets_); > -push_tnl_action(pmd, a, packets_); > + > +err = push_tnl_action(pmd, a, packets_); > +if (!err) { > +(*depth)++; > +dp_netdev_recirculate(pmd, packets_); > +(*depth)--; > +} > return; > } > break; > diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c > index b308f21de100..bc3a310227da 100644 > --- a/ofproto/ofproto-dpif-xlate.c > +++ b/ofproto/ofproto-dpif-xlate.c > @@ -425,10 +425,6 @@ static void xlate_action_set(struct xlate_ctx *ctx); > static void xlate_commit_actions(struct xlate_ctx *ctx); > > static void > -apply_nested_clone_actions(struct xlate_ctx *ctx, const struct xport *in_dev, > - struct xport *out_dev); > - > -static void > ctx_trigger_freeze(struct xlate_ctx *ctx) > { > ctx->exit = true; > @@ -3215,17 +3211,7 @@ build_tunnel_send(struct xlate_ctx *ctx, const struct > xport *xport, > } > tnl_push_data.tnl_port = odp_to_u32(tunnel_odp_port); > tnl_push_data.out_port = odp_to_u32(out_dev->odp_port); > - > -size_t push_action_size = 0; > -size_t clone_ofs = nl_msg_start_nested(ctx->odp_actions, > - OVS_ACTION_ATTR_CLONE); > odp_put_tnl_push_action(ctx->odp_actions, &tnl_push_data); > -push_action_size = ctx->odp_actions->size; > -apply_nested_clone_actions(ctx, xport, out_dev); > -if (ctx->odp_actions->size > push_action_size) { > -/* Update the CLONE action only when combined */ > -nl_msg_end_nested(ctx->odp_actions, clone_ofs); > -} > return 0; > } > > @@ -3261,136 +3247,6 @@ xlate_flow_is_protected(const struct xlate_ctx *ctx, > const struct flow *flow, co > xport_in->xbundle->protected && xport_out->xbundle->protected); > } > > -/* Populate and apply nested actions on 'out_dev'. > - * The nested actions are applied on cloned packets in dp while outputting to > - * either patch or tunnel ports. > - * On output to a patch port, the output action will be replaced with set of > - * nested actions on the peer patch port. > - * Similarly on output to a tunnel port, the post nested actions on > - * tunnel are chained up with the tunnel-push action. > - */ > -static void > -apply_nested_clone_actions(struct xlate_ctx *ctx, const struct xport *in_dev, > - struct xport *out_dev) > -{ > -struct flow *flow = &ctx->xin->flow; > -struct flow old_flow = ctx->xin->flow; > -struct flow_tnl old_flow_tnl_wc = ctx->wc->masks.tunnel; > -bool old_conntrack = ctx->conntracked; > -bool old_was_mpls = ctx->was_mpls; > -ovs_version_t old_version = ctx->xin->tables_version; > -struct ofpbuf old_stack = ctx->stack; > -union mf_subvalue new_stack[1024 / sizeof(union mf_subvalue)]; > -struct ofpbuf old_action_set = ctx->action_set; > -struct ovs_list *old_trace = ctx->xin->trace;
Re: [ovs-dev] [PATCH v5] tunneling: Avoid recirculation on datapath by computing the recirculate actions at translate time.
Hi William, The reason of 'incorrect' n_bytes stats could be due to the mechanism truncate and tunneling with clone action do work. As you wrote, the packet size is changed when the output action is applied. Let's say, I have the config below: ns1 | +--o-+ | br-int | +-o--+ gre0 LOCAL +-o--+ | br-p | +-o--+ | ns2 $ ovs-ofctl dump-flows br-int NXST_FLOW reply (xid=0x4): cookie=0x0, duration=525.264s, table=0, n_packets=4, n_bytes=4168, idle_age=62, in_port=1 actions=output(port=2,max_len=200) $ ovs-ofctl dump-flows br-p NXST_FLOW reply (xid=0x4): cookie=0x0, duration=527.255s, table=0, n_packets=5, n_bytes=4210, idle_age=64, in_port=LOCAL actions=dec_ttl,output:3 $ ovs-vsctl show 73ee7b44-e781-4dfd-ad4c-f7790f644000 Bridge br-int Port br-int Interface br-int type: internal Port "br-int-ns1" Interface "br-int-ns1" Port "gre0" Interface "gre0" type: gre options: {remote_ip="10.0.0.20"} Bridge br-p Port "br-p-ns2" Interface "br-p-ns2" Port br-p Interface br-p type: internal If I send IPCM echo request from ns1 towards ns2 through the GRE tunnel, then I get the following datapath flow: netdev@ovs-netdev: hit:7 missed:11 br-int: br-int 65534/1: (tap) br-int-ns1 1/3: (system) gre0 2/5: (gre: remote_ip=10.0.0.20) br-p: br-p 65534/2: (tap) br-p-ns2 3/4: (system) flow-dump from non-dpdk interfaces: recirc_id(0),in_port(3),eth_type(0x0800),ipv4(tos=0/0x3,ttl=64,frag=no), packets:0, bytes:0, used:never, actions:trunc(200),clone(tnl_push(tnl_port(5) ,header(size=38,type=3,eth(dst=be:a5:83:73:e9:dc,src=36:23:30:b4:b2:48,dl_type=0x0800),ipv4(src=10.0.0.10,dst=10.0.0.20,proto=47,tos=0,ttl=64,frag=0x4 000),gre((flags=0x0,proto=0x6558))),out_port(2)),set(ipv4(tos=0/0x3,ttl=63)),4) The packet arrives on port 3 (br-int-ns1), then cutlen is set via trunc, then clone is applied. Clone pushes tunnel header, then tos/ttl is set, finally, the packet is output to port 4 (br-p-ns2). And this is the point where packet size is changed, at the end of the datapath flow. So, we have a single flow, and size of a packet will not be changed while actions are applied except at the end of processing via output action. Without the "Avoid recirculation" patch we have two datapath flows, because the packet is recirculated. At the end of the first flow the packet size is changed and the packet with modified size enters the OF pipeline again. What is the reason not to change packet size when truncate action is applied? Best regards, Zoltan > -Original Message- > From: William Tu [mailto:u9012...@gmail.com] > Sent: Monday, May 08, 2017 4:55 PM > To: Zoltán Balogh > Cc: Joe Stringer ; Sugesh Chandran ; > ovs dev ; Ben > Pfaff ; Andy Zhou ; Jan Scheurich > > Subject: Re: [ovs-dev] [PATCH v5] tunneling: Avoid recirculation on datapath > by computing the recirculate actions at > translate time. > > Hi Zoltan, > > Yes, we disallow truncate followed by patch port as output on purpose. > The reason is that truncate action sets the packet's new length in its > metadata instead of immediately change the size. Actual change of the > packet size happens when we see the output action. > > Carrying this un-applied truncate metadata to another bridge > complicates many cases. Ex: what if we truncate, send to patch port, > and the other bridge does broadcast? Do each of the port gets > truncated packets? > Previous discussion: > https://patchwork.ozlabs.org/patch/631972/ > > I applied your patch and it passes my testcase (posted in previous > email, which makes sure the outer header is matched). Now I see the > issue you mentioned that flow stats are not correct on the underlay > bridge. I'm still trying to figure out the reason. > > Thanks. > William > > On Mon, May 8, 2017 at 4:15 AM, Zoltán Balogh > wrote: > > Hi, > > > > I looked into the code of truncate, I saw that patch ports are not handled. > > On the other hand I saw that "Avoid recirculation" commit should not be > > affected by this fact. I verified that > packets are truncated correctly with my last patch I sent you before, but > flow stats are not correct on the underlay > bridge. > > > > Could you confirm this, please? > > > > Best regards, > > Zoltan > > > >> -Original Message- > >> From: Zoltán Balogh > >> Sent: Sunday, May 07, 2017 9:16 PM > >> To: 'Joe Stringer' ; William Tu > >> Cc: Sugesh Chandran ; ovs dev > >> ; Ben Pfaff ; Andy > Zhou > >> ; Jan Scheurich > >> Subject: RE: [ovs-dev] [PATCH v5] tunneling: Avoid recirculation on > >> datapath by computing the recirculate actions > at > >> translate time. > >> > >> > >> I have a patch that fixes tunneling over patc
[ovs-dev] [PATCH 2/2] tunnel-tests: Add test to match tunnel traffic.
From: William Tu This test highlights a bug that was affecting master up until the previous patch. Put simply, we have two bridges: an integration bridge which contains a tunnel, and a physical bridge for underlay network connectivity. This test simulates putting UDP traffic through the integration bridge, with the intention to apply GRE tunnel headers and send the packet through the underlay bridge. The underlay bridge should observe GRE traffic. --- tests/tunnel-push-pop.at | 69 1 file changed, 69 insertions(+) diff --git a/tests/tunnel-push-pop.at b/tests/tunnel-push-pop.at index 4eeac4154dbc..fe901525ea62 100644 --- a/tests/tunnel-push-pop.at +++ b/tests/tunnel-push-pop.at @@ -225,3 +225,72 @@ OVS_WAIT_UNTIL([test `ovs-pcap p0.pcap | grep 5054000a505400091235 | wc OVS_VSWITCHD_STOP AT_CLEANUP + +AT_SETUP([tunnel_push_pop - matching on physical bridge]) + +OVS_VSWITCHD_START([add-port br0 p0 -- set Interface p0 type=dummy ofport_request=1 other-config:hwaddr=aa:55:aa:55:00:00]) +AT_CHECK([ovs-vsctl add-br int-br -- set bridge int-br datapath_type=dummy], [0]) +AT_CHECK([ovs-vsctl add-port int-br t1 -- set Interface t1 type=gre \ + options:remote_ip=1.1.2.92 options:key=456 ofport_request=3], [0]) + +AT_CHECK([ovs-appctl dpif/show], [0], [dnl +dummy@ovs-dummy: hit:0 missed:0 + br0: + br0 65534/100: (dummy-internal) + p0 1/1: (dummy) + int-br: + int-br 65534/2: (dummy-internal) + t1 3/3: (gre: key=456, remote_ip=1.1.2.92) +]) + +AT_CHECK([ovs-appctl dpif/show], [0], [dnl +dummy@ovs-dummy: hit:0 missed:0 + br0: + br0 65534/100: (dummy-internal) + p0 1/1: (dummy) + int-br: + int-br 65534/2: (dummy-internal) + t1 3/3: (gre: key=456, remote_ip=1.1.2.92) +]) + +AT_CHECK([ovs-appctl netdev-dummy/ip4addr br0 1.1.2.88/24], [0], [OK +]) +AT_CHECK([ovs-appctl ovs/route/add 1.1.2.92/24 br0], [0], [OK +]) +AT_CHECK([ovs-ofctl add-flow br0 'priority=1,action=normal']) + +dnl Check ARP request +AT_CHECK([ovs-vsctl -- set Interface p0 options:pcap=p0.pcap]) +AT_CHECK([ovs-appctl netdev-dummy/receive int-br 'in_port(2),eth(src=aa:55:aa:55:00:00,dst=f8:bc:12:ff:ff:ff),eth_type(0x0800),ipv4(src=1.1.3.92,dst=1.1.3.88,proto=1,tos=0,ttl=64,frag=no),icmp(type=0,code=0)']) + +dnl Check ARP Snoop +AT_CHECK([ovs-appctl netdev-dummy/receive br0 'recirc_id(0),in_port(100),eth(src=f8:bc:12:44:34:b6,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=1.1.2.92,tip=1.1.2.88,op=2,sha=f8:bc:12:44:34:b6,tha=00:00:00:00:00:00)']) + +AT_CHECK([ovs-appctl tnl/neigh/show | tail -n+3 | sort], [0], [dnl +1.1.2.92 f8:bc:12:44:34:b6 br0 +]) + +AT_CHECK([ovs-appctl tnl/ports/show |sort], [0], [dnl +Listening ports: +gre_sys (3) +]) + +AT_CHECK([ovs-ofctl add-flow br0 'priority=99,udp,action=normal']) + +dnl Check GRE tunnel push +AT_CHECK([ovs-ofctl add-flow int-br action=3]) + +AT_CHECK([ovs-appctl ofproto/trace ovs-dummy 'in_port(2),eth_type(0x0800),ipv4(src=1.1.3.88,dst=1.1.3.112,proto=17,tos=0,ttl=64,frag=no),udp(src=51283,dst=4789)'], [0], [stdout]) +AT_CHECK([tail -1 stdout], [0], + [Datapath actions: tnl_push(tnl_port(3),header(size=42,type=3,eth(dst=f8:bc:12:44:34:b6,src=aa:55:aa:55:00:00,dl_type=0x0800),ipv4(src=1.1.2.88,dst=1.1.2.92,proto=47,tos=0,ttl=64,frag=0x4000),gre((flags=0x2000,proto=0x6558),key=0x1c8)),out_port(100)) +]) + +AT_CHECK([ovs-appctl netdev-dummy/receive int-br '5054000a505400091234']) +AT_CHECK([ovs-ofctl dump-flows br0 | ofctl_strip | sort], [0], [dnl) + n_packets=3, n_bytes=140, priority=1 actions=NORMAL + priority=99,udp actions=NORMAL +NXST_FLOW reply: +]) + +OVS_VSWITCHD_STOP +AT_CLEANUP -- 2.12.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 1/2] Revert "tunneling: Avoid recirculation on datapath."
This reverts commit f1dac5128ce6db2e493f0d1c7a8b53fb9f34476f. When this commit was introduced, it broke the 'make check-system-userspace' testsuite. It appears that the new translation fails to modify the flow in a way that would represent the flow as an encapsulated flow when the traffic is patched through to the second bridge. As such, rather than matching on, for example, "ip,proto=47" for gre, it would use the inner packet's flow headers. It also results in problems reporting statistics, as the tunnel's header is not reflected in subsequent statistics and truncation is not properly applied during translation. While a refreshed approach to solving the above problem is formed, revert this patch. Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2017-May/331972.html Signed-off-by: Joe Stringer --- lib/dpif-netdev.c | 18 ++- ofproto/ofproto-dpif-xlate.c | 280 -- tests/ofproto-dpif.at | 11 +- tests/ovn.at | 6 +- tests/tunnel-push-pop-ipv6.at | 10 +- tests/tunnel-push-pop.at | 12 +- 6 files changed, 173 insertions(+), 164 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 4ee5d058aff8..d21515657634 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -4970,8 +4970,24 @@ dp_execute_cb(void *aux_, struct dp_packet_batch *packets_, case OVS_ACTION_ATTR_TUNNEL_PUSH: if (*depth < MAX_RECIRC_DEPTH) { +struct dp_packet_batch tnl_pkt; +struct dp_packet_batch *orig_packets_ = packets_; +int err; + +if (!may_steal) { +dp_packet_batch_clone(&tnl_pkt, packets_); +packets_ = &tnl_pkt; +dp_packet_batch_reset_cutlen(orig_packets_); +} + dp_packet_batch_apply_cutlen(packets_); -push_tnl_action(pmd, a, packets_); + +err = push_tnl_action(pmd, a, packets_); +if (!err) { +(*depth)++; +dp_netdev_recirculate(pmd, packets_); +(*depth)--; +} return; } break; diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c index b308f21de100..bc3a310227da 100644 --- a/ofproto/ofproto-dpif-xlate.c +++ b/ofproto/ofproto-dpif-xlate.c @@ -425,10 +425,6 @@ static void xlate_action_set(struct xlate_ctx *ctx); static void xlate_commit_actions(struct xlate_ctx *ctx); static void -apply_nested_clone_actions(struct xlate_ctx *ctx, const struct xport *in_dev, - struct xport *out_dev); - -static void ctx_trigger_freeze(struct xlate_ctx *ctx) { ctx->exit = true; @@ -3215,17 +3211,7 @@ build_tunnel_send(struct xlate_ctx *ctx, const struct xport *xport, } tnl_push_data.tnl_port = odp_to_u32(tunnel_odp_port); tnl_push_data.out_port = odp_to_u32(out_dev->odp_port); - -size_t push_action_size = 0; -size_t clone_ofs = nl_msg_start_nested(ctx->odp_actions, - OVS_ACTION_ATTR_CLONE); odp_put_tnl_push_action(ctx->odp_actions, &tnl_push_data); -push_action_size = ctx->odp_actions->size; -apply_nested_clone_actions(ctx, xport, out_dev); -if (ctx->odp_actions->size > push_action_size) { -/* Update the CLONE action only when combined */ -nl_msg_end_nested(ctx->odp_actions, clone_ofs); -} return 0; } @@ -3261,136 +3247,6 @@ xlate_flow_is_protected(const struct xlate_ctx *ctx, const struct flow *flow, co xport_in->xbundle->protected && xport_out->xbundle->protected); } -/* Populate and apply nested actions on 'out_dev'. - * The nested actions are applied on cloned packets in dp while outputting to - * either patch or tunnel ports. - * On output to a patch port, the output action will be replaced with set of - * nested actions on the peer patch port. - * Similarly on output to a tunnel port, the post nested actions on - * tunnel are chained up with the tunnel-push action. - */ -static void -apply_nested_clone_actions(struct xlate_ctx *ctx, const struct xport *in_dev, - struct xport *out_dev) -{ -struct flow *flow = &ctx->xin->flow; -struct flow old_flow = ctx->xin->flow; -struct flow_tnl old_flow_tnl_wc = ctx->wc->masks.tunnel; -bool old_conntrack = ctx->conntracked; -bool old_was_mpls = ctx->was_mpls; -ovs_version_t old_version = ctx->xin->tables_version; -struct ofpbuf old_stack = ctx->stack; -union mf_subvalue new_stack[1024 / sizeof(union mf_subvalue)]; -struct ofpbuf old_action_set = ctx->action_set; -struct ovs_list *old_trace = ctx->xin->trace; -uint64_t actset_stub[1024 / 8]; - -ofpbuf_use_stub(&ctx->stack, new_stack, sizeof new_stack); -ofpbuf_use_stub(&ctx->action_set, actset_stub, sizeof actset_stub); -flow->in_port.ofp_port = out_dev->ofp_port; -flow->metadata = htonll(0); -memset(&flow->tunnel, 0, sizeof flow->tunnel); -
[ovs-dev] [PATCH 0/2] Revert recirculation on datapath.
Currently, the native (userspace) tunneling functionality is not appropriately applying tunnel headers during translation. In a typical 2-bridge configuration with an integration bridge which connects to a tunnel port, plus an underlay bridge which provides connectivity to the physical network, translation occurs the same for both bridges. However, you would expect that once traffic egresses a tunnel on the integration bridge, it would be encapsulated and the lookup in the underlay bridge should occur based on an encapsulated packet. For example, when sending UDP packets through an integration bridge GRE tunnel, the packet would then match on rules in the underlay bridge that match on UDP packets, rather than on GRE packets. Patch #1 is a simple revert of the patch which broke this functionality. Patch #2 adds a new test to the unit testsuite to prevent breakage in future. Joe Stringer (1): Revert "tunneling: Avoid recirculation on datapath." William Tu (1): tunnel-tests: Add test to match tunnel traffic. lib/dpif-netdev.c | 18 ++- ofproto/ofproto-dpif-xlate.c | 280 -- tests/ofproto-dpif.at | 11 +- tests/ovn.at | 6 +- tests/tunnel-push-pop-ipv6.at | 10 +- tests/tunnel-push-pop.at | 81 +++- 6 files changed, 242 insertions(+), 164 deletions(-) -- 2.12.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 4/6] ovsdb: add support for role-based access controls
> From: "Ben Pfaff" > To: "Lance Richardson" > Cc: d...@openvswitch.org > As a high-level comment, it looks to me like documentation is missing > for the ways that this affects the schema and the wire protocol. We try > to document those kinds of changes, relative to the RFC 7074 > specification, in ovsdb/ovsdb-server.1.in. > Would something like this be sufficient? diff --git a/ovsdb/ovsdb-server.1.in b/ovsdb/ovsdb-server.1.in index 3c798dd..2e80fb9 100644 --- a/ovsdb/ovsdb-server.1.in +++ b/ovsdb/ovsdb-server.1.in @@ -269,6 +269,9 @@ narrow down the particular syntax that could not be parsed. The request triggered a bug in \fBovsdb\-server\fR. .IP "\fBovsdb error\fR" A map or set contains a duplicate key. +.IP "\fBpermission error\fR" +The request was denied by the role-based access control extension, +introduced in version 2.8. .RE . .IP "3.2. Schema Format" @@ -281,6 +284,36 @@ This raises the issue of the behavior of the weak reference when the rows that it references are deleted. Since version 2.6, \fBovsdb\-server\fR forces columns that contain weak references to be mutable. +.IP +Since version 2.8, the table name \fBRBAC_Role\fR is used internally +by the role-based access control extension to \fBovsdb\-server\fR and +should not be used for purposes other than defining mappings of role +names to table access permissions. This table has one row per role +name and the following columns: +.RS +.IP "\fBname\fR" +The role name. +.IP "\fBpermissions\fR" +A map of table name to a reference to a row in a separate permission +table. +.RE +.IP +The separate RBAC permission table has one row per access control +configuration and the following columns: +.RS +.IP "\fBname\fR" +The name of the table to which the row applies. +.IP "\fBauthorization\fR" +The set of column names and column:key pairs to be compared with +the client ID in order to determine the authorization status of +the requested operation. +.IP "\fBinsert_delete\fR" +A boolean value, true if insertions and authorized deletions are allowed, +false if no insertions or deletions are allowed. +.IP "\fBupdate\fR" +The set of columns and column:key pairs for which authorized update and +mutate operations should be permitted. +.RE . .IP "4. Wire Protocol" The original OVSDB specifications included the following reason, @@ -299,6 +332,18 @@ any corresponding advantage. The JSON-RPC specification for HTTP transport is incomplete. .RE . +.IP "4.1.3. Transact" +Since version 2.8, role-based access controls can be applied to operations +within a transaction that would modify the contents of the database +(these operations include row insert, row delet, column update, and +column mutate). Role-based access controls are applied when the database schema +contains a table with the name "\fBRBAC_Role\fR" and the connection +on which the transaction request was received has an associated role +name (from the "\fBrole\fR" column in the remote connection table). When +role-based access controls are enabled, transactions that are otherwise +well-formed may be rejected depending on the client's role, ID, and the +contents of the \fBRBAC_Role\fR table and associated permissions table. +. .IP "4.1.5. Monitor" For backward compatibility, \fBovsdb\-server\fR currently permits a single to be used instead of an array; it is treated ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 0/3] Support OpenFlow 1.5 packet-out
Hi Jan, Thanks for letting me know, and sorry that I did not aware of this patch [1]. I take a look at patch [1], and it seems like the common part of [1] and [2] are 1) Defining OF 1.5 packet out format 2) Decoding and encoding OF 1.5 packet out message On the other hand, [1] focuses on supporting the new packet_type field, and [2] is more on all the other pipeline fields. We can see that from the testcases in [1] and [2]. The testcases in the two patches are actually quite independent, and they test different part of the OF 1.5 protocol. I am thinking about to pull out the common part of both [1] and [2] in a separate patch in my v2, and we can both contribute our work from there. How do you think? [1] https://mail.openvswitch.org/pipermail/ovs-dev/2017-April/331032.html [2] https://mail.openvswitch.org/pipermail/ovs-dev/2017-May/332066.html Thanks, -Yi-Hung On Mon, May 8, 2017 at 4:40 AM, Jan Scheurich wrote: > Hi Yi-Hung, > > We have already started to add support for the OpenFlow 1.5 Packet Out > message in our patch series for "packet type-aware pipeline: > https://mail.openvswitch.org/pipermail/ovs-dev/2017-April/331023.html > https://mail.openvswitch.org/pipermail/ovs-dev/2017-April/331032.html > > Our focus was to add support for the new packet_type pipeline field, and I'm > not sure we have covered all necessary efforts to support the rest of the > pipeline fields. Please have a look at our patch and make sure that our > efforts are aligned. > > Thanks, Jan > >> -Original Message- >> From: ovs-dev-boun...@openvswitch.org >> [mailto:ovs-dev-boun...@openvswitch.org] On Behalf Of Ben Pfaff >> Sent: Saturday, 06 May, 2017 06:40 >> To: Yi-Hung Wei >> Cc: d...@openvswitch.org >> Subject: Re: [ovs-dev] [PATCH 0/3] Support OpenFlow 1.5 packet-out >> >> On Thu, May 04, 2017 at 04:12:17PM -0700, Yi-Hung Wei wrote: >> > This patch series add support to OpenFlow 1.5 packet-out message that >> > enables setting pipeline fields. While there are six pipeline match >> > fields as listed in OpenFlow spec 1.5.1, this patch series implements >> > three of them (OXM_OF_IN_PORT, OXM_OF_METADATA, and OXM_OF_TUNNEL_ID), >> > since the rest of them (OXM_OF_IN_PHY_PORT, OXM_OF_ACTSET_OUTPUT, and >> > OXM_OF_PACKET_TYPE) does not fit into the ovs. >> > >> > Yi-Hung Wei (3): >> > ofp-util: Add flow metadata to ofputil_packet_out >> > ofproto: Add OpenFlow 1.5 packet-out support >> > ofp-parse: Parse pipeline fields in OF1.5 packet-out >> >> Thanks for working on this feature. I'm looking forward to being able >> to say that OVS supports all the features that OpenFlow 1.5 requires. I >> think that with this and extensible flow entry statistics (which we've >> already seen a draft of from, I believe, a TCS developer) we'll be >> there. >> >> This is a high quality series. Thank you! I have only a few comments. >> >> First, I see that this series represents metadata as a "struct match". >> Certainly, this works, but I wonder about the alternatives. The most >> obvious one is "struct flow", which has all the same features as struct >> match, without the masks. To me, it looks like the masks in struct >> match aren't even used, at least not consistently; for example, this >> code in parse_ofp_packet_out_str__() initializes in_port without its >> mask: >> >> *po = (struct ofputil_packet_out) { >> .buffer_id = UINT32_MAX, >> .flow_metadata.flow.in_port.ofp_port = OFPP_CONTROLLER, >> }; >> >> The other possible data structure for metadata is struct pkt_metadata, >> which is designed specifically for metadata. However it's currently >> used mostly in packet handling rather than in OpenFlow, so it would >> probably be a second choice. >> >> Second, this series seems to take the literal definition of "pipeline >> fields" from OpenFlow, only allowing those fields actually mentioned in >> OpenFlow to be used in "packet-out"s. But I think that we should >> include OVS extension pipeline fields, too, such as the various fields >> with tunnel information. I also think that the implementation misses >> some fields that OpenFlow itself defines as pipeline fields, such as the >> OpenFlow packet register pipeline fields. >> >> Can you think through these issues and write up a v2? >> >> Thanks, >> >> Ben. >> ___ >> dev mailing list >> d...@openvswitch.org >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 0/3] Support OpenFlow 1.5 packet-out
Hi Ben, Thanks for your valuable review. Yes, it makes sense to use 'struct flow' instead of 'struct match' to represent metadata. As for the "pipeline fields", I briefly look at ovs-fields (7), and I think the patch series should be update to include at least the following fields. * Tunnel fields: (tun_src/dst, tun_ipv6_src/dst, tun_gbp_id, tun_gbp_flags, tun_flags, tun_metadata0 - tun_metadata63) * Register fields: (reg0 - reg15, xreg0 - xreg7, xxreg0 - xxreg3) I will address these issues and send out a v2 soon. Thanks, -Yi-Hung On Fri, May 5, 2017 at 9:39 PM, Ben Pfaff wrote: > On Thu, May 04, 2017 at 04:12:17PM -0700, Yi-Hung Wei wrote: >> This patch series add support to OpenFlow 1.5 packet-out message that >> enables setting pipeline fields. While there are six pipeline match >> fields as listed in OpenFlow spec 1.5.1, this patch series implements >> three of them (OXM_OF_IN_PORT, OXM_OF_METADATA, and OXM_OF_TUNNEL_ID), >> since the rest of them (OXM_OF_IN_PHY_PORT, OXM_OF_ACTSET_OUTPUT, and >> OXM_OF_PACKET_TYPE) does not fit into the ovs. >> >> Yi-Hung Wei (3): >> ofp-util: Add flow metadata to ofputil_packet_out >> ofproto: Add OpenFlow 1.5 packet-out support >> ofp-parse: Parse pipeline fields in OF1.5 packet-out > > Thanks for working on this feature. I'm looking forward to being able > to say that OVS supports all the features that OpenFlow 1.5 requires. I > think that with this and extensible flow entry statistics (which we've > already seen a draft of from, I believe, a TCS developer) we'll be > there. > > This is a high quality series. Thank you! I have only a few comments. > > First, I see that this series represents metadata as a "struct match". > Certainly, this works, but I wonder about the alternatives. The most > obvious one is "struct flow", which has all the same features as struct > match, without the masks. To me, it looks like the masks in struct > match aren't even used, at least not consistently; for example, this > code in parse_ofp_packet_out_str__() initializes in_port without its > mask: > > *po = (struct ofputil_packet_out) { > .buffer_id = UINT32_MAX, > .flow_metadata.flow.in_port.ofp_port = OFPP_CONTROLLER, > }; > > The other possible data structure for metadata is struct pkt_metadata, > which is designed specifically for metadata. However it's currently > used mostly in packet handling rather than in OpenFlow, so it would > probably be a second choice. > > Second, this series seems to take the literal definition of "pipeline > fields" from OpenFlow, only allowing those fields actually mentioned in > OpenFlow to be used in "packet-out"s. But I think that we should > include OVS extension pipeline fields, too, such as the various fields > with tunnel information. I also think that the implementation misses > some fields that OpenFlow itself defines as pipeline fields, such as the > OpenFlow packet register pipeline fields. > > Can you think through these issues and write up a v2? > > Thanks, > > Ben. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH branch-2.7] docs: Add dpdk stable release to DPDK install docs.
On 5/8/17, 9:21 AM, "Stokes, Ian" wrote: > Ian > > This patch does not apply to 2.7; could you respin ? > > Thanks Darrell Hi Darrell, Thanks for looking at this, I think this patch was superseded by a later patch, checking branch 2.7 it already has a commit that makes these changes 3648ff4f614d6e50f896cb28e234d7d0ab33d167 Branch 2.7 has wget http://fast.dpdk.org/rel/dpdk-16.11.1.tar.xz via commit above. This patch proposed having stable specified wget http://dpdk.org/browse/dpdk-stable/snapshot/dpdk-stable-16.11.tar.xz dball@ubuntu:~/dpdk$ ls -l total 17688 -rw-rw-r-- 1 dball dball 9052212 Mar 2 03:19 dpdk-16.11.1.tar.xz -rw-rw-r-- 1 dball dball 9053968 May 8 09:52 dpdk-stable-16.11.tar.xz So, dpdk-16.11.1.tar.xz is the correct one – ok… This also matches with master Thanks Ian ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH branch-2.7] docs: Add dpdk stable release to DPDK install docs.
> Ian > > This patch does not apply to 2.7; could you respin ? > > Thanks Darrell Hi Darrell, Thanks for looking at this, I think this patch was superseded by a later patch, checking branch 2.7 it already has a commit that makes these changes 3648ff4f614d6e50f896cb28e234d7d0ab33d167 Thanks Ian ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] RFC: ovs-dump-flows utility
Hi Ben, Thanks for the look! Ben Pfaff writes: > On Fri, Apr 28, 2017 at 04:44:32PM -0400, Aaron Conole wrote: >> Greetings dev, >> >> I have whipped up a quick little utility (find below), that I've done a >> bit of debugging with and it seems to have made working with dump-flows >> from ovs-ofctl a little easier to use. >> >> If you think it's worthwhile to add to the repository, I'll submit it >> formally. We were using it while debugging some strange packet >> forwarding in openshift. >> >> -Aaron > > Thanks for working to make the ovs-ofctl formatting better! > > I prefer to interpret this script as a kind of "feature request" for > "ovs-ofctl dump-flows". This command already has some special support, > compared to other ovs-ofctl commands, and it might make sense to > continue adding to it. In which way? It calls the same ofp-print code, iirc. > ovs-ofctl dump-flows already has one of the features that this script > adds, that is, sorting the flows and removing "OFPST_FLOW" lines. You > turn this on by using the "--sort" (or "--rsort") option. Ahh, cool - I missed that. > The other features that this script provides all seem like ones that > would be useful to add to ovs-ofctl itself. I'd tend to prefer to > continue enhancing it rather than adding wrapper scripts; I think that > this is likely to yield a more coherent design in the end, and possibly > higher quality. Is that something you're willing to consider? I had started to work on this, back in December, but there were hundreds of lines of existing formatting code that would have to change (this is related to the discussion here: https://mail.openvswitch.org/pipermail/ovs-dev/2016-December/326560.html), and I thought it might be a better use of time to simply wrap the output - especially since I didn't know if any future changes in that area were going to happen. The last thing I want to do is break the existing output (which I do use quite a bit for debugging, so retraining myself would be painful) if someone has scripts which rely on it. Additionally, quite a few print commands would have changed to give the data to the table structure, rather than a long string. It looked to be a rather large change for something that could be resolved with a wrapper. Maybe I misinterpreted your response (from here https://mail.openvswitch.org/pipermail/ovs-dev/2016-December/326201.html). The other thing that adds complication is replacing the port numbers with names from the database. That would require a second transaction (unless there's a way to batch that during the initial dump-flows request, but I couldn't see an obvious way), and I didn't know if it would be okay to do (and how to treat failures... after all, it's convenient, but it isn't requisite to have the numbers replaced with names). There are a few minor changes I have to my copy of the script (I've added back the packet counts, and I have the port output in a way that we can not-quite paste the flow back in to an add-flow), but I ended up also using the direct output of dump-flows. As for how to implement it, I could have put some kind of post processor that would split the strings up (the way I have done with the script), but that felt rather hacky (since it's basically the formatting script, but in c-code form). Anyway, I submitted this as a start. If you think it's better to do the work in the ofp-print library then I can revisit it. Maybe the reduced set of things that were really helpful, and the rest we can just say "don't fear sed". > Thanks, > > Ben. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [RFC PATCH 0/6] Change dpdk rxq scheduling to incorporate rxq processing cycles.
On Mon, 2017-05-08 at 11:49 +0100, Kevin Traynor wrote: > On 05/06/2017 02:01 AM, Greg Rose wrote: > > On Fri, May 5, 2017 at 9:34 AM, Kevin Traynor wrote: > >> Rxqs are scheduled to be handled across available pmds in round robin > >> order with no weight or priority. > >> > >> It can happen that some very busy queues are handled by one pmd which > >> does not have enough cycles to prevent packets being dropped on them. > >> While at the same time another pmd which handles queues with no traffic > >> on them, is essentially idling. > >> > >> Rxq scheduling happens as a result of a number of events and when it does, > >> the same unweighted round robin approach is applied each time. > > > > I've seen and heard of HW that can be configured with a weighted > > round robin approach in which some queues or pools of queues will be > > given a higher priority in which they have more 'credits'. > > Specifically I know of some Intel HW that is capable of that. What I > > do not know is if any of that ever made it into the Linux kernel or if > > it is something that might cause your statement to be inaccurate in > > all cases. > > > > My worry is that this assertion may not be correct. > > > > I'm copying John Fastabend, one of my old co-workers at Intel to catch > > is take. He could probably point us in the right direction. > > > > - Greg > > Hi Greg, > > Thanks for your reply. I'm thinking I was too vague when I said "Rxq > scheduling", and we may be talking about slightly different things. Let > me elaborate on what I mean and why I used that term. > > When using the userspace datapath and dpdk type ports (e.g. type dpdk, > dpdkvhostuser) the rxqs from those ports are polled for packets by pmd > thread(s) in OVS. If there is just one pmd thread, it will poll all the > dpdk ports rxqs. However, if there are more than one pmd threads, the > OVS userspace code will distribute the polling of rxqs across the pmds. > The function that decides the assignment of rxqs to pmds is called > rxq_scheduling() and that is the code I was referring to (see patch > 6/6), with the patchset changing the method of assignment and hoping to > improve the results. > > Either as part of this or separate, it might be worth changing the name > of the function to something like rxq_pmd_assign() to be more specific > about what it does. > > Let me know if this clears things up, or I misinterpreted. > > thanks, > Kevin. That does indeed clear things up. Thank you! - Greg > > > > >> > >> This patchset proposes to augment the round robin nature of rxq scheduling > >> by counting the processing cycles used by the rxqs during their operation > >> and incorporate it into the rxq scheduling. > >> > >> Before distributing in a round robin manner, the rxqs will be sorted in > >> order of the processing cycles they have been consuming. Assuming multiple > >> pmds, this ensures that the measured rxqs using most processing cycles will > >> be distributed to different cores. > >> > >> To try out: > >> This patchset requires the updated pmd counting patch applied as a > >> prerequisite. https://patchwork.ozlabs.org/patch/729970/ > >> > >> Alternatively the series with dependencies can be cloned from here: > >> https://github.com/kevintraynor/ovs-rxq.git > >> > >> Simple way to test is add some dpdk ports, add multiple pmds, vary traffic > >> rates and rxqs on ports and trigger reschedules e.g. by changing rxqs or > >> the pmd-cpu-mask. > >> > >> Check rxq distribution with ovs-appctl dpif-netdev/pmd-rxq-show and see > >> if it matches expected. > >> > >> todo: > >> -possibly add a dedicated reschedule trigger command > >> -use consistent type names > >> -update docs > >> -more testing, especially for dual numa > >> > >> thanks, > >> Kevin. > >> > >> Kevin Traynor (6): > >> dpif-netdev: Add rxq processing cycle counters. > >> dpif-netdev: Update rxq processing cycles from > >> cycles_count_intermediate. > >> dpif-netdev: Change polled_queue to use dp_netdev_rxq. > >> dpif-netdev: Make dpcls optimization interval more generic. > >> dpif-netdev: Count the rxq processing cycles for an rxq. > >> dpif-netdev: Change rxq_scheduling to use rxq processing cycles. > >> > >> lib/dpif-netdev.c | 163 > >> -- > >> 1 file changed, 133 insertions(+), 30 deletions(-) > >> > >> -- > >> 1.8.3.1 > >> > >> ___ > >> dev mailing list > >> d...@openvswitch.org > >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v5] tunneling: Avoid recirculation on datapath by computing the recirculate actions at translate time.
Hi Zoltan, Yes, we disallow truncate followed by patch port as output on purpose. The reason is that truncate action sets the packet's new length in its metadata instead of immediately change the size. Actual change of the packet size happens when we see the output action. Carrying this un-applied truncate metadata to another bridge complicates many cases. Ex: what if we truncate, send to patch port, and the other bridge does broadcast? Do each of the port gets truncated packets? Previous discussion: https://patchwork.ozlabs.org/patch/631972/ I applied your patch and it passes my testcase (posted in previous email, which makes sure the outer header is matched). Now I see the issue you mentioned that flow stats are not correct on the underlay bridge. I'm still trying to figure out the reason. Thanks. William On Mon, May 8, 2017 at 4:15 AM, Zoltán Balogh wrote: > Hi, > > I looked into the code of truncate, I saw that patch ports are not handled. > On the other hand I saw that "Avoid recirculation" commit should not be > affected by this fact. I verified that packets are truncated correctly with > my last patch I sent you before, but flow stats are not correct on the > underlay bridge. > > Could you confirm this, please? > > Best regards, > Zoltan > >> -Original Message- >> From: Zoltán Balogh >> Sent: Sunday, May 07, 2017 9:16 PM >> To: 'Joe Stringer' ; William Tu >> Cc: Sugesh Chandran ; ovs dev >> ; Ben Pfaff ; Andy Zhou >> ; Jan Scheurich >> Subject: RE: [ovs-dev] [PATCH v5] tunneling: Avoid recirculation on datapath >> by computing the recirculate actions at >> translate time. >> >> > >> I have a patch that fixes tunneling over patch ports. The 14th >> > >> system-userspace >> > >> test still does fail, but now the packet size in the dump flow remains >> > >> 242. >> > >> >> > >> ./system-traffic.at:554: ovs-ofctl dump-flows br0 | grep "in_port=2" | >> > >> sed -n 's/.*\(n\_bytes=[0-9]*\).*/\1/p' >> > >> ./system-traffic.at:558: ovs-ofctl dump-flows br-underlay | grep >> > >> "in_port=LOCAL" | sed -n 's/.*\(n\_bytes=[0- >> > 9]*\).*/\1/p' >> > >> --- - 2017-05-05 19:52:28.987111424 +0200 >> > >> +++ >> > >> /home/ezolbal/work/general_L3_tunneling/ovs/tests/system-userspace-testsuite.dir/at-groups/14/stdout >> > 2017-05-05 19:52:28.983508027 +0200 >> > >> @@ -1,2 +1,2 @@ >> > >> -n_bytes=138 >> > >> +n_bytes=242 >> > >> >> > >> I'm little bit lost with flow statistics. Could you have a look at the >> > >> patch >> > >> below, and help me to find out if it's only a statistics bug, please? >> > > >> > > Ah, so the more interesting part of that test is that it also >> > > truncates the packet during output to the tunnel. So the tunnel should >> > > only see a 100B packet (encapped within GRE, so 138B). Commit >> > > aaca4fe0ce9e ("ofp-actions: Add truncate action.") was where this >> > > functionality was originally introduced, perhaps you can look at that >> > > to determine how the truncation should be applied in this case? >> > >> > Looking again, I think this also states that regardless of the >> > truncate, the second bridge should attribute stats for the encapped >> > packet, so even if there was no truncation it should be more than >> > 242B. When it comes to applying geneve TLVs as well, I'd expect this >> > to be calculated correctly. >> >> Hi Joe, >> Hi William, >> >> I was thinking about the "Avoid recirculation" code created by Sugesh and >> myself. It is based on the code patch >> ports do use. >> So I reverted our "Avoid recirculation" commit on master branch and tried to >> truncate packets on a patch port. >> I used the setup below. >> >> >> 192.168.10.10 192.168.10.20 >> ns0 ns1 >>| | >>| br0-ns0 | br1-ns1 >> +--o---+ +--o---+ >> | | | | >> |br0 | |br1 | >> | | | | >> +--oo--+ +--oo--+ >> veth0 || patch0 patch1 || veth1 >>|++| >>| | >>+--+ >> >> >> I attached two namespaces (ns0, ns1) to two netdev bridges (br0, br1), then >> I connected the bridges over veth and >> patch ports. >> >> netdev@ovs-netdev: hit:915736 missed:28 >> br0: >> br0 65534/1: (tap) >> br0-ns0 1/3: (system) >> patch0 2/none: (patch: peer=patch1) >> veth0 3/5: (system) >> br1: >> br1 65534/2: (tap) >> br1-ns1 1/4: (system) >> patch1 2/none: (patch: peer=patch0) >> veth1 3/6: (system) >> >> When I added flow rules to forward and truncate packets over veth ports, the >> ping from ns0 to ns1 did work. >> >> $ ovs-ofctl del-flows br0 >> $ ovs-ofctl del-flows br1 >> $ ovs-ofctl add-flow br0 in_port=1,action=3 >> $ o
Re: [ovs-dev] checkpatch name checking
On Mon, May 08, 2017 at 10:33:19AM -0400, Aaron Conole wrote: > Ben Pfaff writes: > > > Hi Aaron, checkpatch currently tries to ignores files in the "datapath" > > directories but it's not entirely successful. I think that's because, > > in the "parse == 1" case, it doesn't strip a leading "a/" or "b/" from > > filenames: > > current_file = match.group(2) > > whereas in the "parse == 2", it does: > > current_file = newfile.group(2)[2:] > > and the check for "datapath" relies on the / being there: > > # Skip files which have /datapath in them, since they are > > # linux or windows coding standards > > if '/datapath' in current_file: > > continue > > I'm not sure where the real bug is. Should the prefix be consistently > > stripped or not stripped? Once that's decided, it's easy to fix the > > problem. > > I think it should be consistently stripped. The regex matches don't > care, but when the filename is printed to the screen, it will include b/ > which makes simply opening the file in an editor not work correctly. > > I think given that, the parse==1 case is incorrect, and the datapath > check is also incorrect. > > Make sense? Got it. Thanks! I sent a patch: https://patchwork.ozlabs.org/patch/759661/ ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] checkpatch: Fix inconsistencies skipping datapath files.
The code in checkpatch inconsistently stripped "a/" or "b/" from the beginning of a file name, and the check for "datapath" only worked when the prefix was not stripped. This fixes the problem. CC: Aaron Conole Signed-off-by: Ben Pfaff --- utilities/checkpatch.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/utilities/checkpatch.py b/utilities/checkpatch.py index 387549afe3f6..d486de81c8ff 100755 --- a/utilities/checkpatch.py +++ b/utilities/checkpatch.py @@ -277,7 +277,7 @@ def ovs_checkpatch_parse(text): match = hunks.match(line) if match: parse = parse + 1 -current_file = match.group(2) +current_file = match.group(2)[2:] print_file_name = current_file continue elif parse == 0: @@ -318,7 +318,7 @@ def ovs_checkpatch_parse(text): # Skip files which have /datapath in them, since they are # linux or windows coding standards -if '/datapath' in current_file: +if current_file.startswith('datapath'): continue run_checks(current_file, cmp_line, lineno) if __errors or __warnings: -- 2.10.2 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] RFC: ovs-dump-flows utility
On Fri, Apr 28, 2017 at 04:44:32PM -0400, Aaron Conole wrote: > Greetings dev, > > I have whipped up a quick little utility (find below), that I've done a > bit of debugging with and it seems to have made working with dump-flows > from ovs-ofctl a little easier to use. > > If you think it's worthwhile to add to the repository, I'll submit it > formally. We were using it while debugging some strange packet > forwarding in openshift. > > -Aaron Thanks for working to make the ovs-ofctl formatting better! I prefer to interpret this script as a kind of "feature request" for "ovs-ofctl dump-flows". This command already has some special support, compared to other ovs-ofctl commands, and it might make sense to continue adding to it. ovs-ofctl dump-flows already has one of the features that this script adds, that is, sorting the flows and removing "OFPST_FLOW" lines. You turn this on by using the "--sort" (or "--rsort") option. The other features that this script provides all seem like ones that would be useful to add to ovs-ofctl itself. I'd tend to prefer to continue enhancing it rather than adding wrapper scripts; I think that this is likely to yield a more coherent design in the end, and possibly higher quality. Is that something you're willing to consider? Thanks, Ben. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] docs: Improve formatting for daemon options in a few manpages.
daemon.man is meant to have a heading above it, but in a few manpages its text was running directly into the previous documentation because this had been overlooked. By adding .PP to daemon.man, we make this problem less severe if the heading is similarly omitted in future manpages, since at least it will then have its own paragraph instead of running into the previous one. Signed-off-by: Ben Pfaff --- lib/daemon.man| 1 + utilities/ovs-ofctl.8.in | 1 + utilities/ovs-testcontroller.8.in | 1 + 3 files changed, 3 insertions(+) diff --git a/lib/daemon.man b/lib/daemon.man index 2e6d99aca528..820a09903103 100644 --- a/lib/daemon.man +++ b/lib/daemon.man @@ -1,3 +1,4 @@ +.PP The following options are valid on POSIX based platforms. .TP \fB\-\-pidfile\fR[\fB=\fIpidfile\fR] diff --git a/utilities/ovs-ofctl.8.in b/utilities/ovs-ofctl.8.in index a7e805587186..ed75b32a3808 100644 --- a/utilities/ovs-ofctl.8.in +++ b/utilities/ovs-ofctl.8.in @@ -2361,6 +2361,7 @@ sort a flow that specifies \fBnw_src=192.168.0.0/24\fR the same as .IP These options currently affect only \fBdump\-flows\fR output. . +.SS "Daemon Options" .ds DD \ \fBovs\-ofctl\fR detaches only when executing the \fBmonitor\fR or \ \fBsnoop\fR commands. diff --git a/utilities/ovs-testcontroller.8.in b/utilities/ovs-testcontroller.8.in index f88bcd0ed59e..df3c35fbf964 100644 --- a/utilities/ovs-testcontroller.8.in +++ b/utilities/ovs-testcontroller.8.in @@ -139,6 +139,7 @@ Use this option more than once to add flows from multiple files. .so lib/ssl.man .so lib/ssl-peer-ca-cert.man .ds DD +.SS "Daemon Options" .so lib/daemon.man .so lib/vlog.man .so lib/unixctl.man -- 2.10.2 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v5 1/6] userspace: Support for push_eth and pop_eth actions
On Sat, May 06, 2017 at 03:49:43PM +, Zoltán Balogh wrote: > From: Jan Scheurich > > Add support for actions push_eth and pop_eth to the netdev datapath and > the supporting libraries. This patch relies on the support for these actions > in the kernel datapath to be present. > > Signed-off-by: Lorand Jakab > Signed-off-by: Simon Horman > Signed-off-by: Jiri Benc > Signed-off-by: Yi Yang > Signed-off-by: Jean Tourrilhes > Signed-off-by: Jan Scheurich > Co-authored-by: Zoltan Balogh Thanks a lot! I applied this to master. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 2/2] rhel: delete transient ports on boot when starting ovsdb-server
On 05/05/2017 08:41 PM, Aaron Conole wrote: [...] >> @@ -534,7 +538,7 @@ fi >> %doc COPYING NOTICE README.rst NEWS rhel/README.RHEL.rst >> /var/lib/openvswitch >> /var/log/openvswitch >> -%ghost %attr(755,root,root) %{_rundir}/openvswitch >> +%dir %{_rundir}/openvswitch > > Doesn't this still need to be flagged as %ghost? If you use %ghost the directory is not created by rpm and it's only created after a reboot, since tmpfiles.d scripts are only executed by systemd on boot. Using %dir is the suggested way by Fedora Wiki: https://fedoraproject.org/wiki/Packaging:Tmpfiles.d ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] checkpatch name checking
Ben Pfaff writes: > Hi Aaron, checkpatch currently tries to ignores files in the "datapath" > directories but it's not entirely successful. I think that's because, > in the "parse == 1" case, it doesn't strip a leading "a/" or "b/" from > filenames: > current_file = match.group(2) > whereas in the "parse == 2", it does: > current_file = newfile.group(2)[2:] > and the check for "datapath" relies on the / being there: > # Skip files which have /datapath in them, since they are > # linux or windows coding standards > if '/datapath' in current_file: > continue > I'm not sure where the real bug is. Should the prefix be consistently > stripped or not stripped? Once that's decided, it's easy to fix the > problem. I think it should be consistently stripped. The regex matches don't care, but when the filename is printed to the screen, it will include b/ which makes simply opening the file in an editor not work correctly. I think given that, the parse==1 case is incorrect, and the datapath check is also incorrect. Make sense? > Thanks, > > Ben. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v3] Windows: Secure the namedpipe implementation
Thanks for figuring this out! I applied this to all of those branches. On Sun, May 07, 2017 at 03:38:18PM +, Alin Serdean wrote: > Thanks a lot for the patch! Me and Sai talked offline and we will add the > creator (owner) user to be added in another incremental. > > Could you please apply it on master, branch-2.7, branch-2.6? > > Tested-by: Alin Gabriel Serdean > Acked-by: Alin Gabriel Serdean > > > -Original Message- > > From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev- > > boun...@openvswitch.org] On Behalf Of Sairam Venugopal > > Sent: Saturday, May 6, 2017 5:41 AM > > To: d...@openvswitch.org > > Subject: [ovs-dev] [PATCH v3] Windows: Secure the namedpipe > > implementation > > > > Update the security policies around the creation of the namedpipe. The > > current security updates restrict how the namedpipe can be accessed. > > > > - Disable Network access > > - Windows Services can access the namedpipe > > - Administrators can access the namedpipe > > > > Signed-off-by: Sairam Venugopal > > --- > > lib/stream-windows.c | 98 > > > > 1 file changed, 92 insertions(+), 6 deletions(-) > > > > diff --git a/lib/stream-windows.c b/lib/stream-windows.c index > > 44b88bf..c5a26c8 100644 > > --- a/lib/stream-windows.c > > +++ b/lib/stream-windows.c > > @@ -40,6 +40,9 @@ static void maybe_unlink_and_free(char *path); > > /* Default prefix of a local named pipe. */ #define LOCAL_PREFIX > > ".\\pipe\\" > > > > +/* Size of the allowed PSIDs for securing Named Pipe. */ #define > > +ALLOWED_PSIDS_SIZE 2 > > + > > /* This function has the purpose to remove all the slashes received in s. > > */ > > static char * remove_slashes(char *s) @@ -402,16 +405,99 @@ static > > HANDLE create_pnpipe(char *name) { > > SECURITY_ATTRIBUTES sa; > > -sa.nLength = sizeof(sa); > > -sa.lpSecurityDescriptor = NULL; > > +SID_IDENTIFIER_AUTHORITY sia = SECURITY_NT_AUTHORITY; > > +DWORD aclSize; > > +PSID allowedPsid[ALLOWED_PSIDS_SIZE]; > > +PSID remoteAccessSid; > > +PACL acl = NULL; > > +PSECURITY_DESCRIPTOR psd = NULL; > > +HANDLE npipe; > > + > > +/* Disable access over network. */ > > +if (!AllocateAndInitializeSid(&sia, 1, SECURITY_NETWORK_RID, > > + 0, 0, 0, 0, 0, 0, 0, &remoteAccessSid)) { > > +VLOG_ERR_RL(&rl, "Error creating Remote Access SID."); > > +goto handle_error; > > +} > > + > > +aclSize = sizeof(ACL) + sizeof(ACCESS_DENIED_ACE) + > > + GetLengthSid(remoteAccessSid) - sizeof(DWORD); > > + > > +/* Allow Windows Services to access the Named Pipe. */ > > +if (!AllocateAndInitializeSid(&sia, 1, SECURITY_LOCAL_SYSTEM_RID, > > + 0, 0, 0, 0, 0, 0, 0, &allowedPsid[0])) { > > +VLOG_ERR_RL(&rl, "Error creating Services SID."); > > +goto handle_error; > > +} > > + > > +/* Allow Administrators to access the Named Pipe. */ > > +if (!AllocateAndInitializeSid(&sia, 2, SECURITY_BUILTIN_DOMAIN_RID, > > + DOMAIN_ALIAS_RID_ADMINS, 0, 0, 0, 0, 0, > > 0, > > + &allowedPsid[1])) { > > +VLOG_ERR_RL(&rl, "Error creating Administrator SID."); > > +goto handle_error; > > +} > > + > > +for (int i = 0; i < ALLOWED_PSIDS_SIZE; i++) { > > +aclSize += sizeof(ACCESS_ALLOWED_ACE) + > > + GetLengthSid(allowedPsid[i]) - > > + sizeof(DWORD); > > +} > > + > > +acl = xmalloc(aclSize); > > +if (!InitializeAcl(acl, aclSize, ACL_REVISION)) { > > +VLOG_ERR_RL(&rl, "Error initializing ACL."); > > +goto handle_error; > > +} > > + > > +/* Add denied ACL. */ > > +if (!AddAccessDeniedAce(acl, ACL_REVISION, > > +GENERIC_ALL, remoteAccessSid)) { > > +VLOG_ERR_RL(&rl, "Error adding remote access ACE."); > > +goto handle_error; > > +} > > + > > +/* Add allowed ACLs. */ > > +for (int i = 0; i < ALLOWED_PSIDS_SIZE; i++) { > > +if (!AddAccessAllowedAce(acl, ACL_REVISION, > > + GENERIC_ALL, allowedPsid[i])) { > > +VLOG_ERR_RL(&rl, "Error adding ACE."); > > +goto handle_error; > > +} > > +} > > + > > +psd = xmalloc(SECURITY_DESCRIPTOR_MIN_LENGTH); > > +if (!InitializeSecurityDescriptor(psd, SECURITY_DESCRIPTOR_REVISION)) { > > +VLOG_ERR_RL(&rl, "Error initializing Security Descriptor."); > > +goto handle_error; > > +} > > + > > +/* Set DACL. */ > > +if (!SetSecurityDescriptorDacl(psd, TRUE, acl, FALSE)) { > > +VLOG_ERR_RL(&rl, "Error while setting DACL."); > > +goto handle_error; > > +} > > + > > +sa.nLength = sizeof sa; > > sa.bInheritHandle = TRUE; > > +sa.lpSecurityDescriptor = psd; > > + > > if (strlen(name) > 256) { > >
Re: [ovs-dev] [PATCH v8 5/5] datapath-windows: Fragment NBL based on MRU size
Thanks a lot Anand and Alin! I applied all of these to master. On Sat, May 06, 2017 at 01:45:11AM +, Alin Serdean wrote: > Acked-by: Alin Gabriel Serdean > > > -Original Message- > > From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev- > > boun...@openvswitch.org] On Behalf Of Anand Kumar > > Sent: Friday, May 5, 2017 1:13 AM > > To: d...@openvswitch.org > > Subject: [ovs-dev] [PATCH v8 5/5] datapath-windows: Fragment NBL based > > on MRU size > > > > This patch adds support for Fragmenting NBL based on the MRU value. > > MRU value is updated only for Ipv4 fragments, if it is non zero, then > > fragment > > the NBL and send out the new NBL to the vnic. > > > > Signed-off-by: Anand Kumar > > --- > > v7->v8: No change > > v6->v7: Fragment the NBL for tunnel packets > > v5->v6: No Change > > v4->v5: > > - Use MRU information in the _OVS_BUFFER_CONTEXT to fragment > > NBL. > > v3->v4: No Change > > v2->v3: > > - Updated log message > > v1->v2: No change > > --- > > datapath-windows/ovsext/Actions.c | 51 > > ++- > > 1 file changed, 50 insertions(+), 1 deletion(-) > > > > diff --git a/datapath-windows/ovsext/Actions.c b/datapath- > > windows/ovsext/Actions.c > > index b5c13c7..e2eae9a 100644 > > --- a/datapath-windows/ovsext/Actions.c > > +++ b/datapath-windows/ovsext/Actions.c > > @@ -34,6 +34,7 @@ > > #include "Vport.h" > > #include "Vxlan.h" > > #include "Geneve.h" > > +#include "IpFragment.h" > > > > #ifdef OVS_DBG_MOD > > #undef OVS_DBG_MOD > > @@ -144,6 +145,36 @@ OvsInitForwardingCtx(OvsForwardingContext > > *ovsFwdCtx, > > > > /* > > * > > -- > > + * OvsDoFragmentNbl -- > > + * Utility function to Fragment nbl based on mru. > > + * > > +--- > > +--- > > + */ > > +static __inline VOID > > +OvsDoFragmentNbl(OvsForwardingContext *ovsFwdCtx, UINT16 mru) { > > +PNET_BUFFER_LIST fragNbl = NULL; > > +fragNbl = OvsFragmentNBL(ovsFwdCtx->switchContext, > > + ovsFwdCtx->curNbl, > > + &(ovsFwdCtx->layers), > > + mru, 0, TRUE); > > + > > + if (fragNbl != NULL) { > > +OvsCompleteNBL(ovsFwdCtx->switchContext, ovsFwdCtx->curNbl, > > TRUE); > > +OvsInitForwardingCtx(ovsFwdCtx, > > +ovsFwdCtx->switchContext, > > + fragNbl, > > + ovsFwdCtx->srcVportNo, > > + ovsFwdCtx->sendFlags, > > + > > NET_BUFFER_LIST_SWITCH_FORWARDING_DETAIL(fragNbl), > > + ovsFwdCtx->completionList, > > + &ovsFwdCtx->layers, FALSE); > > +} else { > > +OVS_LOG_INFO("Fragment NBL failed for MRU = %u", mru); > > +} > > +} > > + > > +/* > > + * > > +--- > > +--- > > * OvsDetectTunnelRxPkt -- > > * Utility function for an RX packet to detect its tunnel type. > > * > > @@ -604,6 +635,7 @@ OvsTunnelPortTx(OvsForwardingContext > > *ovsFwdCtx) > > UINT32 srcVportNo; > > NDIS_SWITCH_NIC_INDEX srcNicIndex; > > NDIS_SWITCH_PORT_ID srcPortId; > > +POVS_BUFFER_CONTEXT ctx; > > > > /* > > * Setup the source port to be the internal port to as to facilitate > > the @@ - > > 617,6 +649,10 @@ OvsTunnelPortTx(OvsForwardingContext *ovsFwdCtx) > > return NDIS_STATUS_FAILURE; > > } > > > > +ctx = > > (POVS_BUFFER_CONTEXT)NET_BUFFER_LIST_CONTEXT_DATA_START(ovsF > > wdCtx->curNbl); > > +if (ctx->mru != 0) { > > +OvsDoFragmentNbl(ovsFwdCtx, ctx->mru); > > +} > > OVS_FWD_INFO switchFwdInfo = { 0 }; > > /* Apply the encapsulation. The encapsulation will not consume the NBL. > > */ > > switch(ovsFwdCtx->tunnelTxNic->ovsType) { @@ -807,6 +843,7 @@ > > OvsOutputForwardingCtx(OvsForwardingContext *ovsFwdCtx) > > NDIS_STATUS status = STATUS_SUCCESS; > > POVS_SWITCH_CONTEXT switchContext = ovsFwdCtx->switchContext; > > PCWSTR dropReason; > > +POVS_BUFFER_CONTEXT ctx; > > > > /* > > * Handle the case where the some of the destination ports are tunneled > > @@ -829,8 +866,12 @@ OvsOutputForwardingCtx(OvsForwardingContext > > *ovsFwdCtx) > > * before doing encapsulation. > > */ > > if (ovsFwdCtx->tunnelTxNic != NULL || ovsFwdCtx->tunnelRxNic != > > NULL) { > > +POVS_BUFFER_CONTEXT oldCtx, newCtx; > > nb = NET_BUFFER_LIST_FIRST_NB(ovsFwdCtx->curNbl); > > -newNbl = OvsPartialCopyNBL(ovsFwdCtx->switchContext, > > ovsFwdCtx->curNbl, > > +oldCtx = (POVS_BUFFER_CONTEXT) > > +NET_BUFFER_LIST_CONTEXT_DATA_START(ovsFwdCtx->curNbl); > > +newNbl = OvsPartialCopyNBL(ovs
[ovs-dev] checkpatch name checking
Hi Aaron, checkpatch currently tries to ignores files in the "datapath" directories but it's not entirely successful. I think that's because, in the "parse == 1" case, it doesn't strip a leading "a/" or "b/" from filenames: current_file = match.group(2) whereas in the "parse == 2", it does: current_file = newfile.group(2)[2:] and the check for "datapath" relies on the / being there: # Skip files which have /datapath in them, since they are # linux or windows coding standards if '/datapath' in current_file: continue I'm not sure where the real bug is. Should the prefix be consistently stripped or not stripped? Once that's decided, it's easy to fix the problem. Thanks, Ben. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] NEWS: Reorganize and edit recent items for clarity.
Signed-off-by: Ben Pfaff --- NEWS | 22 ++ 1 file changed, 10 insertions(+), 12 deletions(-) diff --git a/NEWS b/NEWS index 7a2b185bbd84..abfd13cd93f4 100644 --- a/NEWS +++ b/NEWS @@ -1,20 +1,16 @@ Post-v2.7.0 - - - Tunnels: - * Added support to set packet mark for tunnel endpoint using - `egress_pkt_mark` OVSDB option. - - EMC insertion probability is reduced to 1% and is configurable via - the new 'other_config:emc-insert-inv-prob' option. + - New support for multiple VLANs (802.1ad or "QinQ"), including a new + "dot1q-tunnel" port VLAN mode. + - Tunnel endpoints can now set packet mark using `egress_pkt_mark' in OVSDB. - DPDK: + * EMC insertion probability is reduced to 1% and is configurable via + the new 'other_config:emc-insert-inv-prob' option. * DPDK log messages redirected to OVS logging subsystem. Log level can be changed in a usual OVS way using 'ovs-appctl vlog' commands for 'dpdk' module. Lower bound still can be configured via extra arguments for DPDK EAL. - IPFIX now provides additional counters for totals since startup. - - New support for multiple VLANs (802.1ad or "QinQ"), including a new - "dot1q-tunnel" port VLAN mode. - - In ovn-vsctl and vtep-ctl, record UUIDs in commands may now be - abbreviated to 4 hex digits. - OVN: * New built-in DNS support. * IPAM for IPv4 can now exclude user-defined addresses from assignment. @@ -28,7 +24,10 @@ Post-v2.7.0 abbreviated to 4 hex digits. * "ovn-sbctl lflow-list" can now print OpenFlow flows that correspond to logical flows. - - Add the command 'ovs-appctl stp/show' (see ovs-vswitchd(8)). + * In ovn-vsctl and vtep-ctl, record UUIDs in commands may now be + abbreviated to 4 hex digits. + * Fedora packaging: OVN services no longer restart automatically + after upgrade. - OpenFlow: * All features required by OpenFlow 1.4 are now implemented, so ovs-vswitchd now enables OpenFlow 1.4 by default (in addition to @@ -37,8 +36,7 @@ Post-v2.7.0 * Bundles now support hashing by just nw_src or nw_dst. * The "learn" action now supports a "limit" option (see ovs-ofctl(8)). * The port status bit OFPPS_LIVE now reflects link aliveness. - - Fedora Packaging: - * OVN services are no longer restarted automatically after upgrade. + - Add the command 'ovs-appctl stp/show' (see ovs-vswitchd(8)). - Add --cleanup option to command 'ovs-appctl exit' (see ovs-vswitchd(8)). v2.7.0 - 21 Feb 2017 -- 2.10.2 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 1/2] sparse: Add rte_memcpy.h replacement header.
Darrell, do you plan to review patch 2? On Sun, May 07, 2017 at 08:58:35AM -0400, Ben Pfaff wrote: > OK, I added that it affects i386 and applied this to master. Thank you! > > On Sat, May 06, 2017 at 08:27:47PM +, Darrell Ball wrote: > > I am not able to reproduce this using 64bit arch., although Sparse does > > work for me. > >(I guess I should install a 32 bit VM at some point, but I am low on > > disk space - too many 64bit VMs) > > I tried tip of master and about 2 months old master > > > > Based on the above data point and the error output below, it seems like > > this noise only afflicts 32 bit arch > > Maybe specify in the commit message that this helps on 32 bit arch, unless > > others can see this or > > something similar on 64bit. > > > > Acked-by: Darrell Ball > > > > > > > > On 5/5/17, 9:39 PM, "ovs-dev-boun...@openvswitch.org on behalf of Ben > > Pfaff" wrote: > > > > Without this replacement header, building netdev-dpdk.c provokes several > > "sparse" warnings: > > > > /usr/include/dpdk/rte_memcpy.h:515:33: warning: incorrect type in > > argument 1 (different type sizes) > > /usr/include/dpdk/rte_memcpy.h:515:33:expected long long const > > [usertype] *__P > > /usr/include/dpdk/rte_memcpy.h:515:33:got int const [usertype] > > * > > /usr/lib/gcc/i686-linux-gnu/6//include/emmintrin.h:698:20: error: > > undefined identifier '__builtin_ia32_loaddqu' > > /usr/lib/gcc/i686-linux-gnu/6//include/emmintrin.h:698:11: error: cast > > from unknown type > > /usr/lib/gcc/i686-linux-gnu/6//include/emmintrin.h:716:3: error: > > undefined identifier '__builtin_ia32_storedqu' > > /usr/lib/gcc/i686-linux-gnu/6//include/emmintrin.h:698:43: error: not a > > function > > /usr/lib/gcc/i686-linux-gnu/6//include/emmintrin.h:716:27: error: not a > > function > > ... > > > > Signed-off-by: Ben Pfaff > > --- > > include/sparse/automake.mk | 1 + > > include/sparse/rte_memcpy.h | 39 > > +++ > > 2 files changed, 40 insertions(+) > > create mode 100644 include/sparse/rte_memcpy.h > > > > diff --git a/include/sparse/automake.mk b/include/sparse/automake.mk > > index 0456ee67d481..f3282c260414 100644 > > --- a/include/sparse/automake.mk > > +++ b/include/sparse/automake.mk > > @@ -10,6 +10,7 @@ noinst_HEADERS += \ > > include/sparse/pthread.h \ > > include/sparse/rte_atomic.h \ > > include/sparse/rte_lcore.h \ > > +include/sparse/rte_memcpy.h \ > > include/sparse/rte_vect.h \ > > include/sparse/sys/socket.h \ > > include/sparse/sys/wait.h > > diff --git a/include/sparse/rte_memcpy.h b/include/sparse/rte_memcpy.h > > new file mode 100644 > > index ..5cd3f013ea8b > > --- /dev/null > > +++ b/include/sparse/rte_memcpy.h > > @@ -0,0 +1,39 @@ > > +/* Copyright (c) 2017 Nicira, Inc. > > + * > > + * Licensed under the Apache License, Version 2.0 (the "License"); > > + * you may not use this file except in compliance with the License. > > + * You may obtain a copy of the License at: > > + * > > + * > > https://urldefense.proofpoint.com/v2/url?u=http-3A__www.apache.org_licenses_LICENSE-2D2.0&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=Qv-3YRHpSQuMTnL-vRyqF0jCaFIpiuYqPGtkgPYEwxA&s=AzghyY07JReCfJwWQShu12qMFGiOML9IpKTIdvsOTu0&e= > > > > + * > > + * Unless required by applicable law or agreed to in writing, software > > + * distributed under the License is distributed on an "AS IS" BASIS, > > + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or > > implied. > > + * See the License for the specific language governing permissions and > > + * limitations under the License. > > + */ > > + > > +#ifndef RTE_MEMCPY_H > > +#define RTE_MEMCPY_H 1 > > + > > +#ifndef __CHECKER__ > > +#error "Use this header only with sparse. It is not a correct > > implementation." > > +#endif > > + > > +/* Include the same headers as the real rte_memcpy(). */ > > +#include > > +#include > > +#include > > +#include > > + > > +/* Declare the same functions as the real rte_memcpy.h, without > > defining them. > > + * This gives sparse the information it needs without provoking > > sparse's > > + * complaints about the implementations. */ > > +void rte_mov16(uint8_t *, const uint8_t *); > > +void rte_mov32(uint8_t *, const uint8_t *); > > +void rte_mov64(uint8_t *, const uint8_t *); > > +void rte_mov128(uint8_t *, const uint8_t *); > > +void rte_mov256(uint8_t *, const uint8_t *); > > +void *rte_memcpy(void *, const void *, size_t); > > + > > +#endif /* RTE_MEMCPY_H_WRAPPER */ > > -- > > 2.10.2 > > > > __
Re: [ovs-dev] Package/service name on Ubuntu vs RedHat
On Mon, May 08, 2017 at 12:57:43PM +, Balazs Nemeth wrote: > Currently the deb package and service name is 'openvswitch-switch' on Ubuntu > according to debian/rules, but the rpm package and service name is > 'openvswitch' on Red Hat according to rhel/openvswitch.spec.in. > Why is this difference exist? Is it possible to use 'openvswitch-switch' on > every existing platform? It would be much easier to handle e.g. the > documentation with unified service name. I *think* that I was the original packager in both cases. The likely reason is that, in my perception, Debian tends to break packages up in many small chunks, whereas Red Hat tends to use more monolithic packaging. The result is that the Debian packaging for OVS has something like 11 packages (ignoring OVN) whereas Red Hat has only a few (originally just one or two, I think, although maybe it's been divided up a bit now). In each case, the service was named after the package that actually provided the switch functionality: on Debian, this was the openvswitch-switch package, on Red Hat, this was the openvswitch package. Whether this is good or desirable, I don't know. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2] ofproto-dpif-ipfix: total counters.
On Thu, Apr 20, 2017 at 03:25:13PM +0100, mweglicx wrote: > Implementation of IPFix counters which hold > total values measured since metering process startup. > > v2: Patch is reapplied for a review because > it hasn't been correctly marked in patchwork. > > Signed-off-by: Michal Weglicki I applied this to master. Thank you for improving Open vSwitch support for IPFIX! ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 6/6] ovn-sbctl: support setting rbac role for remote connections
On Mon, May 01, 2017 at 10:13:32AM -0400, Lance Richardson wrote: > Add support for specifying rbac "role" when setting remote > connection configuration in the southbound database. > > Prior to this change, usage examples included: > > ovn-sbctl set-connection ptcp:6642 > ovn-sbctl set-connection pssl:6642 \ > read-only ptcp: \ > read-write punix:/tmp.foo > > With this change, in addition to the above: > > ovn-sbctl set-connection role=ovn-controller pssl:6642 \ > read-only role= ptcp: \ > read-write punix:/tmp/foo > > As with the "read-only"/"read-write" attributes, the specified > role is applied to all subsequent connections until changed. > > Signed-off-by: Lance Richardson Looks good, thanks. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2] ofproto-dpif-ipfix: total counters.
On Mon, May 08, 2017 at 09:22:42AM +, Weglicki, MichalX wrote: > > -Original Message- > > From: Ben Pfaff [mailto:b...@ovn.org] > > Sent: Monday, May 1, 2017 7:36 PM > > To: Weglicki, MichalX > > Cc: d...@openvswitch.org > > Subject: Re: [ovs-dev] [PATCH v2] ofproto-dpif-ipfix: total counters. > > > > On Thu, Apr 20, 2017 at 03:25:13PM +0100, mweglicx wrote: > > > Implementation of IPFix counters which hold > > > total values measured since metering process startup. > > > > > > v2: Patch is reapplied for a review because > > > it hasn't been correctly marked in patchwork. > > > > > > Signed-off-by: Michal Weglicki > > > > Thanks a lot for working on IPFix! I don't have IPFix expertise, and I > > don't think the other OVS committers have much either, so I need to ask > > some dumb questions to better understand the implications of this patch. > > > > Actually, maybe I just need to ask one. This changes the wire format of > > the IPFix reports that go out. Will this require changes to IPFix > > collectors that understood the previous format, that is, is the new > > format backward compatible with collectors that expected the previous > > format? I understand that IPFix uses a template format to tell > > collectors the format of what it's sending, but it's not clear to me > > whether or not that prevents this kind of compatibility issue. > > > > Thanks, > > > > Ben. > > Hello Ben, > > So counters which just got exposed through my patch are part of RFC > which is already partially implemented in OVS. I've just added few > missing ones, and I'm going to implement also others from same > category (https://tools.ietf.org/html/rfc5102#section-5.10) > > However to answer your question directly those counters are > part of standard group of IPFix counters, so no changes on > Collector side are required. IPFix sends counters in kind of > dictionary format, so adding any counter shouldn't break > backward compatibility - IPFix even defines something like > enterprise counters (vendor specific) where any agent can > define own counters in the template and send those over > (not every collector supports it, but data-wise it shouldn't > break anything on collector side). > > My patch was also tested against libipfix/ipfix_collector > and no additional changes were required to see new > counters on collector side. Thanks for the reassurance! I'll take a closer look at the patch now. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] Package/service name on Ubuntu vs RedHat
Hi, Currently the deb package and service name is 'openvswitch-switch' on Ubuntu according to debian/rules, but the rpm package and service name is 'openvswitch' on Red Hat according to rhel/openvswitch.spec.in. Why is this difference exist? Is it possible to use 'openvswitch-switch' on every existing platform? It would be much easier to handle e.g. the documentation with unified service name. Best regards, Balazs Nemeth ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH ovs V8 07/26] netdev-tc-offloads: Implement netdev flow flush using tc interface
On Wed, May 03, 2017 at 06:07:58PM +0300, Roi Dayan wrote: > From: Paul Blakey > > Signed-off-by: Paul Blakey > Reviewed-by: Roi Dayan > Reviewed-by: Simon Horman This does not appear to address Flavio Leitner's review of v7 of this patch. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH ovs V8 06/26] dpif-netlink: Flush added ports using netdev flow api
On Wed, May 03, 2017 at 06:07:57PM +0300, Roi Dayan wrote: > From: Paul Blakey > > If netdev flow offloading is enabled, flush all > added ports using netdev flow api. > > Signed-off-by: Paul Blakey > Reviewed-by: Roi Dayan > Reviewed-by: Simon Horman Flavio Leitner provided an Ack for v7 of this patch. You may want to consider adding that tag. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH ovs V8 05/26] dpif: Save added ports in a port map for netdev flow api use
On Wed, May 03, 2017 at 06:07:56PM +0300, Roi Dayan wrote: > From: Paul Blakey > > To use netdev flow offloading api, dpifs needs to iterate over > added ports. This addition inserts the added dpif ports in a hash map, > The map will also be used to translate dpif ports to netdevs. > > Signed-off-by: Paul Blakey > Reviewed-by: Roi Dayan > Reviewed-by: Simon Horman ... > diff --git a/lib/netdev.h b/lib/netdev.h > index 7435fdf..9aa7e5e 100644 > --- a/lib/netdev.h > +++ b/lib/netdev.h > @@ -181,6 +181,12 @@ int netdev_init_flow_api(struct netdev *); > extern bool netdev_flow_api_enabled; > void netdev_set_flow_api_enabled(const struct smap *ovs_other_config); > > +struct dpif_port; > +int netdev_ports_insert(struct netdev *, const void *obj, struct dpif_port > *); > +struct netdev *netdev_ports_get(odp_port_t port, const void *obj); > +int netdev_ports_remove(odp_port_t port, const void *obj); > +odp_port_t netdev_ifindex_to_odp_port(int ifindex); > + > /* native tunnel APIs */ > /* Structure to pass parameters required to build a tunnel header. */ > struct netdev_tnl_build_header_params { This patch seems to only partially address the review provided by Joe Stringer for v7. In particular: * netdev_ports_get() -> netdev_ports_lookup() * Feedback regarding 'obj' being a not particularly clear abstraction. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH ovs V8 04/26] other-config: Add tc-policy switch to control tc flower flag
On Wed, May 03, 2017 at 06:07:55PM +0300, Roi Dayan wrote: > From: Paul Blakey > > Add a new configuration tc-policy option that controls tc > flower flag. Possible options are none, skip_sw, skip_hw. > The default is none which is to insert the rule both to sw and hw. > This option is only relevant if hw-offload is enabled. > > Signed-off-by: Paul Blakey > Reviewed-by: Roi Dayan > Reviewed-by: Simon Horman This does not appear to address Flavio Leitner's review of v7 of this patch. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH ovs V8 03/26] other-config: Add hw-offload switch to control netdev flow offloading
On Wed, May 03, 2017 at 06:07:54PM +0300, Roi Dayan wrote: > From: Paul Blakey > > Add a new configuration option - hw-offload that enables netdev > flow api. Enabling this option will allow offloading flows > using netdev implementation instead of the kernel datapath. > This configuration option defaults to false - disabled. > > Signed-off-by: Paul Blakey > Reviewed-by: Roi Dayan > Reviewed-by: Simon Horman ... This does not appear to address Flavio Leitner's review of v7 of this patch. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH ovs V8 02/26] netdev: Adding a new netdev api to be used for offloading flows
On Wed, May 03, 2017 at 06:07:53PM +0300, Roi Dayan wrote: > From: Paul Blakey Please add some text to the changelog. > > Signed-off-by: Paul Blakey > Reviewed-by: Roi Dayan > Reviewed-by: Simon Horman ... > diff --git a/lib/netdev-tc-offloads.c b/lib/netdev-tc-offloads.c > new file mode 100644 > index 000..eb5e79a > --- /dev/null > +++ b/lib/netdev-tc-offloads.c > @@ -0,0 +1,154 @@ > +/* > + * Copyright (c) 2016 Mellanox Technologies, Ltd. > + * > + * Licensed under the Apache License, Version 2.0 (the "License"); > + * you may not use this file except in compliance with the License. > + * You may obtain a copy of the License at: > + * > + * http://www.apache.org/licenses/LICENSE-2.0 > + * > + * Unless required by applicable law or agreed to in writing, software > + * distributed under the License is distributed on an "AS IS" BASIS, > + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. > + * See the License for the specific language governing permissions and > + * limitations under the License. > + */ > + > +#include > + > +#include "netdev-tc-offloads.h" > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#include "coverage.h" > +#include "dp-packet.h" > +#include "dpif-netlink.h" > +#include "dpif-netdev.h" > +#include "openvswitch/dynamic-string.h" > +#include "fatal-signal.h" > +#include "hash.h" > +#include "openvswitch/hmap.h" > +#include "netdev-provider.h" > +#include "netdev-vport.h" > +#include "netlink-notifier.h" > +#include "netlink-socket.h" > +#include "netlink.h" > +#include "openvswitch/ofpbuf.h" > +#include "openflow/openflow.h" > +#include "ovs-atomic.h" > +#include "packets.h" > +#include "poll-loop.h" > +#include "rtnetlink.h" > +#include "openvswitch/shash.h" > +#include "netdev-provider.h" > +#include "openvswitch/match.h" > +#include "openvswitch/vlog.h" > +#include "tc.h" Are all of the above headers needed for what follows? There seems to be a lot of #includes above. > +VLOG_DEFINE_THIS_MODULE(netdev_tc_offloads); > + > +int > +netdev_tc_flow_flush(struct netdev *netdev OVS_UNUSED) > +{ > +return EOPNOTSUPP; > +} > + > +int > +netdev_tc_flow_dump_create(struct netdev *netdev, > + struct netdev_flow_dump **dump_out) > +{ > +struct netdev_flow_dump *dump = xzalloc(sizeof *dump); > + > +dump->netdev = netdev_ref(netdev); > + > +*dump_out = dump; > + > +return 0; > +} > + > +int > +netdev_tc_flow_dump_destroy(struct netdev_flow_dump *dump) > +{ > +netdev_close(dump->netdev); > +free(dump); > + > +return 0; > +} > + > +bool > +netdev_tc_flow_dump_next(struct netdev_flow_dump *dump OVS_UNUSED, > + struct match *match OVS_UNUSED, > + struct nlattr **actions OVS_UNUSED, > + struct dpif_flow_stats *stats OVS_UNUSED, > + ovs_u128 *ufid OVS_UNUSED, > + struct ofpbuf *rbuffer OVS_UNUSED, > + struct ofpbuf *wbuffer OVS_UNUSED) > +{ > +return false; > +} > + > +int > +netdev_tc_flow_put(struct netdev *netdev OVS_UNUSED, > + struct match *match OVS_UNUSED, > + struct nlattr *actions OVS_UNUSED, > + size_t actions_len OVS_UNUSED, > + struct dpif_flow_stats *stats OVS_UNUSED, > + const ovs_u128 *ufid OVS_UNUSED, Here and above ufid follows stats. > + struct offload_info *info OVS_UNUSED) > +{ > +return EOPNOTSUPP; > +} > + > +int > +netdev_tc_flow_get(struct netdev *netdev OVS_UNUSED, > + struct match *match OVS_UNUSED, > + struct nlattr **actions OVS_UNUSED, > + const ovs_u128 *ufid OVS_UNUSED, > + struct dpif_flow_stats *stats OVS_UNUSED, But here the order is reversed. I always think that when a reviewer asks for entries to be sorted they have run out of things to say. But nonetheless it would be slightly nicer if the order was consistent unless there is a reason for it not to be. > + struct ofpbuf *buf OVS_UNUSED) > +{ > +return EOPNOTSUPP; > +} > + > +int > +netdev_tc_flow_del(struct netdev *netdev OVS_UNUSED, > + const ovs_u128 *ufid OVS_UNUSED, > + struct dpif_flow_stats *stats OVS_UNUSED) > +{ > +return EOPNOTSUPP; > +} > + > +int > +netdev_tc_init_flow_api(struct netdev *netdev OVS_UNUSED) > +{ > +return 0; > +} > + There is a trailing blank line above. > diff --git a/lib/netdev-tc-offloads.h b/lib/netdev-tc-offloads.h > new file mode 100644 > index 000..7
Re: [ovs-dev] [PATCH 0/3] Support OpenFlow 1.5 packet-out
Hi Yi-Hung, We have already started to add support for the OpenFlow 1.5 Packet Out message in our patch series for "packet type-aware pipeline: https://mail.openvswitch.org/pipermail/ovs-dev/2017-April/331023.html https://mail.openvswitch.org/pipermail/ovs-dev/2017-April/331032.html Our focus was to add support for the new packet_type pipeline field, and I'm not sure we have covered all necessary efforts to support the rest of the pipeline fields. Please have a look at our patch and make sure that our efforts are aligned. Thanks, Jan > -Original Message- > From: ovs-dev-boun...@openvswitch.org > [mailto:ovs-dev-boun...@openvswitch.org] On Behalf Of Ben Pfaff > Sent: Saturday, 06 May, 2017 06:40 > To: Yi-Hung Wei > Cc: d...@openvswitch.org > Subject: Re: [ovs-dev] [PATCH 0/3] Support OpenFlow 1.5 packet-out > > On Thu, May 04, 2017 at 04:12:17PM -0700, Yi-Hung Wei wrote: > > This patch series add support to OpenFlow 1.5 packet-out message that > > enables setting pipeline fields. While there are six pipeline match > > fields as listed in OpenFlow spec 1.5.1, this patch series implements > > three of them (OXM_OF_IN_PORT, OXM_OF_METADATA, and OXM_OF_TUNNEL_ID), > > since the rest of them (OXM_OF_IN_PHY_PORT, OXM_OF_ACTSET_OUTPUT, and > > OXM_OF_PACKET_TYPE) does not fit into the ovs. > > > > Yi-Hung Wei (3): > > ofp-util: Add flow metadata to ofputil_packet_out > > ofproto: Add OpenFlow 1.5 packet-out support > > ofp-parse: Parse pipeline fields in OF1.5 packet-out > > Thanks for working on this feature. I'm looking forward to being able > to say that OVS supports all the features that OpenFlow 1.5 requires. I > think that with this and extensible flow entry statistics (which we've > already seen a draft of from, I believe, a TCS developer) we'll be > there. > > This is a high quality series. Thank you! I have only a few comments. > > First, I see that this series represents metadata as a "struct match". > Certainly, this works, but I wonder about the alternatives. The most > obvious one is "struct flow", which has all the same features as struct > match, without the masks. To me, it looks like the masks in struct > match aren't even used, at least not consistently; for example, this > code in parse_ofp_packet_out_str__() initializes in_port without its > mask: > > *po = (struct ofputil_packet_out) { > .buffer_id = UINT32_MAX, > .flow_metadata.flow.in_port.ofp_port = OFPP_CONTROLLER, > }; > > The other possible data structure for metadata is struct pkt_metadata, > which is designed specifically for metadata. However it's currently > used mostly in packet handling rather than in OpenFlow, so it would > probably be a second choice. > > Second, this series seems to take the literal definition of "pipeline > fields" from OpenFlow, only allowing those fields actually mentioned in > OpenFlow to be used in "packet-out"s. But I think that we should > include OVS extension pipeline fields, too, such as the various fields > with tunnel information. I also think that the implementation misses > some fields that OpenFlow itself defines as pipeline fields, such as the > OpenFlow packet register pipeline fields. > > Can you think through these issues and write up a v2? > > Thanks, > > Ben. > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] Liebes Kind Gottes,
Liebes Kind Gottes, Grüße 'ich bin. Frau Eunice Ton, eine Witwe zu spät Herr Fabrice Ton. Ich bin 75 Jahre alt und leide an Bauchspeicheldrüsenkrebs. Mein Zustand ist wirklich schlecht und es ist ganz offensichtlich, dass ich nicht mehr als drei Monate nach meinen Ärzten leben. Ich bin bereit, die Summe von sechs Millionen, vierhunderttausend Dollar (6,4 Mio. US $) united states dollars durch Sie in anderen zu spenden, um den armen Witwen und den weniger privilegierten in den ländlichen Gebieten zu helfen und andere karitative Arbeiten durchzuführen . Ich warte auf Ihre dringende Antwort zu gehen. Freundliche Grüße, Frau Eunice Ton ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v5] tunneling: Avoid recirculation on datapath by computing the recirculate actions at translate time.
Hi, I looked into the code of truncate, I saw that patch ports are not handled. On the other hand I saw that "Avoid recirculation" commit should not be affected by this fact. I verified that packets are truncated correctly with my last patch I sent you before, but flow stats are not correct on the underlay bridge. Could you confirm this, please? Best regards, Zoltan > -Original Message- > From: Zoltán Balogh > Sent: Sunday, May 07, 2017 9:16 PM > To: 'Joe Stringer' ; William Tu > Cc: Sugesh Chandran ; ovs dev > ; Ben Pfaff ; Andy Zhou > ; Jan Scheurich > Subject: RE: [ovs-dev] [PATCH v5] tunneling: Avoid recirculation on datapath > by computing the recirculate actions at > translate time. > > > >> I have a patch that fixes tunneling over patch ports. The 14th > > >> system-userspace > > >> test still does fail, but now the packet size in the dump flow remains > > >> 242. > > >> > > >> ./system-traffic.at:554: ovs-ofctl dump-flows br0 | grep "in_port=2" | > > >> sed -n 's/.*\(n\_bytes=[0-9]*\).*/\1/p' > > >> ./system-traffic.at:558: ovs-ofctl dump-flows br-underlay | grep > > >> "in_port=LOCAL" | sed -n 's/.*\(n\_bytes=[0- > > 9]*\).*/\1/p' > > >> --- - 2017-05-05 19:52:28.987111424 +0200 > > >> +++ > > >> /home/ezolbal/work/general_L3_tunneling/ovs/tests/system-userspace-testsuite.dir/at-groups/14/stdout > > 2017-05-05 19:52:28.983508027 +0200 > > >> @@ -1,2 +1,2 @@ > > >> -n_bytes=138 > > >> +n_bytes=242 > > >> > > >> I'm little bit lost with flow statistics. Could you have a look at the > > >> patch > > >> below, and help me to find out if it's only a statistics bug, please? > > > > > > Ah, so the more interesting part of that test is that it also > > > truncates the packet during output to the tunnel. So the tunnel should > > > only see a 100B packet (encapped within GRE, so 138B). Commit > > > aaca4fe0ce9e ("ofp-actions: Add truncate action.") was where this > > > functionality was originally introduced, perhaps you can look at that > > > to determine how the truncation should be applied in this case? > > > > Looking again, I think this also states that regardless of the > > truncate, the second bridge should attribute stats for the encapped > > packet, so even if there was no truncation it should be more than > > 242B. When it comes to applying geneve TLVs as well, I'd expect this > > to be calculated correctly. > > Hi Joe, > Hi William, > > I was thinking about the "Avoid recirculation" code created by Sugesh and > myself. It is based on the code patch > ports do use. > So I reverted our "Avoid recirculation" commit on master branch and tried to > truncate packets on a patch port. > I used the setup below. > > > 192.168.10.10 192.168.10.20 > ns0 ns1 >| | >| br0-ns0 | br1-ns1 > +--o---+ +--o---+ > | | | | > |br0 | |br1 | > | | | | > +--oo--+ +--oo--+ > veth0 || patch0 patch1 || veth1 >|++| >| | >+--+ > > > I attached two namespaces (ns0, ns1) to two netdev bridges (br0, br1), then I > connected the bridges over veth and > patch ports. > > netdev@ovs-netdev: hit:915736 missed:28 > br0: > br0 65534/1: (tap) > br0-ns0 1/3: (system) > patch0 2/none: (patch: peer=patch1) > veth0 3/5: (system) > br1: > br1 65534/2: (tap) > br1-ns1 1/4: (system) > patch1 2/none: (patch: peer=patch0) > veth1 3/6: (system) > > When I added flow rules to forward and truncate packets over veth ports, the > ping from ns0 to ns1 did work. > > $ ovs-ofctl del-flows br0 > $ ovs-ofctl del-flows br1 > $ ovs-ofctl add-flow br0 in_port=1,action=3 > $ ovs-ofctl add-flow br0 in_port=3,actions=1 > $ ovs-ofctl add-flow br1 in_port=3,actions=1 > $ ovs-ofctl add-flow br1 in_port=1,actions=output\(port=3,max_len=200\) > > $ ip netns exec ns0 ping 192.168.10.20 > > flow-dump from non-dpdk interfaces: > recirc_id(0),in_port(3),eth_type(0x0800),ipv4(frag=no), packets:180, > bytes:17640, used:0.610s, actions:5 > recirc_id(0),in_port(5),eth_type(0x0800),ipv4(frag=no), packets:24, > bytes:2352, used:0.610s, actions:3 > recirc_id(0),in_port(4),eth_type(0x0800),ipv4(frag=no), packets:178, > bytes:17444, used:0.610s, > actions:trunc(200),6 > recirc_id(0),in_port(6),eth_type(0x0800),ipv4(frag=no), packets:25, > bytes:2450, used:0.610s, actions:4 > > When I added flow rules to forward and truncate packets over patch ports, > then ping reply packets did not arrive. > > $ ovs-ofctl del-flows br0 > $ ovs-ofctl del-flows br1 > $ ovs-ofctl add-flow br0 in_port=1,action=2 > $ ovs-ofctl add-flow br0 in_port=2,actions=1 > $
Re: [ovs-dev] [RFC PATCH 0/6] Change dpdk rxq scheduling to incorporate rxq processing cycles.
On 05/06/2017 02:01 AM, Greg Rose wrote: > On Fri, May 5, 2017 at 9:34 AM, Kevin Traynor wrote: >> Rxqs are scheduled to be handled across available pmds in round robin >> order with no weight or priority. >> >> It can happen that some very busy queues are handled by one pmd which >> does not have enough cycles to prevent packets being dropped on them. >> While at the same time another pmd which handles queues with no traffic >> on them, is essentially idling. >> >> Rxq scheduling happens as a result of a number of events and when it does, >> the same unweighted round robin approach is applied each time. > > I've seen and heard of HW that can be configured with a weighted > round robin approach in which some queues or pools of queues will be > given a higher priority in which they have more 'credits'. > Specifically I know of some Intel HW that is capable of that. What I > do not know is if any of that ever made it into the Linux kernel or if > it is something that might cause your statement to be inaccurate in > all cases. > > My worry is that this assertion may not be correct. > > I'm copying John Fastabend, one of my old co-workers at Intel to catch > is take. He could probably point us in the right direction. > > - Greg Hi Greg, Thanks for your reply. I'm thinking I was too vague when I said "Rxq scheduling", and we may be talking about slightly different things. Let me elaborate on what I mean and why I used that term. When using the userspace datapath and dpdk type ports (e.g. type dpdk, dpdkvhostuser) the rxqs from those ports are polled for packets by pmd thread(s) in OVS. If there is just one pmd thread, it will poll all the dpdk ports rxqs. However, if there are more than one pmd threads, the OVS userspace code will distribute the polling of rxqs across the pmds. The function that decides the assignment of rxqs to pmds is called rxq_scheduling() and that is the code I was referring to (see patch 6/6), with the patchset changing the method of assignment and hoping to improve the results. Either as part of this or separate, it might be worth changing the name of the function to something like rxq_pmd_assign() to be more specific about what it does. Let me know if this clears things up, or I misinterpreted. thanks, Kevin. > >> >> This patchset proposes to augment the round robin nature of rxq scheduling >> by counting the processing cycles used by the rxqs during their operation >> and incorporate it into the rxq scheduling. >> >> Before distributing in a round robin manner, the rxqs will be sorted in >> order of the processing cycles they have been consuming. Assuming multiple >> pmds, this ensures that the measured rxqs using most processing cycles will >> be distributed to different cores. >> >> To try out: >> This patchset requires the updated pmd counting patch applied as a >> prerequisite. https://patchwork.ozlabs.org/patch/729970/ >> >> Alternatively the series with dependencies can be cloned from here: >> https://github.com/kevintraynor/ovs-rxq.git >> >> Simple way to test is add some dpdk ports, add multiple pmds, vary traffic >> rates and rxqs on ports and trigger reschedules e.g. by changing rxqs or >> the pmd-cpu-mask. >> >> Check rxq distribution with ovs-appctl dpif-netdev/pmd-rxq-show and see >> if it matches expected. >> >> todo: >> -possibly add a dedicated reschedule trigger command >> -use consistent type names >> -update docs >> -more testing, especially for dual numa >> >> thanks, >> Kevin. >> >> Kevin Traynor (6): >> dpif-netdev: Add rxq processing cycle counters. >> dpif-netdev: Update rxq processing cycles from >> cycles_count_intermediate. >> dpif-netdev: Change polled_queue to use dp_netdev_rxq. >> dpif-netdev: Make dpcls optimization interval more generic. >> dpif-netdev: Count the rxq processing cycles for an rxq. >> dpif-netdev: Change rxq_scheduling to use rxq processing cycles. >> >> lib/dpif-netdev.c | 163 >> -- >> 1 file changed, 133 insertions(+), 30 deletions(-) >> >> -- >> 1.8.3.1 >> >> ___ >> dev mailing list >> d...@openvswitch.org >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [RFC PATCH 1/1] netdev-dpdk: enable multi-segment jumbo frames
Currently, jumbo frame support for OvS-DPDK is implemented by increasing the size of mbufs within a mempool, such that each mbuf within the pool is large enough to contain an entire jumbo frame of a user-defined size. Typically, for each user-defined MTU 'requested_mtu', a new mempool is created, containing mbufs of size ~requested_mtu. With the multi-segment approach, all ports share the same mempool, in which each mbuf is of standard/default size (~2k MB). To accommodate jumbo frames, mbufs may be chained together, each mbuf storing a portion of the jumbo frame; each mbuf in the chain is termed a segment, hence the name. --- lib/dpdk.c | 6 ++ lib/netdev-dpdk.c| 54 ++-- lib/netdev-dpdk.h| 1 + vswitchd/vswitch.xml | 19 ++ 4 files changed, 74 insertions(+), 6 deletions(-) diff --git a/lib/dpdk.c b/lib/dpdk.c index 8da6c32..7d08e34 100644 --- a/lib/dpdk.c +++ b/lib/dpdk.c @@ -450,6 +450,12 @@ dpdk_init__(const struct smap *ovs_other_config) /* Finally, register the dpdk classes */ netdev_dpdk_register(); + +bool multi_seg_mbufs_enable = smap_get_bool(ovs_other_config, "dpdk-multi-seg-mbufs", false); +if (multi_seg_mbufs_enable) { +VLOG_INFO("DPDK multi-segment mbufs enabled\n"); +netdev_dpdk_multi_segment_mbufs_enable(); +} } void diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 34fc54b..82bc0e2 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -58,6 +58,7 @@ VLOG_DEFINE_THIS_MODULE(netdev_dpdk); static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 20); +static bool dpdk_multi_segment_mbufs = false; #define DPDK_PORT_WATCHDOG_INTERVAL 5 @@ -480,7 +481,7 @@ dpdk_mp_create(int socket_id, int mtu) * when the number of ports and rxqs that utilize a particular mempool can * change dynamically at runtime. For now, use this rough heurisitic. */ -if (mtu >= ETHER_MTU) { +if (mtu >= ETHER_MTU || dpdk_multi_segment_mbufs) { mp_size = MAX_NB_MBUF; } else { mp_size = MIN_NB_MBUF; @@ -558,17 +559,33 @@ dpdk_mp_put(struct dpdk_mp *dmp) ovs_mutex_unlock(&dpdk_mp_mutex); } -/* Tries to allocate new mempool on requested_socket_id with - * mbuf size corresponding to requested_mtu. +/* Tries to configure a mempool for 'dev' on requested socket_id to accommodate + * packets of size 'requested_mtu'. The properties of the mempool's elements + * are dependent on the value of 'dpdk_multi_segment_mbufs': + * - if 'true', then the mempool contains standard-sized mbufs that are chained + * together to accommodate packets of size 'requested_mtu'. All ports on the + * same socket will share this mempool, irrespective of their MTU. + * - if 'false', then a mempool is allocated, the members of which are non- + * standard-sized mbufs. Each mbuf in the mempool is large enough to fully + * accomdate packets of size 'requested_mtu'. + * * On success new configuration will be applied. * On error, device will be left unchanged. */ static int netdev_dpdk_mempool_configure(struct netdev_dpdk *dev) OVS_REQUIRES(dev->mutex) { -uint32_t buf_size = dpdk_buf_size(dev->requested_mtu); +uint32_t buf_size = 0; struct dpdk_mp *mp; +/* Contiguous mbufs in use - permit oversized mbufs */ +if (!dpdk_multi_segment_mbufs) { +buf_size = dpdk_buf_size(dev->requested_mtu); +} else { +/* multi-segment mbufs - use standard mbuf size */ +buf_size = dpdk_buf_size(ETHER_MTU); +} + mp = dpdk_mp_get(dev->requested_socket_id, FRAME_LEN_TO_MTU(buf_size)); if (!mp) { VLOG_ERR("Failed to create memory pool for netdev " @@ -577,7 +594,13 @@ netdev_dpdk_mempool_configure(struct netdev_dpdk *dev) rte_strerror(rte_errno)); return rte_errno; } else { -dpdk_mp_put(dev->dpdk_mp); +/* When single-segment mbufs are in use, a new mempool is allocated, + * so release the old one. In the case of multi-segment mbufs, the + * same mempool is used for all MTUs. + */ +if (!dpdk_multi_segment_mbufs) { +dpdk_mp_put(dev->dpdk_mp); +} dev->dpdk_mp = mp; dev->mtu = dev->requested_mtu; dev->socket_id = dev->requested_socket_id; @@ -639,6 +662,7 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, int n_rxq, int n_txq) int diag = 0; int i; struct rte_eth_conf conf = port_conf; +struct rte_eth_txconf txconf; if (dev->mtu > ETHER_MTU) { conf.rxmode.jumbo_frame = 1; @@ -666,9 +690,21 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, int n_rxq, int n_txq) break; } +/* DPDK PMDs typically attempt to use simple or vectorized + * transmit functions, neither of which are compatible with + * multi-segment mbufs. Ensure that these are disabled in the + * multi-segment mbu
Re: [ovs-dev] [PATCH 0/1] netdev-dpdk: multi-segment mbuf jumbo frame support
Please disregard this patch - incorrect subject line. I'll submit the updated version promptly. Thanks, Mark >-Original Message- >From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-boun...@openvswitch.org] >On Behalf Of >Mark Kavanagh >Sent: Monday, May 8, 2017 10:59 AM >To: ovs-dev@openvswitch.org; qiud...@chinac.com >Subject: [ovs-dev] [PATCH 0/1] netdev-dpdk: multi-segment mbuf jumbo frame >support > >This RFC introduces an approach for implementing jumbo frame support for >OvS-DPDK with multi-segment mbufs. > >== Overview == >Currently, jumbo frame support for OvS-DPDK is implemented by increasing >the size of mbufs within a mempool, such that each mbuf within the pool is >large enough to contain an entire jumbo frame of a user-defined size. >Typically, for each user-defined MTU 'requested_mtu', a new mempool is created, >containing mbufs of size ~requested_mtu. > >With the multi-segment approach, all ports share the same mempool, in which >each mbuf is of standard/default size (~2k MB). To accommodate jumbo frames, >mbufs may be chained together, each mbuf storing a portion of the jumbo frame; >each mbuf in the chain is termed a segment, hence the name. > > >== Enabling multi-segment mbufs == >Multi-segment and single-segment mbufs are mutually exclusive, and the user >must decide on which approach to adopt on init. The introduction of a new >optional OVSDB field, 'dpdk-multi-seg-mbufs', facilitates this; this is a >boolean field, which defaults to false. Setting the field is identical to >setting existing DPDPK-specific OVSDB fields: > >sudo $OVS_DIR/utilities/ovs-vsctl --no-wait set Open_vSwitch . > other_config:dpdk- >init=true >sudo $OVS_DIR/utilities/ovs-vsctl --no-wait set Open_vSwitch . > other_config:dpdk-lcore- >mask=0x10 >sudo $OVS_DIR/utilities/ovs-vsctl --no-wait set Open_vSwitch . > other_config:dpdk-socket- >mem=4096,0 >==> sudo $OVS_DIR/utilities/ovs-vsctl --no-wait set Open_vSwitch . >other_config:dpdk-multi- >seg-mbufs=true > > >== Code base == >This patch is dependent on the multi-segment mbuf patch submitted by Michael >Qiu (currently V2): >https://mail.openvswitch.org/pipermail/ovs-dev/2017-May/331792.html. >The upstream base commit against which this patch was generated is 1e96502; >to test this patch, check out that branch, apply Michael's patchset, and then >apply this patch: > >3. netdev-dpdk: enable multi-segment jumbo frames >2. DPDK multi-segment mbuf support (Michael Qiu) >1. 1e96502 tests: Only run python SSL test if SSL support is configur... > (OvS upstream) > >The DPDK version used during testing is v17.02, although v16.11 should work >equally well. > > >== Testing == >As this is an RFC, only a subset of the total traffic paths/vSwitch >configurations/actions have been tested - a summary of traffic paths tested >thus far is included below. The action tested in all cases is OUTPUT. Tests >in which issues were observed are summarized beneath the table. > >+-+ >| Traffic Path > | >+-+ >| DPDK Phy 0 -> OvS -> DPDK Phy 1 > | >| DPDK Phy 0 -> OvS -> Kernel Phy 0 > [1] | >| Kernel Phy 0 -> OvS -> DPDK Phy 0 > | >| > | >| DPDK Phy 0 -> OvS -> vHost User 0 -> vHost User 1 -> OvS -> DPDK Phy 1 * > | >| DPDK Phy 0 -> OvS -> vHost User 0 -> vHost User 1 -> OvS -> Kernel Phy 0 * > [1] | >| Kernel Phy 0 -> OvS -> vHost User 1 -> vHost User 0 -> OvS -> -> DPDK Phy 0 >* [2] | >| > | >| vHost0 -> OvS -> vHost1 > | >+-+ > > * = guest kernel IP forwarding >[1] = incorrect L4 checksum >[2] = traffic not forwarded in guest kernel. This behaviour is also observed >on OvS master. > >___ >dev mailing list >d...@openvswitch.org >https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [RFC PATCH 0/1] netdev-dpdk: multi-segment mbuf jumbo frame support
This RFC introduces an approach for implementing jumbo frame support for OvS-DPDK with multi-segment mbufs. == Overview == Currently, jumbo frame support for OvS-DPDK is implemented by increasing the size of mbufs within a mempool, such that each mbuf within the pool is large enough to contain an entire jumbo frame of a user-defined size. Typically, for each user-defined MTU 'requested_mtu', a new mempool is created, containing mbufs of size ~requested_mtu. With the multi-segment approach, all ports share the same mempool, in which each mbuf is of standard/default size (~2k MB). To accommodate jumbo frames, mbufs may be chained together, each mbuf storing a portion of the jumbo frame; each mbuf in the chain is termed a segment, hence the name. == Enabling multi-segment mbufs == Multi-segment and single-segment mbufs are mutually exclusive, and the user must decide on which approach to adopt on init. The introduction of a new optional OVSDB field, 'dpdk-multi-seg-mbufs', facilitates this; this is a boolean field, which defaults to false. Setting the field is identical to setting existing DPDPK-specific OVSDB fields: sudo $OVS_DIR/utilities/ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true sudo $OVS_DIR/utilities/ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-lcore-mask=0x10 sudo $OVS_DIR/utilities/ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem=4096,0 ==> sudo $OVS_DIR/utilities/ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-multi-seg-mbufs=true == Code base == This patch is dependent on the multi-segment mbuf patch submitted by Michael Qiu (currently V2): https://mail.openvswitch.org/pipermail/ovs-dev/2017-May/331792.html. The upstream base commit against which this patch was generated is 1e96502; to test this patch, check out that branch, apply Michael's patchset, and then apply this patch: 3. netdev-dpdk: enable multi-segment jumbo frames 2. DPDK multi-segment mbuf support (Michael Qiu) 1. 1e96502 tests: Only run python SSL test if SSL support is configur... (OvS upstream) The DPDK version used during testing is v17.02, although v16.11 should work equally well. == Testing == As this is an RFC, only a subset of the total traffic paths/vSwitch configurations/actions have been tested - a summary of traffic paths tested thus far is included below. The action tested in all cases is OUTPUT. Tests in which issues were observed are summarized beneath the table. +-+ | Traffic Path | +-+ | DPDK Phy 0 -> OvS -> DPDK Phy 1 | | DPDK Phy 0 -> OvS -> Kernel Phy 0 [1] | | Kernel Phy 0 -> OvS -> DPDK Phy 0 | | | | DPDK Phy 0 -> OvS -> vHost User 0 -> vHost User 1 -> OvS -> DPDK Phy 1 * | | DPDK Phy 0 -> OvS -> vHost User 0 -> vHost User 1 -> OvS -> Kernel Phy 0 * [1] | | Kernel Phy 0 -> OvS -> vHost User 1 -> vHost User 0 -> OvS -> -> DPDK Phy 0 * [2] | | | | vHost0 -> OvS -> vHost1 | +-+ * = guest kernel IP forwarding [1] = incorrect L4 checksum [2] = traffic not forwarded in guest kernel. This behaviour is also observed on OvS master. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 1/1] netdev-dpdk: enable multi-segment jumbo frames
Currently, jumbo frame support for OvS-DPDK is implemented by increasing the size of mbufs within a mempool, such that each mbuf within the pool is large enough to contain an entire jumbo frame of a user-defined size. Typically, for each user-defined MTU 'requested_mtu', a new mempool is created, containing mbufs of size ~requested_mtu. With the multi-segment approach, all ports share the same mempool, in which each mbuf is of standard/default size (~2k MB). To accommodate jumbo frames, mbufs may be chained together, each mbuf storing a portion of the jumbo frame; each mbuf in the chain is termed a segment, hence the name. --- lib/dpdk.c | 6 ++ lib/netdev-dpdk.c| 54 ++-- lib/netdev-dpdk.h| 1 + vswitchd/vswitch.xml | 19 ++ 4 files changed, 74 insertions(+), 6 deletions(-) diff --git a/lib/dpdk.c b/lib/dpdk.c index 8da6c32..7d08e34 100644 --- a/lib/dpdk.c +++ b/lib/dpdk.c @@ -450,6 +450,12 @@ dpdk_init__(const struct smap *ovs_other_config) /* Finally, register the dpdk classes */ netdev_dpdk_register(); + +bool multi_seg_mbufs_enable = smap_get_bool(ovs_other_config, "dpdk-multi-seg-mbufs", false); +if (multi_seg_mbufs_enable) { +VLOG_INFO("DPDK multi-segment mbufs enabled\n"); +netdev_dpdk_multi_segment_mbufs_enable(); +} } void diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 34fc54b..82bc0e2 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -58,6 +58,7 @@ VLOG_DEFINE_THIS_MODULE(netdev_dpdk); static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 20); +static bool dpdk_multi_segment_mbufs = false; #define DPDK_PORT_WATCHDOG_INTERVAL 5 @@ -480,7 +481,7 @@ dpdk_mp_create(int socket_id, int mtu) * when the number of ports and rxqs that utilize a particular mempool can * change dynamically at runtime. For now, use this rough heurisitic. */ -if (mtu >= ETHER_MTU) { +if (mtu >= ETHER_MTU || dpdk_multi_segment_mbufs) { mp_size = MAX_NB_MBUF; } else { mp_size = MIN_NB_MBUF; @@ -558,17 +559,33 @@ dpdk_mp_put(struct dpdk_mp *dmp) ovs_mutex_unlock(&dpdk_mp_mutex); } -/* Tries to allocate new mempool on requested_socket_id with - * mbuf size corresponding to requested_mtu. +/* Tries to configure a mempool for 'dev' on requested socket_id to accommodate + * packets of size 'requested_mtu'. The properties of the mempool's elements + * are dependent on the value of 'dpdk_multi_segment_mbufs': + * - if 'true', then the mempool contains standard-sized mbufs that are chained + * together to accommodate packets of size 'requested_mtu'. All ports on the + * same socket will share this mempool, irrespective of their MTU. + * - if 'false', then a mempool is allocated, the members of which are non- + * standard-sized mbufs. Each mbuf in the mempool is large enough to fully + * accomdate packets of size 'requested_mtu'. + * * On success new configuration will be applied. * On error, device will be left unchanged. */ static int netdev_dpdk_mempool_configure(struct netdev_dpdk *dev) OVS_REQUIRES(dev->mutex) { -uint32_t buf_size = dpdk_buf_size(dev->requested_mtu); +uint32_t buf_size = 0; struct dpdk_mp *mp; +/* Contiguous mbufs in use - permit oversized mbufs */ +if (!dpdk_multi_segment_mbufs) { +buf_size = dpdk_buf_size(dev->requested_mtu); +} else { +/* multi-segment mbufs - use standard mbuf size */ +buf_size = dpdk_buf_size(ETHER_MTU); +} + mp = dpdk_mp_get(dev->requested_socket_id, FRAME_LEN_TO_MTU(buf_size)); if (!mp) { VLOG_ERR("Failed to create memory pool for netdev " @@ -577,7 +594,13 @@ netdev_dpdk_mempool_configure(struct netdev_dpdk *dev) rte_strerror(rte_errno)); return rte_errno; } else { -dpdk_mp_put(dev->dpdk_mp); +/* When single-segment mbufs are in use, a new mempool is allocated, + * so release the old one. In the case of multi-segment mbufs, the + * same mempool is used for all MTUs. + */ +if (!dpdk_multi_segment_mbufs) { +dpdk_mp_put(dev->dpdk_mp); +} dev->dpdk_mp = mp; dev->mtu = dev->requested_mtu; dev->socket_id = dev->requested_socket_id; @@ -639,6 +662,7 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, int n_rxq, int n_txq) int diag = 0; int i; struct rte_eth_conf conf = port_conf; +struct rte_eth_txconf txconf; if (dev->mtu > ETHER_MTU) { conf.rxmode.jumbo_frame = 1; @@ -666,9 +690,21 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, int n_rxq, int n_txq) break; } +/* DPDK PMDs typically attempt to use simple or vectorized + * transmit functions, neither of which are compatible with + * multi-segment mbufs. Ensure that these are disabled in the + * multi-segment mbu
[ovs-dev] [PATCH 0/1] netdev-dpdk: multi-segment mbuf jumbo frame support
This RFC introduces an approach for implementing jumbo frame support for OvS-DPDK with multi-segment mbufs. == Overview == Currently, jumbo frame support for OvS-DPDK is implemented by increasing the size of mbufs within a mempool, such that each mbuf within the pool is large enough to contain an entire jumbo frame of a user-defined size. Typically, for each user-defined MTU 'requested_mtu', a new mempool is created, containing mbufs of size ~requested_mtu. With the multi-segment approach, all ports share the same mempool, in which each mbuf is of standard/default size (~2k MB). To accommodate jumbo frames, mbufs may be chained together, each mbuf storing a portion of the jumbo frame; each mbuf in the chain is termed a segment, hence the name. == Enabling multi-segment mbufs == Multi-segment and single-segment mbufs are mutually exclusive, and the user must decide on which approach to adopt on init. The introduction of a new optional OVSDB field, 'dpdk-multi-seg-mbufs', facilitates this; this is a boolean field, which defaults to false. Setting the field is identical to setting existing DPDPK-specific OVSDB fields: sudo $OVS_DIR/utilities/ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true sudo $OVS_DIR/utilities/ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-lcore-mask=0x10 sudo $OVS_DIR/utilities/ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem=4096,0 ==> sudo $OVS_DIR/utilities/ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-multi-seg-mbufs=true == Code base == This patch is dependent on the multi-segment mbuf patch submitted by Michael Qiu (currently V2): https://mail.openvswitch.org/pipermail/ovs-dev/2017-May/331792.html. The upstream base commit against which this patch was generated is 1e96502; to test this patch, check out that branch, apply Michael's patchset, and then apply this patch: 3. netdev-dpdk: enable multi-segment jumbo frames 2. DPDK multi-segment mbuf support (Michael Qiu) 1. 1e96502 tests: Only run python SSL test if SSL support is configur... (OvS upstream) The DPDK version used during testing is v17.02, although v16.11 should work equally well. == Testing == As this is an RFC, only a subset of the total traffic paths/vSwitch configurations/actions have been tested - a summary of traffic paths tested thus far is included below. The action tested in all cases is OUTPUT. Tests in which issues were observed are summarized beneath the table. +-+ | Traffic Path | +-+ | DPDK Phy 0 -> OvS -> DPDK Phy 1 | | DPDK Phy 0 -> OvS -> Kernel Phy 0 [1] | | Kernel Phy 0 -> OvS -> DPDK Phy 0 | | | | DPDK Phy 0 -> OvS -> vHost User 0 -> vHost User 1 -> OvS -> DPDK Phy 1 * | | DPDK Phy 0 -> OvS -> vHost User 0 -> vHost User 1 -> OvS -> Kernel Phy 0 * [1] | | Kernel Phy 0 -> OvS -> vHost User 1 -> vHost User 0 -> OvS -> -> DPDK Phy 0 * [2] | | | | vHost0 -> OvS -> vHost1 | +-+ * = guest kernel IP forwarding [1] = incorrect L4 checksum [2] = traffic not forwarded in guest kernel. This behaviour is also observed on OvS master. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2] ofproto-dpif-ipfix: total counters.
> -Original Message- > From: Ben Pfaff [mailto:b...@ovn.org] > Sent: Monday, May 1, 2017 7:36 PM > To: Weglicki, MichalX > Cc: d...@openvswitch.org > Subject: Re: [ovs-dev] [PATCH v2] ofproto-dpif-ipfix: total counters. > > On Thu, Apr 20, 2017 at 03:25:13PM +0100, mweglicx wrote: > > Implementation of IPFix counters which hold > > total values measured since metering process startup. > > > > v2: Patch is reapplied for a review because > > it hasn't been correctly marked in patchwork. > > > > Signed-off-by: Michal Weglicki > > Thanks a lot for working on IPFix! I don't have IPFix expertise, and I > don't think the other OVS committers have much either, so I need to ask > some dumb questions to better understand the implications of this patch. > > Actually, maybe I just need to ask one. This changes the wire format of > the IPFix reports that go out. Will this require changes to IPFix > collectors that understood the previous format, that is, is the new > format backward compatible with collectors that expected the previous > format? I understand that IPFix uses a template format to tell > collectors the format of what it's sending, but it's not clear to me > whether or not that prevents this kind of compatibility issue. > > Thanks, > > Ben. Hello Ben, So counters which just got exposed through my patch are part of RFC which is already partially implemented in OVS. I've just added few missing ones, and I'm going to implement also others from same category (https://tools.ietf.org/html/rfc5102#section-5.10) However to answer your question directly those counters are part of standard group of IPFix counters, so no changes on Collector side are required. IPFix sends counters in kind of dictionary format, so adding any counter shouldn't break backward compatibility - IPFix even defines something like enterprise counters (vendor specific) where any agent can define own counters in the template and send those over (not every collector supports it, but data-wise it shouldn't break anything on collector side). My patch was also tested against libipfix/ipfix_collector and no additional changes were required to see new counters on collector side. Br, Michal. -- Intel Research and Development Ireland Limited Registered in Ireland Registered Office: Collinstown Industrial Park, Leixlip, County Kildare Registered Number: 308263 This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev