Re: [ovs-dev] [PATCH v4] Detailed packet drop statistics per dpdk and vhostuser ports
Hi Ben, Thanks for reviewing the patch. As per your suggestion, I will use use net_dev_custom_stats to fetch the new statistics counters and document them in vswitchd/vswitchd.xml. Thanks, Sriram. -Original Message- From: Ben Pfaff Sent: 17 July 2019 01:21 To: Sriram Vatala Cc: ovs-dev@openvswitch.org Subject: Re: [ovs-dev] [PATCH v4] Detailed packet drop statistics per dpdk and vhostuser ports On Tue, Jul 16, 2019 at 07:02:29PM +0530, Sriram Vatala via dev wrote: > OVS may be unable to transmit packets for multiple reasons and today > there is a single counter to track packets dropped due to any of those > reasons. The most common reason is that a VM is unable to read packets > fast enough causing the vhostuser port transmit queue on the OVS side > to become full. This manifests as a problem with VNFs not receiving > all packets. Having a separate drop counter to track packets dropped > because the transmit queue is full will clearly indicate that the > problem is on the VM side and not in OVS. Similarly maintaining > separate counters for all possible drops helps in indicating sensible > cause for packet drops. > > This patch adds counters to track packets dropped due to all possible > reasons and these counters are displayed along with other stats in > "ovs-vsctl get interface statistics" > command. The detailed stats will be available for both dpdk and > vhostuser ports. Did you consider using netdev_get_custom_stats() for these counters? I do not know whether they are likely to be implemented by other network devices, and custom stats are an appropriate way to implement specialized statistics. > Following are the details of the new counters : > > tx_failed_drops : Sometimes DPDK physical/vHost port transmit API > fails to send all/some packets. These untransmited packets are > dropped.The most likely reason for this to happen is because of > transmit queue overrun. Besides transmit queue overrun, there are > other unlikely reasons such as invalid queue id etc. > > tx_mtu_exceeded_drops : These are the packets dropped due to MTU > mismatch (i.e Pkt len > Max Dev MTU). > > tx_qos_drops/rx_qos_drops : These are the packets dropped due to > transmission/reception rate exceeding the configured Egress/Ingress > policer rate on the interface. It would make sense to include the above descriptions in vswitchd/vswitch.xml alongside the other statistics under the "Statistics" group for the Interface table. Thanks for working to make OVS better! ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] 答复: 答复: [ovs-discuss] How can I delete flows which match a given cookie value?
Hmm. I can see how that would be inconvenient. If you have control over the cookies, and enough space in them, then you could encode the (16-bit) priority as part of the (64-bit) cookie. On Thu, Jul 18, 2019 at 12:18:55AM +, Yi Yang (杨燚)-云服务集团 wrote: > Got it, thanks Ben, I want to use cookie and priority to del-flows, but only > --strict option can handle priority. So I only can use cookie to del-flows in > batch, --strict only can delete one flow. > > sudo ovs-ofctl -Oopenflow13 del-flows br-int "cookie=0x01/-1,priority=1" > ovs-ofctl: unknown keyword priority > > -邮件原件- > 发件人: Ben Pfaff [mailto:b...@ovn.org] > 发送时间: 2019年7月18日 2:02 > 收件人: Yi Yang (杨燚)-云服务集团 > 抄送: ovs-disc...@openvswitch.org; ovs-dev@openvswitch.org > 主题: Re: 答复: [ovs-discuss] How can I delete flows which match a given cookie > value? > > --strict means that only exact matches are deleted, so add 'icmp' to your > del-flows command to delete the flow. > > On Wed, Jul 17, 2019 at 03:35:09AM +, Yi Yang (杨燚)-云服务集团 wrote: > > Ben, I found del-flows ran successfully but the flows aren't deleted, > > what's wrong? > > > > [yangyi@localhost ~]$ sudo ovs-ofctl -Oopenflow10 add-flow br-int > > "table=0,cookie=0x1234,priority=1,icmp,actions=drop" > > [yangyi@localhost ~]$ sudo ovs-ofctl -Oopenflow10 dump-flows br-int > > > > NXST_FLOW reply (xid=0x4): > > cookie=0x1234, duration=3.994s, table=0, n_packets=0, n_bytes=0, > > idle_age=3, priority=1,icmp actions=drop [yangyi@localhost ~]$ sudo > > ovs-ofctl -Oopenflow10 --strict del-flows br-int > > "table=0,cookie=0x1234/-1,priority=1" > > [yangyi@localhost ~]$ sudo ovs-ofctl -Oopenflow10 dump-flows br-int > > > > NXST_FLOW reply (xid=0x4): > > cookie=0x1234, duration=49.866s, table=0, n_packets=0, n_bytes=0, > > idle_age=49, priority=1,icmp actions=drop [yangyi@localhost ~]$ > > > > -邮件原件- > > 发件人: Ben Pfaff [mailto:b...@ovn.org] > > 发送时间: 2019年7月17日 1:04 > > 收件人: Yi Yang (杨燚)-云服务集团 > > 抄送: ovs-disc...@openvswitch.org; ovs-dev@openvswitch.org > > 主题: Re: [ovs-discuss] How can I delete flows which match a given cookie > > value? > > > > On Tue, Jul 16, 2019 at 09:35:06AM +, Yi Yang (杨燚)-云服务集团 wrote: > > > I need to add and delete flows according to user operations, I know > > > openflowplugin in Opendaylight can do this, but it seems “ovs-ofctl > > > del-flows” can’t do this way, why can’t cookie value be used to do > > > this for “ovs-ofctl del-flows”? > > > > > > > > > > > > sudo ovs-ofctl -Oopenflow13 --strict del-flows br-int > > > "table=2,cookie=12345" > > > > To match on a cookie, specify a mask, e.g. cookie=12345/-1. > > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] 答复: 答复: [ovs-discuss] How can I delete flows which match a given cookie value?
Got it, thanks Ben, I want to use cookie and priority to del-flows, but only --strict option can handle priority. So I only can use cookie to del-flows in batch, --strict only can delete one flow. sudo ovs-ofctl -Oopenflow13 del-flows br-int "cookie=0x01/-1,priority=1" ovs-ofctl: unknown keyword priority -邮件原件- 发件人: Ben Pfaff [mailto:b...@ovn.org] 发送时间: 2019年7月18日 2:02 收件人: Yi Yang (杨燚)-云服务集团 抄送: ovs-disc...@openvswitch.org; ovs-dev@openvswitch.org 主题: Re: 答复: [ovs-discuss] How can I delete flows which match a given cookie value? --strict means that only exact matches are deleted, so add 'icmp' to your del-flows command to delete the flow. On Wed, Jul 17, 2019 at 03:35:09AM +, Yi Yang (杨燚)-云服务集团 wrote: > Ben, I found del-flows ran successfully but the flows aren't deleted, what's > wrong? > > [yangyi@localhost ~]$ sudo ovs-ofctl -Oopenflow10 add-flow br-int > "table=0,cookie=0x1234,priority=1,icmp,actions=drop" > [yangyi@localhost ~]$ sudo ovs-ofctl -Oopenflow10 dump-flows br-int > > NXST_FLOW reply (xid=0x4): > cookie=0x1234, duration=3.994s, table=0, n_packets=0, n_bytes=0, > idle_age=3, priority=1,icmp actions=drop [yangyi@localhost ~]$ sudo > ovs-ofctl -Oopenflow10 --strict del-flows br-int > "table=0,cookie=0x1234/-1,priority=1" > [yangyi@localhost ~]$ sudo ovs-ofctl -Oopenflow10 dump-flows br-int > > NXST_FLOW reply (xid=0x4): > cookie=0x1234, duration=49.866s, table=0, n_packets=0, n_bytes=0, > idle_age=49, priority=1,icmp actions=drop [yangyi@localhost ~]$ > > -邮件原件- > 发件人: Ben Pfaff [mailto:b...@ovn.org] > 发送时间: 2019年7月17日 1:04 > 收件人: Yi Yang (杨燚)-云服务集团 > 抄送: ovs-disc...@openvswitch.org; ovs-dev@openvswitch.org > 主题: Re: [ovs-discuss] How can I delete flows which match a given cookie value? > > On Tue, Jul 16, 2019 at 09:35:06AM +, Yi Yang (杨燚)-云服务集团 wrote: > > I need to add and delete flows according to user operations, I know > > openflowplugin in Opendaylight can do this, but it seems “ovs-ofctl > > del-flows” can’t do this way, why can’t cookie value be used to do > > this for “ovs-ofctl del-flows”? > > > > > > > > sudo ovs-ofctl -Oopenflow13 --strict del-flows br-int "table=2,cookie=12345" > > To match on a cookie, specify a mask, e.g. cookie=12345/-1. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] Building OVS DPDK debian package
On Wed, Jul 17, 2019 at 04:15:06PM -0700, Ravi Kerur wrote: > On Wed, Jul 17, 2019 at 11:06 AM Ben Pfaff wrote: > > > On Wed, Jul 17, 2019 at 10:03:06AM -0700, Ravi Kerur wrote: > > > I am going through OVS 2.11.90 document to build Debian package on Ubuntu > > > 16.04.4. > > > > > > http://docs.openvswitch.org/en/latest/intro/install/distributions/ > > > > > > where it claims to have OVS+DPDK debian package. Snippets below > > > /** > > > > > > 3. For DPDK datapath, Open vSwitch with DPDK support is bundled in the > > > package openvswitch-switch-dpdk. > > > > This seems to be true for Ubuntu but not for Debian or for upstream Open > > vSwitch. > > > > Thanks. I am building OVS(2.11.90 latest git source)+DPDK(18.11.2) package > on Ubuntu and it doesn't seem to work. I have set DATAPATH_CONFIGURE_OPTS = > --with-dpdk="". Yes. The documentation is wrong. (That's what I was trying to say above, too.) ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v7] ovn: Add a new logical switch port type - 'virtual'
Guru has some comments and questions for you. I have the following suggested incremental to improve the parsing and error checking: diff --git a/ovn/lib/actions.c b/ovn/lib/actions.c index 318d80d5d1b7..66916a837fd3 100644 --- a/ovn/lib/actions.c +++ b/ovn/lib/actions.c @@ -2607,17 +2607,16 @@ parse_bind_vport(struct action_context *ctx) } if (ctx->lexer->token.type != LEX_T_STRING) { -lexer_error(ctx->lexer, -"bind_vport requires port name to be specified."); +lexer_syntax_error(ctx->lexer, "expecting port name string"); return; } struct ovnact_bind_vport *bind_vp = ovnact_put_BIND_VPORT(ctx->ovnacts); bind_vp->vport = xstrdup(ctx->lexer->token.s); lexer_get(ctx->lexer); -lexer_force_match(ctx->lexer, LEX_T_COMMA); -action_parse_field(ctx, 0, false, _vp->vport_parent); -lexer_force_match(ctx->lexer, LEX_T_RPAREN); +(void) (lexer_force_match(ctx->lexer, LEX_T_COMMA) +&& action_parse_field(ctx, 0, false, _vp->vport_parent) +&& lexer_force_match(ctx->lexer, LEX_T_RPAREN)); } static void diff --git a/tests/ovn.at b/tests/ovn.at index 2837be1671b6..5d6c90c5fb6f 100644 --- a/tests/ovn.at +++ b/tests/ovn.at @@ -1372,10 +1372,19 @@ reg0[0] = check_pkt_larger(foo); # lsp1's port key is 0x11. bind_vport("lsp1", inport); encodes as controller(userdata=00.00.00.11.00.00.00.00.11.00.00.00) - # lsp2 doesn't exist. So it should be encoded as drop. bind_vport("lsp2", inport); encodes as drop +bind_vport; +Syntax error at `;' expecting `('. +bind_vport(; +Syntax error at `;' expecting port name string. +bind_vport("xyzzy"; +Syntax error at `;' expecting `,'. +bind_vport("xyzzy",; +Syntax error at `;' expecting field name. +bind_vport("xyzzy", inport; +Syntax error at `;' expecting `)'. # Miscellaneous negative tests. ; ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] OVS and DPI(Classifiers)
Hi, I am sending this email to gauge if OVS team or community using OVS would be interested in this effort. I read some discussions which had happened couple of years back with OVS and Qosmos integration and at least it looked like OVS team didn't approve of it. I just want to restart it but with OVS+DPDK and all open source libraries and check if it makes sense. I have integrated open source classifiers(aka DPI) nDPI and JOY (from Cisco) libraries into OVS+DPDK. Salient points (1) Follows same model as OVS+DPDK integration i.e. user can choose classifiers with '--with-ndpi' and '--with-joy' option during configuration (2) nDPI and JOY are launched on non-PMD cores so the data path performance is not impacted (3) Packets coming into DPDK Rx path is reference counted and sent to nDPI and JOY via intercore rings (4) nDPI and JOY results are then made available to the control plane to act on it (5) Tested on AWS t3.2xlarge and both UDP and TCP hits 5Gbps for 512 and above packet size Thanks, Ravi ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] Building OVS DPDK debian package
On Wed, Jul 17, 2019 at 11:06 AM Ben Pfaff wrote: > On Wed, Jul 17, 2019 at 10:03:06AM -0700, Ravi Kerur wrote: > > I am going through OVS 2.11.90 document to build Debian package on Ubuntu > > 16.04.4. > > > > http://docs.openvswitch.org/en/latest/intro/install/distributions/ > > > > where it claims to have OVS+DPDK debian package. Snippets below > > /** > > > > 3. For DPDK datapath, Open vSwitch with DPDK support is bundled in the > > package openvswitch-switch-dpdk. > > This seems to be true for Ubuntu but not for Debian or for upstream Open > vSwitch. > Thanks. I am building OVS(2.11.90 latest git source)+DPDK(18.11.2) package on Ubuntu and it doesn't seem to work. I have set DATAPATH_CONFIGURE_OPTS = --with-dpdk="". ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] system-traffic.at: Make nsh test more robust.
The patch adds '-n' to tcpdump to avoid address coverting. Since '-U' is used to output to stdout, simply use 'cat' to search result. Use OVS_WAIT_UNTIL instead of sleep ,and also remove/add some newlines. Signed-off-by: William Tu --- tests/system-traffic.at | 89 +++-- 1 file changed, 35 insertions(+), 54 deletions(-) diff --git a/tests/system-traffic.at b/tests/system-traffic.at index 8ea450887076..ba8e3776ccd4 100644 --- a/tests/system-traffic.at +++ b/tests/system-traffic.at @@ -5871,8 +5871,7 @@ dnl The flow will encap a nsh header to the TCP syn packet dnl eth/ip/tcp --> OVS --> eth/nsh/eth/ip/tcp AT_CHECK([ovs-ofctl -Oopenflow13 add-flow br0 "table=0,priority=100,in_port=ovs-p0,ip,actions=encap(nsh(md_type=1)),set_field:0x1234->nsh_spi,set_field:0x11223344->nsh_c1,encap(ethernet),set_field:f2:ff:00:00:00:02->dl_dst,set_field:f2:ff:00:00:00:01->dl_src,ovs-p1"]) -rm ovs-p1.pcap -tcpdump -U -i ovs-p1 -w ovs-p1.pcap & +AT_CHECK([tcpdump -n -xx -U -i ovs-p1 > ovs-p1.pcap &]) sleep 1 dnl The hex dump is a TCP syn packet. pkt=eth/ip/tcp @@ -5880,20 +5879,18 @@ dnl The packet is sent from p0(at_ns0) interface directed to dnl p1(at_ns1) interface NS_CHECK_EXEC([at_ns0], [$PYTHON $srcdir/sendpkt.py p0 f2 00 00 00 00 02 f2 00 00 00 00 01 08 00 45 00 00 28 00 01 00 00 40 06 b0 13 c0 a8 00 0a 0a 00 00 0a 04 00 08 00 00 00 00 c8 00 00 00 00 50 02 20 00 b8 5e 00 00 > /dev/null]) -sleep 1 - dnl Check the expected nsh encapsulated packet on the egress interface -AT_CHECK([tcpdump -xx -r ovs-p1.pcap 2>&1 | egrep "0x: *f2ff * *0002 *f2ff * *0001 *894f *0fc6" 2>&1 1>/dev/null]) -AT_CHECK([tcpdump -xx -r ovs-p1.pcap 2>&1 | egrep "0x0010: *0103 *0012 *34ff *1122 *3344 * * *" 2>&1 1>/dev/null]) -AT_CHECK([tcpdump -xx -r ovs-p1.pcap 2>&1 | egrep "0x0020: * * * *f200 * *0002 *f200 *" 2>&1 1>/dev/null]) -AT_CHECK([tcpdump -xx -r ovs-p1.pcap 2>&1 | egrep "0x0030: *0001 *0800 *4500 *0028 *0001 * *4006 *b013" 2>&1 1>/dev/null]) -AT_CHECK([tcpdump -xx -r ovs-p1.pcap 2>&1 | egrep "0x0040: *c0a8 *000a *0a00 *000a *0400 *0800 * *00c8" 2>&1 1>/dev/null]) -AT_CHECK([tcpdump -xx -r ovs-p1.pcap 2>&1 | egrep "0x0050: * * *5002 *2000 *b85e *" 2>&1 1>/dev/null]) - +OVS_WAIT_UNTIL([cat ovs-p1.pcap 2>&1 | egrep "0x: *f2ff * *0002 *f2ff * *0001 *894f *0fc6" 2>&1 1>/dev/null]) +OVS_WAIT_UNTIL([cat ovs-p1.pcap 2>&1 | egrep "0x0010: *0103 *0012 *34ff *1122 *3344 * * *" 2>&1 1>/dev/null]) +OVS_WAIT_UNTIL([cat ovs-p1.pcap 2>&1 | egrep "0x0020: * * * *f200 * *0002 *f200 *" 2>&1 1>/dev/null]) +OVS_WAIT_UNTIL([cat ovs-p1.pcap 2>&1 | egrep "0x0030: *0001 *0800 *4500 *0028 *0001 * *4006 *b013" 2>&1 1>/dev/null]) +OVS_WAIT_UNTIL([cat ovs-p1.pcap 2>&1 | egrep "0x0040: *c0a8 *000a *0a00 *000a *0400 *0800 * *00c8" 2>&1 1>/dev/null]) +OVS_WAIT_UNTIL([cat ovs-p1.pcap 2>&1 | egrep "0x0050: * * *5002 *2000 *b85e *" 2>&1 1>/dev/null]) OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP + AT_SETUP([nsh - decap header]) OVS_TRAFFIC_VSWITCHD_START() @@ -5906,8 +5903,7 @@ dnl The flow will decap a nsh header which in turn carries a TCP syn packet dnl eth/nsh/eth/ip/tcp --> OVS --> eth/ip/tcp AT_CHECK([ovs-ofctl -Oopenflow13 add-flow br0 "table=0,priority=100,in_port=ovs-p0,dl_type=0x894f, actions=decap(),decap(), ovs-p1"]) -rm ovs-p1.pcap -tcpdump -U -i ovs-p1 -w ovs-p1.pcap & +AT_CHECK([tcpdump -n -xx -U -i ovs-p1 > ovs-p1.pcap &]) sleep 1 dnl The hex dump is NSH packet with TCP syn payload. pkt=eth/nsh/eth/ip/tcp @@ -5915,18 +5911,16 @@ dnl The packet is sent from p0(at_ns0) interface directed to dnl p1(at_ns1) interface NS_CHECK_EXEC([at_ns0], [$PYTHON $srcdir/sendpkt.py p0 f2 ff 00 00 00 02 f2 ff 00 00 00 01 89 4f 02 06 01 03 00 00 64 03 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 10 f2 00 00 00 00 02 f2 00 00 00 00 01 08 00 45 00 00 28 00 01 00 00 40 06 b0 13 c0 a8 00 0a 0a 00 00 0a 04 00 08 00 00 00 00 c8 00 00 00 00 50 02 20 00 b8 5e 00 00 > /dev/null]) -sleep 1 - dnl Check the expected de-capsulated TCP packet on the egress interface -AT_CHECK([tcpdump -xx -r ovs-p1.pcap 2>&1 | egrep "0x: *f200 * *0002 *f200 * *0001 *0800 *4500" 2>&1 1>/dev/null]) -AT_CHECK([tcpdump -xx -r ovs-p1.pcap 2>&1 | egrep "0x0010: *0028 *0001 * *4006 *b013 *c0a8 *000a *0a00" 2>&1 1>/dev/null]) -AT_CHECK([tcpdump -xx -r ovs-p1.pcap 2>&1 | egrep "0x0020: *000a *0400 *0800 * *00c8 * * *5002" 2>&1 1>/dev/null]) -AT_CHECK([tcpdump -xx -r ovs-p1.pcap 2>&1 | egrep "0x0030: *2000 *b85e *" 2>&1 1>/dev/null]) - +OVS_WAIT_UNTIL([cat ovs-p1.pcap 2>&1 | egrep "0x: *f200 * *0002 *f200 * *0001 *0800 *4500" 2>&1 1>/dev/null]) +OVS_WAIT_UNTIL([cat ovs-p1.pcap 2>&1 | egrep "0x0010: *0028 *0001 * *4006 *b013 *c0a8 *000a *0a00" 2>&1 1>/dev/null]) +OVS_WAIT_UNTIL([cat ovs-p1.pcap 2>&1 | egrep "0x0020: *000a *0400 *0800 *
Re: [ovs-dev] [PATCH v7] ovn: Add a new logical switch port type - 'virtual'
On Tue, 16 Jul 2019 at 17:48, wrote: > From: Numan Siddique > > This new type is added for the following reasons: > > - When a load balancer is created in an OpenStack deployment with Octavia > service, it creates a logical port 'VIP' for the virtual ip. > > - This logical port is not bound to any VIF. > > - Octavia service creates a service VM (with another logical port 'P' > which > belongs to the same logical switch) > > - The virtual ip 'VIP' is configured on this service VM. > Does this mean that the VIP is the IP address of this service VM? Or does the service VM have a different IP but also responds to VIP? > > - This service VM provides the load balancing for the VIP with the > configured > backend IPs. > > - Octavia service can be configured to create few service VMs with > active-standby mode > with the active VM configured with the VIP. The VIP can move between > these service nodes. > > Presently there are few problems: > > - When a floating ip (externally reachable IP) is associated to the VIP > and if > the compute nodes have external connectivity then the external traffic > cannot > reach the VIP using the floating ip as the VIP logical port would be > down. > dnat_and_snat entry in NAT table for this vip will have 'external_mac' > and > 'logical_port' configured. > So floating ip is the DNAT ip in OVN and VIP is the logical port IP? It would have worked if NAT table had logical_port column set as service VM? But you don't do it because you don't know which service VM is active? > > - The only way to make it work is to clear the 'external_mac' entry so > that > the gateway chassis does the DNAT for the VIP. > In this case, floating IP would convert to VIP and OVN will just send it to whichever logical port will respond to arp of that VIP? And in this case it is the "active" service VM? > > To solve these problems, this patch proposes a new logical port type - > virtual. > CMS when creating the logical port for the VIP, should > So your goal of this patch is to not do centralized routing? And you want to do distributed routing and hence jumping through hoops? > > - set the type as 'virtual' > > - configure the VIP in the options - > Logical_Switch_Port.options:virtual-ip > > - And set the virtual parents in the options >Logical_Switch_Port.options:virtual-parents. >These virtual parents are the one which can be configured with the VIP. > > If suppose the virtual_ip is configured to 10.0.0.10 on a virtual logical > port 'sw0-vip' > and the virtual_parents are set to - [sw0-p1, sw0-p2] then below logical > flows are added in the > lsp_in_arp_rsp logical switch pipeline > > - table=11(ls_in_arp_rsp), priority=100, >match=(inport == "sw0-p1" && !is_chassis_resident("sw0-vip") && > ((arp.op == 1 && arp.spa == 10.0.0.10 && arp.tpa == 10.0.0.10) || >(arp.op == 2 && arp.spa == 10.0.0.10))), > action=(bind_vport("sw0-vip", inport); next;) > 1. I haven't looked at ovn-northd code for a while, so this is probably stupid question. "sw0-p1" which has a IP of 10.0.0.10 is sending a ARP request with spa and tpa of 10.0.0.10? I don't understand what is happening here. Can you explain? 2. After bind_vport(sw0-vip) is executed , does is_chassis_resident(sw0-vip) return true? Also, a unit test "ovn -- 1 LR with distributed router gateway port" fails consistently for me. > - table=11(ls_in_arp_rsp), priority=100, >match=(inport == "sw0-p2" && !is_chassis_resident("sw0-vip") && > ((arp.op == 1 && arp.spa == 10.0.0.10 && arp.tpa == 10.0.0.10) || >(arp.op == 2 && arp.spa == 10.0.0.10))), >action=(bind_vport("sw0-vip", inport); next;) > > The action bind_vport will claim the logical port - sw0-vip on the chassis > where this action > is executed. Since the port - sw0-vip is claimed by a chassis, the > dnat_and_snat rule for > the VIP will be handled by the compute node. > > Signed-off-by: Numan Siddique > --- > > v6 -> v7 > > * Resolved merge conflicts. > > v5 -> v6 > > * Resolved conflicts after rebasing to latest master in tests/ovn.at > > v4 -> v5 > === > * Rebased to master to resolve merge conflicts. > > v3 -> v4 > === > * Addressed the review comment and removed the code in northd which > referenced the Southbound db state while adding the logical flows. > Instead > using the ovn match - is_chassis_resident() - which I should have used > it in the first place. > > v2 -> v3 > === > * Addressed the review comments from Ben - deleted the new columns - > virtual_ip and virtual_parents from Logical_Switch_Port and instead > is making use of options column for this purpose. > > v1 -> v2 > > * In v1, was not updating the 'put_vport_binding' struct if it already > exists in the put_vport_bindings hmap in the function - > pinctrl_handle_bind_vport(). > In v2 handled it. > * Improved the
Re: [ovs-dev] [PATCH v2] doc: Remove experimental tag for SMC cache.
On 7/17/2019 12:38 PM, Yipeng Wang wrote: SMC cache was introduced in 2.10 with experimental tag. SMC cache is a layer of software cache located after EMC cache. The purpose is to improve the performance of use cases that many flows missing the EMC cache. One can enable SMC cache using smc-enable=true option. Signed-off-by: Yipeng Wang Thanks for the v2 Yipeng. @Ilya, I think this addresses the issues you raised. If this is good with you I can add your ack and push tomorrow? Regards Ian --- Documentation/topics/dpdk/bridge.rst | 6 +++--- NEWS | 1 + 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/Documentation/topics/dpdk/bridge.rst b/Documentation/topics/dpdk/bridge.rst index a3ed926..4ca79a4 100644 --- a/Documentation/topics/dpdk/bridge.rst +++ b/Documentation/topics/dpdk/bridge.rst @@ -117,15 +117,15 @@ It is also possible to enable/disable EMC on per-port basis using:: For more information on the EMC refer to :doc:`/intro/install/dpdk` . -SMC cache (experimental) -- +SMC cache +- SMC cache or signature match cache is a new cache level after EMC cache. The difference between SMC and EMC is SMC only stores a signature of a flow thus it is much more memory efficient. With same memory space, EMC can store 8k flows while SMC can store 1M flows. When traffic flow count is much larger than EMC size, it is generally beneficial to turn off EMC and turn on SMC. It is -currently turned off by default and an experimental feature. +currently turned off by default. To turn on SMC:: diff --git a/NEWS b/NEWS index 806e3c8..feae994 100644 --- a/NEWS +++ b/NEWS @@ -34,6 +34,7 @@ Post-v2.11.0 * 'ovs-appctl exit' now implies cleanup of non-internal ports in userspace datapath regardless of '--cleanup' option. Use '--cleanup' to remove internal ports too. + * Removed experimental tag for SMC cache. - OVSDB: * OVSDB clients can now resynchronize with clustered servers much more quickly after a brief disconnection, saving bandwidth and CPU time. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCHv17 2/2] netdev-afxdp: add new netdev type for AF_XDP.
The patch introduces experimental AF_XDP support for OVS netdev. AF_XDP, the Address Family of the eXpress Data Path, is a new Linux socket type built upon the eBPF and XDP technology. It is aims to have comparable performance to DPDK but cooperate better with existing kernel's networking stack. An AF_XDP socket receives and sends packets from an eBPF/XDP program attached to the netdev, by-passing a couple of Linux kernel's subsystems As a result, AF_XDP socket shows much better performance than AF_PACKET For more details about AF_XDP, please see linux kernel's Documentation/networking/af_xdp.rst. Note that by default, this feature is not compiled in. Signed-off-by: William Tu --- v15: * address review feedback from Ilya https://patchwork.ozlabs.org/patch/1125476/ * skip TCP related test cases * reclaim all CONS_NUM_DESC at complete tx * add retries to kick_tx * increase memory pool size * remove redundant xdp flag and bind flag * remove unused rx_dropped var * make tx_dropped counter atomic * refactor dp_packet_init_afxdp using dp_packet_init__ * rebase to ovs master, test with latest bpf-next kernel commit b14a260e33ddb4 Ilya's kernel patches are required commit 455302d1c9ae ("xdp: fix hang while unregistering device bound to xdp socket") commit 162c820ed896 ("xdp: hold device for umem regardless of zero-copy mode") Possible issues: * still lots of afxdp_cq_skip (ovs-appctl coverage/show) afxdp_cq_skip 44325273.6/sec 34362312.683/sec 572705.2114/sec total: 2106010377 * TODO: 'make check-afxdp' still not all pass IP fragmentation expiry test not fix yet, need to implement deferral memory free, s.t like dpdk_mp_sweep. Currently hit some missing umem descs when reclaiming. NSH test case still failed (not due to afxdp) v16: * address feedbacks from Ilya * add deferral memory free * add afxdp testsuites files to gitignore v17: * address feedbacks from Ilya and Ben https://patchwork.ozlabs.org/patch/1131547/ * ovs_spin_lock: add pthread_spin_lock checks, fix typo * update NEWS * add pthread_spin_lock check at OVS_CHECK_LINUX_AF_XDP * fix bug in xmalloc_size_align * rename xdpsock to netdev-afxdp-pool * remove struct umem_elem, use void * * fix style and comments * fix afxdp.rst * rebase to OVS master, tested on kernel 5.2.0-rc6 Note: I still leave the last_tsc in pmd_perf_stats the same as v16 --- Documentation/automake.mk |1 + Documentation/index.rst |1 + Documentation/intro/install/afxdp.rst | 431 ++ Documentation/intro/install/index.rst |1 + NEWS |1 + acinclude.m4 | 35 ++ configure.ac |1 + lib/automake.mk | 10 + lib/dp-packet.c | 23 + lib/dp-packet.h | 18 +- lib/dpif-netdev-perf.h| 24 + lib/netdev-afxdp-pool.c | 178 ++ lib/netdev-afxdp-pool.h | 78 +++ lib/netdev-afxdp.c| 1017 + lib/netdev-afxdp.h| 74 +++ lib/netdev-linux-private.h| 132 + lib/netdev-linux.c| 121 ++-- lib/netdev-provider.h |3 + lib/netdev.c | 11 + lib/util.c| 92 ++- lib/util.h|5 + tests/.gitignore |3 + tests/automake.mk | 16 + tests/system-afxdp-macros.at | 39 ++ tests/system-afxdp-testsuite.at | 26 + tests/system-traffic.at |2 + vswitchd/vswitch.xml | 15 + 27 files changed, 2250 insertions(+), 108 deletions(-) create mode 100644 Documentation/intro/install/afxdp.rst create mode 100644 lib/netdev-afxdp-pool.c create mode 100644 lib/netdev-afxdp-pool.h create mode 100644 lib/netdev-afxdp.c create mode 100644 lib/netdev-afxdp.h create mode 100644 lib/netdev-linux-private.h create mode 100644 tests/system-afxdp-macros.at create mode 100644 tests/system-afxdp-testsuite.at diff --git a/Documentation/automake.mk b/Documentation/automake.mk index 8472921746ba..2a3214a3cc7f 100644 --- a/Documentation/automake.mk +++ b/Documentation/automake.mk @@ -10,6 +10,7 @@ DOC_SOURCE = \ Documentation/intro/why-ovs.rst \ Documentation/intro/install/index.rst \ Documentation/intro/install/bash-completion.rst \ + Documentation/intro/install/afxdp.rst \ Documentation/intro/install/debian.rst \ Documentation/intro/install/documentation.rst \ Documentation/intro/install/distributions.rst \ diff --git a/Documentation/index.rst b/Documentation/index.rst index 331353fd337a..bace34dbf91b 100644 --- a/Documentation/index.rst +++ b/Documentation/index.rst @@ -59,6 +59,7 @@ vSwitch? Start here.
[ovs-dev] [PATCHv17 1/2] ovs-thread: Add pthread spin lock support.
The patch adds the basic spin lock functions: ovs_spin_{lock, try_lock, unlock, init, destroy}. OSX does not support pthread spin lock, so make it linux only. Signed-off-by: William Tu --- configure.ac | 1 + include/openvswitch/thread.h | 24 lib/ovs-thread.c | 31 +++ 3 files changed, 56 insertions(+) diff --git a/configure.ac b/configure.ac index a9f0a06dc140..dd2a674af0c9 100644 --- a/configure.ac +++ b/configure.ac @@ -108,6 +108,7 @@ AC_CHECK_MEMBERS([struct sockaddr_in6.sin6_scope_id], [], [], #include #include ]]) AC_CHECK_FUNCS([mlockall strnlen getloadavg statvfs getmntent_r sendmmsg clock_gettime]) +AC_CHECK_FUNCS([pthread_spin_lock]) AC_CHECK_HEADERS([mntent.h sys/statvfs.h linux/types.h linux/if_ether.h]) AC_CHECK_HEADERS([linux/net_namespace.h stdatomic.h bits/floatn-common.h]) AC_CHECK_HEADERS([net/if_mib.h], [], [], [[#include diff --git a/include/openvswitch/thread.h b/include/openvswitch/thread.h index 2987db37c9dc..b07b7167b303 100644 --- a/include/openvswitch/thread.h +++ b/include/openvswitch/thread.h @@ -17,6 +17,8 @@ #ifndef OPENVSWITCH_THREAD_H #define OPENVSWITCH_THREAD_H 1 +#include + #include #include #include @@ -33,6 +35,13 @@ struct OVS_LOCKABLE ovs_mutex { const char *where; /* NULL if and only if uninitialized. */ }; +#ifdef HAVE_PTHREAD_SPIN_LOCK +struct OVS_LOCKABLE ovs_spin { +pthread_spinlock_t lock; +const char *where; /* NULL if and only if uninitialized. */ +}; +#endif + /* "struct ovs_mutex" initializer. */ #ifdef PTHREAD_ERRORCHECK_MUTEX_INITIALIZER_NP #define OVS_MUTEX_INITIALIZER { PTHREAD_ERRORCHECK_MUTEX_INITIALIZER_NP, \ @@ -70,6 +79,21 @@ int ovs_mutex_trylock_at(const struct ovs_mutex *mutex, const char *where) void ovs_mutex_cond_wait(pthread_cond_t *, const struct ovs_mutex *mutex) OVS_REQUIRES(mutex); + +#ifdef HAVE_PTHREAD_SPIN_LOCK +void ovs_spin_init(const struct ovs_spin *); +void ovs_spin_destroy(const struct ovs_spin *); +void ovs_spin_unlock(const struct ovs_spin *spin) OVS_RELEASES(spin); +void ovs_spin_lock_at(const struct ovs_spin *spin, const char *where) +OVS_ACQUIRES(spin); +#define ovs_spin_lock(spin) \ +ovs_spin_lock_at(spin, OVS_SOURCE_LOCATOR) + +int ovs_spin_trylock_at(const struct ovs_spin *spin, const char *where) +OVS_TRY_LOCK(0, spin); +#define ovs_spin_trylock(spin) \ +ovs_spin_trylock_at(spin, OVS_SOURCE_LOCATOR) +#endif /* Convenient once-only execution. * diff --git a/lib/ovs-thread.c b/lib/ovs-thread.c index 159d87e5b0ca..b686e4548127 100644 --- a/lib/ovs-thread.c +++ b/lib/ovs-thread.c @@ -75,6 +75,9 @@ static bool multithreaded; LOCK_FUNCTION(mutex, lock); LOCK_FUNCTION(rwlock, rdlock); LOCK_FUNCTION(rwlock, wrlock); +#ifdef HAVE_PTHREAD_SPIN_LOCK +LOCK_FUNCTION(spin, lock); +#endif #define TRY_LOCK_FUNCTION(TYPE, FUN) \ int \ @@ -103,6 +106,9 @@ LOCK_FUNCTION(rwlock, wrlock); TRY_LOCK_FUNCTION(mutex, trylock); TRY_LOCK_FUNCTION(rwlock, tryrdlock); TRY_LOCK_FUNCTION(rwlock, trywrlock); +#ifdef HAVE_PTHREAD_SPIN_LOCK +TRY_LOCK_FUNCTION(spin, trylock); +#endif #define UNLOCK_FUNCTION(TYPE, FUN, WHERE) \ void \ @@ -125,6 +131,10 @@ UNLOCK_FUNCTION(mutex, unlock, ""); UNLOCK_FUNCTION(mutex, destroy, NULL); UNLOCK_FUNCTION(rwlock, unlock, ""); UNLOCK_FUNCTION(rwlock, destroy, NULL); +#ifdef HAVE_PTHREAD_SPIN_LOCK +UNLOCK_FUNCTION(spin, unlock, ""); +UNLOCK_FUNCTION(spin, destroy, NULL); +#endif #define XPTHREAD_FUNC1(FUNCTION, PARAM1)\ void\ @@ -268,6 +278,27 @@ ovs_mutex_cond_wait(pthread_cond_t *cond, const struct ovs_mutex *mutex_) } } +#ifdef HAVE_PTHREAD_SPIN_LOCK +static void +ovs_spin_init__(const struct ovs_spin *l_, int pshared) +{ +struct ovs_spin *l = CONST_CAST(struct ovs_spin *, l_); +int error; + +l->where = ""; +error = pthread_spin_init(>lock, pshared); +if (OVS_UNLIKELY(error)) { +ovs_abort(error, "pthread_spin_init failed"); +} +} + +void +ovs_spin_init(const struct ovs_spin *spin) +{ +ovs_spin_init__(spin, PTHREAD_PROCESS_PRIVATE); +} +#endif + /* Initializes the 'barrier'. 'size' is the number of threads * expected to hit the barrier. */ void -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v13 0/5] dpcls func ptrs & optimizations
On 7/17/2019 7:21 PM, Harry van Haaren wrote: Hey Folks, Here a v13 of the DPCLS Function Pointer patchset, as has been presented at OVS Conf in Nov '18, and discussed on the ML since then. I'm aware of the soft-freeze for 2.12, I feel this patchset has had enough reviews/versions/testing to be merged in 2.12. Thanks Ilya and Ian for input and suggestions on v12. Only change is that the location of the blocks_scratch array has moved from struct dpcls to a thread local storage internal to the implementation. This avoids leaking the blocks_scratch concept outside the dpcls implementation, while also ensuring that each thread running the dpcls has its own scratch memory. Given the nearing soft-freeze and branch deadlines, I'd like to see this get merged - and any future minor comments / improvements can be handled before release. Regards, -Harry Thanks for the quick turn around on this Harry, I'm hoping to have a closer look on the v13 changes tomorrow. Regards Ian ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v13 5/5] dpif-netdev: Add specialized generic scalar functions
Bleep bloop. Greetings Harry van Haaren, I am a robot and I have tried out your patch. Thanks for your contribution. I encountered some error that I wasn't expecting. See the details below. checkpatch: ERROR: Inappropriate bracing around statement #110 FILE: lib/dpif-netdev-lookup-generic.c:300: if (!f && u0_bits == U0 && u1_bits == U1) { \ Lines checked: 176, Warnings: 0, Errors: 1 Please check this out. If you feel there has been an error, please email acon...@bytheb.org Thanks, 0-day Robot ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v13 2/5] dpif-netdev: Move dpcls lookup structures to .h
Bleep bloop. Greetings Harry van Haaren, I am a robot and I have tried out your patch. Thanks for your contribution. I encountered some error that I wasn't expecting. See the details below. checkpatch: ERROR: Improper whitespace around control block #129 FILE: lib/dpif-netdev-private.h:91: #define NETDEV_FLOW_KEY_FOR_EACH_IN_FLOWMAP(VALUE, KEY, FLOWMAP) \ ERROR: Inappropriate bracing around statement #130 FILE: lib/dpif-netdev-private.h:92: MINIFLOW_FOR_EACH_IN_FLOWMAP (VALUE, &(KEY)->mf, FLOWMAP) Lines checked: 271, Warnings: 0, Errors: 2 Please check this out. If you feel there has been an error, please email acon...@bytheb.org Thanks, 0-day Robot ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] flow: Avoid unsafe comparison of minimasks.
On Wed, Jul 17, 2019 at 7:55 PM Ben Pfaff wrote: > > The following, run inside the OVS sandbox, caused OVS to abort when > Address Sanitizer was used: > > ovs-vsctl add-br br-int > ovs-ofctl add-flow br-int > "table=0,cookie=0x1234,priority=1,icmp,actions=drop" > ovs-ofctl --strict del-flows br-int > "table=0,cookie=0x1234/-1,priority=1" > > Sample report from Address Sanitizer: > > ==3029==ERROR: AddressSanitizer: heap-buffer-overflow on address > 0x60343260 at pc 0x7f6b09c2459b bp 0x7ffcb67e7540 sp 0x7ffcb67e6cf0 > READ of size 40 at 0x60343260 thread T0 > #0 0x7f6b09c2459a (/lib/x86_64-linux-gnu/libasan.so.5+0xb859a) > #1 0x565110a748a5 in minimask_equal ../lib/flow.c:3510 > #2 0x565110a9ea41 in minimatch_equal ../lib/match.c:1821 > #3 0x56511091e864 in collect_rules_strict ../ofproto/ofproto.c:4516 > #4 0x56511093d526 in delete_flow_start_strict ../ofproto/ofproto.c:5959 > #5 0x56511093d526 in ofproto_flow_mod_start ../ofproto/ofproto.c:7949 > #6 0x56511093d77b in handle_flow_mod__ ../ofproto/ofproto.c:6122 > #7 0x56511093db71 in handle_flow_mod ../ofproto/ofproto.c:6099 > #8 0x5651109407f6 in handle_single_part_openflow ../ofproto/ofproto.c:8406 > #9 0x5651109407f6 in handle_openflow ../ofproto/ofproto.c:8587 > #10 0x5651109e40da in ofconn_run ../ofproto/connmgr.c:1318 > #11 0x5651109e40da in connmgr_run ../ofproto/connmgr.c:355 > #12 0x56511092b129 in ofproto_run ../ofproto/ofproto.c:1826 > #13 0x5651108f23cd in bridge_run__ ../vswitchd/bridge.c:2965 > #14 0x565110904887 in bridge_run ../vswitchd/bridge.c:3023 > #15 0x5651108e659c in main ../vswitchd/ovs-vswitchd.c:127 > #16 0x7f6b093b709a in __libc_start_main ../csu/libc-start.c:308 > #17 0x5651108e9009 in _start > (/home/blp/nicira/ovs/_build/vswitchd/ovs-vswitchd+0x11d009) > > This fixes the problem, which although largely theoretical could crop up > with odd implementations of memcmp(), perhaps ones optimized in various > "clever" ways. All in all, it seems best to avoid the theoretical problem. > > Signed-off-by: Ben Pfaff Cool! Looks good (and interesting) to me Acked-by: Dumitru Ceara > --- > lib/flow.c | 19 --- > 1 file changed, 16 insertions(+), 3 deletions(-) > > diff --git a/lib/flow.c b/lib/flow.c > index 95da7d4b1803..e54fd2e522e6 100644 > --- a/lib/flow.c > +++ b/lib/flow.c > @@ -1,5 +1,5 @@ > /* > - * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2017 > Nicira, Inc. > + * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2017, 2019 > Nicira, Inc. > * > * Licensed under the Apache License, Version 2.0 (the "License"); > * you may not use this file except in compliance with the License. > @@ -3506,8 +3506,21 @@ minimask_expand(const struct minimask *mask, struct > flow_wildcards *wc) > bool > minimask_equal(const struct minimask *a, const struct minimask *b) > { > -return !memcmp(a, b, sizeof *a > - + MINIFLOW_VALUES_SIZE(miniflow_n_values(>masks))); > +/* At first glance, it might seem that this can be reasonably optimized > + * into a single memcmp() for the total size of the region. Such an > + * optimization will work OK with most implementations of memcmp() that > + * proceed from the start of the regions to be compared to the end in > + * reasonably sized chunks. However, memcmp() is not required to be > + * implemented that way, and an implementation that, for example, > compares > + * all of the bytes in both regions without early exit when it finds a > + * difference, or one that compares, say, 64 bytes at a time, could > access > + * an unmapped region of memory if minimasks 'a' and 'b' have different > + * lengths. By first checking that the maps are the same with the first > + * memcmp(), we verify that 'a' and 'b' have the same length and > therefore > + * ensure that the second memcmp() is safe. */ > +return (!memcmp(a, b, sizeof *a) > +&& !memcmp(a + 1, b + 1, > + MINIFLOW_VALUES_SIZE(miniflow_n_values(>masks; > } > > /* Returns true if at least one bit matched by 'b' is wildcarded by 'a', > -- > 2.20.1 > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] ovsdb-idl: Mark row "parsed" in ovsdb_idl_txn_write__
Once a column is set in a row using ovsdb_idl_txn_write__ we also mark the row "parsed". Otherwise the memory allocated by sbrec__parse_() will never be freed. After marking the row "parsed", the ovsdb_idl_txn_disassemble function will properly free the newly allocated memory for the column (ovsdb_idl_row_unparse). The problem is present only for rows that are inserted by the running application because rows that are loaded from the database will always have row->parsed == true. One way to detect the leak is to start northd with valgrind: valgrind --tool=memcheck --leak-check=yes ./ovn-northd Then create a logical switch and bind a logical port to it: ovn-nbctl ls-add ls1 ovn-nbctl lsp-add ls1 ls1-vm1 The valgrind report: ==9270== Memcheck, a memory error detector ==9270== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==9270== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info ==9270== Command: ./ovn-northd ==9270== ==9270== ==9270== 8 bytes in 1 blocks are definitely lost in loss record 30 of 292 ==9270==at 0x4C29BC3: malloc (vg_replace_malloc.c:299) ==9270==by 0x4D31EF: xmalloc (util.c:138) ==9270==by 0x45CB8E: sbrec_multicast_group_parse_ports (ovn-sb-idl.c:18141) ==9270==by 0x4BB12D: ovsdb_idl_txn_write__ (ovsdb-idl.c:4489) ==9270==by 0x4BB1B5: ovsdb_idl_txn_write (ovsdb-idl.c:4527) ==9270==by 0x45D167: sbrec_multicast_group_set_ports (ovn-sb-idl.c:18561) ==9270==by 0x40F416: ovn_multicast_update_sbrec (ovn-northd.c:2947) ==9270==by 0x41FC55: build_lflows (ovn-northd.c:7981) ==9270==by 0x421830: ovnnb_db_run (ovn-northd.c:8522) ==9270==by 0x422B2D: ovn_db_run (ovn-northd.c:9089) ==9270==by 0x423909: main (ovn-northd.c:9409) ==9270== ==9270== 157 (32 direct, 125 indirect) bytes in 1 blocks are definitely lost in loss record 199 of 292 ==9270==at 0x4C29BC3: malloc (vg_replace_malloc.c:299) ==9270==by 0x4D31EF: xmalloc (util.c:138) ==9270==by 0x471E3D: resize (hmap.c:100) ==9270==by 0x4720C8: hmap_expand_at (hmap.c:175) ==9270==by 0x4C74F1: hmap_insert_at (hmap.h:277) ==9270==by 0x4C825A: smap_add__ (smap.c:392) ==9270==by 0x4C7783: smap_add (smap.c:55) ==9270==by 0x451054: sbrec_datapath_binding_parse_external_ids (ovn-sb-idl.c:7181) ==9270==by 0x4BB12D: ovsdb_idl_txn_write__ (ovsdb-idl.c:4489) ==9270==by 0x4BB1B5: ovsdb_idl_txn_write (ovsdb-idl.c:4527) ==9270==by 0x451436: sbrec_datapath_binding_set_external_ids (ovn-sb-idl.c:7444) ==9270==by 0x4090F1: ovn_datapath_update_external_ids (ovn-northd.c:817) ==9270== ==9270== ==9270== LEAK SUMMARY: ==9270==definitely lost: 1,322 bytes in 47 blocks ==9270==indirectly lost: 4,653 bytes in 240 blocks ==9270== possibly lost: 0 bytes in 0 blocks ==9270==still reachable: 254,004 bytes in 7,780 blocks ==9270== suppressed: 0 bytes in 0 blocks ==9270== Reachable blocks (those to which a pointer was found) are not shown. ==9270== To see them, rerun with: --leak-check=full --show-leak-kinds=all ==9270== ==9270== For counts of detected and suppressed errors, rerun with: -v ==9270== ERROR SUMMARY: 9 errors from 9 contexts (suppressed: 0 from 0) Signed-off-by: Dumitru Ceara --- lib/ovsdb-idl.c | 1 + 1 file changed, 1 insertion(+) diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c index 5c61096..1a6a4af 100644 --- a/lib/ovsdb-idl.c +++ b/lib/ovsdb-idl.c @@ -4487,6 +4487,7 @@ ovsdb_idl_txn_write__(const struct ovsdb_idl_row *row_, } (column->unparse)(row); (column->parse)(row, >new_datum[column_idx]); +row->parsed = true; if (!index_row) { ovsdb_idl_add_to_indexes(row); } -- 1.8.3.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] flow: Avoid unsafe comparison of minimasks.
Bleep bloop. Greetings Ben Pfaff, I am a robot and I have tried out your patch. Thanks for your contribution. I encountered some error that I wasn't expecting. See the details below. checkpatch: WARNING: Line is 88 characters long (recommended limit is 79) #52 FILE: lib/flow.c:2: * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2017, 2019 Nicira, Inc. Lines checked: 82, Warnings: 1, Errors: 0 Please check this out. If you feel there has been an error, please email acon...@bytheb.org Thanks, 0-day Robot ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCHv16 2/2] netdev-afxdp: add new netdev type for AF_XDP.
On Wed, Jul 17, 2019 at 9:56 AM Ben Pfaff wrote: > > On Fri, Jul 12, 2019 at 04:50:56PM -0700, William Tu wrote: > > The patch introduces experimental AF_XDP support for OVS netdev. > > AF_XDP, the Address Family of the eXpress Data Path, is a new Linux socket > > type built upon the eBPF and XDP technology. It is aims to have comparable > > performance to DPDK but cooperate better with existing kernel's networking > > stack. An AF_XDP socket receives and sends packets from an eBPF/XDP program > > attached to the netdev, by-passing a couple of Linux kernel's subsystems > > As a result, AF_XDP socket shows much better performance than AF_PACKET > > For more details about AF_XDP, please see linux kernel's > > Documentation/networking/af_xdp.rst. Note that by default, this feature is > > not compiled in. > > > > Signed-off-by: William Tu > > Thanks a lot! I have a few comments. > > In pmd_perf_stats, is last_tsc always accessed in a thread-local manner > or can it be read cross-thread? If it can be read (or updated) > cross-thread, then it seems unsafe to me on 32-bit architectures, which > might reasonably read or write the two halves of it with separate > instructions. If so, then I'd suggest using atomic_uint64_t instead of > plain uint64_t. I think rdtsc is a per-core register and OVS always reads it, not updated. So it's not possible that while in the middle of 32-bit system reading it, the other half register gets updated. I'm not 100% sure, will let Ilya comment about it. > > Looking at xmalloc_size_align(), I noticed a couple of things. First, > the syntax (runt ? : 0) uses a GCC extension. The portable way to write > it is (runt ? runt : 0), or just "runt" since it's a bool. However, I > notice that the older code was (runt ? CACHE_LINE_SIZE : 0), and by > analogy the newer code should be (runt ? alignment : 0). Are you > convinced that the substitute is correct? Yes, this is a mistake. It should be (runt ? alignment : 0) > > I also have a couple of trivia: > > * One of the conditionals in automake.mk isn't needed because > LIBBPF_LDADD will be defined to empty if !HAVE_AF_XDP. > > * In dpif-netdev-perf.h, we don't need the casts because the standard C > promotions work fine for us in this case. > > I'm appending an incremental to address the trivia: > > diff --git a/lib/automake.mk b/lib/automake.mk > index b07bb01c4ef7..7cb1c3373538 100644 > --- a/lib/automake.mk > +++ b/lib/automake.mk > @@ -9,15 +9,12 @@ lib_LTLIBRARIES += lib/libopenvswitch.la > > lib_libopenvswitch_la_LIBADD = $(SSL_LIBS) > lib_libopenvswitch_la_LIBADD += $(CAPNG_LDADD) > +lib_libopenvswitch_la_LIBADD += $(LIBBPF_LDADD) > > if WIN32 > lib_libopenvswitch_la_LIBADD += ${PTHREAD_LIBS} > endif > > -if HAVE_AF_XDP > -lib_libopenvswitch_la_LIBADD += $(LIBBPF_LDADD) > -endif > - > lib_libopenvswitch_la_LDFLAGS = \ > $(OVS_LTINFO) \ > -Wl,--version-script=$(top_builddir)/lib/libopenvswitch.sym \ > diff --git a/lib/dpif-netdev-perf.h b/lib/dpif-netdev-perf.h > index 6b6dfda7db1c..3a5e90b16a5f 100644 > --- a/lib/dpif-netdev-perf.h > +++ b/lib/dpif-netdev-perf.h > @@ -192,15 +192,11 @@ static inline uint64_t > rdtsc_syscall(struct pmd_perf_stats *s) > { > struct timespec val; > -uint64_t v; > - > if (clock_gettime(CLOCK_MONOTONIC_RAW, ) != 0) { > return s->last_tsc; > } > > -v = (uint64_t) val.tv_sec * 10LL; > -v += (uint64_t) val.tv_nsec; > - > +uint64_t v = val.tv_sec * UINT64_C(10) + val.tv_nsec; > return s->last_tsc = v; > } > #endif Thank you, will add them in v17 patch. William ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v1] doc: Remove experimental tag for SMC cache.
>-Original Message- >From: Stokes, Ian >Sent: Thursday, July 4, 2019 3:40 AM >To: Ilya Maximets ; Wang, Yipeng1 >; d...@openvswitch.org; Ben Pfaff > >Subject: Re: [PATCH v1] doc: Remove experimental tag for SMC cache. > >On 7/4/2019 11:12 AM, Ilya Maximets wrote: >> On 03.07.2019 19:58, Ian Stokes wrote: >>> On 6/27/2019 12:44 PM, Yipeng Wang wrote: SMC cache was introduced in 2.10 with experimental tag. SMC cache is a layer of software cache located after EMC cache. The purpose is to improve the performance of use cases that many flows missing the EMC cache. One can enable SMC cache using smc-enable=true option. >>> >>> Thanks for this Yipeng. >>> >>> I'd raised the topic of removing the experimental tag at the community call >>> previously. >>> >>> I guess two aspects were called out that form part of the litmus test for >>> when the tag should be removed. >>> >>> 1. Is the feature in use in common/commercial use? >>> 2. Are there any to-dos or known bugs to be resolved? >>> >>> WRT point 1, I do believe it has been used/tested in commercial deployments >>> and has been benchmarked against realistic >deployments with positive results over EMC (except when there is only a small >number of flows). I believe this was even presented >by Red Hat at the OVS conference last year, so this is encouraging. >>> >>> WRT to known bugs I'm not aware of any outstanding. There have been a >>> handful of patches upstreamed since 2.10 to address >some minor issues. >>> >>> I think it's ok to remove this tag for the 2.12 release. >>> >>> Would be interested to hear others thoughts? >> >> Hi, Ian and Yipeng. >> >> SMC seems stable enough now. Since fixing one major bug last year I >> didn't noticed about new issues, however I didn't try to use it in >> long-living setups. >> This change doesn't enable SMC by default, so it's totally OK for >> me to remove the experimental tag. This way we'll, probably, have >> more users of this feature and subsequently more testing that will >> allow us to enable it by default for a next releases. >> > >Thanks Ilya, I agree, its seems to be a natural first step before >replacing EMC. > >> So, idea is good. One comment inline for the patch itself. And >> rebase is needed also. >> >> Best regards, Ilya Maximets. >> >>> >>> Regards >>> Ian >>> Signed-off-by: Yipeng Wang --- Documentation/topics/dpdk/bridge.rst | 4 ++-- NEWS | 1 + 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/Documentation/topics/dpdk/bridge.rst b/Documentation/topics/dpdk/bridge.rst index a3ed926..71e1af6 100644 --- a/Documentation/topics/dpdk/bridge.rst +++ b/Documentation/topics/dpdk/bridge.rst @@ -117,7 +117,7 @@ It is also possible to enable/disable EMC on per-port basis using:: For more information on the EMC refer to :doc:`/intro/install/dpdk` . -SMC cache (experimental) +SMC cache - >> >> Above line should be shortened according to the new section name >> to be rST compliant, otherwise it will trigger a warning/error >> while docs generation. > >+1 [Wang, Yipeng] Fixed in V2. > >Regards >Ian >> SMC cache or signature match cache is a new cache level after EMC cache. @@ -125,7 +125,7 @@ The difference between SMC and EMC is SMC only stores a signature of a flow thus it is much more memory efficient. With same memory space, EMC can store 8k flows while SMC can store 1M flows. When traffic flow count is much larger than EMC size, it is generally beneficial to turn off EMC and turn on SMC. It is -currently turned off by default and an experimental feature. +currently turned off by default. To turn on SMC:: diff --git a/NEWS b/NEWS index 2f8171f..bf99295 100644 --- a/NEWS +++ b/NEWS @@ -29,6 +29,7 @@ Post-v2.11.0 * New action "check_pkt_len". * Port configuration with "other-config:priority-tags" now has a mode that retains the 802.1Q header even if VLAN and priority are both zero. + * Removed experimental tag for SMC cache. - OVSDB: * OVSDB clients can now resynchronize with clustered servers much more quickly after a brief disconnection, saving bandwidth and CPU time. [Wang, Yipeng] Thank you both for the comment! I rebased and posted V2. Thanks Yipeng ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v2] doc: Remove experimental tag for SMC cache.
SMC cache was introduced in 2.10 with experimental tag. SMC cache is a layer of software cache located after EMC cache. The purpose is to improve the performance of use cases that many flows missing the EMC cache. One can enable SMC cache using smc-enable=true option. Signed-off-by: Yipeng Wang --- Documentation/topics/dpdk/bridge.rst | 6 +++--- NEWS | 1 + 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/Documentation/topics/dpdk/bridge.rst b/Documentation/topics/dpdk/bridge.rst index a3ed926..4ca79a4 100644 --- a/Documentation/topics/dpdk/bridge.rst +++ b/Documentation/topics/dpdk/bridge.rst @@ -117,15 +117,15 @@ It is also possible to enable/disable EMC on per-port basis using:: For more information on the EMC refer to :doc:`/intro/install/dpdk` . -SMC cache (experimental) -- +SMC cache +- SMC cache or signature match cache is a new cache level after EMC cache. The difference between SMC and EMC is SMC only stores a signature of a flow thus it is much more memory efficient. With same memory space, EMC can store 8k flows while SMC can store 1M flows. When traffic flow count is much larger than EMC size, it is generally beneficial to turn off EMC and turn on SMC. It is -currently turned off by default and an experimental feature. +currently turned off by default. To turn on SMC:: diff --git a/NEWS b/NEWS index 806e3c8..feae994 100644 --- a/NEWS +++ b/NEWS @@ -34,6 +34,7 @@ Post-v2.11.0 * 'ovs-appctl exit' now implies cleanup of non-internal ports in userspace datapath regardless of '--cleanup' option. Use '--cleanup' to remove internal ports too. + * Removed experimental tag for SMC cache. - OVSDB: * OVSDB clients can now resynchronize with clustered servers much more quickly after a brief disconnection, saving bandwidth and CPU time. -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v13 5/5] dpif-netdev: Add specialized generic scalar functions
This commit adds a number of specialized functions, that handle common miniflow fingerprints. This enables compiler optimization, resulting in higher performance. Below a quick description of how this optimization actually works; "Specialized functions" are "instances" of the generic implementation, but the compiler is given extra context when compiling. In the case of iterating miniflow datastructures, the most interesting value to enable compile time optimizations is the loop trip count per unit. In order to create a specialized function, there is a generic implementation, which uses a for() loop without the compiler knowing the loop trip count at compile time. The loop trip count is passed in as an argument to the function: uint32_t miniflow_impl_generic(struct miniflow *mf, uint32_t loop_count) { for(uint32_t i = 0; i < loop_count; i++) // do work } In order to "specialize" the function, we call the generic implementation with hard-coded numbers - these are compile time constants! uint32_t miniflow_impl_loop5(struct miniflow *mf, uint32_t loop_count) { // use hard coded constant for compile-time constant-propogation return miniflow_impl_generic(mf, 5); } Given the compiler is aware of the loop trip count at compile time, it can perform an optimization known as "constant propogation". Combined with inlining of the miniflow_impl_generic() function, the compiler is now enabled to *compile time* unroll the loop 5x, and produce "flat" code. The last step to using the specialized functions is to utilize a function-pointer to choose the specialized (or generic) implementation. The selection of the function pointer is performed at subtable creation time, when miniflow fingerprint of the subtable is known. This technique is known as "multiple dispatch" in some literature, as it uses multiple items of information (miniflow bit counts) to select the dispatch function. By pointing the function pointer at the optimized implementation, OvS benefits from the compile time optimizations at runtime. Signed-off-by: Harry van Haaren Tested-by: Malvika Gupta --- v13: - Update macros to new lookup function prototype without blocks scratch v12: - Fix typo (Ian) - Fix missing . after comments (Ian) - Improve return value comments for probe function (Ian) - Spaces after number in optimized lookup declarations (Ilya) - Make VLOG level debug instead of info (Ilya) v11: - Use MACROs to declare and check optimized functions (Ilya) - Use captial letter for commit message (Ilya) - Rebase onto latest patchset changes - Added NEWS entry for data-path subtable specialization (Ian/Harry) - Checkpatch notes an "incorrect bracketing" in the MACROs, however I didn't find a solution that it does like. v10: - Rebase changes from previous patches. - Remove "restrict" keyword as windows CI failed, see here for details: https://ci.appveyor.com/project/istokes/ovs-q8fvv/builds/24398228 v8: - Rework to use blocks_cache from the dpcls instance, to avoid variable lenght arrays in the data-path. --- NEWS | 4 +++ lib/dpif-netdev-lookup-generic.c | 49 lib/dpif-netdev-private.h| 8 ++ lib/dpif-netdev.c| 9 -- 4 files changed, 68 insertions(+), 2 deletions(-) diff --git a/NEWS b/NEWS index 81130e667..4cfffb1bc 100644 --- a/NEWS +++ b/NEWS @@ -34,6 +34,10 @@ Post-v2.11.0 * 'ovs-appctl exit' now implies cleanup of non-internal ports in userspace datapath regardless of '--cleanup' option. Use '--cleanup' to remove internal ports too. + * Datapath classifer code refactored to enable function pointers to select + the lookup implementation at runtime. This enables specialization of + specific subtables based on the miniflow attributes, enhancing the + performance of the subtable search. - OVSDB: * OVSDB clients can now resynchronize with clustered servers much more quickly after a brief disconnection, saving bandwidth and CPU time. diff --git a/lib/dpif-netdev-lookup-generic.c b/lib/dpif-netdev-lookup-generic.c index de45099f8..50edc483c 100644 --- a/lib/dpif-netdev-lookup-generic.c +++ b/lib/dpif-netdev-lookup-generic.c @@ -269,7 +269,56 @@ dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable, const struct netdev_flow_key *keys[], struct dpcls_rule **rules) { +/* Here the runtime subtable->mf_bits counts are used, which forces the + * compiler to iterate normal for() loops. Due to this limitation in the + * compilers available optimizations, this function has lower performance + * than the below specialized functions. + */ return lookup_generic_impl(subtable, keys_map, keys, rules, subtable->mf_bits_set_unit0, subtable->mf_bits_set_unit1); } + +/* Expand out specialized functions with U0 and U1
[ovs-dev] [PATCH v13 4/5] dpif-netdev: Refactor generic implementation
This commit refactors the generic implementation. The goal of this refactor is to simplify the code to enable "specialization" of the functions at compile time. Given compile-time optimizations, the compiler is able to unroll loops, and create optimized code sequences due to compile time knowledge of loop-trip counts. In order to enable these compiler optimizations, we must refactor the code to pass the loop-trip counts to functions as compile time constants. This patch allows the number of miniflow-bits set per "unit" in the miniflow to be passed around as a function argument. Note that this patch does NOT yet take advantage of doing so, this is only a refactor to enable it in the next patches. Signed-off-by: Harry van Haaren Tested-by: Malvika Gupta --- v13: - Moved blocks scratch array to thread local storage. This encapsulates the blocks array better inside the implementation where it is required, and avoids bleeding the details to the dpcls or PMD level. Thanks Ilya for suggesting the DEFINE_STATIC_PER_THREAD_DATA method. - Removed blocks scratch array from struct dpcls - Removed blocks_scratch parameter from lookup_func prototype v12: - Fix Caps and . (Ilya) - Fixed typos (Ilya) - Added mf_bits and mf_masks in this patch (Ilya) - Fixed rebase conflicts v11: - Rebased to previous changes - Fix typo in commit message (Ian) - Fix variable declaration spacing (Ian) - Remove function names from comments (Ian) - Replace magic 8 with sizeof(uint64_t) (Ian) - Captialize and end comments with a stop. (Ian/Ilya) - Add build time assert to validate FLOWMAP_UNITS (Ilya) - Add note on ALWAYS_INLINE operation - Add space after ULLONG_FOR_EACH_1 (Ilya) - Use hash_add_words64() instead of rolling own loop (Ilya) Note that hash_words64_inline() calls hash_finish() with an fixed value, so it was not the right hash function for this usage. Used hash_add_words64() and manual hash_finish() to re-use as much of hashing code as we can. v10: - Rebase updates from previous patches - Fix whitespace indentation of func params - Removed restrict keyword, Windows CI failing when it is used (Ian) - Fix integer 0 used to set NULL pointer (Ilya) - Postpone free() call on cls->blocks_scratch (Ilya) - Fix indentation of a function v9: - Use count_1bits in favour of __builtin_popcount (Ilya) - Use ALWAYS_INLINE instead of __attribute__ synatx (Ilya) v8: - Rework block_cache and mf_masks to avoid variable-lenght array due to compiler issues. Provisioning for worst case is not a good solution due to magnitue of over-provisioning required. - Rework netdev_flatten function removing unused parameter --- lib/dpif-netdev-lookup-generic.c | 241 +++ lib/dpif-netdev-private.h| 12 +- lib/dpif-netdev.c| 51 ++- 3 files changed, 269 insertions(+), 35 deletions(-) diff --git a/lib/dpif-netdev-lookup-generic.c b/lib/dpif-netdev-lookup-generic.c index 8064911b3..de45099f8 100644 --- a/lib/dpif-netdev-lookup-generic.c +++ b/lib/dpif-netdev-lookup-generic.c @@ -27,61 +27,226 @@ #include "dpif-netdev-perf.h" #include "dpif-provider.h" #include "flow.h" +#include "ovs-thread.h" #include "packets.h" #include "pvector.h" -/* Returns a hash value for the bits of 'key' where there are 1-bits in - * 'mask'. */ -static inline uint32_t -netdev_flow_key_hash_in_mask(const struct netdev_flow_key *key, - const struct netdev_flow_key *mask) +VLOG_DEFINE_THIS_MODULE(dpif_lookup_generic); + +/* Lookup functions below depends on the internal structure of flowmap. */ +BUILD_ASSERT_DECL(FLOWMAP_UNITS == 2); + +/* Per thread data to store the blocks cache. The 'blocks_cache_count' variable + * stores the size of the allocated space in uint64_t blocks (so * 8 to get the + * size in bytes). + */ +DEFINE_STATIC_PER_THREAD_DATA(uint64_t *, blocks_scratch_ptr, 0); +DEFINE_STATIC_PER_THREAD_DATA(uint32_t, blocks_scratch_count_ptr, 0); + +/* Given a packet, table and mf_masks, this function iterates over each bit + * set in the subtable, and calculates the appropriate metadata to store in the + * blocks_scratch[]. + * + * The results of the blocks_scratch[] can be used for hashing, and later for + * verification of if a rule matches the given packet. + */ +static inline void +netdev_flow_key_flatten_unit(const uint64_t *pkt_blocks, + const uint64_t *tbl_blocks, + const uint64_t *mf_masks, + uint64_t *blocks_scratch, + const uint64_t pkt_mf_bits, + const uint32_t count) { -const uint64_t *p = miniflow_get_values(>mf); -uint32_t hash = 0; -uint64_t value; +uint32_t i; + +for (i = 0; i < count; i++) { +uint64_t mf_mask = mf_masks[i]; +/* Calculate the block index for the packet metadata. */ +uint64_t idx_bits = mf_mask & pkt_mf_bits; +const uint32_t
[ovs-dev] [PATCH v13 3/5] dpif-netdev: Split out generic lookup function
This commit splits the generic hash-lookup-verify function to its own file, for cleaner seperation between optimized versions. Signed-off-by: Harry van Haaren Tested-by: Malvika Gupta --- v12: - Fix indentation around "keep sparse happy" comment (Ilya) v11: - Rebase fixups from previous patches - Spacing around ULLONG_FOR_EACH_1 (Ilya) v10: - Rebase fixups from previous patch changes v6: - Fixup some checkpatch warnings on whitespace with MACROs (Ilya) - Other MACROs function incorrectly when checkpatch is happy, so using the functional version without space after the ( character. This prints a checkpatch warning, but I see no way to fix it. --- lib/automake.mk | 1 + lib/dpif-netdev-lookup-generic.c | 98 lib/dpif-netdev.c| 69 +- 3 files changed, 100 insertions(+), 68 deletions(-) create mode 100644 lib/dpif-netdev-lookup-generic.c diff --git a/lib/automake.mk b/lib/automake.mk index 6f216efe0..29d3458da 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -78,6 +78,7 @@ lib_libopenvswitch_la_SOURCES = \ lib/dp-packet.h \ lib/dp-packet.c \ lib/dpdk.h \ + lib/dpif-netdev-lookup-generic.c \ lib/dpif-netdev.c \ lib/dpif-netdev.h \ lib/dpif-netdev-private.h \ diff --git a/lib/dpif-netdev-lookup-generic.c b/lib/dpif-netdev-lookup-generic.c new file mode 100644 index 0..8064911b3 --- /dev/null +++ b/lib/dpif-netdev-lookup-generic.c @@ -0,0 +1,98 @@ +/* + * Copyright (c) 2009, 2010, 2011, 2012, 2013, 2014, 2016, 2017 Nicira, Inc. + * Copyright (c) 2019 Intel Corporation. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include +#include "dpif-netdev.h" +#include "dpif-netdev-private.h" + +#include "bitmap.h" +#include "cmap.h" + +#include "dp-packet.h" +#include "dpif.h" +#include "dpif-netdev-perf.h" +#include "dpif-provider.h" +#include "flow.h" +#include "packets.h" +#include "pvector.h" + +/* Returns a hash value for the bits of 'key' where there are 1-bits in + * 'mask'. */ +static inline uint32_t +netdev_flow_key_hash_in_mask(const struct netdev_flow_key *key, + const struct netdev_flow_key *mask) +{ +const uint64_t *p = miniflow_get_values(>mf); +uint32_t hash = 0; +uint64_t value; + +NETDEV_FLOW_KEY_FOR_EACH_IN_FLOWMAP (value, key, mask->mf.map) { +hash = hash_add64(hash, value & *p); +p++; +} + +return hash_finish(hash, (p - miniflow_get_values(>mf)) * 8); +} + +uint32_t +dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable, + uint32_t keys_map, + const struct netdev_flow_key *keys[], + struct dpcls_rule **rules) +{ +int i; +uint32_t found_map; + +/* Compute hashes for the remaining keys. Each search-key is + * masked with the subtable's mask to avoid hashing the wildcarded + * bits. */ +uint32_t hashes[NETDEV_MAX_BURST]; +ULLONG_FOR_EACH_1 (i, keys_map) { +hashes[i] = netdev_flow_key_hash_in_mask(keys[i], >mask); +} + +/* Lookup. */ +const struct cmap_node *nodes[NETDEV_MAX_BURST]; +found_map = cmap_find_batch(>rules, keys_map, hashes, nodes); + +/* Check results. When the i-th bit of found_map is set, it means + * that a set of nodes with a matching hash value was found for the + * i-th search-key. Due to possible hash collisions we need to check + * which of the found rules, if any, really matches our masked + * search-key. */ +ULLONG_FOR_EACH_1 (i, found_map) { +struct dpcls_rule *rule; + +CMAP_NODE_FOR_EACH (rule, cmap_node, nodes[i]) { +if (OVS_LIKELY(dpcls_rule_matches_key(rule, keys[i]))) { +rules[i] = rule; +/* Even at 20 Mpps the 32-bit hit_cnt cannot wrap + * within one second optimization interval. */ +subtable->hit_cnt++; +goto next; +} +} + +/* None of the found rules was a match. Reset the i-th bit to + * keep searching this key in the next subtable. */ +ULLONG_SET0(found_map, i); /* Did not match. */ +next: +; /* Keep Sparse happy. */ +} + +return found_map; +} diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 749a478a8..b42ca35e3 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c
[ovs-dev] [PATCH v13 2/5] dpif-netdev: Move dpcls lookup structures to .h
This commit moves some data-structures to be available in the dpif-netdev-private.h header. This allows specific implementations of the subtable lookup function to include just that header file, and not require that the code exists in dpif-netdev.c Signed-off-by: Harry van Haaren Tested-by: Malvika Gupta --- v12: - Fixed Caps (Ilya) - Moved declaration of mf_bits and mf_masks* to future patch (Ilya) v11: - Rebase to latest master - Split netdev private header to own file (Ilya) v10: - Rebase updates from previous patch in code that moved. - Move cmap.h include into alphabetical order (Ian) - Fix comment and typo (Ian) - Restructure function typedef to fit in 80 chars v6: - Fix double * in code comments (Eelco) - Reword comment on lookup_func for clarity (Harry) - Rebase fixups --- lib/automake.mk | 1 + lib/dpif-netdev-private.h | 109 ++ lib/dpif-netdev.c | 68 ++-- 3 files changed, 113 insertions(+), 65 deletions(-) create mode 100644 lib/dpif-netdev-private.h diff --git a/lib/automake.mk b/lib/automake.mk index 1b89cac8c..6f216efe0 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -80,6 +80,7 @@ lib_libopenvswitch_la_SOURCES = \ lib/dpdk.h \ lib/dpif-netdev.c \ lib/dpif-netdev.h \ + lib/dpif-netdev-private.h \ lib/dpif-netdev-perf.c \ lib/dpif-netdev-perf.h \ lib/dpif-provider.h \ diff --git a/lib/dpif-netdev-private.h b/lib/dpif-netdev-private.h new file mode 100644 index 0..555856482 --- /dev/null +++ b/lib/dpif-netdev-private.h @@ -0,0 +1,109 @@ +/* + * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2015 Nicira, Inc. + * Copyright (c) 2019 Intel Corperation. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#ifndef DPIF_NETDEV_PRIVATE_H +#define DPIF_NETDEV_PRIVATE_H 1 + +#include +#include + +#include "dpif.h" +#include "cmap.h" + +#ifdef __cplusplus +extern "C" { +#endif + +/* Forward declaration for lookup_func typedef. */ +struct dpcls_subtable; +struct dpcls_rule; + +/* Must be public as it is instantiated in subtable struct below. */ +struct netdev_flow_key { +uint32_t hash; /* Hash function differs for different users. */ +uint32_t len;/* Length of the following miniflow (incl. map). */ +struct miniflow mf; +uint64_t buf[FLOW_MAX_PACKET_U64S]; +}; + +/* A rule to be inserted to the classifier. */ +struct dpcls_rule { +struct cmap_node cmap_node; /* Within struct dpcls_subtable 'rules'. */ +struct netdev_flow_key *mask; /* Subtable's mask. */ +struct netdev_flow_key flow; /* Matching key. */ +/* 'flow' must be the last field, additional space is allocated here. */ +}; + +/* Lookup function for a subtable in the dpcls. This function is called + * by each subtable with an array of packets, and a bitmask of packets to + * perform the lookup on. Using a function pointer gives flexibility to + * optimize the lookup function based on subtable properties and the + * CPU instruction set available at runtime. + */ +typedef +uint32_t (*dpcls_subtable_lookup_func)(struct dpcls_subtable *subtable, + uint32_t keys_map, + const struct netdev_flow_key *keys[], + struct dpcls_rule **rules); + +/* Prototype for generic lookup func, using same code path as before. */ +uint32_t +dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable, + uint32_t keys_map, + const struct netdev_flow_key *keys[], + struct dpcls_rule **rules); + +/* A set of rules that all have the same fields wildcarded. */ +struct dpcls_subtable { +/* The fields are only used by writers. */ +struct cmap_node cmap_node OVS_GUARDED; /* Within dpcls 'subtables_map'. */ + +/* These fields are accessed by readers. */ +struct cmap rules; /* Contains "struct dpcls_rule"s. */ +uint32_t hit_cnt;/* Number of match hits in subtable in current +optimization interval. */ + +/* The lookup function to use for this subtable. If there is a known + * property of the subtable (eg: only 3 bits of miniflow metadata is + * used for the lookup) then this can point at an optimized version of + * the lookup function for this particular subtable. */ +
[ovs-dev] [PATCH v13 1/5] dpif-netdev: Implement function pointers/subtable
This allows plugging-in of different subtable hash-lookup-verify routines, and allows special casing of those functions based on known context (eg: # of bits set) of the specific subtable. Signed-off-by: Harry van Haaren Tested-by: Malvika Gupta --- v11: - Rebased to latest master - Added space to ULLONG_FOR_EACH_1 (Ilya) - Use capital letter in commit message (Ilya) v10: - Fix capitalization of comments, and punctuation. (Ian) - Variable declarations up top before use (Ian) - Fix alignment of function parameters, had to newline after typedef (Ian) - Some mailing-list questions relpied to on-list (Ian) v9: - Use count_1bits in favour of __builtin_popcount (Ilya) v6: - Implement subtable effort per packet "lookups_match" counter (Ilya) - Remove double newline (Eelco) - Remove double * before comments (Eelco) - Reword comments in dpcls_lookup() for clarity (Harry) --- lib/dpif-netdev.c | 138 -- 1 file changed, 96 insertions(+), 42 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 6b99a3c44..123f04577 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -7683,6 +7683,28 @@ dpif_dummy_register(enum dummy_level level) /* Datapath Classifier. */ +/* Forward declaration for lookup_func typedef. */ +struct dpcls_subtable; + +/* Lookup function for a subtable in the dpcls. This function is called + * by each subtable with an array of packets, and a bitmask of packets to + * perform the lookup on. Using a function pointer gives flexibility to + * optimize the lookup function based on subtable properties and the + * CPU instruction set available at runtime. + */ +typedef +uint32_t (*dpcls_subtable_lookup_func)(struct dpcls_subtable *subtable, + uint32_t keys_map, + const struct netdev_flow_key *keys[], + struct dpcls_rule **rules); + +/* Prototype for generic lookup func, using same code path as before. */ +uint32_t +dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable, + uint32_t keys_map, + const struct netdev_flow_key *keys[], + struct dpcls_rule **rules); + /* A set of rules that all have the same fields wildcarded. */ struct dpcls_subtable { /* The fields are only used by writers. */ @@ -7692,6 +7714,13 @@ struct dpcls_subtable { struct cmap rules; /* Contains "struct dpcls_rule"s. */ uint32_t hit_cnt;/* Number of match hits in subtable in current optimization interval. */ + +/* The lookup function to use for this subtable. If there is a known + * property of the subtable (eg: only 3 bits of miniflow metadata is + * used for the lookup) then this can point at an optimized version of + * the lookup function for this particular subtable. */ +dpcls_subtable_lookup_func lookup_func; + struct netdev_flow_key mask; /* Wildcards for fields (const). */ /* 'mask' must be the last field, additional space is allocated here. */ }; @@ -7751,6 +7780,10 @@ dpcls_create_subtable(struct dpcls *cls, const struct netdev_flow_key *mask) cmap_init(>rules); subtable->hit_cnt = 0; netdev_flow_key_clone(>mask, mask); + +/* Decide which hash/lookup/verify function to use. */ +subtable->lookup_func = dpcls_subtable_lookup_generic; + cmap_insert(>subtables_map, >cmap_node, mask->hash); /* Add the new subtable at the end of the pvector (with no hits yet) */ pvector_insert(>subtables, subtable, 0); @@ -7911,6 +7944,55 @@ dpcls_rule_matches_key(const struct dpcls_rule *rule, return true; } +uint32_t +dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable, + uint32_t keys_map, + const struct netdev_flow_key *keys[], + struct dpcls_rule **rules) +{ +int i; +uint32_t found_map; + +/* Compute hashes for the remaining keys. Each search-key is + * masked with the subtable's mask to avoid hashing the wildcarded + * bits. */ +uint32_t hashes[NETDEV_MAX_BURST]; +ULLONG_FOR_EACH_1 (i, keys_map) { +hashes[i] = netdev_flow_key_hash_in_mask(keys[i], + >mask); +} + +/* Lookup. */ +const struct cmap_node *nodes[NETDEV_MAX_BURST]; +found_map = cmap_find_batch(>rules, keys_map, hashes, nodes); + +/* Check results. When the i-th bit of found_map is set, it means + * that a set of nodes with a matching hash value was found for the + * i-th search-key. Due to possible hash collisions we need to check + * which of the found rules, if any, really matches our masked + * search-key. */ +ULLONG_FOR_EACH_1 (i, found_map) { +
[ovs-dev] [PATCH v13 0/5] dpcls func ptrs & optimizations
Hey Folks, Here a v13 of the DPCLS Function Pointer patchset, as has been presented at OVS Conf in Nov '18, and discussed on the ML since then. I'm aware of the soft-freeze for 2.12, I feel this patchset has had enough reviews/versions/testing to be merged in 2.12. Thanks Ilya and Ian for input and suggestions on v12. Only change is that the location of the blocks_scratch array has moved from struct dpcls to a thread local storage internal to the implementation. This avoids leaking the blocks_scratch concept outside the dpcls implementation, while also ensuring that each thread running the dpcls has its own scratch memory. Given the nearing soft-freeze and branch deadlines, I'd like to see this get merged - and any future minor comments / improvements can be handled before release. Regards, -Harry Patchset details; 1) Refactor dpcls_lookup and the subtable for flexibility. In particular, add a function pointer to the subtable structure, which enables "plugging-in" a lookup function at runtime. This enables a number of optimizations in future. 2) and 3) With the function pointer in place, we refactor the existing dpcls_lookup matching code into its own function, and later its own file. To split it to its own file requires making various dpcls data-structures available in dpif-netdev-private.h. 4) Refactor the existing code, to favour compute of flat arrays of miniflows, instead of the MACRO based iteration. This simplifies the code itself, and makes future optimizations possible due to simplified loop structures, and loop trip counts pass in via function arguments. See commit message for more details. 5) This patch implements a select few specialized functions, for handling miniflows with 5-1, 4-1, and 4-0 miniflow unit bit patterns. More of these types of functions can (and should) be added to accelerate other patterns of subtable lookups! See commit message for more details. Harry van Haaren (5): dpif-netdev: Implement function pointers/subtable dpif-netdev: Move dpcls lookup structures to .h dpif-netdev: Split out generic lookup function dpif-netdev: Refactor generic implementation dpif-netdev: Add specialized generic scalar functions NEWS | 4 + lib/automake.mk | 2 + lib/dpif-netdev-lookup-generic.c | 324 +++ lib/dpif-netdev-private.h| 127 lib/dpif-netdev.c| 175 - 5 files changed, 533 insertions(+), 99 deletions(-) create mode 100644 lib/dpif-netdev-lookup-generic.c create mode 100644 lib/dpif-netdev-private.h -- 2.17.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] OVN resource agent - make promotion synchronous
On Tue, Jul 09, 2019 at 09:02:57AM +0200, Michele Baldessari wrote: > Currently inside the ovsdb_server_promote() function we call 'promote_ovnnb' > and 'promote_ovnsb' and then just record the new master state in the > CIB. > > This creates a race because those two promote commands are asynchronous > so when we exit the ovsdb_server_promote() function the underlying DBs > are not guaranteed to be in master state. That means that clients might > connect to an instance that is in read-only mode. > > We add a simple sleep loop where we wait for the underlying DB state to > confirm the master state. We do not need to add a timeout loop because > in case of an issue the resource timeout set within pacemaker will kick > in and the resource agent script will be killed by pacemaker. Thanks, applied to master. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] ofp-flow: Improve error message when cookie cannot be set.
On Tue, Jul 16, 2019 at 03:36:07PM -0700, Gregory Rose wrote: > > On 7/16/2019 10:18 AM, Ben Pfaff wrote: > > The "cookie" value has two meanings in "ovs-ofctl add-flow", etc. With > > a mask, it indicates a match; without a mask, it indicates that the > > cookie should be set. In some case, the cookie cannot be set, which may > > mean that the user meant to indicate a match. The error message for this > > case was poor; this improves it. > > > > Suggested-by: "Yi Yang (杨燚)-云服务集团" > > Signed-off-by: Ben Pfaff > > --- > > lib/ofp-flow.c | 6 -- > > 1 file changed, 4 insertions(+), 2 deletions(-) > > > > diff --git a/lib/ofp-flow.c b/lib/ofp-flow.c > > index c45afd204f80..ff0396845a4e 100644 > > --- a/lib/ofp-flow.c > > +++ b/lib/ofp-flow.c > > @@ -1,5 +1,5 @@ > > /* > > - * Copyright (c) 2008-2017 Nicira, Inc. > > + * Copyright (c) 2008-2017, 2019 Nicira, Inc. > >* > >* Licensed under the Apache License, Version 2.0 (the "License"); > >* you may not use this file except in compliance with the License. > > @@ -1660,7 +1660,9 @@ parse_ofp_str__(struct ofputil_flow_mod *fm, int > > command, char *string, > > /* No mask means that the cookie is being set. */ > > if (command != OFPFC_ADD && command != OFPFC_MODIFY > > && command != OFPFC_MODIFY_STRICT) { > > -return xstrdup("cannot set cookie"); > > +return xasprintf("cannot set cookie (to match on a > > " > > + "cookie, specify a mask, e.g. " > > + "cookie=%s/-1)", value); > > } > > error = str_to_be64(value, >new_cookie); > > fm->modify_cookie = true; > More instructive and helpful error messages are a good thing. > Reviewed-by: Greg Rose Thanks, Greg! I applied this to master. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] Building OVS DPDK debian package
On Wed, Jul 17, 2019 at 10:03:06AM -0700, Ravi Kerur wrote: > I am going through OVS 2.11.90 document to build Debian package on Ubuntu > 16.04.4. > > http://docs.openvswitch.org/en/latest/intro/install/distributions/ > > where it claims to have OVS+DPDK debian package. Snippets below > /** > > 3. For DPDK datapath, Open vSwitch with DPDK support is bundled in the > package openvswitch-switch-dpdk. This seems to be true for Ubuntu but not for Debian or for upstream Open vSwitch. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] Badge inquiry
Hi. We emailed you before. This is company from a manufacture specializing in producing lapel pins in China. We specialize in this field for several years, with good quality and pretty competitive price. We can be free to do an order for you. But we have some questions to ask you. Please, what do you exactly need, quality, low price, or timely delivery? What should I do if you want to do business with us? Have you ever got problem on your products? If has, what is it? Are you willing to find a better partner to do business in this line? Will you agree to contact with us if we can meet your needs? Why not email us to get further information? We are looking forward to establish win-win business with you. Many Thanks & Best Regards. Jim Sales Engineer Coin Badge Industrial Inc TEL:+86-027-83899700 Fax:+86-027-83899708 Website:www.Coin-Badge.com ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] 答复: [ovs-discuss] How can I delete flows which match a given cookie value?
--strict means that only exact matches are deleted, so add 'icmp' to your del-flows command to delete the flow. On Wed, Jul 17, 2019 at 03:35:09AM +, Yi Yang (杨燚)-云服务集团 wrote: > Ben, I found del-flows ran successfully but the flows aren't deleted, what's > wrong? > > [yangyi@localhost ~]$ sudo ovs-ofctl -Oopenflow10 add-flow br-int > "table=0,cookie=0x1234,priority=1,icmp,actions=drop" > [yangyi@localhost ~]$ sudo ovs-ofctl -Oopenflow10 dump-flows br-int > > NXST_FLOW reply (xid=0x4): > cookie=0x1234, duration=3.994s, table=0, n_packets=0, n_bytes=0, idle_age=3, > priority=1,icmp actions=drop > [yangyi@localhost ~]$ sudo ovs-ofctl -Oopenflow10 --strict del-flows br-int > "table=0,cookie=0x1234/-1,priority=1" > [yangyi@localhost ~]$ sudo ovs-ofctl -Oopenflow10 dump-flows br-int > > NXST_FLOW reply (xid=0x4): > cookie=0x1234, duration=49.866s, table=0, n_packets=0, n_bytes=0, > idle_age=49, priority=1,icmp actions=drop > [yangyi@localhost ~]$ > > -邮件原件- > 发件人: Ben Pfaff [mailto:b...@ovn.org] > 发送时间: 2019年7月17日 1:04 > 收件人: Yi Yang (杨燚)-云服务集团 > 抄送: ovs-disc...@openvswitch.org; ovs-dev@openvswitch.org > 主题: Re: [ovs-discuss] How can I delete flows which match a given cookie value? > > On Tue, Jul 16, 2019 at 09:35:06AM +, Yi Yang (杨燚)-云服务集团 wrote: > > I need to add and delete flows according to user operations, I know > > openflowplugin in Opendaylight can do this, but it seems “ovs-ofctl > > del-flows” can’t do this way, why can’t cookie value be used to do > > this for “ovs-ofctl del-flows”? > > > > > > > > sudo ovs-ofctl -Oopenflow13 --strict del-flows br-int "table=2,cookie=12345" > > To match on a cookie, specify a mask, e.g. cookie=12345/-1. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] flow: Avoid unsafe comparison of minimasks.
The following, run inside the OVS sandbox, caused OVS to abort when Address Sanitizer was used: ovs-vsctl add-br br-int ovs-ofctl add-flow br-int "table=0,cookie=0x1234,priority=1,icmp,actions=drop" ovs-ofctl --strict del-flows br-int "table=0,cookie=0x1234/-1,priority=1" Sample report from Address Sanitizer: ==3029==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60343260 at pc 0x7f6b09c2459b bp 0x7ffcb67e7540 sp 0x7ffcb67e6cf0 READ of size 40 at 0x60343260 thread T0 #0 0x7f6b09c2459a (/lib/x86_64-linux-gnu/libasan.so.5+0xb859a) #1 0x565110a748a5 in minimask_equal ../lib/flow.c:3510 #2 0x565110a9ea41 in minimatch_equal ../lib/match.c:1821 #3 0x56511091e864 in collect_rules_strict ../ofproto/ofproto.c:4516 #4 0x56511093d526 in delete_flow_start_strict ../ofproto/ofproto.c:5959 #5 0x56511093d526 in ofproto_flow_mod_start ../ofproto/ofproto.c:7949 #6 0x56511093d77b in handle_flow_mod__ ../ofproto/ofproto.c:6122 #7 0x56511093db71 in handle_flow_mod ../ofproto/ofproto.c:6099 #8 0x5651109407f6 in handle_single_part_openflow ../ofproto/ofproto.c:8406 #9 0x5651109407f6 in handle_openflow ../ofproto/ofproto.c:8587 #10 0x5651109e40da in ofconn_run ../ofproto/connmgr.c:1318 #11 0x5651109e40da in connmgr_run ../ofproto/connmgr.c:355 #12 0x56511092b129 in ofproto_run ../ofproto/ofproto.c:1826 #13 0x5651108f23cd in bridge_run__ ../vswitchd/bridge.c:2965 #14 0x565110904887 in bridge_run ../vswitchd/bridge.c:3023 #15 0x5651108e659c in main ../vswitchd/ovs-vswitchd.c:127 #16 0x7f6b093b709a in __libc_start_main ../csu/libc-start.c:308 #17 0x5651108e9009 in _start (/home/blp/nicira/ovs/_build/vswitchd/ovs-vswitchd+0x11d009) This fixes the problem, which although largely theoretical could crop up with odd implementations of memcmp(), perhaps ones optimized in various "clever" ways. All in all, it seems best to avoid the theoretical problem. Signed-off-by: Ben Pfaff --- lib/flow.c | 19 --- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/lib/flow.c b/lib/flow.c index 95da7d4b1803..e54fd2e522e6 100644 --- a/lib/flow.c +++ b/lib/flow.c @@ -1,5 +1,5 @@ /* - * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2017 Nicira, Inc. + * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2017, 2019 Nicira, Inc. * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. @@ -3506,8 +3506,21 @@ minimask_expand(const struct minimask *mask, struct flow_wildcards *wc) bool minimask_equal(const struct minimask *a, const struct minimask *b) { -return !memcmp(a, b, sizeof *a - + MINIFLOW_VALUES_SIZE(miniflow_n_values(>masks))); +/* At first glance, it might seem that this can be reasonably optimized + * into a single memcmp() for the total size of the region. Such an + * optimization will work OK with most implementations of memcmp() that + * proceed from the start of the regions to be compared to the end in + * reasonably sized chunks. However, memcmp() is not required to be + * implemented that way, and an implementation that, for example, compares + * all of the bytes in both regions without early exit when it finds a + * difference, or one that compares, say, 64 bytes at a time, could access + * an unmapped region of memory if minimasks 'a' and 'b' have different + * lengths. By first checking that the maps are the same with the first + * memcmp(), we verify that 'a' and 'b' have the same length and therefore + * ensure that the second memcmp() is safe. */ +return (!memcmp(a, b, sizeof *a) +&& !memcmp(a + 1, b + 1, + MINIFLOW_VALUES_SIZE(miniflow_n_values(>masks; } /* Returns true if at least one bit matched by 'b' is wildcarded by 'a', -- 2.20.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] NEWS: Mention IGMP support for OVN
On Wed, Jul 17, 2019 at 02:15:24PM +0200, Dumitru Ceara wrote: > NEWS update was missed while adding the IGMP code. > > Fixes: 605535f9adf2 ("OVN: Add ovn-northd IGMP support") > Signed-off-by: Dumitru Ceara Thanks! Applied to master. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] Building OVS DPDK debian package
Hi, I am going through OVS 2.11.90 document to build Debian package on Ubuntu 16.04.4. http://docs.openvswitch.org/en/latest/intro/install/distributions/ where it claims to have OVS+DPDK debian package. Snippets below /** 3. For DPDK datapath, Open vSwitch with DPDK support is bundled in the package openvswitch-switch-dpdk. / I used following commands to build packages fakeroot debian/rules clean fakeroot debian/rules binary I got only following debian packages built but no ovs dpdk package. ls -l *deb -rw-r--r-- 1 root root 1142974 Jul 17 16:53 *libopenvswitch_2.11.90-1_amd64.deb* -rw-r--r-- 1 root root 1550512 Jul 17 16:53 *libopenvswitch-dev_2.11.90-1_amd64.deb* -rw-r--r-- 1 root root 163082 Jul 17 16:53 *openvswitch-common_2.11.90-1_amd64.deb* -rw-r--r-- 1 root root 5206104 Jul 17 16:53 *openvswitch-datapath-dkms_2.11.90-1_all.deb* -rw-r--r-- 1 root root 7965756 Jul 17 16:53 *openvswitch-datapath-source_2.11.90-1_all.deb* -rw-r--r-- 1 root root 6058670 Jul 17 16:53 *openvswitch-dbg_2.11.90-1_amd64.deb* -rw-r--r-- 1 root root 34114 Jul 17 16:53 *openvswitch-ipsec_2.11.90-1_amd64.deb* -rw-r--r-- 1 root root 23392 Jul 17 16:53 *openvswitch-pki_2.11.90-1_all.deb* -rw-r--r-- 1 root root 281158 Jul 17 16:53 *openvswitch-switch_2.11.90-1_amd64.deb* -rw-r--r-- 1 root root 37674 Jul 17 16:53 *openvswitch-test_2.11.90-1_all.deb* -rw-r--r-- 1 root root 40500 Jul 17 16:53 *openvswitch-testcontroller_2.11.90-1_amd64.deb* -rw-r--r-- 1 root root 60412 Jul 17 16:53 *openvswitch-vtep_2.11.90-1_amd64.deb* -rw-r--r-- 1 root root 101232 Jul 17 16:53 *ovn-central_2.11.90-1_amd64.deb* -rw-r--r-- 1 root root 364924 Jul 17 16:53 *ovn-common_2.11.90-1_amd64.deb* -rw-r--r-- 1 root root 35944 Jul 17 16:53 *ovn-controller-vtep_2.11.90-1_amd64.deb* -rw-r--r-- 1 root root 27282 Jul 17 16:53 *ovn-docker_2.11.90-1_amd64.deb* -rw-r--r-- 1 root root 103420 Jul 17 16:53 *ovn-host_2.11.90-1_amd64.deb* -rw-r--r-- 1 root root 103372 Jul 17 16:53 *python-openvswitch_2.11.90-1_all.deb* Thanks. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCHv16 2/2] netdev-afxdp: add new netdev type for AF_XDP.
On Fri, Jul 12, 2019 at 04:50:56PM -0700, William Tu wrote: > The patch introduces experimental AF_XDP support for OVS netdev. > AF_XDP, the Address Family of the eXpress Data Path, is a new Linux socket > type built upon the eBPF and XDP technology. It is aims to have comparable > performance to DPDK but cooperate better with existing kernel's networking > stack. An AF_XDP socket receives and sends packets from an eBPF/XDP program > attached to the netdev, by-passing a couple of Linux kernel's subsystems > As a result, AF_XDP socket shows much better performance than AF_PACKET > For more details about AF_XDP, please see linux kernel's > Documentation/networking/af_xdp.rst. Note that by default, this feature is > not compiled in. > > Signed-off-by: William Tu Thanks a lot! I have a few comments. In pmd_perf_stats, is last_tsc always accessed in a thread-local manner or can it be read cross-thread? If it can be read (or updated) cross-thread, then it seems unsafe to me on 32-bit architectures, which might reasonably read or write the two halves of it with separate instructions. If so, then I'd suggest using atomic_uint64_t instead of plain uint64_t. Looking at xmalloc_size_align(), I noticed a couple of things. First, the syntax (runt ? : 0) uses a GCC extension. The portable way to write it is (runt ? runt : 0), or just "runt" since it's a bool. However, I notice that the older code was (runt ? CACHE_LINE_SIZE : 0), and by analogy the newer code should be (runt ? alignment : 0). Are you convinced that the substitute is correct? I also have a couple of trivia: * One of the conditionals in automake.mk isn't needed because LIBBPF_LDADD will be defined to empty if !HAVE_AF_XDP. * In dpif-netdev-perf.h, we don't need the casts because the standard C promotions work fine for us in this case. I'm appending an incremental to address the trivia: diff --git a/lib/automake.mk b/lib/automake.mk index b07bb01c4ef7..7cb1c3373538 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -9,15 +9,12 @@ lib_LTLIBRARIES += lib/libopenvswitch.la lib_libopenvswitch_la_LIBADD = $(SSL_LIBS) lib_libopenvswitch_la_LIBADD += $(CAPNG_LDADD) +lib_libopenvswitch_la_LIBADD += $(LIBBPF_LDADD) if WIN32 lib_libopenvswitch_la_LIBADD += ${PTHREAD_LIBS} endif -if HAVE_AF_XDP -lib_libopenvswitch_la_LIBADD += $(LIBBPF_LDADD) -endif - lib_libopenvswitch_la_LDFLAGS = \ $(OVS_LTINFO) \ -Wl,--version-script=$(top_builddir)/lib/libopenvswitch.sym \ diff --git a/lib/dpif-netdev-perf.h b/lib/dpif-netdev-perf.h index 6b6dfda7db1c..3a5e90b16a5f 100644 --- a/lib/dpif-netdev-perf.h +++ b/lib/dpif-netdev-perf.h @@ -192,15 +192,11 @@ static inline uint64_t rdtsc_syscall(struct pmd_perf_stats *s) { struct timespec val; -uint64_t v; - if (clock_gettime(CLOCK_MONOTONIC_RAW, ) != 0) { return s->last_tsc; } -v = (uint64_t) val.tv_sec * 10LL; -v += (uint64_t) val.tv_nsec; - +uint64_t v = val.tv_sec * UINT64_C(10) + val.tv_nsec; return s->last_tsc = v; } #endif ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v12 0/5] dpcls func ptrs & optimizations
> -Original Message- > From: Ilya Maximets [mailto:i.maxim...@samsung.com] > Sent: Wednesday, July 17, 2019 5:26 PM > To: Van Haaren, Harry ; d...@openvswitch.org > Cc: echau...@redhat.com; malvika.gu...@arm.com; Stokes, Ian > > Subject: Re: [PATCH v12 0/5] dpcls func ptrs & optimizations > >> I performed a few tests with v11 of this patch-set on my usual setup to > >> check > >> performance of the new generic (non-optimized) implementation. The result > is > >> that > >> new generic implementation is ~5% faster for me than current master (it > was > >> 12% > >> for optimized lookup functions) which is good. > >> The code looks OK for me in general. I still don't really like the fact > that > >> dpcls depends on the internal structure of a flowmap, but we, probably, > >> could > >> deal with that while we have a build time assertion. Hope, we'll have > some > >> better > >> implementation with the same level of performance in the future. > > > > > > Thanks for the feedback on performance Ilya - good to see that we're going > > in the right direction performance wise anyway. > > > > I have one item I'd still like to improve in this patchset, and its > regarding > > where the blocks scratch array is being stored. I'll rework and find a > better > > solution than the current, and post that as the lucky patchset v13 :) > > As an idea, I could suggest DEFINE_STATIC_PER_THREAD_DATA inside the dpif- > netdev-lookup-generic.c. Thanks for the suggestion - that would make the blocks scratch pointer available on a per-thread basis, and be initialized to NULL correct? As a result, we would must a (predictable) branch to check if the pointer is NULL, to allocate the required space on first usage of the subtable_lookup? Or is there a better way to initialize it for all threads that call dpcls_lookup()? My concern here is what if one thread creates a subtable (and allocs blocks_scratch for itself), but then the PMD is polled by another thread - which uses its own blocks_scratch which is NULL. Hence, I think the runtime check + alloc is probably the best/safest way, but I'd appreciate your input on the above threading logic Ilya :) ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v12 0/5] dpcls func ptrs & optimizations
On 17.07.2019 19:16, Van Haaren, Harry wrote: >> -Original Message- >> From: Ilya Maximets [mailto:i.maxim...@samsung.com] >> Sent: Wednesday, July 17, 2019 4:03 PM >> To: Van Haaren, Harry ; d...@openvswitch.org >> Cc: echau...@redhat.com; malvika.gu...@arm.com; Stokes, Ian >> >> Subject: Re: [PATCH v12 0/5] dpcls func ptrs & optimizations >> >> On 17.07.2019 16:00, Harry van Haaren wrote: >>> Hey Folks, >>> >>> Here a v12 of the DPCLS Function Pointer patchset, as has been >>> presented at OVS Conf in Nov '18, and discussed on the ML since then. >>> I'm aware of the soft-freeze for 2.12, I feel this patchset has had >>> enough reviews/versions/testing to be merged in 2.12. > > > >>> Harry van Haaren (5): >>> dpif-netdev: Implement function pointers/subtable >>> dpif-netdev: Move dpcls lookup structures to .h >>> dpif-netdev: Split out generic lookup function >>> dpif-netdev: Refactor generic implementation >>> dpif-netdev: Add specialized generic scalar functions >>> >>> NEWS | 4 + >>> lib/automake.mk | 2 + >>> lib/dpif-netdev-lookup-generic.c | 290 +++ >>> lib/dpif-netdev-private.h| 129 ++ >>> lib/dpif-netdev.c| 197 ++--- >>> 5 files changed, 525 insertions(+), 97 deletions(-) >>> create mode 100644 lib/dpif-netdev-lookup-generic.c >>> create mode 100644 lib/dpif-netdev-private.h >>> >> >> >> I performed a few tests with v11 of this patch-set on my usual setup to >> check >> performance of the new generic (non-optimized) implementation. The result is >> that >> new generic implementation is ~5% faster for me than current master (it was >> 12% >> for optimized lookup functions) which is good. >> The code looks OK for me in general. I still don't really like the fact that >> dpcls depends on the internal structure of a flowmap, but we, probably, >> could >> deal with that while we have a build time assertion. Hope, we'll have some >> better >> implementation with the same level of performance in the future. > > > Thanks for the feedback on performance Ilya - good to see that we're going > in the right direction performance wise anyway. > > I have one item I'd still like to improve in this patchset, and its regarding > where the blocks scratch array is being stored. I'll rework and find a better > solution than the current, and post that as the lucky patchset v13 :) As an idea, I could suggest DEFINE_STATIC_PER_THREAD_DATA inside the dpif-netdev-lookup-generic.c. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCHv16 2/2] netdev-afxdp: add new netdev type for AF_XDP.
On Wed, Jul 17, 2019 at 06:41:48PM +0300, Ilya Maximets wrote: > This patch-set has a long history and the latest versions are in a good shape. > netdev-afxdp works stable for me over the virtual interfaces and passes all > the > system tests that it needs to pass. Eelco is on a PTO now, but his last > report [1] was that v14 of this patch worked fine for him in PVP tests. He had > some issues with large packets, but this is, IMO, most probably due to kernel > driver issues. > > So, I'd like to ask if we're going to merge netdev-afxdp to current release? > For me it's totally fine, especially because netdev-afxdp implemented as a > completely stand-alone feature that could be easily disabled. This way more > users will be able to try and test this new feature with a 2.12 release. I'm really glad to hear that! I was hoping that this would manage to make its way into 2.12. Because it is mostly new code, it is unlikely to cause nonobvious regressions, and so it's great to have it out there for people to try. I honestly haven't reviewed this myself because you've been doing such a great job. I've been looking at it for the last few minutes and I have a few minor comments that I'll send along soon. From what I see so far, my suggestions should be easy to incorporate and should not delay applying this. Thanks, Ben. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v12 0/5] dpcls func ptrs & optimizations
> -Original Message- > From: Ilya Maximets [mailto:i.maxim...@samsung.com] > Sent: Wednesday, July 17, 2019 4:03 PM > To: Van Haaren, Harry ; d...@openvswitch.org > Cc: echau...@redhat.com; malvika.gu...@arm.com; Stokes, Ian > > Subject: Re: [PATCH v12 0/5] dpcls func ptrs & optimizations > > On 17.07.2019 16:00, Harry van Haaren wrote: > > Hey Folks, > > > > Here a v12 of the DPCLS Function Pointer patchset, as has been > > presented at OVS Conf in Nov '18, and discussed on the ML since then. > > I'm aware of the soft-freeze for 2.12, I feel this patchset has had > > enough reviews/versions/testing to be merged in 2.12. > > Harry van Haaren (5): > > dpif-netdev: Implement function pointers/subtable > > dpif-netdev: Move dpcls lookup structures to .h > > dpif-netdev: Split out generic lookup function > > dpif-netdev: Refactor generic implementation > > dpif-netdev: Add specialized generic scalar functions > > > > NEWS | 4 + > > lib/automake.mk | 2 + > > lib/dpif-netdev-lookup-generic.c | 290 +++ > > lib/dpif-netdev-private.h| 129 ++ > > lib/dpif-netdev.c| 197 ++--- > > 5 files changed, 525 insertions(+), 97 deletions(-) > > create mode 100644 lib/dpif-netdev-lookup-generic.c > > create mode 100644 lib/dpif-netdev-private.h > > > > > I performed a few tests with v11 of this patch-set on my usual setup to > check > performance of the new generic (non-optimized) implementation. The result is > that > new generic implementation is ~5% faster for me than current master (it was > 12% > for optimized lookup functions) which is good. > The code looks OK for me in general. I still don't really like the fact that > dpcls depends on the internal structure of a flowmap, but we, probably, > could > deal with that while we have a build time assertion. Hope, we'll have some > better > implementation with the same level of performance in the future. Thanks for the feedback on performance Ilya - good to see that we're going in the right direction performance wise anyway. I have one item I'd still like to improve in this patchset, and its regarding where the blocks scratch array is being stored. I'll rework and find a better solution than the current, and post that as the lucky patchset v13 :) Cheers, -Harry ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] ¿Cómo vender por Instagram?
Buen día El cupo del curso está por llenarse y quise aprovechar la oportunidad de hacerte una última invitación: •Nombre: Cómo vender por instagram •Área: Ventas •¿Cuándo?: Jueves 01 de Agosto •Formato: En línea con interacción en vivo. •Instructor: LUIS PRADO •Lugar: ¡Desde tu sala de juntas o escritorio! Sí el tema del curso no se apega a tus necesidades podemos ayudarte con cursos privados personalizados en línea o presenciales. Solicita informes sin compromiso. ¡Esperamos te animes a acompañarnos! En el webinar interactivo participan empresas de todo el país y nuestro instructor es un experto en el tema que puede atender directamente las dudas y comentarios de tu equipo de trabajo. Solicita información respondiendo a este correo con la palabra Instagram, junto con los siguientes datos: Nombre: Correo electrónico: Número telefónico: Números de Atención: (045) 55 15 54 66 30 (045) 55 85 56 72 93 (045) 55 30 16 70 85 Qué tengas un gran día. Saludos. ¿No eres del área de Ventas? Házmelo saber para asegurarme de solo enviarte cursos del área de tu interés. También puedes solicitar el calendario de los próximos meses. Si desea dejar de recibir nuestra promoción favor de responder con la palabra baja o enviar un correo a ba...@innovalearn.net ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCHv16 2/2] netdev-afxdp: add new netdev type for AF_XDP.
On Wed, Jul 17, 2019 at 8:54 AM Ilya Maximets wrote: > > On 17.07.2019 18:47, William Tu wrote: > > On Wed, Jul 17, 2019 at 8:41 AM Ilya Maximets > > wrote: > >> > >> On 13.07.2019 2:50, William Tu wrote: > >>> The patch introduces experimental AF_XDP support for OVS netdev. > >>> AF_XDP, the Address Family of the eXpress Data Path, is a new Linux socket > >>> type built upon the eBPF and XDP technology. It is aims to have > >>> comparable > >>> performance to DPDK but cooperate better with existing kernel's networking > >>> stack. An AF_XDP socket receives and sends packets from an eBPF/XDP > >>> program > >>> attached to the netdev, by-passing a couple of Linux kernel's subsystems > >>> As a result, AF_XDP socket shows much better performance than AF_PACKET > >>> For more details about AF_XDP, please see linux kernel's > >>> Documentation/networking/af_xdp.rst. Note that by default, this feature is > >>> not compiled in. > >>> > >>> Signed-off-by: William Tu > >>> --- > >> > >> Hi Ben and William. > >> > >> This patch-set has a long history and the latest versions are in a good > >> shape. > >> netdev-afxdp works stable for me over the virtual interfaces and passes > >> all the > >> system tests that it needs to pass. Eelco is on a PTO now, but his last > >> report [1] was that v14 of this patch worked fine for him in PVP tests. He > >> had > >> some issues with large packets, but this is, IMO, most probably due to > >> kernel > >> driver issues. > >> > >> So, I'd like to ask if we're going to merge netdev-afxdp to current > >> release? > >> For me it's totally fine, especially because netdev-afxdp implemented as a > >> completely stand-alone feature that could be easily disabled. This way more > >> users will be able to try and test this new feature with a 2.12 release. > >> > >> I made a review with a few style related comments and 2 small issues on a > >> current version. If William will prepare the v17 today or tomorrow, I'll be > >> able to re-check it and apply while we're in a soft-freeze stage. > >> > >> As an option, I could prepare v17 myself if William has no time for that. > >> > >> > >> Ben, William, what do you think? > >> > > Hi Ilya, > > > > Thanks a lot for your help on reviewing all these versions, and also > > Eelco's review and testing. I will prepare v17 today. > > Cool. Please, also add the NEWS entry to the patch about new experimental > netdev type. > OK! Thank you ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCHv16 2/2] netdev-afxdp: add new netdev type for AF_XDP.
On 17.07.2019 18:47, William Tu wrote: > On Wed, Jul 17, 2019 at 8:41 AM Ilya Maximets wrote: >> >> On 13.07.2019 2:50, William Tu wrote: >>> The patch introduces experimental AF_XDP support for OVS netdev. >>> AF_XDP, the Address Family of the eXpress Data Path, is a new Linux socket >>> type built upon the eBPF and XDP technology. It is aims to have comparable >>> performance to DPDK but cooperate better with existing kernel's networking >>> stack. An AF_XDP socket receives and sends packets from an eBPF/XDP program >>> attached to the netdev, by-passing a couple of Linux kernel's subsystems >>> As a result, AF_XDP socket shows much better performance than AF_PACKET >>> For more details about AF_XDP, please see linux kernel's >>> Documentation/networking/af_xdp.rst. Note that by default, this feature is >>> not compiled in. >>> >>> Signed-off-by: William Tu >>> --- >> >> Hi Ben and William. >> >> This patch-set has a long history and the latest versions are in a good >> shape. >> netdev-afxdp works stable for me over the virtual interfaces and passes all >> the >> system tests that it needs to pass. Eelco is on a PTO now, but his last >> report [1] was that v14 of this patch worked fine for him in PVP tests. He >> had >> some issues with large packets, but this is, IMO, most probably due to kernel >> driver issues. >> >> So, I'd like to ask if we're going to merge netdev-afxdp to current release? >> For me it's totally fine, especially because netdev-afxdp implemented as a >> completely stand-alone feature that could be easily disabled. This way more >> users will be able to try and test this new feature with a 2.12 release. >> >> I made a review with a few style related comments and 2 small issues on a >> current version. If William will prepare the v17 today or tomorrow, I'll be >> able to re-check it and apply while we're in a soft-freeze stage. >> >> As an option, I could prepare v17 myself if William has no time for that. >> >> >> Ben, William, what do you think? >> > Hi Ilya, > > Thanks a lot for your help on reviewing all these versions, and also > Eelco's review and testing. I will prepare v17 today. Cool. Please, also add the NEWS entry to the patch about new experimental netdev type. > > Regards, > William > >> >> CC Ian, as this is a big step for a userspace datapath. >> >> [1] https://mail.openvswitch.org/pipermail/ovs-dev/2019-July/360493.html >> >> Best regards, Ilya Maximets. > > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCHv16 2/2] netdev-afxdp: add new netdev type for AF_XDP.
On Wed, Jul 17, 2019 at 8:41 AM Ilya Maximets wrote: > > On 13.07.2019 2:50, William Tu wrote: > > The patch introduces experimental AF_XDP support for OVS netdev. > > AF_XDP, the Address Family of the eXpress Data Path, is a new Linux socket > > type built upon the eBPF and XDP technology. It is aims to have comparable > > performance to DPDK but cooperate better with existing kernel's networking > > stack. An AF_XDP socket receives and sends packets from an eBPF/XDP program > > attached to the netdev, by-passing a couple of Linux kernel's subsystems > > As a result, AF_XDP socket shows much better performance than AF_PACKET > > For more details about AF_XDP, please see linux kernel's > > Documentation/networking/af_xdp.rst. Note that by default, this feature is > > not compiled in. > > > > Signed-off-by: William Tu > > --- > > Hi Ben and William. > > This patch-set has a long history and the latest versions are in a good shape. > netdev-afxdp works stable for me over the virtual interfaces and passes all > the > system tests that it needs to pass. Eelco is on a PTO now, but his last > report [1] was that v14 of this patch worked fine for him in PVP tests. He had > some issues with large packets, but this is, IMO, most probably due to kernel > driver issues. > > So, I'd like to ask if we're going to merge netdev-afxdp to current release? > For me it's totally fine, especially because netdev-afxdp implemented as a > completely stand-alone feature that could be easily disabled. This way more > users will be able to try and test this new feature with a 2.12 release. > > I made a review with a few style related comments and 2 small issues on a > current version. If William will prepare the v17 today or tomorrow, I'll be > able to re-check it and apply while we're in a soft-freeze stage. > > As an option, I could prepare v17 myself if William has no time for that. > > > Ben, William, what do you think? > Hi Ilya, Thanks a lot for your help on reviewing all these versions, and also Eelco's review and testing. I will prepare v17 today. Regards, William > > CC Ian, as this is a big step for a userspace datapath. > > [1] https://mail.openvswitch.org/pipermail/ovs-dev/2019-July/360493.html > > Best regards, Ilya Maximets. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCHv16 2/2] netdev-afxdp: add new netdev type for AF_XDP.
On 13.07.2019 2:50, William Tu wrote: > The patch introduces experimental AF_XDP support for OVS netdev. > AF_XDP, the Address Family of the eXpress Data Path, is a new Linux socket > type built upon the eBPF and XDP technology. It is aims to have comparable > performance to DPDK but cooperate better with existing kernel's networking > stack. An AF_XDP socket receives and sends packets from an eBPF/XDP program > attached to the netdev, by-passing a couple of Linux kernel's subsystems > As a result, AF_XDP socket shows much better performance than AF_PACKET > For more details about AF_XDP, please see linux kernel's > Documentation/networking/af_xdp.rst. Note that by default, this feature is > not compiled in. > > Signed-off-by: William Tu > --- Hi Ben and William. This patch-set has a long history and the latest versions are in a good shape. netdev-afxdp works stable for me over the virtual interfaces and passes all the system tests that it needs to pass. Eelco is on a PTO now, but his last report [1] was that v14 of this patch worked fine for him in PVP tests. He had some issues with large packets, but this is, IMO, most probably due to kernel driver issues. So, I'd like to ask if we're going to merge netdev-afxdp to current release? For me it's totally fine, especially because netdev-afxdp implemented as a completely stand-alone feature that could be easily disabled. This way more users will be able to try and test this new feature with a 2.12 release. I made a review with a few style related comments and 2 small issues on a current version. If William will prepare the v17 today or tomorrow, I'll be able to re-check it and apply while we're in a soft-freeze stage. As an option, I could prepare v17 myself if William has no time for that. Ben, William, what do you think? CC Ian, as this is a big step for a userspace datapath. [1] https://mail.openvswitch.org/pipermail/ovs-dev/2019-July/360493.html Best regards, Ilya Maximets. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] Fwd: Re: € 2,000,000.00 Euro
Lieber Freund, Ich bin Mr. Richards Wahl des Mega-Gewinners von '533M In Mega Millions Jackpot Ich spende an 5 zufällige Personen, wenn Sie diese E-Mail erhalten, dann wurde Ihre E-Mail nach einem Spinball ausgewählt. Ich habe den größten Teil meines Vermögens an eine Reihe von Wohltätigkeitsorganisationen und Organisationen verteilt. Ich habe mich freiwillig entschieden, Ihnen einen der ausgewählten 5 zu spenden, um meine Gewinne zu überprüfen, siehe meine You Tube Seite unten. WATCH ME HIER: https://www.youtube.com/watch?v=tne02ExNDrw Dies ist Ihr Spendencode: [DF00430342018] Reagieren Sie mit dem Spendencode auf diese E-Mail: natasha.boyce...@gmail.com grüße Herr Richard Wahl ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v12 0/5] dpcls func ptrs & optimizations
On 17.07.2019 16:00, Harry van Haaren wrote: > Hey Folks, > > Here a v12 of the DPCLS Function Pointer patchset, as has been > presented at OVS Conf in Nov '18, and discussed on the ML since then. > I'm aware of the soft-freeze for 2.12, I feel this patchset has had > enough reviews/versions/testing to be merged in 2.12. > > Thanks Ilya and Ian for review comments on v11, they should all be addressed > in this v12. Patchset details below, summary of v11 changes as follows: > - Reworked various minor comments (Capitals, stops, and whitespace) > - Moved variable declarations to patch they're used in > - Improved function documentation on probe() > - Reduced log level from info to debug > - See per patch --- v12 notes for details. > > Given the nearing soft-freeze and branch deadlines, I'd like to see > this get merged - and any future minor comments / improvements can be > handled before release. > > Regards, -Harry > > > Patchset details; > 1) Refactor dpcls_lookup and the subtable for flexibility. > In particular, add a function pointer to the subtable > structure, which enables "plugging-in" a lookup function > at runtime. This enables a number of optimizations in future. > > 2) and 3) > With the function pointer in place, we refactor the existing > dpcls_lookup matching code into its own function, and later its > own file. To split it to its own file requires making various > dpcls data-structures available in dpif-netdev-private.h. > > 4) > Refactor the existing code, to favour compute of flat arrays of > miniflows, instead of the MACRO based iteration. This simplifies > the code itself, and makes future optimizations possible due to > simplified loop structures, and loop trip counts pass in via > function arguments. See commit message for more details. > > 5) > This patch implements a select few specialized functions, for handling > miniflows with 5-1, 4-1, and 4-0 miniflow unit bit patterns. More of > these types of functions can (and should) be added to accelerate other > patterns of subtable lookups! See commit message for more details. > > > Harry van Haaren (5): > dpif-netdev: Implement function pointers/subtable > dpif-netdev: Move dpcls lookup structures to .h > dpif-netdev: Split out generic lookup function > dpif-netdev: Refactor generic implementation > dpif-netdev: Add specialized generic scalar functions > > NEWS | 4 + > lib/automake.mk | 2 + > lib/dpif-netdev-lookup-generic.c | 290 +++ > lib/dpif-netdev-private.h| 129 ++ > lib/dpif-netdev.c| 197 ++--- > 5 files changed, 525 insertions(+), 97 deletions(-) > create mode 100644 lib/dpif-netdev-lookup-generic.c > create mode 100644 lib/dpif-netdev-private.h > I performed a few tests with v11 of this patch-set on my usual setup to check performance of the new generic (non-optimized) implementation. The result is that new generic implementation is ~5% faster for me than current master (it was 12% for optimized lookup functions) which is good. The code looks OK for me in general. I still don't really like the fact that dpcls depends on the internal structure of a flowmap, but we, probably, could deal with that while we have a build time assertion. Hope, we'll have some better implementation with the same level of performance in the future. Best regards, Ilya Maximets. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] OVN resource agent - make promotion synchronous
On Wed, Jul 17, 2019 at 7:41 PM Aaron Conole wrote: > Michele Baldessari writes: > > > On Wed, Jul 17, 2019 at 06:23:56AM -0500, Terry Wilson wrote: > >> Is this just waiting on a couple of line length issues to be fixed? Or > do > >> those really matter? > > > > Hi Terry, > > > > I kind of ignored it because the ovn/utilities/ovndb-servers.ocf file > > has already a bunch of lines > 79 chars. I can still fix it up if that > > is preferred? > > Those shouldn't be 'blocking' anything. It's to bring attention > (warning vs. error). Maybe it hasn't been looked at by Numan / aginwala > yet. CC'd for more attention. > I looked into this patch and acked it :). I think we can ignore the warning for the shell files. Thanks Numan > > > cheers, > > Michele > > > >> > >> On Tue, Jul 9, 2019 at 3:10 AM 0-day Robot wrote: > >> > >> > Bleep bloop. Greetings Michele Baldessari, I am a robot and I have > tried > >> > out your patch. > >> > Thanks for your contribution. > >> > > >> > I encountered some error that I wasn't expecting. See the details > below. > >> > > >> > > >> > checkpatch: > >> > WARNING: Line is 89 characters long (recommended limit is 79) > >> > #50 FILE: ovn/utilities/ovndb-servers.ocf:545: > >> > ocf_log debug "ovndb_servers: Waiting for promotion $host_name as > >> > master to complete" > >> > > >> > WARNING: Line is 82 characters long (recommended limit is 79) > >> > #58 FILE: ovn/utilities/ovndb-servers.ocf:553: > >> > ocf_log debug "ovndb_servers: Promotion of $host_name as the > master > >> > completed" > >> > > >> > Lines checked: 64, Warnings: 2, Errors: 0 > >> > > >> > > >> > Please check this out. If you feel there has been an error, please > email > >> > acon...@bytheb.org > >> > > >> > Thanks, > >> > 0-day Robot > >> > ___ > >> > dev mailing list > >> > d...@openvswitch.org > >> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > >> > > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] OVN resource agent - make promotion synchronous
Michele Baldessari writes: > On Wed, Jul 17, 2019 at 06:23:56AM -0500, Terry Wilson wrote: >> Is this just waiting on a couple of line length issues to be fixed? Or do >> those really matter? > > Hi Terry, > > I kind of ignored it because the ovn/utilities/ovndb-servers.ocf file > has already a bunch of lines > 79 chars. I can still fix it up if that > is preferred? Those shouldn't be 'blocking' anything. It's to bring attention (warning vs. error). Maybe it hasn't been looked at by Numan / aginwala yet. CC'd for more attention. > cheers, > Michele > >> >> On Tue, Jul 9, 2019 at 3:10 AM 0-day Robot wrote: >> >> > Bleep bloop. Greetings Michele Baldessari, I am a robot and I have tried >> > out your patch. >> > Thanks for your contribution. >> > >> > I encountered some error that I wasn't expecting. See the details below. >> > >> > >> > checkpatch: >> > WARNING: Line is 89 characters long (recommended limit is 79) >> > #50 FILE: ovn/utilities/ovndb-servers.ocf:545: >> > ocf_log debug "ovndb_servers: Waiting for promotion $host_name as >> > master to complete" >> > >> > WARNING: Line is 82 characters long (recommended limit is 79) >> > #58 FILE: ovn/utilities/ovndb-servers.ocf:553: >> > ocf_log debug "ovndb_servers: Promotion of $host_name as the master >> > completed" >> > >> > Lines checked: 64, Warnings: 2, Errors: 0 >> > >> > >> > Please check this out. If you feel there has been an error, please email >> > acon...@bytheb.org >> > >> > Thanks, >> > 0-day Robot >> > ___ >> > dev mailing list >> > d...@openvswitch.org >> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >> > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v12 5/5] dpif-netdev: Add specialized generic scalar functions
Bleep bloop. Greetings Harry van Haaren, I am a robot and I have tried out your patch. Thanks for your contribution. I encountered some error that I wasn't expecting. See the details below. checkpatch: ERROR: Inappropriate bracing around statement #112 FILE: lib/dpif-netdev-lookup-generic.c:266: if (!f && u0_bits == U0 && u1_bits == U1) { \ Lines checked: 178, Warnings: 0, Errors: 1 Please check this out. If you feel there has been an error, please email acon...@bytheb.org Thanks, 0-day Robot ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v12 2/5] dpif-netdev: Move dpcls lookup structures to .h
Bleep bloop. Greetings Harry van Haaren, I am a robot and I have tried out your patch. Thanks for your contribution. I encountered some error that I wasn't expecting. See the details below. checkpatch: ERROR: Improper whitespace around control block #129 FILE: lib/dpif-netdev-private.h:91: #define NETDEV_FLOW_KEY_FOR_EACH_IN_FLOWMAP(VALUE, KEY, FLOWMAP) \ ERROR: Inappropriate bracing around statement #130 FILE: lib/dpif-netdev-private.h:92: MINIFLOW_FOR_EACH_IN_FLOWMAP (VALUE, &(KEY)->mf, FLOWMAP) Lines checked: 271, Warnings: 0, Errors: 2 Please check this out. If you feel there has been an error, please email acon...@bytheb.org Thanks, 0-day Robot ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v12 5/5] dpif-netdev: Add specialized generic scalar functions
This commit adds a number of specialized functions, that handle common miniflow fingerprints. This enables compiler optimization, resulting in higher performance. Below a quick description of how this optimization actually works; "Specialized functions" are "instances" of the generic implementation, but the compiler is given extra context when compiling. In the case of iterating miniflow datastructures, the most interesting value to enable compile time optimizations is the loop trip count per unit. In order to create a specialized function, there is a generic implementation, which uses a for() loop without the compiler knowing the loop trip count at compile time. The loop trip count is passed in as an argument to the function: uint32_t miniflow_impl_generic(struct miniflow *mf, uint32_t loop_count) { for(uint32_t i = 0; i < loop_count; i++) // do work } In order to "specialize" the function, we call the generic implementation with hard-coded numbers - these are compile time constants! uint32_t miniflow_impl_loop5(struct miniflow *mf, uint32_t loop_count) { // use hard coded constant for compile-time constant-propogation return miniflow_impl_generic(mf, 5); } Given the compiler is aware of the loop trip count at compile time, it can perform an optimization known as "constant propogation". Combined with inlining of the miniflow_impl_generic() function, the compiler is now enabled to *compile time* unroll the loop 5x, and produce "flat" code. The last step to using the specialized functions is to utilize a function-pointer to choose the specialized (or generic) implementation. The selection of the function pointer is performed at subtable creation time, when miniflow fingerprint of the subtable is known. This technique is known as "multiple dispatch" in some literature, as it uses multiple items of information (miniflow bit counts) to select the dispatch function. By pointing the function pointer at the optimized implementation, OvS benefits from the compile time optimizations at runtime. Signed-off-by: Harry van Haaren Tested-by: Malvika Gupta --- v12: - Fix typo (Ian) - Fix missing . after comments (Ian) - Improve return value comments for probe function (Ian) - Spaces after number in optimized lookup declarations (Ilya) - Make VLOG level debug instead of info (Ilya) v11: - Use MACROs to declare and check optimized functions (Ilya) - Use captial letter for commit message (Ilya) - Rebase onto latest patchset changes - Added NEWS entry for data-path subtable specialization (Ian/Harry) - Checkpatch notes an "incorrect bracketing" in the MACROs, however I didn't find a solution that it does like. v10: - Rebase changes from previous patches. - Remove "restrict" keyword as windows CI failed, see here for details: https://ci.appveyor.com/project/istokes/ovs-q8fvv/builds/24398228 v8: - Rework to use blocks_cache from the dpcls instance, to avoid variable lenght arrays in the data-path. --- NEWS | 4 +++ lib/dpif-netdev-lookup-generic.c | 51 lib/dpif-netdev-private.h| 8 + lib/dpif-netdev.c| 9 -- 4 files changed, 70 insertions(+), 2 deletions(-) diff --git a/NEWS b/NEWS index 81130e667..4cfffb1bc 100644 --- a/NEWS +++ b/NEWS @@ -34,6 +34,10 @@ Post-v2.11.0 * 'ovs-appctl exit' now implies cleanup of non-internal ports in userspace datapath regardless of '--cleanup' option. Use '--cleanup' to remove internal ports too. + * Datapath classifer code refactored to enable function pointers to select + the lookup implementation at runtime. This enables specialization of + specific subtables based on the miniflow attributes, enhancing the + performance of the subtable search. - OVSDB: * OVSDB clients can now resynchronize with clustered servers much more quickly after a brief disconnection, saving bandwidth and CPU time. diff --git a/lib/dpif-netdev-lookup-generic.c b/lib/dpif-netdev-lookup-generic.c index b38adc2d5..83925cbaf 100644 --- a/lib/dpif-netdev-lookup-generic.c +++ b/lib/dpif-netdev-lookup-generic.c @@ -233,7 +233,58 @@ dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable, const struct netdev_flow_key *keys[], struct dpcls_rule **rules) { +/* Here the runtime subtable->mf_bits counts are used, which forces the + * compiler to iterate normal for() loops. Due to this limitation in the + * compilers available optimizations, this function has lower performance + * than the below specialized functions. + */ return lookup_generic_impl(subtable, blocks_scratch, keys_map, keys, rules, subtable->mf_bits_set_unit0, subtable->mf_bits_set_unit1); } + +/* Expand out specialized functions with U0 and U1 bit attributes. */ +#define
[ovs-dev] [PATCH v12 4/5] dpif-netdev: Refactor generic implementation
This commit refactors the generic implementation. The goal of this refactor is to simplify the code to enable "specialization" of the functions at compile time. Given compile-time optimizations, the compiler is able to unroll loops, and create optimized code sequences due to compile time knowledge of loop-trip counts. In order to enable these compiler optimizations, we must refactor the code to pass the loop-trip counts to functions as compile time constants. This patch allows the number of miniflow-bits set per "unit" in the miniflow to be passed around as a function argument. Note that this patch does NOT yet take advantage of doing so, this is only a refactor to enable it in the next patches. Signed-off-by: Harry van Haaren Tested-by: Malvika Gupta --- v12: - Fix Caps and . (Ilya) - Fixed typos (Ilya) - Added mf_bits and mf_masks in this patch (Ilya) - Fixed rebase conflicts v11: - Rebased to previous changes - Fix typo in commit message (Ian) - Fix variable declaration spacing (Ian) - Remove function names from comments (Ian) - Replace magic 8 with sizeof(uint64_t) (Ian) - Captialize and end comments with a stop. (Ian/Ilya) - Add build time assert to validate FLOWMAP_UNITS (Ilya) - Add note on ALWAYS_INLINE operation - Add space after ULLONG_FOR_EACH_1 (Ilya) - Use hash_add_words64() instead of rolling own loop (Ilya) Note that hash_words64_inline() calls hash_finish() with an fixed value, so it was not the right hash function for this usage. Used hash_add_words64() and manual hash_finish() to re-use as much of hashing code as we can. v10: - Rebase updates from previous patches - Fix whitespace indentation of func params - Removed restrict keyword, Windows CI failing when it is used (Ian) - Fix integer 0 used to set NULL pointer (Ilya) - Postpone free() call on cls->blocks_scratch (Ilya) - Fix indentation of a function v9: - Use count_1bits in favour of __builtin_popcount (Ilya) - Use ALWAYS_INLINE instead of __attribute__ synatx (Ilya) v8: - Rework block_cache and mf_masks to avoid variable-lenght array due to compiler issues. Provisioning for worst case is not a good solution due to magnitue of over-provisioning required. - Rework netdev_flatten function removing unused parameter --- lib/dpif-netdev-lookup-generic.c | 205 ++- lib/dpif-netdev-private.h| 14 ++- lib/dpif-netdev.c| 77 +++- 3 files changed, 261 insertions(+), 35 deletions(-) diff --git a/lib/dpif-netdev-lookup-generic.c b/lib/dpif-netdev-lookup-generic.c index 8064911b3..b38adc2d5 100644 --- a/lib/dpif-netdev-lookup-generic.c +++ b/lib/dpif-netdev-lookup-generic.c @@ -30,58 +30,186 @@ #include "packets.h" #include "pvector.h" -/* Returns a hash value for the bits of 'key' where there are 1-bits in - * 'mask'. */ -static inline uint32_t -netdev_flow_key_hash_in_mask(const struct netdev_flow_key *key, - const struct netdev_flow_key *mask) +VLOG_DEFINE_THIS_MODULE(dpif_lookup_generic); + +/* Lookup functions below depends on the internal structure of flowmap. */ +BUILD_ASSERT_DECL(FLOWMAP_UNITS == 2); + +/* Given a packet, table and mf_masks, this function iterates over each bit + * set in the subtable, and calculates the appropriate metadata to store in the + * blocks_scratch[]. + * + * The results of the blocks_scratch[] can be used for hashing, and later for + * verification of if a rule matches the given packet. + */ +static inline void +netdev_flow_key_flatten_unit(const uint64_t *pkt_blocks, + const uint64_t *tbl_blocks, + const uint64_t *mf_masks, + uint64_t *blocks_scratch, + const uint64_t pkt_mf_bits, + const uint32_t count) { -const uint64_t *p = miniflow_get_values(>mf); -uint32_t hash = 0; -uint64_t value; +uint32_t i; + +for (i = 0; i < count; i++) { +uint64_t mf_mask = mf_masks[i]; +/* Calculate the block index for the packet metadata. */ +uint64_t idx_bits = mf_mask & pkt_mf_bits; +const uint32_t pkt_idx = count_1bits(idx_bits); + +/* Check if the packet has the subtable miniflow bit set. If yes, the + * block at the above pkt_idx will be stored, otherwise it is masked + * out to be zero. + */ +uint64_t pkt_has_mf_bit = (mf_mask + 1) & pkt_mf_bits; +uint64_t no_bit = ((!pkt_has_mf_bit) > 0) - 1; -NETDEV_FLOW_KEY_FOR_EACH_IN_FLOWMAP (value, key, mask->mf.map) { -hash = hash_add64(hash, value & *p); -p++; +/* Mask packet block by table block, and mask to zero if packet + * doesn't actually contain this block of metadata. + */ +blocks_scratch[i] = pkt_blocks[pkt_idx] & tbl_blocks[i] & no_bit; } +} + +/* This function takes a packet, and subtable and writes an array of uint64_t + * blocks. The blocks
[ovs-dev] [PATCH v12 3/5] dpif-netdev: Split out generic lookup function
This commit splits the generic hash-lookup-verify function to its own file, for cleaner seperation between optimized versions. Signed-off-by: Harry van Haaren Tested-by: Malvika Gupta --- v12: - Fix indentation around "keep sparse happy" comment (Ilya) v11: - Rebase fixups from previous patches - Spacing around ULLONG_FOR_EACH_1 (Ilya) v10: - Rebase fixups from previous patch changes v6: - Fixup some checkpatch warnings on whitespace with MACROs (Ilya) - Other MACROs function incorrectly when checkpatch is happy, so using the functional version without space after the ( character. This prints a checkpatch warning, but I see no way to fix it. --- lib/automake.mk | 1 + lib/dpif-netdev-lookup-generic.c | 98 lib/dpif-netdev.c| 69 +- 3 files changed, 100 insertions(+), 68 deletions(-) create mode 100644 lib/dpif-netdev-lookup-generic.c diff --git a/lib/automake.mk b/lib/automake.mk index 6f216efe0..29d3458da 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -78,6 +78,7 @@ lib_libopenvswitch_la_SOURCES = \ lib/dp-packet.h \ lib/dp-packet.c \ lib/dpdk.h \ + lib/dpif-netdev-lookup-generic.c \ lib/dpif-netdev.c \ lib/dpif-netdev.h \ lib/dpif-netdev-private.h \ diff --git a/lib/dpif-netdev-lookup-generic.c b/lib/dpif-netdev-lookup-generic.c new file mode 100644 index 0..8064911b3 --- /dev/null +++ b/lib/dpif-netdev-lookup-generic.c @@ -0,0 +1,98 @@ +/* + * Copyright (c) 2009, 2010, 2011, 2012, 2013, 2014, 2016, 2017 Nicira, Inc. + * Copyright (c) 2019 Intel Corporation. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include +#include "dpif-netdev.h" +#include "dpif-netdev-private.h" + +#include "bitmap.h" +#include "cmap.h" + +#include "dp-packet.h" +#include "dpif.h" +#include "dpif-netdev-perf.h" +#include "dpif-provider.h" +#include "flow.h" +#include "packets.h" +#include "pvector.h" + +/* Returns a hash value for the bits of 'key' where there are 1-bits in + * 'mask'. */ +static inline uint32_t +netdev_flow_key_hash_in_mask(const struct netdev_flow_key *key, + const struct netdev_flow_key *mask) +{ +const uint64_t *p = miniflow_get_values(>mf); +uint32_t hash = 0; +uint64_t value; + +NETDEV_FLOW_KEY_FOR_EACH_IN_FLOWMAP (value, key, mask->mf.map) { +hash = hash_add64(hash, value & *p); +p++; +} + +return hash_finish(hash, (p - miniflow_get_values(>mf)) * 8); +} + +uint32_t +dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable, + uint32_t keys_map, + const struct netdev_flow_key *keys[], + struct dpcls_rule **rules) +{ +int i; +uint32_t found_map; + +/* Compute hashes for the remaining keys. Each search-key is + * masked with the subtable's mask to avoid hashing the wildcarded + * bits. */ +uint32_t hashes[NETDEV_MAX_BURST]; +ULLONG_FOR_EACH_1 (i, keys_map) { +hashes[i] = netdev_flow_key_hash_in_mask(keys[i], >mask); +} + +/* Lookup. */ +const struct cmap_node *nodes[NETDEV_MAX_BURST]; +found_map = cmap_find_batch(>rules, keys_map, hashes, nodes); + +/* Check results. When the i-th bit of found_map is set, it means + * that a set of nodes with a matching hash value was found for the + * i-th search-key. Due to possible hash collisions we need to check + * which of the found rules, if any, really matches our masked + * search-key. */ +ULLONG_FOR_EACH_1 (i, found_map) { +struct dpcls_rule *rule; + +CMAP_NODE_FOR_EACH (rule, cmap_node, nodes[i]) { +if (OVS_LIKELY(dpcls_rule_matches_key(rule, keys[i]))) { +rules[i] = rule; +/* Even at 20 Mpps the 32-bit hit_cnt cannot wrap + * within one second optimization interval. */ +subtable->hit_cnt++; +goto next; +} +} + +/* None of the found rules was a match. Reset the i-th bit to + * keep searching this key in the next subtable. */ +ULLONG_SET0(found_map, i); /* Did not match. */ +next: +; /* Keep Sparse happy. */ +} + +return found_map; +} diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 749a478a8..b42ca35e3 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c
[ovs-dev] [PATCH v12 2/5] dpif-netdev: Move dpcls lookup structures to .h
This commit moves some data-structures to be available in the dpif-netdev-private.h header. This allows specific implementations of the subtable lookup function to include just that header file, and not require that the code exists in dpif-netdev.c Signed-off-by: Harry van Haaren Tested-by: Malvika Gupta --- v12: - Fixed Caps (Ilya) - Moved declaration of mf_bits and mf_masks* to future patch (Ilya) v11: - Rebase to latest master - Split netdev private header to own file (Ilya) v10: - Rebase updates from previous patch in code that moved. - Move cmap.h include into alphabetical order (Ian) - Fix comment and typo (Ian) - Restructure function typedef to fit in 80 chars v6: - Fix double * in code comments (Eelco) - Reword comment on lookup_func for clarity (Harry) - Rebase fixups --- lib/automake.mk | 1 + lib/dpif-netdev-private.h | 109 ++ lib/dpif-netdev.c | 68 ++-- 3 files changed, 113 insertions(+), 65 deletions(-) create mode 100644 lib/dpif-netdev-private.h diff --git a/lib/automake.mk b/lib/automake.mk index 1b89cac8c..6f216efe0 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -80,6 +80,7 @@ lib_libopenvswitch_la_SOURCES = \ lib/dpdk.h \ lib/dpif-netdev.c \ lib/dpif-netdev.h \ + lib/dpif-netdev-private.h \ lib/dpif-netdev-perf.c \ lib/dpif-netdev-perf.h \ lib/dpif-provider.h \ diff --git a/lib/dpif-netdev-private.h b/lib/dpif-netdev-private.h new file mode 100644 index 0..555856482 --- /dev/null +++ b/lib/dpif-netdev-private.h @@ -0,0 +1,109 @@ +/* + * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2015 Nicira, Inc. + * Copyright (c) 2019 Intel Corperation. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#ifndef DPIF_NETDEV_PRIVATE_H +#define DPIF_NETDEV_PRIVATE_H 1 + +#include +#include + +#include "dpif.h" +#include "cmap.h" + +#ifdef __cplusplus +extern "C" { +#endif + +/* Forward declaration for lookup_func typedef. */ +struct dpcls_subtable; +struct dpcls_rule; + +/* Must be public as it is instantiated in subtable struct below. */ +struct netdev_flow_key { +uint32_t hash; /* Hash function differs for different users. */ +uint32_t len;/* Length of the following miniflow (incl. map). */ +struct miniflow mf; +uint64_t buf[FLOW_MAX_PACKET_U64S]; +}; + +/* A rule to be inserted to the classifier. */ +struct dpcls_rule { +struct cmap_node cmap_node; /* Within struct dpcls_subtable 'rules'. */ +struct netdev_flow_key *mask; /* Subtable's mask. */ +struct netdev_flow_key flow; /* Matching key. */ +/* 'flow' must be the last field, additional space is allocated here. */ +}; + +/* Lookup function for a subtable in the dpcls. This function is called + * by each subtable with an array of packets, and a bitmask of packets to + * perform the lookup on. Using a function pointer gives flexibility to + * optimize the lookup function based on subtable properties and the + * CPU instruction set available at runtime. + */ +typedef +uint32_t (*dpcls_subtable_lookup_func)(struct dpcls_subtable *subtable, + uint32_t keys_map, + const struct netdev_flow_key *keys[], + struct dpcls_rule **rules); + +/* Prototype for generic lookup func, using same code path as before. */ +uint32_t +dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable, + uint32_t keys_map, + const struct netdev_flow_key *keys[], + struct dpcls_rule **rules); + +/* A set of rules that all have the same fields wildcarded. */ +struct dpcls_subtable { +/* The fields are only used by writers. */ +struct cmap_node cmap_node OVS_GUARDED; /* Within dpcls 'subtables_map'. */ + +/* These fields are accessed by readers. */ +struct cmap rules; /* Contains "struct dpcls_rule"s. */ +uint32_t hit_cnt;/* Number of match hits in subtable in current +optimization interval. */ + +/* The lookup function to use for this subtable. If there is a known + * property of the subtable (eg: only 3 bits of miniflow metadata is + * used for the lookup) then this can point at an optimized version of + * the lookup function for this particular subtable. */ +
[ovs-dev] [PATCH v12 0/5] dpcls func ptrs & optimizations
Hey Folks, Here a v12 of the DPCLS Function Pointer patchset, as has been presented at OVS Conf in Nov '18, and discussed on the ML since then. I'm aware of the soft-freeze for 2.12, I feel this patchset has had enough reviews/versions/testing to be merged in 2.12. Thanks Ilya and Ian for review comments on v11, they should all be addressed in this v12. Patchset details below, summary of v11 changes as follows: - Reworked various minor comments (Capitals, stops, and whitespace) - Moved variable declarations to patch they're used in - Improved function documentation on probe() - Reduced log level from info to debug - See per patch --- v12 notes for details. Given the nearing soft-freeze and branch deadlines, I'd like to see this get merged - and any future minor comments / improvements can be handled before release. Regards, -Harry Patchset details; 1) Refactor dpcls_lookup and the subtable for flexibility. In particular, add a function pointer to the subtable structure, which enables "plugging-in" a lookup function at runtime. This enables a number of optimizations in future. 2) and 3) With the function pointer in place, we refactor the existing dpcls_lookup matching code into its own function, and later its own file. To split it to its own file requires making various dpcls data-structures available in dpif-netdev-private.h. 4) Refactor the existing code, to favour compute of flat arrays of miniflows, instead of the MACRO based iteration. This simplifies the code itself, and makes future optimizations possible due to simplified loop structures, and loop trip counts pass in via function arguments. See commit message for more details. 5) This patch implements a select few specialized functions, for handling miniflows with 5-1, 4-1, and 4-0 miniflow unit bit patterns. More of these types of functions can (and should) be added to accelerate other patterns of subtable lookups! See commit message for more details. Harry van Haaren (5): dpif-netdev: Implement function pointers/subtable dpif-netdev: Move dpcls lookup structures to .h dpif-netdev: Split out generic lookup function dpif-netdev: Refactor generic implementation dpif-netdev: Add specialized generic scalar functions NEWS | 4 + lib/automake.mk | 2 + lib/dpif-netdev-lookup-generic.c | 290 +++ lib/dpif-netdev-private.h| 129 ++ lib/dpif-netdev.c| 197 ++--- 5 files changed, 525 insertions(+), 97 deletions(-) create mode 100644 lib/dpif-netdev-lookup-generic.c create mode 100644 lib/dpif-netdev-private.h -- 2.17.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v12 1/5] dpif-netdev: Implement function pointers/subtable
This allows plugging-in of different subtable hash-lookup-verify routines, and allows special casing of those functions based on known context (eg: # of bits set) of the specific subtable. Signed-off-by: Harry van Haaren Tested-by: Malvika Gupta --- v11: - Rebased to latest master - Added space to ULLONG_FOR_EACH_1 (Ilya) - Use capital letter in commit message (Ilya) v10: - Fix capitalization of comments, and punctuation. (Ian) - Variable declarations up top before use (Ian) - Fix alignment of function parameters, had to newline after typedef (Ian) - Some mailing-list questions relpied to on-list (Ian) v9: - Use count_1bits in favour of __builtin_popcount (Ilya) v6: - Implement subtable effort per packet "lookups_match" counter (Ilya) - Remove double newline (Eelco) - Remove double * before comments (Eelco) - Reword comments in dpcls_lookup() for clarity (Harry) --- lib/dpif-netdev.c | 138 -- 1 file changed, 96 insertions(+), 42 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 6b99a3c44..123f04577 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -7683,6 +7683,28 @@ dpif_dummy_register(enum dummy_level level) /* Datapath Classifier. */ +/* Forward declaration for lookup_func typedef. */ +struct dpcls_subtable; + +/* Lookup function for a subtable in the dpcls. This function is called + * by each subtable with an array of packets, and a bitmask of packets to + * perform the lookup on. Using a function pointer gives flexibility to + * optimize the lookup function based on subtable properties and the + * CPU instruction set available at runtime. + */ +typedef +uint32_t (*dpcls_subtable_lookup_func)(struct dpcls_subtable *subtable, + uint32_t keys_map, + const struct netdev_flow_key *keys[], + struct dpcls_rule **rules); + +/* Prototype for generic lookup func, using same code path as before. */ +uint32_t +dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable, + uint32_t keys_map, + const struct netdev_flow_key *keys[], + struct dpcls_rule **rules); + /* A set of rules that all have the same fields wildcarded. */ struct dpcls_subtable { /* The fields are only used by writers. */ @@ -7692,6 +7714,13 @@ struct dpcls_subtable { struct cmap rules; /* Contains "struct dpcls_rule"s. */ uint32_t hit_cnt;/* Number of match hits in subtable in current optimization interval. */ + +/* The lookup function to use for this subtable. If there is a known + * property of the subtable (eg: only 3 bits of miniflow metadata is + * used for the lookup) then this can point at an optimized version of + * the lookup function for this particular subtable. */ +dpcls_subtable_lookup_func lookup_func; + struct netdev_flow_key mask; /* Wildcards for fields (const). */ /* 'mask' must be the last field, additional space is allocated here. */ }; @@ -7751,6 +7780,10 @@ dpcls_create_subtable(struct dpcls *cls, const struct netdev_flow_key *mask) cmap_init(>rules); subtable->hit_cnt = 0; netdev_flow_key_clone(>mask, mask); + +/* Decide which hash/lookup/verify function to use. */ +subtable->lookup_func = dpcls_subtable_lookup_generic; + cmap_insert(>subtables_map, >cmap_node, mask->hash); /* Add the new subtable at the end of the pvector (with no hits yet) */ pvector_insert(>subtables, subtable, 0); @@ -7911,6 +7944,55 @@ dpcls_rule_matches_key(const struct dpcls_rule *rule, return true; } +uint32_t +dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable, + uint32_t keys_map, + const struct netdev_flow_key *keys[], + struct dpcls_rule **rules) +{ +int i; +uint32_t found_map; + +/* Compute hashes for the remaining keys. Each search-key is + * masked with the subtable's mask to avoid hashing the wildcarded + * bits. */ +uint32_t hashes[NETDEV_MAX_BURST]; +ULLONG_FOR_EACH_1 (i, keys_map) { +hashes[i] = netdev_flow_key_hash_in_mask(keys[i], + >mask); +} + +/* Lookup. */ +const struct cmap_node *nodes[NETDEV_MAX_BURST]; +found_map = cmap_find_batch(>rules, keys_map, hashes, nodes); + +/* Check results. When the i-th bit of found_map is set, it means + * that a set of nodes with a matching hash value was found for the + * i-th search-key. Due to possible hash collisions we need to check + * which of the found rules, if any, really matches our masked + * search-key. */ +ULLONG_FOR_EACH_1 (i, found_map) { +
Re: [ovs-dev] [PATCH] OVN resource agent - make promotion synchronous
On Wed, Jul 17, 2019 at 06:23:56AM -0500, Terry Wilson wrote: > Is this just waiting on a couple of line length issues to be fixed? Or do > those really matter? Hi Terry, I kind of ignored it because the ovn/utilities/ovndb-servers.ocf file has already a bunch of lines > 79 chars. I can still fix it up if that is preferred? cheers, Michele > > On Tue, Jul 9, 2019 at 3:10 AM 0-day Robot wrote: > > > Bleep bloop. Greetings Michele Baldessari, I am a robot and I have tried > > out your patch. > > Thanks for your contribution. > > > > I encountered some error that I wasn't expecting. See the details below. > > > > > > checkpatch: > > WARNING: Line is 89 characters long (recommended limit is 79) > > #50 FILE: ovn/utilities/ovndb-servers.ocf:545: > > ocf_log debug "ovndb_servers: Waiting for promotion $host_name as > > master to complete" > > > > WARNING: Line is 82 characters long (recommended limit is 79) > > #58 FILE: ovn/utilities/ovndb-servers.ocf:553: > > ocf_log debug "ovndb_servers: Promotion of $host_name as the master > > completed" > > > > Lines checked: 64, Warnings: 2, Errors: 0 > > > > > > Please check this out. If you feel there has been an error, please email > > acon...@bytheb.org > > > > Thanks, > > 0-day Robot > > ___ > > dev mailing list > > d...@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > -- Michele Baldessari C2A5 9DA3 9961 4FFB E01B D0BC DDD4 DCCB 7515 5C6D ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] flow: Wildcard UDP ports when using SYMMETRIC_L4 hash for select groups.
> > Applied to master, thanks! Thanks Ben. Warm Regards, Vishal Ajmera ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v4 0/3] OVN: Add IGMP support
On Tue, Jul 16, 2019 at 11:49 PM Ben Pfaff wrote: > > On Mon, Jul 15, 2019 at 10:24:26PM +0200, Dumitru Ceara wrote: > > This series introduces support for IGMP Snooping and IGMP Querier. IGMP > > versions v1-v3 are supported for snooping and IGMP queries originated by > > ovn-controller are general IGMPv3 queries. The rationale behind using v3 for > > querier is that it's backward compatible with v1-v2. > > > > The majority of the code is IP version independent with the thought in mind > > that support for MLD snooping for IPv6 will be added next. > > Thanks, applied to master. > > Please send a follow-up patch that adds an appropriate item to NEWS. > Really I should have asked for that as part of the series, but I didn't > want to delay this any further. > > Thanks again! Thanks for applying the series! Sorry, I had forgotten the NEWS entry. I just sent a follow-up for it: https://patchwork.ozlabs.org/patch/1133287/ Thanks, Dumitru ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] NEWS: Mention IGMP support for OVN
NEWS update was missed while adding the IGMP code. Fixes: 605535f9adf2 ("OVN: Add ovn-northd IGMP support") Signed-off-by: Dumitru Ceara --- NEWS | 1 + 1 file changed, 1 insertion(+) diff --git a/NEWS b/NEWS index 81130e6..806e3c8 100644 --- a/NEWS +++ b/NEWS @@ -53,6 +53,7 @@ Post-v2.11.0 * Support for Transport Zones, a way to separate chassis into logical groups which results in tunnels only been formed between members of the same transport zone(s). + * Support for IGMP Snooping and IGMP Querier. - New QoS type "linux-netem" on Linux. - Added support for TLS Server Name Indication (SNI). - Linux datapath: -- 1.8.3.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v11 5/5] dpif-netdev: Add specialized generic scalar functions
On 7/17/2019 11:29 AM, Van Haaren, Harry wrote: -Original Message- From: Stokes, Ian Sent: Tuesday, July 16, 2019 10:07 PM To: Van Haaren, Harry ; d...@openvswitch.org Cc: echau...@redhat.com; i.maxim...@samsung.com; malvika.gu...@arm.com Subject: Re: [PATCH v11 5/5] dpif-netdev: Add specialized generic scalar functions On 7/15/2019 5:36 PM, Harry van Haaren wrote: This commit adds a number of specialized functions, that handle Thanks for the v11 Harry, some minor comments inline below. Thanks for review! v11: - Use MACROs to declare and check optimized functions (Ilya) - Use captial letter for commit message (Ilya) - Rebase onto latest patchset changes - Added NEWS entry for data-path subtable specialization (Ian/Harry) - Checkpatch notes an "incorrect bracketing" in the MACROs, however I didn't find a solution that it does like. v10: - Rebase changes from previous patches. - Remove "restrict" keyword as windows CI failed, see here for details: https://ci.appveyor.com/project/istokes/ovs-q8fvv/builds/24398228 v8: - Rework to use blocks_cache from the dpcls instance, to avoid variable lenght arrays in the data-path. --- NEWS | 4 +++ lib/dpif-netdev-lookup-generic.c | 51 lib/dpif-netdev-private.h| 8 + lib/dpif-netdev.c| 9 -- 4 files changed, 70 insertions(+), 2 deletions(-) diff --git a/NEWS b/NEWS index 81130e667..4cfffb1bc 100644 --- a/NEWS +++ b/NEWS @@ -34,6 +34,10 @@ Post-v2.11.0 * 'ovs-appctl exit' now implies cleanup of non-internal ports in userspace datapath regardless of '--cleanup' option. Use '--cleanup' to remove internal ports too. + * Datapath classifer code refactored to enable function pointers to select + the lookup implementation at runtime. This enables specialization of + specific subtables based on the miniflow attributes, enhancing the + performance of the subtable search. - OVSDB: * OVSDB clients can now resynchronize with clustered servers much more quickly after a brief disconnection, saving bandwidth and CPU time. diff --git a/lib/dpif-netdev-lookup-generic.c b/lib/dpif-netdev-lookup- generic.c index abd166fc3..259c36645 100644 --- a/lib/dpif-netdev-lookup-generic.c +++ b/lib/dpif-netdev-lookup-generic.c @@ -233,7 +233,58 @@ dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable, const struct netdev_flow_key *keys[], struct dpcls_rule **rules) { +/* Here the runtime subtable->mf_bits counts are used, which forces the + * compiler to iterate normal for() loops. Due to this limitation in the + * compilers available optimizations, this function has lower performance + * than the below specialized functions. + */ return lookup_generic_impl(subtable, blocks_scratch, keys_map, keys, rules, subtable->mf_bits_set_unit0, subtable->mf_bits_set_unit1); } + +/* Expand out specialized functions with U0 and U1 bit attributes */ Minor, missing period at end of comment (can fix this on commit if there are no other revisions). Fixed. +DECLARE_OPTIMIZED_LOOKUP_FUNCTION(5,1) +DECLARE_OPTIMIZED_LOOKUP_FUNCTION(4,1) +DECLARE_OPTIMIZED_LOOKUP_FUNCTION(4,0) + +/* Check if a speicalized function is valid for the required subtable. */ Minor, speicalized -> specialized, again can be fixed on commit otherwise. Fixed. +#define CHECK_LOOKUP_FUNCTION(U0,U1) \ +if (!f && u0_bits == U0 && u1_bits == U1) { \ +f = dpcls_subtable_lookup_mf_u0w##U0##_u1w##U1; \ +} + +/* Probe function to lookup an available specialized function. + * If capable to run the requested miniflow fingerprint, this function returns + * the most optimal implementation for that miniflow fingerprint. + * @retval FunctionAddress A valid function to handle the miniflow bit pattern + * @retval 0 The requested miniflow is not supported here, NULL is returned + */ +dpcls_subtable_lookup_func +dpcls_subtable_generic_probe(uint32_t u0_bits, uint32_t u1_bits) +{ +dpcls_subtable_lookup_func f = NULL; In the comments you return FunctionAddress but you return f below, should this not be FunctionAddress or maybe another variable name if 'FunctionAddress' is a bit unwieldy? Updated return value descriptions as Non-NULL and NULL, and updated comments to read well. This better describes the code than "FunctionAddress". -/* Assign the generic lookup - this works with any miniflow fingerprint. */ -subtable->lookup_func = dpcls_subtable_lookup_generic; +/* Probe for a specialized generic lookup function. */ +subtable->lookup_func = dpcls_subtable_generic_probe(unit0, unit1); + +/* If not set, assign generic lookup. Generic works for any miniflow. */ +if
Re: [ovs-dev] [PATCH] OVN resource agent - make promotion synchronous
Is this just waiting on a couple of line length issues to be fixed? Or do those really matter? On Tue, Jul 9, 2019 at 3:10 AM 0-day Robot wrote: > Bleep bloop. Greetings Michele Baldessari, I am a robot and I have tried > out your patch. > Thanks for your contribution. > > I encountered some error that I wasn't expecting. See the details below. > > > checkpatch: > WARNING: Line is 89 characters long (recommended limit is 79) > #50 FILE: ovn/utilities/ovndb-servers.ocf:545: > ocf_log debug "ovndb_servers: Waiting for promotion $host_name as > master to complete" > > WARNING: Line is 82 characters long (recommended limit is 79) > #58 FILE: ovn/utilities/ovndb-servers.ocf:553: > ocf_log debug "ovndb_servers: Promotion of $host_name as the master > completed" > > Lines checked: 64, Warnings: 2, Errors: 0 > > > Please check this out. If you feel there has been an error, please email > acon...@bytheb.org > > Thanks, > 0-day Robot > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v11 5/5] dpif-netdev: Add specialized generic scalar functions
> -Original Message- > From: Ilya Maximets [mailto:i.maxim...@samsung.com] > Sent: Wednesday, July 17, 2019 11:24 AM > To: Van Haaren, Harry ; d...@openvswitch.org > Cc: echau...@redhat.com; malvika.gu...@arm.com; Stokes, Ian > > Subject: Re: [PATCH v11 5/5] dpif-netdev: Add specialized generic scalar > functions > > On 17.07.2019 13:18, Van Haaren, Harry wrote: > >>> +if (f) { > >>> +VLOG_INFO("Subtable using Generic Optimized for u0 %d, u1 %d\n", > >>> + u0_bits, u1_bits); > >> > >> Can we move this message out to 'dpcls_create_subtable' and log the > >> non-optimized case too? > >> Also, this needs to be DBG as all other subtable related logs are DBG logs. > >> > >> We could just extend the 'Creating subtable' log message to include this > >> information: > >> > >> VLOG_DBG("Creating %"PRIuSIZE". subtable %p for in_port %d with a %s " > >> "lookup function (units: %"PRIu32", %"PRIu32").", > >> cmap_count(>subtables_map), subtable, cls->in_port, > >> (subtable->lookup_func == dpcls_subtable_lookup_generic) > >> ? "generic" : "specialized", unit0, unit1); > > > > > > This might seem a nice idea now, however in future we can plug in other > optimized > > flavors of lookups, and then the dpcls_create_subtable() function won't know > > the details of the implementation. Hence having the log in the > implementation > > is the better solution. > > If there will be other implementations we'll have to change the prototype of > 'dpcls_subtable_generic_probe' and refactor the code around anyway. So, > changing > the log message would be a minor thing. Your version of > 'dpcls_create_subtable' > already highly depends on the implementation of > 'dpcls_subtable_generic_probe'. If we plug in new implementations, they will also have their own probe() function, so there is no need to modify the existing probe prototype. The log message is implementation specific (the name and details of the implementation), so a print at that level (not an abstraction level above it) makes more sense here. I'll leave this as is for v12, but change to DEBUG as suggested. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v11 5/5] dpif-netdev: Add specialized generic scalar functions
> -Original Message- > From: Stokes, Ian > Sent: Tuesday, July 16, 2019 10:07 PM > To: Van Haaren, Harry ; d...@openvswitch.org > Cc: echau...@redhat.com; i.maxim...@samsung.com; malvika.gu...@arm.com > Subject: Re: [PATCH v11 5/5] dpif-netdev: Add specialized generic scalar > functions > > On 7/15/2019 5:36 PM, Harry van Haaren wrote: > > This commit adds a number of specialized functions, that handle > Thanks for the v11 Harry, some minor comments inline below. Thanks for review! > > v11: > > - Use MACROs to declare and check optimized functions (Ilya) > > - Use captial letter for commit message (Ilya) > > - Rebase onto latest patchset changes > > - Added NEWS entry for data-path subtable specialization (Ian/Harry) > > - Checkpatch notes an "incorrect bracketing" in the MACROs, however I > >didn't find a solution that it does like. > > > > v10: > > - Rebase changes from previous patches. > > - Remove "restrict" keyword as windows CI failed, see here for details: > >https://ci.appveyor.com/project/istokes/ovs-q8fvv/builds/24398228 > > > > v8: > > - Rework to use blocks_cache from the dpcls instance, to avoid variable > >lenght arrays in the data-path. > > --- > > NEWS | 4 +++ > > lib/dpif-netdev-lookup-generic.c | 51 > > lib/dpif-netdev-private.h| 8 + > > lib/dpif-netdev.c| 9 -- > > 4 files changed, 70 insertions(+), 2 deletions(-) > > > > diff --git a/NEWS b/NEWS > > index 81130e667..4cfffb1bc 100644 > > --- a/NEWS > > +++ b/NEWS > > @@ -34,6 +34,10 @@ Post-v2.11.0 > >* 'ovs-appctl exit' now implies cleanup of non-internal ports in > userspace > > datapath regardless of '--cleanup' option. Use '--cleanup' to > remove > > internal ports too. > > + * Datapath classifer code refactored to enable function pointers to > select > > + the lookup implementation at runtime. This enables specialization of > > + specific subtables based on the miniflow attributes, enhancing the > > + performance of the subtable search. > > - OVSDB: > >* OVSDB clients can now resynchronize with clustered servers much > more > > quickly after a brief disconnection, saving bandwidth and CPU time. > > diff --git a/lib/dpif-netdev-lookup-generic.c b/lib/dpif-netdev-lookup- > generic.c > > index abd166fc3..259c36645 100644 > > --- a/lib/dpif-netdev-lookup-generic.c > > +++ b/lib/dpif-netdev-lookup-generic.c > > @@ -233,7 +233,58 @@ dpcls_subtable_lookup_generic(struct dpcls_subtable > *subtable, > > const struct netdev_flow_key *keys[], > > struct dpcls_rule **rules) > > { > > +/* Here the runtime subtable->mf_bits counts are used, which forces the > > + * compiler to iterate normal for() loops. Due to this limitation in > the > > + * compilers available optimizations, this function has lower > performance > > + * than the below specialized functions. > > + */ > > return lookup_generic_impl(subtable, blocks_scratch, keys_map, keys, > rules, > > subtable->mf_bits_set_unit0, > > subtable->mf_bits_set_unit1); > > } > > + > > +/* Expand out specialized functions with U0 and U1 bit attributes */ > > Minor, missing period at end of comment (can fix this on commit if there > are no other revisions). Fixed. > > +DECLARE_OPTIMIZED_LOOKUP_FUNCTION(5,1) > > +DECLARE_OPTIMIZED_LOOKUP_FUNCTION(4,1) > > +DECLARE_OPTIMIZED_LOOKUP_FUNCTION(4,0) > > + > > +/* Check if a speicalized function is valid for the required subtable. */ > Minor, speicalized -> specialized, again can be fixed on commit otherwise. Fixed. > > > +#define CHECK_LOOKUP_FUNCTION(U0,U1) > \ > > +if (!f && u0_bits == U0 && u1_bits == U1) { > \ > > +f = dpcls_subtable_lookup_mf_u0w##U0##_u1w##U1; > \ > > +} > > + > > +/* Probe function to lookup an available specialized function. > > + * If capable to run the requested miniflow fingerprint, this function > returns > > + * the most optimal implementation for that miniflow fingerprint. > > + * @retval FunctionAddress A valid function to handle the miniflow bit > pattern > > + * @retval 0 The requested miniflow is not supported here, NULL is returned > > + */ > > +dpcls_subtable_lookup_func > > +dpcls_subtable_generic_probe(uint32_t u0_bits, uint32_t u1_bits) > > +{ > > +dpcls_subtable_lookup_func f = NULL; > > In the comments you return FunctionAddress but you return f below, > should this not be FunctionAddress or maybe another variable name if > 'FunctionAddress' is a bit unwieldy? Updated return value descriptions as Non-NULL and NULL, and updated comments to read well. This better describes the code than "FunctionAddress". > > -/* Assign the generic lookup - this works with any miniflow > fingerprint. */ > > -subtable->lookup_func =
Re: [ovs-dev] [PATCH v11 4/5] dpif-netdev: Refactor generic implementation
> -Original Message- > From: Ilya Maximets [mailto:i.maxim...@samsung.com] > Sent: Wednesday, July 17, 2019 11:14 AM > To: Van Haaren, Harry ; d...@openvswitch.org > Cc: echau...@redhat.com; malvika.gu...@arm.com; Stokes, Ian > > Subject: Re: [PATCH v11 4/5] dpif-netdev: Refactor generic implementation > >> diff --git a/lib/dpif-netdev-lookup-generic.c b/lib/dpif-netdev-lookup- > generic.c > >> index 833abf54f..abd166fc3 100644 > >> --- a/lib/dpif-netdev-lookup-generic.c > >> +++ b/lib/dpif-netdev-lookup-generic.c > >> @@ -30,68 +30,210 @@ > >> #include "packets.h" > >> #include "pvector.h" > >> > >> -/* Returns a hash value for the bits of 'key' where there are 1-bits in > >> - * 'mask'. */ > >> -static inline uint32_t > >> -netdev_flow_key_hash_in_mask(const struct netdev_flow_key *key, > >> - const struct netdev_flow_key *mask) > >> +VLOG_DEFINE_THIS_MODULE(dpif_lookup_generic); > > > > dpif_netdev_lookup_generic? > > > > BTW, this might be not needed if we'll remove logging in the patch #5. Refer to feedback on 5/5 just now - leaving the logging in the implementation is the better solution IMO. > >> +/* Lookup functions below depends on the internal structure of flowmap. */ > >> +BUILD_ASSERT_DECL(FLOWMAP_UNITS == 2); > >> + > >> +/* Given a packet, table and mf_masks, this function iterates over each > bit > >> + * set in the subtable, and calculates the appropriate metadata to store > in the > >> + * blocks_scratch[]. > >> + * > >> + * The results of the blocks_scratch[] can be used for hashing, and later > for > >> + * verification of if a rule matches the given packet. > >> + */ > >> +static inline void > >> +netdev_flow_key_flatten_unit(const uint64_t *pkt_blocks, > >> + const uint64_t *tbl_blocks, > >> + const uint64_t *mf_masks, > >> + uint64_t *blocks_scratch, > >> + const uint64_t pkt_mf_bits, > >> + const uint32_t count) > >> { > >> -const uint64_t *p = miniflow_get_values(>mf); > >> -uint32_t hash = 0; > >> -uint64_t value; > >> +uint32_t i; > >> + > >> +for (i = 0; i < count; i++) { > >> +uint64_t mf_mask = mf_masks[i]; > >> +/* Calculate the block index for the packet metadata. */ > >> +uint64_t idx_bits = mf_mask & pkt_mf_bits; > >> +const uint32_t pkt_idx = count_1bits(idx_bits); > >> + > >> +/* Check if the packet has the subtable miniflow bit set. If yes, > the > >> + * block at the above pkt_idx will be stored, otherwise it is > masked > >> + * out to be zero. > >> + */ > >> +uint64_t pkt_has_mf_bit = (mf_mask + 1) & pkt_mf_bits; > >> +uint64_t no_bit = ((!pkt_has_mf_bit) > 0) - 1; > >> > >> -NETDEV_FLOW_KEY_FOR_EACH_IN_FLOWMAP (value, key, mask->mf.map) { > >> -hash = hash_add64(hash, value & *p); > >> -p++; > >> +/* Mask packet block by table block, and mask to zero if packet > >> + * doesn't actually contain this block of metadata. > >> + */ > >> +blocks_scratch[i] = pkt_blocks[pkt_idx] & tbl_blocks[i] & no_bit; > >> } > >> +} > >> + > >> +/* This function takes a packet, and subtable and writes an array of > uint64_t > >> + * blocks. The blocks contain the metadata that the subtable matches on, > in > >> + * the same order as the subtable, allowing linear iteration over the > blocks. > >> + * > >> + * To calculate the blocks contents, the netdev_flow_key_flatten_unit > function > >> + * is called twice, once for each "unit" of the miniflow. This call can be > >> + * inlined by the compiler for performance. > >> + * > >> + * Note that the u0_count and u1_count variables can be compile-time > constants, > >> + * allowing the loop in the inlined flatten_unit() function to be compile- > time > >> + * unrolled, or possibly removed totally by unrolling by the loop > iterations. > >> + * The compile time optimizations enabled by this design improves > performance. > >> + */ > >> +static inline void > >> +netdev_flow_key_flatten(const struct netdev_flow_key *key, > >> +const struct netdev_flow_key *mask, > >> +const uint64_t *mf_masks, > >> +uint64_t *blocks_scratch, > >> +const uint32_t u0_count, > >> +const uint32_t u1_count) > >> +{ > >> +/* Load mask from subtable, mask with packet mf, popcount to get idx. > */ > >> +const uint64_t *pkt_blocks = miniflow_get_values(>mf); > >> +const uint64_t *tbl_blocks = miniflow_get_values(>mf); > >> > >> -return hash_finish(hash, (p - miniflow_get_values(>mf)) * 8); > >> +/* Packet miniflow bits to be masked by pre-calculated mf_masks. */ > >> +const uint64_t pkt_bits_u0 = key->mf.map.bits[0]; > >> +const uint32_t pkt_bits_u0_pop = count_1bits(pkt_bits_u0); > >> +
Re: [ovs-dev] [PATCH v11 5/5] dpif-netdev: Add specialized generic scalar functions
On 17.07.2019 13:18, Van Haaren, Harry wrote: >> -Original Message- >> From: Ilya Maximets [mailto:i.maxim...@samsung.com] >> Sent: Wednesday, July 17, 2019 10:52 AM >> To: Van Haaren, Harry ; d...@openvswitch.org >> Cc: echau...@redhat.com; malvika.gu...@arm.com; Stokes, Ian >> >> Subject: Re: [PATCH v11 5/5] dpif-netdev: Add specialized generic scalar >> functions > > > >>> +DECLARE_OPTIMIZED_LOOKUP_FUNCTION(5,1) >>> +DECLARE_OPTIMIZED_LOOKUP_FUNCTION(4,1) >>> +DECLARE_OPTIMIZED_LOOKUP_FUNCTION(4,0) >> >> Some spaces needed after the comma. > > Will fix in v12. > >>> +/* Check if a speicalized function is valid for the required subtable. */ >>> +#define CHECK_LOOKUP_FUNCTION(U0,U1) >> \ >>> +if (!f && u0_bits == U0 && u1_bits == U1) { >> \ >>> +f = dpcls_subtable_lookup_mf_u0w##U0##_u1w##U1; >> \ >>> +} >> >> The reason why I initially placed this macro inside the function is that >> 'u0_bits' and 'u1_bits' only makes sense inside the function. >> IMHO, it's better to move this inside, but it's up to you. > > I'd prefer leave outside the function - this seems to be the standard way of > doing MACRO defines in OVS? Eg in dpif-netdev other macros are also outside > the > functions, so I'll follow that method. > > > >>> +dpcls_subtable_lookup_func >>> +dpcls_subtable_generic_probe(uint32_t u0_bits, uint32_t u1_bits) >>> +{ >>> +dpcls_subtable_lookup_func f = NULL; >>> + >>> +CHECK_LOOKUP_FUNCTION(5, 1); >>> +CHECK_LOOKUP_FUNCTION(4, 1); >>> +CHECK_LOOKUP_FUNCTION(4, 0); >>> + >>> +if (f) { >>> +VLOG_INFO("Subtable using Generic Optimized for u0 %d, u1 %d\n", >>> + u0_bits, u1_bits); >> >> Can we move this message out to 'dpcls_create_subtable' and log the >> non-optimized case too? >> Also, this needs to be DBG as all other subtable related logs are DBG logs. >> >> We could just extend the 'Creating subtable' log message to include this >> information: >> >> VLOG_DBG("Creating %"PRIuSIZE". subtable %p for in_port %d with a %s " >> "lookup function (units: %"PRIu32", %"PRIu32").", >> cmap_count(>subtables_map), subtable, cls->in_port, >> (subtable->lookup_func == dpcls_subtable_lookup_generic) >> ? "generic" : "specialized", unit0, unit1); > > > This might seem a nice idea now, however in future we can plug in other > optimized > flavors of lookups, and then the dpcls_create_subtable() function won't know > the details of the implementation. Hence having the log in the implementation > is the better solution. If there will be other implementations we'll have to change the prototype of 'dpcls_subtable_generic_probe' and refactor the code around anyway. So, changing the log message would be a minor thing. Your version of 'dpcls_create_subtable' already highly depends on the implementation of 'dpcls_subtable_generic_probe'. > > Will change to DEBUG instead of INFO, good idea thanks. > > > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v11 0/5] dpcls func ptrs & optimizations
On 17.07.2019 0:06, Ian Stokes wrote: > On 7/15/2019 5:36 PM, Harry van Haaren wrote: >> Hey Folks, >> >> Here a v11 of the DPCLS Function Pointer patchset, as has been >> presented at OVS Conf in Nov '18, and discussed on the ML since then. >> I'm aware of the soft-freeze for 2.12, I feel this patchset has had >> enough reviews/versions/testing to be merged in 2.12. >> >> Thanks Ilya and Ian for review comments on v10, they should all be addressed >> in this v11. Patchset details below, summary of v11 changes as follows: >> - Reworked to use hash_add_words64(), unfortunatly hash_words64_inline() did >> not provide the same has due to it calling hash_finish() with a different >> "final" value. Refactored to re-use as much as we can of the existing >> code. >> - Reworked specialized functions to use MACROs as suggested in review. This >> makes the specialization of functions much more tidy - good suggestion! >> - Reworked lots of little fixes, Captials and stops. >> - Added NEWS entry in "Userspace Datapath" section on DPCLS function pointers >> and how specialization of subtables minfilows gains search performance. >> - See per patch --- v11 notes for details :) >> >> Regards, -Harry > > Thanks for working on the v11 Harry. > > Patches 1-4 in the series looks ok to me. I have some minor comments on patch > 5. > > I've ran it through the usual validation (travis, appveyor, read the docs > etc). All passing without issue. > > I'm just running some performance tests now again with vsperf. > > For rfc2544 0% traffic loss for ovs flow scalability (phy2phy_scalability) > I'm seeing a 10% increase in performance for 64 byte packets and in the case > of scalability where loss is allowed (phy2phy_scalability_cont) then I see an > 8% increase in performance on these baselines. > > @Ilya, do you have input on these patches or has Harry addressed your > concerns in the latest revision? I've sent some comments for the patches. BTW, I'm planning to test the performance of generic (non-optimized) implementation today to compare with the previous implementation. Best regards, Ilya Maximets. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v11 5/5] dpif-netdev: Add specialized generic scalar functions
> -Original Message- > From: Ilya Maximets [mailto:i.maxim...@samsung.com] > Sent: Wednesday, July 17, 2019 10:52 AM > To: Van Haaren, Harry ; d...@openvswitch.org > Cc: echau...@redhat.com; malvika.gu...@arm.com; Stokes, Ian > > Subject: Re: [PATCH v11 5/5] dpif-netdev: Add specialized generic scalar > functions > > +DECLARE_OPTIMIZED_LOOKUP_FUNCTION(5,1) > > +DECLARE_OPTIMIZED_LOOKUP_FUNCTION(4,1) > > +DECLARE_OPTIMIZED_LOOKUP_FUNCTION(4,0) > > Some spaces needed after the comma. Will fix in v12. > > +/* Check if a speicalized function is valid for the required subtable. */ > > +#define CHECK_LOOKUP_FUNCTION(U0,U1) > \ > > +if (!f && u0_bits == U0 && u1_bits == U1) { > \ > > +f = dpcls_subtable_lookup_mf_u0w##U0##_u1w##U1; > \ > > +} > > The reason why I initially placed this macro inside the function is that > 'u0_bits' and 'u1_bits' only makes sense inside the function. > IMHO, it's better to move this inside, but it's up to you. I'd prefer leave outside the function - this seems to be the standard way of doing MACRO defines in OVS? Eg in dpif-netdev other macros are also outside the functions, so I'll follow that method. > > +dpcls_subtable_lookup_func > > +dpcls_subtable_generic_probe(uint32_t u0_bits, uint32_t u1_bits) > > +{ > > +dpcls_subtable_lookup_func f = NULL; > > + > > +CHECK_LOOKUP_FUNCTION(5, 1); > > +CHECK_LOOKUP_FUNCTION(4, 1); > > +CHECK_LOOKUP_FUNCTION(4, 0); > > + > > +if (f) { > > +VLOG_INFO("Subtable using Generic Optimized for u0 %d, u1 %d\n", > > + u0_bits, u1_bits); > > Can we move this message out to 'dpcls_create_subtable' and log the > non-optimized case too? > Also, this needs to be DBG as all other subtable related logs are DBG logs. > > We could just extend the 'Creating subtable' log message to include this > information: > > VLOG_DBG("Creating %"PRIuSIZE". subtable %p for in_port %d with a %s " > "lookup function (units: %"PRIu32", %"PRIu32").", > cmap_count(>subtables_map), subtable, cls->in_port, > (subtable->lookup_func == dpcls_subtable_lookup_generic) > ? "generic" : "specialized", unit0, unit1); This might seem a nice idea now, however in future we can plug in other optimized flavors of lookups, and then the dpcls_create_subtable() function won't know the details of the implementation. Hence having the log in the implementation is the better solution. Will change to DEBUG instead of INFO, good idea thanks. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v11 4/5] dpif-netdev: Refactor generic implementation
On 17.07.2019 13:05, Ilya Maximets wrote: > On 15.07.2019 19:36, Harry van Haaren wrote: >> This commit refactors the generic implementation. The >> goal of this refactor is to simplify the code to enable >> "specialization" of the functions at compile time. >> >> Given compile-time optimizations, the compiler is able >> to unroll loops, and create optimized code sequences due >> to compile time knowledge of loop-trip counts. >> >> In order to enable these compiler optimizations, we must >> refactor the code to pass the loop-trip counts to functions >> as compile time constants. >> >> This patch allows the number of miniflow-bits set per "unit" >> in the miniflow to be passed around as a function argument. >> >> Note that this patch does NOT yet take advantage of doing so, >> this is only a refactor to enable it in the next patches. >> >> Signed-off-by: Harry van Haaren >> Tested-by: Malvika Gupta >> >> --- >> >> v11: >> - Rebased to previous changes >> - Fix typo in commit message (Ian) >> - Fix variable declaration spacing (Ian) >> - Remove function names from comments (Ian) >> - Replace magic 8 with sizeof(uint64_t) (Ian) >> - Captialize and end comments with a stop. (Ian/Ilya) >> - Add build time assert to validate FLOWMAP_UNITS (Ilya) >> - Add note on ALWAYS_INLINE operation >> - Add space after ULLONG_FOR_EACH_1 (Ilya) >> - Use hash_add_words64() instead of rolling own loop (Ilya) >> Note that hash_words64_inline() calls hash_finish() with an >> fixed value, so it was not the right hash function for this >> usage. Used hash_add_words64() and manual hash_finish() to >> re-use as much of hashing code as we can. >> >> v10: >> - Rebase updates from previous patches >> - Fix whitespace indentation of func params >> - Removed restrict keyword, Windows CI failing when it is used (Ian) >> - Fix integer 0 used to set NULL pointer (Ilya) >> - Postpone free() call on cls->blocks_scratch (Ilya) >> - Fix indentation of a function >> >> v9: >> - Use count_1bits in favour of __builtin_popcount (Ilya) >> - Use ALWAYS_INLINE instead of __attribute__ synatx (Ilya) >> >> v8: >> - Rework block_cache and mf_masks to avoid variable-lenght array >> due to compiler issues. Provisioning for worst case is not a >> good solution due to magnitue of over-provisioning required. >> - Rework netdev_flatten function removing unused parameter >> --- >> lib/dpif-netdev-lookup-generic.c | 214 +-- >> lib/dpif-netdev-private.h| 8 +- >> lib/dpif-netdev.c| 77 ++- >> 3 files changed, 258 insertions(+), 41 deletions(-) >> >> diff --git a/lib/dpif-netdev-lookup-generic.c >> b/lib/dpif-netdev-lookup-generic.c >> index 833abf54f..abd166fc3 100644 >> --- a/lib/dpif-netdev-lookup-generic.c >> +++ b/lib/dpif-netdev-lookup-generic.c >> @@ -30,68 +30,210 @@ >> #include "packets.h" >> #include "pvector.h" >> >> -/* Returns a hash value for the bits of 'key' where there are 1-bits in >> - * 'mask'. */ >> -static inline uint32_t >> -netdev_flow_key_hash_in_mask(const struct netdev_flow_key *key, >> - const struct netdev_flow_key *mask) >> +VLOG_DEFINE_THIS_MODULE(dpif_lookup_generic); > > dpif_netdev_lookup_generic? > > BTW, this might be not needed if we'll remove logging in the patch #5. > >> + >> +/* Lookup functions below depends on the internal structure of flowmap. */ >> +BUILD_ASSERT_DECL(FLOWMAP_UNITS == 2); >> + >> +/* Given a packet, table and mf_masks, this function iterates over each bit >> + * set in the subtable, and calculates the appropriate metadata to store in >> the >> + * blocks_scratch[]. >> + * >> + * The results of the blocks_scratch[] can be used for hashing, and later >> for >> + * verification of if a rule matches the given packet. >> + */ >> +static inline void >> +netdev_flow_key_flatten_unit(const uint64_t *pkt_blocks, >> + const uint64_t *tbl_blocks, >> + const uint64_t *mf_masks, >> + uint64_t *blocks_scratch, >> + const uint64_t pkt_mf_bits, >> + const uint32_t count) >> { >> -const uint64_t *p = miniflow_get_values(>mf); >> -uint32_t hash = 0; >> -uint64_t value; >> +uint32_t i; >> + >> +for (i = 0; i < count; i++) { >> +uint64_t mf_mask = mf_masks[i]; >> +/* Calculate the block index for the packet metadata. */ >> +uint64_t idx_bits = mf_mask & pkt_mf_bits; >> +const uint32_t pkt_idx = count_1bits(idx_bits); >> + >> +/* Check if the packet has the subtable miniflow bit set. If yes, >> the >> + * block at the above pkt_idx will be stored, otherwise it is masked >> + * out to be zero. >> + */ >> +uint64_t pkt_has_mf_bit = (mf_mask + 1) & pkt_mf_bits; >> +uint64_t no_bit = ((!pkt_has_mf_bit) > 0) - 1; >> >> -NETDEV_FLOW_KEY_FOR_EACH_IN_FLOWMAP (value,
Re: [ovs-dev] [PATCH v11 4/5] dpif-netdev: Refactor generic implementation
On 15.07.2019 19:36, Harry van Haaren wrote: > This commit refactors the generic implementation. The > goal of this refactor is to simplify the code to enable > "specialization" of the functions at compile time. > > Given compile-time optimizations, the compiler is able > to unroll loops, and create optimized code sequences due > to compile time knowledge of loop-trip counts. > > In order to enable these compiler optimizations, we must > refactor the code to pass the loop-trip counts to functions > as compile time constants. > > This patch allows the number of miniflow-bits set per "unit" > in the miniflow to be passed around as a function argument. > > Note that this patch does NOT yet take advantage of doing so, > this is only a refactor to enable it in the next patches. > > Signed-off-by: Harry van Haaren > Tested-by: Malvika Gupta > > --- > > v11: > - Rebased to previous changes > - Fix typo in commit message (Ian) > - Fix variable declaration spacing (Ian) > - Remove function names from comments (Ian) > - Replace magic 8 with sizeof(uint64_t) (Ian) > - Captialize and end comments with a stop. (Ian/Ilya) > - Add build time assert to validate FLOWMAP_UNITS (Ilya) > - Add note on ALWAYS_INLINE operation > - Add space after ULLONG_FOR_EACH_1 (Ilya) > - Use hash_add_words64() instead of rolling own loop (Ilya) > Note that hash_words64_inline() calls hash_finish() with an > fixed value, so it was not the right hash function for this > usage. Used hash_add_words64() and manual hash_finish() to > re-use as much of hashing code as we can. > > v10: > - Rebase updates from previous patches > - Fix whitespace indentation of func params > - Removed restrict keyword, Windows CI failing when it is used (Ian) > - Fix integer 0 used to set NULL pointer (Ilya) > - Postpone free() call on cls->blocks_scratch (Ilya) > - Fix indentation of a function > > v9: > - Use count_1bits in favour of __builtin_popcount (Ilya) > - Use ALWAYS_INLINE instead of __attribute__ synatx (Ilya) > > v8: > - Rework block_cache and mf_masks to avoid variable-lenght array > due to compiler issues. Provisioning for worst case is not a > good solution due to magnitue of over-provisioning required. > - Rework netdev_flatten function removing unused parameter > --- > lib/dpif-netdev-lookup-generic.c | 214 +-- > lib/dpif-netdev-private.h| 8 +- > lib/dpif-netdev.c| 77 ++- > 3 files changed, 258 insertions(+), 41 deletions(-) > > diff --git a/lib/dpif-netdev-lookup-generic.c > b/lib/dpif-netdev-lookup-generic.c > index 833abf54f..abd166fc3 100644 > --- a/lib/dpif-netdev-lookup-generic.c > +++ b/lib/dpif-netdev-lookup-generic.c > @@ -30,68 +30,210 @@ > #include "packets.h" > #include "pvector.h" > > -/* Returns a hash value for the bits of 'key' where there are 1-bits in > - * 'mask'. */ > -static inline uint32_t > -netdev_flow_key_hash_in_mask(const struct netdev_flow_key *key, > - const struct netdev_flow_key *mask) > +VLOG_DEFINE_THIS_MODULE(dpif_lookup_generic); dpif_netdev_lookup_generic? BTW, this might be not needed if we'll remove logging in the patch #5. > + > +/* Lookup functions below depends on the internal structure of flowmap. */ > +BUILD_ASSERT_DECL(FLOWMAP_UNITS == 2); > + > +/* Given a packet, table and mf_masks, this function iterates over each bit > + * set in the subtable, and calculates the appropriate metadata to store in > the > + * blocks_scratch[]. > + * > + * The results of the blocks_scratch[] can be used for hashing, and later for > + * verification of if a rule matches the given packet. > + */ > +static inline void > +netdev_flow_key_flatten_unit(const uint64_t *pkt_blocks, > + const uint64_t *tbl_blocks, > + const uint64_t *mf_masks, > + uint64_t *blocks_scratch, > + const uint64_t pkt_mf_bits, > + const uint32_t count) > { > -const uint64_t *p = miniflow_get_values(>mf); > -uint32_t hash = 0; > -uint64_t value; > +uint32_t i; > + > +for (i = 0; i < count; i++) { > +uint64_t mf_mask = mf_masks[i]; > +/* Calculate the block index for the packet metadata. */ > +uint64_t idx_bits = mf_mask & pkt_mf_bits; > +const uint32_t pkt_idx = count_1bits(idx_bits); > + > +/* Check if the packet has the subtable miniflow bit set. If yes, the > + * block at the above pkt_idx will be stored, otherwise it is masked > + * out to be zero. > + */ > +uint64_t pkt_has_mf_bit = (mf_mask + 1) & pkt_mf_bits; > +uint64_t no_bit = ((!pkt_has_mf_bit) > 0) - 1; > > -NETDEV_FLOW_KEY_FOR_EACH_IN_FLOWMAP (value, key, mask->mf.map) { > -hash = hash_add64(hash, value & *p); > -p++; > +/* Mask packet block by table block, and mask to zero if packet > +
Re: [ovs-dev] [PATCH v11 5/5] dpif-netdev: Add specialized generic scalar functions
On 15.07.2019 19:36, Harry van Haaren wrote: > This commit adds a number of specialized functions, that handle > common miniflow fingerprints. This enables compiler optimization, > resulting in higher performance. Below a quick description of > how this optimization actually works; > > "Specialized functions" are "instances" of the generic implementation, > but the compiler is given extra context when compiling. In the case of > iterating miniflow datastructures, the most interesting value to enable > compile time optimizations is the loop trip count per unit. > > In order to create a specialized function, there is a generic implementation, > which uses a for() loop without the compiler knowing the loop trip count at > compile time. The loop trip count is passed in as an argument to the function: > > uint32_t miniflow_impl_generic(struct miniflow *mf, uint32_t loop_count) > { > for(uint32_t i = 0; i < loop_count; i++) > // do work > } > > In order to "specialize" the function, we call the generic implementation > with hard-coded numbers - these are compile time constants! > > uint32_t miniflow_impl_loop5(struct miniflow *mf, uint32_t loop_count) > { > // use hard coded constant for compile-time constant-propogation > return miniflow_impl_generic(mf, 5); > } > > Given the compiler is aware of the loop trip count at compile time, > it can perform an optimization known as "constant propogation". Combined > with inlining of the miniflow_impl_generic() function, the compiler is > now enabled to *compile time* unroll the loop 5x, and produce "flat" code. > > The last step to using the specialized functions is to utilize a > function-pointer to choose the specialized (or generic) implementation. > The selection of the function pointer is performed at subtable creation > time, when miniflow fingerprint of the subtable is known. This technique > is known as "multiple dispatch" in some literature, as it uses multiple > items of information (miniflow bit counts) to select the dispatch function. > > By pointing the function pointer at the optimized implementation, OvS > benefits from the compile time optimizations at runtime. > > Signed-off-by: Harry van Haaren > Tested-by: Malvika Gupta > > --- > > v11: > - Use MACROs to declare and check optimized functions (Ilya) > - Use captial letter for commit message (Ilya) > - Rebase onto latest patchset changes > - Added NEWS entry for data-path subtable specialization (Ian/Harry) > - Checkpatch notes an "incorrect bracketing" in the MACROs, however I > didn't find a solution that it does like. > > v10: > - Rebase changes from previous patches. > - Remove "restrict" keyword as windows CI failed, see here for details: > https://ci.appveyor.com/project/istokes/ovs-q8fvv/builds/24398228 > > v8: > - Rework to use blocks_cache from the dpcls instance, to avoid variable > lenght arrays in the data-path. > --- > NEWS | 4 +++ > lib/dpif-netdev-lookup-generic.c | 51 > lib/dpif-netdev-private.h| 8 + > lib/dpif-netdev.c| 9 -- > 4 files changed, 70 insertions(+), 2 deletions(-) > > diff --git a/NEWS b/NEWS > index 81130e667..4cfffb1bc 100644 > --- a/NEWS > +++ b/NEWS > @@ -34,6 +34,10 @@ Post-v2.11.0 > * 'ovs-appctl exit' now implies cleanup of non-internal ports in > userspace > datapath regardless of '--cleanup' option. Use '--cleanup' to remove > internal ports too. > + * Datapath classifer code refactored to enable function pointers to > select > + the lookup implementation at runtime. This enables specialization of > + specific subtables based on the miniflow attributes, enhancing the > + performance of the subtable search. > - OVSDB: > * OVSDB clients can now resynchronize with clustered servers much more > quickly after a brief disconnection, saving bandwidth and CPU time. > diff --git a/lib/dpif-netdev-lookup-generic.c > b/lib/dpif-netdev-lookup-generic.c > index abd166fc3..259c36645 100644 > --- a/lib/dpif-netdev-lookup-generic.c > +++ b/lib/dpif-netdev-lookup-generic.c > @@ -233,7 +233,58 @@ dpcls_subtable_lookup_generic(struct dpcls_subtable > *subtable, >const struct netdev_flow_key *keys[], >struct dpcls_rule **rules) > { > +/* Here the runtime subtable->mf_bits counts are used, which forces the > + * compiler to iterate normal for() loops. Due to this limitation in the > + * compilers available optimizations, this function has lower performance > + * than the below specialized functions. > + */ > return lookup_generic_impl(subtable, blocks_scratch, keys_map, keys, > rules, > subtable->mf_bits_set_unit0, > subtable->mf_bits_set_unit1); > } > + > +/* Expand out specialized functions with U0 and U1 bit attributes */ >
Re: [ovs-dev] [PATCH] odp-util: Fix NSH mask parsing.
On 15.07.2019 20:41, William Tu wrote: > On Mon, Jul 15, 2019 at 6:58 AM Ilya Maximets wrote: >> >> 'odp_flow_key_to_flow__' is used to parse both keys and masks. >> However, 'odp_nsh_key_from_attr' expects only 'key' as an argument >> and fails to parse masks with ODP_FIT_ERROR which causes userspace >> system tests failures: >> >> # make check-system-userspace TESTSUITEFLAGS='-k nsh' >> >> |odp_util(revalidator)|WARN| >> OVS_NSH_KEY_ATTR_MD1 present but declared mdtype 0 is not 1 (NSH_M_TYPE1) >> >> |odp_util(revalidator)|WARN| >> the flow mask in error is: >> <...> nsh(flags=0ttl=0,mdtype=0,np=255,spi=0x0,si=0), >> for the following flow key: >> <...> nsh_ttl=8,nsh_mdtype=1,nsh_np=3,nsh_spi=0x64,nsh_si=3, >> nsh_c1=0x1020304,nsh_c2=0x5060708,nsh_c3=0x90a0b0c,nsh_c4=0xd0e0f10 >> <...> >> >> Fix that by passing the additional argument 'is_mask' to make it be >> like all other parsing functions. >> >> Additionally fixed missing comma in the 'format_nsh_key'. >> >> CC: Yi Yang >> Fixes: f59cb331c481 ("nsh: rework NSH netlink keys and actions") >> Signed-off-by: Ilya Maximets >> --- > > LGTM, thanks for the fix. > Acked-by: William Tu Thanks! Applied to master. Best regards, Ilya Maximets. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev