[ovs-dev] [PATCH 1/2] dpif-netlink: Fix a bug that causes duplicate key error in datapath
Kmod tests 122 and 123 failed and kernel reports a "Duplicate key of type 6" error. Further debugging reveals that nl_attr_find__() should start looking for OVS_KEY_ATTR_ETHERTYPE from offset returned by a previous called nl_msg_start_nested(). This patch fixes it. Tests 122 and 123 were skipped by kernel 4.15 and older versions. Kernel 4.16 and later kernels start showing this failure. Signed-off-by: Yifeng Sun --- lib/dpif-netlink.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c index dfa9d9199a73..e23a35da4f4e 100644 --- a/lib/dpif-netlink.c +++ b/lib/dpif-netlink.c @@ -3925,7 +3925,7 @@ put_exclude_packet_type(struct ofpbuf *buf, uint16_t type, ovs_be16 pt = pt_ns_type_be(nl_attr_get_be32(packet_type)); const struct nlattr *nla; -nla = nl_attr_find(buf, NLA_HDRLEN, OVS_KEY_ATTR_ETHERTYPE); +nla = nl_attr_find(buf, ofs + NLA_HDRLEN, OVS_KEY_ATTR_ETHERTYPE); if (nla) { ovs_be16 *ethertype; -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 2/2] test: Fix failed test "flow resume with geneve tun_metadata"
Test "flow resume with geneve tun_metadata" failed because there is no controller running to send controller(pause) message. A previous commit deleted the line that starts ovs-ofctl as a controller in order to avoid a race condition on monitor log. This patch adds back this line but omits the log file because this test doesn't dpends on the log file. Fixes: e8833217914f9c071c49 ("system-traffic.at: avoid a race condition on monitor log") CC: David Marchand Signed-off-by: Yifeng Sun --- tests/system-traffic.at | 3 +++ 1 file changed, 3 insertions(+) diff --git a/tests/system-traffic.at b/tests/system-traffic.at index ffe508dd61f7..84c2af4170a3 100644 --- a/tests/system-traffic.at +++ b/tests/system-traffic.at @@ -538,6 +538,8 @@ OVS_CHECK_GENEVE() OVS_TRAFFIC_VSWITCHD_START() ADD_BR([br-underlay]) +AT_CHECK([ovs-ofctl monitor br0 resume --detach --no-chdir --pidfile 2> /dev/null]) + ADD_NAMESPACES(at_ns0) dnl Set up underlay link from host into the namespace using veth pair. @@ -567,6 +569,7 @@ NS_CHECK_EXEC([at_ns0], [ping -q -c 3 10.1.1.100 | FORMAT_PING], [0], [dnl 3 packets transmitted, 3 received, 0% packet loss, time 0ms ]) +OVS_APP_EXIT_AND_WAIT([ovs-ofctl]) OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 2/2] test: Fix failed test "flow resume with geneve tun_metadata"
Thanks Darrell, I will send a new version tomorrow. On Thu, Jan 31, 2019 at 5:19 PM Darrell Ball wrote: > Minor comments about the commit message Yifeng > > On Thu, Jan 31, 2019 at 3:10 PM Yifeng Sun wrote: > >> Test "flow resume with geneve tun_metadata" failed because there is >> no controller running to send controller(pause) message. > > > The controller receives a 'pause' related to the continuation. > The controller sends back a 'resume' related to the continuation. > > s/no controller running to send controller(pause) message./no controller > running to handle the continuation./ > > >> A previous >> commit deleted the line that starts ovs-ofctl as a controller in >> order to avoid a race condition on monitor log. This patch adds >> back this line but omits the log file because this test doesn't >> dpends on the log file. >> > > s/dpends/depend/ > > >> >> Fixes: e8833217914f9c071c49 ("system-traffic.at: avoid a race condition >> on monitor log") >> CC: David Marchand >> Signed-off-by: Yifeng Sun >> --- >> tests/system-traffic.at | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/tests/system-traffic.at b/tests/system-traffic.at >> index ffe508dd61f7..84c2af4170a3 100644 >> --- a/tests/system-traffic.at >> +++ b/tests/system-traffic.at >> @@ -538,6 +538,8 @@ OVS_CHECK_GENEVE() >> OVS_TRAFFIC_VSWITCHD_START() >> ADD_BR([br-underlay]) >> >> +AT_CHECK([ovs-ofctl monitor br0 resume --detach --no-chdir --pidfile 2> >> /dev/null]) >> + >> ADD_NAMESPACES(at_ns0) >> >> dnl Set up underlay link from host into the namespace using veth pair. >> @@ -567,6 +569,7 @@ NS_CHECK_EXEC([at_ns0], [ping -q -c 3 10.1.1.100 | >> FORMAT_PING], [0], [dnl >> 3 packets transmitted, 3 received, 0% packet loss, time 0ms >> ]) >> >> +OVS_APP_EXIT_AND_WAIT([ovs-ofctl]) >> OVS_TRAFFIC_VSWITCHD_STOP >> AT_CLEANUP >> >> -- >> 2.7.4 >> >> ___ >> dev mailing list >> d...@openvswitch.org >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >> > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v2] test: Fix failed test "flow resume with geneve tun_metadata"
Test "flow resume with geneve tun_metadata" failed because there is no controller running to handle the continuation message. A previous commit deleted the line that starts ovs-ofctl as a controller in order to avoid a race condition on monitor log. This patch adds back this line but omits the log file because this test doesn't depend on the log file. Fixes: e8833217914f9c071c49 ("system-traffic.at: avoid a race condition on monitor log") CC: David Marchand Signed-off-by: Yifeng Sun --- v1->v2: Fixed commit message by Darrell's suggestion, thanks Darrell! tests/system-traffic.at | 3 +++ 1 file changed, 3 insertions(+) diff --git a/tests/system-traffic.at b/tests/system-traffic.at index ffe508dd61f7..84c2af4170a3 100644 --- a/tests/system-traffic.at +++ b/tests/system-traffic.at @@ -538,6 +538,8 @@ OVS_CHECK_GENEVE() OVS_TRAFFIC_VSWITCHD_START() ADD_BR([br-underlay]) +AT_CHECK([ovs-ofctl monitor br0 resume --detach --no-chdir --pidfile 2> /dev/null]) + ADD_NAMESPACES(at_ns0) dnl Set up underlay link from host into the namespace using veth pair. @@ -567,6 +569,7 @@ NS_CHECK_EXEC([at_ns0], [ping -q -c 3 10.1.1.100 | FORMAT_PING], [0], [dnl 3 packets transmitted, 3 received, 0% packet loss, time 0ms ]) +OVS_APP_EXIT_AND_WAIT([ovs-ofctl]) OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] odp-util: Stop parse odp actions if nlattr is overflow
`encap = nl_msg_start_nested(key, OVS_KEY_ATTR_ENCAP)` ensures that key->size >= (encap + NLA_HDRLEN), so the `if` statement is safe. Reported-at: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=11306 Signed-off-by: Yifeng Sun --- lib/odp-util.c | 4 1 file changed, 4 insertions(+) diff --git a/lib/odp-util.c b/lib/odp-util.c index 778c00ee8876..482a0be2f9d7 100644 --- a/lib/odp-util.c +++ b/lib/odp-util.c @@ -5599,6 +5599,10 @@ parse_odp_key_mask_attr(struct parse_odp_context *context, const char *s, context->depth--; return retval; } + +if (nl_attr_oversized(key->size - encap - NLA_HDRLEN)) { +return -E2BIG; +} s += retval; } s++; -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] ofp-actions: Set an action depth limit to prevent stackoverflow by ofpacts_parse
Reported-at: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=12557 Signed-off-by: Yifeng Sun --- include/openvswitch/ofp-actions.h | 4 lib/ofp-actions.c | 5 + 2 files changed, 9 insertions(+) diff --git a/include/openvswitch/ofp-actions.h b/include/openvswitch/ofp-actions.h index caaa37c05a1d..14c5eab74bd3 100644 --- a/include/openvswitch/ofp-actions.h +++ b/include/openvswitch/ofp-actions.h @@ -1175,7 +1175,11 @@ struct ofpact_parse_params { /* Output. */ struct ofpbuf *ofpacts; enum ofputil_protocol *usable_protocols; + +/* Parse context. */ +unsigned int depth; }; +#define MAX_OFPACT_PARSE_DEPTH 100 char *ofpacts_parse_actions(const char *, const struct ofpact_parse_params *) OVS_WARN_UNUSED_RESULT; char *ofpacts_parse_instructions(const char *, diff --git a/lib/ofp-actions.c b/lib/ofp-actions.c index f76db6c0f948..6f175186498d 100644 --- a/lib/ofp-actions.c +++ b/lib/ofp-actions.c @@ -9062,11 +9062,16 @@ static char * OVS_WARN_UNUSED_RESULT ofpacts_parse(char *str, const struct ofpact_parse_params *pp, bool allow_instructions, enum ofpact_type outer_action) { +if (pp->depth >= MAX_OFPACT_PARSE_DEPTH) { +return xstrdup("Action nested too deeply"); +} +CONST_CAST(struct ofpact_parse_params *, pp)->depth++; uint32_t orig_size = pp->ofpacts->size; char *error = ofpacts_parse__(str, pp, allow_instructions, outer_action); if (error) { pp->ofpacts->size = orig_size; } +CONST_CAST(struct ofpact_parse_params *, pp)->depth--; return error; } -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2] test: Fix failed test "flow resume with geneve tun_metadata"
Thanks Ben, please backport to 2.11 if possible. On Mon, Feb 4, 2019 at 1:36 PM Ben Pfaff wrote: > On Fri, Feb 01, 2019 at 11:11:45AM -0800, Yi-Hung Wei wrote: > > On Fri, Feb 1, 2019 at 10:02 AM Yifeng Sun > wrote: > > > > > > Test "flow resume with geneve tun_metadata" failed because there is > > > no controller running to handle the continuation message. A previous > > > commit deleted the line that starts ovs-ofctl as a controller in > > > order to avoid a race condition on monitor log. This patch adds > > > back this line but omits the log file because this test doesn't > > > depend on the log file. > > > > > > Fixes: e8833217914f9c071c49 ("system-traffic.at: avoid a race > condition on monitor log") > > > CC: David Marchand > > > Signed-off-by: Yifeng Sun > > > --- > > Thanks for the fix. > > > > Acked-by: Yi-Hung Wei > > I applied this to master. If it should be backported, let me know. > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v1] ofproto-dpif-trace: Fix for the segmentation fault in ofproto_trace().
Thanks for the fix. I am wondering if we can output some useful information in 'struct ds' for this case? On Mon, Feb 4, 2019 at 3:45 PM Ashish Varma wrote: > Added the check for NULL in "next_ct_states" argument passed to the > "ofproto_trace()" function. Under normal scenario, this is non-NULL. A NULL > "next_ct_states" argument is passed from the "upcall_xlate()" function on > encountering XLATE_RECURSION_TOO_DEEP or XLATE_TOO_MANY_RESUBMITS error. > > VMware-BZ: #2282287 > Signed-off-by: Ashish Varma > --- > ofproto/ofproto-dpif-trace.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/ofproto/ofproto-dpif-trace.c b/ofproto/ofproto-dpif-trace.c > index eca61ce..4a981e1 100644 > --- a/ofproto/ofproto-dpif-trace.c > +++ b/ofproto/ofproto-dpif-trace.c > @@ -740,7 +740,7 @@ ofproto_trace(struct ofproto_dpif *ofproto, const > struct flow *flow, > ds_put_format(output, "\nrecirc(%#"PRIx32")", >recirc_node->recirc_id); > > -if (recirc_node->type == OFT_RECIRC_CONNTRACK) { > +if (next_ct_states && recirc_node->type == OFT_RECIRC_CONNTRACK) { > uint32_t ct_state; > if (ovs_list_is_empty(next_ct_states)) { > ct_state = CS_TRACKED | CS_NEW; > -- > 2.7.4 > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] debian: Add libelf-dev dependency for dkms
Looks good to me, thanks. Reviewed-by: Yifeng Sun On Tue, Feb 12, 2019 at 12:37 PM Greg Rose wrote: > Newer kernels define CONFIG_UNWINDER_ORC for their kernel configurations > and to build this the kernel compilation requires the libelf-dev > package. Add the dependency to the dkms build requirements. > > VMware-BZ: #2287968 > Signed-off-by: Greg Rose > --- > debian/control | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/debian/control b/debian/control > index cde93f2..c70d2a6 100644 > --- a/debian/control > +++ b/debian/control > @@ -41,7 +41,7 @@ Description: Open vSwitch datapath module source - > module-assistant version > > Package: openvswitch-datapath-dkms > Architecture: all > -Depends: dkms (>= 1.95), libc6-dev, make, ${misc:Depends}, > ${python:Depends} > +Depends: dkms (>= 1.95), libc6-dev, libelf-dev, make, ${misc:Depends}, > ${python:Depends} > Description: Open vSwitch datapath module source - DKMS version > Open vSwitch is a production quality, multilayer, software-based, > Ethernet virtual switch. It is designed to enable massive network > -- > 1.8.3.1 > > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] oss: Fix oss build errors because of ovs API change
Reported-at: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13432 Signed-off-by: Yifeng Sun --- tests/oss-fuzz/odp_target.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/tests/oss-fuzz/odp_target.c b/tests/oss-fuzz/odp_target.c index a7a8bbffcaa7..ae61cdca322f 100644 --- a/tests/oss-fuzz/odp_target.c +++ b/tests/oss-fuzz/odp_target.c @@ -27,7 +27,7 @@ parse_keys(bool wc_keys, const char *in) ofpbuf_init(&odp_key, 0); ofpbuf_init(&odp_mask, 0); error = odp_flow_from_string(in, NULL, - &odp_key, &odp_mask); + &odp_key, &odp_mask, NULL); if (error) { printf("odp_flow_from_string: error\n"); goto next; @@ -47,7 +47,8 @@ parse_keys(bool wc_keys, const char *in) }; /* Convert odp_key to flow. */ -fitness = odp_flow_key_to_flow(odp_key.data, odp_key.size, &flow); +fitness = odp_flow_key_to_flow(odp_key.data, odp_key.size, + &flow, NULL); switch (fitness) { case ODP_FIT_PERFECT: break; -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] rhel: Add option to enable AF_XDP on rpm package
LGTM, thanks. Tested-by: Yifeng Sun Reviewed-by: Yifeng Sun On Thu, Jan 21, 2021 at 10:57 AM Yi-Hung Wei wrote: > This patch adds an RPMBUILD_OPT so that user can enable > AF_XDP support in the rpm package by: > > $ make rpm-fedora RPMBUILD_OPT="--with afxdp" > > Signed-off-by: Yi-Hung Wei > --- > rhel/openvswitch-fedora.spec.in | 8 > 1 file changed, 8 insertions(+) > > diff --git a/rhel/openvswitch-fedora.spec.in b/rhel/ > openvswitch-fedora.spec.in > index 2c0c4fa186a3..e03b26b6af34 100644 > --- a/rhel/openvswitch-fedora.spec.in > +++ b/rhel/openvswitch-fedora.spec.in > @@ -28,6 +28,8 @@ > %bcond_without libcapng > # To enable DPDK support, specify '--with dpdk' when building > %bcond_with dpdk > +# To enable AF_XDP support, specify '--with afxdp' when building > +%bcond_with afxdp > > # If there is a need to automatically enable the package after > installation, > # specify the "--with autoenable" > @@ -73,6 +75,9 @@ BuildRequires: libpcap-devel numactl-devel > BuildRequires: dpdk-devel >= 17.05.1 > Provides: %{name}-dpdk = %{version}-%{release} > %endif > +%if %{with afxdp} > +BuildRequires: libbpf-devel numactl-devel > +%endif > BuildRequires: unbound unbound-devel > > Requires: openssl hostname iproute module-init-tools unbound > @@ -164,6 +169,9 @@ This package provides IPsec tunneling support for OVS > tunnels. > %if %{with dpdk} > --with-dpdk=$(dirname %{_datadir}/dpdk/*/.config) \ > %endif > +%if %{with afxdp} > +--enable-afxdp \ > +%endif > --enable-ssl \ > --disable-static \ > --enable-shared \ > -- > 2.7.4 > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] connmgr: Check nullptr inside ofmonitor_report()
ovs-vswitchd could crash under these circumstances: 1. When one bridge is being destroyed, ofproto_destroy() is called and connmgr pointer of its ofproto struct is nullified. This ofproto struct is deallocated through 'ovsrcu_postpone(ofproto_destroy_defer__, p);'. 2. Before RCU enters quiesce state to actually free this ofproto struct, revalidator thread calls udpif_revalidator(), which could handle a learn flow and calls ofproto_flow_mod_learn(), it later calls ofmonitor_report() and ofproto struct's connmgr pointer is accessed. The crash stack trace is shown below: 0 ofmonitor_report (mgr=0x0, rule=rule@entry=0x7fa4ac067c30, event=event@entry=NXFME_ADDED, reason=reason@entry=OFPRR_IDLE_TIMEOUT, abbrev_ofconn=0x0, abbrev_xid=0, old_actions=old_actions@entry=0x0) at ofproto/connmgr.c:2160 1 0x7fa4d6803495 in add_flow_finish (ofproto=0x55d9075d4ab0, ofm=, req=req@entry=0x0) at ofproto/ofproto.c:5221 2 0x7fa4d68036af in modify_flows_finish (req=0x0, ofm=0x7fa4980753f0, ofproto=0x55d9075d4ab0) at ofproto/ofproto.c:5823 3 ofproto_flow_mod_finish (ofproto=0x55d9075d4ab0, ofm=ofm@entry=0x7fa4980753f0, req=req@entry=0x0) at ofproto/ofproto.c:8088 4 0x7fa4d680372d in ofproto_flow_mod_learn_finish (ofm=ofm@entry=0x7fa4980753f0, orig_ofproto=orig_ofproto@entry=0x0) at ofproto/ofproto.c:5439 5 0x7fa4d68072f9 in ofproto_flow_mod_learn (ofm=0x7fa4980753f0, keep_ref=keep_ref@entry=true, limit=, below_limitp=below_limitp@entry=0x0) at ofproto/ofproto.c:5499 6 0x7fa4d6835d33 in xlate_push_stats_entry (entry=0x7fa498012448, stats=stats@entry=0x7fa4d2701a10, offloaded=offloaded@entry=false) at ofproto/ofproto-dpif-xlate-cache.c:127 7 0x7fa4d6835e3a in xlate_push_stats (xcache=, stats=stats@entry=0x7fa4d2701a10, offloaded=offloaded@entry=false) at ofproto/ofproto-dpif-xlate-cache.c:181 8 0x7fa4d6822046 in revalidate_ukey (udpif=udpif@entry=0x55d90760b240, ukey=ukey@entry=0x7fa4b0191660, stats=stats@entry=0x7fa4d2705118, odp_actions=odp_actions@entry=0x7fa4d2701b50, reval_seq=reval_seq@entry=5655486242, recircs=recircs@entry=0x7fa4d2701b40, offloaded=false) at ofproto/ofproto-dpif-upcall.c:2294 9 0x7fa4d6825aee in revalidate (revalidator=0x55d90769dd00) at ofproto/ofproto-dpif-upcall.c:2683 10 0x7fa4d6825cf3 in udpif_revalidator (arg=0x55d90769dd00) at ofproto/ofproto-dpif-upcall.c:936 11 0x7fa4d6259c9f in ovsthread_wrapper (aux_=) at lib/ovs-thread.c:423 12 0x7fa4d582cea5 in start_thread () from /usr/lib64/libpthread.so.0 13 0x7fa4d504b96d in clone () from /usr/lib64/libc.so.6 At the time of crash, the involved ofproto was already deallocated: (gdb) print *ofproto $1 = ..., name = 0x55d907602820 "nsx-managed", ..., ports = {..., one = 0x0, mask = 63, n = 0}, ..., connmgr = 0x0, ... This patch fixes it. VMware-BZ: #2700626 Signed-off-by: Yifeng Sun --- ofproto/connmgr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ofproto/connmgr.c b/ofproto/connmgr.c index 9c5c633b4171..ee07df7a8bc0 100644 --- a/ofproto/connmgr.c +++ b/ofproto/connmgr.c @@ -2140,7 +2140,7 @@ ofmonitor_report(struct connmgr *mgr, struct rule *rule, const struct rule_actions *old_actions) OVS_REQUIRES(ofproto_mutex) { -if (rule_is_hidden(rule)) { +if (!mgr || rule_is_hidden(rule)) { return; } -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCHv2] connmgr: Check nullptr inside ofmonitor_report()
ovs-vswitchd could crash under these circumstances: 1. When one bridge is being destroyed, ofproto_destroy() is called and connmgr pointer of its ofproto struct is nullified. This ofproto struct is deallocated through 'ovsrcu_postpone(ofproto_destroy_defer__, p);'. 2. Before RCU enters quiesce state to actually free this ofproto struct, revalidator thread calls udpif_revalidator(), which could handle a learn flow and calls ofproto_flow_mod_learn(), it later calls ofmonitor_report() and ofproto struct's connmgr pointer is accessed. The crash stack trace is shown below: 0 ofmonitor_report (mgr=0x0, rule=rule@entry=0x7fa4ac067c30, event=event@entry=NXFME_ADDED, reason=reason@entry=OFPRR_IDLE_TIMEOUT, abbrev_ofconn=0x0, abbrev_xid=0, old_actions=old_actions@entry=0x0) at ofproto/connmgr.c:2160 1 0x7fa4d6803495 in add_flow_finish (ofproto=0x55d9075d4ab0, ofm=, req=req@entry=0x0) at ofproto/ofproto.c:5221 2 0x7fa4d68036af in modify_flows_finish (req=0x0, ofm=0x7fa4980753f0, ofproto=0x55d9075d4ab0) at ofproto/ofproto.c:5823 3 ofproto_flow_mod_finish (ofproto=0x55d9075d4ab0, ofm=ofm@entry=0x7fa4980753f0, req=req@entry=0x0) at ofproto/ofproto.c:8088 4 0x7fa4d680372d in ofproto_flow_mod_learn_finish (ofm=ofm@entry=0x7fa4980753f0, orig_ofproto=orig_ofproto@entry=0x0) at ofproto/ofproto.c:5439 5 0x7fa4d68072f9 in ofproto_flow_mod_learn (ofm=0x7fa4980753f0, keep_ref=keep_ref@entry=true, limit=, below_limitp=below_limitp@entry=0x0) at ofproto/ofproto.c:5499 6 0x7fa4d6835d33 in xlate_push_stats_entry (entry=0x7fa498012448, stats=stats@entry=0x7fa4d2701a10, offloaded=offloaded@entry=false) at ofproto/ofproto-dpif-xlate-cache.c:127 7 0x7fa4d6835e3a in xlate_push_stats (xcache=, stats=stats@entry=0x7fa4d2701a10, offloaded=offloaded@entry=false) at ofproto/ofproto-dpif-xlate-cache.c:181 8 0x7fa4d6822046 in revalidate_ukey (udpif=udpif@entry=0x55d90760b240, ukey=ukey@entry=0x7fa4b0191660, stats=stats@entry=0x7fa4d2705118, odp_actions=odp_actions@entry=0x7fa4d2701b50, reval_seq=reval_seq@entry=5655486242, recircs=recircs@entry=0x7fa4d2701b40, offloaded=false) at ofproto/ofproto-dpif-upcall.c:2294 9 0x7fa4d6825aee in revalidate (revalidator=0x55d90769dd00) at ofproto/ofproto-dpif-upcall.c:2683 10 0x7fa4d6825cf3 in udpif_revalidator (arg=0x55d90769dd00) at ofproto/ofproto-dpif-upcall.c:936 11 0x7fa4d6259c9f in ovsthread_wrapper (aux_=) at lib/ovs-thread.c:423 12 0x7fa4d582cea5 in start_thread () from /usr/lib64/libpthread.so.0 13 0x7fa4d504b96d in clone () from /usr/lib64/libc.so.6 At the time of crash, the involved ofproto was already deallocated: (gdb) print *ofproto $1 = ..., name = 0x55d907602820 "nsx-managed", ..., ports = {..., one = 0x0, mask = 63, n = 0}, ..., connmgr = 0x0, ... This patch fixes it. VMware-BZ: #2700626 Signed-off-by: Yifeng Sun --- v1->v2: Add check for ofmonitor_flush, thanks William. ofproto/connmgr.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/ofproto/connmgr.c b/ofproto/connmgr.c index 9c5c633b4171..fa8f6cd0e83a 100644 --- a/ofproto/connmgr.c +++ b/ofproto/connmgr.c @@ -2140,7 +2140,7 @@ ofmonitor_report(struct connmgr *mgr, struct rule *rule, const struct rule_actions *old_actions) OVS_REQUIRES(ofproto_mutex) { -if (rule_is_hidden(rule)) { +if (!mgr || rule_is_hidden(rule)) { return; } @@ -2244,6 +2244,10 @@ ofmonitor_flush(struct connmgr *mgr) { struct ofconn *ofconn; +if (!mgr) { +return; +} + LIST_FOR_EACH (ofconn, connmgr_node, &mgr->conns) { struct rconn_packet_counter *counter = ofconn->monitor_counter; -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] connmgr: Check nullptr inside ofmonitor_report()
Thanks William and YiHung for review, I sent a new version. On Tue, Feb 16, 2021 at 8:45 PM William Tu wrote: > On Tue, Feb 16, 2021 at 1:40 PM Yi-Hung Wei wrote: > > > > On Tue, Feb 16, 2021 at 1:06 PM Yifeng Sun > wrote: > > > > > > ovs-vswitchd could crash under these circumstances: > > > 1. When one bridge is being destroyed, ofproto_destroy() is called and > > > connmgr pointer of its ofproto struct is nullified. This ofproto > struct is > > > deallocated through 'ovsrcu_postpone(ofproto_destroy_defer__, p);'. > > > 2. Before RCU enters quiesce state to actually free this ofproto > struct, > > > revalidator thread calls udpif_revalidator(), which could handle > > > a learn flow and calls ofproto_flow_mod_learn(), it later calls > > > ofmonitor_report() and ofproto struct's connmgr pointer is accessed. > > > > > LGTM, thanks. I guess this is hard to reproduce or create a test. > Do we need to worry about other places that use 'ofproto->connmgr'? > ex: there are a couple of places calling ofmonitor_flush(ofproto->connmgr); > > > > The crash stack trace is shown below: > > > > > > 0 ofmonitor_report (mgr=0x0, rule=rule@entry=0x7fa4ac067c30, > event=event@entry=NXFME_ADDED, > > > reason=reason@entry=OFPRR_IDLE_TIMEOUT, abbrev_ofconn=0x0, > abbrev_xid=0, old_actions=old_actions@entry=0x0) > > > at ofproto/connmgr.c:2160 > > > 1 0x7fa4d6803495 in add_flow_finish (ofproto=0x55d9075d4ab0, > ofm=, req=req@entry=0x0) > > > at ofproto/ofproto.c:5221 > > > 2 0x7fa4d68036af in modify_flows_finish (req=0x0, > ofm=0x7fa4980753f0, ofproto=0x55d9075d4ab0) > > > at ofproto/ofproto.c:5823 > > > 3 ofproto_flow_mod_finish (ofproto=0x55d9075d4ab0, > > > ofm=ofm@entry=0x7fa4980753f0, > req=req@entry=0x0) > > > at ofproto/ofproto.c:8088 > > > 4 0x7fa4d680372d in ofproto_flow_mod_learn_finish (ofm=ofm@entry > =0x7fa4980753f0, > > > orig_ofproto=orig_ofproto@entry=0x0) at ofproto/ofproto.c:5439 > > > 5 0x7fa4d68072f9 in ofproto_flow_mod_learn (ofm=0x7fa4980753f0, > keep_ref=keep_ref@entry=true, > > > limit=, below_limitp=below_limitp@entry=0x0) at > ofproto/ofproto.c:5499 > > > 6 0x7fa4d6835d33 in xlate_push_stats_entry (entry=0x7fa498012448, > stats=stats@entry=0x7fa4d2701a10, > > > offloaded=offloaded@entry=false) at > ofproto/ofproto-dpif-xlate-cache.c:127 > > > 7 0x7fa4d6835e3a in xlate_push_stats (xcache=, > stats=stats@entry=0x7fa4d2701a10, > > > offloaded=offloaded@entry=false) at > ofproto/ofproto-dpif-xlate-cache.c:181 > > > 8 0x7fa4d6822046 in revalidate_ukey > > > (udpif=udpif@entry=0x55d90760b240, > ukey=ukey@entry=0x7fa4b0191660, > > > stats=stats@entry=0x7fa4d2705118, odp_actions=odp_actions@entry > =0x7fa4d2701b50, > > > reval_seq=reval_seq@entry=5655486242, > > > recircs=recircs@entry=0x7fa4d2701b40, > offloaded=false) > > > at ofproto/ofproto-dpif-upcall.c:2294 > > > 9 0x7fa4d6825aee in revalidate (revalidator=0x55d90769dd00) at > ofproto/ofproto-dpif-upcall.c:2683 > > > 10 0x7fa4d6825cf3 in udpif_revalidator (arg=0x55d90769dd00) at > ofproto/ofproto-dpif-upcall.c:936 > > > 11 0x7fa4d6259c9f in ovsthread_wrapper (aux_=) at > lib/ovs-thread.c:423 > > > 12 0x7fa4d582cea5 in start_thread () from > /usr/lib64/libpthread.so.0 > > > 13 0x7fa4d504b96d in clone () from /usr/lib64/libc.so.6 > > > > > > At the time of crash, the involved ofproto was already deallocated: > > > > > > (gdb) print *ofproto > > > $1 = ..., name = 0x55d907602820 "nsx-managed", ..., ports = {..., > > > one = 0x0, mask = 63, n = 0}, ..., connmgr = 0x0, ... > > > > > > This patch fixes it. > > > > > > VMware-BZ: #2700626 > > > Signed-off-by: Yifeng Sun > > > --- > > > > LGTM. > > > > Acked-by: Yi-Hung Wei > > ___ > > dev mailing list > > d...@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v2] bfd: Support overlay BFD
Current OVS intercepts and processes all BFD packets, thus VM-2-VM BFD packets get lost and the recipient VM never sees them. This patch fixes it by only intercepting and processing BFD packets destined to a configured BFD instance, and other BFD packets are made available to the OVS flow table for forwarding. This patch adds new test to validate BFD overlay. This patch keeps BFD's backward compatibility. Signed-off-by: Yifeng Sun --- v1->v2: Add test by William's suggestion. lib/bfd.c | 16 +--- tests/system-traffic.at | 42 ++ vswitchd/vswitch.xml| 7 +++ 3 files changed, 62 insertions(+), 3 deletions(-) diff --git a/lib/bfd.c b/lib/bfd.c index cc8c6857afa4..3c965699ace3 100644 --- a/lib/bfd.c +++ b/lib/bfd.c @@ -149,6 +149,9 @@ BUILD_ASSERT_DECL(BFD_PACKET_LEN == sizeof(struct msg)); #define FLAGS_MASK 0x3f #define DEFAULT_MULT 3 +#define BFD_DEFAULT_SRC_IP 0xA9FE0101 /* 169.254.1.1 */ +#define BFD_DEFAULT_DST_IP 0xA9FE0100 /* 169.254.1.0 */ + struct bfd { struct hmap_node node;/* In 'all_bfds'. */ uint32_t disc;/* bfd.LocalDiscr. Key in 'all_bfds' hmap. */ @@ -457,9 +460,9 @@ bfd_configure(struct bfd *bfd, const char *name, const struct smap *cfg, &bfd->rmt_eth_dst); bfd_lookup_ip(smap_get_def(cfg, "bfd_src_ip", ""), - htonl(0xA9FE0101) /* 169.254.1.1 */, &bfd->ip_src); + htonl(BFD_DEFAULT_SRC_IP), &bfd->ip_src); bfd_lookup_ip(smap_get_def(cfg, "bfd_dst_ip", ""), - htonl(0xA9FE0100) /* 169.254.1.0 */, &bfd->ip_dst); + htonl(BFD_DEFAULT_DST_IP), &bfd->ip_dst); forwarding_if_rx = smap_get_bool(cfg, "forwarding_if_rx", false); if (bfd->forwarding_if_rx != forwarding_if_rx) { @@ -674,7 +677,14 @@ bfd_should_process_flow(const struct bfd *bfd_, const struct flow *flow, memset(&wc->masks.nw_proto, 0xff, sizeof wc->masks.nw_proto); if (flow->nw_proto == IPPROTO_UDP && !(flow->nw_frag & FLOW_NW_FRAG_LATER) -&& tp_dst_equals(flow, BFD_DEST_PORT, wc)) { +&& tp_dst_equals(flow, BFD_DEST_PORT, wc) +&& (bfd->ip_src == htonl(BFD_DEFAULT_SRC_IP) +|| bfd->ip_src == flow->nw_dst)) { + +if (bfd->ip_src == flow->nw_dst) { +memset(&wc->masks.nw_dst, 0x, sizeof wc->masks.nw_dst); +} + bool check_tnl_key; atomic_read_relaxed(&bfd->check_tnl_key, &check_tnl_key); diff --git a/tests/system-traffic.at b/tests/system-traffic.at index 2a0fbadff4a1..80b58996d530 100644 --- a/tests/system-traffic.at +++ b/tests/system-traffic.at @@ -6289,3 +6289,45 @@ OVS_WAIT_UNTIL([cat p2.pcap | egrep "0x0050: * * *5002 *2000 *b85e * OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP + +AT_SETUP([bfd - BFD overlay]) +OVS_CHECK_GENEVE() + +OVS_TRAFFIC_VSWITCHD_START() + +AT_CHECK([ovs-vsctl -- set bridge br0 other-config:hwaddr=\"f2:ff:00:00:00:01\"]) +ADD_BR([br-underlay], [set bridge br-underlay other-config:hwaddr=\"ee:09:e0:4d:bf:31\"]) + +AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"]) +AT_CHECK([ovs-ofctl add-flow br-underlay "actions=normal"]) + +ADD_NAMESPACES(at_ns0) + +dnl Set up underlay link from host into the namespace using veth pair. +ADD_VETH(p0, at_ns0, br-underlay, "172.16.180.105/24", 4e:12:5d:6c:74:3d) +AT_CHECK([ip addr add dev br-underlay "172.16.180.106/24"]) +AT_CHECK([ip link set dev br-underlay up]) + +dnl Set up tunnel endpoints on OVS outside the namespace. +ADD_OVS_TUNNEL([geneve], [br0], [at_gnv0], [172.16.180.105], [192.168.10.100/24], +[options:packet_type=ptap]) + +dnl Certain Linux distributions, like CentOS, have default iptable rules +dnl to reject input traffic from br-underlay. Here we add a rule to walk +dnl around it. +iptables -I INPUT 1 -i br-underlay -j ACCEPT +on_exit 'iptables -D INPUT 1' + +dnl Firstly, test normal BFD packet. +ovs-ofctl -O OpenFlow13 packet-out br-underlay "in_port=1 packet=ee09e04dbf314e125d6c743d080045c0006624d94000401153f9ac10b469ac10b46a739f17c1005247f900806558002320016a0f5a9ed376080045c00034ff11fa03ac10b469ac10b46ac0070ec8002021c003187b3c96ebbc96b962000186af4240 actions=NORMAL" +dnl Next we test overlay BFD packet. +ovs-ofctl -O OpenFlow13 packet-out br-underlay "in_port=1 packet=ee09e04dbf314e125d6c743d0800455b2558400040115445ac10b469ac10b46a6b1017c1004722c10240655801048001001803005254009d0b6d5254000c8984080045210e22400040119688c0a80a68c0a80a6995cd0ec8000dd342746573740a actions=NORMAL" + +ovs-dpctl dump-
Re: [ovs-dev] [PATCH v2] bfd: Support overlay BFD
Thanks for reviewing. For these two packets: dnl outer IP: Source: 172.16.180.105 Destination: 172.16.180.106 This one is normal BFD packet, bfd_should_process_flow should return true, as used to. dnl inner IP: Source: 192.168.10.104 Destination: 192.168.10.105 This one is overlay BFD packet, bfd_should_process_flow should return false so this packet won't be intercepted by OVS's BFD engine. Thanks, Yifeng On Wed, Jul 22, 2020 at 10:54 AM William Tu wrote: > On Wed, Jul 22, 2020 at 01:59:04AM -0700, Yifeng Sun wrote: > > Current OVS intercepts and processes all BFD packets, thus VM-2-VM > > BFD packets get lost and the recipient VM never sees them. > > > > This patch fixes it by only intercepting and processing BFD packets > > destined to a configured BFD instance, and other BFD packets are made > > available to the OVS flow table for forwarding. > > > > This patch adds new test to validate BFD overlay. > > > > This patch keeps BFD's backward compatibility. > > > > Signed-off-by: Yifeng Sun > > --- > > v1->v2: Add test by William's suggestion. > > > > lib/bfd.c | 16 +--- > > tests/system-traffic.at | 42 ++ > > vswitchd/vswitch.xml| 7 +++ > > 3 files changed, 62 insertions(+), 3 deletions(-) > > > > diff --git a/lib/bfd.c b/lib/bfd.c > > index cc8c6857afa4..3c965699ace3 100644 > > --- a/lib/bfd.c > > +++ b/lib/bfd.c > > @@ -149,6 +149,9 @@ BUILD_ASSERT_DECL(BFD_PACKET_LEN == sizeof(struct > msg)); > > #define FLAGS_MASK 0x3f > > #define DEFAULT_MULT 3 > > > > +#define BFD_DEFAULT_SRC_IP 0xA9FE0101 /* 169.254.1.1 */ > > +#define BFD_DEFAULT_DST_IP 0xA9FE0100 /* 169.254.1.0 */ > > + > > struct bfd { > > struct hmap_node node;/* In 'all_bfds'. */ > > uint32_t disc;/* bfd.LocalDiscr. Key in 'all_bfds' > hmap. */ > > @@ -457,9 +460,9 @@ bfd_configure(struct bfd *bfd, const char *name, > const struct smap *cfg, > > &bfd->rmt_eth_dst); > > > > bfd_lookup_ip(smap_get_def(cfg, "bfd_src_ip", ""), > > - htonl(0xA9FE0101) /* 169.254.1.1 */, &bfd->ip_src); > > + htonl(BFD_DEFAULT_SRC_IP), &bfd->ip_src); > > bfd_lookup_ip(smap_get_def(cfg, "bfd_dst_ip", ""), > > - htonl(0xA9FE0100) /* 169.254.1.0 */, &bfd->ip_dst); > > + htonl(BFD_DEFAULT_DST_IP), &bfd->ip_dst); > > > > forwarding_if_rx = smap_get_bool(cfg, "forwarding_if_rx", false); > > if (bfd->forwarding_if_rx != forwarding_if_rx) { > > @@ -674,7 +677,14 @@ bfd_should_process_flow(const struct bfd *bfd_, > const struct flow *flow, > > memset(&wc->masks.nw_proto, 0xff, sizeof wc->masks.nw_proto); > > if (flow->nw_proto == IPPROTO_UDP > > && !(flow->nw_frag & FLOW_NW_FRAG_LATER) > > -&& tp_dst_equals(flow, BFD_DEST_PORT, wc)) { > > +&& tp_dst_equals(flow, BFD_DEST_PORT, wc) > > +&& (bfd->ip_src == htonl(BFD_DEFAULT_SRC_IP) > > +|| bfd->ip_src == flow->nw_dst)) { > > + > > +if (bfd->ip_src == flow->nw_dst) { > > +memset(&wc->masks.nw_dst, 0x, sizeof > wc->masks.nw_dst); > > +} > > + > > bool check_tnl_key; > > > > atomic_read_relaxed(&bfd->check_tnl_key, &check_tnl_key); > > diff --git a/tests/system-traffic.at b/tests/system-traffic.at > > index 2a0fbadff4a1..80b58996d530 100644 > > --- a/tests/system-traffic.at > > +++ b/tests/system-traffic.at > > @@ -6289,3 +6289,45 @@ OVS_WAIT_UNTIL([cat p2.pcap | egrep "0x0050: > * * *5002 *2000 *b85e * > > > > OVS_TRAFFIC_VSWITCHD_STOP > > AT_CLEANUP > > + > > +AT_SETUP([bfd - BFD overlay]) > > +OVS_CHECK_GENEVE() > > + > > +OVS_TRAFFIC_VSWITCHD_START() > > + > > +AT_CHECK([ovs-vsctl -- set bridge br0 > other-config:hwaddr=\"f2:ff:00:00:00:01\"]) > > +ADD_BR([br-underlay], [set bridge br-underlay > other-config:hwaddr=\"ee:09:e0:4d:bf:31\"]) > > + > > +AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"]) > > +AT_CHECK([ovs-ofctl add-flow br-underlay "actions=normal"]) > > + > > +ADD_NAMESPACES(at_ns0) > > + > > +dnl Set up underlay li
Re: [ovs-dev] [PATCH v2] bfd: Support overlay BFD
Please discard my previous email, I misunderstood your question. The packet above is dnl outer IP: Source: 172.16.180.105 Destination: 172.16.180.106 dnl inner IP: Source: 192.168.10.104 Destination: 192.168.10.105 So the bfd_should_process_flow returns false. Yes, you are correct and this patch returns false in this case. For the above packet, outer IP is extracted as tunnel info, flow->nw_dst is 192.168.10.105. So bfd_should_process_flow returns false. Thanks, Yifeng On Wed, Jul 22, 2020 at 11:02 AM Yifeng Sun wrote: > Thanks for reviewing. > > For these two packets: > > dnl outer IP: Source: 172.16.180.105 Destination: 172.16.180.106 > This one is normal BFD packet, bfd_should_process_flow should return > true, as used to. > > dnl inner IP: Source: 192.168.10.104 Destination: 192.168.10.105 > This one is overlay BFD packet, bfd_should_process_flow should return > false so this packet won't be intercepted by OVS's BFD engine. > > Thanks, > Yifeng > > On Wed, Jul 22, 2020 at 10:54 AM William Tu wrote: > >> On Wed, Jul 22, 2020 at 01:59:04AM -0700, Yifeng Sun wrote: >> > Current OVS intercepts and processes all BFD packets, thus VM-2-VM >> > BFD packets get lost and the recipient VM never sees them. >> > >> > This patch fixes it by only intercepting and processing BFD packets >> > destined to a configured BFD instance, and other BFD packets are made >> > available to the OVS flow table for forwarding. >> > >> > This patch adds new test to validate BFD overlay. >> > >> > This patch keeps BFD's backward compatibility. >> > >> > Signed-off-by: Yifeng Sun >> > --- >> > v1->v2: Add test by William's suggestion. >> > >> > lib/bfd.c | 16 +--- >> > tests/system-traffic.at | 42 >> ++ >> > vswitchd/vswitch.xml| 7 +++ >> > 3 files changed, 62 insertions(+), 3 deletions(-) >> > >> > diff --git a/lib/bfd.c b/lib/bfd.c >> > index cc8c6857afa4..3c965699ace3 100644 >> > --- a/lib/bfd.c >> > +++ b/lib/bfd.c >> > @@ -149,6 +149,9 @@ BUILD_ASSERT_DECL(BFD_PACKET_LEN == sizeof(struct >> msg)); >> > #define FLAGS_MASK 0x3f >> > #define DEFAULT_MULT 3 >> > >> > +#define BFD_DEFAULT_SRC_IP 0xA9FE0101 /* 169.254.1.1 */ >> > +#define BFD_DEFAULT_DST_IP 0xA9FE0100 /* 169.254.1.0 */ >> > + >> > struct bfd { >> > struct hmap_node node;/* In 'all_bfds'. */ >> > uint32_t disc;/* bfd.LocalDiscr. Key in 'all_bfds' >> hmap. */ >> > @@ -457,9 +460,9 @@ bfd_configure(struct bfd *bfd, const char *name, >> const struct smap *cfg, >> > &bfd->rmt_eth_dst); >> > >> > bfd_lookup_ip(smap_get_def(cfg, "bfd_src_ip", ""), >> > - htonl(0xA9FE0101) /* 169.254.1.1 */, &bfd->ip_src); >> > + htonl(BFD_DEFAULT_SRC_IP), &bfd->ip_src); >> > bfd_lookup_ip(smap_get_def(cfg, "bfd_dst_ip", ""), >> > - htonl(0xA9FE0100) /* 169.254.1.0 */, &bfd->ip_dst); >> > + htonl(BFD_DEFAULT_DST_IP), &bfd->ip_dst); >> > >> > forwarding_if_rx = smap_get_bool(cfg, "forwarding_if_rx", false); >> > if (bfd->forwarding_if_rx != forwarding_if_rx) { >> > @@ -674,7 +677,14 @@ bfd_should_process_flow(const struct bfd *bfd_, >> const struct flow *flow, >> > memset(&wc->masks.nw_proto, 0xff, sizeof wc->masks.nw_proto); >> > if (flow->nw_proto == IPPROTO_UDP >> > && !(flow->nw_frag & FLOW_NW_FRAG_LATER) >> > -&& tp_dst_equals(flow, BFD_DEST_PORT, wc)) { >> > +&& tp_dst_equals(flow, BFD_DEST_PORT, wc) >> > +&& (bfd->ip_src == htonl(BFD_DEFAULT_SRC_IP) >> > +|| bfd->ip_src == flow->nw_dst)) { >> > + >> > +if (bfd->ip_src == flow->nw_dst) { >> > +memset(&wc->masks.nw_dst, 0x, sizeof >> wc->masks.nw_dst); >> > +} >> > + >> > bool check_tnl_key; >> > >> > atomic_read_relaxed(&bfd->check_tnl_key, &check_tnl_key); >> > diff --git a/tests/system-traffic.at b/tests/system-traffic.at >> > index 2a0fbadff4a1..80b5
Re: [ovs-dev] [PATCH v2] bfd: Support overlay BFD
You are correct, I will fix BFD config in v3. For the overlay BFD packet, we don't set up a port to handle packets targeted at 192.168.10.105. So ovs simply drops them. On Wed, Jul 22, 2020 at 11:26 AM William Tu wrote: > On Wed, Jul 22, 2020 at 11:02:32AM -0700, Yifeng Sun wrote: > > Thanks for reviewing. > > > > For these two packets: > > > > dnl outer IP: Source: 172.16.180.105 Destination: 172.16.180.106 > > This one is normal BFD packet, bfd_should_process_flow should return > > true, as used to. > > > > dnl inner IP: Source: 192.168.10.104 Destination: 192.168.10.105 > > This one is overlay BFD packet, bfd_should_process_flow should return > > false so this packet won't be intercepted by OVS's BFD engine. > > So you add an additional condition here: > (bfd->ip_src == htonl(BFD_DEFAULT_SRC_IP) > || bfd->ip_src == flow->nw_dst)) > > How come the first packt is true, and the second packet is false in > the above condition? > > you didn't set bfd_src_ip in the test, so what's the value of bfd->ip_src? > > another question below > > > > > --- a/tests/system-traffic.at > > > > +++ b/tests/system-traffic.at > > > > @@ -6289,3 +6289,45 @@ OVS_WAIT_UNTIL([cat p2.pcap | egrep "0x0050: > > > * * *5002 *2000 *b85e * > > > > > > > > OVS_TRAFFIC_VSWITCHD_STOP > > > > AT_CLEANUP > > > > + > > > > +AT_SETUP([bfd - BFD overlay]) > > > > +OVS_CHECK_GENEVE() > > > > + > > > > +OVS_TRAFFIC_VSWITCHD_START() > > > > + > > > > +AT_CHECK([ovs-vsctl -- set bridge br0 > > > other-config:hwaddr=\"f2:ff:00:00:00:01\"]) > > > > +ADD_BR([br-underlay], [set bridge br-underlay > > > other-config:hwaddr=\"ee:09:e0:4d:bf:31\"]) > > > > + > > > > +AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"]) > > > > +AT_CHECK([ovs-ofctl add-flow br-underlay "actions=normal"]) > > > > + > > > > +ADD_NAMESPACES(at_ns0) > > > > + > > > > +dnl Set up underlay link from host into the namespace using veth > pair. > > > > +ADD_VETH(p0, at_ns0, br-underlay, "172.16.180.105/24", > > > 4e:12:5d:6c:74:3d) > > > > +AT_CHECK([ip addr add dev br-underlay "172.16.180.106/24"]) > > > > +AT_CHECK([ip link set dev br-underlay up]) > > > > + > > > > +dnl Set up tunnel endpoints on OVS outside the namespace. > > > > +ADD_OVS_TUNNEL([geneve], [br0], [at_gnv0], [172.16.180.105], [ > > > 192.168.10.100/24], > > > > +[options:packet_type=ptap]) > > > > + > > > > +dnl Certain Linux distributions, like CentOS, have default iptable > rules > > > > +dnl to reject input traffic from br-underlay. Here we add a rule to > walk > > > > +dnl around it. > > > > +iptables -I INPUT 1 -i br-underlay -j ACCEPT > > > > +on_exit 'iptables -D INPUT 1' > > > > + > > > > +dnl Firstly, test normal BFD packet. > > > > +ovs-ofctl -O OpenFlow13 packet-out br-underlay "in_port=1 > > > > packet=ee09e04dbf314e125d6c743d080045c0006624d94000401153f9ac10b469ac10b46a739f17c1005247f900806558002320016a0f5a9ed376080045c00034ff11fa03ac10b469ac10b46ac0070ec8002021c003187b3c96ebbc96b962000186af4240 > > > actions=NORMAL" > > > The packet above is > > > dnl outer IP: Source: 172.16.180.105 Destination: 172.16.180.106 > > > dnl inner IP: Source: 172.16.180.105 Destination: 172.16.180.106 > > > > > > And why does this trigger ovs to process it ex: > bfd_should_process_flow() > > > return true? > > > In your patch, you're adding extra check > > > bfd->ip_src == flow->nw_dst > > > and here > > > bfd->ip_src is default 169.254.1.1 > > > flow->nw_dst is 172.16.180.106 > > > > > > > > > > +dnl Next we test overlay BFD packet. > > > > +ovs-ofctl -O OpenFlow13 packet-out br-underlay "in_port=1 > > > > packet=ee09e04dbf314e125d6c743d0800455b2558400040115445ac10b469ac10b46a6b1017c1004722c10240655801048001001803005254009d0b6d5254000c8984080045210e22400040119688c0a80a68c0a80a6995cd0ec8000dd342746573740a > > > actions=NORMAL" > > the 2nd packet is NORMAL > > > > > > The packet above is > > > dnl o
[ovs-dev] [PATCH v3] bfd: Support overlay BFD
Current OVS intercepts and processes all BFD packets, thus VM-2-VM BFD packets get lost and the recipient VM never sees them. This patch fixes it by only intercepting and processing BFD packets destined to a configured BFD instance, and other BFD packets are made available to the OVS flow table for forwarding. This patch keeps BFD's backward compatibility. VMWare-BZ: 2579326 Signed-off-by: Yifeng Sun --- v1->v2: Add test by William's suggestion. v2->v3: Fix BFD config, thanks William. lib/bfd.c | 16 +--- tests/system-traffic.at | 43 +++ vswitchd/vswitch.xml| 7 +++ 3 files changed, 63 insertions(+), 3 deletions(-) diff --git a/lib/bfd.c b/lib/bfd.c index cc8c6857afa4..3c965699ace3 100644 --- a/lib/bfd.c +++ b/lib/bfd.c @@ -149,6 +149,9 @@ BUILD_ASSERT_DECL(BFD_PACKET_LEN == sizeof(struct msg)); #define FLAGS_MASK 0x3f #define DEFAULT_MULT 3 +#define BFD_DEFAULT_SRC_IP 0xA9FE0101 /* 169.254.1.1 */ +#define BFD_DEFAULT_DST_IP 0xA9FE0100 /* 169.254.1.0 */ + struct bfd { struct hmap_node node;/* In 'all_bfds'. */ uint32_t disc;/* bfd.LocalDiscr. Key in 'all_bfds' hmap. */ @@ -457,9 +460,9 @@ bfd_configure(struct bfd *bfd, const char *name, const struct smap *cfg, &bfd->rmt_eth_dst); bfd_lookup_ip(smap_get_def(cfg, "bfd_src_ip", ""), - htonl(0xA9FE0101) /* 169.254.1.1 */, &bfd->ip_src); + htonl(BFD_DEFAULT_SRC_IP), &bfd->ip_src); bfd_lookup_ip(smap_get_def(cfg, "bfd_dst_ip", ""), - htonl(0xA9FE0100) /* 169.254.1.0 */, &bfd->ip_dst); + htonl(BFD_DEFAULT_DST_IP), &bfd->ip_dst); forwarding_if_rx = smap_get_bool(cfg, "forwarding_if_rx", false); if (bfd->forwarding_if_rx != forwarding_if_rx) { @@ -674,7 +677,14 @@ bfd_should_process_flow(const struct bfd *bfd_, const struct flow *flow, memset(&wc->masks.nw_proto, 0xff, sizeof wc->masks.nw_proto); if (flow->nw_proto == IPPROTO_UDP && !(flow->nw_frag & FLOW_NW_FRAG_LATER) -&& tp_dst_equals(flow, BFD_DEST_PORT, wc)) { +&& tp_dst_equals(flow, BFD_DEST_PORT, wc) +&& (bfd->ip_src == htonl(BFD_DEFAULT_SRC_IP) +|| bfd->ip_src == flow->nw_dst)) { + +if (bfd->ip_src == flow->nw_dst) { +memset(&wc->masks.nw_dst, 0x, sizeof wc->masks.nw_dst); +} + bool check_tnl_key; atomic_read_relaxed(&bfd->check_tnl_key, &check_tnl_key); diff --git a/tests/system-traffic.at b/tests/system-traffic.at index 2a0fbadff4a1..5c1aee49aba2 100644 --- a/tests/system-traffic.at +++ b/tests/system-traffic.at @@ -6289,3 +6289,46 @@ OVS_WAIT_UNTIL([cat p2.pcap | egrep "0x0050: * * *5002 *2000 *b85e * OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP + +AT_SETUP([bfd - BFD overlay]) +OVS_CHECK_GENEVE() + +OVS_TRAFFIC_VSWITCHD_START() + +AT_CHECK([ovs-vsctl -- set bridge br0 other-config:hwaddr=\"f2:ff:00:00:00:01\"]) +ADD_BR([br-underlay], [set bridge br-underlay other-config:hwaddr=\"ee:09:e0:4d:bf:31\"]) + +AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"]) +AT_CHECK([ovs-ofctl add-flow br-underlay "actions=normal"]) + +ADD_NAMESPACES(at_ns0) + +dnl Set up underlay link from host into the namespace using veth pair. +ADD_VETH(p0, at_ns0, br-underlay, "172.16.180.105/24", 4e:12:5d:6c:74:3d) +AT_CHECK([ip addr add dev br-underlay "172.16.180.106/24"]) +AT_CHECK([ip link set dev br-underlay up]) + +dnl Set up tunnel endpoints on OVS outside the namespace. +ADD_OVS_TUNNEL([geneve], [br0], [at_gnv0], [172.16.180.105], [192.168.10.100/24], +[options:packet_type=ptap]) +AT_CHECK([ovs-vsctl -- set Interface at_gnv0 bfd:enable=true bfd:bfd_src_ip=172.16.180.106]) + +dnl Certain Linux distributions, like CentOS, have default iptable rules +dnl to reject input traffic from br-underlay. Here we add a rule to walk +dnl around it. +iptables -I INPUT 1 -i br-underlay -j ACCEPT +on_exit 'iptables -D INPUT 1' + +dnl Firstly, test normal BFD packet. +ovs-ofctl -O OpenFlow13 packet-out br-underlay "in_port=1 packet=ee09e04dbf314e125d6c743d080045c0006624d94000401153f9ac10b469ac10b46a739f17c1005247f900806558002320016a0f5a9ed376080045c00034ff11fa03ac10b469ac10b46ac0070ec8002021c003187b3c96ebbc96b962000186af4240 actions=NORMAL" +dnl Next we test overlay BFD packet. +ovs-ofctl -O OpenFlow13 packet-out br-underlay "in_port=1 packet=ee09e04dbf314e125d6c743d0800455b2558400040115445ac10b469ac10b46a6b1017c1004722c10240655801048001001803005254009d0b6d525400
[ovs-dev] [PATCH v4] bfd: Support overlay BFD
Current OVS intercepts and processes all BFD packets, thus VM-2-VM BFD packets get lost and the recipient VM never sees them. This patch fixes it by only intercepting and processing BFD packets destined to a configured BFD instance, and other BFD packets are made available to the OVS flow table for forwarding. This patch keeps BFD's backward compatibility. VMWare-BZ: 2579326 Signed-off-by: Yifeng Sun --- v1->v2: Add test by William's suggestion. v2->v3: Fix BFD config, thanks William. v3->v4: Test will fail at second run, fixed it. lib/bfd.c | 16 +--- tests/system-traffic.at | 44 vswitchd/vswitch.xml| 7 +++ 3 files changed, 64 insertions(+), 3 deletions(-) diff --git a/lib/bfd.c b/lib/bfd.c index cc8c6857afa4..3c965699ace3 100644 --- a/lib/bfd.c +++ b/lib/bfd.c @@ -149,6 +149,9 @@ BUILD_ASSERT_DECL(BFD_PACKET_LEN == sizeof(struct msg)); #define FLAGS_MASK 0x3f #define DEFAULT_MULT 3 +#define BFD_DEFAULT_SRC_IP 0xA9FE0101 /* 169.254.1.1 */ +#define BFD_DEFAULT_DST_IP 0xA9FE0100 /* 169.254.1.0 */ + struct bfd { struct hmap_node node;/* In 'all_bfds'. */ uint32_t disc;/* bfd.LocalDiscr. Key in 'all_bfds' hmap. */ @@ -457,9 +460,9 @@ bfd_configure(struct bfd *bfd, const char *name, const struct smap *cfg, &bfd->rmt_eth_dst); bfd_lookup_ip(smap_get_def(cfg, "bfd_src_ip", ""), - htonl(0xA9FE0101) /* 169.254.1.1 */, &bfd->ip_src); + htonl(BFD_DEFAULT_SRC_IP), &bfd->ip_src); bfd_lookup_ip(smap_get_def(cfg, "bfd_dst_ip", ""), - htonl(0xA9FE0100) /* 169.254.1.0 */, &bfd->ip_dst); + htonl(BFD_DEFAULT_DST_IP), &bfd->ip_dst); forwarding_if_rx = smap_get_bool(cfg, "forwarding_if_rx", false); if (bfd->forwarding_if_rx != forwarding_if_rx) { @@ -674,7 +677,14 @@ bfd_should_process_flow(const struct bfd *bfd_, const struct flow *flow, memset(&wc->masks.nw_proto, 0xff, sizeof wc->masks.nw_proto); if (flow->nw_proto == IPPROTO_UDP && !(flow->nw_frag & FLOW_NW_FRAG_LATER) -&& tp_dst_equals(flow, BFD_DEST_PORT, wc)) { +&& tp_dst_equals(flow, BFD_DEST_PORT, wc) +&& (bfd->ip_src == htonl(BFD_DEFAULT_SRC_IP) +|| bfd->ip_src == flow->nw_dst)) { + +if (bfd->ip_src == flow->nw_dst) { +memset(&wc->masks.nw_dst, 0x, sizeof wc->masks.nw_dst); +} + bool check_tnl_key; atomic_read_relaxed(&bfd->check_tnl_key, &check_tnl_key); diff --git a/tests/system-traffic.at b/tests/system-traffic.at index 2a0fbadff4a1..ea72f155782f 100644 --- a/tests/system-traffic.at +++ b/tests/system-traffic.at @@ -6289,3 +6289,47 @@ OVS_WAIT_UNTIL([cat p2.pcap | egrep "0x0050: * * *5002 *2000 *b85e * OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP + +AT_SETUP([bfd - BFD overlay]) +OVS_CHECK_GENEVE() + +OVS_TRAFFIC_VSWITCHD_START() + +AT_CHECK([ovs-vsctl -- set bridge br0 other-config:hwaddr=\"f2:ff:00:00:00:01\"]) +ADD_BR([br-underlay], [set bridge br-underlay other-config:hwaddr=\"ee:09:e0:4d:bf:31\"]) + +AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"]) +AT_CHECK([ovs-ofctl add-flow br-underlay "actions=normal"]) + +ADD_NAMESPACES(at_ns0) + +dnl Set up underlay link from host into the namespace using veth pair. +ADD_VETH(p0, at_ns0, br-underlay, "172.16.180.105/24", 4e:12:5d:6c:74:3d) +AT_CHECK([ip addr add dev br-underlay "172.16.180.106/24"]) +AT_CHECK([ip link set dev br-underlay up]) + +dnl Set up tunnel endpoints on OVS outside the namespace. +ADD_OVS_TUNNEL([geneve], [br0], [at_gnv0], [172.16.180.105], [192.168.10.100/24], +[options:packet_type=ptap]) +AT_CHECK([ovs-vsctl -- set Interface at_gnv0 ofport_request=1]) +AT_CHECK([ovs-vsctl -- set Interface at_gnv0 bfd:enable=true bfd:bfd_src_ip="172.16.180.106"]) + +dnl Certain Linux distributions, like CentOS, have default iptable rules +dnl to reject input traffic from br-underlay. Here we add a rule to walk +dnl around it. +iptables -I INPUT 1 -i br-underlay -j ACCEPT +on_exit 'iptables -D INPUT 1' + +dnl Firstly, test normal BFD packet. +ovs-ofctl -O OpenFlow13 packet-out br-underlay "in_port=1 packet=ee09e04dbf314e125d6c743d080045c00066ea4e400040118e83ac10b469ac10b46a356e17c10052cb5500806558002320018ad232b919c4080045c00034ff11fa03ac10b469ac10b46acec8002020800318035cd00e000f424f4240 actions=NORMAL" +dnl Next we test overlay BFD packet. +ovs-ofctl -O OpenFlow13 packet-out br-underlay "in_port=1 packet=ee09e04dbf
Re: [ovs-dev] [PATCH v4] bfd: Support overlay BFD
Confirmed that the setup is quite unstable. Sometimes bfd flow shows up in datapath-flows.txt but sometimes not. Let me take a look. Thanks, Yifeng On Thu, Jul 23, 2020 at 6:51 AM William Tu wrote: > On Wed, Jul 22, 2020 at 1:41 PM Yifeng Sun wrote: > > > > Current OVS intercepts and processes all BFD packets, thus VM-2-VM > > BFD packets get lost and the recipient VM never sees them. > > > > This patch fixes it by only intercepting and processing BFD packets > > destined to a configured BFD instance, and other BFD packets are made > > available to the OVS flow table for forwarding. > > > > This patch keeps BFD's backward compatibility. > > > > VMWare-BZ: 2579326 > s/VMWare/VMware > s/2579326/#2579326/ > > > Signed-off-by: Yifeng Sun > > --- > > v1->v2: Add test by William's suggestion. > > v2->v3: Fix BFD config, thanks William. > > v3->v4: Test will fail at second run, fixed it. > > > > lib/bfd.c | 16 +--- > > tests/system-traffic.at | 44 > > > vswitchd/vswitch.xml| 7 +++ > > 3 files changed, 64 insertions(+), 3 deletions(-) > > > > diff --git a/lib/bfd.c b/lib/bfd.c > > index cc8c6857afa4..3c965699ace3 100644 > > --- a/lib/bfd.c > > +++ b/lib/bfd.c > > @@ -149,6 +149,9 @@ BUILD_ASSERT_DECL(BFD_PACKET_LEN == sizeof(struct > msg)); > > #define FLAGS_MASK 0x3f > > #define DEFAULT_MULT 3 > > > > +#define BFD_DEFAULT_SRC_IP 0xA9FE0101 /* 169.254.1.1 */ > > +#define BFD_DEFAULT_DST_IP 0xA9FE0100 /* 169.254.1.0 */ > > + > > struct bfd { > > struct hmap_node node;/* In 'all_bfds'. */ > > uint32_t disc;/* bfd.LocalDiscr. Key in 'all_bfds' > hmap. */ > > @@ -457,9 +460,9 @@ bfd_configure(struct bfd *bfd, const char *name, > const struct smap *cfg, > > &bfd->rmt_eth_dst); > > > > bfd_lookup_ip(smap_get_def(cfg, "bfd_src_ip", ""), > > - htonl(0xA9FE0101) /* 169.254.1.1 */, &bfd->ip_src); > > + htonl(BFD_DEFAULT_SRC_IP), &bfd->ip_src); > > bfd_lookup_ip(smap_get_def(cfg, "bfd_dst_ip", ""), > > - htonl(0xA9FE0100) /* 169.254.1.0 */, &bfd->ip_dst); > > + htonl(BFD_DEFAULT_DST_IP), &bfd->ip_dst); > > > > forwarding_if_rx = smap_get_bool(cfg, "forwarding_if_rx", false); > > if (bfd->forwarding_if_rx != forwarding_if_rx) { > > @@ -674,7 +677,14 @@ bfd_should_process_flow(const struct bfd *bfd_, > const struct flow *flow, > > memset(&wc->masks.nw_proto, 0xff, sizeof wc->masks.nw_proto); > > if (flow->nw_proto == IPPROTO_UDP > > && !(flow->nw_frag & FLOW_NW_FRAG_LATER) > > -&& tp_dst_equals(flow, BFD_DEST_PORT, wc)) { > > +&& tp_dst_equals(flow, BFD_DEST_PORT, wc) > > +&& (bfd->ip_src == htonl(BFD_DEFAULT_SRC_IP) > > +|| bfd->ip_src == flow->nw_dst)) { > > + > > +if (bfd->ip_src == flow->nw_dst) { > > +memset(&wc->masks.nw_dst, 0x, sizeof > wc->masks.nw_dst); > > +} > > + > > bool check_tnl_key; > > > > atomic_read_relaxed(&bfd->check_tnl_key, &check_tnl_key); > > diff --git a/tests/system-traffic.at b/tests/system-traffic.at > > index 2a0fbadff4a1..ea72f155782f 100644 > > --- a/tests/system-traffic.at > > +++ b/tests/system-traffic.at > > @@ -6289,3 +6289,47 @@ OVS_WAIT_UNTIL([cat p2.pcap | egrep "0x0050: > * * *5002 *2000 *b85e * > > > > OVS_TRAFFIC_VSWITCHD_STOP > > AT_CLEANUP > > + > > +AT_SETUP([bfd - BFD overlay]) > > +OVS_CHECK_GENEVE() > > + > > +OVS_TRAFFIC_VSWITCHD_START() > > + > > +AT_CHECK([ovs-vsctl -- set bridge br0 > other-config:hwaddr=\"f2:ff:00:00:00:01\"]) > > +ADD_BR([br-underlay], [set bridge br-underlay > other-config:hwaddr=\"ee:09:e0:4d:bf:31\"]) > > + > > +AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"]) > > +AT_CHECK([ovs-ofctl add-flow br-underlay "actions=normal"]) > > + > > +ADD_NAMESPACES(at_ns0) > > + > > +dnl Set up underlay link from host into the namespace using veth pair. > > +ADD_VETH(p0, at_ns0, br-underlay, "172.16.180.105/24", > 4e:12:5d:6c:74:3d) &g
Re: [ovs-dev] [PATCH 05/23] datapath: Print error when ovs_execute_actions() fails
LGTM, thanks. Reviewed-by: Yifeng Sun On Thu, Aug 20, 2020 at 3:50 PM Greg Rose wrote: > From: Yifeng Sun > > Upstream commit: > commit aa733660dbd8d9192b8c528ae0f4b84f3fef74e4 > Author: Yifeng Sun > Date: Sun Aug 4 19:56:11 2019 -0700 > > openvswitch: Print error when ovs_execute_actions() fails > > Currently in function ovs_dp_process_packet(), return values of > ovs_execute_actions() are silently discarded. This patch prints out > an debug message when error happens so as to provide helpful hints > for debugging. > Acked-by: Pravin B Shelar > > Signed-off-by: David S. Miller > > Cc: Yifeng Sun > Signed-off-by: Greg Rose > --- > datapath/datapath.c | 7 +-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/datapath/datapath.c b/datapath/datapath.c > index 2879f24..c8c21d7 100644 > --- a/datapath/datapath.c > +++ b/datapath/datapath.c > @@ -240,6 +240,7 @@ void ovs_dp_process_packet(struct sk_buff *skb, struct > sw_flow_key *key) > struct dp_stats_percpu *stats; > u64 *stats_counter; > u32 n_mask_hit; > + int error; > > stats = this_cpu_ptr(dp->stats_percpu); > > @@ -248,7 +249,6 @@ void ovs_dp_process_packet(struct sk_buff *skb, struct > sw_flow_key *key) > &n_mask_hit); > if (unlikely(!flow)) { > struct dp_upcall_info upcall; > - int error; > > memset(&upcall, 0, sizeof(upcall)); > upcall.cmd = OVS_PACKET_CMD_MISS; > @@ -265,7 +265,10 @@ void ovs_dp_process_packet(struct sk_buff *skb, > struct sw_flow_key *key) > > ovs_flow_stats_update(flow, key->tp.flags, skb); > sf_acts = rcu_dereference(flow->sf_acts); > - ovs_execute_actions(dp, skb, sf_acts, key); > + error = ovs_execute_actions(dp, skb, sf_acts, key); > + if (unlikely(error)) > + net_dbg_ratelimited("ovs: action execution error on > datapath %s: %d\n", > + ovs_dp_name(dp), > error); > > stats_counter = &stats->n_hit; > > -- > 1.8.3.1 > > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v1 1/1] dns-resolve: Allow unbound's config file to be set through an env var.
Hi Ted, There seems some indent issue with the next to last '}', can you take a look? Thanks, Yifeng On Tue, Sep 8, 2020 at 8:33 PM Ted Elhourani wrote: > When an unbound context is created, check whether OVS_UNBOUND_CONF has been > set. If a valid config file is supplied then use it to configure the > context. The procedure returns if the config file is invalid. If no config > file is found then the default unbound config is used. > > Signed-off-by: Ted Elhourani > --- > Documentation/intro/install/general.rst | 4 +++- > lib/dns-resolve.c | 12 > 2 files changed, 15 insertions(+), 1 deletion(-) > > diff --git a/Documentation/intro/install/general.rst > b/Documentation/intro/install/general.rst > index 09f2c13f1..c4300cd53 100644 > --- a/Documentation/intro/install/general.rst > +++ b/Documentation/intro/install/general.rst > @@ -97,7 +97,9 @@ need the following software: >specifying OpenFlow and OVSDB remotes. If unbound library is already >installed, then Open vSwitch will automatically build with support for > it. >The environment variable OVS_RESOLV_CONF can be used to specify DNS > server > - configuration file (the default file on Linux is /etc/resolv.conf). > + configuration file (the default file on Linux is /etc/resolv.conf), and > + environment variable OVS_UNBOUND_CONF can be used to specify the > + configuration file for unbound. > > On Linux, you may choose to compile the kernel module that comes with the > Open > vSwitch distribution or to use the kernel module built into the Linux > kernel > diff --git a/lib/dns-resolve.c b/lib/dns-resolve.c > index 1ff58960f..9b5928e0b 100644 > --- a/lib/dns-resolve.c > +++ b/lib/dns-resolve.c > @@ -82,6 +82,18 @@ dns_resolve_init(bool is_daemon) > return; > } > > +const char *ub_conf_filename = getenv("OVS_UNBOUND_CONF"); > +if (ub_conf_filename != NULL) { > +int retval = ub_ctx_config(ub_ctx__, ub_conf_filename); > +if (retval != 0) { > +VLOG_WARN_RL(&rl, "Failed to set libunbound context config: > %s", > + ub_strerror(retval)); > +ub_ctx_delete(ub_ctx__); > +ub_ctx__ = NULL; > +return; > + } > +} > + > const char *filename = getenv("OVS_RESOLV_CONF"); > if (!filename) { > #ifdef _WIN32 > -- > 2.22.3 > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v1 1/1] dns-resolve: Allow unbound's config file to be set through an env var.
Looks good to me, thanks. Reviewed-by: Yifeng Sun On Mon, Sep 28, 2020 at 11:54 AM Ted Elhourani wrote: > When an unbound context is created, check whether OVS_UNBOUND_CONF has been > set. If a valid config file is supplied then use it to configure the > context. The procedure returns if the config file is invalid. If no config > file is found then the default unbound config is used. > > Signed-off-by: Ted Elhourani > --- > Documentation/intro/install/general.rst | 4 +++- > lib/dns-resolve.c | 12 > 2 files changed, 15 insertions(+), 1 deletion(-) > > diff --git a/Documentation/intro/install/general.rst > b/Documentation/intro/install/general.rst > index 09f2c13f1..c4300cd53 100644 > --- a/Documentation/intro/install/general.rst > +++ b/Documentation/intro/install/general.rst > @@ -97,7 +97,9 @@ need the following software: >specifying OpenFlow and OVSDB remotes. If unbound library is already >installed, then Open vSwitch will automatically build with support for > it. >The environment variable OVS_RESOLV_CONF can be used to specify DNS > server > - configuration file (the default file on Linux is /etc/resolv.conf). > + configuration file (the default file on Linux is /etc/resolv.conf), and > + environment variable OVS_UNBOUND_CONF can be used to specify the > + configuration file for unbound. > > On Linux, you may choose to compile the kernel module that comes with the > Open > vSwitch distribution or to use the kernel module built into the Linux > kernel > diff --git a/lib/dns-resolve.c b/lib/dns-resolve.c > index 1ff58960f..d34451434 100644 > --- a/lib/dns-resolve.c > +++ b/lib/dns-resolve.c > @@ -82,6 +82,18 @@ dns_resolve_init(bool is_daemon) > return; > } > > +const char *ub_conf_filename = getenv("OVS_UNBOUND_CONF"); > +if (ub_conf_filename != NULL) { > +int retval = ub_ctx_config(ub_ctx__, ub_conf_filename); > +if (retval != 0) { > +VLOG_WARN_RL(&rl, "Failed to set libunbound context config: > %s", > + ub_strerror(retval)); > +ub_ctx_delete(ub_ctx__); > +ub_ctx__ = NULL; > +return; > +} > +} > + > const char *filename = getenv("OVS_RESOLV_CONF"); > if (!filename) { > #ifdef _WIN32 > -- > 2.22.3 > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] rhel: Add case for RHEL 7.5 major version to kmod manage script
Reviewed-by: Yifeng Sun On Tue, Aug 27, 2019 at 2:06 PM Greg Rose wrote: > > A Centos 7.5 kernel with an unencountered set of minor build numbers > caused an upgrade bug. Adding the case for the rhel 7.5 kmod management > script fixes the problem. > > Signed-off-by: Greg Rose > --- > rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh > b/rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh > index 2cd8e5c..51756ec 100644 > --- a/rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh > +++ b/rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh > @@ -85,6 +85,11 @@ if [ "$mainline_major" = "3" ] && [ "$mainline_minor" = > "10" ]; then > comp_ver=11 > ver_offset=4 > installed_ver="$minor_rev" > +elif [ "$major_rev" = "862" ]; then > +#echo "rhel75" > +comp_ver=11 > +ver_offset=4 > +installed_ver="$minor_rev" > elif [ "$major_rev" = "957" ]; then > #echo "rhel76" > comp_ver=10 > -- > 1.8.3.1 > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2] userspace: Enable non-bridge port as tunnel endpoint.
Hi Ben, Could you please take a look at this patch if you have time? Thanks. Yifeng On Thu, Jul 18, 2019 at 1:07 PM Yifeng Sun wrote: > > For userspace datapath, currently only the bridge itself, the LOCAL port, > can be the tunnel endpoint to encap/decap tunnel packets. This patch > enables non-bridge port as tunnel endpoint. One use case is for users to > create a bridge and a vtep port as tap, and configure underlay IP at vtep > port as the tunnel endpoint. > > Signed-off-by: William Tu > Co-authored-by: William Tu > Signed-off-by: Yifeng Sun > --- > v1->v2: Fixed an error pointed out by Ben. > > ofproto/ofproto-dpif-xlate.c | 56 > +++- > tests/tunnel-push-pop.at | 55 +++ > 2 files changed, 100 insertions(+), 11 deletions(-) > > diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c > index 73966a4e83ca..84c650de19ee 100644 > --- a/ofproto/ofproto-dpif-xlate.c > +++ b/ofproto/ofproto-dpif-xlate.c > @@ -3403,6 +3403,19 @@ tnl_route_lookup_flow(const struct xlate_ctx *ctx, > } > } > } > + > +/* If tunnel IP isn't configured on bridges, then we search all ports. */ > +HMAP_FOR_EACH (xbridge, hmap_node, &ctx->xcfg->xbridges) { > +struct xport *port; > + > +HMAP_FOR_EACH (port, ofp_node, &xbridge->xports) { > +if (!strncmp(netdev_get_name(port->netdev), > + out_dev, IFNAMSIZ)) { > +*out_port = port; > +return 0; > +} > +} > +} > return -ENOENT; > } > > @@ -3965,6 +3978,16 @@ is_nd_dst_correct(const struct flow *flow, const > struct in6_addr *ipv6_addr) > IN6_ARE_ADDR_EQUAL(&flow->ipv6_dst, ipv6_addr); > } > > +static bool > +is_neighbor_reply_matched(const struct flow *flow, struct in6_addr *ip_addr) > +{ > +return ((IN6_IS_ADDR_V4MAPPED(ip_addr) && > + flow->dl_type == htons(ETH_TYPE_ARP) && > + in6_addr_get_mapped_ipv4(ip_addr) == flow->nw_dst) || > +(!IN6_IS_ADDR_V4MAPPED(ip_addr) && > + is_nd_dst_correct(flow, ip_addr))); > +} > + > /* Function verifies if the ARP reply or Neighbor Advertisement represented > by > * 'flow' addresses the 'xbridge' of 'ctx'. Returns true if the ARP TA or > * neighbor discovery destination is in the list of configured IP addresses > of > @@ -3979,11 +4002,7 @@ is_neighbor_reply_correct(const struct xlate_ctx *ctx, > const struct flow *flow) > /* Verify if 'nw_dst' of ARP or 'ipv6_dst' of ICMPV6 is in the list. */ > for (i = 0; xbridge_addr && i < xbridge_addr->n_addr; i++) { > struct in6_addr *ip_addr = &xbridge_addr->addr[i]; > -if ((IN6_IS_ADDR_V4MAPPED(ip_addr) && > - flow->dl_type == htons(ETH_TYPE_ARP) && > - in6_addr_get_mapped_ipv4(ip_addr) == flow->nw_dst) || > -(!IN6_IS_ADDR_V4MAPPED(ip_addr) && > - is_nd_dst_correct(flow, ip_addr))) { > +if (is_neighbor_reply_matched(flow, ip_addr)) { > /* Found a match. */ > ret = true; > break; > @@ -3991,20 +4010,35 @@ is_neighbor_reply_correct(const struct xlate_ctx > *ctx, const struct flow *flow) > } > > xbridge_addr_unref(xbridge_addr); > + > +/* If not found in bridge's IPs, search in its ports. */ > +if (!ret) { > +struct in6_addr *ip_addr, *mask; > +struct xport *port; > +int error, n_in6; > + > +HMAP_FOR_EACH (port, ofp_node, &ctx->xbridge->xports) { > +error = netdev_get_addr_list(port->netdev, &ip_addr, > + &mask, &n_in6); > +if (!error && is_neighbor_reply_matched(flow, ip_addr)) { > +/* Found a match. */ > +ret = true; > +break; > +} > +} > +} > return ret; > } > > static bool > -terminate_native_tunnel(struct xlate_ctx *ctx, ofp_port_t ofp_port, > -struct flow *flow, struct flow_wildcards *wc, > -odp_port_t *tnl_port) > +terminate_native_tunnel(struct xlate_ctx *ctx, struct flow *flow, > +struct flow_wildcards *wc, odp_port_t *tnl_port) > { > *tnl_port = ODPP_NONE; > > /* XXX: Write better Filter for tunnel port. We can use in_port > * in tunnel-po
[ovs-dev] [PATCH v3] userspace: Enable non-bridge port as tunnel endpoint.
For userspace datapath, currently only the bridge itself, the LOCAL port, can be the tunnel endpoint to encap/decap tunnel packets. This patch enables non-bridge port as tunnel endpoint. One use case is for users to create a bridge and a vtep port as tap, and configure underlay IP at vtep port as the tunnel endpoint. This patch causes failure for test "ptap - L3 over patch port". This is because this test is already using non-bridge port gre1 as tunnel endpoint. In this test, an extra flow is added to support this, as shown below: ovs-ofctl add-flow br1 in_port=p1,actions=output=gre1 It later generates a datapath flow which matches an extra eth field: - recirc_id(0),...,eth_type(0x0800),... + recirc_id(0),...,eth(dst=1e:2c:e9:2a:66:9e),eth_type(0x0800),... With this patch, the above flow is no longer needed. Signed-off-by: William Tu Co-authored-by: William Tu Signed-off-by: Yifeng Sun --- v1->v2: Fixed an error pointed out by Ben. v2->v3: Fixed a test failure, thanks Ben for review and testing! ofproto/ofproto-dpif-xlate.c | 56 +++- tests/packet-type-aware.at | 1 - tests/tunnel-push-pop.at | 55 +++ 3 files changed, 100 insertions(+), 12 deletions(-) diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c index 02a2a4535542..290924634f36 100644 --- a/ofproto/ofproto-dpif-xlate.c +++ b/ofproto/ofproto-dpif-xlate.c @@ -3410,6 +3410,19 @@ tnl_route_lookup_flow(const struct xlate_ctx *ctx, } } } + +/* If tunnel IP isn't configured on bridges, then we search all ports. */ +HMAP_FOR_EACH (xbridge, hmap_node, &ctx->xcfg->xbridges) { +struct xport *port; + +HMAP_FOR_EACH (port, ofp_node, &xbridge->xports) { +if (!strncmp(netdev_get_name(port->netdev), + out_dev, IFNAMSIZ)) { +*out_port = port; +return 0; +} +} +} return -ENOENT; } @@ -3972,6 +3985,16 @@ is_nd_dst_correct(const struct flow *flow, const struct in6_addr *ipv6_addr) IN6_ARE_ADDR_EQUAL(&flow->ipv6_dst, ipv6_addr); } +static bool +is_neighbor_reply_matched(const struct flow *flow, struct in6_addr *ip_addr) +{ +return ((IN6_IS_ADDR_V4MAPPED(ip_addr) && + flow->dl_type == htons(ETH_TYPE_ARP) && + in6_addr_get_mapped_ipv4(ip_addr) == flow->nw_dst) || +(!IN6_IS_ADDR_V4MAPPED(ip_addr) && + is_nd_dst_correct(flow, ip_addr))); +} + /* Function verifies if the ARP reply or Neighbor Advertisement represented by * 'flow' addresses the 'xbridge' of 'ctx'. Returns true if the ARP TA or * neighbor discovery destination is in the list of configured IP addresses of @@ -3986,11 +4009,7 @@ is_neighbor_reply_correct(const struct xlate_ctx *ctx, const struct flow *flow) /* Verify if 'nw_dst' of ARP or 'ipv6_dst' of ICMPV6 is in the list. */ for (i = 0; xbridge_addr && i < xbridge_addr->n_addr; i++) { struct in6_addr *ip_addr = &xbridge_addr->addr[i]; -if ((IN6_IS_ADDR_V4MAPPED(ip_addr) && - flow->dl_type == htons(ETH_TYPE_ARP) && - in6_addr_get_mapped_ipv4(ip_addr) == flow->nw_dst) || -(!IN6_IS_ADDR_V4MAPPED(ip_addr) && - is_nd_dst_correct(flow, ip_addr))) { +if (is_neighbor_reply_matched(flow, ip_addr)) { /* Found a match. */ ret = true; break; @@ -3998,20 +4017,35 @@ is_neighbor_reply_correct(const struct xlate_ctx *ctx, const struct flow *flow) } xbridge_addr_unref(xbridge_addr); + +/* If not found in bridge's IPs, search in its ports. */ +if (!ret) { +struct in6_addr *ip_addr, *mask; +struct xport *port; +int error, n_in6; + +HMAP_FOR_EACH (port, ofp_node, &ctx->xbridge->xports) { +error = netdev_get_addr_list(port->netdev, &ip_addr, + &mask, &n_in6); +if (!error && is_neighbor_reply_matched(flow, ip_addr)) { +/* Found a match. */ +ret = true; +break; +} +} +} return ret; } static bool -terminate_native_tunnel(struct xlate_ctx *ctx, ofp_port_t ofp_port, -struct flow *flow, struct flow_wildcards *wc, -odp_port_t *tnl_port) +terminate_native_tunnel(struct xlate_ctx *ctx, struct flow *flow, +struct flow_wildcards *wc, odp_port_t *tnl_port) { *tnl_port = ODPP_NONE; /* XXX: Write better Filter for tunnel port. We can use in_port * in tunnel-port flow to avoid these checks completely. */ -if (ofp_port ==
Re: [ovs-dev] [PATCH v3] datapath: compat: Backports bugfixes for nf_conncount
Thanks Yi-Hung for the explanation. On Wed, Aug 28, 2019 at 4:49 PM Yi-Hung Wei wrote: > > On Wed, Aug 28, 2019 at 4:07 PM Ben Pfaff wrote: > > > > On Wed, Aug 07, 2019 at 03:25:33PM -0700, Yifeng Sun wrote: > > > This patch backports several critical bug fixes related to > > > locking and data consistency in nf_conncount code. > > > > > > This backport is based on the following upstream net-next upstream > > > commits. > > > a007232 ("netfilter: nf_conncount: fix argument order to find_next_bit") > > > c80f10b ("netfilter: nf_conncount: speculative garbage collection on > > > empty lists") > > > 2f971a8 ("netfilter: nf_conncount: move all list iterations under > > > spinlock") > > > df4a902 ("netfilter: nf_conncount: merge lookup and add functions") > > > e8cfb37 ("netfilter: nf_conncount: restart search when nodes have been > > > erased") > > > f7fcc98 ("netfilter: nf_conncount: split gc in two phases") > > > 4cd273b ("netfilter: nf_conncount: don't skip eviction when age is > > > negative") > > > c78e781 ("netfilter: nf_conncount: replace CONNCOUNT_LOCK_SLOTS with > > > CONNCOUNT_SLOTS") > > > d4e7df1 ("netfilter: nf_conncount: use rb_link_node_rcu() instead of > > > rb_link_node()") > > > 53ca0f2 ("netfilter: nf_conncount: remove wrong condition check routine") > > > 3c5cdb1 ("netfilter: nf_conncount: fix unexpected permanent node of > > > list.") > > > 31568ec ("netfilter: nf_conncount: fix list_del corruption in conn_free") > > > fd3e71a ("netfilter: nf_conncount: use spin_lock_bh instead of spin_lock") > > > > > > This patch adds additional compat code so that it can build on > > > all supported kernel versions. > > > > I think that our most common approach is to use one OVS commit to > > backport one Linux kernel commit. This commit combines many Linux > > kernel commits. Is that an intentional change in this case? > > Hi Ben, > > Yes, we are intended to pull in all of the bug fixes in this case. > The rationale is as following. > > For the commits in ovs kernel module, we usually backport one upstream > net-next commit to one OVS commit. We need this fine granularity > backports because a single OVS kernel module changes can affect OVS > behavior. For the other type of kernel backports (mainly in > ./datapath/linux/compat/ ), we try to backport the required missing > features for ovs kernel module in the older kernel. The goal is to > keep the older kernel in sync with the newer kernel on the required > features, and we may not need much detailed information per upstream > patch. In this case, it would be easier to pull in multiple patches > at once. > > Some existing examples are, > c387d8177f20 ("compat: Add ipv6 GRE and IPV6 Tunneling") > 744964326f6c ("datapath: compat: Backports nf_conncount") > > Thanks, > > -Yi-Hung ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v3] userspace: Enable non-bridge port as tunnel endpoint.
Thanks Ben and Darrell, let me check it out. On Wed, Aug 28, 2019 at 12:46 PM Darrell Ball wrote: > > Thanks for the patch > > How about writing a system test ? > > Darrell > > On Wed, Aug 28, 2019 at 10:50 AM Yifeng Sun wrote: >> >> For userspace datapath, currently only the bridge itself, the LOCAL port, >> can be the tunnel endpoint to encap/decap tunnel packets. This patch >> enables non-bridge port as tunnel endpoint. One use case is for users to >> create a bridge and a vtep port as tap, and configure underlay IP at vtep >> port as the tunnel endpoint. >> >> This patch causes failure for test "ptap - L3 over patch port". This is >> because this test is already using non-bridge port gre1 as tunnel endpoint. >> In this test, an extra flow is added to support this, as shown below: >> ovs-ofctl add-flow br1 in_port=p1,actions=output=gre1 >> >> It later generates a datapath flow which matches an extra eth field: >> - recirc_id(0),...,eth_type(0x0800),... >> + recirc_id(0),...,eth(dst=1e:2c:e9:2a:66:9e),eth_type(0x0800),... >> >> With this patch, the above flow is no longer needed. >> >> Signed-off-by: William Tu >> Co-authored-by: William Tu >> Signed-off-by: Yifeng Sun >> --- >> v1->v2: Fixed an error pointed out by Ben. >> v2->v3: Fixed a test failure, thanks Ben for review and testing! >> ofproto/ofproto-dpif-xlate.c | 56 >> +++- >> tests/packet-type-aware.at | 1 - >> tests/tunnel-push-pop.at | 55 >> +++ >> 3 files changed, 100 insertions(+), 12 deletions(-) >> >> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c >> index 02a2a4535542..290924634f36 100644 >> --- a/ofproto/ofproto-dpif-xlate.c >> +++ b/ofproto/ofproto-dpif-xlate.c >> @@ -3410,6 +3410,19 @@ tnl_route_lookup_flow(const struct xlate_ctx *ctx, >> } >> } >> } >> + >> +/* If tunnel IP isn't configured on bridges, then we search all ports. >> */ >> +HMAP_FOR_EACH (xbridge, hmap_node, &ctx->xcfg->xbridges) { >> +struct xport *port; >> + >> +HMAP_FOR_EACH (port, ofp_node, &xbridge->xports) { >> +if (!strncmp(netdev_get_name(port->netdev), >> + out_dev, IFNAMSIZ)) { >> +*out_port = port; >> +return 0; >> +} >> +} >> +} >> return -ENOENT; >> } >> >> @@ -3972,6 +3985,16 @@ is_nd_dst_correct(const struct flow *flow, const >> struct in6_addr *ipv6_addr) >> IN6_ARE_ADDR_EQUAL(&flow->ipv6_dst, ipv6_addr); >> } >> >> +static bool >> +is_neighbor_reply_matched(const struct flow *flow, struct in6_addr *ip_addr) >> +{ >> +return ((IN6_IS_ADDR_V4MAPPED(ip_addr) && >> + flow->dl_type == htons(ETH_TYPE_ARP) && >> + in6_addr_get_mapped_ipv4(ip_addr) == flow->nw_dst) || >> +(!IN6_IS_ADDR_V4MAPPED(ip_addr) && >> + is_nd_dst_correct(flow, ip_addr))); >> +} >> + >> /* Function verifies if the ARP reply or Neighbor Advertisement represented >> by >> * 'flow' addresses the 'xbridge' of 'ctx'. Returns true if the ARP TA or >> * neighbor discovery destination is in the list of configured IP addresses >> of >> @@ -3986,11 +4009,7 @@ is_neighbor_reply_correct(const struct xlate_ctx >> *ctx, const struct flow *flow) >> /* Verify if 'nw_dst' of ARP or 'ipv6_dst' of ICMPV6 is in the list. */ >> for (i = 0; xbridge_addr && i < xbridge_addr->n_addr; i++) { >> struct in6_addr *ip_addr = &xbridge_addr->addr[i]; >> -if ((IN6_IS_ADDR_V4MAPPED(ip_addr) && >> - flow->dl_type == htons(ETH_TYPE_ARP) && >> - in6_addr_get_mapped_ipv4(ip_addr) == flow->nw_dst) || >> -(!IN6_IS_ADDR_V4MAPPED(ip_addr) && >> - is_nd_dst_correct(flow, ip_addr))) { >> +if (is_neighbor_reply_matched(flow, ip_addr)) { >> /* Found a match. */ >> ret = true; >> break; >> @@ -3998,20 +4017,35 @@ is_neighbor_reply_correct(const struct xlate_ctx >> *ctx, const struct flow *flow) >> } >> >> xbridge_addr_unref(xbridge_addr); >> + >> +/* If n
Re: [ovs-dev] [PATCH] rhel: Fix ovs-kmod-manage.sh to work with RHEL 7.3
Looks good to me, thanks. Reviewed-by: Yifeng Sun On Thu, Aug 29, 2019 at 11:57 AM Greg Rose wrote: > > Add case for RHEL 7.3. This also fixes commit 22abff2 where I forgot to > update the comp_ver variable for RHEL 7.5 and while I was in there I > updated comp_ver for the RHEL 7.4 case as well. > > Fixes: 22abff2 ("rhel: Add case for RHEL 7.5 major version to...") > Signed-off-by: Greg Rose > --- > rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh | 9 +++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh > b/rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh > index 51756ec..c5b1d2d 100644 > --- a/rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh > +++ b/rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh > @@ -80,14 +80,19 @@ if [ "$mainline_major" = "3" ] && [ "$mainline_minor" = > "10" ]; then > comp_ver=36 > ver_offset=4 > installed_ver="$minor_rev" > +elif [ "$major_rev" = "514" ]; then > +#echo "rhel73" > +comp_ver=26 > +ver_offset=4 > +installed_ver="$minor_rev" > elif [ "$major_rev" = "693" ]; then > #echo "rhel74" > -comp_ver=11 > +comp_ver=21 > ver_offset=4 > installed_ver="$minor_rev" > elif [ "$major_rev" = "862" ]; then > #echo "rhel75" > -comp_ver=11 > +comp_ver=20 > ver_offset=4 > installed_ver="$minor_rev" > elif [ "$major_rev" = "957" ]; then > -- > 1.8.3.1 > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v4] userspace: Enable non-bridge port as tunnel endpoint.
For userspace datapath, currently only the bridge itself, the LOCAL port, can be the tunnel endpoint to encap/decap tunnel packets. This patch enables non-bridge port as tunnel endpoint. One use case is for users to create a bridge and a vtep port as tap, and configure underlay IP at vtep port as the tunnel endpoint. This patch causes failure for test "ptap - L3 over patch port". This is because this test is already using non-bridge port gre1 as tunnel endpoint. In this test, a flow is added to redirect tunnel packets to gre1 port, as shown below: ovs-ofctl add-flow br1 in_port=p1,actions=output=gre1 It later generates a datapath flow which matches an extra eth field: - recirc_id(0),...,eth_type(0x0800),... + recirc_id(0),...,eth(dst=1e:2c:e9:2a:66:9e),eth_type(0x0800),... With this patch, this flow need only a NORMAL action. Signed-off-by: William Tu Co-authored-by: William Tu Signed-off-by: Yifeng Sun --- v1->v2: Fixed an error pointed out by Ben. v2->v3: Fixed a test failure, thanks Ben for review and testing! v3->v4: When creating v3, a code rebase lead to test number change and we tested the wrong test. Thanks Ben for catching it. In addition to this fix, this version also added system test, as suggested by Darrell. ofproto/ofproto-dpif-xlate.c | 56 +- tests/packet-type-aware.at | 8 +++--- tests/system-layer3-tunnels.at | 55 + 3 files changed, 104 insertions(+), 15 deletions(-) diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c index 17800f3c8a3f..9c2f44784161 100644 --- a/ofproto/ofproto-dpif-xlate.c +++ b/ofproto/ofproto-dpif-xlate.c @@ -3410,6 +3410,19 @@ tnl_route_lookup_flow(const struct xlate_ctx *ctx, } } } + +/* If tunnel IP isn't configured on bridges, then we search all ports. */ +HMAP_FOR_EACH (xbridge, hmap_node, &ctx->xcfg->xbridges) { +struct xport *port; + +HMAP_FOR_EACH (port, ofp_node, &xbridge->xports) { +if (!strncmp(netdev_get_name(port->netdev), + out_dev, IFNAMSIZ)) { +*out_port = port; +return 0; +} +} +} return -ENOENT; } @@ -3972,6 +3985,16 @@ is_nd_dst_correct(const struct flow *flow, const struct in6_addr *ipv6_addr) IN6_ARE_ADDR_EQUAL(&flow->ipv6_dst, ipv6_addr); } +static bool +is_neighbor_reply_matched(const struct flow *flow, struct in6_addr *ip_addr) +{ +return ((IN6_IS_ADDR_V4MAPPED(ip_addr) && + flow->dl_type == htons(ETH_TYPE_ARP) && + in6_addr_get_mapped_ipv4(ip_addr) == flow->nw_dst) || +(!IN6_IS_ADDR_V4MAPPED(ip_addr) && + is_nd_dst_correct(flow, ip_addr))); +} + /* Function verifies if the ARP reply or Neighbor Advertisement represented by * 'flow' addresses the 'xbridge' of 'ctx'. Returns true if the ARP TA or * neighbor discovery destination is in the list of configured IP addresses of @@ -3986,11 +4009,7 @@ is_neighbor_reply_correct(const struct xlate_ctx *ctx, const struct flow *flow) /* Verify if 'nw_dst' of ARP or 'ipv6_dst' of ICMPV6 is in the list. */ for (i = 0; xbridge_addr && i < xbridge_addr->n_addr; i++) { struct in6_addr *ip_addr = &xbridge_addr->addr[i]; -if ((IN6_IS_ADDR_V4MAPPED(ip_addr) && - flow->dl_type == htons(ETH_TYPE_ARP) && - in6_addr_get_mapped_ipv4(ip_addr) == flow->nw_dst) || -(!IN6_IS_ADDR_V4MAPPED(ip_addr) && - is_nd_dst_correct(flow, ip_addr))) { +if (is_neighbor_reply_matched(flow, ip_addr)) { /* Found a match. */ ret = true; break; @@ -3998,20 +4017,35 @@ is_neighbor_reply_correct(const struct xlate_ctx *ctx, const struct flow *flow) } xbridge_addr_unref(xbridge_addr); + +/* If not found in bridge's IPs, search in its ports. */ +if (!ret) { +struct in6_addr *ip_addr, *mask; +struct xport *port; +int error, n_in6; + +HMAP_FOR_EACH (port, ofp_node, &ctx->xbridge->xports) { +error = netdev_get_addr_list(port->netdev, &ip_addr, + &mask, &n_in6); +if (!error && is_neighbor_reply_matched(flow, ip_addr)) { +/* Found a match. */ +ret = true; +break; +} +} +} return ret; } static bool -terminate_native_tunnel(struct xlate_ctx *ctx, ofp_port_t ofp_port, -struct flow *flow, struct flow_wildcards *wc, -odp_port_t *tnl_port) +terminate_native_tunnel(struct xlate_ctx *ctx, struct flow *flow, +
Re: [ovs-dev] [PATCH] faq: Update list of kernels supported by 2.12.
Looks good to me, thanks. Reviewed-by: Yifeng Sun On Fri, Sep 6, 2019 at 4:33 PM Justin Pettit wrote: > > Signed-off-by: Justin Pettit > --- > Documentation/faq/releases.rst | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst > index 8daa23bb2d0c..8c29e32efa71 100644 > --- a/Documentation/faq/releases.rst > +++ b/Documentation/faq/releases.rst > @@ -69,6 +69,7 @@ Q: What Linux kernel versions does each Open vSwitch > release work with? > 2.9.x3.10 to 4.13 > 2.10.x 3.10 to 4.17 > 2.11.x 3.10 to 4.18 > +2.12.x 3.10 to 5.0 > == > > Open vSwitch userspace should also work with the Linux kernel module > built > -- > 2.17.1 > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 01/10] raft: Free leaked json data
Valgrind reported: 1924: compacting online - cluster ==29312== 2,886 (240 direct, 2,646 indirect) bytes in 6 blocks are definitely lost in loss record 406 of 413 ==29312==at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==29312==by 0x44A5F4: xmalloc (util.c:138) ==29312==by 0x4308EA: json_create (json.c:1451) ==29312==by 0x4308EA: json_object_create (json.c:254) ==29312==by 0x430ED0: json_parser_push_object (json.c:1273) ==29312==by 0x430ED0: json_parser_input (json.c:1371) ==29312==by 0x431CF1: json_lex_input (json.c:991) ==29312==by 0x43233B: json_parser_feed (json.c:1149) ==29312==by 0x41D87F: parse_body.isra.0 (log.c:411) ==29312==by 0x41E141: ovsdb_log_read (log.c:476) ==29312==by 0x42646D: raft_read_log (raft.c:866) ==29312==by 0x42646D: raft_open (raft.c:951) ==29312==by 0x4151AF: ovsdb_storage_open__ (storage.c:81) ==29312==by 0x408FFC: open_db (ovsdb-server.c:642) ==29312==by 0x40657F: main (ovsdb-server.c:358) This patch fixes it. Signed-off-by: Yifeng Sun --- ovsdb/raft.c | 1 + 1 file changed, 1 insertion(+) diff --git a/ovsdb/raft.c b/ovsdb/raft.c index 9eabe2cfeecd..a45c7f8ba998 100644 --- a/ovsdb/raft.c +++ b/ovsdb/raft.c @@ -883,6 +883,7 @@ raft_read_log(struct raft *raft) error = raft_apply_record(raft, i, &r); raft_record_uninit(&r); } +json_destroy(json); if (error) { return ovsdb_wrap_error(error, "error reading record %llu from " "%s log", i, raft->name); -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 02/10] ofproto-dpif: Uninitialize 'xlate_cache' to free resources
Valgrind reported: 1210: ofproto-dpif - continuation after clone ==32205== 4,392 (1,440 direct, 2,952 indirect) bytes in 12 blocks are definitely lost in loss record 359 of 362 ==32205==at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==32205==by 0x532574: xmalloc (util.c:138) ==32205==by 0x4F98CA: ofpbuf_init (ofpbuf.c:123) ==32205==by 0x42C07B: nxt_resume (ofproto-dpif.c:5110) ==32205==by 0x41796F: handle_nxt_resume (ofproto.c:3677) ==32205==by 0x424583: handle_single_part_openflow (ofproto.c:8473) ==32205==by 0x424583: handle_openflow (ofproto.c:8606) ==32205==by 0x4579E2: ofconn_run (connmgr.c:1318) ==32205==by 0x4579E2: connmgr_run (connmgr.c:355) ==32205==by 0x41E0F5: ofproto_run (ofproto.c:1845) ==32205==by 0x40BA63: bridge_run__ (bridge.c:2971) ==32205==by 0x410CF3: bridge_run (bridge.c:3029) ==32205==by 0x407614: main (ovs-vswitchd.c:127) This is because 'xcache' was not destroyed properly. This patch fixes it. Signed-off-by: Yifeng Sun --- ofproto/ofproto-dpif.c | 1 + 1 file changed, 1 insertion(+) diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c index 751535249e21..46fa1357163b 100644 --- a/ofproto/ofproto-dpif.c +++ b/ofproto/ofproto-dpif.c @@ -5148,6 +5148,7 @@ nxt_resume(struct ofproto *ofproto_, /* Clean up. */ ofpbuf_uninit(&odp_actions); dp_packet_uninit(&packet); +xlate_cache_uninit(&xcache); return error; } -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 03/10] dpif-netdev: Handle uninitialized value error for 'match.wc'
Valgrind reported that match.wc was not initialized, as below: 1176: ofproto-dpif - fragment handling - actions ==21214== Conditional jump or move depends on uninitialised value(s) ==21214==at 0x4B77C1: odp_flow_key_from_flow__ (odp-util.c:6143) ==21214==by 0x46DB58: dp_netdev_upcall (dpif-netdev.c:6239) ==21214==by 0x4774A7: handle_packet_upcall (dpif-netdev.c:6608) ==21214==by 0x4774A7: fast_path_processing (dpif-netdev.c:6726) ==21214==by 0x47933C: dp_netdev_input__ (dpif-netdev.c:6814) ==21214==by 0x479AB8: dp_netdev_input (dpif-netdev.c:6852) ==21214==by 0x479AB8: dp_netdev_process_rxq_port (dpif-netdev.c:4287) ==21214==by 0x47A6A9: dpif_netdev_run (dpif-netdev.c:5264) ==21214==by 0x4324E7: type_run (ofproto-dpif.c:342) ==21214==by 0x41C5FE: ofproto_type_run (ofproto.c:1734) ==21214==by 0x40BAAC: bridge_run__ (bridge.c:2965) ==21214==by 0x410CF3: bridge_run (bridge.c:3029) ==21214==by 0x407614: main (ovs-vswitchd.c:127) ==21214== Uninitialised value was created by a stack allocation ==21214==at 0x4769C3: fast_path_processing (dpif-netdev.c:6672) 'match' is allocated on stack but its 'wc' is accessed in odp_flow_key_from_flow__ without proper initialization. This patch fixes it. Signed-off-by: Yifeng Sun --- lib/dpif-netdev.c | 1 + 1 file changed, 1 insertion(+) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index a88a78f8a688..6be6e47ed127 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -6600,6 +6600,7 @@ handle_packet_upcall(struct dp_netdev_pmd_thread *pmd, match.tun_md.valid = false; miniflow_expand(&key->mf, &match.flow); +memset(&match.wc, 0, sizeof match.wc); ofpbuf_clear(actions); ofpbuf_clear(put_actions); -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 05/10] trigger: Free leaked ovsdb_schema
Valgrind reported: 1925: schema conversion online - standalone ==10884== 689 (56 direct, 633 indirect) bytes in 1 blocks are definitely lost in loss record 384 of 420 ==10884==at 0x4C2FB55: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==10884==by 0x44A592: xcalloc (util.c:121) ==10884==by 0x40E2EC: ovsdb_schema_create (ovsdb.c:41) ==10884==by 0x40E688: ovsdb_schema_from_json (ovsdb.c:217) ==10884==by 0x416C6F: ovsdb_trigger_try (trigger.c:246) ==10884==by 0x40D4DE: ovsdb_jsonrpc_trigger_create (jsonrpc-server.c:1119) ==10884==by 0x40D4DE: ovsdb_jsonrpc_session_got_request (jsonrpc-server.c:986) ==10884==by 0x40D4DE: ovsdb_jsonrpc_session_run (jsonrpc-server.c:556) ==10884==by 0x40D4DE: ovsdb_jsonrpc_session_run_all (jsonrpc-server.c:586) ==10884==by 0x40D4DE: ovsdb_jsonrpc_server_run (jsonrpc-server.c:401) ==10884==by 0x406A6E: main_loop (ovsdb-server.c:209) ==10884==by 0x406A6E: main (ovsdb-server.c:460) 'new_schema' should also be freed when there is no error. This patch fixes it. Signed-off-by: Yifeng Sun --- ovsdb/trigger.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ovsdb/trigger.c b/ovsdb/trigger.c index 6f4ed96b000b..7e62e90ae381 100644 --- a/ovsdb/trigger.c +++ b/ovsdb/trigger.c @@ -254,8 +254,8 @@ ovsdb_trigger_try(struct ovsdb_trigger *t, long long int now) if (!error) { error = ovsdb_convert(t->db, new_schema, &newdb); } +ovsdb_schema_destroy(new_schema); if (error) { -ovsdb_schema_destroy(new_schema); trigger_convert_error(t, error); return false; } -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 04/10] ovs-ofctl: Free leaked minimatch
Valgrind reported: 1056: ofproto - bundle with multiple flow mods (OpenFlow 1.4) ==19220== 160 bytes in 2 blocks are definitely lost in loss record 24 of 34 ==19220==at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==19220==by 0x4979A4: xmalloc (util.c:138) ==19220==by 0x42407D: miniflow_alloc (flow.c:3340) ==19220==by 0x4296CF: minimatch_init (match.c:1758) ==19220==by 0x46273D: parse_ofp_str__ (ofp-flow.c:1759) ==19220==by 0x465B9E: parse_ofp_str (ofp-flow.c:1790) ==19220==by 0x465CE0: parse_ofp_flow_mod_str (ofp-flow.c:1817) ==19220==by 0x465DF6: parse_ofp_flow_mod_file (ofp-flow.c:1876) ==19220==by 0x410BA3: ofctl_flow_mod_file.isra.19 (ovs-ofctl.c:1773) ==19220==by 0x417933: ovs_cmdl_run_command__ (command-line.c:223) ==19220==by 0x406F68: main (ovs-ofctl.c:179) This patch fixes it. Signed-off-by: Yifeng Sun --- utilities/ovs-ofctl.c | 1 + 1 file changed, 1 insertion(+) diff --git a/utilities/ovs-ofctl.c b/utilities/ovs-ofctl.c index 754629d3dfbb..06289d296573 100644 --- a/utilities/ovs-ofctl.c +++ b/utilities/ovs-ofctl.c @@ -1724,6 +1724,7 @@ bundle_flow_mod__(const char *remote, struct ofputil_flow_mod *fms, ovs_list_push_back(&requests, &request->list_node); free(CONST_CAST(struct ofpact *, fm->ofpacts)); +minimatch_destroy(&fm->match); } bundle_transact(vconn, &requests, OFPBF_ORDERED | OFPBF_ATOMIC); -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 07/10] dns-resolve: Free 'struct ub_result' when callback returns error results
Valgrind reported: 1074: ofproto - flush flows, groups, and meters for controller change ==5499== 695 (288 direct, 407 indirect) bytes in 3 blocks are definitely lost in loss record 344 of 355 ==5499==at 0x4C2FB55: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==5499==by 0x5E7F145: ??? (in /usr/lib/x86_64-linux-gnu/libunbound.so.2.4.0) ==5499==by 0x5E6EBDE: ub_resolve_async (in /usr/lib/x86_64-linux-gnu/libunbound.so.2.4.0) ==5499==by 0x55C739: resolve_async__.part.5 (dns-resolve.c:233) ==5499==by 0x55C85C: resolve_async__ (dns-resolve.c:261) ==5499==by 0x55C85C: resolve_callback__ (dns-resolve.c:262) ==5499==by 0x5E6FEF1: ub_process (in /usr/lib/x86_64-linux-gnu/libunbound.so.2.4.0) ==5499==by 0x55CAF3: dns_resolve (dns-resolve.c:153) ==5499==by 0x523864: parse_sockaddr_components_dns (socket-util.c:438) ==5499==by 0x523864: parse_sockaddr_components (socket-util.c:504) ==5499==by 0x524468: inet_parse_active (socket-util.c:541) ==5499==by 0x524564: inet_open_active (socket-util.c:579) ==5499==by 0x5959F9: tcp_open (stream-tcp.c:56) ==5499==by 0x529192: stream_open (stream.c:228) ==5499==by 0x529910: stream_open_with_default_port (stream.c:724) ==5499==by 0x595FAE: vconn_stream_open (vconn-stream.c:81) ==5499==by 0x535C9B: vconn_open (vconn.c:250) ==5499==by 0x517C59: reconnect (rconn.c:467) ==5499==by 0x5184C7: run_BACKOFF (rconn.c:492) ==5499==by 0x5184C7: rconn_run (rconn.c:660) ==5499==by 0x457FE8: ofservice_run (connmgr.c:1992) ==5499==by 0x457FE8: connmgr_run (connmgr.c:367) ==5499==by 0x41E0F5: ofproto_run (ofproto.c:1845) ==5499==by 0x40BA63: bridge_run__ (bridge.c:2971) In ub_resolve_async's callback function, 'struct ub_result' should be finally freed even if there is a resolving error. This patch fixes it. Signed-off-by: Yifeng Sun --- lib/dns-resolve.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/lib/dns-resolve.c b/lib/dns-resolve.c index e98e65f493ed..1ff58960fe01 100644 --- a/lib/dns-resolve.c +++ b/lib/dns-resolve.c @@ -251,6 +251,7 @@ resolve_callback__(void *req_, int err, struct ub_result *result) struct resolve_request *req = req_; if (err != 0 || (result->qtype == ns_t_ && !result->havedata)) { +ub_resolve_free(result); req->state = RESOLVE_ERROR; VLOG_ERR_RL(&rl, "%s: failed to resolve", req->name); return; @@ -265,6 +266,7 @@ resolve_callback__(void *req_, int err, struct ub_result *result) char *addr; if (!resolve_result_to_addr__(result, &addr)) { +ub_resolve_free(result); req->state = RESOLVE_ERROR; VLOG_ERR_RL(&rl, "%s: failed to resolve", req->name); return; -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 06/10] ovsdb-client: Free ovsdb_schema
Valgrind reported: 1925: schema conversion online - standalone ==10727== 689 (56 direct, 633 indirect) bytes in 1 blocks are definitely lost in loss record 64 of 66 ==10727==at 0x4C2FB55: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==10727==by 0x449D42: xcalloc (util.c:121) ==10727==by 0x40F45C: ovsdb_schema_create (ovsdb.c:41) ==10727==by 0x40F7F8: ovsdb_schema_from_json (ovsdb.c:217) ==10727==by 0x40FB4E: ovsdb_schema_from_file (ovsdb.c:101) ==10727==by 0x40B156: do_convert (ovsdb-client.c:1639) ==10727==by 0x4061C6: main (ovsdb-client.c:282) This patch fixes it. Signed-off-by: Yifeng Sun --- ovsdb/ovsdb-client.c | 1 + 1 file changed, 1 insertion(+) diff --git a/ovsdb/ovsdb-client.c b/ovsdb/ovsdb-client.c index 9ae15e557661..bfc90e6f7f85 100644 --- a/ovsdb/ovsdb-client.c +++ b/ovsdb/ovsdb-client.c @@ -1654,6 +1654,7 @@ do_convert(struct jsonrpc *rpc, const char *database_ OVS_UNUSED, ovsdb_schema_to_json(new_schema)), NULL); check_txn(jsonrpc_transact_block(rpc, request, &reply), &reply); jsonrpc_msg_destroy(reply); +ovsdb_schema_destroy(new_schema); } static void -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 08/10] ofproto-dpif: Free leaked 'webster'
Valgrind reported: 1122: ofproto-dpif - select group with explicit dp_hash selection method ==16884== 64 bytes in 1 blocks are definitely lost in loss record 320 of 346 ==16884==at 0x4C2FB55: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==16884==by 0x532512: xcalloc (util.c:121) ==16884==by 0x4262B9: group_setup_dp_hash_table (ofproto-dpif.c:4846) ==16884==by 0x4267CB: group_set_selection_method (ofproto-dpif.c:4938) ==16884==by 0x4267CB: group_construct (ofproto-dpif.c:4984) ==16884==by 0x417250: init_group (ofproto.c:7286) ==16884==by 0x41B4FC: add_group_start (ofproto.c:7316) ==16884==by 0x42247A: ofproto_group_mod_start (ofproto.c:7589) ==16884==by 0x4250EC: handle_group_mod (ofproto.c:7744) ==16884==by 0x4250EC: handle_single_part_openflow (ofproto.c:8428) ==16884==by 0x4250EC: handle_openflow (ofproto.c:8606) ==16884==by 0x4579E2: ofconn_run (connmgr.c:1318) ==16884==by 0x4579E2: connmgr_run (connmgr.c:355) ==16884==by 0x41E0F5: ofproto_run (ofproto.c:1845) ==16884==by 0x40BA63: bridge_run__ (bridge.c:2971) ==16884==by 0x410CF3: bridge_run (bridge.c:3029) ==16884==by 0x407614: main (ovs-vswitchd.c:127) This patch fixes it. Signed-off-by: Yifeng Sun --- ofproto/ofproto-dpif.c | 1 + 1 file changed, 1 insertion(+) diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c index 46fa1357163b..7bb0f7bdb4f3 100644 --- a/ofproto/ofproto-dpif.c +++ b/ofproto/ofproto-dpif.c @@ -4871,6 +4871,7 @@ group_setup_dp_hash_table(struct group_dpif *group, size_t max_hash) if (n_hash > MAX_SELECT_GROUP_HASH_VALUES || (max_hash != 0 && n_hash > max_hash)) { VLOG_DBG(" Too many hash values required: %"PRIu64, n_hash); +free(webster); return false; } -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 09/10] db-ctl-base: Free leaked ovsdb_datum
Valgrind reported: 2491: database commands -- negative checks ==19245== 36 (32 direct, 4 indirect) bytes in 1 blocks are definitely lost in loss record 36 of 53 ==19245==at 0x4C2FD5F: realloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==19245==by 0x431AB4: xrealloc (util.c:149) ==19245==by 0x41656D: ovsdb_datum_reallocate (ovsdb-data.c:1883) ==19245==by 0x41656D: ovsdb_datum_union (ovsdb-data.c:1961) ==19245==by 0x4107B2: cmd_add (db-ctl-base.c:1494) ==19245==by 0x406E2E: do_vsctl (ovs-vsctl.c:2626) ==19245==by 0x406E2E: main (ovs-vsctl.c:183) ==19252== 16 bytes in 1 blocks are definitely lost in loss record 9 of 52 ==19252==at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==19252==by 0x430F74: xmalloc (util.c:138) ==19252==by 0x414D07: clone_atoms (ovsdb-data.c:990) ==19252==by 0x4153F6: ovsdb_datum_clone (ovsdb-data.c:1012) ==19252==by 0x4104D3: cmd_remove (db-ctl-base.c:1564) ==19252==by 0x406E2E: do_vsctl (ovs-vsctl.c:2626) ==19252==by 0x406E2E: main (ovs-vsctl.c:183) This patch fixes them. Signed-off-by: Yifeng Sun --- lib/db-ctl-base.c | 5 + 1 file changed, 5 insertions(+) diff --git a/lib/db-ctl-base.c b/lib/db-ctl-base.c index 3bd9f006acb1..6878d6326cae 100644 --- a/lib/db-ctl-base.c +++ b/lib/db-ctl-base.c @@ -1489,6 +1489,7 @@ cmd_add(struct ctl_context *ctx) ctx->error = ovsdb_datum_from_string(&add, &add_type, ctx->argv[i], ctx->symtab); if (ctx->error) { +ovsdb_datum_destroy(&old, &column->type); return; } ovsdb_datum_union(&old, &add, type, false); @@ -1500,6 +1501,7 @@ cmd_add(struct ctl_context *ctx) old.n, type->value.type == OVSDB_TYPE_VOID ? "values" : "pairs", column->name, table->name, type->n_max); +ovsdb_datum_destroy(&old, &column->type); return; } ovsdb_idl_txn_verify(row, column); @@ -1581,10 +1583,12 @@ cmd_remove(struct ctl_context *ctx) ctx->argv[i], ctx->symtab); if (ctx->error) { +ovsdb_datum_destroy(&old, &column->type); return; } } else { ctx->error = error; +ovsdb_datum_destroy(&old, &column->type); return; } } @@ -1596,6 +1600,7 @@ cmd_remove(struct ctl_context *ctx) "table %s but the minimum number is %u", old.n, type->value.type == OVSDB_TYPE_VOID ? "values" : "pairs", column->name, table->name, type->n_min); +ovsdb_datum_destroy(&old, &column->type); return; } ovsdb_idl_txn_verify(row, column); -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 10/10] conntrack: Validate accessing of conntrack data in pkt_metadata
Valgrind reported: 1305: ofproto-dpif - conntrack - ipv6 ==26942== Conditional jump or move depends on uninitialised value(s) ==26942==at 0x587C00: check_orig_tuple (conntrack.c:1006) ==26942==by 0x587C00: process_one (conntrack.c:1141) ==26942==by 0x587C00: conntrack_execute (conntrack.c:1220) ==26942==by 0x47B00F: dp_execute_cb (dpif-netdev.c:7305) ==26942==by 0x4AF756: odp_execute_actions (odp-execute.c:794) ==26942==by 0x477532: dp_netdev_execute_actions (dpif-netdev.c:7349) ==26942==by 0x477532: handle_packet_upcall (dpif-netdev.c:6630) ==26942==by 0x477532: fast_path_processing (dpif-netdev.c:6726) ==26942==by 0x47933C: dp_netdev_input__ (dpif-netdev.c:6814) ==26942==by 0x479AB8: dp_netdev_input (dpif-netdev.c:6852) ==26942==by 0x479AB8: dp_netdev_process_rxq_port (dpif-netdev.c:4287) ==26942==by 0x47A6A9: dpif_netdev_run (dpif-netdev.c:5264) ==26942==by 0x4324E7: type_run (ofproto-dpif.c:342) ==26942==by 0x41C5FE: ofproto_type_run (ofproto.c:1734) ==26942==by 0x40BAAC: bridge_run__ (bridge.c:2965) ==26942==by 0x410CF3: bridge_run (bridge.c:3029) ==26942==by 0x407614: main (ovs-vswitchd.c:127) ==26942== Uninitialised value was created by a heap allocation ==26942==at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==26942==by 0x532574: xmalloc (util.c:138) ==26942==by 0x46CD62: dp_packet_new (dp-packet.c:153) ==26942==by 0x4A0431: eth_from_flow_str (netdev-dummy.c:1644) ==26942==by 0x4A0431: netdev_dummy_receive (netdev-dummy.c:1783) ==26942==by 0x531990: process_command (unixctl.c:308) ==26942==by 0x531990: run_connection (unixctl.c:342) ==26942==by 0x531990: unixctl_server_run (unixctl.c:393) ==26942==by 0x40761E: main (ovs-vswitchd.c:128) 1316: ofproto-dpif - conntrack - tcp port reuse ==24039== Conditional jump or move depends on uninitialised value(s) ==24039==at 0x587BF5: check_orig_tuple (conntrack.c:1004) ==24039==by 0x587BF5: process_one (conntrack.c:1141) ==24039==by 0x587BF5: conntrack_execute (conntrack.c:1220) ==24039==by 0x47B02F: dp_execute_cb (dpif-netdev.c:7306) ==24039==by 0x4AF7A6: odp_execute_actions (odp-execute.c:794) ==24039==by 0x47755B: dp_netdev_execute_actions (dpif-netdev.c:7350) ==24039==by 0x47755B: handle_packet_upcall (dpif-netdev.c:6631) ==24039==by 0x47755B: fast_path_processing (dpif-netdev.c:6727) ==24039==by 0x47935C: dp_netdev_input__ (dpif-netdev.c:6815) ==24039==by 0x479AD8: dp_netdev_input (dpif-netdev.c:6853) ==24039==by 0x479AD8: dp_netdev_process_rxq_port (dpif-netdev.c:4287) ==24039==by 0x47A6C9: dpif_netdev_run (dpif-netdev.c:5264) ==24039==by 0x4324F7: type_run (ofproto-dpif.c:342) ==24039==by 0x41C5FE: ofproto_type_run (ofproto.c:1734) ==24039==by 0x40BAAC: bridge_run__ (bridge.c:2965) ==24039==by 0x410CF3: bridge_run (bridge.c:3029) ==24039==by 0x407614: main (ovs-vswitchd.c:127) ==24039== Uninitialised value was created by a heap allocation ==24039==at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==24039==by 0x5325C4: xmalloc (util.c:138) ==24039==by 0x46D144: dp_packet_new (dp-packet.c:153) ==24039==by 0x46D144: dp_packet_new_with_headroom (dp-packet.c:163) ==24039==by 0x51191E: eth_from_hex (packets.c:498) ==24039==by 0x4A03B9: eth_from_packet (netdev-dummy.c:1609) ==24039==by 0x4A03B9: netdev_dummy_receive (netdev-dummy.c:1765) ==24039==by 0x5319E0: process_command (unixctl.c:308) ==24039==by 0x5319E0: run_connection (unixctl.c:342) ==24039==by 0x5319E0: unixctl_server_run (unixctl.c:393) ==24039==by 0x40761E: main (ovs-vswitchd.c:128) According to comments in pkt_metadata_init(), conntrack data is valid only if pkt_metadata.ct_state != 0. This patch prevents check_orig_tuple() get called when conntrack data is uninitialized. Signed-off-by: Yifeng Sun --- lib/conntrack.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/lib/conntrack.c b/lib/conntrack.c index e5266e579452..86c16b2fbe77 100644 --- a/lib/conntrack.c +++ b/lib/conntrack.c @@ -1138,7 +1138,8 @@ process_one(struct conntrack *ct, struct dp_packet *pkt, handle_nat(pkt, conn, zone, ctx->reply, ctx->icmp_related); } -} else if (check_orig_tuple(ct, pkt, ctx, now, &conn, nat_action_info)) { +} else if (pkt->md.ct_state + && check_orig_tuple(ct, pkt, ctx, now, &conn, nat_action_info)) { create_new_conn = conn_update_state(ct, pkt, ctx, conn, now); } else { if (ctx->icmp_related) { -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 05/10] trigger: Free leaked ovsdb_schema
Thanks Aginwala, Could you please double check if 'json_destroy(result)' is necessary here? If result != NULL, then it is passed in trigger_success(), which puts result in 't->reply', later, jsonrpc_msg_destroy() will free it. Thanks, Yifeng On Thu, Sep 12, 2019 at 2:00 PM aginwala wrote: > > One minor suggestion here: > Can you also handle freeing result: > diff --git a/ovsdb/trigger.c b/ovsdb/trigger.c > index 6f4ed96b0..0158957d6 100644 > --- a/ovsdb/trigger.c > +++ b/ovsdb/trigger.c > @@ -214,6 +214,7 @@ ovsdb_trigger_try(struct ovsdb_trigger *t, long long int > now) > /* Unsatisfied "wait" condition. Take no action now, > retry > * later. */ > } > +json_destroy(result); > return false; > } > > Else I can handle that in separate patch. Else, acked by for the series. > Acked-by: Aliasgar Ginwala > > On Wed, Sep 11, 2019 at 2:19 PM Yifeng Sun wrote: >> >> Valgrind reported: >> >> 1925: schema conversion online - standalone >> >> ==10884== 689 (56 direct, 633 indirect) bytes in 1 blocks are definitely >> lost in loss record 384 of 420 >> ==10884==at 0x4C2FB55: calloc (in >> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) >> ==10884==by 0x44A592: xcalloc (util.c:121) >> ==10884==by 0x40E2EC: ovsdb_schema_create (ovsdb.c:41) >> ==10884==by 0x40E688: ovsdb_schema_from_json (ovsdb.c:217) >> ==10884==by 0x416C6F: ovsdb_trigger_try (trigger.c:246) >> ==10884==by 0x40D4DE: ovsdb_jsonrpc_trigger_create >> (jsonrpc-server.c:1119) >> ==10884==by 0x40D4DE: ovsdb_jsonrpc_session_got_request >> (jsonrpc-server.c:986) >> ==10884==by 0x40D4DE: ovsdb_jsonrpc_session_run (jsonrpc-server.c:556) >> ==10884==by 0x40D4DE: ovsdb_jsonrpc_session_run_all >> (jsonrpc-server.c:586) >> ==10884==by 0x40D4DE: ovsdb_jsonrpc_server_run (jsonrpc-server.c:401) >> ==10884==by 0x406A6E: main_loop (ovsdb-server.c:209) >> ==10884==by 0x406A6E: main (ovsdb-server.c:460) >> >> 'new_schema' should also be freed when there is no error. >> This patch fixes it. >> >> Signed-off-by: Yifeng Sun >> --- >> ovsdb/trigger.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/ovsdb/trigger.c b/ovsdb/trigger.c >> index 6f4ed96b000b..7e62e90ae381 100644 >> --- a/ovsdb/trigger.c >> +++ b/ovsdb/trigger.c >> @@ -254,8 +254,8 @@ ovsdb_trigger_try(struct ovsdb_trigger *t, long long int >> now) >> if (!error) { >> error = ovsdb_convert(t->db, new_schema, &newdb); >> } >> +ovsdb_schema_destroy(new_schema); >> if (error) { >> -ovsdb_schema_destroy(new_schema); >> trigger_convert_error(t, error); >> return false; >> } >> -- >> 2.7.4 >> >> ___ >> dev mailing list >> d...@openvswitch.org >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] netdev-offload-dpdk : add ipv6 rte flow item support
Hi Timo, Please try to format your patch in raw text. Thanks, Yifeng On Wed, Sep 11, 2019 at 7:55 PM Timo_Liu wrote: > > > > Nowadays some Nics support hw offloading via dpdk rte_flow lib. Many > layer2-layer4 fields can be offloaded to nics, including smac/dmac ipv4 > sip/dip etc. Also some nics(including intel X710) supports ipv6 header > offloading, but when we execute netdev_offload_dpdk_add_flow, there is no > IPV6 rte_flow pattern. > > > > > > This patch adds support for IPV6 header hw-offload rte_flow pattern, > including ipv6_label, nw_ttl, nw_proto, ipv6 sip and dip, also the > corresponding mask filed is added. > > > > > Signed-off-by: Liu Chang > > > > > > > diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c > > index 01e9004..ab3f82b 100644 > > --- a/lib/netdev-offload-dpdk.c > > +++ b/lib/netdev-offload-dpdk.c > > @@ -433,7 +433,10 @@ netdev_offload_dpdk_add_flow(struct netdev *netdev, > > struct flow_items { > > struct rte_flow_item_eth eth; > > struct rte_flow_item_vlan vlan; > > -struct rte_flow_item_ipv4 ipv4; > > +union { > > +struct rte_flow_item_ipv4 ipv4; > > +struct rte_flow_item_ipv6 ipv6; > > +}; > > union { > > struct rte_flow_item_tcp tcp; > > struct rte_flow_item_udp udp; > > @@ -503,6 +506,31 @@ netdev_offload_dpdk_add_flow(struct netdev *netdev, > > mask.ipv4.hdr.next_proto_id; > > } > > > > > +/* IP v6 */ > > +if (match->flow.dl_type == htons(ETH_TYPE_IPV6)) { > > +int i =0; > > + > > +spec.ipv6.hdr.vtc_flow = match->flow.ipv6_label; > > +spec.ipv6.hdr.proto = match->flow.nw_proto; > > +spec.ipv6.hdr.hop_limits = match->flow.nw_ttl; > > + > > +mask.ipv6.hdr.vtc_flow = match->wc.masks.ipv6_label; > > +mask.ipv6.hdr.proto = match->wc.masks.nw_proto; > > +mask.ipv6.hdr.hop_limits = match->wc.masks.nw_ttl; > > + > > > > +for (i = 0; i < 16; i++) { > > +spec.ipv6.hdr.src_addr[i] = match->flow.ipv6_src.s6_addr[i]; > > +spec.ipv6.hdr.dst_addr[i] = match->flow.ipv6_dst.s6_addr[i]; > > + > > +mask.ipv6.hdr.src_addr[i] = match->wc.masks.ipv6_src.s6_addr[i]; > > +mask.ipv6.hdr.dst_addr[i] = match->wc.masks.ipv6_dst.s6_addr[i]; > > +} > > + > > +add_flow_pattern(&patterns, RTE_FLOW_ITEM_TYPE_IPV6, > > + &spec.ipv6, &mask.ipv6); > > + > > +} > > + > > if (proto != IPPROTO_ICMP && proto != IPPROTO_UDP && > > proto != IPPROTO_SCTP && proto != IPPROTO_TCP && > > (match->wc.masks.tp_src || > > > > > > > > > Best Regards > > Timo_liu > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 1/2] sset: New function sset_join().
LGTM, thanks. Reviewed-by: Yifeng Sun On Wed, Sep 18, 2019 at 9:31 AM Ben Pfaff wrote: > > This will acquire its first user in an upcoming commit. > > This function follows the pattern set by svec_join(). > > Signed-off-by: Ben Pfaff > --- > lib/sset.c | 31 +++ > lib/sset.h | 3 +++ > 2 files changed, 34 insertions(+) > > diff --git a/lib/sset.c b/lib/sset.c > index 3deb1f9de9be..b2e3f43ec91b 100644 > --- a/lib/sset.c > +++ b/lib/sset.c > @@ -18,6 +18,7 @@ > > #include "sset.h" > > +#include "openvswitch/dynamic-string.h" > #include "hash.h" > > static uint32_t > @@ -118,6 +119,36 @@ sset_from_delimited_string(struct sset *set, const char > *s_, > free(s); > } > > +/* Returns a malloc()'d string that consists of the concatenation of all of > the > + * strings in 'sset' in lexicographic order, each separated from the next by > + * 'delimiter' and followed by 'terminator'. For example: > + * > + * sset_join(("a", "b", "c"), ", ", ".") -> "a, b, c." > + * sset_join(("xyzzy"), ", ", ".") -> "xyzzy." > + * sset_join((""),", ", ".") -> "." > + * > + * The caller is responsible for freeing the returned string (with free()). > + */ > +char * > +sset_join(const struct sset *sset, > + const char *delimiter, const char *terminator) > +{ > +struct ds s = DS_EMPTY_INITIALIZER; > + > +const char **names = sset_sort(sset); > +for (size_t i = 0; i < sset_count(sset); i++) { > +if (i) { > +ds_put_cstr(&s, delimiter); > +} > +ds_put_cstr(&s, names[i]); > +} > +free(names); > + > +ds_put_cstr(&s, terminator); > + > +return ds_steal_cstr(&s); > +} > + > /* Returns true if 'set' contains no strings, false if it contains at least > one > * string. */ > bool > diff --git a/lib/sset.h b/lib/sset.h > index 768d0cf0a1f3..f0bb8b534496 100644 > --- a/lib/sset.h > +++ b/lib/sset.h > @@ -43,8 +43,11 @@ void sset_clone(struct sset *, const struct sset *); > void sset_swap(struct sset *, struct sset *); > void sset_moved(struct sset *); > > +/* String parsing and formatting. */ > void sset_from_delimited_string(struct sset *, const char *s, > const char *delimiters); > +char *sset_join(const struct sset *, > +const char *delimiter, const char *terminator); > > /* Count. */ > bool sset_is_empty(const struct sset *); > -- > 2.21.0 > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 2/2] db-ctl-base: Give better error messages for ambiguous abbreviations.
LGTM, thanks. Reviewed-by: Yifeng Sun On Wed, Sep 18, 2019 at 9:32 AM Ben Pfaff wrote: > > Tables and columns may be abbreviated to unique prefixes, but until now > the error messages have just said there's more than one match. This commit > makes the error messages list the possibilities. > > Signed-off-by: Ben Pfaff > --- > lib/db-ctl-base.c | 58 +- > tests/ovs-vsctl.at | 5 +++- > 2 files changed, 41 insertions(+), 22 deletions(-) > > diff --git a/lib/db-ctl-base.c b/lib/db-ctl-base.c > index 3bd9f006acb1..6ae638be5a2b 100644 > --- a/lib/db-ctl-base.c > +++ b/lib/db-ctl-base.c > @@ -430,31 +430,39 @@ static char * > get_column(const struct ovsdb_idl_table_class *table, const char > *column_name, > const struct ovsdb_idl_column **columnp) > { > +struct sset best_matches = SSET_INITIALIZER(&best_matches); > const struct ovsdb_idl_column *best_match = NULL; > unsigned int best_score = 0; > -size_t i; > > -for (i = 0; i < table->n_columns; i++) { > +for (size_t i = 0; i < table->n_columns; i++) { > const struct ovsdb_idl_column *column = &table->columns[i]; > unsigned int score = score_partial_match(column->name, column_name); > -if (score > best_score) { > +if (score && score >= best_score) { > +if (score > best_score) { > +sset_clear(&best_matches); > +} > +sset_add(&best_matches, column->name); > best_match = column; > best_score = score; > -} else if (score == best_score) { > -best_match = NULL; > } > } > > -*columnp = best_match; > -if (best_match) { > -return NULL; > -} else if (best_score) { > -return xasprintf("%s contains more than one column whose name " > - "matches \"%s\"", table->name, column_name); > +char *error = NULL; > +*columnp = NULL; > +if (!best_match) { > +error = xasprintf("%s does not contain a column whose name matches " > + "\"%s\"", table->name, column_name); > +} else if (sset_count(&best_matches) == 1) { > +*columnp = best_match; > } else { > -return xasprintf("%s does not contain a column whose name matches " > - "\"%s\"", table->name, column_name); > +char *matches = sset_join(&best_matches, ", ", ""); > +error = xasprintf("%s contains more than one column " > + "whose name matches \"%s\": %s", > + table->name, column_name, matches); > +free(matches); > } > +sset_destroy(&best_matches); > +return error; > } > > static char * OVS_WARN_UNUSED_RESULT > @@ -1207,27 +1215,35 @@ cmd_list(struct ctl_context *ctx) > static char * OVS_WARN_UNUSED_RESULT > get_table(const char *table_name, const struct ovsdb_idl_table_class > **tablep) > { > +struct sset best_matches = SSET_INITIALIZER(&best_matches); > const struct ovsdb_idl_table_class *best_match = NULL; > unsigned int best_score = 0; > -char *error = NULL; > > for (const struct ovsdb_idl_table_class *table = idl_classes; > table < &idl_classes[n_classes]; table++) { > unsigned int score = score_partial_match(table->name, table_name); > -if (score > best_score) { > +if (score && score >= best_score) { > +if (score > best_score) { > +sset_clear(&best_matches); > +} > +sset_add(&best_matches, table->name); > best_match = table; > best_score = score; > -} else if (score == best_score) { > -best_match = NULL; > } > } > -if (best_match) { > + > +char *error = NULL; > +if (!best_match) { > +error = xasprintf("unknown table \"%s\"", table_name); > +} else if (sset_count(&best_matches) == 1) { > *tablep = best_match; > -} else if (best_score) { > -error = xasprintf("multiple table names match \"%s\"", table_name); > } else { > -error = xasprintf("unknown table \"%s\"", table_name); > +char *matches = sset_join(&best_matches, ", ", ""); > +error = xasprintf("\"
Re: [ovs-dev] [PATCH v3] datapath: compat: Backports bugfixes for nf_conncount
Hi Ben, Could you please backport this patch to 2.12? Thanks. Yifeng On Thu, Aug 29, 2019 at 8:39 AM Yifeng Sun wrote: > > Thanks Yi-Hung for the explanation. > > On Wed, Aug 28, 2019 at 4:49 PM Yi-Hung Wei wrote: > > > > On Wed, Aug 28, 2019 at 4:07 PM Ben Pfaff wrote: > > > > > > On Wed, Aug 07, 2019 at 03:25:33PM -0700, Yifeng Sun wrote: > > > > This patch backports several critical bug fixes related to > > > > locking and data consistency in nf_conncount code. > > > > > > > > This backport is based on the following upstream net-next upstream > > > > commits. > > > > a007232 ("netfilter: nf_conncount: fix argument order to find_next_bit") > > > > c80f10b ("netfilter: nf_conncount: speculative garbage collection on > > > > empty lists") > > > > 2f971a8 ("netfilter: nf_conncount: move all list iterations under > > > > spinlock") > > > > df4a902 ("netfilter: nf_conncount: merge lookup and add functions") > > > > e8cfb37 ("netfilter: nf_conncount: restart search when nodes have been > > > > erased") > > > > f7fcc98 ("netfilter: nf_conncount: split gc in two phases") > > > > 4cd273b ("netfilter: nf_conncount: don't skip eviction when age is > > > > negative") > > > > c78e781 ("netfilter: nf_conncount: replace CONNCOUNT_LOCK_SLOTS with > > > > CONNCOUNT_SLOTS") > > > > d4e7df1 ("netfilter: nf_conncount: use rb_link_node_rcu() instead of > > > > rb_link_node()") > > > > 53ca0f2 ("netfilter: nf_conncount: remove wrong condition check > > > > routine") > > > > 3c5cdb1 ("netfilter: nf_conncount: fix unexpected permanent node of > > > > list.") > > > > 31568ec ("netfilter: nf_conncount: fix list_del corruption in > > > > conn_free") > > > > fd3e71a ("netfilter: nf_conncount: use spin_lock_bh instead of > > > > spin_lock") > > > > > > > > This patch adds additional compat code so that it can build on > > > > all supported kernel versions. > > > > > > I think that our most common approach is to use one OVS commit to > > > backport one Linux kernel commit. This commit combines many Linux > > > kernel commits. Is that an intentional change in this case? > > > > Hi Ben, > > > > Yes, we are intended to pull in all of the bug fixes in this case. > > The rationale is as following. > > > > For the commits in ovs kernel module, we usually backport one upstream > > net-next commit to one OVS commit. We need this fine granularity > > backports because a single OVS kernel module changes can affect OVS > > behavior. For the other type of kernel backports (mainly in > > ./datapath/linux/compat/ ), we try to backport the required missing > > features for ovs kernel module in the older kernel. The goal is to > > keep the older kernel in sync with the newer kernel on the required > > features, and we may not need much detailed information per upstream > > patch. In this case, it would be easier to pull in multiple patches > > at once. > > > > Some existing examples are, > > c387d8177f20 ("compat: Add ipv6 GRE and IPV6 Tunneling") > > 744964326f6c ("datapath: compat: Backports nf_conncount") > > > > Thanks, > > > > -Yi-Hung ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] acinclude: Fix false positive search for prandom_u32
LGTM, thanks Greg for the fix. Reviewed-by: Yifeng Sun On Tue, Oct 8, 2019 at 9:21 AM Greg Rose wrote: > > Searching random.h for prandom_u32 will also match when prandom_u32_max > is present and cause a false positive HAVE_PRANDOM_U32. Fix this up > by looking for the parenthesis following prandom_u32 so it won't > match on prandom_u32_max. > > Passes Travis: > https://travis-ci.org/gvrose8192/ovs-experimental/builds/595171808 > > Signed-off-by: Greg Rose > --- > acinclude.m4 | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/acinclude.m4 b/acinclude.m4 > index c729266..066c134 100644 > --- a/acinclude.m4 > +++ b/acinclude.m4 > @@ -723,7 +723,9 @@ AC_DEFUN([OVS_CHECK_LINUX_COMPAT], [ > [\(*nf_ct_timeout_find_get_hook\)], [net], > [OVS_DEFINE([HAVE_NF_CT_TIMEOUT_FIND_GET_HOOK_NET])]) > > - OVS_GREP_IFELSE([$KSRC/include/linux/random.h], [prandom_u32]) > + OVS_GREP_IFELSE([$KSRC/include/linux/random.h], > + [prandom_u32[[\(]]], > + [OVS_DEFINE([HAVE_PRANDOM_U32])]) >OVS_GREP_IFELSE([$KSRC/include/linux/random.h], [prandom_u32_max]) > >OVS_GREP_IFELSE([$KSRC/include/net/rtnetlink.h], [get_link_net]) > -- > 1.8.3.1 > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] rhel: Support RHEL7.7 build and packaging
This patch provides essential fixes for OVS to support RHEL7.7's new kernel. make rpm-fedora-kmod \ RPMBUILD_OPT='-D "kversion 3.10.0-1062.1.2.el7.x86_64"' Signed-off-by: Yifeng Sun --- rhel/openvswitch-kmod-fedora.spec.in | 9 + rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh | 14 ++ 2 files changed, 15 insertions(+), 8 deletions(-) diff --git a/rhel/openvswitch-kmod-fedora.spec.in b/rhel/openvswitch-kmod-fedora.spec.in index b3588982ef7a..fbb8366990f1 100644 --- a/rhel/openvswitch-kmod-fedora.spec.in +++ b/rhel/openvswitch-kmod-fedora.spec.in @@ -12,8 +12,9 @@ # Use the kversion macro such as # RPMBUILD_OPT='-D "kversion 3.10.0-693.1.1.el7.x86_64 3.10.0-693.17.1.el7.x86_64"' # to build package for mulitple kernel versions in the same package -# This only works for kernel 3.10.0 major revision 957 (RHEL 7.6), -# major revision 693 (RHEL 7.4) and major revision 327 (RHEL 7.2). +# This only works for kernel 3.10.0 major revision 1062 (RHEL 7.7), +# major revision 957 (RHEL 7.6), major revision 693 (RHEL 7.4) and +# major revision 327 (RHEL 7.2). # By default, build against the current running kernel version #%define kernel 3.1.5-1.fc16.x86_64 #define kernel %{kernel_source} @@ -92,8 +93,8 @@ if grep -qs "suse" /etc/os-release; then fi elif [ "$mainline_major" = "3" ] && [ "$mainline_minor" = "10" ] && { [ "$major_rev" = "327" ] || [ "$major_rev" = "693" ] || \ - [ "$major_rev" = "957" ]; }; then -# For RHEL 7.2, 7.4 and 7.6 + [ "$major_rev" = "957" ] || [ "$major_rev" == "1062" ]; }; then +# For RHEL 7.2, 7.4, 7.6 and 7.7 if [ -x "%{_datadir}/openvswitch/scripts/ovs-kmod-manage.sh" ]; then %{_datadir}/openvswitch/scripts/ovs-kmod-manage.sh fi diff --git a/rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh b/rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh index 693fb0b744b3..a643b55ff0f8 100644 --- a/rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh +++ b/rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh @@ -15,9 +15,10 @@ # limitations under the License. # This script is intended to be used on the following kernels. -# - 3.10.0 major revision 327 (RHEL 7.2) -# - 3.10.0 major revision 693 (RHEL 7.4) -# - 3.10.0 major revision 957 (RHEL 7.6) +# - 3.10.0 major revision 327 (RHEL 7.2) +# - 3.10.0 major revision 693 (RHEL 7.4) +# - 3.10.0 major revision 957 (RHEL 7.6) +# - 3.10.0 major revision 1062 (RHEL 7.7) # - 4.4.x, x >= 73 (SLES 12 SP3) # - 4.12.x, x >= 14 (SLES 12 SP4). # It is packaged in the openvswitch kmod RPM and run in the post-install @@ -100,6 +101,11 @@ if [ "$mainline_major" = "3" ] && [ "$mainline_minor" = "10" ]; then comp_ver=10 ver_offset=4 installed_ver="$minor_rev" +elif [ "$major_rev" = "1062" ]; then +#echo "rhel77" +comp_ver=10 +ver_offset=4 +installed_ver="$minor_rev" fi elif [ "$mainline_major" = "4" ] && [ "$mainline_minor" = "4" ]; then if [ "$mainline_patch" -ge "73" ]; then @@ -111,7 +117,7 @@ elif [ "$mainline_major" = "4" ] && [ "$mainline_minor" = "4" ]; then elif [ "$mainline_major" = "4" ] && [ "$mainline_minor" = "12" ]; then if [ "$mainline_patch" -ge "14" ]; then #echo "sles12sp4" -comp_ver=14 +comp_ver=1 ver_offset=2 installed_ver="$mainline_patch" fi -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] dpif-netlink: Free leaked nl_sock
Valgrind reports: 20 bytes in 1 blocks are definitely lost in loss record 94 of 353 by 0x532594: xmalloc (util.c:138) by 0x553EAD: nl_sock_create (netlink-socket.c:146) by 0x54331D: create_nl_sock (dpif-netlink.c:255) by 0x54331D: dpif_netlink_port_add__ (dpif-netlink.c:756) by 0x5435F6: dpif_netlink_port_add_compat (dpif-netlink.c:876) by 0x5435F6: dpif_netlink_port_add (dpif-netlink.c:922) by 0x47EC1D: dpif_port_add (dpif.c:584) by 0x42B35F: port_add (ofproto-dpif.c:3721) by 0x41E64A: ofproto_port_add (ofproto.c:2032) by 0x40B3FE: iface_do_create (bridge.c:1817) by 0x40B3FE: iface_create (bridge.c:1855) by 0x40B3FE: bridge_add_ports__ (bridge.c:943) by 0x40D14A: bridge_add_ports (bridge.c:959) by 0x40D14A: bridge_reconfigure (bridge.c:673) by 0x410D75: bridge_run (bridge.c:3050) by 0x407614: main (ovs-vswitchd.c:127) This leak is because when vport_add_channel() returns 0, it is expected to take the ownership of 'socksp'. This patch fixes this issue. Signed-off-by: Yifeng Sun --- lib/dpif-netlink.c | 1 + 1 file changed, 1 insertion(+) diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c index ceb39cde1ab1..ebe22106e0fc 100644 --- a/lib/dpif-netlink.c +++ b/lib/dpif-netlink.c @@ -458,6 +458,7 @@ vport_add_channel(struct dpif_netlink *dpif, odp_port_t port_no, int error; if (dpif->handlers == NULL) { +close_nl_sock(socksp); return 0; } -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] dpif-netlink: Fix some variable naming.
LGTM, thanks. Reviewed-by: Yifeng Sun On Mon, Oct 14, 2019 at 11:28 AM Ben Pfaff wrote: > > Usually a plural name refers to an array, but 'socks' and 'socksp' were > only single objects, so this changes their names to 'sock' and 'sockp'. > > Usually a 'p' suffix means that a variable is an output argument, but > that was only true in one place here, so this changes the names of the > other variables to plain 'sock'. > > Signed-off-by: Ben Pfaff > --- > lib/dpif-netlink.c | 48 +++--- > 1 file changed, 24 insertions(+), 24 deletions(-) > > diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c > index ebe22106e0fc..d1f9b81db84f 100644 > --- a/lib/dpif-netlink.c > +++ b/lib/dpif-netlink.c > @@ -249,11 +249,11 @@ static int dpif_netlink_port_query__(const struct > dpif_netlink *dpif, > struct dpif_port *dpif_port); > > static int > -create_nl_sock(struct dpif_netlink *dpif OVS_UNUSED, struct nl_sock **socksp) > +create_nl_sock(struct dpif_netlink *dpif OVS_UNUSED, struct nl_sock **sockp) > OVS_REQ_WRLOCK(dpif->upcall_lock) > { > #ifndef _WIN32 > -return nl_sock_create(NETLINK_GENERIC, socksp); > +return nl_sock_create(NETLINK_GENERIC, sockp); > #else > /* Pick netlink sockets to use in a round-robin fashion from each > * handler's pool of sockets. */ > @@ -263,13 +263,13 @@ create_nl_sock(struct dpif_netlink *dpif OVS_UNUSED, > struct nl_sock **socksp) > > /* A pool of sockets is allocated when the handler is initialized. */ > if (sock_pool == NULL) { > -*socksp = NULL; > +*sockp = NULL; > return EINVAL; > } > > ovs_assert(index < VPORT_SOCK_POOL_SIZE); > -*socksp = sock_pool[index].nl_sock; > -ovs_assert(*socksp); > +*sockp = sock_pool[index].nl_sock; > +ovs_assert(*sockp); > index = (index == VPORT_SOCK_POOL_SIZE - 1) ? 0 : index + 1; > handler->last_used_pool_idx = index; > return 0; > @@ -277,10 +277,10 @@ create_nl_sock(struct dpif_netlink *dpif OVS_UNUSED, > struct nl_sock **socksp) > } > > static void > -close_nl_sock(struct nl_sock *socksp) > +close_nl_sock(struct nl_sock *sock) > { > #ifndef _WIN32 > -nl_sock_destroy(socksp); > +nl_sock_destroy(sock); > #endif > } > > @@ -450,7 +450,7 @@ vport_get_pid(struct dpif_netlink *dpif, uint32_t > port_idx, > > static int > vport_add_channel(struct dpif_netlink *dpif, odp_port_t port_no, > - struct nl_sock *socksp) > + struct nl_sock *sock) > { > struct epoll_event event; > uint32_t port_idx = odp_to_u32(port_no); > @@ -458,7 +458,7 @@ vport_add_channel(struct dpif_netlink *dpif, odp_port_t > port_no, > int error; > > if (dpif->handlers == NULL) { > -close_nl_sock(socksp); > +close_nl_sock(sock); > return 0; > } > > @@ -499,14 +499,14 @@ vport_add_channel(struct dpif_netlink *dpif, odp_port_t > port_no, > struct dpif_handler *handler = &dpif->handlers[i]; > > #ifndef _WIN32 > -if (epoll_ctl(handler->epoll_fd, EPOLL_CTL_ADD, nl_sock_fd(socksp), > +if (epoll_ctl(handler->epoll_fd, EPOLL_CTL_ADD, nl_sock_fd(sock), >&event) < 0) { > error = errno; > goto error; > } > #endif > } > -dpif->channels[port_idx].sock = socksp; > +dpif->channels[port_idx].sock = sock; > dpif->channels[port_idx].last_poll = LLONG_MIN; > > return 0; > @@ -515,7 +515,7 @@ error: > #ifndef _WIN32 > while (i--) { > epoll_ctl(dpif->handlers[i].epoll_fd, EPOLL_CTL_DEL, > - nl_sock_fd(socksp), NULL); > + nl_sock_fd(sock), NULL); > } > #endif > dpif->channels[port_idx].sock = NULL; > @@ -750,12 +750,12 @@ dpif_netlink_port_add__(struct dpif_netlink *dpif, > const char *name, > { > struct dpif_netlink_vport request, reply; > struct ofpbuf *buf; > -struct nl_sock *socksp = NULL; > +struct nl_sock *sock = NULL; > uint32_t upcall_pids = 0; > int error = 0; > > if (dpif->handlers) { > -error = create_nl_sock(dpif, &socksp); > +error = create_nl_sock(dpif, &sock); > if (error) { > return error; > } > @@ -768,8 +768,8 @@ dpif_netlink_port_add__(struct dpif_netlink *dpif, const > char *name, > request.name = name; > > request.port_no = *port_nop;
Re: [ovs-dev] [PATCH 01/11] datapath: Replace nf_ct_invert_tuplepr() with nf_ct_invert_tuple()
A minor issue in commit message: pl_nf_ct_invert_tuple => rpl_nf_ct_invert_tuple Other than that, LGTM, thanks. Reviewed-by: Yifeng Sun On Mon, Oct 14, 2019 at 10:50 AM Yi-Hung Wei wrote: > > After upstream net-next commit 303e0c558959 ("netfilter: conntrack: > avoid unneeded nf_conntrack_l4proto lookups") nf_ct_invert_tuplepr() > is no longer available in the kernel. > > Ideally, we should be in sync with upstream kernel by calling > nf_ct_invert_tuple() directly in conntrack.c. However, > nf_ct_invert_tuple() has different function signature in older kernel, > and it would be hard to replace that in the compat layer. Thus, we > use pl_nf_ct_invert_tuple() in conntrack.c and maintain compatibility > in the compat layer so that ovs kernel module runs smoothly in both > new and old kernel. > > Signed-off-by: Yi-Hung Wei > --- > acinclude.m4 | 2 ++ > datapath/conntrack.c | 2 +- > .../linux/compat/include/net/netfilter/nf_conntrack_core.h | 14 > ++ > 3 files changed, 17 insertions(+), 1 deletion(-) > > diff --git a/acinclude.m4 b/acinclude.m4 > index 52f92870eaaa..4072a7c8f58a 100644 > --- a/acinclude.m4 > +++ b/acinclude.m4 > @@ -697,6 +697,8 @@ AC_DEFUN([OVS_CHECK_LINUX_COMPAT], [ >[nf_ct_set]) >OVS_GREP_IFELSE([$KSRC/include/net/netfilter/nf_conntrack.h], >[nf_ct_is_untracked]) > + OVS_GREP_IFELSE([$KSRC/include/net/netfilter/nf_conntrack.h], > + [nf_ct_invert_tuplepr]) >OVS_GREP_IFELSE([$KSRC/include/net/netfilter/nf_conntrack_zones.h], >[nf_ct_zone_init]) >OVS_GREP_IFELSE([$KSRC/include/net/netfilter/nf_conntrack_l3proto.h], > diff --git a/datapath/conntrack.c b/datapath/conntrack.c > index e328afe1ad15..afdd65b4cb7c 100644 > --- a/datapath/conntrack.c > +++ b/datapath/conntrack.c > @@ -668,7 +668,7 @@ ovs_ct_find_existing(struct net *net, const struct > nf_conntrack_zone *zone, > if (natted) { > struct nf_conntrack_tuple inverse; > > - if (!nf_ct_invert_tuplepr(&inverse, &tuple)) { > + if (!rpl_nf_ct_invert_tuple(&inverse, &tuple)) { > pr_debug("ovs_ct_find_existing: Inversion failed!\n"); > return NULL; > } > diff --git a/datapath/linux/compat/include/net/netfilter/nf_conntrack_core.h > b/datapath/linux/compat/include/net/netfilter/nf_conntrack_core.h > index 10158011fd4d..ad52bc9412d8 100644 > --- a/datapath/linux/compat/include/net/netfilter/nf_conntrack_core.h > +++ b/datapath/linux/compat/include/net/netfilter/nf_conntrack_core.h > @@ -113,4 +113,18 @@ rpl_nf_conntrack_in(struct sk_buff *skb, const struct > nf_hook_state *state) > #define nf_conntrack_in rpl_nf_conntrack_in > #endif /* HAVE_NF_CONNTRACK_IN_TAKES_NF_HOOK_STATE */ > > +#ifdef HAVE_NF_CT_INVERT_TUPLEPR > +static inline bool rpl_nf_ct_invert_tuple(struct nf_conntrack_tuple *inverse, > + const struct nf_conntrack_tuple *orig) > +{ > + return nf_ct_invert_tuplepr(inverse, orig); > +} > +#else > +static inline bool rpl_nf_ct_invert_tuple(struct nf_conntrack_tuple *inverse, > + const struct nf_conntrack_tuple *orig) > +{ > + return nf_ct_invert_tuple(inverse, orig); > +} > +#endif /* HAVE_NF_CT_INVERT_TUPLEPR */ > + > #endif /* _NF_CONNTRACK_CORE_WRAPPER_H */ > -- > 2.7.4 > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 03/11] datapath: add seqadj extension when NAT is used.
LGTM, thanks. Reviewed-by: Yifeng Sun On Mon, Oct 14, 2019 at 10:51 AM Yi-Hung Wei wrote: > > From: Flavio Leitner > > upstream patch: > > commit fa7e428c6b7ed3281610511a2b2ec716d9894be8 > Author: Flavio Leitner > Date: Mon Mar 25 15:58:31 2019 -0300 > > openvswitch: add seqadj extension when NAT is used. > > When the conntrack is initialized, there is no helper attached > yet so the nat info initialization (nf_nat_setup_info) skips > adding the seqadj ext. > > A helper is attached later when the conntrack is not confirmed > but is going to be committed. In this case, if NAT is needed then > adds the seqadj ext as well. > > Fixes: 16ec3d4fbb96 ("openvswitch: Fix cached ct with helper.") > Signed-off-by: Flavio Leitner > Acked-by: Pravin B Shelar > Signed-off-by: David S. Miller > > Signed-off-by: Yi-Hung Wei > --- > datapath/conntrack.c | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/datapath/conntrack.c b/datapath/conntrack.c > index 291d4f4723d9..1b345a03e704 100644 > --- a/datapath/conntrack.c > +++ b/datapath/conntrack.c > @@ -1063,6 +1063,12 @@ static int __ovs_ct_lookup(struct net *net, struct > sw_flow_key *key, > GFP_ATOMIC); > if (err) > return err; > + > + /* helper installed, add seqadj if NAT is required */ > + if (info->nat && !nfct_seqadj(ct)) { > + if (!nfct_seqadj_ext_add(ct)) > + return -EINVAL; > + } > } > > /* Call the helper only if: > -- > 2.7.4 > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 02/11] datapath: Detect upstream nf_nat change
LGTM, thanks. Reviewed-by: Yifeng Sun On Mon, Oct 14, 2019 at 10:51 AM Yi-Hung Wei wrote: > > The following two upstream commits merge nf_nat_ipv4 and nf_nat_ipv6 > into nf_nat core, and move some header files around. To handle > these modifications, this patch detects the upstream changes, uses > the header files and config symbols properly. > > Ideally, we should replace CONFIG_NF_NAT_IPV4 and CONFIG_NF_NAT_IPV6 with > CONFIG_NF_NAT and CONFIG_IPV6. In order to keep backward compatibility, > we keep the checking of CONFIG_NF_NAT_IPV4/6 as is for the old kernel, > and replace them with marco for the new kernel. > > upstream commits: > 3bf195ae6037 ("netfilter: nat: merge nf_nat_ipv4,6 into nat core") > d2c5c103b133 ("netfilter: nat: remove nf_nat_l3proto.h and nf_nat_core.h") > > Signed-off-by: Yi-Hung Wei > --- > acinclude.m4 | 2 ++ > datapath/conntrack.c | 13 - > 2 files changed, 14 insertions(+), 1 deletion(-) > > diff --git a/acinclude.m4 b/acinclude.m4 > index 4072a7c8f58a..cc80026f2127 100644 > --- a/acinclude.m4 > +++ b/acinclude.m4 > @@ -713,6 +713,8 @@ AC_DEFUN([OVS_CHECK_LINUX_COMPAT], [ >OVS_GREP_IFELSE([$KSRC/include/net/netfilter/nf_nat.h], > [nf_ct_nat_ext_add]) >OVS_GREP_IFELSE([$KSRC/include/net/netfilter/nf_nat.h], > [nf_nat_alloc_null_binding]) >OVS_GREP_IFELSE([$KSRC/include/net/netfilter/nf_nat.h], [nf_nat_range2]) > + OVS_GREP_IFELSE([$KSRC/include/net/netfilter/nf_nat.h], [nf_nat_packet], > + [OVS_DEFINE([HAVE_UPSTREAM_NF_NAT])]) >OVS_GREP_IFELSE([$KSRC/include/net/netfilter/nf_conntrack_seqadj.h], > [nf_ct_seq_adjust]) >OVS_GREP_IFELSE([$KSRC/include/net/netfilter/nf_conntrack_count.h], > [nf_conncount_gc_list], > > [OVS_GREP_IFELSE([$KSRC/include/net/netfilter/nf_conntrack_count.h], > diff --git a/datapath/conntrack.c b/datapath/conntrack.c > index afdd65b4cb7c..291d4f4723d9 100644 > --- a/datapath/conntrack.c > +++ b/datapath/conntrack.c > @@ -35,10 +35,21 @@ > #include > > #ifdef CONFIG_NF_NAT_NEEDED > +/* Starting from upstream commit 3bf195ae6037 ("netfilter: nat: merge > + * nf_nat_ipv4,6 into nat core") in kernel 5.1. nf_nat_ipv4,6 are merged > + * into nf_nat. In order to keep backward compatibility, we keep the config > + * checking as is for the old kernel, and replace them with marco for the > + * new kernel. */ > +#ifdef HAVE_UPSTREAM_NF_NAT > +#include > +#define CONFIG_NF_NAT_IPV4 CONFIG_NF_NAT > +#define CONFIG_NF_NAT_IPV6 CONFIG_IPV6 > +#else > #include > #include > #include > -#endif > +#endif /* HAVE_UPSTREAM_NF_NAT */ > +#endif /* CONFIG_NF_NAT_NEEDED */ > > #include "datapath.h" > #include "conntrack.h" > -- > 2.7.4 > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 04/11] datapath: Handle NF_NAT_NEEDED replacement
LGTM, thanks. Reviewed-by: Yifeng Sun On Mon, Oct 14, 2019 at 10:52 AM Yi-Hung Wei wrote: > > Starting from the following upstream commit, NF_NAT_NEEDED is replaced > by IS_ENABLED(CONFIG_NF_NAT) in the upstream kernel. This patch makes > some changes so that our in tree ovs kernel module is compatible to > both old and new kernels. > > Upstream commit: > commit 4806e975729f99c7908d1688a143f1e16d464e6c > Author: Florian Westphal > Date: Wed Mar 27 09:22:26 2019 +0100 > > netfilter: replace NF_NAT_NEEDED with IS_ENABLED(CONFIG_NF_NAT) > > NF_NAT_NEEDED is true whenever nat support for either ipv4 or ipv6 is > enabled. Now that the af-specific nat configuration switches have been > removed, IS_ENABLED(CONFIG_NF_NAT) has the same effect. > > Signed-off-by: Florian Westphal > Signed-off-by: Pablo Neira Ayuso > > Signed-off-by: Yi-Hung Wei > --- > acinclude.m4 | 1 + > datapath/conntrack.c | 25 + > 2 files changed, 18 insertions(+), 8 deletions(-) > > diff --git a/acinclude.m4 b/acinclude.m4 > index cc80026f2127..dca09abefa96 100644 > --- a/acinclude.m4 > +++ b/acinclude.m4 > @@ -676,6 +676,7 @@ AC_DEFUN([OVS_CHECK_LINUX_COMPAT], [ >OVS_FIND_FIELD_IFELSE([$KSRC/include/linux/netfilter.h], [nf_hook_ops], > [owner], [OVS_DEFINE([HAVE_NF_HOOKS_OPS_OWNER])]) >OVS_GREP_IFELSE([$KSRC/include/linux/netfilter.h], [NFPROTO_INET]) > + OVS_GREP_IFELSE([$KSRC/include/linux/netfilter.h], [CONFIG_NF_NAT_NEEDED]) > > >OVS_FIND_FIELD_IFELSE([$KSRC/include/linux/netfilter_ipv6.h], > [nf_ipv6_ops], > diff --git a/datapath/conntrack.c b/datapath/conntrack.c > index 1b345a03e704..010f9af5ffd2 100644 > --- a/datapath/conntrack.c > +++ b/datapath/conntrack.c > @@ -34,7 +34,16 @@ > #include > #include > > -#ifdef CONFIG_NF_NAT_NEEDED > +/* Upstream commit 4806e975729f ("netfilter: replace NF_NAT_NEEDED with > + * IS_ENABLED(CONFIG_NF_NAT)") replaces the config checking on NF_NAT_NEEDED > + * with CONFIG_NF_NAT. We will replace the checking on NF_NAT_NEEDED for the > + * newer kernel with the marco in order to keep backward compatiblity. > + */ > +#ifndef HAVE_CONFIG_NF_NAT_NEEDED > +#define CONFIG_NF_NAT_NEEDED CONFIG_NF_NAT > +#endif > + > +#if IS_ENABLED(CONFIG_NF_NAT_NEEDED) > /* Starting from upstream commit 3bf195ae6037 ("netfilter: nat: merge > * nf_nat_ipv4,6 into nat core") in kernel 5.1. nf_nat_ipv4,6 are merged > * into nf_nat. In order to keep backward compatibility, we keep the config > @@ -100,7 +109,7 @@ struct ovs_conntrack_info { > struct md_labels labels; > char timeout[CTNL_TIMEOUT_NAME_MAX]; > struct nf_ct_timeout *nf_ct_timeout; > -#ifdef CONFIG_NF_NAT_NEEDED > +#if IS_ENABLED(CONFIG_NF_NAT_NEEDED) > struct nf_nat_range2 range; /* Only present for SRC NAT and DST NAT. > */ > #endif > }; > @@ -786,7 +795,7 @@ static bool skb_nfct_cached(struct net *net, > return ct_executed; > } > > -#ifdef CONFIG_NF_NAT_NEEDED > +#if IS_ENABLED(CONFIG_NF_NAT_NEEDED) > /* Modelled after nf_nat_ipv[46]_fn(). > * range is only used for new, uninitialized NAT state. > * Returns either NF_ACCEPT or NF_DROP. > @@ -1405,7 +1414,7 @@ static int ovs_ct_add_helper(struct ovs_conntrack_info > *info, const char *name, > return 0; > } > > -#ifdef CONFIG_NF_NAT_NEEDED > +#if IS_ENABLED(CONFIG_NF_NAT_NEEDED) > static int parse_nat(const struct nlattr *attr, > struct ovs_conntrack_info *info, bool log) > { > @@ -1547,7 +1556,7 @@ static const struct ovs_ct_len_tbl > ovs_ct_attr_lens[OVS_CT_ATTR_MAX + 1] = { > .maxlen = sizeof(struct md_labels) }, > [OVS_CT_ATTR_HELPER]= { .minlen = 1, > .maxlen = NF_CT_HELPER_NAME_LEN }, > -#ifdef CONFIG_NF_NAT_NEEDED > +#if IS_ENABLED(CONFIG_NF_NAT_NEEDED) > /* NAT length is checked when parsing the nested attributes. */ > [OVS_CT_ATTR_NAT] = { .minlen = 0, .maxlen = INT_MAX }, > #endif > @@ -1627,7 +1636,7 @@ static int parse_ct(const struct nlattr *attr, struct > ovs_conntrack_info *info, > return -EINVAL; > } > break; > -#ifdef CONFIG_NF_NAT_NEEDED > +#if IS_ENABLED(CONFIG_NF_NAT_NEEDED) > case OVS_CT_ATTR_NAT: { > int err = parse_nat(a, info, log); > > @@ -1761,7 +1770,7 @@ err_free_ct: > return err; > } > > -#ifdef CONFIG_NF_NAT_NEEDED > +#if IS_ENABLED(CONFIG_NF_NAT_NEEDED) > static b
Re: [ovs-dev] [PATCH 05/11] datapath: Use nla_nest_start_noflag()
LGTM, thanks. Reviewed-by: Yifeng Sun On Mon, Oct 14, 2019 at 10:53 AM Yi-Hung Wei wrote: > > This patch backports the openvswitch changes and update the compat layer > for the following upstream patch. > > commit ae0be8de9a53cda3505865c11826d8ff0640237c > Author: Michal Kubecek > Date: Fri Apr 26 11:13:06 2019 +0200 > > netlink: make nla_nest_start() add NLA_F_NESTED flag > > Even if the NLA_F_NESTED flag was introduced more than 11 years ago, most > netlink based interfaces (including recently added ones) are still not > setting it in kernel generated messages. Without the flag, message parsers > not aware of attribute semantics (e.g. wireshark dissector or libmnl's > mnl_nlmsg_fprintf()) cannot recognize nested attributes and won't display > the structure of their contents. > > Unfortunately we cannot just add the flag everywhere as there may be > userspace applications which check nlattr::nla_type directly rather than > through a helper masking out the flags. Therefore the patch renames > nla_nest_start() to nla_nest_start_noflag() and introduces > nla_nest_start() > as a wrapper adding NLA_F_NESTED. The calls which add NLA_F_NESTED > manually > are rewritten to use nla_nest_start(). > > Except for changes in include/net/netlink.h, the patch was generated using > this semantic patch: > > @@ expression E1, E2; @@ > -nla_nest_start(E1, E2) > +nla_nest_start_noflag(E1, E2) > > @@ expression E1, E2; @@ > -nla_nest_start_noflag(E1, E2 | NLA_F_NESTED) > +nla_nest_start(E1, E2) > > Signed-off-by: Michal Kubecek > Acked-by: Jiri Pirko > Acked-by: David Ahern > Signed-off-by: David S. Miller > > Signed-off-by: Yi-Hung Wei > --- > acinclude.m4| 1 + > datapath/conntrack.c| 6 +++--- > datapath/datapath.c | 7 +++--- > datapath/flow_netlink.c | 33 > +++-- > datapath/linux/compat/include/net/netlink.h | 9 > datapath/meter.c| 8 +++ > datapath/vport-vxlan.c | 2 +- > datapath/vport.c| 2 +- > 8 files changed, 40 insertions(+), 28 deletions(-) > > diff --git a/acinclude.m4 b/acinclude.m4 > index dca09abefa96..fe121ab9126d 100644 > --- a/acinclude.m4 > +++ b/acinclude.m4 > @@ -844,6 +844,7 @@ AC_DEFUN([OVS_CHECK_LINUX_COMPAT], [ >OVS_GREP_IFELSE([$KSRC/include/net/netlink.h], [nla_put_in_addr]) >OVS_GREP_IFELSE([$KSRC/include/net/netlink.h], [nla_find_nested]) >OVS_GREP_IFELSE([$KSRC/include/net/netlink.h], [nla_is_last]) > + OVS_GREP_IFELSE([$KSRC/include/net/netlink.h], [nla_nest_start_noflag]) >OVS_GREP_IFELSE([$KSRC/include/linux/netlink.h], [void.*netlink_set_err], >[OVS_DEFINE([HAVE_VOID_NETLINK_SET_ERR])]) >OVS_FIND_PARAM_IFELSE([$KSRC/include/net/netlink.h], > diff --git a/datapath/conntrack.c b/datapath/conntrack.c > index 010f9af5ffd2..b11a30965147 100644 > --- a/datapath/conntrack.c > +++ b/datapath/conntrack.c > @@ -1776,7 +1776,7 @@ static bool ovs_ct_nat_to_attr(const struct > ovs_conntrack_info *info, > { > struct nlattr *start; > > - start = nla_nest_start(skb, OVS_CT_ATTR_NAT); > + start = nla_nest_start_noflag(skb, OVS_CT_ATTR_NAT); > if (!start) > return false; > > @@ -1847,7 +1847,7 @@ int ovs_ct_action_to_attr(const struct > ovs_conntrack_info *ct_info, > { > struct nlattr *start; > > - start = nla_nest_start(skb, OVS_ACTION_ATTR_CT); > + start = nla_nest_start_noflag(skb, OVS_ACTION_ATTR_CT); > if (!start) > return -EMSGSIZE; > > @@ -2257,7 +2257,7 @@ static int ovs_ct_limit_cmd_get(struct sk_buff *skb, > struct genl_info *info) > if (IS_ERR(reply)) > return PTR_ERR(reply); > > - nla_reply = nla_nest_start(reply, OVS_CT_LIMIT_ATTR_ZONE_LIMIT); > + nla_reply = nla_nest_start_noflag(reply, > OVS_CT_LIMIT_ATTR_ZONE_LIMIT); > > if (a[OVS_CT_LIMIT_ATTR_ZONE_LIMIT]) { > err = ovs_ct_limit_get_zone_limit( > diff --git a/datapath/datapath.c b/datapath/datapath.c > index 94e4f6ffd6e9..78e2e6310529 100644 > --- a/datapath/datapath.c > +++ b/datapath/datapath.c > @@ -475,7 +475,8 @@ static int queue_userspace_packet(struct datapath *dp, > struct sk_buff *skb, > > > if (upcall_info->egress_tun_info) { > - nla = nla_nest_start(user_skb, > OVS_PACKET_ATTR_EGRESS_TUN_KEY); > + nla = nla_n
Re: [ovs-dev] [PATCH 06/11] datapath: genetlink: optionally validate strictly/dumps
LGTM, thanks. Reviewed-by: Yifeng Sun On Mon, Oct 14, 2019 at 10:53 AM Yi-Hung Wei wrote: > > This patch backports the following upstream commit within the > openvswitch kernel module with some checks so that it also works > in the older kernel. > > Upstream commit: > commit ef6243acb4782df587a4d7d6c310fa5b5d82684b > Author: Johannes Berg > Date: Fri Apr 26 14:07:31 2019 +0200 > > genetlink: optionally validate strictly/dumps > > Add options to strictly validate messages and dump messages, > sometimes perhaps validating dump messages non-strictly may > be required, so add an option for that as well. > > Since none of this can really be applied to existing commands, > set the options everwhere using the following spatch: > > @@ > identifier ops; > expression X; > @@ > struct genl_ops ops[] = { > ..., > { > .cmd = X, > + .validate = GENL_DONT_VALIDATE_STRICT | > GENL_DONT_VALIDATE_DUMP, > ... > }, > ... > }; > > For new commands one should just not copy the .validate 'opt-out' > flags and thus get strict validation. > > Signed-off-by: Johannes Berg > Signed-off-by: David S. Miller > > Signed-off-by: Yi-Hung Wei > --- > acinclude.m4 | 1 + > datapath/conntrack.c | 9 + > datapath/datapath.c | 39 +++ > datapath/meter.c | 12 > 4 files changed, 61 insertions(+) > > diff --git a/acinclude.m4 b/acinclude.m4 > index fe121ab9126d..055f5387db19 100644 > --- a/acinclude.m4 > +++ b/acinclude.m4 > @@ -817,6 +817,7 @@ AC_DEFUN([OVS_CHECK_LINUX_COMPAT], [ >OVS_GREP_IFELSE([$KSRC/include/net/genetlink.h], [genlmsg_parse]) >OVS_GREP_IFELSE([$KSRC/include/net/genetlink.h], [genl_notify.*family], >[OVS_DEFINE([HAVE_GENL_NOTIFY_TAKES_FAMILY])]) > + OVS_GREP_IFELSE([$KSRC/include/net/genetlink.h], [genl_validate_flags]) >OVS_FIND_PARAM_IFELSE([$KSRC/include/net/genetlink.h], > [genl_notify], [net], > [OVS_DEFINE([HAVE_GENL_NOTIFY_TAKES_NET])]) > diff --git a/datapath/conntrack.c b/datapath/conntrack.c > index b11a30965147..0c0d43bec2e5 100644 > --- a/datapath/conntrack.c > +++ b/datapath/conntrack.c > @@ -2283,18 +2283,27 @@ exit_err: > > static struct genl_ops ct_limit_genl_ops[] = { > { .cmd = OVS_CT_LIMIT_CMD_SET, > +#ifdef HAVE_GENL_VALIDATE_FLAGS > + .validate = GENL_DONT_VALIDATE_STRICT | > GENL_DONT_VALIDATE_DUMP, > +#endif > .flags = GENL_ADMIN_PERM, /* Requires CAP_NET_ADMIN >* privilege. */ > .policy = ct_limit_policy, > .doit = ovs_ct_limit_cmd_set, > }, > { .cmd = OVS_CT_LIMIT_CMD_DEL, > +#ifdef HAVE_GENL_VALIDATE_FLAGS > + .validate = GENL_DONT_VALIDATE_STRICT | > GENL_DONT_VALIDATE_DUMP, > +#endif > .flags = GENL_ADMIN_PERM, /* Requires CAP_NET_ADMIN >* privilege. */ > .policy = ct_limit_policy, > .doit = ovs_ct_limit_cmd_del, > }, > { .cmd = OVS_CT_LIMIT_CMD_GET, > +#ifdef HAVE_GENL_VALIDATE_FLAGS > + .validate = GENL_DONT_VALIDATE_STRICT | > GENL_DONT_VALIDATE_DUMP, > +#endif > .flags = 0, /* OK for unprivileged users. */ > .policy = ct_limit_policy, > .doit = ovs_ct_limit_cmd_get, > diff --git a/datapath/datapath.c b/datapath/datapath.c > index 78e2e6310529..f4244ea09869 100644 > --- a/datapath/datapath.c > +++ b/datapath/datapath.c > @@ -652,6 +652,9 @@ static const struct nla_policy > packet_policy[OVS_PACKET_ATTR_MAX + 1] = { > > static struct genl_ops dp_packet_genl_ops[] = { > { .cmd = OVS_PACKET_CMD_EXECUTE, > +#ifdef HAVE_GENL_VALIDATE_FLAGS > + .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, > +#endif > .flags = GENL_UNS_ADMIN_PERM, /* Requires CAP_NET_ADMIN privilege. > */ > .policy = packet_policy, > .doit = ovs_packet_cmd_execute > @@ -1440,22 +1443,34 @@ static const struct nla_policy > flow_policy[OVS_FLOW_ATTR_MAX + 1] = { > > static struct genl_ops dp_flow_genl_ops[] = { > { .cmd = OVS_FLOW_CMD_NEW, > +#ifdef HAVE_GENL_VALIDATE_FLAGS > + .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, > +#endif > .flags = GENL_UNS_ADMIN_PERM, /* Requires CAP_NET_ADMIN privilege
Re: [ovs-dev] [PATCH 07/11] datapath: Load and reference the NAT helper.
LGTM, thanks. Reviewed-by: Yifeng Sun On Mon, Oct 14, 2019 at 10:54 AM Yi-Hung Wei wrote: > > This commit backports the following upstream commit, and two functions > in nf_conntrack_helper.h. > > Upstream commit: > commit fec9c271b8f1bde1086be5aa415cdb586e0dc800 > Author: Flavio Leitner > Date: Wed Apr 17 11:46:17 2019 -0300 > > openvswitch: load and reference the NAT helper. > > This improves the original commit 17c357efe5ec ("openvswitch: load > NAT helper") where it unconditionally tries to load the module for > every flow using NAT, so not efficient when loading multiple flows. > It also doesn't hold any references to the NAT module while the > flow is active. > > This change fixes those problems. It will try to load the module > only if it's not present. It grabs a reference to the NAT module > and holds it while the flow is active. Finally, an error message > shows up if either actions above fails. > > Fixes: 17c357efe5ec ("openvswitch: load NAT helper") > Signed-off-by: Flavio Leitner > Signed-off-by: Pablo Neira Ayuso > > Signed-off-by: Yi-Hung Wei > --- > acinclude.m4 | 4 > datapath/conntrack.c | 27 > +- > .../include/net/netfilter/nf_conntrack_helper.h| 17 ++ > 3 files changed, 42 insertions(+), 6 deletions(-) > > diff --git a/acinclude.m4 b/acinclude.m4 > index 055f5387db19..22f92723b00d 100644 > --- a/acinclude.m4 > +++ b/acinclude.m4 > @@ -904,6 +904,10 @@ AC_DEFUN([OVS_CHECK_LINUX_COMPAT], [ >OVS_GREP_IFELSE([$KSRC/include/net/netfilter/nf_conntrack_helper.h], >[nf_conntrack_helper_put], >[OVS_DEFINE(HAVE_NF_CONNTRACK_HELPER_PUT)]) > + OVS_GREP_IFELSE([$KSRC/include/net/netfilter/nf_conntrack_helper.h], > + [nf_nat_helper_try_module_get]) > + OVS_GREP_IFELSE([$KSRC/include/net/netfilter/nf_conntrack_helper.h], > + [nf_nat_helper_put]) > > OVS_GREP_IFELSE([$KSRC/include/linux/skbuff.h],:space:]]]SKB_GSO_UDP[[[:space:, >[OVS_DEFINE([HAVE_SKB_GSO_UDP])]) >OVS_GREP_IFELSE([$KSRC/include/net/dst.h],[DST_NOCACHE], > diff --git a/datapath/conntrack.c b/datapath/conntrack.c > index 0c0d43bec2e5..9a7eab655142 100644 > --- a/datapath/conntrack.c > +++ b/datapath/conntrack.c > @@ -1391,6 +1391,7 @@ static int ovs_ct_add_helper(struct ovs_conntrack_info > *info, const char *name, > { > struct nf_conntrack_helper *helper; > struct nf_conn_help *help; > + int ret = 0; > > helper = nf_conntrack_helper_try_module_get(name, info->family, > key->ip.proto); > @@ -1405,13 +1406,22 @@ static int ovs_ct_add_helper(struct > ovs_conntrack_info *info, const char *name, > return -ENOMEM; > } > > +#ifdef CONFIG_NF_NAT_NEEDED > + if (info->nat) { > + ret = nf_nat_helper_try_module_get(name, info->family, > + key->ip.proto); > + if (ret) { > + nf_conntrack_helper_put(helper); > + OVS_NLERR(log, "Failed to load \"%s\" NAT helper, > error: %d", > + name, ret); > + return ret; > + } > + } > +#endif > + > rcu_assign_pointer(help->helper, helper); > info->helper = helper; > - > - if (info->nat) > - request_module("ip_nat_%s", name); > - > - return 0; > + return ret; > } > > #if IS_ENABLED(CONFIG_NF_NAT_NEEDED) > @@ -1898,8 +1908,13 @@ void ovs_ct_free_action(const struct nlattr *a) > > static void __ovs_ct_free_action(struct ovs_conntrack_info *ct_info) > { > - if (ct_info->helper) > + if (ct_info->helper) { > +#ifdef CONFIG_NF_NAT_NEEDED > + if (ct_info->nat) > + nf_nat_helper_put(ct_info->helper); > +#endif > nf_conntrack_helper_put(ct_info->helper); > + } > if (ct_info->ct) { > if (ct_info->timeout[0]) > nf_ct_destroy_timeout(ct_info->ct); > diff --git > a/datapath/linux/compat/include/net/netfilter/nf_conntrack_helper.h > b/datapath/linux/compat/include/net/netfilter/nf_conntrack_helper.h > index b6a3d0bf75b3..78f97375b66e 100644 > --- a/datapath/linux/compat/include/net/netfilter/nf_conntrack_helper.h &g
Re: [ovs-dev] [PATCH 08/11] datapath: Check for null pointer return from nla_nest_start_noflag
LGTM, thanks. Reviewed-by: Yifeng Sun On Mon, Oct 14, 2019 at 10:54 AM Yi-Hung Wei wrote: > > From: Colin Ian King > > upstream commit: > > commit ca96534630e2edfd73121c487c957b17eca3b7d7 > Author: Colin Ian King > Date: Wed May 1 14:41:58 2019 +0100 > > openvswitch: check for null pointer return from nla_nest_start_noflag > > The call to nla_nest_start_noflag can return null in the unlikely > event that nla_put returns -EMSGSIZE. Check for this condition to > avoid a null pointer dereference on pointer nla_reply. > > Addresses-Coverity: ("Dereference null return value") > Fixes: 11efd5cb04a1 ("openvswitch: Support conntrack zone limit") > Signed-off-by: Colin Ian King > Acked-by: Yi-Hung Wei > Signed-off-by: David S. Miller > > Signed-off-by: Yi-Hung Wei > --- > datapath/conntrack.c | 4 > 1 file changed, 4 insertions(+) > > diff --git a/datapath/conntrack.c b/datapath/conntrack.c > index 9a7eab655142..86e7dd24bb9b 100644 > --- a/datapath/conntrack.c > +++ b/datapath/conntrack.c > @@ -2273,6 +2273,10 @@ static int ovs_ct_limit_cmd_get(struct sk_buff *skb, > struct genl_info *info) > return PTR_ERR(reply); > > nla_reply = nla_nest_start_noflag(reply, > OVS_CT_LIMIT_ATTR_ZONE_LIMIT); > + if (!nla_reply) { > + err = -EMSGSIZE; > + goto exit_err; > + } > > if (a[OVS_CT_LIMIT_ATTR_ZONE_LIMIT]) { > err = ovs_ct_limit_get_zone_limit( > -- > 2.7.4 > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 09/11] datapath: Replace removed NF_NAT_NEEDED with IS_ENABLED(CONFIG_NF_NAT)
LGTM, thanks. Reviewed-by: Yifeng Sun On Mon, Oct 14, 2019 at 10:55 AM Yi-Hung Wei wrote: > > Backports the following upstream commit with some backward compatibility > change. > > commit f319ca6557c10a711facc4dd60197470796d3ec1 > Author: Geert Uytterhoeven > Date: Wed May 8 08:52:32 2019 +0200 > > openvswitch: Replace removed NF_NAT_NEEDED with IS_ENABLED(CONFIG_NF_NAT) > > Commit 4806e975729f99c7 ("netfilter: replace NF_NAT_NEEDED with > IS_ENABLED(CONFIG_NF_NAT)") removed CONFIG_NF_NAT_NEEDED, but a new user > popped up afterwards. > > Fixes: fec9c271b8f1bde1 ("openvswitch: load and reference the NAT > helper.") > Signed-off-by: Geert Uytterhoeven > Acked-by: Florian Westphal > Acked-by: Flavio Leitner > Signed-off-by: David S. Miller > > Signed-off-by: Yi-Hung Wei > --- > datapath/conntrack.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/datapath/conntrack.c b/datapath/conntrack.c > index 86e7dd24bb9b..ba73962b2214 100644 > --- a/datapath/conntrack.c > +++ b/datapath/conntrack.c > @@ -1406,7 +1406,7 @@ static int ovs_ct_add_helper(struct ovs_conntrack_info > *info, const char *name, > return -ENOMEM; > } > > -#ifdef CONFIG_NF_NAT_NEEDED > +#if IS_ENABLED(CONFIG_NF_NAT_NEEDED) > if (info->nat) { > ret = nf_nat_helper_try_module_get(name, info->family, >key->ip.proto); > @@ -1909,7 +1909,7 @@ void ovs_ct_free_action(const struct nlattr *a) > static void __ovs_ct_free_action(struct ovs_conntrack_info *ct_info) > { > if (ct_info->helper) { > -#ifdef CONFIG_NF_NAT_NEEDED > +#if IS_ENABLED(CONFIG_NF_NAT_NEEDED) > if (ct_info->nat) > nf_nat_helper_put(ct_info->helper); > #endif > -- > 2.7.4 > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 10/11] datapath: Fix log message in ovs conntrack
LGTM, thanks. Reviewed-by: Yifeng Sun On Mon, Oct 14, 2019 at 10:55 AM Yi-Hung Wei wrote: > > Upstream commit: > commit 12c6bc38f99bb168b7f16bdb5e855a51a23ee9ec > Author: Yi-Hung Wei > Date: Wed Aug 21 17:16:10 2019 -0700 > > openvswitch: Fix log message in ovs conntrack > > Fixes: 06bd2bdf19d2 ("openvswitch: Add timeout support to ct action") > Signed-off-by: Yi-Hung Wei > Signed-off-by: David S. Miller > > Signed-off-by: Yi-Hung Wei > --- > datapath/conntrack.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/datapath/conntrack.c b/datapath/conntrack.c > index ba73962b2214..f6e9386f4707 100644 > --- a/datapath/conntrack.c > +++ b/datapath/conntrack.c > @@ -1663,7 +1663,7 @@ static int parse_ct(const struct nlattr *attr, struct > ovs_conntrack_info *info, > case OVS_CT_ATTR_TIMEOUT: > memcpy(info->timeout, nla_data(a), nla_len(a)); > if (!memchr(info->timeout, '\0', nla_len(a))) { > - OVS_NLERR(log, "Invalid conntrack helper"); > + OVS_NLERR(log, "Invalid conntrack timeout"); > return -EINVAL; > } > break; > -- > 2.7.4 > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 11/11] datapath: Allow attaching helper in later commit
LGTM, thanks. Reviewed-by: Yifeng Sun On Mon, Oct 14, 2019 at 10:56 AM Yi-Hung Wei wrote: > > Upstream commit: > commit 248d45f1e1934f7849fbdc35ef1e57151cf063eb > Author: Yi-Hung Wei > Date: Fri Oct 4 09:26:44 2019 -0700 > > openvswitch: Allow attaching helper in later commit > > This patch allows to attach conntrack helper to a confirmed conntrack > entry. Currently, we can only attach alg helper to a conntrack entry > when it is in the unconfirmed state. This patch enables an use case > that we can firstly commit a conntrack entry after it passed some > initial conditions. After that the processing pipeline will further > check a couple of packets to determine if the connection belongs to > a particular application, and attach alg helper to the connection > in a later stage. > > Signed-off-by: Yi-Hung Wei > Signed-off-by: David S. Miller > > Signed-off-by: Yi-Hung Wei > --- > datapath/conntrack.c | 21 + > 1 file changed, 13 insertions(+), 8 deletions(-) > > diff --git a/datapath/conntrack.c b/datapath/conntrack.c > index f6e9386f4707..838cf63c908f 100644 > --- a/datapath/conntrack.c > +++ b/datapath/conntrack.c > @@ -1045,6 +1045,8 @@ static int __ovs_ct_lookup(struct net *net, struct > sw_flow_key *key, > > ct = nf_ct_get(skb, &ctinfo); > if (ct) { > + bool add_helper = false; > + > /* Packets starting a new connection must be NATted before the > * helper, so that the helper knows about the NAT. We enforce > * this by delaying both NAT and helper calls for unconfirmed > @@ -1062,16 +1064,17 @@ static int __ovs_ct_lookup(struct net *net, struct > sw_flow_key *key, > } > > /* Userspace may decide to perform a ct lookup without a > helper > -* specified followed by a (recirculate and) commit with one. > -* Therefore, for unconfirmed connections which we will > commit, > -* we need to attach the helper here. > +* specified followed by a (recirculate and) commit with one, > +* or attach a helper in a later commit. Therefore, for > +* connections which we will commit, we may need to attach > +* the helper here. > */ > - if (!nf_ct_is_confirmed(ct) && info->commit && > - info->helper && !nfct_help(ct)) { > + if (info->commit && info->helper && !nfct_help(ct)) { > int err = __nf_ct_try_assign_helper(ct, info->ct, > GFP_ATOMIC); > if (err) > return err; > + add_helper = true; > > /* helper installed, add seqadj if NAT is required */ > if (info->nat && !nfct_seqadj(ct)) { > @@ -1081,11 +1084,13 @@ static int __ovs_ct_lookup(struct net *net, struct > sw_flow_key *key, > } > > /* Call the helper only if: > -* - nf_conntrack_in() was executed above ("!cached") for a > -* confirmed connection, or > +* - nf_conntrack_in() was executed above ("!cached") or a > +* helper was just attached ("add_helper") for a confirmed > +* connection, or > * - When committing an unconfirmed connection. > */ > - if ((nf_ct_is_confirmed(ct) ? !cached : info->commit) && > + if ((nf_ct_is_confirmed(ct) ? !cached || add_helper : > + info->commit) && > ovs_ct_helper(skb, info->family) != NF_ACCEPT) { > return -EINVAL; > } > -- > 2.7.4 > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] tests: Get rid of timeout options for control utilities.
Looks good to me, thanks. Reviewed-by: Yifeng Sun On Tue, Oct 15, 2019 at 9:11 AM Ilya Maximets wrote: > > 'OVS_CTL_TIMEOUT' environment variable is exported in tests/atlocal.in > and controls timeouts for all OVS utilities in testsuite. > > There should be no manual tweaks for each single command. > > This helps with running tests under valgrind where commands could > take really long time as you only need to change 'OVS_CTL_TIMEOUT' > in a single place. > > Few manual timeouts were left in places where they make sense. > > Signed-off-by: Ilya Maximets > --- > tests/daemon.at | 2 +- > tests/ofproto-dpif.at | 18 +- > tests/ofproto-macros.at | 2 +- > tests/ovs-macros.at | 8 > tests/ovs-vswitchd.at | 8 > tests/ovsdb-cluster.at | 8 > tests/ovsdb-macros.at | 2 +- > tests/pmd.at| 4 ++-- > tests/vtep-ctl.at | 8 > 9 files changed, 30 insertions(+), 30 deletions(-) > > diff --git a/tests/daemon.at b/tests/daemon.at > index bdc8910f9..a7982de38 100644 > --- a/tests/daemon.at > +++ b/tests/daemon.at > @@ -97,7 +97,7 @@ check_process_name $child ovsdb-server > > # Avoid a race between pidfile creation and notifying the parent, > # which can easily trigger if ovsdb-server is slow (e.g. due to valgrind). > -OVS_WAIT_UNTIL([ovs-appctl --timeout=10 -t ovsdb-server version]) > +OVS_WAIT_UNTIL([ovs-appctl -t ovsdb-server version]) > > # Kill the daemon process, making it look like a segfault, > # and wait for a new child process to get spawned. > diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at > index 8d9908858..49326c533 100644 > --- a/tests/ofproto-dpif.at > +++ b/tests/ofproto-dpif.at > @@ -10615,35 +10615,35 @@ AT_CHECK([ovs-vsctl get Interface p1 mtu], [0], [dnl > AT_CHECK([ovs-vsctl set Interface p1 mtu_request=1600]) > > # Check that the new MTU is applied > -AT_CHECK([ovs-vsctl --timeout=10 wait-until Interface p1 mtu=1600]) > +AT_CHECK([ovs-vsctl wait-until Interface p1 mtu=1600]) > # The internal port 'br0' should have the same MTU value as p1, becase it's > # the new bridge minimum. > -AT_CHECK([ovs-vsctl --timeout=10 wait-until Interface br0 mtu=1600]) > +AT_CHECK([ovs-vsctl wait-until Interface br0 mtu=1600]) > > AT_CHECK([ovs-vsctl del-port br0 p1]) > > # When 'p1' is deleted, the internal port should return to the default MTU > -AT_CHECK([ovs-vsctl --timeout=10 wait-until Interface br0 mtu=1500]) > +AT_CHECK([ovs-vsctl wait-until Interface br0 mtu=1500]) > > # New port with 'mtu_request' in the same transaction. > AT_CHECK([ovs-vsctl add-port br0 p2 -- set int p2 type=dummy > mtu_request=1600]) > -AT_CHECK([ovs-vsctl --timeout=10 wait-until Interface p2 mtu=1600]) > -AT_CHECK([ovs-vsctl --timeout=10 wait-until Interface br0 mtu=1600]) > +AT_CHECK([ovs-vsctl wait-until Interface p2 mtu=1600]) > +AT_CHECK([ovs-vsctl wait-until Interface br0 mtu=1600]) > > # Explicitly set mtu_request on the internal interface. This should prevent > # the MTU from being overriden. > AT_CHECK([ovs-vsctl set int br0 mtu_request=1700]) > -AT_CHECK([ovs-vsctl --timeout=10 wait-until Interface br0 mtu=1700]) > +AT_CHECK([ovs-vsctl wait-until Interface br0 mtu=1700]) > > # The new MTU on p2 should not affect br0. > AT_CHECK([ovs-vsctl set int p2 mtu_request=1400]) > -AT_CHECK([ovs-vsctl --timeout=10 wait-until Interface p2 mtu=1400]) > -AT_CHECK([ovs-vsctl --timeout=10 wait-until Interface br0 mtu=1700]) > +AT_CHECK([ovs-vsctl wait-until Interface p2 mtu=1400]) > +AT_CHECK([ovs-vsctl wait-until Interface br0 mtu=1700]) > > # Remove explicit mtu_request from br0. Now it should track the bridge > # minimum again. > AT_CHECK([ovs-vsctl set int br0 mtu_request=[[]]]) > -AT_CHECK([ovs-vsctl --timeout=10 wait-until Interface br0 mtu=1400]) > +AT_CHECK([ovs-vsctl wait-until Interface br0 mtu=1400]) > > OVS_VSWITCHD_STOP > AT_CLEANUP > diff --git a/tests/ofproto-macros.at b/tests/ofproto-macros.at > index 04d4ed7e2..b2b17eed3 100644 > --- a/tests/ofproto-macros.at > +++ b/tests/ofproto-macros.at > @@ -249,7 +249,7 @@ add_of_br () { > local br=br$brnum > local dpid=fedcba987654321$brnum > local mac=aa:55:aa:55:00:0$brnum > -ovs-vsctl --timeout=20 \ > +ovs-vsctl \ > -- add-br $br \ > -- set bridge $br datapath-type=dummy \ >fail-mode=secure \ > diff --git a/tests/ovs-macros.at b/tests/ovs-macros.at > index e07c4b908..8e512f4e7 100644 > --- a/tests/ovs-macros.at > +++ b/tests/ovs-macros.at > @@ -155,7 +155,7 @@ kill_ovs_vswitchd () { > fi > >
Re: [ovs-dev] [PATCH v2 01/12] datapath: Fix linking without CONFIG_NF_CONNTRACK_LABELS
LGTM. Reviewed-by: Yifeng Sun On Tue, Oct 15, 2019 at 10:40 AM Yi-Hung Wei wrote: > > From: Arnd Bergmann > > upstream commit: > commit a277d516de5f498c91d91189717ef7e01102ad27 > Author: Arnd Bergmann > Date: Fri Nov 2 16:36:55 2018 +0100 > > openvswitch: fix linking without CONFIG_NF_CONNTRACK_LABELS > > When CONFIG_CC_OPTIMIZE_FOR_DEBUGGING is enabled, the compiler > fails to optimize out a dead code path, which leads to a link failure: > > net/openvswitch/conntrack.o: In function `ovs_ct_set_labels': > conntrack.c:(.text+0x2e60): undefined reference to `nf_connlabels_replace' > > In this configuration, we can take a shortcut, and completely > remove the contrack label code. This may also help the regular > optimization. > > Signed-off-by: Arnd Bergmann > Signed-off-by: David S. Miller > > Signed-off-by: Yi-Hung Wei > --- > datapath/conntrack.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/datapath/conntrack.c b/datapath/conntrack.c > index c6d523758ff1..e328afe1ad15 100644 > --- a/datapath/conntrack.c > +++ b/datapath/conntrack.c > @@ -1263,7 +1263,8 @@ static int ovs_ct_commit(struct net *net, struct > sw_flow_key *key, > &info->labels.mask); > if (err) > return err; > - } else if (labels_nonzero(&info->labels.mask)) { > + } else if (IS_ENABLED(CONFIG_NF_CONNTRACK_LABELS) && > + labels_nonzero(&info->labels.mask)) { > err = ovs_ct_set_labels(ct, key, &info->labels.value, > &info->labels.mask); > if (err) > -- > 2.7.4 > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] rhel: Support RHEL7.7 build and packaging
Hi Greg, Please try the following: # yum install scl-utils scl-utils-build # yum install rh-python36 # scl enable rh-python36 bash # pip install sphinx Thanks, Yifeng On Tue, Oct 15, 2019 at 12:28 PM Gregory Rose wrote: > > > On 10/11/2019 2:49 PM, Yifeng Sun wrote: > > This patch provides essential fixes for OVS to support > > RHEL7.7's new kernel. > > > > make rpm-fedora-kmod \ > > RPMBUILD_OPT='-D "kversion 3.10.0-1062.1.2.el7.x86_64"' > > > > Signed-off-by: Yifeng Sun > > Hi Yifeng, > > I'm trying to test this patch on a RHEL 7.7 system but it requires > python sphinx and I can't > seem to find out how to install it. I've found a bunch of RPMs but all > of them have > broken dependencies. > > I've never installed the sphinx stuff before but now it's required. Did > you find a way > to install it? > > Here is the error: > error: Failed build dependencies: > /usr/bin/sphinx-build-3 is needed by openvswitch-2.12.90-1.el7.x86_64 > > Thanks, > > - Greg > > > --- > > rhel/openvswitch-kmod-fedora.spec.in | 9 + > > rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh | 14 ++ > > 2 files changed, 15 insertions(+), 8 deletions(-) > > > > diff --git a/rhel/openvswitch-kmod-fedora.spec.in > > b/rhel/openvswitch-kmod-fedora.spec.in > > index b3588982ef7a..fbb8366990f1 100644 > > --- a/rhel/openvswitch-kmod-fedora.spec.in > > +++ b/rhel/openvswitch-kmod-fedora.spec.in > > @@ -12,8 +12,9 @@ > > # Use the kversion macro such as > > # RPMBUILD_OPT='-D "kversion 3.10.0-693.1.1.el7.x86_64 > > 3.10.0-693.17.1.el7.x86_64"' > > # to build package for mulitple kernel versions in the same package > > -# This only works for kernel 3.10.0 major revision 957 (RHEL 7.6), > > -# major revision 693 (RHEL 7.4) and major revision 327 (RHEL 7.2). > > +# This only works for kernel 3.10.0 major revision 1062 (RHEL 7.7), > > +# major revision 957 (RHEL 7.6), major revision 693 (RHEL 7.4) and > > +# major revision 327 (RHEL 7.2). > > # By default, build against the current running kernel version > > #%define kernel 3.1.5-1.fc16.x86_64 > > #define kernel %{kernel_source} > > @@ -92,8 +93,8 @@ if grep -qs "suse" /etc/os-release; then > > fi > > elif [ "$mainline_major" = "3" ] && [ "$mainline_minor" = "10" ] && > >{ [ "$major_rev" = "327" ] || [ "$major_rev" = "693" ] || \ > > - [ "$major_rev" = "957" ]; }; then > > -# For RHEL 7.2, 7.4 and 7.6 > > + [ "$major_rev" = "957" ] || [ "$major_rev" == "1062" ]; }; then > > +# For RHEL 7.2, 7.4, 7.6 and 7.7 > > if [ -x "%{_datadir}/openvswitch/scripts/ovs-kmod-manage.sh" ]; then > > %{_datadir}/openvswitch/scripts/ovs-kmod-manage.sh > > fi > > diff --git a/rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh > > b/rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh > > index 693fb0b744b3..a643b55ff0f8 100644 > > --- a/rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh > > +++ b/rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh > > @@ -15,9 +15,10 @@ > > # limitations under the License. > > > > # This script is intended to be used on the following kernels. > > -# - 3.10.0 major revision 327 (RHEL 7.2) > > -# - 3.10.0 major revision 693 (RHEL 7.4) > > -# - 3.10.0 major revision 957 (RHEL 7.6) > > +# - 3.10.0 major revision 327 (RHEL 7.2) > > +# - 3.10.0 major revision 693 (RHEL 7.4) > > +# - 3.10.0 major revision 957 (RHEL 7.6) > > +# - 3.10.0 major revision 1062 (RHEL 7.7) > > # - 4.4.x, x >= 73 (SLES 12 SP3) > > # - 4.12.x, x >= 14 (SLES 12 SP4). > > # It is packaged in the openvswitch kmod RPM and run in the post-install > > @@ -100,6 +101,11 @@ if [ "$mainline_major" = "3" ] && [ "$mainline_minor" > > = "10" ]; then > > comp_ver=10 > > ver_offset=4 > > installed_ver="$minor_rev" > > +elif [ "$major_rev" = "1062" ]; then > > +#echo "rhel77" > > +comp_ver=10 > > +ver_offset=4 > > +installed_ver="$minor_rev" > > fi > > elif [ "$mainline_major" = "4" ] && [ "$mainline_minor" = "4" ]; then > > if [ "$mainline_patch" -ge "73" ]; then > > @@ -111,7 +117,7 @@ elif [ "$mainline_major" = "4" ] && [ "$mainline_minor" > > = "4" ]; then > > elif [ "$mainline_major" = "4" ] && [ "$mainline_minor" = "12" ]; then > > if [ "$mainline_patch" -ge "14" ]; then > > #echo "sles12sp4" > > -comp_ver=14 > > +comp_ver=1 > > ver_offset=2 > > installed_ver="$mainline_patch" > > fi > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] rhel: Support RHEL7.7 build and packaging
Hi Greg, I added this repo: subscription-manager repos --enable rhel-server-rhscl-7-rpms Thanks, Yifeng On Tue, Oct 15, 2019 at 2:58 PM Gregory Rose wrote: > > > On 10/15/2019 2:27 PM, Yifeng Sun wrote: > > Hi Greg, > > > > Please try the following: > > > > # yum install scl-utils scl-utils-build > > # yum install rh-python36 > > Hi Yifeng, > > I tried this step but no luck. > > [root@localhost ~]# yum install rh-python36 -y > Loaded plugins: product-id, search-disabled-repos, subscription-manager > No package rh-python36 available. > Error: Nothing to do > > Could you list your repos? Here's what I've got: > [root@localhost ~]# yum repolist > Loaded plugins: product-id, search-disabled-repos, subscription-manager > repo idrepo > name status > rhel-7-server-extras-rpms/x86_64 Red Hat Enterprise Linux 7 > Ser 1,182 > rhel-7-server-optional-rpms/7Server/x86_64 Red Hat Enterprise Linux 7 > Ser 19,307 > rhel-7-server-rpms/7Server/x86_64 Red Hat Enterprise Linux 7 > Ser 26,484 > repolist: 46,973 > > > Thanks, > > - Greg > > > # scl enable rh-python36 bash > > # pip install sphinx > > > > Thanks, > > Yifeng > > > > On Tue, Oct 15, 2019 at 12:28 PM Gregory Rose wrote: > >> > >> On 10/11/2019 2:49 PM, Yifeng Sun wrote: > >>> This patch provides essential fixes for OVS to support > >>> RHEL7.7's new kernel. > >>> > >>> make rpm-fedora-kmod \ > >>> RPMBUILD_OPT='-D "kversion 3.10.0-1062.1.2.el7.x86_64"' > >>> > >>> Signed-off-by: Yifeng Sun > >> Hi Yifeng, > >> > >> I'm trying to test this patch on a RHEL 7.7 system but it requires > >> python sphinx and I can't > >> seem to find out how to install it. I've found a bunch of RPMs but all > >> of them have > >> broken dependencies. > >> > >> I've never installed the sphinx stuff before but now it's required. Did > >> you find a way > >> to install it? > >> > >> Here is the error: > >> error: Failed build dependencies: > >> /usr/bin/sphinx-build-3 is needed by openvswitch-2.12.90-1.el7.x86_64 > >> > >> Thanks, > >> > >> - Greg > >> > >>> --- > >>>rhel/openvswitch-kmod-fedora.spec.in | 9 + > >>>rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh | 14 > >>> ++ > >>>2 files changed, 15 insertions(+), 8 deletions(-) > >>> > >>> diff --git a/rhel/openvswitch-kmod-fedora.spec.in > >>> b/rhel/openvswitch-kmod-fedora.spec.in > >>> index b3588982ef7a..fbb8366990f1 100644 > >>> --- a/rhel/openvswitch-kmod-fedora.spec.in > >>> +++ b/rhel/openvswitch-kmod-fedora.spec.in > >>> @@ -12,8 +12,9 @@ > >>># Use the kversion macro such as > >>># RPMBUILD_OPT='-D "kversion 3.10.0-693.1.1.el7.x86_64 > >>> 3.10.0-693.17.1.el7.x86_64"' > >>># to build package for mulitple kernel versions in the same package > >>> -# This only works for kernel 3.10.0 major revision 957 (RHEL 7.6), > >>> -# major revision 693 (RHEL 7.4) and major revision 327 (RHEL 7.2). > >>> +# This only works for kernel 3.10.0 major revision 1062 (RHEL 7.7), > >>> +# major revision 957 (RHEL 7.6), major revision 693 (RHEL 7.4) and > >>> +# major revision 327 (RHEL 7.2). > >>># By default, build against the current running kernel version > >>>#%define kernel 3.1.5-1.fc16.x86_64 > >>>#define kernel %{kernel_source} > >>> @@ -92,8 +93,8 @@ if grep -qs "suse" /etc/os-release; then > >>>fi > >>>elif [ "$mainline_major" = "3" ] && [ "$mainline_minor" = "10" ] && > >>> { [ "$major_rev" = "327" ] || [ "$major_rev" = "693" ] || \ > >>> - [ "$major_rev" = "957" ]; }; then > >>> -# For RHEL 7.2, 7.4 and 7.6 > >>> + [ "$major_rev" = "957" ] || [ "$major_rev" == "1062" ]; }; then > >>> +# For RHEL 7.2, 7.4, 7.6 and 7.7 > >>>if [ -x "%{_datadir}/openvswitch/scripts/ovs-kmod-manage.sh" ]; > >>> then > >>>
Re: [ovs-dev] [PATCH] compat: Fix small naming issue
LGTM, thanks Greg. Reviewed-by: Yifeng Sun On Wed, Oct 16, 2019 at 1:21 PM Greg Rose wrote: > > In commit 057772cf2477 the function is missing the rpl_ prefix > and the define that replaces the original function should come > after the function definition. > > Fixes: 057772cf2477 ("compat: Backport nf_ct_tmpl_alloc().") > Signed-off-by: Greg Rose > --- > datapath/linux/compat/include/net/netfilter/nf_conntrack_core.h | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/datapath/linux/compat/include/net/netfilter/nf_conntrack_core.h > b/datapath/linux/compat/include/net/netfilter/nf_conntrack_core.h > index 1015801..84cb09e 100644 > --- a/datapath/linux/compat/include/net/netfilter/nf_conntrack_core.h > +++ b/datapath/linux/compat/include/net/netfilter/nf_conntrack_core.h > @@ -7,11 +7,10 @@ > > #include > > -#define nf_ct_tmpl_alloc rpl_nf_ct_tmpl_alloc > /* Released via destroy_conntrack() */ > static inline struct nf_conn * > -nf_ct_tmpl_alloc(struct net *net, const struct nf_conntrack_zone *zone, > -gfp_t flags) > +rpl_nf_ct_tmpl_alloc(struct net *net, const struct nf_conntrack_zone *zone, > +gfp_t flags) > { > struct nf_conn *tmpl; > > @@ -32,6 +31,7 @@ out_free: > kfree(tmpl); > return NULL; > } > +#define nf_ct_tmpl_alloc rpl_nf_ct_tmpl_alloc > > static void rpl_nf_ct_tmpl_free(struct nf_conn *tmpl) > { > -- > 1.8.3.1 > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] compat: Fix small naming issue
No problem, thanks! Yifeng On Thu, Oct 17, 2019 at 10:30 AM William Tu wrote: > > On Wed, Oct 16, 2019 at 01:26:39PM -0700, Yifeng Sun wrote: > > LGTM, thanks Greg. > > > > Reviewed-by: Yifeng Sun > > > > Thanks! Applied to master. > > Sorry Yifeng, I forgot to add your Reviewed-by tag. > > William > > > On Wed, Oct 16, 2019 at 1:21 PM Greg Rose wrote: > > > > > > In commit 057772cf2477 the function is missing the rpl_ prefix > > > and the define that replaces the original function should come > > > after the function definition. > > > > > > Fixes: 057772cf2477 ("compat: Backport nf_ct_tmpl_alloc().") > > > Signed-off-by: Greg Rose > > > --- > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH V2] Update scripts to support RHEL 7.9
LGTM. Reviewed-by: Yifeng Sun On Tue, Nov 17, 2020 at 3:26 PM Greg Rose wrote: > Add support for RHEL7.9 GA release with kernel 3.10.0-1160 > > Signed-off-by: Greg Rose > > --- > V2 - Correct the author > --- > rhel/openvswitch-kmod-fedora.spec.in | 6 -- > rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh | 6 ++ > 2 files changed, 10 insertions(+), 2 deletions(-) > > diff --git a/rhel/openvswitch-kmod-fedora.spec.in b/rhel/ > openvswitch-kmod-fedora.spec.in > index 15eec6d4c..ff190064f 100644 > --- a/rhel/openvswitch-kmod-fedora.spec.in > +++ b/rhel/openvswitch-kmod-fedora.spec.in > @@ -19,6 +19,7 @@ > # - 3.10.0 major revision 1062 (RHEL 7.7) > # - 3.10.0 major revision 1101 (RHEL 7.8 Beta) > # - 3.10.0 major revision 1127 (RHEL 7.8 GA) > +# - 3.10.0 major revision 1160 (RHEL 7.9 GA) > # By default, build against the current running kernel version > #%define kernel 3.1.5-1.fc16.x86_64 > #define kernel %{kernel_source} > @@ -98,8 +99,9 @@ if grep -qs "suse" /etc/os-release; then > elif [ "$mainline_major" = "3" ] && [ "$mainline_minor" = "10" ] && > { [ "$major_rev" = "327" ] || [ "$major_rev" = "693" ] || \ > [ "$major_rev" = "957" ] || [ "$major_rev" == "1062" ] || \ > - [ "$major_rev" = "1101" ] || [ "$major_rev" = "1127" ] ; }; then > -# For RHEL 7.2, 7.4, 7.6, 7.7, and 7.8 > + [ "$major_rev" = "1101" ] || [ "$major_rev" = "1127" ] || \ > + [ "$major_rev" = "1160" ] ; }; then > +# For RHEL 7.2, 7.4, 7.6, 7.7, 7.8 and 7.9 > if [ -x "%{_datadir}/openvswitch/scripts/ovs-kmod-manage.sh" ]; then > %{_datadir}/openvswitch/scripts/ovs-kmod-manage.sh > fi > diff --git a/rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh > b/rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh > index c70e135cd..9bf25a46b 100644 > --- a/rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh > +++ b/rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh > @@ -21,6 +21,7 @@ > # - 3.10.0 major revision 1062 (RHEL 7.7) > # - 3.10.0 major revision 1101 (RHEL 7.8 Beta) > # - 3.10.0 major revision 1127 (RHEL 7.8 GA) > +# - 3.10.0 major revision 1160 (RHEL 7.9) > # - 4.4.x, x >= 73 (SLES 12 SP3) > # - 4.12.x, x >= 14 (SLES 12 SP4). > # It is packaged in the openvswitch kmod RPM and run in the post-install > @@ -118,6 +119,11 @@ if [ "$mainline_major" = "3" ] && [ "$mainline_minor" > = "10" ]; then > comp_ver=10 > ver_offset=4 > installed_ver="$minor_rev" > +elif [ "$major_rev" = "1160" ]; then > +#echo "rhel79" > +comp_ver=10 > +ver_offset=4 > +installed_ver="$minor_rev" > fi > elif [ "$mainline_major" = "4" ] && [ "$mainline_minor" = "4" ]; then > if [ "$mainline_patch" -ge "73" ]; then > -- > 2.17.1 > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v4 0/3] Add support for TSO with DPDK
Hi Flavio, I am testing your patch using iperf between 2 VMs on the same host. But it seems that TCP connection can't be created between these 2 VMs. When inspecting further, I found that TCP packets have invalid checksums. This might be the reason. I am wondering if I missed something in the setup? Thanks a lot. Best, Yifeng On Thu, Jan 16, 2020 at 9:01 AM Flavio Leitner wrote: > > Abbreviated as TSO, TCP Segmentation Offload is a feature which enables > the network stack to delegate the TCP segmentation to the NIC reducing > the per packet CPU overhead. > > A guest using vhost-user interface with TSO enabled can send TCP packets > much bigger than the MTU, which saves CPU cycles normally used to break > the packets down to MTU size and to calculate checksums. > > It also saves CPU cycles used to parse multiple packets/headers during > the packet processing inside virtual switch. > > If the destination of the packet is another guest in the same host, then > the same big packet can be sent through a vhost-user interface skipping > the segmentation completely. However, if the destination is not local, > the NIC hardware is instructed to do the TCP segmentation and checksum > calculation. > > The first 2 patches are not really part of TSO support, but they are > required to make sure everything works. > > There are good improvements sending to or receiving from veth pairs or > tap devices as well. See the iperf3 results below: > > [*] veth with ethtool tx off. > > VM sending to: Default Enabled Enabled/Default >Local BR 3 Gbits/sec 23 Gbits/sec 7x >Net NS (veth)3 Gbits/sec[*] 22 Gbits/sec 7x >VM (same host) 2.5 Gbits/sec 24 Gbits/sec 9x >Ext Host10 Gbits/sec 35 Gbits/sec 3x >Ext Host (vxlan) 8.8 Gbits/sec (not supported) > > Using VLAN: >Local BR 3 Gbits/sec 23 Gbits/sec 7x >VM (same host) 2.5 Gbits/sec 21 Gbits/sec 8x >Ext Host 6.4 Gbits/sec 34 Gbits/sec 5x > > Using IPv6: >Net NS (veth) 2.7 Gbits/sec[*] 22 Gbits/sec 8x >VM (same host) 2.6 Gbits/sec 21 Gbits/sec 8x >Ext Host 8.7 Gbits/sec 34 Gbits/sec 4x > > Conntrack: >No packet changes: 1.41 Gbits/sec33 Gbits/sec 23x > > VM receiving from: >Local BR 2.5 Gbits/sec 2.4 Gbits/sec 1x >Net NS (veth) 2.5 Gbits/sec[*] 9.3 Gbits/sec 3x >VM (same host) 4.9 Gbits/sec 25 Gbits/sec 5x >Ext Host 9.7 Gbits/sec 9.4 Gbits/sec 1x >Ext Host (vxlan) 5.5 Gbits/sec (not supported) > > Using VLAN: >Local BR 2.4 Gbits/sec 2.4 Gbits/sec 1x >VM (same host) 3.8 Gbits/sec 24 Gbits/sec 8x >Ext Host 9.5 Gbits/sec 9.5 Gbits/sec 1x > > Using IPv6: >Net NS (veth) 2.2 Gbits/sec[*] 9 Gbits/sec 4x >VM (same host) 4.5 Gbits/sec 24 Gbits/sec 5x >Ext Host 8.9 Gbits/sec8.9 Gbits/sec 1x > > Used iperf3 -u to test UDP traffic limited at default 1Mbits/sec > and noticed no change with the exception for tunneled packets (not > supported). > > Travis, AppVeyor, and Cirrus-ci passed. > > Flavio Leitner (3): > dp-packet: preserve headroom when cloning a pkt batch > vhost: Disable multi-segmented buffers > netdev-dpdk: Add TCP Segmentation Offload support > > Documentation/automake.mk | 1 + > Documentation/topics/index.rst | 1 + > Documentation/topics/userspace-tso.rst | 98 +++ > NEWS | 1 + > lib/automake.mk| 2 + > lib/conntrack.c| 29 +- > lib/dp-packet.h| 192 +++- > lib/ipf.c | 32 +- > lib/netdev-dpdk.c | 355 --- > lib/netdev-linux-private.h | 5 + > lib/netdev-linux.c | 386 ++--- > lib/netdev-provider.h | 9 + > lib/netdev.c | 78 - > lib/userspace-tso.c| 48 +++ > lib/userspace-tso.h| 23 ++ > vswitchd/bridge.c | 2 + > vswitchd/vswitch.xml | 17 ++ > 17 files changed, 1154 insertions(+), 125 deletions(-) > create mode 100644 Documentation/topics/userspace-tso.rst > create mode 100644 lib/userspace-tso.c > create mode 100644 lib/userspace-tso.h > > -- > 2.24.1 > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v4 0/3] Add support for TSO with DPDK
Hi Ilya, Thanks for your reply. The thing is, if checksum offloading is enabled in both VMs, then sender VM will send a packet with invalid TCP checksum, and later OVS will send this packet to receiver VM directly without calculating a valid checksum. As a result, receiver VM will drop this packet because it contains invalid checksum. This is what happened when I tried this patch. Best, Yifeng On Mon, Jan 27, 2020 at 12:09 PM Ilya Maximets wrote: > > On 27.01.2020 18:24, Yifeng Sun wrote: > > Hi Flavio, > > > > I am testing your patch using iperf between 2 VMs on the same host. > > But it seems that TCP connection can't be created between these 2 VMs. > > When inspecting further, I found that TCP packets have invalid checksums. > > This might be the reason. > > > > I am wondering if I missed something in the setup? Thanks a lot. > > I didn't test myself, but according to current design, checksum offloading > (rx and tx) shuld be enabled in both VMs. Otherwise all the packets will > be dropped by the guest kernel. > > Best regards, Ilya Maximets. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v4 0/3] Add support for TSO with DPDK
Hi Flavio, Thanks for the explanation. I followed the steps in the document but TCP connection still failed to build between 2 VMs. I finally modified VM's kernel directly to disable TCP checksum validation to get it working properly. I got 30.0Gbps for 'iperf' between 2 VMs. Best, Yifeng On Tue, Jan 28, 2020 at 4:00 AM Flavio Leitner wrote: > > On Mon, Jan 27, 2020 at 05:17:01PM -0800, Yifeng Sun wrote: > > Hi Ilya, > > > > Thanks for your reply. > > > > The thing is, if checksum offloading is enabled in both VMs, then > > sender VM will send > > a packet with invalid TCP checksum, and later OVS will send this > > packet to receiver > > VM directly without calculating a valid checksum. As a result, > > receiver VM will drop > > this packet because it contains invalid checksum. This is what > > happened when I tried > > this patch. > > > > When TSO is enabled, the TX checksumming offloading is required, > then you will see invalid checksum. This is well documented here: > > https://github.com/openvswitch/ovs/blob/master/Documentation/topics/userspace-tso.rst#userspace-datapath---tso > > "Additionally, if the traffic is headed to a VM within the same host > further optimization can be expected. As the traffic never leaves > the machine, no MTU needs to be accounted for, and thus no > segmentation and checksum calculations are required, which saves yet > more cycles." > > Therefore, it's expected to see bad csum in the traffic dumps. > > To use the feature, you need few steps: enable the feature in OvS > enable in qemu and inside the VM. The linux guest usually enable > the feature by default if qemu offers it. > > HTH, > fbl > > > > Best, > > Yifeng > > > > On Mon, Jan 27, 2020 at 12:09 PM Ilya Maximets wrote: > > > > > > On 27.01.2020 18:24, Yifeng Sun wrote: > > > > Hi Flavio, > > > > > > > > I am testing your patch using iperf between 2 VMs on the same host. > > > > But it seems that TCP connection can't be created between these 2 VMs. > > > > When inspecting further, I found that TCP packets have invalid > > > > checksums. > > > > This might be the reason. > > > > > > > > I am wondering if I missed something in the setup? Thanks a lot. > > > > > > I didn't test myself, but according to current design, checksum offloading > > > (rx and tx) shuld be enabled in both VMs. Otherwise all the packets will > > > be dropped by the guest kernel. > > > > > > Best regards, Ilya Maximets. > > -- > fbl ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v4 0/3] Add support for TSO with DPDK
Sure. Firstly, make sure userspace-tso-enable is true # ovs-vsctl get Open_vSwitch . other_config {dpdk-init="true", enable-statistics="true", pmd-cpu-mask="0xf", userspace-tso-enable="true"} Next, create 2 VMs with vhostuser-type interface on the same KVM host: When VM boots up, turn on tx, tso and sg # ethtool -K ens6 tx on # ethtool -K ens6 tso on # ethtool -K ens6 sg on Then run 'iperf -s' on one VM and 'iperf -c xx.xx.xx.xx' on another VM. Iperf doesn't work if there is no chage to VM's kernel. `tcpdump` shows that iperf server received packets with invalid TCP checksum. `nstat -a` shows that TcpInCsumErr number is accumulating. After adding changes to VM's kernel as below, iperf works properly. in tcp_v4_rcv() - if (skb_checksum_init(skb, IPPROTO_TCP, inet_compute_pseudo)) + if (skb_checksum_init(skb, IPPROTO_TCP, inet_compute_pseudo)) static inline bool tcp_checksum_complete(struct sk_buff *skb) { return 0; } Best, Yifeng On Tue, Jan 28, 2020 at 2:52 PM Flavio Leitner wrote: > > On Tue, Jan 28, 2020 at 02:21:30PM -0800, Yifeng Sun wrote: > > Hi Flavio, > > > > Thanks for the explanation. I followed the steps in the document but > > TCP connection still failed to build between 2 VMs. > > > > I finally modified VM's kernel directly to disable TCP checksum validation > > to get it working properly. I got 30.0Gbps for 'iperf' between 2 VMs. > > Could you provide more details on how you did that? What's running > inside the VM? > > I don't change anything inside of the VMs (Linux) in my testbed. > > fbl > > > > > > Best, > > Yifeng > > > > > > On Tue, Jan 28, 2020 at 4:00 AM Flavio Leitner wrote: > > > > > > On Mon, Jan 27, 2020 at 05:17:01PM -0800, Yifeng Sun wrote: > > > > Hi Ilya, > > > > > > > > Thanks for your reply. > > > > > > > > The thing is, if checksum offloading is enabled in both VMs, then > > > > sender VM will send > > > > a packet with invalid TCP checksum, and later OVS will send this > > > > packet to receiver > > > > VM directly without calculating a valid checksum. As a result, > > > > receiver VM will drop > > > > this packet because it contains invalid checksum. This is what > > > > happened when I tried > > > > this patch. > > > > > > > > > > When TSO is enabled, the TX checksumming offloading is required, > > > then you will see invalid checksum. This is well documented here: > > > > > > https://github.com/openvswitch/ovs/blob/master/Documentation/topics/userspace-tso.rst#userspace-datapath---tso > > > > > > "Additionally, if the traffic is headed to a VM within the same host > > > further optimization can be expected. As the traffic never leaves > > > the machine, no MTU needs to be accounted for, and thus no > > > segmentation and checksum calculations are required, which saves yet > > > more cycles." > > > > > > Therefore, it's expected to see bad csum in the traffic dumps. > > > > > > To use the feature, you need few steps: enable the feature in OvS > > > enable in qemu and inside the VM. The linux guest usually enable > > > the feature by default if qemu offers it. > > > > > > HTH, > > > fbl > > > > > > > > > > Best, > > > > Yifeng > > > > > > > > On Mon, Jan 27, 2020 at 12:09 PM Ilya Maximets > > > > wrote: > > > > > > > > > > On 27.01.2020 18:24, Yifeng Sun wrote: > > > > > > Hi Flavio, > > > > > > > > > > > > I am testing your patch using iperf between 2 VMs on the same host. > > > > > > But it seems that TCP connection can't be created between these 2 > > > > > > VMs. > > > > > > When inspecting further, I found that TCP packets have invalid > > > > > > checksums. > > > > > > This might be the reason. > > > > > > > > > > > > I am wondering if I missed something in the setup? Thanks a lot. > > > > > > > > > > I didn't test myself, but according to current design, checksum > > > > > offloading > > > > > (rx and tx) shuld be enabled in both VMs. Otherwise all the packets > > > > > will > > > > > be dropped by the guest kernel. > > > > > > > > > > Best regards, Ilya Maximets. > > > > > > -- > > > fbl > > -- > fbl ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v4 0/3] Add support for TSO with DPDK
Hi Flavio, Sorry in my last email, one change is incorrect. it should be: in tcp_v4_rcv() - if (skb_checksum_init(skb, IPPROTO_TCP, inet_compute_pseudo)) + if (0) The kernel version I am using is ubuntu 18.04's default kernel: $ uname -r 4.15.0-76-generic Thanks, Yifeng On Wed, Jan 29, 2020 at 3:25 AM Flavio Leitner wrote: > > On Tue, Jan 28, 2020 at 03:23:02PM -0800, Yifeng Sun wrote: > > Sure. > > > > Firstly, make sure userspace-tso-enable is true > > # ovs-vsctl get Open_vSwitch . other_config > > {dpdk-init="true", enable-statistics="true", pmd-cpu-mask="0xf", > > userspace-tso-enable="true"} > > > > Next, create 2 VMs with vhostuser-type interface on the same KVM host: > > > > > >> path='/tmp/041afca0-6e11-4eab-a62f-1ccf5cd318fd' mode='server'/> > > > > > > > > > > I have other options set, but I don't think they are related: > ufo='off' mrg_rxbuf='on'/> > > > > > > > > >> function='0x0'/> > > > > > > When VM boots up, turn on tx, tso and sg > > # ethtool -K ens6 tx on > > # ethtool -K ens6 tso on > > # ethtool -K ens6 sg on > > All the needed offloading features are turned on by default, > so I don't change anything in my testbed. > > > Then run 'iperf -s' on one VM and 'iperf -c xx.xx.xx.xx' on another VM. > > Iperf doesn't work if there is no chage to VM's kernel. `tcpdump` shows > > that iperf server received packets with invalid TCP checksum. > > `nstat -a` shows that TcpInCsumErr number is accumulating. > > > > After adding changes to VM's kernel as below, iperf works properly. > > in tcp_v4_rcv() > > - if (skb_checksum_init(skb, IPPROTO_TCP, inet_compute_pseudo)) > > + if (skb_checksum_init(skb, IPPROTO_TCP, inet_compute_pseudo)) > > > > static inline bool tcp_checksum_complete(struct sk_buff *skb) > > { > > return 0; > > } > > That's odd. Which kernel is that? Maybe I can try the same version. > I am using 5.2.14-200.fc30.x86_64. > > Looks like somehow the packet lost its offloading flags, then kernel > has to check the csum and since it wasn't calculated before, it's > just random garbage. > > fbl > > > > > > > > > > Best, > > Yifeng > > > > On Tue, Jan 28, 2020 at 2:52 PM Flavio Leitner wrote: > > > > > > On Tue, Jan 28, 2020 at 02:21:30PM -0800, Yifeng Sun wrote: > > > > Hi Flavio, > > > > > > > > Thanks for the explanation. I followed the steps in the document but > > > > TCP connection still failed to build between 2 VMs. > > > > > > > > I finally modified VM's kernel directly to disable TCP checksum > > > > validation > > > > to get it working properly. I got 30.0Gbps for 'iperf' between 2 VMs. > > > > > > Could you provide more details on how you did that? What's running > > > inside the VM? > > > > > > I don't change anything inside of the VMs (Linux) in my testbed. > > > > > > fbl > > > > > > > > > > > > > > Best, > > > > Yifeng > > > > > > > > > > > > On Tue, Jan 28, 2020 at 4:00 AM Flavio Leitner > > > > wrote: > > > > > > > > > > On Mon, Jan 27, 2020 at 05:17:01PM -0800, Yifeng Sun wrote: > > > > > > Hi Ilya, > > > > > > > > > > > > Thanks for your reply. > > > > > > > > > > > > The thing is, if checksum offloading is enabled in both VMs, then > > > > > > sender VM will send > > > > > > a packet with invalid TCP checksum, and later OVS will send this > > > > > > packet to receiver > > > > > > VM directly without calculating a valid checksum. As a result, > > > > > > receiver VM will drop > > > > > > this packet because it contains invalid checksum. This is what > > > > > > happened when I tried > > > > > > this patch. > > > > > > > > > > > > > > > > When TSO is enabled, the TX checksumming offloading is required, > > > > > then you will see invalid checksum. This is well documented here: > > > > &
Re: [ovs-dev] [PATCH v4 0/3] Add support for TSO with DPDK
Hi Ilya, The whole output of 'ethtool -k ens6' is here: $ ethtool -k ens6 Features for ens6: rx-checksumming: on [fixed] tx-checksumming: on tx-checksum-ipv4: off [fixed] tx-checksum-ip-generic: on tx-checksum-ipv6: off [fixed] tx-checksum-fcoe-crc: off [fixed] tx-checksum-sctp: off [fixed] scatter-gather: on tx-scatter-gather: on tx-scatter-gather-fraglist: off [fixed] tcp-segmentation-offload: on tx-tcp-segmentation: on tx-tcp-ecn-segmentation: on tx-tcp-mangleid-segmentation: on tx-tcp6-segmentation: on udp-fragmentation-offload: off generic-segmentation-offload: on generic-receive-offload: on large-receive-offload: off [fixed] rx-vlan-offload: off [fixed] tx-vlan-offload: off [fixed] ntuple-filters: off [fixed] receive-hashing: off [fixed] highdma: on [fixed] rx-vlan-filter: on [fixed] vlan-challenged: off [fixed] tx-lockless: off [fixed] netns-local: off [fixed] tx-gso-robust: on [fixed] tx-fcoe-segmentation: off [fixed] tx-gre-segmentation: off [fixed] tx-gre-csum-segmentation: off [fixed] tx-ipxip4-segmentation: off [fixed] tx-ipxip6-segmentation: off [fixed] tx-udp_tnl-segmentation: off [fixed] tx-udp_tnl-csum-segmentation: off [fixed] tx-gso-partial: off [fixed] tx-sctp-segmentation: off [fixed] tx-esp-segmentation: off [fixed] fcoe-mtu: off [fixed] tx-nocache-copy: off loopback: off [fixed] rx-fcs: off [fixed] rx-all: off [fixed] tx-vlan-stag-hw-insert: off [fixed] rx-vlan-stag-hw-parse: off [fixed] rx-vlan-stag-filter: off [fixed] l2-fwd-offload: off [fixed] hw-tc-offload: off [fixed] esp-hw-offload: off [fixed] esp-tx-csum-hw-offload: off [fixed] rx-udp_tunnel-port-offload: off [fixed] yfs@ubuntu:~$ ethtool -k ens6 | grep rx rx-checksumming: on [fixed] rx-vlan-offload: off [fixed] rx-vlan-filter: on [fixed] rx-fcs: off [fixed] rx-all: off [fixed] rx-vlan-stag-hw-parse: off [fixed] rx-vlan-stag-filter: off [fixed] rx-udp_tunnel-port-offload: off [fixed] yfs@ubuntu:~$ ethtool -k ens6 Features for ens6: rx-checksumming: on [fixed] tx-checksumming: on tx-checksum-ipv4: off [fixed] tx-checksum-ip-generic: on tx-checksum-ipv6: off [fixed] tx-checksum-fcoe-crc: off [fixed] tx-checksum-sctp: off [fixed] scatter-gather: on tx-scatter-gather: on tx-scatter-gather-fraglist: off [fixed] tcp-segmentation-offload: on tx-tcp-segmentation: on tx-tcp-ecn-segmentation: on tx-tcp-mangleid-segmentation: on tx-tcp6-segmentation: on udp-fragmentation-offload: off generic-segmentation-offload: on generic-receive-offload: on large-receive-offload: off [fixed] rx-vlan-offload: off [fixed] tx-vlan-offload: off [fixed] ntuple-filters: off [fixed] receive-hashing: off [fixed] highdma: on [fixed] rx-vlan-filter: on [fixed] vlan-challenged: off [fixed] tx-lockless: off [fixed] netns-local: off [fixed] tx-gso-robust: on [fixed] tx-fcoe-segmentation: off [fixed] tx-gre-segmentation: off [fixed] tx-gre-csum-segmentation: off [fixed] tx-ipxip4-segmentation: off [fixed] tx-ipxip6-segmentation: off [fixed] tx-udp_tnl-segmentation: off [fixed] tx-udp_tnl-csum-segmentation: off [fixed] tx-gso-partial: off [fixed] tx-sctp-segmentation: off [fixed] tx-esp-segmentation: off [fixed] fcoe-mtu: off [fixed] tx-nocache-copy: off loopback: off [fixed] rx-fcs: off [fixed] rx-all: off [fixed] tx-vlan-stag-hw-insert: off [fixed] rx-vlan-stag-hw-parse: off [fixed] rx-vlan-stag-filter: off [fixed] l2-fwd-offload: off [fixed] hw-tc-offload: off [fixed] esp-hw-offload: off [fixed] esp-tx-csum-hw-offload: off [fixed] rx-udp_tunnel-port-offload: off [fixed] Thanks, Yifeng On Wed, Jan 29, 2020 at 4:07 AM Ilya Maximets wrote: > > On 29.01.2020 12:25, Flavio Leitner wrote: > > On Tue, Jan 28, 2020 at 03:23:02PM -0800, Yifeng Sun wrote: > >> Sure. > >> > >> Firstly, make sure userspace-tso-enable is true > >> # ovs-vsctl get Open_vSwitch . other_config > >> {dpdk-init="true", enable-statistics="true", pmd-cpu-mask="0xf", > >> userspace-tso-enable="true"} > >> > >> Next, create 2 VMs with vhostuser-type interface on the same KVM host: > >> > >> > >>>> path='/tmp/041afca0-6e11-4eab-a62f-1ccf5cd318fd' mode='server'/> > >> > >> > >> > >> > > > > I have other options set, but I don't think they are related: > > > ufo='off' mrg_rxbuf='on'/>> > ecn='off' ufo='off'/> > > > > > >> > >> > >>>> function='0x0'/> > >> > >> > >> When VM boots up, turn on tx, tso and sg > >> # ethtool -K ens6 tx on &
Re: [ovs-dev] [PATCH v4 0/3] Add support for TSO with DPDK
Hi Flavio, I found this piece of code in kernel's drivers/net/virtio_net.c and its function receive_buf(): if (hdr->hdr.flags & VIRTIO_NET_HDR_F_DATA_VALID) skb->ip_summed = CHECKSUM_UNNECESSARY; My understanding is that vhost_user needs to set flag VIRTIO_NET_HDR_F_DATA_VALID so that guest's kernel will skip packet's checksum validation. Then I looked through dpdk's source code but didn't find any place that sets this flag. So I made some changes as below, and TCP starts working between 2 VMs without any kernel change. diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c index 73bf98bd9..5e45db655 100644 --- a/lib/librte_vhost/virtio_net.c +++ b/lib/librte_vhost/virtio_net.c @@ -437,6 +437,7 @@ virtio_enqueue_offload(struct rte_mbuf *m_buf, struct virtio_net_hdr *net_hdr) ASSIGN_UNLESS_EQUAL(net_hdr->csum_start, 0); ASSIGN_UNLESS_EQUAL(net_hdr->csum_offset, 0); - ASSIGN_UNLESS_EQUAL(net_hdr->flags, 0); + net_hdr->flags = VIRTIO_NET_HDR_F_DATA_VALID; } /* IP cksum verification cannot be bypassed, then calculate here */ Any comments will be appreciated! Thanks a lot, Yifeng On Wed, Jan 29, 2020 at 1:21 PM Flavio Leitner wrote: > > On Wed, Jan 29, 2020 at 11:19:47AM -0800, William Tu wrote: > > On Wed, Jan 29, 2020 at 3:25 AM Flavio Leitner wrote: > > > > > > On Tue, Jan 28, 2020 at 03:23:02PM -0800, Yifeng Sun wrote: > > > > Sure. > > > > > > > > Firstly, make sure userspace-tso-enable is true > > > > # ovs-vsctl get Open_vSwitch . other_config > > > > {dpdk-init="true", enable-statistics="true", pmd-cpu-mask="0xf", > > > > userspace-tso-enable="true"} > > > > > > > > Next, create 2 VMs with vhostuser-type interface on the same KVM host: > > > > > > > > > > > >> > > path='/tmp/041afca0-6e11-4eab-a62f-1ccf5cd318fd' mode='server'/> > > > > > > > > > > > > > > > > > > > > > > I have other options set, but I don't think they are related: > > > > > ufo='off' mrg_rxbuf='on'/> > > > > > > > Is mrg_rxbuf required to be on? > > No. > > > I saw when enable userspace tso, we are setting external buffer > > RTE_VHOST_USER_EXTBUF_SUPPORT > > Yes. > > > Is this the same thing? > > No. > > mrg_rxbuf says that we want the virtio ring to support chained ring > entries. If that is disabled, the virtio ring will be populated with > entries of maximum buffer length. If that is enabled, a packet will > use one or chain more entries in the virtio ring, so each entry can > be of smaller lengths. That is not visible to OvS. > > The RTE_VHOST_USER_EXTBUF_SUPPORT tells how a packet is provided > after have been pulled out of virtio rings to OvS. We have three > options currently: > > 1) LINEARBUF > It supports data length up to the packet provided (~MTU size). > > 2) EXTBUF > If the packet is too big for #1, allocate a buffer large enough > to fit the data. We get a big packet, but instead of data being > along with the packet's metadata, it's in an external buffer. > > >+---> [ big buffer] > > Well, actually we make partial use of unused buffer to store > struct rte_mbuf_ext_shared_info. > > 3) If neither LINEARBUF nor EXTBUF is not provided (default), > vhost lib can provide large packets as a chain of mbufs, which > OvS doesn't support today. > > HTH, > -- > fbl ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v4 0/3] Add support for TSO with DPDK
Got it. Thanks. Yifeng On Wed, Jan 29, 2020 at 3:04 PM Flavio Leitner wrote: > > On Wed, Jan 29, 2020 at 02:42:27PM -0800, Yifeng Sun wrote: > > Hi Flavio, > > Hi Yifend, thanks for looking into this. > > > I found this piece of code in kernel's drivers/net/virtio_net.c and > > its function receive_buf(): > > if (hdr->hdr.flags & VIRTIO_NET_HDR_F_DATA_VALID) > > skb->ip_summed = CHECKSUM_UNNECESSARY; > > My understanding is that vhost_user needs to set flag > > VIRTIO_NET_HDR_F_DATA_VALID so that > > guest's kernel will skip packet's checksum validation. > > > > Then I looked through dpdk's source code but didn't find any place > > that sets this flag. So I made > > some changes as below, and TCP starts working between 2 VMs without > > any kernel change. > > > > diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c > > index 73bf98bd9..5e45db655 100644 > > --- a/lib/librte_vhost/virtio_net.c > > +++ b/lib/librte_vhost/virtio_net.c > > @@ -437,6 +437,7 @@ virtio_enqueue_offload(struct rte_mbuf *m_buf, > > struct virtio_net_hdr *net_hdr) > > ASSIGN_UNLESS_EQUAL(net_hdr->csum_start, 0); > > ASSIGN_UNLESS_EQUAL(net_hdr->csum_offset, 0); > > - ASSIGN_UNLESS_EQUAL(net_hdr->flags, 0); > > + net_hdr->flags = VIRTIO_NET_HDR_F_DATA_VALID; > > } > > > > /* IP cksum verification cannot be bypassed, then calculate here */ > > No, it actually uses ->flags to pass VIRTIO_NET_HDR_F_NEEDS_CSUM and > then we pass the start and offset. > > HTH, > fbl > > > > > > > Any comments will be appreciated! > > > > Thanks a lot, > > Yifeng > > > > On Wed, Jan 29, 2020 at 1:21 PM Flavio Leitner wrote: > > > > > > On Wed, Jan 29, 2020 at 11:19:47AM -0800, William Tu wrote: > > > > On Wed, Jan 29, 2020 at 3:25 AM Flavio Leitner > > > > wrote: > > > > > > > > > > On Tue, Jan 28, 2020 at 03:23:02PM -0800, Yifeng Sun wrote: > > > > > > Sure. > > > > > > > > > > > > Firstly, make sure userspace-tso-enable is true > > > > > > # ovs-vsctl get Open_vSwitch . other_config > > > > > > {dpdk-init="true", enable-statistics="true", pmd-cpu-mask="0xf", > > > > > > userspace-tso-enable="true"} > > > > > > > > > > > > Next, create 2 VMs with vhostuser-type interface on the same KVM > > > > > > host: > > > > > > > > > > > > > > > > > >> > > > > path='/tmp/041afca0-6e11-4eab-a62f-1ccf5cd318fd' mode='server'/> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I have other options set, but I don't think they are related: > > > > > > > > > ufo='off' mrg_rxbuf='on'/> > > > > > > > > > > > > > Is mrg_rxbuf required to be on? > > > > > > No. > > > > > > > I saw when enable userspace tso, we are setting external buffer > > > > RTE_VHOST_USER_EXTBUF_SUPPORT > > > > > > Yes. > > > > > > > Is this the same thing? > > > > > > No. > > > > > > mrg_rxbuf says that we want the virtio ring to support chained ring > > > entries. If that is disabled, the virtio ring will be populated with > > > entries of maximum buffer length. If that is enabled, a packet will > > > use one or chain more entries in the virtio ring, so each entry can > > > be of smaller lengths. That is not visible to OvS. > > > > > > The RTE_VHOST_USER_EXTBUF_SUPPORT tells how a packet is provided > > > after have been pulled out of virtio rings to OvS. We have three > > > options currently: > > > > > > 1) LINEARBUF > > > It supports data length up to the packet provided (~MTU size). > > > > > > 2) EXTBUF > > > If the packet is too big for #1, allocate a buffer large enough > > > to fit the data. We get a big packet, but instead of data being > > > along with the packet's metadata, it's in an external buffer. > > > > > > > > >+---> [ big buffer] > > > > > > Well, actually we make partial use of unused buffer to store > > > struct rte_mbuf_ext_shared_info. > > > > > > 3) If neither LINEARBUF nor EXTBUF is not provided (default), > > > vhost lib can provide large packets as a chain of mbufs, which > > > OvS doesn't support today. > > > > > > HTH, > > > -- > > > fbl > > -- > fbl ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v4 0/3] Add support for TSO with DPDK
Hi Flavio, Can you please confirm the kernel versions you are using? Host KVM: 5.2.14-200.fc30.x86_64. VM: 4.15.0 from upstream ubuntu. Thanks, Yifeng On Thu, Feb 13, 2020 at 12:05 PM Flavio Leitner wrote: > > > Hi Yifeng, > > Sorry the late response. > > On Wed, Jan 29, 2020 at 09:04:39AM -0800, Yifeng Sun wrote: > > Hi Flavio, > > > > Sorry in my last email, one change is incorrect. it should be: > > in tcp_v4_rcv() > > - if (skb_checksum_init(skb, IPPROTO_TCP, inet_compute_pseudo)) > > + if (0) > > > > The kernel version I am using is ubuntu 18.04's default kernel: > > $ uname -r > > 4.15.0-76-generic > > I deployed a VM with 4.15.0 from upstream and I can ssh, scp (back > and forth), iperf3 (direct, reverse, with TCP or UDP) between that > VM and another VM, veth, bridge and another host without issues. > > Any chance for you to try with the same upstream kernel version? > > Thanks, > fbl > > > > > Thanks, > > Yifeng > > > > On Wed, Jan 29, 2020 at 3:25 AM Flavio Leitner wrote: > > > > > > On Tue, Jan 28, 2020 at 03:23:02PM -0800, Yifeng Sun wrote: > > > > Sure. > > > > > > > > Firstly, make sure userspace-tso-enable is true > > > > # ovs-vsctl get Open_vSwitch . other_config > > > > {dpdk-init="true", enable-statistics="true", pmd-cpu-mask="0xf", > > > > userspace-tso-enable="true"} > > > > > > > > Next, create 2 VMs with vhostuser-type interface on the same KVM host: > > > > > > > > > > > >> > > path='/tmp/041afca0-6e11-4eab-a62f-1ccf5cd318fd' mode='server'/> > > > > > > > > > > > > > > > > > > > > > > I have other options set, but I don't think they are related: > > > > > ufo='off' mrg_rxbuf='on'/> > > > > > > > > > > > > > > > > > > > > >> > > function='0x0'/> > > > > > > > > > > > > When VM boots up, turn on tx, tso and sg > > > > # ethtool -K ens6 tx on > > > > # ethtool -K ens6 tso on > > > > # ethtool -K ens6 sg on > > > > > > All the needed offloading features are turned on by default, > > > so I don't change anything in my testbed. > > > > > > > Then run 'iperf -s' on one VM and 'iperf -c xx.xx.xx.xx' on another VM. > > > > Iperf doesn't work if there is no chage to VM's kernel. `tcpdump` shows > > > > that iperf server received packets with invalid TCP checksum. > > > > `nstat -a` shows that TcpInCsumErr number is accumulating. > > > > > > > > After adding changes to VM's kernel as below, iperf works properly. > > > > in tcp_v4_rcv() > > > > - if (skb_checksum_init(skb, IPPROTO_TCP, inet_compute_pseudo)) > > > > + if (skb_checksum_init(skb, IPPROTO_TCP, inet_compute_pseudo)) > > > > > > > > static inline bool tcp_checksum_complete(struct sk_buff *skb) > > > > { > > > > return 0; > > > > } > > > > > > That's odd. Which kernel is that? Maybe I can try the same version. > > > I am using 5.2.14-200.fc30.x86_64. > > > > > > Looks like somehow the packet lost its offloading flags, then kernel > > > has to check the csum and since it wasn't calculated before, it's > > > just random garbage. > > > > > > fbl > > > > > > > > > > > > > > > > > > > > > > Best, > > > > Yifeng > > > > > > > > On Tue, Jan 28, 2020 at 2:52 PM Flavio Leitner > > > > wrote: > > > > > > > > > > On Tue, Jan 28, 2020 at 02:21:30PM -0800, Yifeng Sun wrote: > > > > > > Hi Flavio, > > > > > > > > > > > > Thanks for the explanation. I followed the steps in the document but > > > > > > TCP connection still failed to build between 2 VMs. > > > > > > > > > > > > I finally modified VM's kernel directly to disable TCP checksum > > > > > > validation > > > > > > to get it working properly. I got 30.0Gbps for 'iperf' between 2 > > > >
Re: [ovs-dev] [PATCH v4 0/3] Add support for TSO with DPDK
Got it, thanks! Yifeng On Fri, Feb 14, 2020 at 11:29 AM Flavio Leitner wrote: > > On Fri, Feb 14, 2020 at 09:44:52AM -0800, Yifeng Sun wrote: > > Hi Flavio, > > > > Can you please confirm the kernel versions you are using? > > > > Host KVM: 5.2.14-200.fc30.x86_64. > > Host KVM: 5.5.0+ > > > VM: 4.15.0 from upstream ubuntu. > > VM: 4.15.0 from Linus git tree. > > fbl > > > > > Thanks, > > Yifeng > > > > On Thu, Feb 13, 2020 at 12:05 PM Flavio Leitner wrote: > > > > > > > > > Hi Yifeng, > > > > > > Sorry the late response. > > > > > > On Wed, Jan 29, 2020 at 09:04:39AM -0800, Yifeng Sun wrote: > > > > Hi Flavio, > > > > > > > > Sorry in my last email, one change is incorrect. it should be: > > > > in tcp_v4_rcv() > > > > - if (skb_checksum_init(skb, IPPROTO_TCP, inet_compute_pseudo)) > > > > + if (0) > > > > > > > > The kernel version I am using is ubuntu 18.04's default kernel: > > > > $ uname -r > > > > 4.15.0-76-generic > > > > > > I deployed a VM with 4.15.0 from upstream and I can ssh, scp (back > > > and forth), iperf3 (direct, reverse, with TCP or UDP) between that > > > VM and another VM, veth, bridge and another host without issues. > > > > > > Any chance for you to try with the same upstream kernel version? > > > > > > Thanks, > > > fbl > > > > > > > > > > > Thanks, > > > > Yifeng > > > > > > > > On Wed, Jan 29, 2020 at 3:25 AM Flavio Leitner > > > > wrote: > > > > > > > > > > On Tue, Jan 28, 2020 at 03:23:02PM -0800, Yifeng Sun wrote: > > > > > > Sure. > > > > > > > > > > > > Firstly, make sure userspace-tso-enable is true > > > > > > # ovs-vsctl get Open_vSwitch . other_config > > > > > > {dpdk-init="true", enable-statistics="true", pmd-cpu-mask="0xf", > > > > > > userspace-tso-enable="true"} > > > > > > > > > > > > Next, create 2 VMs with vhostuser-type interface on the same KVM > > > > > > host: > > > > > > > > > > > > > > > > > >> > > > > path='/tmp/041afca0-6e11-4eab-a62f-1ccf5cd318fd' mode='server'/> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I have other options set, but I don't think they are related: > > > > > > > > > ufo='off' mrg_rxbuf='on'/> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > function='0x0'/> > > > > > > > > > > > > > > > > > > When VM boots up, turn on tx, tso and sg > > > > > > # ethtool -K ens6 tx on > > > > > > # ethtool -K ens6 tso on > > > > > > # ethtool -K ens6 sg on > > > > > > > > > > All the needed offloading features are turned on by default, > > > > > so I don't change anything in my testbed. > > > > > > > > > > > Then run 'iperf -s' on one VM and 'iperf -c xx.xx.xx.xx' on another > > > > > > VM. > > > > > > Iperf doesn't work if there is no chage to VM's kernel. `tcpdump` > > > > > > shows > > > > > > that iperf server received packets with invalid TCP checksum. > > > > > > `nstat -a` shows that TcpInCsumErr number is accumulating. > > > > > > > > > > > > After adding changes to VM's kernel as below, iperf works properly. > > > > > > in tcp_v4_rcv() > > > > > > - if (skb_checksum_init(skb, IPPROTO_TCP, > > > > > > inet_compute_pseudo)) > > > > > > + if (skb_checksum_init(skb, IPPROTO_TCP, > > > > > > inet_compute_pseudo)) > > > > > > > > > > > > static inline bool tcp_checksum_complete(struct sk_buff *skb) > > > > > > { > > > &
[ovs-dev] [PATCH 2/2] system-traffic: Check frozen state handling with TLV map change
This patch enhances a system traffic test to prevent regression on the tunnel metadata table (tun_table) handling with frozen state. Without a proper fix this test can crash ovs-vswitchd due to a use-after-free bug on tun_table. These are the timed sequence of how this bug is triggered: - Adds an OpenFlow rule in OVS that matches Geneve tunnel metadata that contains a controller action. - When the first packet matches the aforementioned OpenFlow rule, during the miss upcall, OVS stores a pointer to the tun_table (that decodes the Geneve tunnel metadata) in a frozen state and pushes down a datapath flow into kernel datapath. - Issues a add-tlv-map command to reprogram the tun_table on OVS. OVS frees the old tun_table and create a new tun_table. - A subsequent packet hits the kernel datapath flow again. Since there is a controller action associated with that flow, it triggers slow path controller upcall. - In the slow path controller upcall, OVS derives the tun_table from the frozen state, which points to the old tun_table that is already being freed at this time point. - In order to access the tunnel metadata, OVS uses the invalid pointer that points to the old tun_table and triggers the core dump. Signed-off-by: Yi-Hung Wei Signed-off-by: Yifeng Sun Co-authored-by: Yi-Hung Wei --- tests/system-traffic.at | 14 ++ 1 file changed, 14 insertions(+) diff --git a/tests/system-traffic.at b/tests/system-traffic.at index 4a39c929c207..992de8546c41 100644 --- a/tests/system-traffic.at +++ b/tests/system-traffic.at @@ -611,6 +611,20 @@ NS_CHECK_EXEC([at_ns0], [ping -q -c 3 10.1.1.100 | FORMAT_PING], [0], [dnl 3 packets transmitted, 3 received, 0% packet loss, time 0ms ]) +dnl Test OVS handles TLV map modifictions properly when restores frozen state. +NS_CHECK_EXEC([at_ns0], [ping 10.1.1.100 > ping.out &]) + +sleep 2 + +AT_CHECK([ovs-ofctl add-tlv-map br0 "{class=0x,type=0x88,len=4}->tun_metadata1"]) +sleep 1 +AT_CHECK([ovs-ofctl add-tlv-map br0 "{class=0x,type=0x99,len=4}->tun_metadata2"]) +sleep 1 +AT_CHECK([ovs-ofctl add-tlv-map br0 "{class=0x,type=0xaa,len=4}->tun_metadata3"]) +sleep 1 + +dnl At this point, ovs-vswitchd will either crash or everything is OK. + OVS_APP_EXIT_AND_WAIT([ovs-ofctl]) OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 1/2] tun_metadata: Fix coredump caused by use-after-free bug
Tun_metadata can be referened by flow and frozen_state at the same time. When ovs-vswitchd handles TLV table mod message, the involved tun_metadata gets freed. The call trace to free tun_metadata is shown as below: ofproto_run - handle_openflow - handle_single_part_openflow - handle_tlv_table_mod - tun_metadata_table_mod - tun_metadata_postpone_free Unfortunately, this tun_metadata can be still used by some frozen_state, and later on when frozen_state tries to access its tun_metadata table, ovs-vswitchd crashes. The call trace to access tun_metadata from frozen_state is shown as below: udpif_upcall_handler - recv_upcalls - process_upcall - frozen_metadata_to_flow This patch fixes it by introducing a reference count to tun_metadata. Whenever a pointer of tun_metadata is passed between flow and frozen_state, we increase its reference count. Reference count is decreased at deallocation. In present code, pointer of tun_metadata can be passed between flows. It is safe because of RCU mechanism. VMware-BZ: #2526222 Signed-off-by: Yifeng Sun --- lib/tun-metadata.c | 29 - lib/tun-metadata.h | 2 ++ ofproto/ofproto-dpif-rid.c | 8 ofproto/ofproto-dpif-rid.h | 2 ++ 4 files changed, 40 insertions(+), 1 deletion(-) diff --git a/lib/tun-metadata.c b/lib/tun-metadata.c index f8a0e19524e9..c4218a034a92 100644 --- a/lib/tun-metadata.c +++ b/lib/tun-metadata.c @@ -25,6 +25,7 @@ #include "nx-match.h" #include "odp-netlink.h" #include "openvswitch/ofp-match.h" +#include "ovs-atomic.h" #include "ovs-rcu.h" #include "packets.h" #include "tun-metadata.h" @@ -40,6 +41,11 @@ struct tun_meta_entry { /* Maps from TLV option class+type to positions in a struct tun_metadata's * 'opts' array. */ struct tun_table { + /* Struct tun_table can be referenced by struct frozen_state for a long + * time. This ref_cnt protects tun_table from being freed if it is still + * being used somewhere. */ +struct ovs_refcount ref_cnt; + /* TUN_METADATA is stored in element . */ struct tun_meta_entry entries[TUN_METADATA_NUM_OPTS]; @@ -79,6 +85,24 @@ tun_key_type(uint32_t key) return key & 0xff; } +void +tun_metadata_ref(const struct tun_table *tab) +{ +if (tab) { +ovs_refcount_ref(&CONST_CAST(struct tun_table *, tab)->ref_cnt); +} +} + +unsigned int +tun_metadata_unref(const struct tun_table *tab) +{ +if (tab) { +return ovs_refcount_unref_relaxed( +&CONST_CAST(struct tun_table *, tab)->ref_cnt); +} +return -1; +} + /* Returns a newly allocated tun_table. If 'old_map' is nonnull then the new * tun_table is a deep copy of the old one. */ struct tun_table * @@ -111,6 +135,7 @@ tun_metadata_alloc(const struct tun_table *old_map) hmap_init(&new_map->key_hmap); } +ovs_refcount_init(&new_map->ref_cnt); return new_map; } @@ -135,7 +160,9 @@ tun_metadata_free(struct tun_table *map) void tun_metadata_postpone_free(struct tun_table *tab) { -ovsrcu_postpone(tun_metadata_free, tab); +if (tun_metadata_unref(tab) == 1) { +ovsrcu_postpone(tun_metadata_free, tab); +} } enum ofperr diff --git a/lib/tun-metadata.h b/lib/tun-metadata.h index 7dad9504b8da..933021a0f679 100644 --- a/lib/tun-metadata.h +++ b/lib/tun-metadata.h @@ -33,6 +33,8 @@ struct ofputil_tlv_table_mod; struct ofputil_tlv_table_reply; struct tun_table; +void tun_metadata_ref(const struct tun_table *tab); +unsigned int tun_metadata_unref(const struct tun_table *tab); struct tun_table *tun_metadata_alloc(const struct tun_table *old_map); void tun_metadata_free(struct tun_table *); void tun_metadata_postpone_free(struct tun_table *); diff --git a/ofproto/ofproto-dpif-rid.c b/ofproto/ofproto-dpif-rid.c index 29aafc2c0b40..d479e53d9b2d 100644 --- a/ofproto/ofproto-dpif-rid.c +++ b/ofproto/ofproto-dpif-rid.c @@ -201,6 +201,7 @@ static void frozen_state_clone(struct frozen_state *new, const struct frozen_state *old) { *new = *old; +tun_metadata_ref(old->metadata.tunnel.metadata.tab); new->stack = (new->stack_size ? xmemdup(new->stack, new->stack_size) : NULL); @@ -218,10 +219,17 @@ frozen_state_clone(struct frozen_state *new, const struct frozen_state *old) static void frozen_state_free(struct frozen_state *state) { +struct tun_table *tab; + free(state->stack); free(state->ofpacts); free(state->action_set); free(state->userdata); + +tab = CONST_CAST(struct tun_table *, state->metadata.tunnel.metadata.tab); +if (tun_metadata_unref(tab) == 1) { +tun_metadata_free(tab); +} } /* Allocate a unique recirculation id for the given set of flow metadata. diff --git a/ofproto/ofproto-dpif-
Re: [ovs-dev] [PATCH 1/2] tun_metadata: Fix coredump caused by use-after-free bug
Thanks Tonghao. tun_metadata_ref/unref follows the practice of ovs_refcount_ref/unref. In this patch, we need return value of tun_metadata_unref to decide the way to free it. Thanks, Yifeng On Fri, Mar 27, 2020 at 2:45 AM Tonghao Zhang wrote: > > On Fri, Mar 27, 2020 at 3:58 AM Yifeng Sun wrote: > > > > Tun_metadata can be referened by flow and frozen_state at the same > > time. When ovs-vswitchd handles TLV table mod message, the involved > > tun_metadata gets freed. The call trace to free tun_metadata is > > shown as below: > > > > ofproto_run > > - handle_openflow > > - handle_single_part_openflow > > - handle_tlv_table_mod > > - tun_metadata_table_mod > > - tun_metadata_postpone_free > > > > Unfortunately, this tun_metadata can be still used by some frozen_state, > > and later on when frozen_state tries to access its tun_metadata table, > > ovs-vswitchd crashes. The call trace to access tun_metadata from > > frozen_state is shown as below: > > > > udpif_upcall_handler > > - recv_upcalls > > - process_upcall > > - frozen_metadata_to_flow > > > > This patch fixes it by introducing a reference count to tun_metadata. > > Whenever a pointer of tun_metadata is passed between flow and > > frozen_state, we increase its reference count. Reference count > > is decreased at deallocation. > > > > In present code, pointer of tun_metadata can be passed between flows. > > It is safe because of RCU mechanism. > > > > VMware-BZ: #2526222 > > Signed-off-by: Yifeng Sun > > --- > > lib/tun-metadata.c | 29 - > > lib/tun-metadata.h | 2 ++ > > ofproto/ofproto-dpif-rid.c | 8 > > ofproto/ofproto-dpif-rid.h | 2 ++ > > 4 files changed, 40 insertions(+), 1 deletion(-) > > > > diff --git a/lib/tun-metadata.c b/lib/tun-metadata.c > > index f8a0e19524e9..c4218a034a92 100644 > > --- a/lib/tun-metadata.c > > +++ b/lib/tun-metadata.c > > @@ -25,6 +25,7 @@ > > #include "nx-match.h" > > #include "odp-netlink.h" > > #include "openvswitch/ofp-match.h" > > +#include "ovs-atomic.h" > > #include "ovs-rcu.h" > > #include "packets.h" > > #include "tun-metadata.h" > > @@ -40,6 +41,11 @@ struct tun_meta_entry { > > /* Maps from TLV option class+type to positions in a struct tun_metadata's > > * 'opts' array. */ > > struct tun_table { > > + /* Struct tun_table can be referenced by struct frozen_state for a > > long > > + * time. This ref_cnt protects tun_table from being freed if it is > > still > > + * being used somewhere. */ > > +struct ovs_refcount ref_cnt; > > + > > /* TUN_METADATA is stored in element . */ > > struct tun_meta_entry entries[TUN_METADATA_NUM_OPTS]; > > > > @@ -79,6 +85,24 @@ tun_key_type(uint32_t key) > > return key & 0xff; > > } > > > > +void > > +tun_metadata_ref(const struct tun_table *tab) > > +{ > > +if (tab) { > > +ovs_refcount_ref(&CONST_CAST(struct tun_table *, tab)->ref_cnt); > > +} > > +} > > + > > +unsigned int > > +tun_metadata_unref(const struct tun_table *tab) > > +{ > > +if (tab) { > > +return ovs_refcount_unref_relaxed( > > +&CONST_CAST(struct tun_table *, tab)->ref_cnt); > In general, xxx_unref will free the struct data, for example, > stp_unref/netdev_unref/lldp_unref. > > > +} > > +return -1; > > +} > > + > > /* Returns a newly allocated tun_table. If 'old_map' is nonnull then the > > new > > * tun_table is a deep copy of the old one. */ > > struct tun_table * > > @@ -111,6 +135,7 @@ tun_metadata_alloc(const struct tun_table *old_map) > > hmap_init(&new_map->key_hmap); > > } > > > > +ovs_refcount_init(&new_map->ref_cnt); > > return new_map; > > } > > > > @@ -135,7 +160,9 @@ tun_metadata_free(struct tun_table *map) > > void > > tun_metadata_postpone_free(struct tun_table *tab) > > { > > -ovsrcu_postpone(tun_metadata_free, tab); > > +if (tun_metadata_unref(tab) == 1) { > > +ovsrcu_postpone(tun_metadata_free, tab); > > +} > > } > > > > enum ofperr > > diff --git a/lib/tun-metadata.h b/lib/tun-metadata.h > > index 7dad9504b8da..933021a0f679 100644 &g
Re: [ovs-dev] [PATCH 2/2] system-traffic: Check frozen state handling with TLV map change
Thanks William for the comments. I will submit a new version. Yifeng On Mon, Apr 6, 2020 at 8:56 AM William Tu wrote: > On Mon, Apr 6, 2020 at 8:29 AM William Tu wrote: > > > > Hi Yifeng, > > > > Thanks for the patch, I can reproduce the issue using > > $ make check-system-userspace TESTSUITEFLAGS='-k resume' > > ASAN reports > > ==127707==ERROR: AddressSanitizer: heap-use-after-free on address > > 0x61f20690 at pc 0x0089cecf bp 0x7fff38f95690 sp > > 0x7fff38f95688 > > READ of size 4 at 0x61f20690 thread T0 > > #0 0x89cece in tun_metadata_get_fmd > /root/ovs/lib/tun-metadata.c:394:52 > > #1 0x66a3f9 in flow_get_metadata /root/ovs/lib/flow.c:1236:5 > > #2 0x58a9ca in process_upcall > > /root/ovs/ofproto/ofproto-dpif-upcall.c:1538:13 > > #3 0x57a723 in upcall_cb > /root/ovs/ofproto/ofproto-dpif-upcall.c:1311:13 > > > > However, applying your fix (patch 1/2) and run > > $ make check-system-userspace TESTSUITEFLAGS='-k resume' > > fix the crash but trigger other fail. > > > FYI, with your patch, the failed log shows: > +2020-04-06T15:43:08.716Z|1|dpif(revalidator7)|WARN|netdev@ovs-netdev: > failed to put[modify] (No such file or directory) > ufid:4c0cc511-5dfd-4afe-a43b-86889dcd3600 > > skb_priority(0/0),tunnel(tun_id=0x0,src=172.31.1.1,dst=172.31.1.100,ttl=64/0,tp_src=62880/0,tp_dst=6081/0,flags(-df-csum+key)),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(4),packet_type(ns=0,id=0),eth(src=ee:b9:ef:c8:ab:57/00:00:00:00:00:00,dst=33:33:00:00:00:16/00:00:00:00:00:00),eth_type(0x86dd),ipv6(src=::/::,dst=ff02::16/::,label=0/0,proto=58/0,tclass=0/0,hlimit=1/0,frag=no),icmpv6(type=143/0,code=0/0), > > actions:userspace(pid=0,controller(reason=7,dont_send=0,continuation=0,recirc_id=7,rule_cookie=0,controller_id=0,max_len=65535)) > +2020-04-06T15:43:08.716Z|2|dpif(revalidator7)|WARN|netdev@ovs-netdev: > failed to put[modify] (No such file or directory) > ufid:e7d78a37-7dd8-4641-a71d-1c57e4b47329 > > skb_priority(0/0),tunnel(tun_id=0x0,src=172.31.1.1,dst=172.31.1.100,ttl=64/0,tp_src=22243/0,tp_dst=6081/0,flags(-df-csum+key)),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(4),packet_type(ns=0,id=0),eth(src=ee:b9:ef:c8:ab:57/00:00:00:00:00:00,dst=f2:7d:a0:68:ae:4a/00:00:00:00:00:00),eth_type(0x0800),ipv4(src= > 10.1.1.1/0.0.0.0,dst=10.1.1.100/0.0.0.0,proto=1,tos=0/0,ttl=64/0,frag=no > ),icmp(type=8/0,code=0/0), > > actions:userspace(pid=0,controller(reason=1,dont_send=0,continuation=1,recirc_id=8,rule_cookie=0,controller_id=0,max_len=65535)) > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 1/2] tun_metadata: Fix coredump caused by use-after-free bug
Thanks William for the review. We need to use ovsrcu_postpone(tun_metadata_free, tab) because tun_table is protected by RCU and we can only free tun_table when all threads quiesce. It turns out that this fix based on reference count is not beautiful, it doesn't fit well with current RCU mechanism. A previous commit seems fixing the samiliar issue: 254878c18874f6 (ofproto-dpif-xlate: Fix segmentation fault caused by tun_table) It is unsafe for frozen_state to reference tun_table because tun_table is protected by RCU while the lifecycle of frozen_state can span several RCU quiesce states. As we discussed offline, we can simply nullify tun_table in frozen_state. If frozen_state needs tun_table, the latest valid tun_table can be found by ofproto_get_tun_tab() efficiently. I will submit a new version. Yifeng On Mon, Apr 6, 2020 at 8:47 AM William Tu wrote: > > On Thu, Mar 26, 2020 at 12:58:21PM -0700, Yifeng Sun wrote: > > Tun_metadata can be referened by flow and frozen_state at the same > > time. When ovs-vswitchd handles TLV table mod message, the involved > > tun_metadata gets freed. The call trace to free tun_metadata is > > shown as below: > > > > ofproto_run > > - handle_openflow > > - handle_single_part_openflow > > - handle_tlv_table_mod > > - tun_metadata_table_mod > > - tun_metadata_postpone_free > > > > Unfortunately, this tun_metadata can be still used by some frozen_state, > > and later on when frozen_state tries to access its tun_metadata table, > > ovs-vswitchd crashes. The call trace to access tun_metadata from > > frozen_state is shown as below: > > > > udpif_upcall_handler > > - recv_upcalls > > - process_upcall > > - frozen_metadata_to_flow > > > > This patch fixes it by introducing a reference count to tun_metadata. > > Whenever a pointer of tun_metadata is passed between flow and > > frozen_state, we increase its reference count. Reference count > > is decreased at deallocation. > > > > In present code, pointer of tun_metadata can be passed between flows. > > It is safe because of RCU mechanism. > > > > VMware-BZ: #2526222 > > Signed-off-by: Yifeng Sun > > --- > > lib/tun-metadata.c | 29 - > > lib/tun-metadata.h | 2 ++ > > ofproto/ofproto-dpif-rid.c | 8 > > ofproto/ofproto-dpif-rid.h | 2 ++ > > 4 files changed, 40 insertions(+), 1 deletion(-) > > > > diff --git a/lib/tun-metadata.c b/lib/tun-metadata.c > > index f8a0e19524e9..c4218a034a92 100644 > > --- a/lib/tun-metadata.c > > +++ b/lib/tun-metadata.c > > @@ -25,6 +25,7 @@ > > #include "nx-match.h" > > #include "odp-netlink.h" > > #include "openvswitch/ofp-match.h" > > +#include "ovs-atomic.h" > > #include "ovs-rcu.h" > > #include "packets.h" > > #include "tun-metadata.h" > > @@ -40,6 +41,11 @@ struct tun_meta_entry { > > /* Maps from TLV option class+type to positions in a struct tun_metadata's > > * 'opts' array. */ > > struct tun_table { > > + /* Struct tun_table can be referenced by struct frozen_state for a long > > + * time. This ref_cnt protects tun_table from being freed if it is still > > + * being used somewhere. */ > > +struct ovs_refcount ref_cnt; > > + > > /* TUN_METADATA is stored in element . */ > > struct tun_meta_entry entries[TUN_METADATA_NUM_OPTS]; > > > > @@ -79,6 +85,24 @@ tun_key_type(uint32_t key) > > return key & 0xff; > > } > > > > +void > > +tun_metadata_ref(const struct tun_table *tab) > > +{ > > +if (tab) { > > +ovs_refcount_ref(&CONST_CAST(struct tun_table *, tab)->ref_cnt); > > +} > > +} > > + > > +unsigned int > > +tun_metadata_unref(const struct tun_table *tab) > > +{ > > +if (tab) { > > +return ovs_refcount_unref_relaxed( > > +&CONST_CAST(struct tun_table *, tab)->ref_cnt); > > +} > > +return -1; > return -1 looks weird since it's unsigned int. > > > +} > > + > > /* Returns a newly allocated tun_table. If 'old_map' is nonnull then the new > > * tun_table is a deep copy of the old one. */ > > struct tun_table * > > @@ -111,6 +135,7 @@ tun_metadata_alloc(const struct tun_table *old_map) > > hmap_init(&new_map->key_hmap); > > } > > > > +ovs_refcount_init(&new_map->ref_cnt); > &g
[ovs-dev] [PATCH v2 1/2] tun_metadata: Fix coredump caused by use-after-free bug
Tun_metadata can be referened by flow and frozen_state at the same time. When ovs-vswitchd handles TLV table mod message, the involved tun_metadata gets freed. The call trace to free tun_metadata is shown as below: ofproto_run - handle_openflow - handle_single_part_openflow - handle_tlv_table_mod - tun_metadata_table_mod - tun_metadata_postpone_free Unfortunately, this tun_metadata can be still used by some frozen_state, and later on when frozen_state tries to access its tun_metadata table, ovs-vswitchd crashes. The call trace to access tun_metadata from frozen_state is shown as below: udpif_upcall_handler - recv_upcalls - process_upcall - frozen_metadata_to_flow It is unsafe for frozen_state to reference tun_table because tun_table is protected by RCU while the lifecycle of frozen_state can span several RCU quiesce states. Current code violates OVS's RCU protection mechanism. This patch fixes it by simply stopping frozen_state from referencing tun_table. If frozen_state needs tun_table, we can find the latest valid tun_table through ofproto_get_tun_tab() efficiently. A previous commit seems fixing the samiliar issue: 254878c18874f6 (ofproto-dpif-xlate: Fix segmentation fault caused by tun_table) VMware-BZ: #2526222 Signed-off-by: Yifeng Sun --- v1->v2: Drop the fix based on reference count. It doesn't fit well with RCU mechanism. Thanks William and YiHung for the offline discussion. ofproto/ofproto-dpif-rid.h| 7 +++ ofproto/ofproto-dpif-upcall.c | 2 ++ 2 files changed, 9 insertions(+) diff --git a/ofproto/ofproto-dpif-rid.h b/ofproto/ofproto-dpif-rid.h index e5d02caf28a3..5235764a9885 100644 --- a/ofproto/ofproto-dpif-rid.h +++ b/ofproto/ofproto-dpif-rid.h @@ -115,6 +115,13 @@ frozen_metadata_from_flow(struct frozen_metadata *md, { memset(md, 0, sizeof *md); md->tunnel = flow->tunnel; +/* It is unsafe for frozen_state to reference tun_table because + * tun_table is protected by RCU while the lifecycle of frozen_state + * can span several RCU quiesce states. + * + * The latest valid tun_table can be found by ofproto_get_tun_tab() + * efficiently. */ +md->tunnel.metadata.tab = NULL; md->metadata = flow->metadata; memcpy(md->regs, flow->regs, sizeof md->regs); md->in_port = flow->in_port.ofp_port; diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c index 8dfa05b71df4..949cd4dbaf6f 100644 --- a/ofproto/ofproto-dpif-upcall.c +++ b/ofproto/ofproto-dpif-upcall.c @@ -1535,6 +1535,8 @@ process_upcall(struct udpif *udpif, struct upcall *upcall, } frozen_metadata_to_flow(&state->metadata, &frozen_flow); +frozen_flow.tunnel.metadata.tab = ofproto_get_tun_tab( +&upcall->ofproto->up); flow_get_metadata(&frozen_flow, &am->pin.up.base.flow_metadata); ofproto_dpif_send_async_msg(upcall->ofproto, am); -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v2 2/2] system-traffic: Check frozen state handling with TLV map change
This patch enhances a system traffic test to prevent regression on the tunnel metadata table (tun_table) handling with frozen state. Without a proper fix this test can crash ovs-vswitchd due to a use-after-free bug on tun_table. These are the timed sequence of how this bug is triggered: - Adds an OpenFlow rule in OVS that matches Geneve tunnel metadata that contains a controller action. - When the first packet matches the aforementioned OpenFlow rule, during the miss upcall, OVS stores a pointer to the tun_table (that decodes the Geneve tunnel metadata) in a frozen state and pushes down a datapath flow into kernel datapath. - Issues a add-tlv-map command to reprogram the tun_table on OVS. OVS frees the old tun_table and create a new tun_table. - A subsequent packet hits the kernel datapath flow again. Since there is a controller action associated with that flow, it triggers slow path controller upcall. - In the slow path controller upcall, OVS derives the tun_table from the frozen state, which points to the old tun_table that is already being freed at this time point. - In order to access the tunnel metadata, OVS uses the invalid pointer that points to the old tun_table and triggers the core dump. Signed-off-by: Yi-Hung Wei Signed-off-by: Yifeng Sun Co-authored-by: Yi-Hung Wei --- v1-v2: Improve the test based on William's review, thanks. tests/system-traffic.at | 10 ++ 1 file changed, 10 insertions(+) diff --git a/tests/system-traffic.at b/tests/system-traffic.at index 4a39c929c207..3ed03d92b566 100644 --- a/tests/system-traffic.at +++ b/tests/system-traffic.at @@ -611,6 +611,16 @@ NS_CHECK_EXEC([at_ns0], [ping -q -c 3 10.1.1.100 | FORMAT_PING], [0], [dnl 3 packets transmitted, 3 received, 0% packet loss, time 0ms ]) +dnl Test OVS handles TLV map modifictions properly when restores frozen state. +NS_CHECK_EXEC([at_ns0], [ping 10.1.1.100 > /dev/null &]) + +AT_CHECK([ovs-ofctl add-tlv-map br0 "{class=0x,type=0x88,len=4}->tun_metadata1"]) +sleep 1 +AT_CHECK([ovs-ofctl add-tlv-map br0 "{class=0x,type=0x99,len=4}->tun_metadata2"]) +sleep 1 +AT_CHECK([ovs-ofctl add-tlv-map br0 "{class=0x,type=0xaa,len=4}->tun_metadata3"]) +sleep 1 + OVS_APP_EXIT_AND_WAIT([ovs-ofctl]) OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2 1/2] tun_metadata: Fix coredump caused by use-after-free bug
Thanks Yi-Hung. I like this idea. The code will be cleaner. Ben, could you give some suggestion? Thanks, Yifeng On Tue, Apr 7, 2020 at 2:57 PM Yi-Hung Wei wrote: > On Tue, Apr 7, 2020 at 2:50 PM Yifeng Sun wrote: > > > > Tun_metadata can be referened by flow and frozen_state at the same > > time. When ovs-vswitchd handles TLV table mod message, the involved > > tun_metadata gets freed. The call trace to free tun_metadata is > > shown as below: > > > > ofproto_run > > - handle_openflow > > - handle_single_part_openflow > > - handle_tlv_table_mod > > - tun_metadata_table_mod > > - tun_metadata_postpone_free > > > > Unfortunately, this tun_metadata can be still used by some frozen_state, > > and later on when frozen_state tries to access its tun_metadata table, > > ovs-vswitchd crashes. The call trace to access tun_metadata from > > frozen_state is shown as below: > > > > udpif_upcall_handler > > - recv_upcalls > > - process_upcall > > - frozen_metadata_to_flow > > > > It is unsafe for frozen_state to reference tun_table because tun_table > > is protected by RCU while the lifecycle of frozen_state can span several > > RCU quiesce states. Current code violates OVS's RCU protection mechanism. > > > > This patch fixes it by simply stopping frozen_state from referencing > > tun_table. If frozen_state needs tun_table, we can find the latest valid > > tun_table through ofproto_get_tun_tab() efficiently. > > > > A previous commit seems fixing the samiliar issue: > > 254878c18874f6 (ofproto-dpif-xlate: Fix segmentation fault caused by > tun_table) > > > > VMware-BZ: #2526222 > > Signed-off-by: Yifeng Sun > > --- > > v1->v2: Drop the fix based on reference count. It doesn't fit well with > RCU > > mechanism. Thanks William and YiHung for the offline discussion. > > > > ofproto/ofproto-dpif-rid.h| 7 +++ > > ofproto/ofproto-dpif-upcall.c | 2 ++ > > 2 files changed, 9 insertions(+) > > > > diff --git a/ofproto/ofproto-dpif-rid.h b/ofproto/ofproto-dpif-rid.h > > index e5d02caf28a3..5235764a9885 100644 > > --- a/ofproto/ofproto-dpif-rid.h > > +++ b/ofproto/ofproto-dpif-rid.h > > @@ -115,6 +115,13 @@ frozen_metadata_from_flow(struct frozen_metadata > *md, > > { > > memset(md, 0, sizeof *md); > > md->tunnel = flow->tunnel; > > +/* It is unsafe for frozen_state to reference tun_table because > > + * tun_table is protected by RCU while the lifecycle of frozen_state > > + * can span several RCU quiesce states. > > + * > > + * The latest valid tun_table can be found by ofproto_get_tun_tab() > > + * efficiently. */ > > +md->tunnel.metadata.tab = NULL; > > md->metadata = flow->metadata; > > memcpy(md->regs, flow->regs, sizeof md->regs); > > md->in_port = flow->in_port.ofp_port; > > diff --git a/ofproto/ofproto-dpif-upcall.c > b/ofproto/ofproto-dpif-upcall.c > > index 8dfa05b71df4..949cd4dbaf6f 100644 > > --- a/ofproto/ofproto-dpif-upcall.c > > +++ b/ofproto/ofproto-dpif-upcall.c > > @@ -1535,6 +1535,8 @@ process_upcall(struct udpif *udpif, struct upcall > *upcall, > > } > > > > frozen_metadata_to_flow(&state->metadata, &frozen_flow); > > +frozen_flow.tunnel.metadata.tab = ofproto_get_tun_tab( > > +&upcall->ofproto->up); > > > Thanks for the fix. I wonder if it makes sense to move > ofproto_get_tun_tab() into frozen_metadata_to_flow()? Therefore, we > do not need to call ofproto_get_tun_tab() to reset the tun_table for > other frozen state use case. > > Thanks, > > -Yi-Hung > > > > flow_get_metadata(&frozen_flow, > &am->pin.up.base.flow_metadata); > > > > ofproto_dpif_send_async_msg(upcall->ofproto, am); > > -- > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2 1/2] tun_metadata: Fix coredump caused by use-after-free bug
ovsrcu_set() is not necessary here because frozen_flow is on stack. On Wed, Apr 8, 2020 at 6:48 AM William Tu wrote: > On Tue, Apr 07, 2020 at 02:57:03PM -0700, Yi-Hung Wei wrote: > > On Tue, Apr 7, 2020 at 2:50 PM Yifeng Sun > wrote: > > > > > > Tun_metadata can be referened by flow and frozen_state at the same > > > time. When ovs-vswitchd handles TLV table mod message, the involved > > > tun_metadata gets freed. The call trace to free tun_metadata is > > > shown as below: > > > > > > ofproto_run > > > - handle_openflow > > > - handle_single_part_openflow > > > - handle_tlv_table_mod > > > - tun_metadata_table_mod > > > - tun_metadata_postpone_free > > > > > > Unfortunately, this tun_metadata can be still used by some > frozen_state, > > > and later on when frozen_state tries to access its tun_metadata table, > > > ovs-vswitchd crashes. The call trace to access tun_metadata from > > > frozen_state is shown as below: > > > > > > udpif_upcall_handler > > > - recv_upcalls > > > - process_upcall > > > - frozen_metadata_to_flow > > > > > > It is unsafe for frozen_state to reference tun_table because tun_table > > > is protected by RCU while the lifecycle of frozen_state can span > several > > > RCU quiesce states. Current code violates OVS's RCU protection > mechanism. > > > > > > This patch fixes it by simply stopping frozen_state from referencing > > > tun_table. If frozen_state needs tun_table, we can find the latest > valid > > > tun_table through ofproto_get_tun_tab() efficiently. > > > > > > A previous commit seems fixing the samiliar issue: > > > 254878c18874f6 (ofproto-dpif-xlate: Fix segmentation fault caused by > tun_table) > > > > > > VMware-BZ: #2526222 > > > Signed-off-by: Yifeng Sun > > > --- > > > v1->v2: Drop the fix based on reference count. It doesn't fit well > with RCU > > > mechanism. Thanks William and YiHung for the offline discussion. > > > > > > ofproto/ofproto-dpif-rid.h| 7 +++ > > > ofproto/ofproto-dpif-upcall.c | 2 ++ > > > 2 files changed, 9 insertions(+) > > > > > > diff --git a/ofproto/ofproto-dpif-rid.h b/ofproto/ofproto-dpif-rid.h > > > index e5d02caf28a3..5235764a9885 100644 > > > --- a/ofproto/ofproto-dpif-rid.h > > > +++ b/ofproto/ofproto-dpif-rid.h > > > @@ -115,6 +115,13 @@ frozen_metadata_from_flow(struct frozen_metadata > *md, > > > { > > > memset(md, 0, sizeof *md); > > > md->tunnel = flow->tunnel; > > > +/* It is unsafe for frozen_state to reference tun_table because > > > + * tun_table is protected by RCU while the lifecycle of > frozen_state > > > + * can span several RCU quiesce states. > > > + * > > > + * The latest valid tun_table can be found by > ofproto_get_tun_tab() > > > + * efficiently. */ > > > +md->tunnel.metadata.tab = NULL; > > tun_table is RCU-protected, should we use ovsrcu_set? > > > > md->metadata = flow->metadata; > > > memcpy(md->regs, flow->regs, sizeof md->regs); > > > md->in_port = flow->in_port.ofp_port; > > > diff --git a/ofproto/ofproto-dpif-upcall.c > b/ofproto/ofproto-dpif-upcall.c > > > index 8dfa05b71df4..949cd4dbaf6f 100644 > > > --- a/ofproto/ofproto-dpif-upcall.c > > > +++ b/ofproto/ofproto-dpif-upcall.c > > > @@ -1535,6 +1535,8 @@ process_upcall(struct udpif *udpif, struct > upcall *upcall, > > > } > > > > > > frozen_metadata_to_flow(&state->metadata, &frozen_flow); > > > +frozen_flow.tunnel.metadata.tab = ofproto_get_tun_tab( > > > +&upcall->ofproto->up); > > > > > > Thanks for the fix. I wonder if it makes sense to move > > ofproto_get_tun_tab() into frozen_metadata_to_flow()? Therefore, we > > do not need to call ofproto_get_tun_tab() to reset the tun_table for > > other frozen state use case. > > > > Thanks, > > > > -Yi-Hung > > > > > > > flow_get_metadata(&frozen_flow, > &am->pin.up.base.flow_metadata); > > > > > > ofproto_dpif_send_async_msg(upcall->ofproto, am); > > > -- > > ___ > > dev mailing list > > d...@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2 1/2] tun_metadata: Fix coredump caused by use-after-free bug
ofproto_get_tun_tab() calls ovsrcu_get(). This ensures our reading of tun_table is safe before quiesce state. On Wed, Apr 8, 2020 at 8:36 AM William Tu wrote: > On Wed, Apr 8, 2020 at 8:16 AM Yifeng Sun wrote: > > > > ovsrcu_set() is not necessary here because frozen_flow is on stack. > > > Ok, thanks. > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v3 2/2] system-traffic: Check frozen state handling with TLV map change
This patch enhances a system traffic test to prevent regression on the tunnel metadata table (tun_table) handling with frozen state. Without a proper fix this test can crash ovs-vswitchd due to a use-after-free bug on tun_table. These are the timed sequence of how this bug is triggered: - Adds an OpenFlow rule in OVS that matches Geneve tunnel metadata that contains a controller action. - When the first packet matches the aforementioned OpenFlow rule, during the miss upcall, OVS stores a pointer to the tun_table (that decodes the Geneve tunnel metadata) in a frozen state and pushes down a datapath flow into kernel datapath. - Issues a add-tlv-map command to reprogram the tun_table on OVS. OVS frees the old tun_table and create a new tun_table. - A subsequent packet hits the kernel datapath flow again. Since there is a controller action associated with that flow, it triggers slow path controller upcall. - In the slow path controller upcall, OVS derives the tun_table from the frozen state, which points to the old tun_table that is already being freed at this time point. - In order to access the tunnel metadata, OVS uses the invalid pointer that points to the old tun_table and triggers the core dump. Signed-off-by: Yi-Hung Wei Signed-off-by: Yifeng Sun Co-authored-by: Yi-Hung Wei --- v1-v2: Improve the test based on William's review, thanks. tests/system-traffic.at | 10 ++ 1 file changed, 10 insertions(+) diff --git a/tests/system-traffic.at b/tests/system-traffic.at index 4a39c929c207..3ed03d92b566 100644 --- a/tests/system-traffic.at +++ b/tests/system-traffic.at @@ -611,6 +611,16 @@ NS_CHECK_EXEC([at_ns0], [ping -q -c 3 10.1.1.100 | FORMAT_PING], [0], [dnl 3 packets transmitted, 3 received, 0% packet loss, time 0ms ]) +dnl Test OVS handles TLV map modifictions properly when restores frozen state. +NS_CHECK_EXEC([at_ns0], [ping 10.1.1.100 > /dev/null &]) + +AT_CHECK([ovs-ofctl add-tlv-map br0 "{class=0x,type=0x88,len=4}->tun_metadata1"]) +sleep 1 +AT_CHECK([ovs-ofctl add-tlv-map br0 "{class=0x,type=0x99,len=4}->tun_metadata2"]) +sleep 1 +AT_CHECK([ovs-ofctl add-tlv-map br0 "{class=0x,type=0xaa,len=4}->tun_metadata3"]) +sleep 1 + OVS_APP_EXIT_AND_WAIT([ovs-ofctl]) OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v3 1/2] tun_metadata: Fix coredump caused by use-after-free bug
Tun_metadata can be referened by flow and frozen_state at the same time. When ovs-vswitchd handles TLV table mod message, the involved tun_metadata gets freed. The call trace to free tun_metadata is shown as below: ofproto_run - handle_openflow - handle_single_part_openflow - handle_tlv_table_mod - tun_metadata_table_mod - tun_metadata_postpone_free Unfortunately, this tun_metadata can be still used by some frozen_state, and later on when frozen_state tries to access its tun_metadata table, ovs-vswitchd crashes. The call trace to access tun_metadata from frozen_state is shown as below: udpif_upcall_handler - recv_upcalls - process_upcall - frozen_metadata_to_flow It is unsafe for frozen_state to reference tun_table because tun_table is protected by RCU while the lifecycle of frozen_state can span several RCU quiesce states. Current code violates OVS's RCU protection mechanism. This patch fixes it by simply stopping frozen_state from referencing tun_table. If frozen_state needs tun_table, the latest valid tun_table can be found through ofproto_get_tun_tab() efficiently. A previous commit seems fixing the samiliar issue: 254878c18874f6 (ofproto-dpif-xlate: Fix segmentation fault caused by tun_table) VMware-BZ: #2526222 Signed-off-by: Yifeng Sun --- v1->v2: Drop the fix based on reference count. It doesn't fit well with RCU mechanism. Thanks William and YiHung for the offline discussion. v2->v3: frozen_metadata_to_flow() looks up and assigns flow's tun_table. Thanks YiHung's suggestion. ofproto/ofproto-dpif-rid.h| 12 +++- ofproto/ofproto-dpif-upcall.c | 3 ++- ofproto/ofproto-dpif-xlate.c | 9 +++-- 3 files changed, 16 insertions(+), 8 deletions(-) diff --git a/ofproto/ofproto-dpif-rid.h b/ofproto/ofproto-dpif-rid.h index e5d02caf28a3..30cd5275f24c 100644 --- a/ofproto/ofproto-dpif-rid.h +++ b/ofproto/ofproto-dpif-rid.h @@ -22,6 +22,7 @@ #include "cmap.h" #include "ofproto-dpif-mirror.h" +#include "ofproto/ofproto-provider.h" #include "openvswitch/list.h" #include "openvswitch/ofp-actions.h" #include "ovs-thread.h" @@ -115,16 +116,25 @@ frozen_metadata_from_flow(struct frozen_metadata *md, { memset(md, 0, sizeof *md); md->tunnel = flow->tunnel; +/* It is unsafe for frozen_state to reference tun_table because + * tun_table is protected by RCU while the lifecycle of frozen_state + * can span several RCU quiesce states. + * + * The latest valid tun_table can be found by ofproto_get_tun_tab() + * efficiently. */ +md->tunnel.metadata.tab = NULL; md->metadata = flow->metadata; memcpy(md->regs, flow->regs, sizeof md->regs); md->in_port = flow->in_port.ofp_port; } static inline void -frozen_metadata_to_flow(const struct frozen_metadata *md, +frozen_metadata_to_flow(struct ofproto *ofproto, +const struct frozen_metadata *md, struct flow *flow) { flow->tunnel = md->tunnel; +flow->tunnel.metadata.tab = ofproto_get_tun_tab(ofproto); flow->metadata = md->metadata; memcpy(flow->regs, md->regs, sizeof flow->regs); flow->in_port.ofp_port = md->in_port; diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c index 8dfa05b71df4..5e08ef10dad6 100644 --- a/ofproto/ofproto-dpif-upcall.c +++ b/ofproto/ofproto-dpif-upcall.c @@ -1534,7 +1534,8 @@ process_upcall(struct udpif *udpif, struct upcall *upcall, flow_clear_conntrack(&frozen_flow); } -frozen_metadata_to_flow(&state->metadata, &frozen_flow); +frozen_metadata_to_flow(&upcall->ofproto->up, &state->metadata, +&frozen_flow); flow_get_metadata(&frozen_flow, &am->pin.up.base.flow_metadata); ofproto_dpif_send_async_msg(upcall->ofproto, am); diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c index 042c50a6346c..abce976c6c2f 100644 --- a/ofproto/ofproto-dpif-xlate.c +++ b/ofproto/ofproto-dpif-xlate.c @@ -7544,7 +7544,8 @@ xlate_actions(struct xlate_in *xin, struct xlate_out *xout) /* Restore pipeline metadata. May change flow's in_port and other * metadata to the values that existed when freezing was triggered. */ -frozen_metadata_to_flow(&state->metadata, flow); +frozen_metadata_to_flow(&ctx.xbridge->ofproto->up, +&state->metadata, flow); /* Restore stack, if any. */ if (state->stack) { @@ -7596,14 +7597,10 @@ xlate_actions(struct xlate_in *xin, struct xlate_out *xout) ctx.error = XLATE_INVALID_TUNNEL_METADATA; goto exit; } -} else if (!flow->tunnel.metadata.tab || xin->froze
Re: [ovs-dev] [PATCH] ofp-actions: Fix memory leak on error path.
Looks good to me, thanks. Reviewed-by: Yifeng Sun On Mon, Apr 13, 2020 at 8:43 AM William Tu wrote: > Need to free the memory before return. Detected by gcc10. > > Signed-off-by: William Tu > --- > lib/ofp-actions.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/lib/ofp-actions.c b/lib/ofp-actions.c > index ef8b2b4527f9..a94d1a7ca918 100644 > --- a/lib/ofp-actions.c > +++ b/lib/ofp-actions.c > @@ -5966,6 +5966,7 @@ parse_CLONE(char *arg, const struct > ofpact_parse_params *pp) > clone = pp->ofpacts->header; > > if (ofpbuf_oversized(pp->ofpacts)) { > +free(error); > return xasprintf("input too big"); > } > > -- > 2.7.4 > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCHv2] fatal-signal: Remove snprintf.
Thanks, looks good to me. Tested-by: Yifeng Sun Reviewed-by: Yifeng Sun On Wed, Mar 25, 2020 at 11:51 AM William Tu wrote: > Function snprintf is not async-signal-safe. Replace it with > our own implementation. Example ovs-vswitchd.log output: > 2020-03-25T01:08:19.673Z|00050|memory|INFO|handlers:2 ports:3 > SIGSEGV detected, backtrace: > 0x4872d9 > 0x7f4e2ab974b0 > 0x7f4e2ac5d74d <__poll+0x2d> > 0x531098 > 0x51aefc > 0x445ca9 > 0x5056fd > 0x7f4e2b65f6ba > 0x7f4e2ac6941d > 0x0 <+0x0> > > Tested-at: > https://travis-ci.org/github/williamtu/ovs-travis/builds/666875084 > Signed-off-by: William Tu > --- > v2: > - avoid strcat overflow buffer, switch to use strncat > - some code refactor > > --- > lib/fatal-signal.c | 45 + > 1 file changed, 37 insertions(+), 8 deletions(-) > > diff --git a/lib/fatal-signal.c b/lib/fatal-signal.c > index 51cf628d994e..e033f1ec59ec 100644 > --- a/lib/fatal-signal.c > +++ b/lib/fatal-signal.c > @@ -158,6 +158,23 @@ fatal_signal_add_hook(void (*hook_cb)(void *aux), > void (*cancel_cb)(void *aux), > } > > #ifdef HAVE_UNWIND > +/* Convert unsigned long long to string. This is needed because > + * using snprintf() is not async signal safe. */ > +static inline int > +llong_to_hex_str(unsigned long long value, char *str) > +{ > +int i = 0, res; > + > +if (value / 16 > 0) { > +i = llong_to_hex_str(value / 16, str); > +} > + > +res = value % 16; > +str[i] = "0123456789abcdef"[res]; > + > +return i + 1; > +} > + > /* Send the backtrace buffer to monitor thread. > * > * Note that this runs in the signal handling context, any system > @@ -192,20 +209,32 @@ send_backtrace_to_monitor(void) { > dep * sizeof(struct unw_backtrace))); > } else { > /* Since there is no monitor daemon running, write backtrace > - * in current process. This is not asyn-signal-safe due to > - * use of snprintf(). > + * in current process. > */ > char str[] = "SIGSEGV detected, backtrace:\n"; > +char ip_str[16], offset_str[6]; > +char _line[64]; > +char *line = (char *)_line; > > vlog_direct_write_to_log_file_unsafe(str); > > for (int i = 0; i < dep; i++) { > -char line[64]; > - > -snprintf(line, 64, "0x%016"PRIxPTR" <%s+0x%"PRIxPTR">\n", > - unw_bt[i].ip, > - unw_bt[i].func, > - unw_bt[i].offset); > +memset(line, 0, sizeof _line); > +memset(ip_str, ' ', sizeof ip_str); > +memset(offset_str, 0, sizeof offset_str); > +ip_str[sizeof(ip_str) - 1] = 0; > +offset_str[sizeof(offset_str) - 1] = 0; > + > +llong_to_hex_str(unw_bt[i].ip, ip_str); > +llong_to_hex_str(unw_bt[i].offset, offset_str); > + > +strcat(line, "0x"); > +strcat(line, ip_str); > +strcat(line, "<"); > +strncat(line, unw_bt[i].func, UNW_MAX_FUNCN); > +strcat(line, "+0x"); > +strcat(line, offset_str); > +strcat(line, ">\n"); > vlog_direct_write_to_log_file_unsafe(line); > } > } > -- > 2.7.4 > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCHv3] fatal-signal: Remove snprintf.
Thanks for fixing gcc10, looks good to me. Tested-by: Yifeng Sun Reviewed-by: Yifeng Sun On Tue, Apr 14, 2020 at 8:17 AM William Tu wrote: > Function snprintf is not async-signal-safe. Replace it with > our own implementation. Example ovs-vswitchd.log output: > 2020-03-25T01:08:19.673Z|00050|memory|INFO|handlers:2 ports:3 > SIGSEGV detected, backtrace: > 0x4872d9 > 0x7f4e2ab974b0 > 0x7f4e2ac5d74d <__poll+0x2d> > 0x531098 > 0x51aefc > 0x445ca9 > 0x5056fd > 0x7f4e2b65f6ba > 0x7f4e2ac6941d > 0x0 <+0x0> > > Tested-at: > https://travis-ci.org/github/williamtu/ovs-travis/builds/674901331 > Signed-off-by > <https://travis-ci.org/github/williamtu/ovs-travis/builds/674901331Signed-off-by>: > William Tu > --- > v3: > - use memcpy to avoid gcc10 warning below > lib/fatal-signal.c: In function ‘send_backtrace_to_monitor’: > lib/fatal-signal.c:234:13: warning: ‘__builtin_strncat’ output may be > truncated copying 32 bytes from a string of length 1535 > [ ]8;; > https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wstringop-truncation-Wstringop-truncation > ]8;;] > 234 | strncat(line, unw_bt[i].func, UNW_MAX_FUNCN); > | ^~~ > --- > lib/fatal-signal.c | 45 + > 1 file changed, 37 insertions(+), 8 deletions(-) > > diff --git a/lib/fatal-signal.c b/lib/fatal-signal.c > index 51cf628d994e..bbb31ef27517 100644 > --- a/lib/fatal-signal.c > +++ b/lib/fatal-signal.c > @@ -158,6 +158,23 @@ fatal_signal_add_hook(void (*hook_cb)(void *aux), > void (*cancel_cb)(void *aux), > } > > #ifdef HAVE_UNWIND > +/* Convert unsigned long long to string. This is needed because > + * using snprintf() is not async signal safe. */ > +static inline int > +llong_to_hex_str(unsigned long long value, char *str) > +{ > +int i = 0, res; > + > +if (value / 16 > 0) { > +i = llong_to_hex_str(value / 16, str); > +} > + > +res = value % 16; > +str[i] = "0123456789abcdef"[res]; > + > +return i + 1; > +} > + > /* Send the backtrace buffer to monitor thread. > * > * Note that this runs in the signal handling context, any system > @@ -192,20 +209,32 @@ send_backtrace_to_monitor(void) { > dep * sizeof(struct unw_backtrace))); > } else { > /* Since there is no monitor daemon running, write backtrace > - * in current process. This is not asyn-signal-safe due to > - * use of snprintf(). > + * in current process. > */ > char str[] = "SIGSEGV detected, backtrace:\n"; > +char ip_str[16], offset_str[6]; > +char line[64], fn_name[UNW_MAX_FUNCN]; > > vlog_direct_write_to_log_file_unsafe(str); > > for (int i = 0; i < dep; i++) { > -char line[64]; > - > -snprintf(line, 64, "0x%016"PRIxPTR" <%s+0x%"PRIxPTR">\n", > - unw_bt[i].ip, > - unw_bt[i].func, > - unw_bt[i].offset); > +memset(line, 0, sizeof line); > +memset(fn_name, 0, sizeof fn_name); > +memset(offset_str, 0, sizeof offset_str); > +memset(ip_str, ' ', sizeof ip_str); > +ip_str[sizeof(ip_str) - 1] = 0; > + > +llong_to_hex_str(unw_bt[i].ip, ip_str); > +llong_to_hex_str(unw_bt[i].offset, offset_str); > + > +strcat(line, "0x"); > +strcat(line, ip_str); > +strcat(line, "<"); > +memcpy(fn_name, unw_bt[i].func, UNW_MAX_FUNCN - 1); > +strcat(line, fn_name); > +strcat(line, "+0x"); > +strcat(line, offset_str); > +strcat(line, ">\n"); > vlog_direct_write_to_log_file_unsafe(line); > } > } > -- > 2.7.4 > > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH V2] compat: Fix broken partial backport of extack op parameter
LGTM, thanks Greg. Reviewed-by: Yifeng Sun On Tue, Apr 14, 2020 at 11:42 AM Greg Rose wrote: > A series of commits added support for the extended ack > parameter to the newlink, changelink and validate ops in > the rtnl_link_ops structure: > a8b8a889e369d ("net: add netlink_ext_ack argument to > rtnl_link_ops.validate") > 7a3f4a185169b ("net: add netlink_ext_ack argument to > rtnl_link_ops.newlink") > ad744b223c521 ("net: add netlink_ext_ack argument to > rtnl_link_ops.changelink") > > These commits were all added at the same time and present since the > Linux kernel 4.13 release. In our compatiblity layer we have a > define HAVE_EXT_ACK_IN_RTNL_LINKOPS that indicates the presence of > the extended ack parameter for these three link operations. > > At least one distro has only backported two of the three patches, > for newlink and changelink, while not backporting patch a8b8a889e369d > for the validate op. Our compatibility layer code in acinclude.m4 > is able to find the presence of the extack within the rtnl_link_ops > structure so it defines HAVE_EXT_ACK_IN_RTNL_LINKOPS but since the > validate link op does not have the extack parameter the compilation > fails on recent kernels for that particular distro. Other kernel > distributions based upon this distro will presumably also encounter > the compile errors. > > Introduce a new function in acinclude.m4 that will find function > op definitions and then search for the required parameter. Then > use this function to define HAVE_RTNLOP_VALIDATE_WITH_EXTACK so > that we can detect and enable correct compilation on kernels > which have not backported the entire set of patches. This function > is generic to any function op - it need not be in a structure. > > In places where HAVE_EXT_ACK_IN_RTNL_LINKOPS wraps validate functions > replace it with the new HAVE_RTNLOP_VALIDATE_WITH_EXTACK define. > > Passes Travis here: > https://travis-ci.org/github/gvrose8192/ovs-experimental/builds/674599698 > > Passes a kernel check-kmod test on several systems, including > sles12 sp4 4.12.14-95.48-default kernel, without any regressions. > > VMWare-BZ: #2544032 > > Signed-off-by: Greg Rose > > --- > V2 - Fix comment for OVS_FIND_OP_PARAM_IFELSE function and don't > forget the VMWare BZ # > --- > acinclude.m4 | 34 ++ > datapath/linux/compat/geneve.c | 2 +- > datapath/linux/compat/ip6_gre.c| 10 +- > datapath/linux/compat/ip6_tunnel.c | 2 +- > datapath/linux/compat/ip_gre.c | 10 +- > datapath/linux/compat/lisp.c | 2 +- > datapath/linux/compat/stt.c| 2 +- > datapath/linux/compat/vxlan.c | 2 +- > 8 files changed, 49 insertions(+), 15 deletions(-) > > diff --git a/acinclude.m4 b/acinclude.m4 > index 02efea6..0901f28 100644 > --- a/acinclude.m4 > +++ b/acinclude.m4 > @@ -520,6 +520,37 @@ AC_DEFUN([OVS_FIND_PARAM_IFELSE], [ >fi > ]) > > +dnl OVS_FIND_OP_PARAM_IFELSE(FILE, OP, REGEX, [IF-MATCH], [IF-NO-MATCH]) > +dnl > +dnl Looks for OP in FILE. If it is found, greps for REGEX within the > +dnl OP definition. If this is successful, runs IF-MATCH, otherwise > +dnl IF_NO_MATCH. If IF-MATCH is empty then it defines to > +dnl OVS_DEFINE(HAVE__WITH_), with and > +dnl translated to uppercase. > +AC_DEFUN([OVS_FIND_OP_PARAM_IFELSE], [ > + AC_MSG_CHECKING([whether $2 has member $3 in $1]) > + if test -f $1; then > +awk '/$2[[ \t\n]]*\)\(/,/;/' $1 2>/dev/null | grep '$3' >/dev/null > +status=$? > +case $status in > + 0) > +AC_MSG_RESULT([yes]) > +m4_if([$4], [], > [OVS_DEFINE([HAVE_]m4_toupper([$2])[_WITH_]m4_toupper([$3]))], [$4]) > +;; > + 1) > +AC_MSG_RESULT([no]) > +$5 > +;; > + *) > +AC_MSG_ERROR([grep exited with status $status]) > +;; > +esac > + else > +AC_MSG_RESULT([file not found]) > +$5 > + fi > +]) > + > dnl OVS_DEFINE(NAME) > dnl > dnl Defines NAME to 1 in kcompat.h. > @@ -1056,6 +1087,9 @@ AC_DEFUN([OVS_CHECK_LINUX_COMPAT], [ >OVS_GREP_IFELSE([$KSRC/include/net/netlink.h], >[nla_parse_deprecated_strict], >[OVS_DEFINE([HAVE_NLA_PARSE_DEPRECATED_STRICT])]) > + OVS_FIND_OP_PARAM_IFELSE([$KSRC/include/net/rtnetlink.h], > + [validate], [extack], > + > [OVS_DEFINE([HAVE_RTNLOP_VALIDATE_WITH_EXTACK])]) > >if cmp -s datapath/linux/kcompat.h.new \ > datapath/linux/kcompat.h >/dev/null 2>&1; then > diff --git a/datapath/linux/compat/geneve
[ovs-dev] [PATCH] rhel: Support RHEL8.0 build and packaging
This patch provides essential fixes for OVS to build and package on RHEL8.0. The required package python3-sphinx can be installed by: $ ARCH=$( /bin/arch ) $ subscription-manager repos --enable "codeready-builder-for-rhel-8-${ARCH}-rpms" $ yum install python3-sphinx Signed-off-by: Yifeng Sun --- rhel/openvswitch-fedora.spec.in | 10 -- rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh | 5 + 2 files changed, 13 insertions(+), 2 deletions(-) diff --git a/rhel/openvswitch-fedora.spec.in b/rhel/openvswitch-fedora.spec.in index 7bc8c34b80af..02504f05f9b7 100644 --- a/rhel/openvswitch-fedora.spec.in +++ b/rhel/openvswitch-fedora.spec.in @@ -60,9 +60,15 @@ BuildRequires: autoconf automake libtool BuildRequires: systemd-units openssl openssl-devel BuildRequires: python3-devel BuildRequires: desktop-file-utils -BuildRequires: groff graphviz -BuildRequires: checkpolicy, selinux-policy-devel +%if 0%{?rhel} >= 8 +BuildRequires: groff-base +BuildRequires: python3-sphinx +%else +BuildRequires: groff BuildRequires: /usr/bin/sphinx-build-3 +%endif +BuildRequires: graphviz +BuildRequires: checkpolicy, selinux-policy-devel # make check dependencies BuildRequires: procps-ng %if %{with libcapng} diff --git a/rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh b/rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh index a9b5cdd817da..43dcc73fd3c5 100644 --- a/rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh +++ b/rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh @@ -75,6 +75,11 @@ IFS='.\|-' read mainline_major mainline_minor mainline_patch major_rev \ # echo mainline_major=$mainline_major mainline_minor=$mainline_minor \ # mainline_patch=$mainline_patch major_rev=$major_rev minor_rev=$minor_rev +if [ "$mainline_major" = "4" ]; then +# Skip this script on rhel8 +exit 0 +fi + expected_rhel_base_minor="el7" if [ "$mainline_major" = "3" ] && [ "$mainline_minor" = "10" ]; then if [ "$major_rev" = "327" ]; then -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] rhel: Fix dual kernel rpm install for RHEL 8.4
LGTM, thanks Greg. Reviewed-by: Yifeng Sun On Mon, Aug 23, 2021 at 9:33 AM Greg Rose wrote: > RHEL 8.4 is the first of the RHEL 8.x kernels that has broken ABI so > it requires the same sort of fix as we did for several RHEL 7.x kernel > that needed two kernel rpms to work for all minor revisions of the > baseline kernel module. > > Signed-off-by: Greg Rose > --- > rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh | 8 > 1 file changed, 8 insertions(+) > > diff --git a/rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh > b/rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh > index 22bebaa58..01d31a216 100644 > --- a/rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh > +++ b/rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh > @@ -24,6 +24,7 @@ > # - 3.10.0 major revision 1160 (RHEL 7.9) > # - 4.4.x, x >= 73 (SLES 12 SP3) > # - 4.12.x, x >= 14 (SLES 12 SP4). > +# - 4.18.x major revision 305 (RHEL 8.4) > # It is packaged in the openvswitch kmod RPM and run in the post-install > # scripts. > # > @@ -139,6 +140,13 @@ elif [ "$mainline_major" = "4" ] && [ > "$mainline_minor" = "12" ]; then > ver_offset=2 > installed_ver="$mainline_patch" > fi > +elif [ "$mainline_major" = "4" ] && [ "$mainline_minor" = "18" ]; then > +if [ "$major_rev" = "305" ]; then > +echo "rhel84" > +comp_ver=9 > +ver_offset=4 > +installed_ver="$minor_rev" > +fi > fi > > if [ X"$ver_offset" = X ]; then > -- > 2.17.1 > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] conntrack: Support packets/bytes stats
Userspace conntrack doesn't support conntrack stats for packets and bytes. This patch implements it. Signed-off-by: Yifeng Sun --- lib/conntrack-private.h | 9 + lib/conntrack.c | 28 tests/system-common-macros.at | 2 +- tests/system-traffic.at | 30 ++ 4 files changed, 68 insertions(+), 1 deletion(-) diff --git a/lib/conntrack-private.h b/lib/conntrack-private.h index dfdf4e676..7f21d3772 100644 --- a/lib/conntrack-private.h +++ b/lib/conntrack-private.h @@ -91,6 +91,11 @@ enum OVS_PACKED_ENUM ct_conn_type { CT_CONN_TYPE_UN_NAT, }; +struct conn_counter { +atomic_uint64_t packets; +atomic_uint64_t bytes; +}; + struct conn { /* Immutable data. */ struct conn_key key; @@ -123,6 +128,10 @@ struct conn { enum ct_conn_type conn_type; uint32_t tp_id; /* Timeout policy ID. */ + +/* Counters. */ +struct conn_counter counters_orig; +struct conn_counter counters_reply; }; enum ct_update_res { diff --git a/lib/conntrack.c b/lib/conntrack.c index 33a1a9295..177154cd8 100644 --- a/lib/conntrack.c +++ b/lib/conntrack.c @@ -1245,6 +1245,21 @@ conn_update_state_alg(struct conntrack *ct, struct dp_packet *pkt, return false; } +static void +conn_update_counters(struct conn *conn, + const struct dp_packet *pkt, bool reply) +{ +if (conn) { +struct conn_counter *counter = (reply + ? &conn->counters_reply + : &conn->counters_orig); +uint64_t old; + +atomic_count_inc64(&counter->packets); +atomic_add(&counter->bytes, dp_packet_size(pkt), &old); +} +} + static void set_cached_conn(const struct nat_action_info_t *nat_action_info, const struct conn_lookup_ctx *ctx, struct conn *conn, @@ -1283,6 +1298,8 @@ process_one_fast(uint16_t zone, const uint32_t *setmark, if (setlabel) { set_label(pkt, conn, &setlabel[0], &setlabel[1]); } + +conn_update_counters(conn, pkt, pkt->md.reply); } static void @@ -1420,6 +1437,8 @@ process_one(struct conntrack *ct, struct dp_packet *pkt, set_label(pkt, conn, &setlabel[0], &setlabel[1]); } +conn_update_counters(conn, pkt, ctx->reply); + handle_alg_ctl(ct, ctx, pkt, ct_alg_ctl, conn, now, !!nat_action_info); set_cached_conn(nat_action_info, ctx, conn, pkt); @@ -2641,6 +2660,15 @@ conn_to_ct_dpif_entry(const struct conn *conn, struct ct_dpif_entry *entry, } ovs_mutex_unlock(&conn->lock); +entry->counters_orig.packets = atomic_count_get64( +(atomic_uint64_t *)&conn->counters_orig.packets); +entry->counters_orig.bytes = atomic_count_get64( +(atomic_uint64_t *)&conn->counters_orig.bytes); +entry->counters_reply.packets = atomic_count_get64( +(atomic_uint64_t *)&conn->counters_reply.packets); +entry->counters_reply.bytes = atomic_count_get64( +(atomic_uint64_t *)&conn->counters_reply.bytes); + entry->timeout = (expiration > 0) ? expiration / 1000 : 0; if (conn->alg) { diff --git a/tests/system-common-macros.at b/tests/system-common-macros.at index 19a0b125b..89cd7b83c 100644 --- a/tests/system-common-macros.at +++ b/tests/system-common-macros.at @@ -240,7 +240,7 @@ m4_define([STRIP_MONITOR_CSUM], [grep "csum:" | sed 's/csum:.*/csum: /']) # and limit the output to the rows containing 'ip-addr'. # m4_define([FORMAT_CT], -[[grep "dst=$1" | sed -e 's/port=[0-9]*/port=/g' -e 's/id=[0-9]*/id=/g' -e 's/state=[0-9_A-Z]*/state=/g' | sort | uniq]]) +[[grep "dst=$1" | sed -e 's/port=[0-9]*/port=/g' -e 's/id=[0-9]*/id=/g' -e 's/state=[0-9_A-Z]*/state=/g' -e 's/timeout=[0-9]*/timeout=/g' | sort | uniq]]) # NETNS_DAEMONIZE([namespace], [command], [pidfile]) # diff --git a/tests/system-traffic.at b/tests/system-traffic.at index f22d86e46..15b2c288c 100644 --- a/tests/system-traffic.at +++ b/tests/system-traffic.at @@ -6743,6 +6743,36 @@ AT_CHECK([ovs-ofctl dump-flows br0 | grep table=2, | OFPROTO_CLEAR_DURATION_IDLE OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP +AT_SETUP([conntrack - stats]) +CHECK_CONNTRACK() +OVS_TRAFFIC_VSWITCHD_START() + +ADD_NAMESPACES(at_ns0, at_ns1) + +ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24") +ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24") + +AT_DATA([flows.txt], [dnl +priority=1,action=drop +priority=10,arp,action=normal +priority=100,in_port=1,icmp,action=ct(commit),2 +priority=100,in_port=2,ct_state=-trk,icmp,action=ct(table=0) +priority=100,in_port=2,ct_state=+trk+est-new,icmp,action=1 +]) + +AT_CHECK([ovs-ofctl --bundle add-flows br0 flows.txt]) + +NS_CHECK_EXEC([at_ns0], [
Re: [ovs-dev] [PATCH] ofproto-dpif-xlate: Do not use zero-weight buckets in select groups.
Hi Ben, group_first_live_bucket() can still return zero-weighted bucket, then this bucket will be used via pick_ff_group() and xlate_group_action__(). I am wondering if it is an issue? Thanks, Yifeng On Fri, Jun 7, 2019 at 12:04 PM 0-day Robot wrote: > > Bleep bloop. Greetings Ben Pfaff, I am a robot and I have tried out your > patch. > Thanks for your contribution. > > I encountered some error that I wasn't expecting. See the details below. > > > checkpatch: > WARNING: Line is 88 characters long (recommended limit is 79) > #23 FILE: ofproto/ofproto-dpif-xlate.c:1: > /* Copyright (c) 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2019 > Nicira, Inc. > > Lines checked: 47, Warnings: 1, Errors: 0 > > > Please check this out. If you feel there has been an error, please email > acon...@bytheb.org > > Thanks, > 0-day Robot > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] ofproto-dpif-xlate: Report DHCP output actions in trace.
Look good to me, thanks. Reviewed-by: Yifeng Sun On Fri, Jun 7, 2019 at 1:03 PM 0-day Robot wrote: > > Bleep bloop. Greetings Ben Pfaff, I am a robot and I have tried out your > patch. > Thanks for your contribution. > > I encountered some error that I wasn't expecting. See the details below. > > > checkpatch: > WARNING: Line is 88 characters long (recommended limit is 79) > #17 FILE: ofproto/ofproto-dpif-xlate.c:1: > /* Copyright (c) 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2019 > Nicira, Inc. > > Lines checked: 32, Warnings: 1, Errors: 0 > > > Please check this out. If you feel there has been an error, please email > acon...@bytheb.org > > Thanks, > 0-day Robot > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev