Re: [ovs-discuss] Core dumps generated when running ovs tests in parallel

2019-03-15 Thread Ben Pfaff
On Sat, Mar 16, 2019 at 09:56:49AM +0530, Numan Siddique wrote:
> On Sat, Mar 16, 2019, 2:28 AM Ben Pfaff  wrote:
> 
> > On Sat, Mar 16, 2019 at 12:28:31AM +0530, Numan Siddique wrote:
> > > On my Fedora 29 when ever I run all the ovs tests with "-j5", I see few
> > > core dumps generated for ovsdb-server and python2.
> >
> > There are a couple of tests that intentionally do "kill -SEGV", to test
> > that things work properly in that case.  I suspect you're seeing those.
> 
> Thanks Ben. Sorry for my ignorance.

It surprises everyone at least once.  If you can think of a way to make
it less surprising, let us know--it would be helpful.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] Core dumps generated when running ovs tests in parallel

2019-03-15 Thread Numan Siddique
On Sat, Mar 16, 2019, 2:28 AM Ben Pfaff  wrote:

> On Sat, Mar 16, 2019 at 12:28:31AM +0530, Numan Siddique wrote:
> > On my Fedora 29 when ever I run all the ovs tests with "-j5", I see few
> > core dumps generated for ovsdb-server and python2.
>
> There are a couple of tests that intentionally do "kill -SEGV", to test
> that things work properly in that case.  I suspect you're seeing those.
>

Thanks Ben. Sorry for my ignorance.

Numan
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] 答复: ping6 delay shaking with nd_target

2019-03-15 Thread Ben Pfaff
OK.

We haven't characterized OVS latency in such fine detail.  I encourage
you to investigate, but I will not look into this myself.

On Sat, Mar 16, 2019 at 02:48:37AM +, sunquanying wrote:
> “Delay shaking” means that ping6 delay is between 0.05ms and 0.1ms most of 
> the time. A few minutes later, ping6 delay suddenly bigger than 0.3ms,  and 
> after one or two packets, ping6 delay is between 0.05ms and 0.1ms again. The 
> time interval of ping6 delay suddenly increase has no rules.
> 
> -邮件原件-
> 发件人: Ben Pfaff [mailto:b...@ovn.org] 
> 发送时间: 2019年3月16日 7:21
> 收件人: sunquanying 
> 抄送: ovs-discuss@openvswitch.org; ja...@ovn.org; jpet...@ovn.org; gaoxiaoqiu 
> ; Zhoujingbin (Robin, Cloud Networking) 
> ; chenchanghu 
> 主题: Re: [ovs-discuss] ping6 delay shaking with nd_target
> 
> What is "delay shaking"?
> 
> On Tue, Mar 12, 2019 at 12:52:25PM +, sunquanying wrote:
> > Hello:
> > 
> > We have a problem of  occasionally delay shaking at IPv6 when add the 
> > following flow:
> > 
> > ovs-ofctl add-flow ply1-1-0 
> > "table=0,priority=2,in_port=3,icmp6,dl_src=fa:16:3e:01:01:40,icmp_type=136,nd_target=2002:1::a84c:b41d:3e1:140
> >  actions=output:4"
> > ovs-ofctl add-flow ply1-1-0 
> > "table=0,priority=2,in_port=3,dl_src=fa:16:3e:01:01:40,ipv6 
> > actions=output:4"
> > ovs-ofctl add-flow ply1-1-0 "table=0,priority=10,ipv6,in_port=4 
> > actions=output:3"
> > 
> > In order to  make the problem a definite one, we add "need_revalidate = 
> > true" in revalidate_ukey() as following.
> > static enum reval_result revalidate_ukey( .. ) {
> > ..
> > need_revalidate = true;
> > if (need_revalidate) {
> > ..
> > }
> > }
> > 
> > We add "need_revalidate = true"  and do the following test :
> > 
> > 1.   Add flow: (packets only hit the second and third flow, and will 
> > never hit the first one.)
> > 
> > ovs-ofctl add-flow ply1-1-0 
> > "table=0,priority=2,in_port=3,icmp6,dl_src=fa:16:3e:01:01:40,icmp_type=136,nd_target=2002:1::a84c:b41d:3e1:140
> >  actions=output:4"
> > 
> > ovs-ofctl add-flow ply1-1-0 
> > "table=0,priority=2,in_port=3,dl_src=fa:16:3e:01:01:40,ipv6 
> > actions=output:4"
> > 
> > ovs-ofctl add-flow ply1-1-0 "table=0,priority=10,ipv6,in_port=4 
> > actions=output:3"
> > 
> > average ping6 delay > 0.3ms
> > 
> > 
> > 
> > 2.   Modify dl_src of the first flow in experiment 1:
> > 
> > ovs-ofctl add-flow ply1-1-0 
> > "table=0,priority=2,in_port=3,icmp6,dl_src=fa:16:3e:11:11:41,icmp_type=136,nd_target=2002:1::a84c:b41d:3e1:140
> >  actions=output:4"
> > 
> > ovs-ofctl add-flow ply1-1-0 
> > "table=0,priority=2,in_port=3,dl_src=fa:16:3e:01:01:40,ipv6 
> > actions=output:4"
> > 
> > ovs-ofctl add-flow ply1-1-0 "table=0,priority=10,ipv6,in_port=4 
> > actions=output:3"
> > 
> > 
> > average ping6 delay < 0.1ms
> > 
> > 
> > 3.   Delete the first flow in experiment 1:
> > 
> > ovs-ofctl add-flow ply1-1-0 
> > "table=0,priority=2,in_port=3,dl_src=fa:16:3e:01:01:40,ipv6 
> > actions=output:4"
> > 
> > ovs-ofctl add-flow ply1-1-0 "table=0,priority=10,ipv6,in_port=4 
> > actions=output:3"
> > 
> > 
> > 
> > average ping6 delay < 0.1ms
> > 
> > It seems like that nd_target mask has been set as 0x when upcall in 
> > flow_wildcards_fold_minimask_in_map(), but change to 0x when generate 
> > dpcls_rule, which leads to this phenomenon.
> > Tcp and udp packets have the same results with icmpv6 ping6 packets when 
> > add the first flow in experiment 1.
> > 
> > Could you please give us some suggestions on how to solve this problem? 
> > Will you fix this problem?
> > 
> > Thank you.
> > 
> > # ovs-ofctl --version
> > ovs-ofctl (Open vSwitch) 2.7.3
> > OpenFlow versions 0x1:0x4
> > Release version: R5.RC6.018
> > 
> > #  uname -a
> > Linux linux-xWqfOT 3.10.0-862.14.1.6_8.x86_64 #1 SMP Thu Dec 20 
> > 00:00:00 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
> > 
> > # ovs-ofctl show ply1-1-0
> > OFPT_FEATURES_REPLY (xid=0x2): dpid:4a6cdb36e948 n_tables:254, 
> > n_buffers:0
> > capabilities: FLOW_STATS TABLE_STATS PORT_STATS QUEUE_STATS 
> > ARP_MATCH_IP
> > actions: output enqueue ext_action set_vlan_vid set_vlan_pcp 
> > strip_vlan mod_dl_src mod_dl_dst mod_nw_src mod_nw_dst mod_nw_tos 
> > mod_tp_src mod_tp_dst
> > 3(tap1-1-0): addr:d2:90:f6:f9:e6:21
> >  config: 0
> >  state:  0
> >  speed: 0 Mbps now, 0 Mbps max
> > 4(pvi1-1-0): addr:ba:2d:6c:d6:b7:ce
> >  config: 0
> >  state:  0
> >  speed: 0 Mbps now, 0 Mbps max
> > LOCAL(ply1-1-0): addr:4a:6c:db:36:e9:48
> >  config: 0
> >  state:  0
> >  current:10MB-FD COPPER
> >  speed: 10 Mbps now, 0 Mbps max
> > OFPT_GET_CONFIG_REPLY (xid=0x4): frags=normal miss_send_len=0
> > 
> > # ovs-vsctl show
> > Bridge "ply1-1-0"
> > fail_mode: secure
> > Port "ply1-1-0"
> > Interface "ply1-1-0"
> > type: internal
> > Port "pvi1-1-0"
> > tag: 4093
> > Interface "pvi1-1-0"
> >  

[ovs-discuss] 答复: ping6 delay shaking with nd_target

2019-03-15 Thread sunquanying
“Delay shaking” means that ping6 delay is between 0.05ms and 0.1ms most of the 
time. A few minutes later, ping6 delay suddenly bigger than 0.3ms,  and after 
one or two packets, ping6 delay is between 0.05ms and 0.1ms again. The time 
interval of ping6 delay suddenly increase has no rules.

-邮件原件-
发件人: Ben Pfaff [mailto:b...@ovn.org] 
发送时间: 2019年3月16日 7:21
收件人: sunquanying 
抄送: ovs-discuss@openvswitch.org; ja...@ovn.org; jpet...@ovn.org; gaoxiaoqiu 
; Zhoujingbin (Robin, Cloud Networking) 
; chenchanghu 
主题: Re: [ovs-discuss] ping6 delay shaking with nd_target

What is "delay shaking"?

On Tue, Mar 12, 2019 at 12:52:25PM +, sunquanying wrote:
> Hello:
> 
> We have a problem of  occasionally delay shaking at IPv6 when add the 
> following flow:
> 
> ovs-ofctl add-flow ply1-1-0 
> "table=0,priority=2,in_port=3,icmp6,dl_src=fa:16:3e:01:01:40,icmp_type=136,nd_target=2002:1::a84c:b41d:3e1:140
>  actions=output:4"
> ovs-ofctl add-flow ply1-1-0 
> "table=0,priority=2,in_port=3,dl_src=fa:16:3e:01:01:40,ipv6 actions=output:4"
> ovs-ofctl add-flow ply1-1-0 "table=0,priority=10,ipv6,in_port=4 
> actions=output:3"
> 
> In order to  make the problem a definite one, we add "need_revalidate = true" 
> in revalidate_ukey() as following.
> static enum reval_result revalidate_ukey( .. ) {
> ..
> need_revalidate = true;
> if (need_revalidate) {
> ..
> }
> }
> 
> We add "need_revalidate = true"  and do the following test :
> 
> 1.   Add flow: (packets only hit the second and third flow, and will 
> never hit the first one.)
> 
> ovs-ofctl add-flow ply1-1-0 
> "table=0,priority=2,in_port=3,icmp6,dl_src=fa:16:3e:01:01:40,icmp_type=136,nd_target=2002:1::a84c:b41d:3e1:140
>  actions=output:4"
> 
> ovs-ofctl add-flow ply1-1-0 
> "table=0,priority=2,in_port=3,dl_src=fa:16:3e:01:01:40,ipv6 actions=output:4"
> 
> ovs-ofctl add-flow ply1-1-0 "table=0,priority=10,ipv6,in_port=4 
> actions=output:3"
> 
> average ping6 delay > 0.3ms
> 
> 
> 
> 2.   Modify dl_src of the first flow in experiment 1:
> 
> ovs-ofctl add-flow ply1-1-0 
> "table=0,priority=2,in_port=3,icmp6,dl_src=fa:16:3e:11:11:41,icmp_type=136,nd_target=2002:1::a84c:b41d:3e1:140
>  actions=output:4"
> 
> ovs-ofctl add-flow ply1-1-0 
> "table=0,priority=2,in_port=3,dl_src=fa:16:3e:01:01:40,ipv6 actions=output:4"
> 
> ovs-ofctl add-flow ply1-1-0 "table=0,priority=10,ipv6,in_port=4 
> actions=output:3"
> 
> 
> average ping6 delay < 0.1ms
> 
> 
> 3.   Delete the first flow in experiment 1:
> 
> ovs-ofctl add-flow ply1-1-0 
> "table=0,priority=2,in_port=3,dl_src=fa:16:3e:01:01:40,ipv6 actions=output:4"
> 
> ovs-ofctl add-flow ply1-1-0 "table=0,priority=10,ipv6,in_port=4 
> actions=output:3"
> 
> 
> 
> average ping6 delay < 0.1ms
> 
> It seems like that nd_target mask has been set as 0x when upcall in 
> flow_wildcards_fold_minimask_in_map(), but change to 0x when generate 
> dpcls_rule, which leads to this phenomenon.
> Tcp and udp packets have the same results with icmpv6 ping6 packets when add 
> the first flow in experiment 1.
> 
> Could you please give us some suggestions on how to solve this problem? Will 
> you fix this problem?
> 
> Thank you.
> 
> # ovs-ofctl --version
> ovs-ofctl (Open vSwitch) 2.7.3
> OpenFlow versions 0x1:0x4
> Release version: R5.RC6.018
> 
> #  uname -a
> Linux linux-xWqfOT 3.10.0-862.14.1.6_8.x86_64 #1 SMP Thu Dec 20 
> 00:00:00 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
> 
> # ovs-ofctl show ply1-1-0
> OFPT_FEATURES_REPLY (xid=0x2): dpid:4a6cdb36e948 n_tables:254, 
> n_buffers:0
> capabilities: FLOW_STATS TABLE_STATS PORT_STATS QUEUE_STATS 
> ARP_MATCH_IP
> actions: output enqueue ext_action set_vlan_vid set_vlan_pcp 
> strip_vlan mod_dl_src mod_dl_dst mod_nw_src mod_nw_dst mod_nw_tos 
> mod_tp_src mod_tp_dst
> 3(tap1-1-0): addr:d2:90:f6:f9:e6:21
>  config: 0
>  state:  0
>  speed: 0 Mbps now, 0 Mbps max
> 4(pvi1-1-0): addr:ba:2d:6c:d6:b7:ce
>  config: 0
>  state:  0
>  speed: 0 Mbps now, 0 Mbps max
> LOCAL(ply1-1-0): addr:4a:6c:db:36:e9:48
>  config: 0
>  state:  0
>  current:10MB-FD COPPER
>  speed: 10 Mbps now, 0 Mbps max
> OFPT_GET_CONFIG_REPLY (xid=0x4): frags=normal miss_send_len=0
> 
> # ovs-vsctl show
> Bridge "ply1-1-0"
> fail_mode: secure
> Port "ply1-1-0"
> Interface "ply1-1-0"
> type: internal
> Port "pvi1-1-0"
> tag: 4093
> Interface "pvi1-1-0"
> type: patch
> options: {peer="pvo1-1-0"}
> Port "tap1-1-0"
> tag: 4092
> Interface "tap1-1-0"
> type: virtio
> 
> 

> ___
> discuss mailing list
> disc...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

___
discuss mailing list
disc...@openvswitch.org

Re: [ovs-discuss] openvswitch-2.11.0 single MPLS_POP

2019-03-15 Thread Ben Pfaff
You should not need to match on both labels to pop a single label.

If you use ofproto/trace for this packet (see ovs-vswitchd(8)), what
does it say?

On Tue, Mar 12, 2019 at 08:34:38AM -0400, Thomas Crowley wrote:
> I am attempting to pop the outer MPLS label on a packet that has two MPLS
> labels and send the result to a patch port that then sends the packet back
> to the same bridge.  I am using a combination of ONOS and scapy(see bottom
> of email for script) to do my test.  I do not get the packet back to my
> bridge via the patch port.  I do get the following error:
> 
> 2019-03-12T12:00:51.460Z|00019|odp_util(handler7)|ERR|invalid Ethertype 0
> in flow key
> 
> when I dump the flow OVS tells me that it is:
> 
> cookie=0xba8259b570, duration=30.601s, table=0, n_packets=27,
> n_bytes=1350, priority=60001,mpls,in_port=input,mpls_label=164,mpls_bos=0
> actions=pop_mpls:0x8847,output:"4PatchIn"
> 
> If I run the same test but with a packet with only a single MPLS label it
> works fine.  Do I have to match both labels in order to pop a single label?
> 
> Thank you,
> Tom
> 
> Scapy script:
> #!/usr/bin/env python
> 
> import sys
> from scapy.all import *
> 
> load_contrib("mpls")
> mpls_eth = Ether(src="ca:09:11:11:11:1b", dst="ca:01:07:fc:00:1c",
> type=0x8847)
> mpls_lables=MPLS(label=164, s=0, ttl=255)/MPLS(label=182, s=1, ttl=255)
> mpls_ip = IP(src='10.0.255.2', dst='10.0.255.4')
> mpls_icmp = ICMP(type="echo-request")
> mpls_raw = Raw(load="Fok!")
> mpls_frame=mpls_eth/mpls_lables/mpls_ip/ICMP()
> #mpls_icmp/mpls_raw
> 
> #>>> Ether(str(mpls_frame))
> # label=16 cos=0 s=0 ttl=255 | label=18 cos=0 s=0 ttl=255 | os=0 s=1 ttl=255 | ttl=64 proto=icmp chksum=0x68c7 src=10.0.255.2 dst=10.0.255.2 options=[]
> | =0xcaf3 id=0x0 seq=0x0 |>>>
> 
> sendp(mpls_frame, iface="input")
> 
> sendp(mpls_frame, iface="input", loop=1, inter=1.1)

> ___
> discuss mailing list
> disc...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] ping6 delay shaking with nd_target

2019-03-15 Thread Ben Pfaff
What is "delay shaking"?

On Tue, Mar 12, 2019 at 12:52:25PM +, sunquanying wrote:
> Hello:
> 
> We have a problem of  occasionally delay shaking at IPv6 when add the 
> following flow:
> 
> ovs-ofctl add-flow ply1-1-0 
> "table=0,priority=2,in_port=3,icmp6,dl_src=fa:16:3e:01:01:40,icmp_type=136,nd_target=2002:1::a84c:b41d:3e1:140
>  actions=output:4"
> ovs-ofctl add-flow ply1-1-0 
> "table=0,priority=2,in_port=3,dl_src=fa:16:3e:01:01:40,ipv6 actions=output:4"
> ovs-ofctl add-flow ply1-1-0 "table=0,priority=10,ipv6,in_port=4 
> actions=output:3"
> 
> In order to  make the problem a definite one, we add "need_revalidate = true" 
> in revalidate_ukey() as following.
> static enum reval_result revalidate_ukey( .. )
> {
> ..
> need_revalidate = true;
> if (need_revalidate) {
> ..
> }
> }
> 
> We add "need_revalidate = true"  and do the following test :
> 
> 1.   Add flow: (packets only hit the second and third flow, and will 
> never hit the first one.)
> 
> ovs-ofctl add-flow ply1-1-0 
> "table=0,priority=2,in_port=3,icmp6,dl_src=fa:16:3e:01:01:40,icmp_type=136,nd_target=2002:1::a84c:b41d:3e1:140
>  actions=output:4"
> 
> ovs-ofctl add-flow ply1-1-0 
> "table=0,priority=2,in_port=3,dl_src=fa:16:3e:01:01:40,ipv6 actions=output:4"
> 
> ovs-ofctl add-flow ply1-1-0 "table=0,priority=10,ipv6,in_port=4 
> actions=output:3"
> 
> average ping6 delay > 0.3ms
> 
> 
> 
> 2.   Modify dl_src of the first flow in experiment 1:
> 
> ovs-ofctl add-flow ply1-1-0 
> "table=0,priority=2,in_port=3,icmp6,dl_src=fa:16:3e:11:11:41,icmp_type=136,nd_target=2002:1::a84c:b41d:3e1:140
>  actions=output:4"
> 
> ovs-ofctl add-flow ply1-1-0 
> "table=0,priority=2,in_port=3,dl_src=fa:16:3e:01:01:40,ipv6 actions=output:4"
> 
> ovs-ofctl add-flow ply1-1-0 "table=0,priority=10,ipv6,in_port=4 
> actions=output:3"
> 
> 
> average ping6 delay < 0.1ms
> 
> 
> 3.   Delete the first flow in experiment 1:
> 
> ovs-ofctl add-flow ply1-1-0 
> "table=0,priority=2,in_port=3,dl_src=fa:16:3e:01:01:40,ipv6 actions=output:4"
> 
> ovs-ofctl add-flow ply1-1-0 "table=0,priority=10,ipv6,in_port=4 
> actions=output:3"
> 
> 
> 
> average ping6 delay < 0.1ms
> 
> It seems like that nd_target mask has been set as 0x when upcall in 
> flow_wildcards_fold_minimask_in_map(), but change to 0x when generate 
> dpcls_rule, which leads to this phenomenon.
> Tcp and udp packets have the same results with icmpv6 ping6 packets when add 
> the first flow in experiment 1.
> 
> Could you please give us some suggestions on how to solve this problem? Will 
> you fix this problem?
> 
> Thank you.
> 
> # ovs-ofctl --version
> ovs-ofctl (Open vSwitch) 2.7.3
> OpenFlow versions 0x1:0x4
> Release version: R5.RC6.018
> 
> #  uname -a
> Linux linux-xWqfOT 3.10.0-862.14.1.6_8.x86_64 #1 SMP Thu Dec 20 00:00:00 UTC 
> 2018 x86_64 x86_64 x86_64 GNU/Linux
> 
> # ovs-ofctl show ply1-1-0
> OFPT_FEATURES_REPLY (xid=0x2): dpid:4a6cdb36e948
> n_tables:254, n_buffers:0
> capabilities: FLOW_STATS TABLE_STATS PORT_STATS QUEUE_STATS ARP_MATCH_IP
> actions: output enqueue ext_action set_vlan_vid set_vlan_pcp strip_vlan 
> mod_dl_src mod_dl_dst mod_nw_src mod_nw_dst mod_nw_tos mod_tp_src mod_tp_dst
> 3(tap1-1-0): addr:d2:90:f6:f9:e6:21
>  config: 0
>  state:  0
>  speed: 0 Mbps now, 0 Mbps max
> 4(pvi1-1-0): addr:ba:2d:6c:d6:b7:ce
>  config: 0
>  state:  0
>  speed: 0 Mbps now, 0 Mbps max
> LOCAL(ply1-1-0): addr:4a:6c:db:36:e9:48
>  config: 0
>  state:  0
>  current:10MB-FD COPPER
>  speed: 10 Mbps now, 0 Mbps max
> OFPT_GET_CONFIG_REPLY (xid=0x4): frags=normal miss_send_len=0
> 
> # ovs-vsctl show
> Bridge "ply1-1-0"
> fail_mode: secure
> Port "ply1-1-0"
> Interface "ply1-1-0"
> type: internal
> Port "pvi1-1-0"
> tag: 4093
> Interface "pvi1-1-0"
> type: patch
> options: {peer="pvo1-1-0"}
> Port "tap1-1-0"
> tag: 4092
> Interface "tap1-1-0"
> type: virtio
> 
> 

> ___
> discuss mailing list
> disc...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] When the controller is disconnected, if you update the port immediately, the ovs will crash.

2019-03-15 Thread Ben Pfaff
On Tue, Mar 12, 2019 at 08:34:39PM +0800, 贾乘 wrote:
> Hi All,
> 
> This is  my bridge configuration:
> 
> Bridge br-int
> Controller "tcp:127.0.0.1:6653"
> is_connected: true
> fail_mode: secure
> Port br-int
> Interface br-int
> type: internal
> Port vxlan-vtp
> Interface vxlan-vtp
> type: vxlan
> options: {dst_port="4789", key=flow, 
> local_ip="10.23.127.129", remote_ip=flow}
> Port br-ex-patch
> Interface br-ex-patch
> type: patch
> options: {peer=br-int-patch} 
> I do something as the following steps:
> 1.  Disconnect  the controller 
> 2.  Change the vxlan port interface , the local_ip is set to be "flow”.
> 
> The ovs-vswitchd is crashed, then I got the ovs-vswitchd core dump:
> 
> #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
> #1  0x7f855f89542a in __GI_abort () at abort.c:89
> #2  0x55e14c6f7fae in ofputil_protocol_to_ofp_version 
> (protocol=) at lib/ofp-protocol.c:123
> #3  0x55e14c6f32ce in ofputil_encode_port_status 
> (ps=ps@entry=0x7ffdfceab670, protocol=) at lib/ofp-port.c:938
> #4  0x55e14c662dcb in connmgr_send_port_status (mgr=0x55e14da54370, 
> source=source@entry=0x0, pp=pp@entry=0x55e14db9af50, reason=reason@entry=2 
> '\002')
> at ofproto/connmgr.c:1654
> #5  0x55e14c62bf26 in update_port (ofproto=ofproto@entry=0x55e14db778e0, 
> name=name@entry=0x55e14dbe01a0 "vxlan-vtp") at ofproto/ofproto.c:2652
> #6  0x55e14c62c477 in ofproto_run (p=0x55e14db778e0) at 
> ofproto/ofproto.c:1818
> #7  0x55e14c61a8bc in bridge_run__ () at vswitchd/bridge.c:2944
> #8  0x55e14c61bc71 in bridge_reconfigure 
> (ovs_cfg=ovs_cfg@entry=0x55e14da5ab10) at vswitchd/bridge.c:721
> #9  0x55e14c61fba9 in bridge_run () at vswitchd/bridge.c:3023
> #10 0x55e14c2bdbdd in main (argc=, argv=) 
> at vswitchd/ovs-vswitchd.c:125
> 
> 
> From the code ,  the function connmgr_send_port_status try to send the 
> message of the port status update to every controller connection, but since I 
> disconnect the controller , the rconn state is S_BACKOFF.  So based on the 
> function 
> ofconn_get_protocol, the protocol will be OFPUTIL_P_NONE. When calling 
> ofputil_encode_port_status with none protocol, the ovs-vswitchd is crashed.
> 
> connmgr_send_port_status. —>  ofconn_get_protocol (conn disconnected, it 
> return none protocol)
>  —> ofputil_encode_port_status. 
> (With None protocol , ovs-vswitchd crash)
> 
> 
> So I suggest to check if the protocol is null before calling 
> ofputil_encode_port_status.

Thanks for the report.

What version of OVS are you testing?
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] Core dumps generated when running ovs tests in parallel

2019-03-15 Thread Ben Pfaff
On Sat, Mar 16, 2019 at 12:28:31AM +0530, Numan Siddique wrote:
> On my Fedora 29 when ever I run all the ovs tests with "-j5", I see few
> core dumps generated for ovsdb-server and python2.

There are a couple of tests that intentionally do "kill -SEGV", to test
that things work properly in that case.  I suspect you're seeing those.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


[ovs-discuss] Core dumps generated when running ovs tests in parallel

2019-03-15 Thread Numan Siddique
Hi,

On my Fedora 29 when ever I run all the ovs tests with "-j5", I see few
core dumps generated for ovsdb-server and python2.

Here's the back trace

[root@nusiddiq ovsdb]# gdb ./ovsdb-server
/opt/core_dumps/core.ovsdb-server.24604
GNU gdb (GDB) Fedora 8.2-6.fc29
...
...
Core was generated by `ovsdb-server --monitor --pidfile --no-db'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x7f813cdfe3e8 in poll () from /lib64/libc.so.6
Missing separate debuginfos, use: dnf debuginfo-install
glibc-2.28-26.fc29.x86_64 libatomic-8.3.1-2.fc29.x86_64
libcap-ng-0.7.9-5.fc29.x86_64 libevent-2.1.8-3.fc29.x86_64
openssl-libs-1.1.1b-2.fc29.x86_64 python3-libs-3.7.2-4.fc29.x86_64
unbound-libs-1.8.3-2.fc29.x86_64 zlib-1.2.11-14.fc29.x86_64
(gdb) bt
#0  0x7f813cdfe3e8 in poll () from /lib64/libc.so.6
#1  0x004453c4 in time_poll (pollfds=pollfds@entry=0xfbb320,
n_pollfds=2, handles=handles@entry=0x0, timeout_when=24471727,
elapsed=elapsed@entry=0x7fff612d09ac) at ../lib/timeval.c:326
#2  0x0043b1c4 in poll_block () at ../include/openvswitch/hmap.h:232
#3  0x00406dd7 in main_loop (is_backup=0x7fff612d0a5e,
exiting=0x7fff612d0a5f, run_process=0x0, remotes=0x7fff612d0ab0,
unixctl=0xfb1420, all_dbs=0x7fff612d0af0, jsonrpc=0xf7eec0,
config=0x7fff612d0b10) at ../ovsdb/ovsdb-server.c:280
#4  main (argc=, argv=) at
../ovsdb/ovsdb-server.c:460


[root@nusiddiq ovsdb]# gdb /usr/bin/python2
/opt/core_dumps/core.python2.26288
GNU gdb (GDB) Fedora 8.2-6.fc29
..
..
Core was generated by `/usr/bin/python2 ../../../../tests/test-daemon.py
--pidfile --monitor'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x7f7074fbeccb in select () from /lib64/libc.so.6
Missing separate debuginfos, use: dnf debuginfo-install
python2-2.7.15-11.fc29.x86_64
(gdb) bt
#0  0x7f7074fbeccb in select () from /lib64/libc.so.6
#1  0x7f7074c50e10 in ?? () from
/usr/lib64/python2.7/lib-dynload/timemodule.so
#2  0x7f70753b114b in PyEval_EvalFrameEx () from
/lib64/libpython2.7.so.1.0
#3  0x7f70753b01ac in PyEval_EvalFrameEx () from
/lib64/libpython2.7.so.1.0
#4  0x7f70753b1902 in PyEval_EvalCodeEx () from
/lib64/libpython2.7.so.1.0
#5  0x7f70753b1b9d in PyEval_EvalCode () from /lib64/libpython2.7.so.1.0
#6  0x7f70753b7b4f in ?? () from /lib64/libpython2.7.so.1.0
#7  0x7f70753b7af8 in PyRun_FileExFlags () from
/lib64/libpython2.7.so.1.0
#8  0x7f70753b790c in PyRun_SimpleFileExFlags () from
/lib64/libpython2.7.so.1.0
#9  0x7f70753bd5ba in Py_Main () from /lib64/libpython2.7.so.1.0
#10 0x7f7074eee413 in __libc_start_main () from /lib64/libc.so.6
#11 0x55c873ddc0ae in _start ()


The glibc version is 2.28 (glibc-2.28-26.fc29.x86_64)

We have seen similar crashes with ovn-controller and ovs-vswitchd in this
BZ - https://bugzilla.redhat.com/show_bug.cgi?id=1685058

And the backtrace goes to libc.

Is anyone aware of this or have any pointers what could be going on here ?

Thanks
Numan
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVS/OVN troubleshooting: where's my packet?

2019-03-15 Thread Daniel Alvarez Sanchez
Sounds like a great plan, Ben! Thanks for that. It'd be great if
people could chime in this thread to help identify those gaps.

As about the anecdotes, we had just been involved in a case where OVN
was used and packets were dropped at conntrack:

Two VMs on different Logical Switches (externally routed), running on
the same hypervisor were communicating between each other and packet
loss was observed. The packet loss was observed only on small (<64B)
packets. These packets were padded by the NIC before being put on the
wire and when they came back, due to the ACLs, they were put into
conntrack and dropped there. We determined this by inspecting DP flows
via 'ovs-dpctl dump-flows' and then we enabled logging on netfilter
which showed that there was an error with the checksum calculation. It
happened to be a bug on the OVS kernel side which was already fixed in
newer kernels but it took quite a while to figure out and a good
understanding on what was going on. In this scenario, if OVN ACLs were
removed, traffic worked so OVN was the first to be blamed. And
sometimes, the OVN user/engineer is not an OVS expert to be able to
tell effectively what happened to a packet.

Maybe the example is not the best as it was resolved using just the
'ovs-dpctl' tool and some logging but support engineers may loop in
OVN engineers which may loop in OVS engineers which may loop in kernel
engineers. It'd be great to improve the experience somehow so that the
initial assessment doesn't have to go always all the way down.

I'm curious about other folks' experiences here as well with more pure
OVS experience.

Thanks a lot!
Daniel

On Thu, Mar 14, 2019 at 5:55 PM Ben Pfaff  wrote:
>
> On Thu, Mar 14, 2019 at 04:55:56PM +0100, Daniel Alvarez Sanchez wrote:
> > Hi folks,
> >
> > Lately I'm getting the question in the subject line more and more
> > frequently and facing it myself, especially in the context of
> > OpenStack.
> >
> > The shift to OVN in OpenStack involves a totally different approach
> > when it comes to tracing packet drops. Before OVN, there were a bunch
> > of network namespaces and devices where you could hook a tcpdump on
> > and inspect the traffic. People are used to those troubleshooting
> > techniques and OVS was merely used for normal action switches.
> >
> > It's clear that there's tools and techniques to analyze this (trace
> > tool, port mirroring, etc.), but often times requires quite high
> > knowledge and understanding of the pipeline and OVS itself to
> > effectively trace where a packet got dropped. Furthermore, there could
> > be some scenarios where the packet can be silently dropped.
> >
> > I came across this patch [0] and presentation about it [1] which aims
> > to tackle partly the problem described here (focusing in the DPDK
> > datapath).
> >
> > The intent of this email is to gather some feedback as how to provide
> > efficient tools and techniques to troubleshoot OVS/OVN issues and what
> > do you think is immediately missing in this context.
>
> I guess that there are multiple things to do here:
>
> - Better document the tools that are available.
>
> - Implement improvements, especially UX-wise, to the existing tools.
>
> - Identify gaps in the available tools (and then fill them).
>
> Do you have any good anecdotes about user/admin frustration?  They might
> be helpful for figuring out how to help.  A lot of us here designed and
> built this stuff and so the gaps are not always obvious to us.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] Re: [HELP] Question about Performance issues in high concurrency scenarios

2019-03-15 Thread Yanqin Wei (Arm Technology China)
Hi
What I know is EMC need more memory space. EMC has 8k entries by default, this 
memory can store 1000K SMC entries.  EMC should perform better with the same 
cache entries, but SMC is better with the same memory space.

Best Regards,
Wei Yanqin

-Original Message-
From: txfh2007 
Sent: Friday, March 15, 2019 3:23 PM
To: Yanqin Wei (Arm Technology China) ; Flavio Leitner 

Cc: ovs-discuss 
Subject: Re: [HELP] Question about Performance issues in high concurrency 
scenarios

Hi Yanqin && all:
I have studied this ppt && vedio, and  tried in my own environment. The 
result is in 100K mac-ip PVP test scenario, when EM_FLOW_HASH_SHIFT == 17(emc 
entry num is 130K+), the performance can still maintain(a little decline than 
10K mac-ip scenario). My formal conclusion is wrong as I didn't notice that if 
I generate 100K mac-ip flows , then the mac-table-entry num in ovs bridge 
should be enlarge simultaneously(I use the auto learn flow entries)

   I also tested scenario that smc is enabled and emc disabled. My result is 
pure smc performance is slightly not good as emc.(100K mac-ip flows, pkt size 
is 64 bytes, 4pmd lcores , 10G nic , PVP test, smc can reach 3.7G throughput, 
latency is 56us. For emc throughput is 4G , latency 51us )

   My question is: should emc entry num get enlarged ? Are there any drawbacks 
if emc entry is larger than 100K ?







--
发件人:Yanqin Wei (Arm Technology China) 
发送时间:2019年3月14日(星期四) 10:05
收件人:txfh2007 ; Flavio Leitner 
抄 送:ovs-discuss 
主 题:RE: [ovs-discuss] Re: [HELP] Question about Performance issues in high 
concurrency scenarios


!--  @font-face{font-family:Cambria Math;panose-1:2 4 5 3 5 4 6 3 2 
4;}{font-family:DengXian;panose-1:2 1 6 0 3 1 1 1 1 
1;}{font-family:Calibri;panose-1:2 15 5 2 2 2 4 3 2 
4;}{font-family:\@DengXian;panose-1:2 1 6 0 3 1 1 1 1 
1;}{font-family:Tahoma;panose-1:2 11 6 4 3 5 4 4 2 4;}{font-family:Microsoft 
YaHei;panose-1:2 11 5 3 2 2 4 2 2 4;}{font-family:\@Microsoft 
YaHei;}p.MsoNormal, li.MsoNormal, 
div.MsoNormal{margin:.0in;margin-bottom:.0pt;font-size:11.0pt;font-family:Calibri,sans-serif;}a:link,
 
span.MsoHyperlink{mso-style-priority:99;color:#0563c1;text-decoration:underline;}a:visited,
 
span.MsoHyperlinkFollowed{mso-style-priority:99;color:#954f72;text-decoration:underline;}p.msonormal0,
 li.msonormal0, 
div.msonormal0{mso-style-name:msonormal;mso-margin-top-alt:auto;margin-right:.0in;mso-margin-bottom-alt:auto;margin-left:.0in;font-size:11.0pt;font-family:Calibri,sans-serif;}span.EmailStyle19{mso-style-type:personal-reply;font-family:Calibri,sans-serif;color:windowtext;}.MsoChpDefault{mso-style-type:export-only;font-family:Calibri,sans-serif;}@page
 WordSection1{size:8.5in 11.0in;margin:1.0in 1.0in 1.0in 
1.0in;}div.WordSection1{page:WordSection1;}-->Hi,

There is a interesting topic in Open vSwitch 2018 Fall Conference maybe helpful 
for you. It shows EMC/SMC behave different performance in high concurrency case.
For your case, SMC may be more helpful.

http://www.openvswitch.org/support/ovscon2018/5/1330-theurer.pdf



From: ovs-discuss-boun...@openvswitch.org 
On Behalf Of txfh2007 via discuss
Sent: Thursday, March 14, 2019 9:00 AM
To: Flavio Leitner 
Cc: ovs-discuss 
Subject: [ovs-discuss] Re: [HELP] Question about Performance issues in high 
concurrency scenarios

Thanks,Flavio:

So there are two questions:

1. Is dpcls lookup More efficiency than EMC?

2. If I just want to enlarge emc entry num by change EM_FLOW_HASH_SHIFT 
value, are there any drawbacks ?





--

发件人:Flavio Leitner 

发送时间:2019年3月14日(星期四) 04:05

收件人:txfh2007 

抄 送:ovs-discuss 

主 题:Re: [ovs-discuss] [HELP] Question about Performance issues in high 
concurrency scenarios



On Tue, Mar 05, 2019 at 03:38:43PM +0800, txfh2007 via discuss wrote:
> Hi everyone:
>  I have test ovs-dpdk PVP performance for several days . And I
>  have found in high concurrency scenarios(eg. 100K mac-ip
>  streams), the ovs-dpdk performance gets significant decline.
>  And one cause of performance decline is the emc entry num
>  limitation, is that right? If I just increase emc entry num,
>  will other problems occur ? Are there any suggession helping to
>  get performance improvement under high concurrency scenarios?


Perhaps you could consider disabling EMC if its size is not enough.
fbl




IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.

IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not 

[ovs-discuss] Re: [HELP] Question about Performance issues in high concurrency scenarios

2019-03-15 Thread txfh2007 via discuss
Hi Yanqin && all:
I have studied this ppt && vedio, and  tried in my own environment. The 
result is in 100K mac-ip PVP test scenario, when EM_FLOW_HASH_SHIFT == 17(emc 
entry num is 130K+), the performance can still maintain(a little decline than 
10K mac-ip scenario). My formal conclusion is wrong as I didn't notice that if 
I generate 100K mac-ip flows , then the mac-table-entry num in ovs bridge 
should be enlarge simultaneously(I use the auto learn flow entries)

   I also tested scenario that smc is enabled and emc disabled. My result is 
pure smc performance is slightly not good as emc.(100K mac-ip flows, pkt size 
is 64 bytes, 4pmd lcores , 10G nic , PVP test, smc can reach 3.7G throughput, 
latency is 56us. For emc throughput is 4G , latency 51us )

   My question is: should emc entry num get enlarged ? Are there any drawbacks 
if emc entry is larger than 100K ?







--
发件人:Yanqin Wei (Arm Technology China) 
发送时间:2019年3月14日(星期四) 10:05
收件人:txfh2007 ; Flavio Leitner 
抄 送:ovs-discuss 
主 题:RE: [ovs-discuss] Re: [HELP] Question about Performance issues in high 
concurrency scenarios


!--  @font-face{font-family:Cambria Math;panose-1:2 4 5 3 5 4 6 3 2 
4;}{font-family:DengXian;panose-1:2 1 6 0 3 1 1 1 1 
1;}{font-family:Calibri;panose-1:2 15 5 2 2 2 4 3 2 
4;}{font-family:\@DengXian;panose-1:2 1 6 0 3 1 1 1 1 
1;}{font-family:Tahoma;panose-1:2 11 6 4 3 5 4 4 2 4;}{font-family:Microsoft 
YaHei;panose-1:2 11 5 3 2 2 4 2 2 4;}{font-family:\@Microsoft 
YaHei;}p.MsoNormal, li.MsoNormal, 
div.MsoNormal{margin:.0in;margin-bottom:.0pt;font-size:11.0pt;font-family:Calibri,sans-serif;}a:link,
 
span.MsoHyperlink{mso-style-priority:99;color:#0563c1;text-decoration:underline;}a:visited,
 
span.MsoHyperlinkFollowed{mso-style-priority:99;color:#954f72;text-decoration:underline;}p.msonormal0,
 li.msonormal0, 
div.msonormal0{mso-style-name:msonormal;mso-margin-top-alt:auto;margin-right:.0in;mso-margin-bottom-alt:auto;margin-left:.0in;font-size:11.0pt;font-family:Calibri,sans-serif;}span.EmailStyle19{mso-style-type:personal-reply;font-family:Calibri,sans-serif;color:windowtext;}.MsoChpDefault{mso-style-type:export-only;font-family:Calibri,sans-serif;}@page
 WordSection1{size:8.5in 11.0in;margin:1.0in 1.0in 1.0in 
1.0in;}div.WordSection1{page:WordSection1;}-->Hi,
 
There is a interesting topic in Open vSwitch 2018 Fall Conference maybe helpful 
for you. It shows EMC/SMC behave different performance in high concurrency case.
For your case, SMC may be more helpful.
 
http://www.openvswitch.org/support/ovscon2018/5/1330-theurer.pdf
 
 
 
From: ovs-discuss-boun...@openvswitch.org 
On Behalf Of txfh2007 via discuss
Sent: Thursday, March 14, 2019 9:00 AM
To: Flavio Leitner 
Cc: ovs-discuss 
Subject: [ovs-discuss] Re: [HELP] Question about Performance issues in high 
concurrency scenarios
 
Thanks,Flavio:

So there are two questions: 

1. Is dpcls lookup More efficiency than EMC?

2. If I just want to enlarge emc entry num by change EM_FLOW_HASH_SHIFT 
value, are there any drawbacks ?

 

 

--

发件人:Flavio Leitner 

发送时间:2019年3月14日(星期四) 04:05

收件人:txfh2007 

抄 送:ovs-discuss 

主 题:Re: [ovs-discuss] [HELP] Question about Performance issues in high 
concurrency scenarios

 

On Tue, Mar 05, 2019 at 03:38:43PM +0800, txfh2007 via discuss wrote:
> Hi everyone:
>  I have test ovs-dpdk PVP performance for several days . And I
>  have found in high concurrency scenarios(eg. 100K mac-ip
>  streams), the ovs-dpdk performance gets significant decline.
>  And one cause of performance decline is the emc entry num
>  limitation, is that right? If I just increase emc entry num,
>  will other problems occur ? Are there any suggession helping to
>  get performance improvement under high concurrency scenarios?
 

Perhaps you could consider disabling EMC if its size is not enough.
fbl
 



IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss