[ovs-discuss] quad port X710 rNDC (Dell) make KVM host br0 OVS (2.5.0) port lose connection

2017-08-31 Thread Jayakumar, Muthurajan
Dear team,
Can I please request if any one of you have seen similar observation please.
Kindly requesting to share your suggestion please.

Following is the observation:

quad port X710 rNDC (Dell) make KVM host br0 OVS (2.5.0) port lose connection

Background:

We are introducing quad port Intel X710 rNDC to all Dell 14G platform. We make 
all eth0-eth3 of X710 (i40e driver 2.0.23, FW 6.00) as br0 uplink bond of KVM 
OVS, and we have observed periodic network connection loss (ping unreachable) 
on the br0 interface.


Attached is one of node's OVS config & network port info. The AHV KVM host is:
CentOS release 6.8 (Final)
4.4.26-1.el6.nutanix.20160925.83.x86_64
From ovs-vsctl, eth0 is active OVS upstream port, eth1 is standby OVS port.
Bridge "br0"
Port "br0"
Interface "br0"
type: internal
Port "br0-dhcp"
Interface "br0-dhcp"
type: vxlan
options: {key="1", remote_ip="10.211.56.93"}
Port "br0-up"
Interface "eth1"
Interface "eth3"
Interface "eth0"
Interface "eth2"
Port "tap0"
tag: 0
Interface "tap0"
Port "vnet0"
Interface "vnet0"
Port "br0-arp"
Interface "br0-arp"
type: vxlan
options: {key="1", remote_ip="192.168.5.2"}
ovs_version: "2.5.0"
 br0-up 
bond_mode: active-backup
bond may use recirculation: no, Recirc-ID : -1
bond-hash-basis: 0
updelay: 0 ms
downdelay: 0 ms
lacp_status: off
active slave mac: 24:6e:96:47:6d:0c(eth1)
slave eth0: enabled
may_enable: true
slave eth1: enabled
active slave
may_enable: true
slave eth2: disabled
may_enable: false
slave eth3: disabled
may_enable: false

However, eth1 (port 1) has much more tx/rx traffic than eth0 (port3) per 
ovs-ofctl dump, why? And pkt drop is on br0 uplink bond port & tap0 VM vnic 
port:
ovs-ofctl dump-ports-desc br0
-
OFPST_PORT_DESC reply (xid=0x2):
1(eth1): addr:24:6e:96:47:6d:0c config: 0 state: 0 current: 10GB-FD advertised: 
FIBER supported: 10GB-FD FIBER AUTO_PAUSE speed: 1 Mbps now, 1 Mbps max
2(eth3): addr:24:6e:96:47:6d:10 config: 0 state: LINK_DOWN advertised: 1GB-FD 
10GB-FD AUTO_NEG supported: 1GB-FD 10GB-FD AUTO_NEG AUTO_PAUSE speed: 0 Mbps 
now, 1 Mbps max
3(eth0): addr:24:6e:96:47:6d:0a config: 0 state: 0 current: 10GB-FD advertised: 
FIBER supported: 10GB-FD FIBER AUTO_PAUSE speed: 1 Mbps now, 1 Mbps max
4(eth2): addr:24:6e:96:47:6d:0e config: 0 state: LINK_DOWN advertised: 1GB-FD 
10GB-FD AUTO_NEG supported: 1GB-FD 10GB-FD AUTO_NEG AUTO_PAUSE speed: 0 Mbps 
now, 1 Mbps max
5(vnet0): addr:fe:6b:8d:80:5c:a8 config: 0 state: 0 current: 10MB-FD COPPER 
speed: 10 Mbps now, 0 Mbps max
6(br0-arp): addr:a6:3e:f0:db:76:c6 config: NO_FLOOD state: 0 speed: 0 Mbps now, 
0 Mbps max
7(br0-dhcp): addr:62:4d:cc:2f:33:b4 config: NO_FLOOD state: 0 speed: 0 Mbps 
now, 0 Mbps max
37(tap0): addr:4a:37:5e:99:48:b7 config: 0 state: 0 current: 10MB-FD COPPER 
speed: 10 Mbps now, 0 Mbps max
LOCAL(br0): addr:24:6e:96:47:6d:0a config: 0 state: 0 speed: 0 Mbps now, 0 Mbps 
max
ovs-ofctl dump-ports br0

OFPST_PORT reply (xid=0x2): 9 ports
port LOCAL: rx pkts=2908711, bytes=942823314, drop=3031, errs=0, frame=0, 
over=0, crc=0 tx pkts=2527350, bytes=909830850, drop=0, errs=0, coll=0
port 37: rx pkts=7, bytes=412, drop=0, errs=0, frame=0, over=0, crc=0 tx 
pkts=16, bytes=1666, drop=50414, errs=0, coll=0
port 5: rx pkts=69363971, bytes=32991240945, drop=0, errs=0, frame=0, over=0, 
crc=0 tx pkts=69705117, bytes=36274599187, drop=0, errs=0, coll=0
port 1: rx pkts=85268869, bytes=37284065534, drop=0, errs=0, frame=0, over=0, 
crc=0 tx pkts=83538302, bytes=33877468309, drop=0, errs=0, coll=0
port 4: rx pkts=0, bytes=0, drop=0, errs=0, frame=0, over=0, crc=0 tx pkts=0, 
bytes=0, drop=0, errs=0, coll=0
port 6: rx pkts=0, bytes=0, drop=0, errs=0, frame=0, over=0, crc=0 tx pkts=18, 
bytes=756, drop=0, errs=0, coll=0
port 7: rx pkts=0, bytes=0, drop=0, errs=0, frame=0, over=0, crc=0 tx pkts=0, 
bytes=0, drop=0, errs=0, coll=0
port 2: rx pkts=0, bytes=0, drop=0, errs=0, frame=0, over=0, crc=0 tx pkts=0, 
bytes=0, drop=0, errs=0, coll=0
port 3: rx pkts=483827, bytes=47188768, drop=0, errs=0, frame=0, over=0, crc=0 
tx pkts=16, bytes=1296, drop=0, errs=0, coll=0

Please help to advise what's the next step to root cause this blocking issue. 
Thanks!


___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] quad port X710 rNDC (Dell) make KVM host br0 OVS (2.5.0) port lose connection

2017-09-04 Thread Flavio Leitner
On Thu, 31 Aug 2017 07:12:43 +
"Jayakumar, Muthurajan"  wrote:

> Dear team,
> Can I please request if any one of you have seen similar observation please.
> Kindly requesting to share your suggestion please.
> 
> Following is the observation:
> 
> quad port X710 rNDC (Dell) make KVM host br0 OVS (2.5.0) port lose connection
> 
> Background:
> 
> We are introducing quad port Intel X710 rNDC to all Dell 14G platform. We 
> make all eth0-eth3 of X710 (i40e driver 2.0.23, FW 6.00) as br0 uplink bond 
> of KVM OVS, and we have observed periodic network connection loss (ping 
> unreachable) on the br0 interface.
> 
> 
> Attached is one of node's OVS config & network port info. The AHV KVM host is:
> CentOS release 6.8 (Final)
> 4.4.26-1.el6.nutanix.20160925.83.x86_64
> From ovs-vsctl, eth0 is active OVS upstream port, eth1 is standby OVS port.
> Bridge "br0"
> Port "br0"
> Interface "br0"
> type: internal
> Port "br0-dhcp"
> Interface "br0-dhcp"
> type: vxlan
> options: {key="1", remote_ip="10.211.56.93"}
> Port "br0-up"
> Interface "eth1"
> Interface "eth3"
> Interface "eth0"
> Interface "eth2"
> Port "tap0"
> tag: 0
> Interface "tap0"
> Port "vnet0"
> Interface "vnet0"
> Port "br0-arp"
> Interface "br0-arp"
> type: vxlan
> options: {key="1", remote_ip="192.168.5.2"}
> ovs_version: "2.5.0"
>  br0-up 
> bond_mode: active-backup
> bond may use recirculation: no, Recirc-ID : -1
> bond-hash-basis: 0
> updelay: 0 ms
> downdelay: 0 ms
> lacp_status: off
> active slave mac: 24:6e:96:47:6d:0c(eth1)
> slave eth0: enabled
> may_enable: true
> slave eth1: enabled
> active slave
> may_enable: true
> slave eth2: disabled
> may_enable: false
> slave eth3: disabled
> may_enable: false
> 
> However, eth1 (port 1) has much more tx/rx traffic than eth0 (port3) per 
> ovs-ofctl dump, why?

Because eth1 is the active slave? So it will do all the TX/RX for
the br-up port.


> And pkt drop is on br0 uplink bond port & tap0 VM vnic port:
> ovs-ofctl dump-ports-desc br0
> -
> OFPST_PORT_DESC reply (xid=0x2):
> 1(eth1): addr:24:6e:96:47:6d:0c config: 0 state: 0 current: 10GB-FD 
> advertised: FIBER supported: 10GB-FD FIBER AUTO_PAUSE speed: 1 Mbps now, 
> 1 Mbps max
> 2(eth3): addr:24:6e:96:47:6d:10 config: 0 state: LINK_DOWN advertised: 1GB-FD 
> 10GB-FD AUTO_NEG supported: 1GB-FD 10GB-FD AUTO_NEG AUTO_PAUSE speed: 0 Mbps 
> now, 1 Mbps max
> 3(eth0): addr:24:6e:96:47:6d:0a config: 0 state: 0 current: 10GB-FD 
> advertised: FIBER supported: 10GB-FD FIBER AUTO_PAUSE speed: 1 Mbps now, 
> 1 Mbps max
> 4(eth2): addr:24:6e:96:47:6d:0e config: 0 state: LINK_DOWN advertised: 1GB-FD 
> 10GB-FD AUTO_NEG supported: 1GB-FD 10GB-FD AUTO_NEG AUTO_PAUSE speed: 0 Mbps 
> now, 1 Mbps max
> 5(vnet0): addr:fe:6b:8d:80:5c:a8 config: 0 state: 0 current: 10MB-FD COPPER 
> speed: 10 Mbps now, 0 Mbps max
> 6(br0-arp): addr:a6:3e:f0:db:76:c6 config: NO_FLOOD state: 0 speed: 0 Mbps 
> now, 0 Mbps max
> 7(br0-dhcp): addr:62:4d:cc:2f:33:b4 config: NO_FLOOD state: 0 speed: 0 Mbps 
> now, 0 Mbps max
> 37(tap0): addr:4a:37:5e:99:48:b7 config: 0 state: 0 current: 10MB-FD COPPER 
> speed: 10 Mbps now, 0 Mbps max
> LOCAL(br0): addr:24:6e:96:47:6d:0a config: 0 state: 0 speed: 0 Mbps now, 0 
> Mbps max
> ovs-ofctl dump-ports br0
> 
> OFPST_PORT reply (xid=0x2): 9 ports
> port LOCAL: rx pkts=2908711, bytes=942823314, drop=3031, errs=0, frame=0, 
> over=0, crc=0 tx pkts=2527350, bytes=909830850, drop=0, errs=0, coll=0

This is br0. Is it up or down? If it's down, the broadcasts for that port
will be dropped and accounted.


> port 37: rx pkts=7, bytes=412, drop=0, errs=0, frame=0, over=0, crc=0 tx 
> pkts=16, bytes=1666, drop=50414, errs=0, coll=0

Hard to tell without more info, like flow table and traffic. You might tcpdump
in the host and in the guest to see more, perhaps.

-- 
Flavio

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss