Hi everyone, we are facing a flopping of igb (5.3.4.4) nics which are bonded by using openvswitch. I am in XenServer which is based on CentOS. I am not sure if the problem is the driver or openvswitch. We have six nics that are bonded into two groups of three. The issue appers after some time that the nics are bonded. The *kern.log* shows (part of):
Nov 1 15:39:56 AxenD1 kernel: [1017001.797897] igb 0000:15:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Nov 1 15:39:56 AxenD1 kernel: [1017001.887470] igb 0000:15:00.0 eth0: igb: eth0 NIC Link is Down Nov 1 15:40:00 AxenD1 kernel: [1017005.577876] igb 0000:15:00.1 eth1: igb: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Nov 1 15:40:00 AxenD1 kernel: [1017005.781865] igb 0000:15:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Nov 1 15:40:10 AxenD1 kernel: [1017015.861861] igb 0000:15:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Nov 1 15:40:10 AxenD1 kernel: [1017015.959454] igb 0000:15:00.0 eth0: igb: eth0 NIC Link is Down Nov 1 15:40:14 AxenD1 kernel: [1017019.689864] igb 0000:15:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Nov 1 15:40:14 AxenD1 kernel: [1017020.157869] igb 0000:15:00.1 eth1: igb: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Nov 1 15:40:25 AxenD1 kernel: [1017030.837873] igb 0000:15:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Nov 1 15:40:25 AxenD1 kernel: [1017030.919331] igb 0000:15:00.0 eth0: igb: eth0 NIC Link is Down Nov 1 15:40:28 AxenD1 kernel: [1017034.205864] igb 0000:15:00.1 eth1: igb: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Nov 1 15:40:29 AxenD1 kernel: [1017034.829864] igb 0000:15:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Nov 1 15:40:39 AxenD1 kernel: [1017044.837856] igb 0000:15:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Nov 1 15:40:39 AxenD1 kernel: [1017044.931369] igb 0000:15:00.0 eth0: igb: eth0 NIC Link is Down Nov 1 15:40:43 AxenD1 kernel: [1017048.649860] igb 0000:15:00.1 eth1: igb: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Nov 1 15:40:43 AxenD1 kernel: [1017048.841867] igb 0000:15:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Nov 1 15:40:53 AxenD1 kernel: [1017058.853860] igb 0000:15:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Nov 1 15:40:53 AxenD1 kernel: [1017058.943421] igb 0000:15:00.0 eth0: igb: eth0 NIC Link is Down Nov 1 15:40:56 AxenD1 kernel: [1017062.329849] igb 0000:15:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Nov 1 15:40:57 AxenD1 kernel: [1017063.217873] igb 0000:15:00.1 eth1: igb: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Nov 1 15:41:00 AxenD1 kernel: [1017066.032048] vif vif-537-0 vif537.0: Guest Rx stalled Nov 1 15:41:08 AxenD1 kernel: [1017073.845863] igb 0000:15:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Nov 1 15:41:08 AxenD1 kernel: [1017073.935291] igb 0000:15:00.0 eth0: igb: eth0 NIC Link is Down Nov 1 15:41:11 AxenD1 kernel: [1017077.133868] igb 0000:15:00.1 eth1: igb: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Nov 1 15:41:12 AxenD1 kernel: [1017077.841902] igb 0000:15:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Nov 1 15:41:22 AxenD1 kernel: [1017087.845881] igb 0000:15:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Nov 1 15:41:22 AxenD1 kernel: [1017087.935415] igb 0000:15:00.0 eth0: igb: eth0 NIC Link is Down Nov 1 15:41:26 AxenD1 kernel: [1017091.785868] igb 0000:15:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Nov 1 15:41:26 AxenD1 kernel: [1017092.105875] igb 0000:15:00.1 eth1: igb: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Nov 1 15:41:37 AxenD1 kernel: [1017102.773865] igb 0000:15:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Nov 1 15:41:37 AxenD1 kernel: [1017102.863320] igb 0000:15:00.0 eth0: igb: eth0 NIC Link is Down Nov 1 15:41:40 AxenD1 kernel: [1017106.101863] igb 0000:15:00.1 eth1: igb: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Nov 1 15:41:41 AxenD1 kernel: [1017107.241867] igb 0000:15:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Nov 1 15:41:51 AxenD1 kernel: [1017116.725860] igb 0000:15:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Nov 1 15:41:51 AxenD1 kernel: [1017116.815206] igb 0000:15:00.0 eth0: igb: eth0 NIC Link is Down Nov 1 15:41:55 AxenD1 kernel: [1017120.521882] igb 0000:15:00.1 eth1: igb: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Nov 1 15:41:55 AxenD1 kernel: [1017120.725854] igb 0000:15:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Nov 1 15:42:05 AxenD1 kernel: [1017130.780242] igb 0000:15:00.1 eth1: igb: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Nov 1 15:42:05 AxenD1 kernel: [1017130.821934] igb 0000:15:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Nov 1 15:42:05 AxenD1 kernel: [1017130.876058] igb 0000:15:00.1 eth1: igb: eth1 NIC Link is Down Nov 1 15:42:05 AxenD1 kernel: [1017130.911424] igb 0000:15:00.0 eth0: igb: eth0 NIC Link is Down Nov 1 15:42:08 AxenD1 kernel: [1017134.322050] igb 0000:15:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Nov 1 15:42:09 AxenD1 kernel: [1017134.629876] igb 0000:15:00.1 eth1: igb: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Nov 1 15:42:19 AxenD1 kernel: [1017144.725857] igb 0000:15:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Nov 1 15:42:19 AxenD1 kernel: [1017144.811352] igb 0000:15:00.0 eth0: igb: eth0 NIC Link is Down Nov 1 15:42:23 AxenD1 kernel: [1017148.533871] igb 0000:15:00.1 eth1: igb: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Nov 1 15:42:23 AxenD1 kernel: [1017149.189892] igb 0000:15:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None *At the same time, I am getting in daemon.log: * Nov 1 15:39:58 AxenD1 ovsdb-server: ovs|277473|reconnect|INFO|ssl: 192.168.254.60:6632: connecting... Nov 1 15:40:00 AxenD1 ovs-vswitchd: ovs|245685|bond|INFO|interface eth1: link state up Nov 1 15:40:00 AxenD1 ovs-vswitchd: ovs|245686|bond|INFO|interface eth1: enabled Nov 1 15:40:00 AxenD1 ovs-vswitchd: ovs|245687|bond|INFO|bond bond1: active interface is now eth1 Nov 1 15:40:00 AxenD1 ovs-vswitchd: ovs|245688|bond|INFO|interface eth1: link state down Nov 1 15:40:00 AxenD1 ovs-vswitchd: ovs|245689|bond|INFO|interface eth1: disabled Nov 1 15:40:00 AxenD1 ovs-vswitchd: ovs|245690|bond|INFO|bond bond1: all interfaces disabled Nov 1 15:40:00 AxenD1 ovs-vswitchd: ovs|245691|bond|INFO|interface eth0: link state up Nov 1 15:40:00 AxenD1 ovs-vswitchd: ovs|245692|bond|INFO|interface eth0: will be enabled if it stays up for 31000 ms Only when the bond is destroyed, the nics stop to go down. Also, when the bond contains only one nic, it works fine. In addition, we observed a lot of overruns, however, I am not sure if they are related with the main issue. Please don't hesitate to ask me for more infomation. Thanks, Matias. _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev