I joined this list recently, and encountered something very similar to this user:
On 8 January 2016 at 04:52, Benoît <benoitne at gmail.com <http://openvswitch.org/mailman/listinfo/discuss>> wrote: >* I have an issue where ovs-vswitchd is starting too early. *>* I got a persistent name for an interface (pnic_wwan) but it is happening *>* after ovs-vswitchd starts so it makes an error as it does'nt find the *>* interface name! *>>* Bridge vswitch_wwan *>* Port pnic_wwan *>* Interface pnic_wwan *>* error: "could not open network device pnic_wwan (No such *>* device)"* I am testing with Fedora 23. It seems that with openvswitch.service enabled, openvswitch-nonetwork.service starts too early, before any of the physical network interfaces have been detected. During a "clean" shutdown process, and if the OVS bridge is configured using /etc/sysconfig/network/* with TYPE=OVSBridge, the bridge is normally removed on shutdown, which leaves the system in an acceptable state as when openvswitch-nonetwork.service starts early, there is no bridge in existence, so there is no problem. However, if shutdown is unclean for any reason - if ifdown-ovs was not executed properly for any reason - then the system comes up with the physical network interface ports already pre-associated with the bridge, and because the bridge is started before networking exists, it leads to "could not open network device ens2f0 (No such device)" (in my case, the persistence naming is the default as selected by udev configuration). This error persists, in that the physical ports are unusable in this state. Now, in some cases, the ifup-ovs will delete and re-add the port, so other than errors during startup, the bridge becomes healthy when the port is re-added. In the fali cases, "ovs-vsctl show" will show the physical interfaces with the "No such device" error, even though the interfaces clearly do exist by this point. In my case, I am trying to use TYPE=OVSBond. I have dual 10 GbE and I wanted to use an OVS bridge instead of a Linux bridge for my host networking, with several VLAN configured as TYPE=OVSIntPort on the bridge. If I configured the physical interfaces as TYPE=OVSPort, and I have TYPE=OVSBond list them with BOND_IFACES, then I get a different problem at startup... Where the TYPE=OVSPort initialization tries to re-add the port with: ovs-vsctl -t 10 -- --if-exists del-port ens2f0 -- add-port ens2f0 But this fails with "cannot create a port named ens2f0 because an interface named ens2f0 already exists on bridge br-ext". In this case, the port is part of the bond, not directly part of the bridge, and the re-add code isn't able to work around this problem. During further investigation, I found that after the system is up (and particularly after network.service has been run), I could "systemctl restart openvswitch" and "ovs-vsctl show" would no longer list "No such device" for the physical interface ports. After trying to understand and dis-entangle all the cause and effect, I finally realized that ifup-ovs will start OVS on demand, after the physical interfaces have been detected and assigned names (including possible renames ... eth0 => ens2f0, ...), and that I could avoid starting OVS too early, simple by *not* enabling the openvswitch.service. This is now working... By *not* enabling openvswitch.service, and letting ifup-ovs start up openvswitch on demand, the system is coming up reliably whether clean shutdown or force reset (I want the server to be crash-safe, so I explicitly test this case).... But, I'm now concerned about the direction of Fedora and openvswitch-nonetwork.service, and I am wondering if my work-around of not enabling openvswitch.service makes sense, and is part of the design of ifup-ovs that will be supported going forwards, or is just lucky that it works, and this could break with a future openvswitch update, or a future version of Fedora? I think the openvswitch-nonetwork.service starting early, and presuming that physical interfaces can actually be used that early, is a defect in openvswitch. I think the intent is to make OVS bridges and internal ports available for use with the rest of the networking support, but this only currently works properly for virtual bridges that are not connected to physical interfaces. By "works properly", I mean that it comes up clean whether shutdown was "clean" or "dirty", and doesn't have errors about "No such device", and does not need the port to be re-added to clear this error state. Without any real understanding of the complexity here, I am thinking that when OpenVSwitch starts early, before the physical network interfaces exist according to the kernel, OpenVSwitch should delay initialization of those ports or bonds until the physical network interfaces actually do exist. The "No such device" issue should automatically clear as soon as the device actually does come into existence. In my case, I would like the "bond0" (TYPE=OVSBond) to be re-initialized as soon as one or both of "ens2f0" (TYPE=OVSPort) or "ens2f1" (TYPE=OVSPort) become real, similar to what would happen when the link state for the real interfaces goes up or down. I think this should also applies to regular ports on the bridge. There should be no need for ifup-ovs to re-create the port if it already exists, and just needs to be properly initialized *after* the physical interface comes into existence in the kernel. Is this something that is already understood, or already being worked on? I found very little information on this with Google searching, which is how I stumbled upon this original thread... Other work-arounds that I tried that may be of interest to people to understand exactly how it fails, and how it behaves: 1) I tried to use regular TYPE=Ethernet (instead of TYPE=OVSPort) network interfaces, and "ifup" the physical interfaces as a "Pre" command to the openvswitch-nonetwork.service. This gave a warning about "Delaying initialization" from "ifup". I believe it *did* fix the problem, but only because the "ifup" failed, so the openvswitch-nonetwork.service startup was aborted early, and it happened later due to ifup-ovs. As even "/bin/false" would have had the same effect here, I considered this an invalid work-around and this helped lead me to the conclusion of disabling openvswitch.service altogether as the more sensible work-around. 2) I tried to "modprobe ixgbe" (the network driver for the Intel cards I have) as a "Pre" command to the openvswitch-nonetwork.service. This had similar behaviour to the "ifup" above. Also not a very good solution. -- Mark Mielke <mark.mie...@gmail.com>
_______________________________________________ discuss mailing list discuss@openvswitch.org http://openvswitch.org/mailman/listinfo/discuss