Aaron Conole <acon...@redhat.com> writes: > Flavio Leitner <f...@sysclose.org> writes: > >> On Thu, Apr 18, 2019 at 01:46:22PM -0600, Alex Williamson wrote: >>> On Thu, 18 Apr 2019 15:50:43 -0300 >>> Flavio Leitner <f...@sysclose.org> wrote: >>> >>> > On Thu, Apr 18, 2019 at 12:06:57PM -0600, Alex Williamson wrote: >>> > > On Thu, 18 Apr 2019 13:56:23 -0300 >>> > > Flavio Leitner <f...@sysclose.org> wrote: >>> > > >>> > > > On Thu, Apr 18, 2019 at 10:43:11AM -0600, Alex Williamson wrote: >>> > > > > On Thu, 18 Apr 2019 13:23:54 -0300 >>> > > > > Flavio Leitner <f...@sysclose.org> wrote: >>> > > > Another thing is that when the module is ready and the event is sent >>> > > > out, what holds OVS for not trying to open and get EACCESS before >>> > > > udev is triggered to fix the device permission? >>> > > >>> > > If there were a race, could ovs ever run before udev on system >>> > > startup? Probably not. >>> > >>> > It does wait, but only for the udev to settle, which means if the >>> > module has not triggered an event until that time, OVS will not wait >>> > and we still have a race. >>> >>> But udev isn't waiting on the module to trigger an event, the module >>> contains a MODULE_ALIAS, so I believe it's just the static processing >>> of the modules.alias that triggers the event. >> >> What I am saying is that driverctl will trigger load the module and >> bind the device, later on systemd will trigger OVS service which >> waits udev to settle, but none of that guarantees that the permissions >> are updated when OVS is initializing, see below. >> >>> > > Ideally perhaps a cleaner solution might be an >>> > > explicit dependency on the vfio module specific to ovs startup rather >>> > > than changing a system policy, but it really depends on the context and >>> > > use cases. Thanks, >>> > >>> > It does have. The driverctl will bind the devices to vfio-pci but >>> > the problem is that which signal we should rely on to know when >>> > the vfio module is still initializing, or failed or finished. >>> >>> What signal/mechanism is being used currently? If driverctl is asked >>> to set a driver override it does: >>> >>> 1) if module is not loaded, modprobe >>> 2) unbinds device from existing driver, if any >>> 3) sets driver_override >>> 4) triggers drivers_probe >>> 5) tests if device is bound to a driver, any driver >>> >>> There are certainly some deficiencies here, unbinding the device before >>> setting the driver_override leaves the device open to getting bound by >>> the wrong driver, and the verification in the last step could be more >>> specific in testing for binding to the correct driver, but step #1 is >>> the modprobe of the driver, which should be a synchronous operation. >>> We shouldn't be able to complete a 'driverctl set-override $DEV >>> vfio-pci' without vfio being initialized, afaict. Thanks, >> >> Right, sounds like systemd is starting openvswitch service before >> the driverctl is done with the devices. > > I'm not sure. The ordering could be a problem. > > Perhaps we could try adding: > > After=basic.target > > for the ovs-vswitchd.service if we have a machine that exhibits this > behavior, but I don't know if it will resolve the race. There is some > kind of strange ordering looking at: > > https://www.freedesktop.org/software/systemd/man/systemd.special.html > and > https://www.freedesktop.org/software/systemd/man/bootup.html# > > I can't find how network.target dependency really works w.r.t. ordering > and the driverctl+basic.target services.
Ping? Any thoughts? Do you have an alternative approach you'd rather see? I can try asking the customer if they can test out the After=basic.target change I propose, but I'm not positive it will resolve anything. And if it doesn't, I want to be able to say "well, here's a follow up." >> fbl _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev