Aaron Conole <acon...@redhat.com> writes:

> Flavio Leitner <f...@sysclose.org> writes:
>
>> On Thu, Apr 18, 2019 at 01:46:22PM -0600, Alex Williamson wrote:
>>> On Thu, 18 Apr 2019 15:50:43 -0300
>>> Flavio Leitner <f...@sysclose.org> wrote:
>>> 
>>> > On Thu, Apr 18, 2019 at 12:06:57PM -0600, Alex Williamson wrote:
>>> > > On Thu, 18 Apr 2019 13:56:23 -0300
>>> > > Flavio Leitner <f...@sysclose.org> wrote:
>>> > >   
>>> > > > On Thu, Apr 18, 2019 at 10:43:11AM -0600, Alex Williamson wrote:  
>>> > > > > On Thu, 18 Apr 2019 13:23:54 -0300
>>> > > > > Flavio Leitner <f...@sysclose.org> wrote:
>>> > > > Another thing is that when the module is ready and the event is sent
>>> > > > out, what holds OVS for not trying to open and get EACCESS before
>>> > > > udev is triggered to fix the device permission?  
>>> > > 
>>> > > If there were a race, could ovs ever run before udev on system
>>> > > startup?  Probably not.  
>>> > 
>>> > It does wait, but only for the udev to settle, which means if the
>>> > module has not triggered an event until that time, OVS will not wait
>>> > and we still have a race.
>>> 
>>> But udev isn't waiting on the module to trigger an event, the module
>>> contains a MODULE_ALIAS, so I believe it's just the static processing
>>> of the modules.alias that triggers the event.
>>
>> What I am saying is that driverctl will trigger load the module and
>> bind the device, later on systemd will trigger OVS service which
>> waits udev to settle, but none of that guarantees that the permissions
>> are updated when OVS is initializing, see below.
>>
>>> > >  Ideally perhaps a cleaner solution might be an
>>> > > explicit dependency on the vfio module specific to ovs startup rather
>>> > > than changing a system policy, but it really depends on the context and
>>> > > use cases.  Thanks,  
>>> > 
>>> > It does have. The driverctl will bind the devices to vfio-pci but
>>> > the problem is that which signal we should rely on to know when
>>> > the vfio module is still initializing, or failed or finished.
>>> 
>>> What signal/mechanism is being used currently?  If driverctl is asked
>>> to set a driver override it does:
>>> 
>>>  1) if module is not loaded, modprobe
>>>  2) unbinds device from existing driver, if any
>>>  3) sets driver_override
>>>  4) triggers drivers_probe
>>>  5) tests if device is bound to a driver, any driver
>>> 
>>> There are certainly some deficiencies here, unbinding the device before
>>> setting the driver_override leaves the device open to getting bound by
>>> the wrong driver, and the verification in the last step could be more
>>> specific in testing for binding to the correct driver, but step #1 is
>>> the modprobe of the driver, which should be a synchronous operation.
>>> We shouldn't be able to complete a 'driverctl set-override $DEV
>>> vfio-pci' without vfio being initialized, afaict.  Thanks,
>>
>> Right, sounds like systemd is starting openvswitch service before
>> the driverctl is done with the devices.
>
> I'm not sure.  The ordering could be a problem.
>
> Perhaps we could try adding:
>
>   After=basic.target
>
> for the ovs-vswitchd.service if we have a machine that exhibits this
> behavior, but I don't know if it will resolve the race.  There is some
> kind of strange ordering looking at:
>
> https://www.freedesktop.org/software/systemd/man/systemd.special.html
> and
> https://www.freedesktop.org/software/systemd/man/bootup.html#
>
> I can't find how network.target dependency really works w.r.t. ordering
> and the driverctl+basic.target services.

Ping?  Any thoughts?  Do you have an alternative approach you'd rather
see?  I can try asking the customer if they can test out the
After=basic.target change I propose, but I'm not positive it will
resolve anything.  And if it doesn't, I want to be able to say "well,
here's a follow up."

>> fbl
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to