On Mon, Feb 07, 2022 at 01:53:34PM +0000, Oleksandr Andrushchenko wrote:
> 
> 
> On 07.02.22 14:46, Roger Pau Monné wrote:
> > On Mon, Feb 07, 2022 at 11:08:39AM +0000, Oleksandr Andrushchenko wrote:
> >> ======================================
> >>
> >> Bottom line:
> >> ======================================
> >>
> >> 1. vpci_{read|write} are not protected with pcidevs_lock and can run in
> >> parallel with pci_remove_device which can remove pdev after 
> >> vpci_{read|write}
> >> acquired the pdev pointer. This may lead to a fail due to pdev dereference.
> >>
> >> So, to protect pdev dereference vpci_{read|write} must also use pdevs_lock.
> > We would like to take the pcidevs_lock only while fetching the device
> > (ie: pci_get_pdev_by_domain), afterwards it should be fine to lock the
> > device using a vpci specific lock so calls to vpci_{read,write} can be
> > partially concurrent across multiple domains.
> This means this can't be done a pre-req patch, but as a part of the
> patch which changes locking.
> >
> > In fact I think Jan had already pointed out that the pci lock would
> > need taking while searching for the device in vpci_{read,write}.
> I was referring to the time after we found pdev and it is currently
> possible to free pdev while using it after the search
> >
> > It seems to me that if you implement option 3 below taking the
> > per-domain rwlock in read mode in vpci_{read|write} will already
> > protect you from the device being removed if the same per-domain lock
> > is taken in write mode in vpci_remove_device.
> Yes, it should. Again this can't be done as a pre-req patch because
> this relies on pdev->vpci_lock

Hm, no, I don't think so. You could introduce this per-domain rwlock
in a prepatch, and then move the vpci lock outside of the vpci struct.
I see no problem with that.

> >
> >> 2. The only offending place which is in the way of pci_dev->vpci_lock is
> >> modify_bars. If it can be re-worked to track already mapped and unmapped
> >> regions then we can avoid having a possible deadlock and can use
> >> pci_dev->vpci_lock (rangesets won't help here as we also need refcounting 
> >> be
> >> implemented).
> > I think a refcounting based solution will be very complex to
> > implement. I'm however happy to be proven wrong.
> I can't estimate, but I have a feeling that all these plays around locking
> is just because of this single piece of code. No other place suffer from
> pdev->vpci_lock and no d->lock
> >
> >> If pcidevs_lock is used for vpci_{read|write} then no deadlock is possible,
> >> but modify_bars code must be re-worked not to lock itself (pdev->vpci_lock 
> >> and
> >> tmp->vpci_lock when pdev == tmp, this is minor).
> > Taking the pcidevs lock (a global lock) is out of the picture IMO, as
> > it's going to serialize all calls of vpci_{read|write}, and would
> > create too much contention on the pcidevs lock.
> I understand that. But if we would like to fix the existing code I see
> no other alternative.
> >
> >> 3. We may think about a per-domain rwlock and pdev->vpci_lock, so this 
> >> solves
> >> modify_bars's two pdevs access. But this doesn't solve possible pdev
> >> de-reference in vpci_{read|write} vs pci_remove_device.
> > pci_remove device will call vpci_remove_device, so as long as
> > vpci_remove_device taken the per-domain lock in write (exclusive) mode
> > it should be fine.
> I think I need to see if there are any other places which similarly
> require the write lock
> >
> >> @Roger, @Jan, I would like to hear what do you think about the above 
> >> analysis
> >> and how can we proceed with locking re-work?
> > I think the per-domain rwlock seems like a good option. I would do
> > that as a pre-patch.
> It is. But it seems it won't solve the thing we started this adventure for:
> 
> With per-domain read lock and still ABBA in modify_bars (hope the below
> is correctly seen with a monospace font):
> 
> cpu0: vpci_write-> d->RLock -> pdev1->lock ->                                 
>                  rom_write -> modify_bars: tmp (pdev2) ->lock
> cpu1:        vpci_write-> d->RLock pdev2->lock -> cmd_write -> modify_bars: 
> tmp (pdev1) ->lock
> 
> There is no API to upgrade read lock to write lock in modify_bars which could 
> help,
> so in both cases vpci_write should take write lock.

I've thought more than once that it would be nice to have a
write_{upgrade,downgrade} (read_downgrade maybe?) or similar helper.

I think you could also drop the read lock, take the write lock and
check that &pdev->vpci->header == header in order to be sure
pdev->vpci hasn't been recreated. You would have to do similar in
order to get back again from a write lock into a read one.

We should avoid taking the rwlock in write mode in vpci_write
unconditionally.

Thanks, Roger.

Reply via email to