On Tue, Feb 22, 2022 at 09:18:23PM +0000, Robin Murphy wrote:

> > Still not sure I see what you are thinking though..
> 
> What part of "How hard is it to hold group->mutex when reading or writing
> group->owner?" sounded like "complex lockless algorithm", exactly?

group->owner is not the issue, this series is already using the group
lock to protect the group_cnt and owner.

It is how you inspect the struct device it iterates over to decide if
it is still using the DMA API or not that is the problem. Hint: this
is why I keep mentioning the device_lock() as it is the only locking we
for this today.

> To spell it out, the scheme I'm proposing looks like this:

Well, I already got this, it is what is in driver_or_DMA_API_token()
that matters

I think you are suggesting to do something like:

   if (!READ_ONCE(dev->driver) ||  ???)
       return NULL;
   return group;  // A DMA_API 'token'

Which is locklessly reading dev->driver, and why you are talking about
races, I guess.

> remove:
>        bool still_owned = false;
>        mutex_lock(group->mutex);
>        list_for_each_entry(tmp, &group->devices, list) {
>                void *owner = driver_or_DMA_API_token(tmp);
>                if (tmp == dev || !owner || owner != group->owner)

And here you expect this will never be called if a group is owned by
VFIO? Which bakes in that weird behavior of really_probe() that only
some errors deserve to get a notifier?

How does the next series work? The iommu_attach_device() work relies
on the owner_cnt too. I don't think this list can replace it there.

> always serialised, even if the list walk in the remove hook sees "future"
> information WRT other devices' drivers, at worst it should merely short-cut
> to a corresponding pending reclaim of ownership.

Depending on what the actual logic is I could accept this argument,
assuming it came with a WRITE_ONCE on the store side and we all
thought carefully about how all this is ordered.

> Because the current alternative to adding a few simple lines to dd.c is
> adding loads of lines all over the place to end up calling back into common
> IOMMU code, to do something I'm 99% certain the common IOMMU code
> could do

*shrug* both Christoph and I tried to convince Greg. He never really
explained why, but currently he thinks this is the right way to design
it, and so here we are.

> for itself in private. That said, having worked through the above, it does
> start looking like a bit of a big change for this series at this point, so
> I'm happy to keep it on the back burner for when I have to rip
> .dma_configure to pieces anyway.

OK, thanks.

> According to lockdep, I think I've solved the VFIO locking issue provided
> vfio_group_viable() goes away, so I'm certainly keen not to delay that for
> another cycle!

Indeed.

Keep in mind that lockdep is disabled on the device_lock()..

> paragraph quoted above again. I'm not talking about automatic DMA API
> claiming, that clearly happens per-device; I'm talking about explicit
> callers of iommu_group_claim_dma_owner(). Does VFIO call that multiple times
> for individual devices? No. Should it? No. Is it reasonable that any other
> future callers should need to? I don't think so. Would things be easier to
> reason about if we just disallowed it outright? For sure.

iommufd is device centric and the current draft does call
iommu_group_claim_dma_owner() once for each device. It doesn't have
any reason to track groups, so it has no way to know if it is
"nesting" or not.

I hope the iommu_attach_device() work will progress and iommufd can
eventualy call a cleaner device API, it is setup to work this way at
least.

So, yes, currently future calls need the owner_cnt to work right.
(and we are doing all this to allow iommufd and vfio to share the
ownership logic - adding VFIO-like group tracking complexity to
iommufd to save a few bus callbacks is not a win, IMHO)

> > It has already been 8 weeks on this point, lets just move on please.
> 
> Sure, if it was rc7 with the merge window looming I'd be saying "this is
> close enough, let's get it in now and fix the small stuff next cycle".
> However while there's still potentially time to get things right first time,
> I for one am going to continue to point them out because I'm not a fan of
> avoidable churn. I'm sorry I haven't had a chance to look properly at this
> series between v1 and v6, but that's just how things have been.

Well, it is understandable, but this was supposed to be a smallish
cleanup series. All the improvments from the discussion are welcomed
and certainly did improve it, but this started in November and is
dragging on..

Sometimes we need churn to bring everyone along the journey.

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Reply via email to