Hello, Robin! Thank you for yours work!
Some drivers, like this Tegra VDE (Video Decoder Engine) driver for
example, do not want to use implicit IOMMU domain.
That isn't (intentionally) changing here - the only difference should be
that instead of having the ARM-special implicit domain, which you have
to kick out of the way with the ARM-specific API before you're able to
attach your own domain, the implicit domain is now a proper IOMMU API
default domain, which automatically gets bumped by your attach. The
default domains should still only be created in the same cases that the
ARM dma_iommu_mappings were.
Tegra VDE driver
relies on explicit IOMMU domain in a case of Tegra SMMU because VDE
hardware can't access last page of the AS and because driver wants to
reserve some fixed addresses [1].
[1]
https://elixir.bootlin.com/linux/v5.9-rc1/source/drivers/staging/media/tegra-vde/iommu.c#L100
Tegra30 SoC supports up to 4 domains, hence it's not possible to afford
wasting unused implicit domains. I think this needs to be addressed
before this patch could be applied.
Yeah, there is one subtle change in behaviour from removing the ARM
layer on top of the core API, in that the IOMMU driver will no longer
see an explicit detach call. Thus it does stand to benefit from being a
bit cleverer about noticing devices being moved from one domain to
another by an attach call, either by releasing the hardware context for
the inactive domain once the device(s) are moved across to the new one,
or by simply reprogramming the hardware context in-place for the new
domain's address space without allocating a new one at all (most of the
drivers that don't have multiple contexts already handle the latter
approach quite well).
Would it be possible for IOMMU drivers to gain support for filtering out
devices in iommu_domain_alloc(dev, type)? Then perhaps Tegra SMMU driver
could simply return NULL in a case of type=IOMMU_DOMAIN_DMA and
dev=tegra-vde.
If you can implement IOMMU_DOMAIN_IDENTITY by allowing the relevant
devices to bypass translation entirely without needing a hardware
context (or at worst, can spare one context which all identity-mapped
logical domains can share), then you could certainly do that kind of
filtering with the .def_domain_type callback if you really wanted to. As
above, the intent is that that shouldn't be necessary for this
particular case, since only one of a group's default domain and
explicitly attached domain can be live at any given time, so the driver
should be able to take advantage of that.
If you simply have more active devices (groups) than available contexts
then yes, you probably would want to do some filtering to decide who
deserves a translation domain and who doesn't, but in that case you
should already have had a long-standing problem with the ARM implicit
domains.
Alternatively, the Tegra SMMU could be changed such that the devices
will be attached to a domain at the time of a first IOMMU mapping
invocation instead of attaching at the time of attach_dev() callback
invocation.
Or maybe even IOMMU core could be changed to attach devices at the time
of the first IOMMU mapping invocation? This could be a universal
solution for all drivers.
I suppose technically you could do that within an IOMMU driver already
(similar to how some defer most of setup that logically belongs to
->domain_alloc until the first ->attach_dev). It's a bit grim from the
caller's PoV though, in terms of the failure mode being non-obvious and
having no real way to recover. Again, you'd be better off simply making
decisions up-front at domain_alloc or attach time based on the domain type.