On 23.09.25 14:15, Jason Gunthorpe wrote:
> On Tue, Sep 23, 2025 at 09:52:04AM +0200, Christian König wrote:
>> For example the ISP driver part of amdgpu provides the V4L2
>> interface and when we interchange a DMA-buf with it we recognize that
>> it is actually the same device we work with.
> 
> One of the issues here is the mis-use of dma_map_resource() to create
> dma_addr_t for PCI devices. This was never correct.

That is not a mis-use at all but rather exactly what dma_map_resource() was 
created for.

If dma_map_resource() is not ACS aware than we should add that.

> VFIO is using a new correct ACS aware DMA mapping API that I would
> expect all the DMABUF world to slowly migrate to. This API prevents
> mappings in cases that don't work in HW.
> 
> So a design where you have to DMA map something then throw away the
> DMA map after doing some "shortcut" check isn't going to work.
> 
> We need some way for the importer/exporter to negotiate what kind of
> address they want to exchange without forcing a dma mapping.

That is already in place. We don't DMA map anything in those use cases.

>>>> I've read through this thread—Jason, correct me if I'm wrong—but I
>>>> believe what you're suggesting is that instead of using PCIe P2P
>>>> (dma_map_resource) to communicate the VF's VRAM offset to the PF, we
>>>> should teach dma-buf to natively understand a VF's VRAM offset. I don't
>>>> think this is currently built into dma-buf, but it probably should be,
>>>> as it could benefit other use cases as well (e.g., UALink, NVLink,
>>>> etc.).
>>>>
>>>> In both examples above, the PCIe P2P fabric is used for communication,
>>>> whereas in the VF→PF case, it's only using the PCIe P2P address to
>>>> extract the VF's VRAM offset, rather than serving as a communication
>>>> path. I believe that's Jason's objection. Again, Jason, correct me if
>>>> I'm misunderstanding here.
> 
> Yes, this is my point.
> 
> We have many cases now where a dma_addr_t is not the appropriate way
> to exchange addressing information from importer/exporter and we need
> more flexibility.
> 
> I also consider the KVM and iommufd use cases that must have a
> phys_addr_t in this statement.

Abusing phys_addr_t is also the completely wrong approach in that moment.

When you want to communicate addresses in a device specific address space you 
need a device specific type for that and not abuse phys_addr_t.

>> What you can do is to either export the DMA-buf from the driver who
>> feels responsible for the PF directly (that's what we do in amdgpu
>> because the VRAM is actually not fully accessible through the BAR).
> 
> Again, considering security somehow as there should not be uAPI to
> just give uncontrolled access to VRAM.
> 
> From a security side having the VF create the DMABUF is better as you
> get that security proof that it is permitted to access the VRAM.

Well the VF is basically just a window into the HW of the PF.

The real question is where does the VFIO gets the necessary information which 
parts of the BAR to expose?

> From this thread I think if VFIO had the negotiated option to export a
> CPU phys_addr_t then the Xe PF driver can reliably convert that to a
> VRAM offset.
> 
> We need to add a CPU phys_addr_t option for VFIO to iommufd and KVM
> anyhow, those cases can't use dma_addr_t.

Clear NAK to using CPU phys_addr_t. This is just a horrible idea.

Regards,
Christian.

> 
>>>> I'd prefer to leave the provisioning data to the PF if possible. I
>>>> haven't fully wrapped my head around the flow yet, but it should be
>>>> feasible for the VF → VFIO → PF path to pass along the initial VF
>>>> scatter-gather (SG) list in the dma-buf, which includes VF-specific
>>>> PFNs. The PF can then use this, along with its provisioning information,
>>>> to resolve the physical address.
>>
>> Well don't put that into the sg_table but rather into an xarray or
>> similar, but in general that's the correct idea.
> 
> Yes, please lets move away from re-using dma_addr_t to represent
> things that are not created by the DMA API.
> 
> Jason

Reply via email to