On Thu, Dec 4, 2025 at 7:35 PM Tianyu Lan <[email protected]> wrote:
>
> On Thu, Dec 4, 2025 at 11:35 AM Michael Kelley <[email protected]> wrote:
> >
> > From: Tianyu Lan <[email protected]> Sent: Wednesday, December 3, 2025 
> > 6:21 AM
> > >
> > > On Sat, Nov 29, 2025 at 1:47 AM Michael Kelley <[email protected]> 
> > > wrote:
> > > >
> > > > From: Tianyu Lan <[email protected]> Sent: Monday, November 24, 2025 
> > > > 10:29 AM
> >
> > [snip]
> >
> > > >
> > > > Here's my idea for an alternate approach.  The goal is to allow use of 
> > > > the
> > > > swiotlb to be disabled on a per-device basis. A device is initialized 
> > > > for swiotlb
> > > > usage by swiotlb_dev_init(), which sets dev->dma_io_tlb_mem to point to 
> > > > the
> > > > default swiotlb memory.  For VMBus devices, the calling sequence is
> > > > vmbus_device_register() -> device_register() -> device_initialize() ->
> > > > swiotlb_dev_init(). But if vmbus_device_register() could override the
> > > > dev->dma_io_tlb_mem value and put it back to NULL, swiotlb operations
> > > > would be disabled on the device. Furthermore, is_swiotlb_force_bounce()
> > > > would return "false", and the normal DMA functions would not force the
> > > > use of bounce buffers. The entire code change looks like this:
> > > >
> > > > --- a/drivers/hv/vmbus_drv.c
> > > > +++ b/drivers/hv/vmbus_drv.c
> > > > @@ -2133,11 +2133,15 @@ int vmbus_device_register(struct hv_device 
> > > > *child_device_obj)
> > > >         child_device_obj->device.dma_mask = &child_device_obj->dma_mask;
> > > >         dma_set_mask(&child_device_obj->device, DMA_BIT_MASK(64));
> > > >
> > > > +       device_initialize(&child_device_obj->device);
> > > > +       if (child_device_obj->channel->co_external_memory)
> > > > +               child_device_obj->device.dma_io_tlb_mem = NULL;
> > > > +
> > > >         /*
> > > >          * Register with the LDM. This will kick off the driver/device
> > > >          * binding...which will eventually call vmbus_match() and 
> > > > vmbus_probe()
> > > >          */
> > > > -       ret = device_register(&child_device_obj->device);
> > > > +       ret = device_add(&child_device_obj->device);
> > > >         if (ret) {
> > > >                 pr_err("Unable to register child device\n");
> > > >                 put_device(&child_device_obj->device);
> > > >
> > > > I've only compile tested the above since I don't have an environment 
> > > > where
> > > > I can test Confidential VMBus. You would need to verify whether my 
> > > > thinking
> > > > is correct and this produces the intended result.
> > >
> > > Thanks Michael. I tested it and it seems to hit an issue. Will double 
> > > check.with
> > > HCL/paravisor team.
> > >
> > >  We considered such a change before. From Roman's previous patch, it 
> > > seems to
> > > need to change phys_to_dma() and force_dma_unencrypted().
> >
> > In a Hyper-V SEV-SNP VM with a paravisor, I assert that phys_to_dma() and
> > __phys_to_dma() do the same thing.  phys_to_dma() calls 
> > dma_addr_encrypted(),
> > which does __sme_set().  But in a Hyper-V VM using vTOM, sme_me_mask is
> > always 0, so dma_addr_encrypted() is a no-op.  dma_addr_unencrypted() and
> > dma_addr_canonical() are also no-ops. See include/linux/mem_encrypt.h. So
> > in a Hyper-V SEV-SNP VM, the DMA layer doesn't change anything related to
> > encryption when translating between a physical address and a DMA address.
> > Same thing is true for a Hyper-V TDX VM with paravisor.
> >
> > force_dma_unencrypted() will indeed return "true", and it is used in
> > phys_to_dma_direct(). But both return paths in phys_to_dma_direct() return 
> > the
> > same result because of dma_addr_unencrypted() and dma_addr_encrypted()
> > being no-ops. Other uses of force_dma_unencrypted() are only in the
> > dma_alloc_*() paths, but dma_alloc_*() isn't used by VMBus devices because
> > the device control structures are in the ring buffer, which as you have 
> > noted, is
> > already handled separately. So for the moment, I don't think the return 
> > value
> > from force_dma_unencrypted() matters.
> >

dma_alloc_*() is used by PCI device driver(e.g, Mana NIC
driver) and If we need to support TDisp device, the change
in the force_dma_unencrypted() is still necessary and dma
address to TDisp device should be encrypted memory with
sme_me_mask.

>From this point, Hyper-V specific dma ops may resolve this
without change in the DMA core code. Otherwise, we still
need to add a callback or other flag inside for platforms to
check whether it should return encrypted/decrypted address
to drivers.

> > So I'm guessing something else unexpected is happening such that just 
> > disabling
> > the swiotlb on a per-device basis doesn't work. Assuming that Roman's 
> > original
> > patch actually worked, I'm trying to figure out how my idea is different in 
> > a way
> > that has a material effect on things. And if your patch works by going 
> > directly to
> > __phys_to_dma(), it should also work when using phys_to_dma() instead.
> >

The issue I met should not be related with bounce buffer disabling.
I don't find any failure to map dma memory with bounce buffer.
For disabling per-device swiotlb, it looks like work to set dma_io_
tlb_mem to be NULL and it makes is_swiotlb_force_bounce()
returns false.

--
Thanks
Tianyu Lan

Reply via email to