[virtio-dev] Re: [PATCH v3 1/4] dma-buf: add support for virtio exported objects

2020-05-14 Thread David Stevens
On Thu, May 14, 2020 at 9:30 PM Daniel Vetter  wrote:
> On Thu, May 14, 2020 at 05:19:40PM +0900, David Stevens wrote:
> > Sorry for the duplicate reply, didn't notice this until now.
> >
> > > Just storing
> > > the uuid should be doable (assuming this doesn't change during the
> > > lifetime of the buffer), so no need for a callback.
> >
> > Directly storing the uuid doesn't work that well because of
> > synchronization issues. The uuid needs to be shared between multiple
> > virtio devices with independent command streams, so to prevent races
> > between importing and exporting, the exporting driver can't share the
> > uuid with other drivers until it knows that the device has finished
> > registering the uuid. That requires a round trip to and then back from
> > the device. Using a callback allows the latency from that round trip
> > registration to be hidden.
>
> Uh, that means you actually do something and there's locking involved.
> Makes stuff more complicated, invariant attributes are a lot easier
> generally. Registering that uuid just always doesn't work, and blocking
> when you're exporting?

Registering the id at creation and blocking in gem export is feasible,
but it doesn't work well for systems with a centralized buffer
allocator that doesn't support batch allocations (e.g. gralloc). In
such a system, the round trip latency would almost certainly be
included in the buffer allocation time. At least on the system I'm
working on, I suspect that would add 10s of milliseconds of startup
latency to video pipelines (although I haven't benchmarked the
difference). Doing the blocking as late as possible means most or all
of the latency can be hidden behind other pipeline setup work.

In terms of complexity, I think the synchronization would be basically
the same in either approach, just in different locations. All it would
do is alleviate the need for a callback to fetch the UUID.

-David

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



Re: [virtio-dev] [PATCH v3 00/15] virtio-mem: paravirtualized memory

2020-05-14 Thread teawater



> 2020年5月14日 20:19,David Hildenbrand  写道:
> 
> On 14.05.20 13:47, David Hildenbrand wrote:
>> On 14.05.20 13:10, David Hildenbrand wrote:
>>> On 14.05.20 12:12, David Hildenbrand wrote:
 On 14.05.20 12:02, teawater wrote:
> 
> 
>> 2020年5月14日 16:48,David Hildenbrand  写道:
>> 
>> On 14.05.20 08:44, teawater wrote:
>>> Hi David,
>>> 
>>> I got a kernel warning with v2 and v3.
>> 
>> Hi Hui,
>> 
>> thanks for playing with the latest versions. Surprisingly, I can
>> reproduce even by hotplugging a DIMM instead as well - that's good, so
>> it's not related to virtio-mem, lol. Seems to be some QEMU setup issue
>> with older machine types.
>> 
>> Can you switch to a newer qemu machine version, especially
>> pc-i440fx-5.0? Both, hotplugging DIMMs and virtio-mem works for me with
>> that QEMU machine just fine.
> 
> I still could reproduce this issue with pc-i440fx-5.0 or pc.  Did I miss 
> anything?
> 
 
 Below I don't even see virtio_mem. I had to repair the image (filesystem
 fsck) because it was broken, can you try that as well?
 
 Also, it would be great if you could test with v4.
 
>>> 
>>> Correction, something seems to be broken either in QEMU or the kernel. Once 
>>> I
>>> define a DIMM so it's added and online during boot, I get these issues:
>>> 
>>> (I have virtio-mem v4 installed in the guest)
>>> 
>>> #! /bin/bash
>>> sudo x86_64-softmmu/qemu-system-x86_64 \
>>>-machine pc-i440fx-5.0,accel=kvm,usb=off \
>>>-cpu host \
>>>-no-reboot \
>>>-nographic \
>>>-device ide-hd,drive=hd \
>>>-drive 
>>> if=none,id=hd,file=/home/dhildenb/git/Fedora-Cloud-Base-31-1.9.x86_64.qcow2,format=qcow2
>>>  \
>>>-m 1g,slots=10,maxmem=2G \
>>>-smp 1 \
>>>-object memory-backend-ram,id=mem0,size=256m \
>>>-device pc-dimm,id=dimm0,memdev=mem0 \
>>>-s \
>>>-monitor unix:/var/tmp/monitor,server,nowait
>>> 
>>> 
>>> Without the DIMM it seems to work just fine.
>>> 
>> 
>> And another correction. 
>> 
>> Using QEMU v5.0.0, Linux 5.7-rc5, untouched
>> Fedora-Cloud-Base-32-1.6.x86_64.qcow2, I get even without any memory hotplug:
>> 
>> #! /bin/bash
>> sudo x86_64-softmmu/qemu-system-x86_64 \
>>-machine pc-i440fx-5.0,accel=kvm,usb=off \
>>-cpu host \
>>-no-reboot \
>>-nographic \
>>-device ide-hd,drive=hd \
>>-drive 
>> if=none,id=hd,file=/home/dhildenb/git/Fedora-Cloud-Base-32-1.6.x86_64.qcow2,format=qcow2
>>  \
>>-m 5g,slots=10,maxmem=6G \
>>-smp 1 \
>>-s \
>>-kernel /home/dhildenb/git/linux/arch/x86/boot/bzImage \
>>-append "console=ttyS0 rd.shell nokaslr swiotlb=noforce" \
>>-monitor unix:/var/tmp/monitor,server,nowait
>> 
>> 
>> Observe how big the initial RAM even is!
>> 
>> 
>> So this is no DIMM/hotplug/virtio_mem issue. With memory hotplug, it seems 
>> to get
>> more likely to trigger if "swiotlb=noforce" is not specified.
>> 
>> "swiotlb=noforce" seems to trigger some pre-existing issue here. Without
>> "swiotlb=noforce", I was only able to observe this via pc-i440fx-2.1,
>> 
> 
> (talking to myself :) )
> 
> I think I finally understood why using "swiotlb=noforce" with hotplugged
> memory is wrong - or with memory > 3GB. Via "swiotlb=noforce" you tell
> the system to "Never use bounce buffers (for debugging)". This works as
> long as all memory is DMA memory (e.g., < 3GB) AFAIK.
> 
> "If specified, trying to map memory that cannot be used with DMA will
> fail, and a rate-limited warning will be printed."
> 
> Hotplugged memory (under QEMU) is never added below 4GB, because of the
> PCI hole. So both, memory from DIMMs and from virtio-mem will end up at
> or above 4GB. To make a device use that memory, you need bounce buffers.
> 
> Hotplugged memory is never DMA memory.
> 


Hi David,

It is fixed when I remove "swiotlb=noforce”.

Thanks for your help.

Best,
Hui

> -- 
> Thanks,
> 
> David / dhildenb


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



Re: [virtio-dev] [PATCH v3 00/15] virtio-mem: paravirtualized memory

2020-05-14 Thread David Hildenbrand
On 14.05.20 13:47, David Hildenbrand wrote:
> On 14.05.20 13:10, David Hildenbrand wrote:
>> On 14.05.20 12:12, David Hildenbrand wrote:
>>> On 14.05.20 12:02, teawater wrote:


> 2020年5月14日 16:48,David Hildenbrand  写道:
>
> On 14.05.20 08:44, teawater wrote:
>> Hi David,
>>
>> I got a kernel warning with v2 and v3.
>
> Hi Hui,
>
> thanks for playing with the latest versions. Surprisingly, I can
> reproduce even by hotplugging a DIMM instead as well - that's good, so
> it's not related to virtio-mem, lol. Seems to be some QEMU setup issue
> with older machine types.
>
> Can you switch to a newer qemu machine version, especially
> pc-i440fx-5.0? Both, hotplugging DIMMs and virtio-mem works for me with
> that QEMU machine just fine.

 I still could reproduce this issue with pc-i440fx-5.0 or pc.  Did I miss 
 anything?

>>>
>>> Below I don't even see virtio_mem. I had to repair the image (filesystem
>>> fsck) because it was broken, can you try that as well?
>>>
>>> Also, it would be great if you could test with v4.
>>>
>>
>> Correction, something seems to be broken either in QEMU or the kernel. Once I
>> define a DIMM so it's added and online during boot, I get these issues:
>>
>> (I have virtio-mem v4 installed in the guest)
>>
>> #! /bin/bash
>> sudo x86_64-softmmu/qemu-system-x86_64 \
>> -machine pc-i440fx-5.0,accel=kvm,usb=off \
>> -cpu host \
>> -no-reboot \
>> -nographic \
>> -device ide-hd,drive=hd \
>> -drive 
>> if=none,id=hd,file=/home/dhildenb/git/Fedora-Cloud-Base-31-1.9.x86_64.qcow2,format=qcow2
>>  \
>> -m 1g,slots=10,maxmem=2G \
>> -smp 1 \
>> -object memory-backend-ram,id=mem0,size=256m \
>> -device pc-dimm,id=dimm0,memdev=mem0 \
>> -s \
>> -monitor unix:/var/tmp/monitor,server,nowait
>>
>>
>> Without the DIMM it seems to work just fine.
>>
> 
> And another correction. 
> 
> Using QEMU v5.0.0, Linux 5.7-rc5, untouched
> Fedora-Cloud-Base-32-1.6.x86_64.qcow2, I get even without any memory hotplug:
> 
> #! /bin/bash
> sudo x86_64-softmmu/qemu-system-x86_64 \
> -machine pc-i440fx-5.0,accel=kvm,usb=off \
> -cpu host \
> -no-reboot \
> -nographic \
> -device ide-hd,drive=hd \
> -drive 
> if=none,id=hd,file=/home/dhildenb/git/Fedora-Cloud-Base-32-1.6.x86_64.qcow2,format=qcow2
>  \
> -m 5g,slots=10,maxmem=6G \
> -smp 1 \
> -s \
> -kernel /home/dhildenb/git/linux/arch/x86/boot/bzImage \
> -append "console=ttyS0 rd.shell nokaslr swiotlb=noforce" \
> -monitor unix:/var/tmp/monitor,server,nowait
> 
> 
> Observe how big the initial RAM even is!
> 
> 
> So this is no DIMM/hotplug/virtio_mem issue. With memory hotplug, it seems to 
> get
> more likely to trigger if "swiotlb=noforce" is not specified.
> 
> "swiotlb=noforce" seems to trigger some pre-existing issue here. Without
> "swiotlb=noforce", I was only able to observe this via pc-i440fx-2.1,
> 

(talking to myself :) )

I think I finally understood why using "swiotlb=noforce" with hotplugged
memory is wrong - or with memory > 3GB. Via "swiotlb=noforce" you tell
the system to "Never use bounce buffers (for debugging)". This works as
long as all memory is DMA memory (e.g., < 3GB) AFAIK.

"If specified, trying to map memory that cannot be used with DMA will
fail, and a rate-limited warning will be printed."

Hotplugged memory (under QEMU) is never added below 4GB, because of the
PCI hole. So both, memory from DIMMs and from virtio-mem will end up at
or above 4GB. To make a device use that memory, you need bounce buffers.

Hotplugged memory is never DMA memory.

-- 
Thanks,

David / dhildenb


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



Re: [virtio-dev] [PATCH v3 00/15] virtio-mem: paravirtualized memory

2020-05-14 Thread David Hildenbrand
On 14.05.20 13:10, David Hildenbrand wrote:
> On 14.05.20 12:12, David Hildenbrand wrote:
>> On 14.05.20 12:02, teawater wrote:
>>>
>>>
 2020年5月14日 16:48,David Hildenbrand  写道:

 On 14.05.20 08:44, teawater wrote:
> Hi David,
>
> I got a kernel warning with v2 and v3.

 Hi Hui,

 thanks for playing with the latest versions. Surprisingly, I can
 reproduce even by hotplugging a DIMM instead as well - that's good, so
 it's not related to virtio-mem, lol. Seems to be some QEMU setup issue
 with older machine types.

 Can you switch to a newer qemu machine version, especially
 pc-i440fx-5.0? Both, hotplugging DIMMs and virtio-mem works for me with
 that QEMU machine just fine.
>>>
>>> I still could reproduce this issue with pc-i440fx-5.0 or pc.  Did I miss 
>>> anything?
>>>
>>
>> Below I don't even see virtio_mem. I had to repair the image (filesystem
>> fsck) because it was broken, can you try that as well?
>>
>> Also, it would be great if you could test with v4.
>>
> 
> Correction, something seems to be broken either in QEMU or the kernel. Once I
> define a DIMM so it's added and online during boot, I get these issues:
> 
> (I have virtio-mem v4 installed in the guest)
> 
> #! /bin/bash
> sudo x86_64-softmmu/qemu-system-x86_64 \
> -machine pc-i440fx-5.0,accel=kvm,usb=off \
> -cpu host \
> -no-reboot \
> -nographic \
> -device ide-hd,drive=hd \
> -drive 
> if=none,id=hd,file=/home/dhildenb/git/Fedora-Cloud-Base-31-1.9.x86_64.qcow2,format=qcow2
>  \
> -m 1g,slots=10,maxmem=2G \
> -smp 1 \
> -object memory-backend-ram,id=mem0,size=256m \
> -device pc-dimm,id=dimm0,memdev=mem0 \
> -s \
> -monitor unix:/var/tmp/monitor,server,nowait
> 
> 
> Without the DIMM it seems to work just fine.
> 

And another correction. 

Using QEMU v5.0.0, Linux 5.7-rc5, untouched
Fedora-Cloud-Base-32-1.6.x86_64.qcow2, I get even without any memory hotplug:

#! /bin/bash
sudo x86_64-softmmu/qemu-system-x86_64 \
-machine pc-i440fx-5.0,accel=kvm,usb=off \
-cpu host \
-no-reboot \
-nographic \
-device ide-hd,drive=hd \
-drive 
if=none,id=hd,file=/home/dhildenb/git/Fedora-Cloud-Base-32-1.6.x86_64.qcow2,format=qcow2
 \
-m 5g,slots=10,maxmem=6G \
-smp 1 \
-s \
-kernel /home/dhildenb/git/linux/arch/x86/boot/bzImage \
-append "console=ttyS0 rd.shell nokaslr swiotlb=noforce" \
-monitor unix:/var/tmp/monitor,server,nowait


Observe how big the initial RAM even is!


So this is no DIMM/hotplug/virtio_mem issue. With memory hotplug, it seems to 
get
more likely to trigger if "swiotlb=noforce" is not specified.

"swiotlb=noforce" seems to trigger some pre-existing issue here. Without
"swiotlb=noforce", I was only able to observe this via pc-i440fx-2.1,

-- 
Thanks,

David / dhildenb


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



Re: [virtio-dev] [PATCH v3 00/15] virtio-mem: paravirtualized memory

2020-05-14 Thread David Hildenbrand
On 14.05.20 12:12, David Hildenbrand wrote:
> On 14.05.20 12:02, teawater wrote:
>>
>>
>>> 2020年5月14日 16:48,David Hildenbrand  写道:
>>>
>>> On 14.05.20 08:44, teawater wrote:
 Hi David,

 I got a kernel warning with v2 and v3.
>>>
>>> Hi Hui,
>>>
>>> thanks for playing with the latest versions. Surprisingly, I can
>>> reproduce even by hotplugging a DIMM instead as well - that's good, so
>>> it's not related to virtio-mem, lol. Seems to be some QEMU setup issue
>>> with older machine types.
>>>
>>> Can you switch to a newer qemu machine version, especially
>>> pc-i440fx-5.0? Both, hotplugging DIMMs and virtio-mem works for me with
>>> that QEMU machine just fine.
>>
>> I still could reproduce this issue with pc-i440fx-5.0 or pc.  Did I miss 
>> anything?
>>
> 
> Below I don't even see virtio_mem. I had to repair the image (filesystem
> fsck) because it was broken, can you try that as well?
> 
> Also, it would be great if you could test with v4.
> 

Correction, something seems to be broken either in QEMU or the kernel. Once I
define a DIMM so it's added and online during boot, I get these issues:

(I have virtio-mem v4 installed in the guest)

#! /bin/bash
sudo x86_64-softmmu/qemu-system-x86_64 \
-machine pc-i440fx-5.0,accel=kvm,usb=off \
-cpu host \
-no-reboot \
-nographic \
-device ide-hd,drive=hd \
-drive 
if=none,id=hd,file=/home/dhildenb/git/Fedora-Cloud-Base-31-1.9.x86_64.qcow2,format=qcow2
 \
-m 1g,slots=10,maxmem=2G \
-smp 1 \
-object memory-backend-ram,id=mem0,size=256m \
-device pc-dimm,id=dimm0,memdev=mem0 \
-s \
-monitor unix:/var/tmp/monitor,server,nowait


Without the DIMM it seems to work just fine.

-- 
Thanks,

David / dhildenb


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] Re: [PATCH v6] iommu/virtio: Use page size bitmap supported by endpoint

2020-05-14 Thread Michael S. Tsirkin
On Thu, May 14, 2020 at 12:50:16PM +0200, Jean-Philippe Brucker wrote:
> On Thu, May 14, 2020 at 05:31:00AM -0400, Michael S. Tsirkin wrote:
> > On Thu, May 14, 2020 at 01:22:37PM +0530, Bharat Bhushan wrote:
> > > Different endpoint can support different page size, probe
> > > endpoint if it supports specific page size otherwise use
> > > global page sizes.
> > > 
> > > Device attached to domain should support a minimum of
> > > domain supported page sizes. If device supports more
> > > than domain supported page sizes then device is limited
> > > to use domain supported page sizes only.
> > 
> > OK so I am just trying to figure it out.
> > Before the patch, we always use the domain supported page sizes
> > right?
> > 
> > With the patch, we still do, but we also probe and
> > validate that device supports all domain page sizes,
> > if it does not then we fail to attach the device.
> 
> Generally there is one endpoint per domain. Linux creates the domains and
> decides which endpoint goes in which domain. It puts multiple endpoints in
> a domain in two cases:
> 
> * If endpoints cannot be isolated from each others by the IOMMU, for
>   example if ACS isolation isn't enabled in PCIe. In that case endpoints
>   are in the same IOMMU group, and therefore contained in the same domain.
>   This is more of a quirk for broken hardware, and isn't much of a concern
>   for virtualization because it's easy for the hypervisor to present
>   endpoints isolated from each others.

Unless they aren't isolated on real hardware :)

> * If userspace wants to put endpoints in the same VFIO container, then
>   VFIO first attempts to put them in the same IOMMU domain, and falls back
>   to multiple domains if that fails. That case is just a luxury and we
>   shouldn't over-complicate the driver to cater for this.
> 
> So the attach checks don't need to be that complicated. Checking that the
> page masks are exactly the same should be enough.
> 
> > This seems like a lot of effort for little benefit, can't
> > hypervisor simply make sure endpoints support the
> > iommu page sizes for us?
> 
> I tend to agree, it's not very likely that we'll have a configuration with
> different page sizes between physical and virtual endpoints.
> 
> If there is a way for QEMU to simply reject VFIO devices that don't use
> the same page mask as what's configured globally, let's do that instead of
> introducing the page_size_mask property.

Or we can even do the subset thing in QEMU. Can be transparent to
guests.


So I guess this patch isn't really needed then.

> 
> > > @@ -615,7 +636,7 @@ static int viommu_domain_finalise(struct 
> > > viommu_endpoint *vdev,
> > >   struct viommu_dev *viommu = vdev->viommu;
> > >   struct viommu_domain *vdomain = to_viommu_domain(domain);
> > >  
> > > - viommu_page_size = 1UL << __ffs(viommu->pgsize_bitmap);
> > > + viommu_page_size = 1UL << __ffs(vdev->pgsize_bitmap);
> > >   if (viommu_page_size > PAGE_SIZE) {
> > >   dev_err(vdev->dev,
> > >   "granule 0x%lx larger than system page size 0x%lx\n",
> > 
> > 
> > Looks like this is messed up on 32 bit: e.g. 0x1 will try to do
> > 1UL << -1, which is undefined behaviour. Which is btw already messed up
> > wrt viommu->pgsize_bitmap, but that's not a reason to propagate
> > the error.
> 
> Realistically we're not going to have a page granule larger than 4G, it's
> going to be 4k or 64k. But we can add a check that truncates the
> page_size_mask to 32-bit and makes sure that it's non-null.

... on 32 bit

> 
> > > +struct virtio_iommu_probe_pgsize_mask {
> > > + struct virtio_iommu_probe_property  head;
> > > + __u8reserved[4];
> > > + /* Same format as virtio_iommu_config::page_size_mask */
> > 
> > It's actually slightly different in that
> > this must be a superset of domain page size mask, right?
> 
> No it overrides the global mask

OK so I'd copy the comment and tweak it rather than
refer to virtio_iommu_config::page_size_mask
(besides, virtio_iommu_config::page_size_mask isn't legal C,
I know C++ so I figured out what's meant but it's
better to just say "page_size_mask in sturct virtio_iommu_config" )


> 
> > > + __le64  pgsize_bitmap;
> 
> Bharat, please rename this to page_size_mask for consistency
> 
> Thanks,
> Jean
> 
> > > +};
> > > +
> > >  #define VIRTIO_IOMMU_RESV_MEM_T_RESERVED 0
> > >  #define VIRTIO_IOMMU_RESV_MEM_T_MSI  1
> > >  
> > > -- 
> > > 2.17.1
> > 


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] Re: [PATCH v6] iommu/virtio: Use page size bitmap supported by endpoint

2020-05-14 Thread Jean-Philippe Brucker
On Thu, May 14, 2020 at 05:31:00AM -0400, Michael S. Tsirkin wrote:
> On Thu, May 14, 2020 at 01:22:37PM +0530, Bharat Bhushan wrote:
> > Different endpoint can support different page size, probe
> > endpoint if it supports specific page size otherwise use
> > global page sizes.
> > 
> > Device attached to domain should support a minimum of
> > domain supported page sizes. If device supports more
> > than domain supported page sizes then device is limited
> > to use domain supported page sizes only.
> 
> OK so I am just trying to figure it out.
> Before the patch, we always use the domain supported page sizes
> right?
> 
> With the patch, we still do, but we also probe and
> validate that device supports all domain page sizes,
> if it does not then we fail to attach the device.

Generally there is one endpoint per domain. Linux creates the domains and
decides which endpoint goes in which domain. It puts multiple endpoints in
a domain in two cases:

* If endpoints cannot be isolated from each others by the IOMMU, for
  example if ACS isolation isn't enabled in PCIe. In that case endpoints
  are in the same IOMMU group, and therefore contained in the same domain.
  This is more of a quirk for broken hardware, and isn't much of a concern
  for virtualization because it's easy for the hypervisor to present
  endpoints isolated from each others.

* If userspace wants to put endpoints in the same VFIO container, then
  VFIO first attempts to put them in the same IOMMU domain, and falls back
  to multiple domains if that fails. That case is just a luxury and we
  shouldn't over-complicate the driver to cater for this.

So the attach checks don't need to be that complicated. Checking that the
page masks are exactly the same should be enough.

> This seems like a lot of effort for little benefit, can't
> hypervisor simply make sure endpoints support the
> iommu page sizes for us?

I tend to agree, it's not very likely that we'll have a configuration with
different page sizes between physical and virtual endpoints.

If there is a way for QEMU to simply reject VFIO devices that don't use
the same page mask as what's configured globally, let's do that instead of
introducing the page_size_mask property.

> > @@ -615,7 +636,7 @@ static int viommu_domain_finalise(struct 
> > viommu_endpoint *vdev,
> > struct viommu_dev *viommu = vdev->viommu;
> > struct viommu_domain *vdomain = to_viommu_domain(domain);
> >  
> > -   viommu_page_size = 1UL << __ffs(viommu->pgsize_bitmap);
> > +   viommu_page_size = 1UL << __ffs(vdev->pgsize_bitmap);
> > if (viommu_page_size > PAGE_SIZE) {
> > dev_err(vdev->dev,
> > "granule 0x%lx larger than system page size 0x%lx\n",
> 
> 
> Looks like this is messed up on 32 bit: e.g. 0x1 will try to do
> 1UL << -1, which is undefined behaviour. Which is btw already messed up
> wrt viommu->pgsize_bitmap, but that's not a reason to propagate
> the error.

Realistically we're not going to have a page granule larger than 4G, it's
going to be 4k or 64k. But we can add a check that truncates the
page_size_mask to 32-bit and makes sure that it's non-null.

> > +struct virtio_iommu_probe_pgsize_mask {
> > +   struct virtio_iommu_probe_property  head;
> > +   __u8reserved[4];
> > +   /* Same format as virtio_iommu_config::page_size_mask */
> 
> It's actually slightly different in that
> this must be a superset of domain page size mask, right?

No it overrides the global mask

> > +   __le64  pgsize_bitmap;

Bharat, please rename this to page_size_mask for consistency

Thanks,
Jean

> > +};
> > +
> >  #define VIRTIO_IOMMU_RESV_MEM_T_RESERVED   0
> >  #define VIRTIO_IOMMU_RESV_MEM_T_MSI1
> >  
> > -- 
> > 2.17.1
> 

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



Re: [virtio-dev] [PATCH v3 00/15] virtio-mem: paravirtualized memory

2020-05-14 Thread David Hildenbrand
On 14.05.20 12:02, teawater wrote:
> 
> 
>> 2020年5月14日 16:48,David Hildenbrand  写道:
>>
>> On 14.05.20 08:44, teawater wrote:
>>> Hi David,
>>>
>>> I got a kernel warning with v2 and v3.
>>
>> Hi Hui,
>>
>> thanks for playing with the latest versions. Surprisingly, I can
>> reproduce even by hotplugging a DIMM instead as well - that's good, so
>> it's not related to virtio-mem, lol. Seems to be some QEMU setup issue
>> with older machine types.
>>
>> Can you switch to a newer qemu machine version, especially
>> pc-i440fx-5.0? Both, hotplugging DIMMs and virtio-mem works for me with
>> that QEMU machine just fine.
> 
> I still could reproduce this issue with pc-i440fx-5.0 or pc.  Did I miss 
> anything?
> 

Below I don't even see virtio_mem. I had to repair the image (filesystem
fsck) because it was broken, can you try that as well?

Also, it would be great if you could test with v4.

-- 
Thanks,

David / dhildenb


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



Re: [virtio-dev] [PATCH v3 00/15] virtio-mem: paravirtualized memory

2020-05-14 Thread teawater



> 2020年5月14日 16:48,David Hildenbrand  写道:
> 
> On 14.05.20 08:44, teawater wrote:
>> Hi David,
>> 
>> I got a kernel warning with v2 and v3.
> 
> Hi Hui,
> 
> thanks for playing with the latest versions. Surprisingly, I can
> reproduce even by hotplugging a DIMM instead as well - that's good, so
> it's not related to virtio-mem, lol. Seems to be some QEMU setup issue
> with older machine types.
> 
> Can you switch to a newer qemu machine version, especially
> pc-i440fx-5.0? Both, hotplugging DIMMs and virtio-mem works for me with
> that QEMU machine just fine.

I still could reproduce this issue with pc-i440fx-5.0 or pc.  Did I miss 
anything?

> 
> What also seems to make it work with pc-i440fx-2.1, is giving the
> machine 4G of initial memory (-m 4g,slots=10,maxmem=5G).

After I changed command, I got error when guest kernel boot:
sudo /home/teawater/qemu/qemu/x86_64-softmmu/qemu-system-x86_64 -machine 
pc,accel=kvm,usb=off -cpu host -no-reboot -nographic -device ide-hd,drive=hd 
-drive if=none,id=hd,file=/home/teawater/old.img,format=raw -kernel 
/home/teawater/kernel/bk2/arch/x86/boot/bzImage -append "console=ttyS0 
root=/dev/sda nokaslr swiotlb=noforce" -m 4g,slots=10,maxmem=5G -smp 1 -s 
-monitor unix:/home/teawater/qemu/m,server,nowait
SeaBIOS (version rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org)


iPXE (http://ipxe.org) 00:03.0 CA00 PCI2.10 PnP PMM+BFF905B0+BFEF05B0 CA00



Booting from ROM..
[0.00] Linux version 5.7.0-rc4-next-20200507+ 
(teawater@iZbp1bv6xzjxt9vmytkdnmZ) (gcc (GCC) 4.8.5 20150623 (Red Hat 
4.8.5-39), GNU ld version0
[0.00] Command line: console=ttyS0 root=/dev/sda nokaslr swiotlb=noforce
[0.00] KERNEL supported cpus:
[0.00]   Intel GenuineIntel
[0.00]   AMD AuthenticAMD
[0.00]   Hygon HygonGenuine
[0.00]   Centaur CentaurHauls
[0.00]   zhaoxin   Shanghai
[0.00] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point 
registers'
[0.00] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
[0.00] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
[0.00] x86/fpu: Supporting XSAVE feature 0x008: 'MPX bounds registers'
[0.00] x86/fpu: Supporting XSAVE feature 0x010: 'MPX CSR'
[0.00] x86/fpu: Supporting XSAVE feature 0x020: 'AVX-512 opmask'
[0.00] x86/fpu: Supporting XSAVE feature 0x040: 'AVX-512 Hi256'
[0.00] x86/fpu: Supporting XSAVE feature 0x080: 'AVX-512 ZMM_Hi256'
[0.00] x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
[0.00] x86/fpu: xstate_offset[3]:  832, xstate_sizes[3]:   64
[0.00] x86/fpu: xstate_offset[4]:  896, xstate_sizes[4]:   64
[0.00] x86/fpu: xstate_offset[5]:  960, xstate_sizes[5]:   64
[0.00] x86/fpu: xstate_offset[6]: 1024, xstate_sizes[6]:  512
[0.00] x86/fpu: xstate_offset[7]: 1536, xstate_sizes[7]: 1024
[0.00] x86/fpu: Enabled xstate features 0xff, context size is 2560 
bytes, using 'compacted' format.
[0.00] BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x0009fbff] usable
[0.00] BIOS-e820: [mem 0x0009fc00-0x0009] reserved
[0.00] BIOS-e820: [mem 0x000f-0x000f] reserved
[0.00] BIOS-e820: [mem 0x0010-0xbffd] usable
[0.00] BIOS-e820: [mem 0xbffe-0xbfff] reserved
[0.00] BIOS-e820: [mem 0xfeffc000-0xfeff] reserved
[0.00] BIOS-e820: [mem 0xfffc-0x] reserved
[0.00] BIOS-e820: [mem 0x0001-0x00013fff] usable
[0.00] NX (Execute Disable) protection: active
[0.00] SMBIOS 2.8 present.
[0.00] DMI: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[0.00] Hypervisor detected: KVM
[0.00] kvm-clock: Using msrs 4b564d01 and 4b564d00
[0.00] kvm-clock: cpu 0, msr 2f12001, primary cpu clock
[0.00] kvm-clock: using sched offset of 227993437 cycles
[0.02] clocksource: kvm-clock: mask: 0x max_cycles: 
0x1cd42e4dffb, max_idle_ns: 881590591483 ns
[0.03] tsc: Detected 2499.998 MHz processor
[0.001618] last_pfn = 0x14 max_arch_pfn = 0x4
[0.001659] x86/PAT: Configuration [0-7]: WB  WC  UC- UC  WB  WP  UC- WT
[0.001666] last_pfn = 0xbffe0 max_arch_pfn = 0x4
[0.005474] found SMP MP-table at [mem 0x000f5ab0-0x000f5abf]
[0.005649] check: Scanning 1 areas for low memory corruption
[0.005676] Using GB pages for direct mapping
[0.006325] ACPI: Early table checksum verification disabled
[0.006328] ACPI: RSDP 0x000F58B0 14 (v00 BOCHS )
[0.006330] ACPI: RSDT 0xBFFE1F1E 38 (v01 BOCHS  BXPCRSDT 
0001 BXPC 0001)
[0.006334] ACPI: FACP 0xBFFE1CF2 74 (v01 BOCHS  BXPCFACP 
0001 BXPC 0001)
[  

[virtio-dev] Re: [PATCH v6] iommu/virtio: Use page size bitmap supported by endpoint

2020-05-14 Thread Michael S. Tsirkin
On Thu, May 14, 2020 at 01:22:37PM +0530, Bharat Bhushan wrote:
> Different endpoint can support different page size, probe
> endpoint if it supports specific page size otherwise use
> global page sizes.
> 
> Device attached to domain should support a minimum of
> domain supported page sizes. If device supports more
> than domain supported page sizes then device is limited
> to use domain supported page sizes only.

OK so I am just trying to figure it out.
Before the patch, we always use the domain supported page sizes
right? With the patch, we still do, but we also probe and
validate that device supports all domain page sizes,
if it does not then we fail to attach the device.

This seems like a lot of effort for little benefit, can't
hypervisor simply make sure endpoints support the
iommu page sizes for us?



> 
> Signed-off-by: Bharat Bhushan 
> ---
> v5->v6
>  - property length before dereference
>  - Error out on no supported page sizes (page-size-mask is zero)
>  - Allow device to attach to domain even it supports
>minimum of domain supported page sizes. In that case device
>will use domain page sizes only.
>  - added format of pgsize_bitmap
> 
> v4->v5:
>  - Rebase to Linux v5.7-rc4
> 
> v3->v4:
>  - Fix whitespace error
> 
> v2->v3:
>  - Fixed error return for incompatible endpoint
>  - __u64 changed to __le64 in header file
> 
>  drivers/iommu/virtio-iommu.c  | 63 ---
>  include/uapi/linux/virtio_iommu.h | 14 ++-
>  2 files changed, 71 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
> index 4e1d11af23c8..cbac3047a781 100644
> --- a/drivers/iommu/virtio-iommu.c
> +++ b/drivers/iommu/virtio-iommu.c
> @@ -78,6 +78,7 @@ struct viommu_endpoint {
>   struct viommu_dev   *viommu;
>   struct viommu_domain*vdomain;
>   struct list_headresv_regions;
> + u64 pgsize_bitmap;
>  };
>  
>  struct viommu_request {
> @@ -415,6 +416,23 @@ static int viommu_replay_mappings(struct viommu_domain 
> *vdomain)
>   return ret;
>  }
>  
> +static int viommu_set_pgsize_bitmap(struct viommu_endpoint *vdev,
> + struct virtio_iommu_probe_pgsize_mask *mask,
> + size_t len)
> +{
> + u64 pgsize_bitmap;
> +
> + if (len < sizeof(*mask))
> + return -EINVAL;
> +
> + pgsize_bitmap = le64_to_cpu(mask->pgsize_bitmap);
> + if (!pgsize_bitmap)
> + return -EINVAL;
> +
> + vdev->pgsize_bitmap = pgsize_bitmap;
> + return 0;
> +}
> +
>  static int viommu_add_resv_mem(struct viommu_endpoint *vdev,
>  struct virtio_iommu_probe_resv_mem *mem,
>  size_t len)
> @@ -499,6 +517,9 @@ static int viommu_probe_endpoint(struct viommu_dev 
> *viommu, struct device *dev)
>   case VIRTIO_IOMMU_PROBE_T_RESV_MEM:
>   ret = viommu_add_resv_mem(vdev, (void *)prop, len);
>   break;
> + case VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK:
> + ret = viommu_set_pgsize_bitmap(vdev, (void *)prop, len);
> + break;
>   default:
>   dev_err(dev, "unknown viommu prop 0x%x\n", type);
>   }
> @@ -615,7 +636,7 @@ static int viommu_domain_finalise(struct viommu_endpoint 
> *vdev,
>   struct viommu_dev *viommu = vdev->viommu;
>   struct viommu_domain *vdomain = to_viommu_domain(domain);
>  
> - viommu_page_size = 1UL << __ffs(viommu->pgsize_bitmap);
> + viommu_page_size = 1UL << __ffs(vdev->pgsize_bitmap);
>   if (viommu_page_size > PAGE_SIZE) {
>   dev_err(vdev->dev,
>   "granule 0x%lx larger than system page size 0x%lx\n",


Looks like this is messed up on 32 bit: e.g. 0x1 will try to do
1UL << -1, which is undefined behaviour. Which is btw already messed up
wrt viommu->pgsize_bitmap, but that's not a reason to propagate
the error.


> @@ -630,7 +651,7 @@ static int viommu_domain_finalise(struct viommu_endpoint 
> *vdev,
>  
>   vdomain->id = (unsigned int)ret;
>  
> - domain->pgsize_bitmap   = viommu->pgsize_bitmap;
> + domain->pgsize_bitmap   = vdev->pgsize_bitmap;
>   domain->geometry= viommu->geometry;
>  
>   vdomain->map_flags  = viommu->map_flags;
> @@ -654,6 +675,38 @@ static void viommu_domain_free(struct iommu_domain 
> *domain)
>   kfree(vdomain);
>  }
>  
> +/*
> + * Check whether the endpoint's capabilities are compatible with other
> + * endpoints in the domain. Report any inconsistency.

This actually has side effects, so _is_ isn't a good name for it.
viommu_endpoint_compatible?

> + */
> +static bool viommu_endpoint_is_compatible(struct viommu_endpoint *vdev,
> +   struct viommu_domain *vdomain)
> +{
> + struct device *dev 

Re: [virtio-dev] [PATCH v3 00/15] virtio-mem: paravirtualized memory

2020-05-14 Thread David Hildenbrand
On 14.05.20 08:44, teawater wrote:
> Hi David,
> 
> I got a kernel warning with v2 and v3.

Hi Hui,

thanks for playing with the latest versions. Surprisingly, I can
reproduce even by hotplugging a DIMM instead as well - that's good, so
it's not related to virtio-mem, lol. Seems to be some QEMU setup issue
with older machine types.

Can you switch to a newer qemu machine version, especially
pc-i440fx-5.0? Both, hotplugging DIMMs and virtio-mem works for me with
that QEMU machine just fine.

What also seems to make it work with pc-i440fx-2.1, is giving the
machine 4G of initial memory (-m 4g,slots=10,maxmem=5G).

Cheers!


> // start a QEMU that is get from 
> https://github.com/davidhildenbrand/qemu/tree/virtio-mem-v2 and setup a file 
> as a ide disk.
> /home/teawater/qemu/qemu/x86_64-softmmu/qemu-system-x86_64 -machine 
> pc-i440fx-2.1,accel=kvm,usb=off -cpu host -no-reboot -nographic -device 
> ide-hd,drive=hd -drive if=none,id=hd,file=/home/teawater/old.img,format=raw 
> -kernel /home/teawater/kernel/bk2/arch/x86/boot/bzImage -append 
> "console=ttyS0 root=/dev/sda nokaslr swiotlb=noforce" -m 
> 1g,slots=10,maxmem=2G -smp 1 -s -monitor 
> unix:/home/teawater/qemu/m,server,nowait
> 
> // Setup virtio-mem and plug 256m memory in qemu monitor:
> object_add memory-backend-ram,id=mem1,size=256m
> device_add virtio-mem-pci,id=vm0,memdev=mem1
> qom-set vm0 requested-size 256M
> 
> // Go back to the terminal and access file system will got following kernel 
> warning.
> [   19.515549] pci :00:04.0: [1af4:1015] type 00 class 0x00ff00
> [   19.516227] pci :00:04.0: reg 0x10: [io  0x-0x007f]
> [   19.517196] pci :00:04.0: BAR 0: assigned [io  0x1000-0x107f]
> [   19.517843] virtio-pci :00:04.0: enabling device ( -> 0001)
> [   19.535957] PCI Interrupt Link [LNKD] enabled at IRQ 11
> [   19.536507] virtio-pci :00:04.0: virtio_pci: leaving for legacy driver
> [   19.537528] virtio_mem virtio0: start address: 0x1
> [   19.538094] virtio_mem virtio0: region size: 0x1000
> [   19.538621] virtio_mem virtio0: device block size: 0x20
> [   19.539186] virtio_mem virtio0: memory block size: 0x800
> [   19.539752] virtio_mem virtio0: subblock size: 0x40
> [   19.540357] virtio_mem virtio0: plugged size: 0x0
> [   19.540834] virtio_mem virtio0: requested size: 0x0
> [   20.170441] virtio_mem virtio0: plugged size: 0x0
> [   20.170933] virtio_mem virtio0: requested size: 0x1000
> [   20.172247] Built 1 zonelists, mobility grouping on.  Total pages: 266012
> [   20.172955] Policy zone: Normal
> 
> / # ls
> [   26.724565] [ cut here ]
> [   26.725047] ata_piix :00:01.1: DMA addr 0x00010fc14000+49152 
> overflow (mask , bus limit 0).
> [   26.726024] WARNING: CPU: 0 PID: 179 at 
> /home/teawater/kernel/linux2/kernel/dma/direct.c:364 
> dma_direct_map_page+0x118/0x130
> [   26.727141] Modules linked in:
> [   26.727456] CPU: 0 PID: 179 Comm: ls Not tainted 5.6.0-rc5-next-20200311+ 
> #9
> [   26.728163] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
> [   26.729305] RIP: 0010:dma_direct_map_page+0x118/0x130
> [   26.729825] Code: 8b 1f e8 3b 70 59 00 48 8d 4c 24 08 48 89 c6 4c 89 2c 24 
> 4d 89 e1 49 89 e8 48 89 da 48 c7 c7 08 6c 34 82 31 c0 e8 d8 8e f7 ff <00
> [   26.731683] RSP: :c9213838 EFLAGS: 00010082
> [   26.732205] RAX:  RBX: 88803ebeb1b0 RCX: 
> 82665148
> [   26.732913] RDX: 0001 RSI: 0092 RDI: 
> 0046
> [   26.733621] RBP: c000 R08: 01df R09: 
> 01df
> [   26.734338] R10:  R11: c92135a8 R12: 
> 
> [   26.735054] R13:  R14:  R15: 
> 88803d55f5b0
> [   26.735772] FS:  024e9880() GS:88803ec0() 
> knlGS:
> [   26.736579] CS:  0010 DS:  ES:  CR0: 80050033
> [   26.737162] CR2: 005bfc7f CR3: 000107e12004 CR4: 
> 00360ef0
> [   26.737879] DR0:  DR1:  DR2: 
> 
> [   26.738591] DR3:  DR6: fffe0ff0 DR7: 
> 0400
> [   26.739307] Call Trace:
> [   26.739564]  dma_direct_map_sg+0x64/0xb0
> [   26.739969]  ? ata_scsi_write_same_xlat+0x350/0x350
> [   26.740461]  ata_qc_issue+0x214/0x260
> [   26.740839]  ata_scsi_queuecmd+0x16a/0x490
> [   26.741255]  scsi_queue_rq+0x679/0xa60
> [   26.741639]  blk_mq_dispatch_rq_list+0x90/0x510
> [   26.742099]  ? elv_rb_del+0x1f/0x30
> [   26.742456]  ? deadline_remove_request+0x6a/0xb0
> [   26.742926]  blk_mq_do_dispatch_sched+0x78/0x100
> [   26.743397]  blk_mq_sched_dispatch_requests+0xf9/0x170
> [   26.743924]  __blk_mq_run_hw_queue+0x7e/0x130
> [   26.744365]  __blk_mq_delay_run_hw_queue+0x107/0x150
> [   26.744874]  blk_mq_run_hw_queue+0x61/0x100
> [   26.745299]  blk_mq_sched_inser

[virtio-dev] Re: [PATCH v3 1/4] dma-buf: add support for virtio exported objects

2020-05-14 Thread David Stevens
Sorry for the duplicate reply, didn't notice this until now.

> Just storing
> the uuid should be doable (assuming this doesn't change during the
> lifetime of the buffer), so no need for a callback.

Directly storing the uuid doesn't work that well because of
synchronization issues. The uuid needs to be shared between multiple
virtio devices with independent command streams, so to prevent races
between importing and exporting, the exporting driver can't share the
uuid with other drivers until it knows that the device has finished
registering the uuid. That requires a round trip to and then back from
the device. Using a callback allows the latency from that round trip
registration to be hidden.

-David

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] Re: [PATCH v3 1/4] dma-buf: add support for virtio exported objects

2020-05-14 Thread Gerd Hoffmann
  Hi,

> - for the runtime upcasting the usual approach is to check the ->ops
> pointer. Which means that would need to be the same for all virtio
> dma_bufs, which might get a bit awkward. But I'd really prefer we not
> add allocator specific stuff like this to dma-buf.

This is exactly the problem, it gets messy quickly, also when it comes
to using the drm_prime.c helpers ...

take care,
  Gerd


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] [PATCH v6] iommu/virtio: Use page size bitmap supported by endpoint

2020-05-14 Thread Bharat Bhushan
Different endpoint can support different page size, probe
endpoint if it supports specific page size otherwise use
global page sizes.

Device attached to domain should support a minimum of
domain supported page sizes. If device supports more
than domain supported page sizes then device is limited
to use domain supported page sizes only.

Signed-off-by: Bharat Bhushan 
---
v5->v6
 - property length before dereference
 - Error out on no supported page sizes (page-size-mask is zero)
 - Allow device to attach to domain even it supports
   minimum of domain supported page sizes. In that case device
   will use domain page sizes only.
 - added format of pgsize_bitmap

v4->v5:
 - Rebase to Linux v5.7-rc4

v3->v4:
 - Fix whitespace error

v2->v3:
 - Fixed error return for incompatible endpoint
 - __u64 changed to __le64 in header file

 drivers/iommu/virtio-iommu.c  | 63 ---
 include/uapi/linux/virtio_iommu.h | 14 ++-
 2 files changed, 71 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 4e1d11af23c8..cbac3047a781 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -78,6 +78,7 @@ struct viommu_endpoint {
struct viommu_dev   *viommu;
struct viommu_domain*vdomain;
struct list_headresv_regions;
+   u64 pgsize_bitmap;
 };
 
 struct viommu_request {
@@ -415,6 +416,23 @@ static int viommu_replay_mappings(struct viommu_domain 
*vdomain)
return ret;
 }
 
+static int viommu_set_pgsize_bitmap(struct viommu_endpoint *vdev,
+   struct virtio_iommu_probe_pgsize_mask *mask,
+   size_t len)
+{
+   u64 pgsize_bitmap;
+
+   if (len < sizeof(*mask))
+   return -EINVAL;
+
+   pgsize_bitmap = le64_to_cpu(mask->pgsize_bitmap);
+   if (!pgsize_bitmap)
+   return -EINVAL;
+
+   vdev->pgsize_bitmap = pgsize_bitmap;
+   return 0;
+}
+
 static int viommu_add_resv_mem(struct viommu_endpoint *vdev,
   struct virtio_iommu_probe_resv_mem *mem,
   size_t len)
@@ -499,6 +517,9 @@ static int viommu_probe_endpoint(struct viommu_dev *viommu, 
struct device *dev)
case VIRTIO_IOMMU_PROBE_T_RESV_MEM:
ret = viommu_add_resv_mem(vdev, (void *)prop, len);
break;
+   case VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK:
+   ret = viommu_set_pgsize_bitmap(vdev, (void *)prop, len);
+   break;
default:
dev_err(dev, "unknown viommu prop 0x%x\n", type);
}
@@ -615,7 +636,7 @@ static int viommu_domain_finalise(struct viommu_endpoint 
*vdev,
struct viommu_dev *viommu = vdev->viommu;
struct viommu_domain *vdomain = to_viommu_domain(domain);
 
-   viommu_page_size = 1UL << __ffs(viommu->pgsize_bitmap);
+   viommu_page_size = 1UL << __ffs(vdev->pgsize_bitmap);
if (viommu_page_size > PAGE_SIZE) {
dev_err(vdev->dev,
"granule 0x%lx larger than system page size 0x%lx\n",
@@ -630,7 +651,7 @@ static int viommu_domain_finalise(struct viommu_endpoint 
*vdev,
 
vdomain->id = (unsigned int)ret;
 
-   domain->pgsize_bitmap   = viommu->pgsize_bitmap;
+   domain->pgsize_bitmap   = vdev->pgsize_bitmap;
domain->geometry= viommu->geometry;
 
vdomain->map_flags  = viommu->map_flags;
@@ -654,6 +675,38 @@ static void viommu_domain_free(struct iommu_domain *domain)
kfree(vdomain);
 }
 
+/*
+ * Check whether the endpoint's capabilities are compatible with other
+ * endpoints in the domain. Report any inconsistency.
+ */
+static bool viommu_endpoint_is_compatible(struct viommu_endpoint *vdev,
+ struct viommu_domain *vdomain)
+{
+   struct device *dev = vdev->dev;
+   u64 pgsize_bitmap;
+
+   if (vdomain->viommu != vdev->viommu) {
+   dev_err(dev, "cannot attach to foreign vIOMMU\n");
+   return false;
+   }
+
+   pgsize_bitmap = vdomain->domain.pgsize_bitmap & vdev->pgsize_bitmap;
+
+   if (pgsize_bitmap != vdomain->domain.pgsize_bitmap) {
+   dev_err(dev, "incompatible domain bitmap 0x%lx != 0x%llx\n",
+   vdomain->domain.pgsize_bitmap, vdev->pgsize_bitmap);
+   return false;
+   }
+
+   /* Domain pagesize bitmap is subset of device pagesize bitmap */
+   if (pgsize_bitmap != vdev->pgsize_bitmap) {
+   dev_info(dev, "page size bitmap used %llx, supported %llx\n",
+pgsize_bitmap, vdev->pgsize_bitmap);
+   vdev->pgsize_bitmap = pgsize_bitmap;
+   }
+   return true;
+}
+
 static int viommu_attach_dev(struct iommu_do

[virtio-dev] [PATCH v6] iommu/virtio: Use page size bitmap supported by endpoint

2020-05-14 Thread Bharat Bhushan
Different endpoint can support different page size, probe
endpoint if it supports specific page size otherwise use
global page sizes.

Device attached to domain should support a minimum of
domain supported page sizes. If device supports more
than domain supported page sizes then device is limited
to use domain supported page sizes only.

Signed-off-by: Bharat Bhushan 
---
v5->v6
 - property length before dereference
 - Error out on no supported page sizes (page-size-mask is zero)
 - Allow device to attach to domain even it supports
   minimum of domain supported page sizes. In that case device
   will use domain page sizes only.
 - added format of pgsize_bitmap

v4->v5:
 - Rebase to Linux v5.7-rc4

v3->v4:
 - Fix whitespace error

v2->v3:
 - Fixed error return for incompatible endpoint
 - __u64 changed to __le64 in header file

 drivers/iommu/virtio-iommu.c  | 63 ---
 include/uapi/linux/virtio_iommu.h | 14 ++-
 2 files changed, 71 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 4e1d11af23c8..cbac3047a781 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -78,6 +78,7 @@ struct viommu_endpoint {
struct viommu_dev   *viommu;
struct viommu_domain*vdomain;
struct list_headresv_regions;
+   u64 pgsize_bitmap;
 };
 
 struct viommu_request {
@@ -415,6 +416,23 @@ static int viommu_replay_mappings(struct viommu_domain 
*vdomain)
return ret;
 }
 
+static int viommu_set_pgsize_bitmap(struct viommu_endpoint *vdev,
+   struct virtio_iommu_probe_pgsize_mask *mask,
+   size_t len)
+{
+   u64 pgsize_bitmap;
+
+   if (len < sizeof(*mask))
+   return -EINVAL;
+
+   pgsize_bitmap = le64_to_cpu(mask->pgsize_bitmap);
+   if (!pgsize_bitmap)
+   return -EINVAL;
+
+   vdev->pgsize_bitmap = pgsize_bitmap;
+   return 0;
+}
+
 static int viommu_add_resv_mem(struct viommu_endpoint *vdev,
   struct virtio_iommu_probe_resv_mem *mem,
   size_t len)
@@ -499,6 +517,9 @@ static int viommu_probe_endpoint(struct viommu_dev *viommu, 
struct device *dev)
case VIRTIO_IOMMU_PROBE_T_RESV_MEM:
ret = viommu_add_resv_mem(vdev, (void *)prop, len);
break;
+   case VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK:
+   ret = viommu_set_pgsize_bitmap(vdev, (void *)prop, len);
+   break;
default:
dev_err(dev, "unknown viommu prop 0x%x\n", type);
}
@@ -615,7 +636,7 @@ static int viommu_domain_finalise(struct viommu_endpoint 
*vdev,
struct viommu_dev *viommu = vdev->viommu;
struct viommu_domain *vdomain = to_viommu_domain(domain);
 
-   viommu_page_size = 1UL << __ffs(viommu->pgsize_bitmap);
+   viommu_page_size = 1UL << __ffs(vdev->pgsize_bitmap);
if (viommu_page_size > PAGE_SIZE) {
dev_err(vdev->dev,
"granule 0x%lx larger than system page size 0x%lx\n",
@@ -630,7 +651,7 @@ static int viommu_domain_finalise(struct viommu_endpoint 
*vdev,
 
vdomain->id = (unsigned int)ret;
 
-   domain->pgsize_bitmap   = viommu->pgsize_bitmap;
+   domain->pgsize_bitmap   = vdev->pgsize_bitmap;
domain->geometry= viommu->geometry;
 
vdomain->map_flags  = viommu->map_flags;
@@ -654,6 +675,38 @@ static void viommu_domain_free(struct iommu_domain *domain)
kfree(vdomain);
 }
 
+/*
+ * Check whether the endpoint's capabilities are compatible with other
+ * endpoints in the domain. Report any inconsistency.
+ */
+static bool viommu_endpoint_is_compatible(struct viommu_endpoint *vdev,
+ struct viommu_domain *vdomain)
+{
+   struct device *dev = vdev->dev;
+   u64 pgsize_bitmap;
+
+   if (vdomain->viommu != vdev->viommu) {
+   dev_err(dev, "cannot attach to foreign vIOMMU\n");
+   return false;
+   }
+
+   pgsize_bitmap = vdomain->domain.pgsize_bitmap & vdev->pgsize_bitmap;
+
+   if (pgsize_bitmap != vdomain->domain.pgsize_bitmap) {
+   dev_err(dev, "incompatible domain bitmap 0x%lx != 0x%llx\n",
+   vdomain->domain.pgsize_bitmap, vdev->pgsize_bitmap);
+   return false;
+   }
+
+   /* Domain pagesize bitmap is subset of device pagesize bitmap */
+   if (pgsize_bitmap != vdev->pgsize_bitmap) {
+   dev_info(dev, "page size bitmap used %llx, supported %llx\n",
+pgsize_bitmap, vdev->pgsize_bitmap);
+   vdev->pgsize_bitmap = pgsize_bitmap;
+   }
+   return true;
+}
+
 static int viommu_attach_dev(struct iommu_do