Re: [PATCH 2/4] vhost-vdpa: reset vendor specific mapping to initial state in .release

2023-10-19 Thread Si-Wei Liu



On 10/19/2023 9:11 PM, Jason Wang wrote:

On Fri, Oct 20, 2023 at 6:28 AM Si-Wei Liu  wrote:



On 10/19/2023 7:39 AM, Eugenio Perez Martin wrote:

On Thu, Oct 19, 2023 at 10:27 AM Jason Wang  wrote:

On Thu, Oct 19, 2023 at 2:47 PM Si-Wei Liu  wrote:


On 10/18/2023 7:53 PM, Jason Wang wrote:

On Wed, Oct 18, 2023 at 4:49 PM Si-Wei Liu  wrote:

On 10/18/2023 12:00 AM, Jason Wang wrote:

Unfortunately, it's a must to stick to ABI. I agree it's a mess but we
don't have a better choice. Or we can fail the probe if userspace
doesn't ack this feature.

Antoher idea we can just do the following in vhost_vdpa reset?

config->reset()
if (IOTLB_PERSIST is not set) {
config->reset_map()
}

Then we don't have the burden to maintain them in the parent?

Thanks

Please see my earlier response in the other email, thanks.

%<%<

First, the ideal fix would be to leave this reset_vendor_mappings()
emulation code on the individual driver itself, which already has the
broken behavior.

So the point is, not about whether the existing behavior is "broken"
or not.

Hold on, I thought earlier we all agreed upon that the existing behavior
of vendor driver self-clearing maps during .reset violates the vhost
iotlb abstraction and also breaks the .set_map/.dma_map API. This is
100% buggy driver implementation itself that we should discourage or
eliminate as much as possible (that's part of the goal for this series),

I'm not saying it's not an issue, what I'm saying is, if the fix
breaks another userspace, it's a new bug in the kernel. See what Linus
said in [1]

"If a change results in user programs breaking, it's a bug in the kernel."


but here you seem to go existentialism and suggests the very opposite
that every .set_map/.dma_map driver implementation, regardless being the
current or the new/upcoming, should unconditionally try to emulate the
broken reset behavior for the sake of not breaking older userspace.

Such "emulation" is not done at the parent level. New parents just
need to implement reset_map() or not. everything could be done inside
vhost-vDPA as pseudo code that is shown above.


Set
aside the criteria and definition for how userspace can be broken, can
we step back to the original question why we think it's broken, and what
we can do to promote good driver implementation instead of discuss the
implementation details?

I'm not sure I get the point of this question. I'm not saying we don't
need to fix, what I am saying is that such a fix must be done in a
negotiable way. And it's better if parents won't get any burden. It
can just decide to implement reset_map() or not.


Reading the below response I found my major
points are not heard even if written for quite a few times.

I try my best to not ignore any important things, but I can't promise
I will not miss any. I hope the above clarifies my points.


It's not
that I don't understand the importance of not breaking old userspace, I
appreciate your questions and extra patience, however I do feel the
"broken" part is very relevant to our discussion here.
If it's broken (in the sense of vhost IOTLB API) that you agree, I think
we should at least allow good driver implementations; and when you think
about the possibility of those valid good driver cases
(.set_map/.dma_map implementations that do not clear maps in .reset),
you might be able to see why it's coded the way as it is now.


It's about whether we could stick to the old behaviour without
too much cost. And I believe we could.

And just to clarify here, reset_vendor_mappings() = config->reset_map()


But today there's no backend feature negotiation
between vhost-vdpa and the parent driver. Do we want to send down the
acked_backend_features to parent drivers?

There's no need to do that with the above code, or anything I missed here?

config->reset()
if (IOTLB_PERSIST is not set) {
 config->reset_map()
}

Implementation issue: this implies reset_map() has to be there for every
.set_map implementations, but vendor driver implementation for custom
IOMMU could well implement DMA ops by itself instead of .reset_map. This
won't work for every set_map driver (think about the vduse case).

Well let me do it once again, reset_map() is not mandated:

config->reset()
if (IOTLB_PERSIST is not set) {
  if (config->reset_map)
config->reset_map()

To avoid new parent drivers

I am afraid it's not just new parent drivers, but any well behaved
driver today may well break userspace if go with this forced emulation
code, if they have to implement reset_map for some reason (e.g. restored
to 1:1 passthrough mapping or other default state in mapping). For new
userspace and user driver we can guard against it using the
IOTLB_PERSIST flag, but the above code would get a big chance to break
setup with good driver and older userspace in practice.

And .reset_map implementation doesn't necessarily need to clear maps.
For e.g. IOMMU API compliant 

Re: [PATCH 2/4] vhost-vdpa: reset vendor specific mapping to initial state in .release

2023-10-19 Thread Jason Wang
On Fri, Oct 20, 2023 at 6:28 AM Si-Wei Liu  wrote:
>
>
>
> On 10/19/2023 7:39 AM, Eugenio Perez Martin wrote:
> > On Thu, Oct 19, 2023 at 10:27 AM Jason Wang  wrote:
> >> On Thu, Oct 19, 2023 at 2:47 PM Si-Wei Liu  wrote:
> >>>
> >>>
> >>> On 10/18/2023 7:53 PM, Jason Wang wrote:
>  On Wed, Oct 18, 2023 at 4:49 PM Si-Wei Liu  wrote:
> >
> > On 10/18/2023 12:00 AM, Jason Wang wrote:
> >>> Unfortunately, it's a must to stick to ABI. I agree it's a mess but we
> >>> don't have a better choice. Or we can fail the probe if userspace
> >>> doesn't ack this feature.
> >> Antoher idea we can just do the following in vhost_vdpa reset?
> >>
> >> config->reset()
> >> if (IOTLB_PERSIST is not set) {
> >>config->reset_map()
> >> }
> >>
> >> Then we don't have the burden to maintain them in the parent?
> >>
> >> Thanks
> > Please see my earlier response in the other email, thanks.
> >
> > %<%<
> >
> > First, the ideal fix would be to leave this reset_vendor_mappings()
> > emulation code on the individual driver itself, which already has the
> > broken behavior.
>  So the point is, not about whether the existing behavior is "broken"
>  or not.
> >>> Hold on, I thought earlier we all agreed upon that the existing behavior
> >>> of vendor driver self-clearing maps during .reset violates the vhost
> >>> iotlb abstraction and also breaks the .set_map/.dma_map API. This is
> >>> 100% buggy driver implementation itself that we should discourage or
> >>> eliminate as much as possible (that's part of the goal for this series),
> >> I'm not saying it's not an issue, what I'm saying is, if the fix
> >> breaks another userspace, it's a new bug in the kernel. See what Linus
> >> said in [1]
> >>
> >> "If a change results in user programs breaking, it's a bug in the kernel."
> >>
> >>> but here you seem to go existentialism and suggests the very opposite
> >>> that every .set_map/.dma_map driver implementation, regardless being the
> >>> current or the new/upcoming, should unconditionally try to emulate the
> >>> broken reset behavior for the sake of not breaking older userspace.
> >> Such "emulation" is not done at the parent level. New parents just
> >> need to implement reset_map() or not. everything could be done inside
> >> vhost-vDPA as pseudo code that is shown above.
> >>
> >>> Set
> >>> aside the criteria and definition for how userspace can be broken, can
> >>> we step back to the original question why we think it's broken, and what
> >>> we can do to promote good driver implementation instead of discuss the
> >>> implementation details?
> >> I'm not sure I get the point of this question. I'm not saying we don't
> >> need to fix, what I am saying is that such a fix must be done in a
> >> negotiable way. And it's better if parents won't get any burden. It
> >> can just decide to implement reset_map() or not.
> >>
> >>> Reading the below response I found my major
> >>> points are not heard even if written for quite a few times.
> >> I try my best to not ignore any important things, but I can't promise
> >> I will not miss any. I hope the above clarifies my points.
> >>
> >>> It's not
> >>> that I don't understand the importance of not breaking old userspace, I
> >>> appreciate your questions and extra patience, however I do feel the
> >>> "broken" part is very relevant to our discussion here.
> >>> If it's broken (in the sense of vhost IOTLB API) that you agree, I think
> >>> we should at least allow good driver implementations; and when you think
> >>> about the possibility of those valid good driver cases
> >>> (.set_map/.dma_map implementations that do not clear maps in .reset),
> >>> you might be able to see why it's coded the way as it is now.
> >>>
> It's about whether we could stick to the old behaviour without
>  too much cost. And I believe we could.
> 
>  And just to clarify here, reset_vendor_mappings() = config->reset_map()
> 
> > But today there's no backend feature negotiation
> > between vhost-vdpa and the parent driver. Do we want to send down the
> > acked_backend_features to parent drivers?
>  There's no need to do that with the above code, or anything I missed 
>  here?
> 
>  config->reset()
>  if (IOTLB_PERSIST is not set) {
>  config->reset_map()
>  }
> >>> Implementation issue: this implies reset_map() has to be there for every
> >>> .set_map implementations, but vendor driver implementation for custom
> >>> IOMMU could well implement DMA ops by itself instead of .reset_map. This
> >>> won't work for every set_map driver (think about the vduse case).
> >> Well let me do it once again, reset_map() is not mandated:
> >>
> >> config->reset()
> >> if (IOTLB_PERSIST is not set) {
> >>  if (config->reset_map)
> >>config->reset_map()
> > To avoid new parent drivers
> I 

Re: [PATCH 2/4] vhost-vdpa: reset vendor specific mapping to initial state in .release

2023-10-19 Thread Si-Wei Liu




On 10/17/2023 10:27 PM, Jason Wang wrote:

   If we do
this without a negotiation, IOTLB will not be clear but the Qemu will
try to re-program the IOTLB after reset. Which will break?

1) stick the exact old behaviour with just one line of check

It's not just one line of check here, the old behavior emulation has to
be done as Eugenio illustrated in the other email.

For vhost-vDPA it's just

if (IOTLB_PERSIST is acked by userspace)
 reset_map()
... and this reset_map in vhost_vdpa_cleanup can't be negotiable 
depending on IOTLB_PERSIST. Consider the case where user switches to 
virtio-vdpa after an older userspace using vhost-vdpa finished running. 
Even with buggy_virtio_reset_map in place it's unwarranted the vendor 
IOMMU can get back to the default state, e.g. ending with 1:1 
passthrough mapping. If not doing this unconditionally it will get a big 
chance to break userspace.


-Siwei



For parent, it's somehow similar:

during .reset()

if (IOTLB_PERSIST is not acked by userspace)
 reset_vendor_mappings()

Anything I missed here?


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 2/4] vhost-vdpa: reset vendor specific mapping to initial state in .release

2023-10-19 Thread Si-Wei Liu



On 10/19/2023 7:39 AM, Eugenio Perez Martin wrote:

On Thu, Oct 19, 2023 at 10:27 AM Jason Wang  wrote:

On Thu, Oct 19, 2023 at 2:47 PM Si-Wei Liu  wrote:



On 10/18/2023 7:53 PM, Jason Wang wrote:

On Wed, Oct 18, 2023 at 4:49 PM Si-Wei Liu  wrote:


On 10/18/2023 12:00 AM, Jason Wang wrote:

Unfortunately, it's a must to stick to ABI. I agree it's a mess but we
don't have a better choice. Or we can fail the probe if userspace
doesn't ack this feature.

Antoher idea we can just do the following in vhost_vdpa reset?

config->reset()
if (IOTLB_PERSIST is not set) {
   config->reset_map()
}

Then we don't have the burden to maintain them in the parent?

Thanks

Please see my earlier response in the other email, thanks.

%<%<

First, the ideal fix would be to leave this reset_vendor_mappings()
emulation code on the individual driver itself, which already has the
broken behavior.

So the point is, not about whether the existing behavior is "broken"
or not.

Hold on, I thought earlier we all agreed upon that the existing behavior
of vendor driver self-clearing maps during .reset violates the vhost
iotlb abstraction and also breaks the .set_map/.dma_map API. This is
100% buggy driver implementation itself that we should discourage or
eliminate as much as possible (that's part of the goal for this series),

I'm not saying it's not an issue, what I'm saying is, if the fix
breaks another userspace, it's a new bug in the kernel. See what Linus
said in [1]

"If a change results in user programs breaking, it's a bug in the kernel."


but here you seem to go existentialism and suggests the very opposite
that every .set_map/.dma_map driver implementation, regardless being the
current or the new/upcoming, should unconditionally try to emulate the
broken reset behavior for the sake of not breaking older userspace.

Such "emulation" is not done at the parent level. New parents just
need to implement reset_map() or not. everything could be done inside
vhost-vDPA as pseudo code that is shown above.


Set
aside the criteria and definition for how userspace can be broken, can
we step back to the original question why we think it's broken, and what
we can do to promote good driver implementation instead of discuss the
implementation details?

I'm not sure I get the point of this question. I'm not saying we don't
need to fix, what I am saying is that such a fix must be done in a
negotiable way. And it's better if parents won't get any burden. It
can just decide to implement reset_map() or not.


Reading the below response I found my major
points are not heard even if written for quite a few times.

I try my best to not ignore any important things, but I can't promise
I will not miss any. I hope the above clarifies my points.


It's not
that I don't understand the importance of not breaking old userspace, I
appreciate your questions and extra patience, however I do feel the
"broken" part is very relevant to our discussion here.
If it's broken (in the sense of vhost IOTLB API) that you agree, I think
we should at least allow good driver implementations; and when you think
about the possibility of those valid good driver cases
(.set_map/.dma_map implementations that do not clear maps in .reset),
you might be able to see why it's coded the way as it is now.


   It's about whether we could stick to the old behaviour without
too much cost. And I believe we could.

And just to clarify here, reset_vendor_mappings() = config->reset_map()


But today there's no backend feature negotiation
between vhost-vdpa and the parent driver. Do we want to send down the
acked_backend_features to parent drivers?

There's no need to do that with the above code, or anything I missed here?

config->reset()
if (IOTLB_PERSIST is not set) {
config->reset_map()
}

Implementation issue: this implies reset_map() has to be there for every
.set_map implementations, but vendor driver implementation for custom
IOMMU could well implement DMA ops by itself instead of .reset_map. This
won't work for every set_map driver (think about the vduse case).

Well let me do it once again, reset_map() is not mandated:

config->reset()
if (IOTLB_PERSIST is not set) {
 if (config->reset_map)
   config->reset_map()

To avoid new parent drivers
I am afraid it's not just new parent drivers, but any well behaved 
driver today may well break userspace if go with this forced emulation 
code, if they have to implement reset_map for some reason (e.g. restored 
to 1:1 passthrough mapping or other default state in mapping). For new 
userspace and user driver we can guard against it using the 
IOTLB_PERSIST flag, but the above code would get a big chance to break 
setup with good driver and older userspace in practice.


And .reset_map implementation doesn't necessarily need to clear maps. 
For e.g. IOMMU API compliant driver that only needs simple DMA model for 
passthrough, all .reset_map has to do is toggle to 

Re: [PATCH 2/4] vhost-vdpa: reset vendor specific mapping to initial state in .release

2023-10-19 Thread Jason Wang
On Thu, Oct 19, 2023 at 2:47 PM Si-Wei Liu  wrote:
>
>
>
> On 10/18/2023 7:53 PM, Jason Wang wrote:
> > On Wed, Oct 18, 2023 at 4:49 PM Si-Wei Liu  wrote:
> >>
> >>
> >> On 10/18/2023 12:00 AM, Jason Wang wrote:
>  Unfortunately, it's a must to stick to ABI. I agree it's a mess but we
>  don't have a better choice. Or we can fail the probe if userspace
>  doesn't ack this feature.
> >>> Antoher idea we can just do the following in vhost_vdpa reset?
> >>>
> >>> config->reset()
> >>> if (IOTLB_PERSIST is not set) {
> >>>   config->reset_map()
> >>> }
> >>>
> >>> Then we don't have the burden to maintain them in the parent?
> >>>
> >>> Thanks
> >> Please see my earlier response in the other email, thanks.
> >>
> >> %<%<
> >>
> >> First, the ideal fix would be to leave this reset_vendor_mappings()
> >> emulation code on the individual driver itself, which already has the
> >> broken behavior.
> > So the point is, not about whether the existing behavior is "broken"
> > or not.
> Hold on, I thought earlier we all agreed upon that the existing behavior
> of vendor driver self-clearing maps during .reset violates the vhost
> iotlb abstraction and also breaks the .set_map/.dma_map API. This is
> 100% buggy driver implementation itself that we should discourage or
> eliminate as much as possible (that's part of the goal for this series),

I'm not saying it's not an issue, what I'm saying is, if the fix
breaks another userspace, it's a new bug in the kernel. See what Linus
said in [1]

"If a change results in user programs breaking, it's a bug in the kernel."

> but here you seem to go existentialism and suggests the very opposite
> that every .set_map/.dma_map driver implementation, regardless being the
> current or the new/upcoming, should unconditionally try to emulate the
> broken reset behavior for the sake of not breaking older userspace.

Such "emulation" is not done at the parent level. New parents just
need to implement reset_map() or not. everything could be done inside
vhost-vDPA as pseudo code that is shown above.

> Set
> aside the criteria and definition for how userspace can be broken, can
> we step back to the original question why we think it's broken, and what
> we can do to promote good driver implementation instead of discuss the
> implementation details?

I'm not sure I get the point of this question. I'm not saying we don't
need to fix, what I am saying is that such a fix must be done in a
negotiable way. And it's better if parents won't get any burden. It
can just decide to implement reset_map() or not.

> Reading the below response I found my major
> points are not heard even if written for quite a few times.

I try my best to not ignore any important things, but I can't promise
I will not miss any. I hope the above clarifies my points.

> It's not
> that I don't understand the importance of not breaking old userspace, I
> appreciate your questions and extra patience, however I do feel the
> "broken" part is very relevant to our discussion here.
> If it's broken (in the sense of vhost IOTLB API) that you agree, I think
> we should at least allow good driver implementations; and when you think
> about the possibility of those valid good driver cases
> (.set_map/.dma_map implementations that do not clear maps in .reset),
> you might be able to see why it's coded the way as it is now.
>
> >   It's about whether we could stick to the old behaviour without
> > too much cost. And I believe we could.
> >
> > And just to clarify here, reset_vendor_mappings() = config->reset_map()
> >
> >> But today there's no backend feature negotiation
> >> between vhost-vdpa and the parent driver. Do we want to send down the
> >> acked_backend_features to parent drivers?
> > There's no need to do that with the above code, or anything I missed here?
> >
> > config->reset()
> > if (IOTLB_PERSIST is not set) {
> >config->reset_map()
> > }
> Implementation issue: this implies reset_map() has to be there for every
> .set_map implementations, but vendor driver implementation for custom
> IOMMU could well implement DMA ops by itself instead of .reset_map. This
> won't work for every set_map driver (think about the vduse case).

Well let me do it once again, reset_map() is not mandated:

config->reset()
if (IOTLB_PERSIST is not set) {
if (config->reset_map)
  config->reset_map()
}

Did you see any issue with VDUSE in this case?

>
> But this is not the the point I was making. I think if you agree this is
> purely buggy driver implementation of its own, we should try to isolate
> this buggy behavior to individual driver rather than overload vhost-vdpa
> or vdpa core's role to help implement the emulation of broken driver
> behavior.

As I pointed out, if it is not noticeable in the userspace, that's
fine but it's not.

> I don't get why .reset is special here, the abuse of .reset to
> manipulate mapping could also happen in other IOMMU 

Re: [PATCH 2/4] vhost-vdpa: reset vendor specific mapping to initial state in .release

2023-10-19 Thread Si-Wei Liu



On 10/18/2023 7:53 PM, Jason Wang wrote:

On Wed, Oct 18, 2023 at 4:49 PM Si-Wei Liu  wrote:



On 10/18/2023 12:00 AM, Jason Wang wrote:

Unfortunately, it's a must to stick to ABI. I agree it's a mess but we
don't have a better choice. Or we can fail the probe if userspace
doesn't ack this feature.

Antoher idea we can just do the following in vhost_vdpa reset?

config->reset()
if (IOTLB_PERSIST is not set) {
  config->reset_map()
}

Then we don't have the burden to maintain them in the parent?

Thanks

Please see my earlier response in the other email, thanks.

%<%<

First, the ideal fix would be to leave this reset_vendor_mappings()
emulation code on the individual driver itself, which already has the
broken behavior.

So the point is, not about whether the existing behavior is "broken"
or not.
Hold on, I thought earlier we all agreed upon that the existing behavior 
of vendor driver self-clearing maps during .reset violates the vhost 
iotlb abstraction and also breaks the .set_map/.dma_map API. This is 
100% buggy driver implementation itself that we should discourage or 
eliminate as much as possible (that's part of the goal for this series), 
but here you seem to go existentialism and suggests the very opposite 
that every .set_map/.dma_map driver implementation, regardless being the 
current or the new/upcoming, should unconditionally try to emulate the 
broken reset behavior for the sake of not breaking older userspace. Set 
aside the criteria and definition for how userspace can be broken, can 
we step back to the original question why we think it's broken, and what 
we can do to promote good driver implementation instead of discuss the 
implementation details? Reading the below response I found my major 
points are not heard even if written for quite a few times. It's not 
that I don't understand the importance of not breaking old userspace, I 
appreciate your questions and extra patience, however I do feel the 
"broken" part is very relevant to our discussion here.


If it's broken (in the sense of vhost IOTLB API) that you agree, I think 
we should at least allow good driver implementations; and when you think 
about the possibility of those valid good driver cases 
(.set_map/.dma_map implementations that do not clear maps in .reset),  
you might be able to see why it's coded the way as it is now.



  It's about whether we could stick to the old behaviour without
too much cost. And I believe we could.

And just to clarify here, reset_vendor_mappings() = config->reset_map()


But today there's no backend feature negotiation
between vhost-vdpa and the parent driver. Do we want to send down the
acked_backend_features to parent drivers?

There's no need to do that with the above code, or anything I missed here?

config->reset()
if (IOTLB_PERSIST is not set) {
   config->reset_map()
}
Implementation issue: this implies reset_map() has to be there for every 
.set_map implementations, but vendor driver implementation for custom 
IOMMU could well implement DMA ops by itself instead of .reset_map. This 
won't work for every set_map driver (think about the vduse case).


But this is not the the point I was making. I think if you agree this is 
purely buggy driver implementation of its own, we should try to isolate 
this buggy behavior to individual driver rather than overload vhost-vdpa 
or vdpa core's role to help implement the emulation of broken driver 
behavior. I don't get why .reset is special here, the abuse of .reset to 
manipulate mapping could also happen in other IOMMU unrelated driver 
entries like in .suspend, or in queue_reset. If someday userspace is 
found coded around similar buggy driver implementation in other driver 
ops, do we want to follow and duplicate the same emulation in vdpa core 
as the precedent is already set here around .reset?
The buggy driver can fail in a lot of other ways indefinitely during 
reset, if there's a buggy driver that's already broken the way as how it 
is and happens to survive with all userspace apps, we just don't care 
and let it be. There's no way we can enumerate all those buggy behaviors 
in .reset_map itself, it's overloading that driver API too much.

Second, IOTLB_PERSIST is needed but not sufficient. Due to lack of
backend feature negotiation in parent driver, if vhost-vdpa has to
provide the old-behaviour emulation for compatibility on driver's
behalf, it needs to be done per-driver basis. There could be good
on-chip or vendor IOMMU implementation which doesn't clear the IOTLB in
.reset, and vendor specific IOMMU doesn't have to provide .reset_map,

Then we just don't offer IOTLB_PRESIST, isn't this by design?
Think about the vduse case, it can work with DMA ops directly so doesn't 
have to implement .reset_map, unless for some specific good reason. 
Because it's a conforming and valid/good driver implementation, we may 
still allow it to advertise IOTLB_PERSIST to userspace. Which 

Re: [PATCH 2/4] vhost-vdpa: reset vendor specific mapping to initial state in .release

2023-10-18 Thread Jason Wang
On Wed, Oct 18, 2023 at 4:49 PM Si-Wei Liu  wrote:
>
>
>
> On 10/18/2023 12:00 AM, Jason Wang wrote:
> >> Unfortunately, it's a must to stick to ABI. I agree it's a mess but we
> >> don't have a better choice. Or we can fail the probe if userspace
> >> doesn't ack this feature.
> > Antoher idea we can just do the following in vhost_vdpa reset?
> >
> > config->reset()
> > if (IOTLB_PERSIST is not set) {
> >  config->reset_map()
> > }
> >
> > Then we don't have the burden to maintain them in the parent?
> >
> > Thanks
> Please see my earlier response in the other email, thanks.
>
> %<%<
>
> First, the ideal fix would be to leave this reset_vendor_mappings()
> emulation code on the individual driver itself, which already has the
> broken behavior.

So the point is, not about whether the existing behavior is "broken"
or not. It's about whether we could stick to the old behaviour without
too much cost. And I believe we could.

And just to clarify here, reset_vendor_mappings() = config->reset_map()

> But today there's no backend feature negotiation
> between vhost-vdpa and the parent driver. Do we want to send down the
> acked_backend_features to parent drivers?

There's no need to do that with the above code, or anything I missed here?

config->reset()
if (IOTLB_PERSIST is not set) {
  config->reset_map()
}

>
> Second, IOTLB_PERSIST is needed but not sufficient. Due to lack of
> backend feature negotiation in parent driver, if vhost-vdpa has to
> provide the old-behaviour emulation for compatibility on driver's
> behalf, it needs to be done per-driver basis. There could be good
> on-chip or vendor IOMMU implementation which doesn't clear the IOTLB in
> .reset, and vendor specific IOMMU doesn't have to provide .reset_map,

Then we just don't offer IOTLB_PRESIST, isn't this by design?

> we
> should allow these good driver implementations rather than
> unconditionally stick to some specific problematic behavior for every
> other good driver.

Then you can force reset_map() with set_map() that is what I suggest
in another thread, no?

> Then we need a set of device flags (backend_features
> bit again?) to indicate the specific driver needs upper layer's help on
> old-behaviour emulation.
>
> Last but not least, I'm not sure how to properly emulate
> reset_vendor_mappings() from vhost-vdpa layer. If a vendor driver has no
> .reset_map op implemented, or if .reset_map has a slightly different
> implementation than what it used to reset the iotlb in the .reset op,

See above, for reset_vendor_mappings() I meant config->reset_map() exactly.

Thanks

> then this either becomes effectively dead code if no one ends up using,
> or the vhost-vdpa emulation is helpless and limited in scope, unable to
> cover all the cases.
>
> %<%<
>

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 2/4] vhost-vdpa: reset vendor specific mapping to initial state in .release

2023-10-18 Thread Jason Wang
On Thu, Oct 19, 2023 at 7:21 AM Si-Wei Liu  wrote:
>
>
>
> On 10/18/2023 4:14 AM, Eugenio Perez Martin wrote:
> > On Wed, Oct 18, 2023 at 10:44 AM Si-Wei Liu  wrote:
> >>
> >>
> >> On 10/17/2023 10:27 PM, Jason Wang wrote:
> >>> On Wed, Oct 18, 2023 at 12:36 PM Si-Wei Liu  wrote:
> 
>  On 10/16/2023 7:35 PM, Jason Wang wrote:
> > On Tue, Oct 17, 2023 at 4:30 AM Si-Wei Liu  
> > wrote:
> >> On 10/16/2023 4:28 AM, Eugenio Perez Martin wrote:
> >>> On Mon, Oct 16, 2023 at 8:33 AM Jason Wang  
> >>> wrote:
>  On Fri, Oct 13, 2023 at 3:36 PM Si-Wei Liu  
>  wrote:
> > On 10/12/2023 8:01 PM, Jason Wang wrote:
> >> On Tue, Oct 10, 2023 at 5:05 PM Si-Wei Liu  
> >> wrote:
> >>> Devices with on-chip IOMMU or vendor specific IOTLB implementation
> >>> may need to restore iotlb mapping to the initial or default state
> >>> using the .reset_map op, as it's desirable for some parent devices
> >>> to solely manipulate mappings by its own, independent of virtio 
> >>> device
> >>> state. For instance, device reset does not cause mapping go away 
> >>> on
> >>> such IOTLB model in need of persistent mapping. Before vhost-vdpa
> >>> is going away, give them a chance to reset iotlb back to the 
> >>> initial
> >>> state in vhost_vdpa_cleanup().
> >>>
> >>> Signed-off-by: Si-Wei Liu 
> >>> ---
> >>>   drivers/vhost/vdpa.c | 16 
> >>>   1 file changed, 16 insertions(+)
> >>>
> >>> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> >>> index 851535f..a3f8160 100644
> >>> --- a/drivers/vhost/vdpa.c
> >>> +++ b/drivers/vhost/vdpa.c
> >>> @@ -131,6 +131,15 @@ static struct vhost_vdpa_as 
> >>> *vhost_vdpa_find_alloc_as(struct vhost_vdpa *v,
> >>>  return vhost_vdpa_alloc_as(v, asid);
> >>>   }
> >>>
> >>> +static void vhost_vdpa_reset_map(struct vhost_vdpa *v, u32 asid)
> >>> +{
> >>> +   struct vdpa_device *vdpa = v->vdpa;
> >>> +   const struct vdpa_config_ops *ops = vdpa->config;
> >>> +
> >>> +   if (ops->reset_map)
> >>> +   ops->reset_map(vdpa, asid);
> >>> +}
> >>> +
> >>>   static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 
> >>> asid)
> >>>   {
> >>>  struct vhost_vdpa_as *as = asid_to_as(v, asid);
> >>> @@ -140,6 +149,13 @@ static int vhost_vdpa_remove_as(struct 
> >>> vhost_vdpa *v, u32 asid)
> >>>
> >>>  hlist_del(>hash_link);
> >>>  vhost_vdpa_iotlb_unmap(v, >iotlb, 0ULL, 0ULL - 
> >>> 1, asid);
> >>> +   /*
> >>> +* Devices with vendor specific IOMMU may need to restore
> >>> +* iotlb to the initial or default state which is not done
> >>> +* through device reset, as the IOTLB mapping manipulation
> >>> +* could be decoupled from the virtio device life cycle.
> >>> +*/
> >> Should we do this according to whether IOTLB_PRESIST is set?
> > Well, in theory this seems like so but it's unnecessary code change
> > actually, as that is the way how vDPA parent behind platform IOMMU 
> > works
> > today, and userspace doesn't break as of today. :)
>  Well, this is one question I've ever asked before. You have explained
>  that one of the reason that we don't break userspace is that they may
>  couple IOTLB reset with vDPA reset as well. One example is the Qemu.
> 
> > As explained in previous threads [1][2], when IOTLB_PERSIST is not 
> > set
> > it doesn't necessarily mean the iotlb will definitely be destroyed
> > across reset (think about the platform IOMMU case), so userspace 
> > today
> > is already tolerating enough with either good or bad IOMMU.
> > I'm confused, how to define tolerating here?
>  Tolerating defined as QEMU has to proactively unmap before reset just to
>  workaround the driver bug (on-chip maps out of sync), unconditionally
>  for platform or on-chip. While we all know it doesn't have to do so for
>  platform IOMMU, though userspace has no means to distinguish. That said,
>  userspace is sacrificing reset time performance on platform IOMMU setup
>  just for working around buggy implementation in the other setup.
> >>> Ok, so what you actually mean is that userspace can tolerate the "bug"
> >>> with the performance penalty.
> >> Right.
> >>>
> > For example, if it has tolerance, why bother?
>  I'm not sure I get the question. But I think userspace is compromising
>  because of buggy implementation in a few 

Re: [PATCH 2/4] vhost-vdpa: reset vendor specific mapping to initial state in .release

2023-10-18 Thread Si-Wei Liu



On 10/18/2023 4:14 AM, Eugenio Perez Martin wrote:

On Wed, Oct 18, 2023 at 10:44 AM Si-Wei Liu  wrote:



On 10/17/2023 10:27 PM, Jason Wang wrote:

On Wed, Oct 18, 2023 at 12:36 PM Si-Wei Liu  wrote:


On 10/16/2023 7:35 PM, Jason Wang wrote:

On Tue, Oct 17, 2023 at 4:30 AM Si-Wei Liu  wrote:

On 10/16/2023 4:28 AM, Eugenio Perez Martin wrote:

On Mon, Oct 16, 2023 at 8:33 AM Jason Wang  wrote:

On Fri, Oct 13, 2023 at 3:36 PM Si-Wei Liu  wrote:

On 10/12/2023 8:01 PM, Jason Wang wrote:

On Tue, Oct 10, 2023 at 5:05 PM Si-Wei Liu  wrote:

Devices with on-chip IOMMU or vendor specific IOTLB implementation
may need to restore iotlb mapping to the initial or default state
using the .reset_map op, as it's desirable for some parent devices
to solely manipulate mappings by its own, independent of virtio device
state. For instance, device reset does not cause mapping go away on
such IOTLB model in need of persistent mapping. Before vhost-vdpa
is going away, give them a chance to reset iotlb back to the initial
state in vhost_vdpa_cleanup().

Signed-off-by: Si-Wei Liu 
---
  drivers/vhost/vdpa.c | 16 
  1 file changed, 16 insertions(+)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 851535f..a3f8160 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -131,6 +131,15 @@ static struct vhost_vdpa_as 
*vhost_vdpa_find_alloc_as(struct vhost_vdpa *v,
 return vhost_vdpa_alloc_as(v, asid);
  }

+static void vhost_vdpa_reset_map(struct vhost_vdpa *v, u32 asid)
+{
+   struct vdpa_device *vdpa = v->vdpa;
+   const struct vdpa_config_ops *ops = vdpa->config;
+
+   if (ops->reset_map)
+   ops->reset_map(vdpa, asid);
+}
+
  static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 asid)
  {
 struct vhost_vdpa_as *as = asid_to_as(v, asid);
@@ -140,6 +149,13 @@ static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 
asid)

 hlist_del(>hash_link);
 vhost_vdpa_iotlb_unmap(v, >iotlb, 0ULL, 0ULL - 1, asid);
+   /*
+* Devices with vendor specific IOMMU may need to restore
+* iotlb to the initial or default state which is not done
+* through device reset, as the IOTLB mapping manipulation
+* could be decoupled from the virtio device life cycle.
+*/

Should we do this according to whether IOTLB_PRESIST is set?

Well, in theory this seems like so but it's unnecessary code change
actually, as that is the way how vDPA parent behind platform IOMMU works
today, and userspace doesn't break as of today. :)

Well, this is one question I've ever asked before. You have explained
that one of the reason that we don't break userspace is that they may
couple IOTLB reset with vDPA reset as well. One example is the Qemu.


As explained in previous threads [1][2], when IOTLB_PERSIST is not set
it doesn't necessarily mean the iotlb will definitely be destroyed
across reset (think about the platform IOMMU case), so userspace today
is already tolerating enough with either good or bad IOMMU.

I'm confused, how to define tolerating here?

Tolerating defined as QEMU has to proactively unmap before reset just to
workaround the driver bug (on-chip maps out of sync), unconditionally
for platform or on-chip. While we all know it doesn't have to do so for
platform IOMMU, though userspace has no means to distinguish. That said,
userspace is sacrificing reset time performance on platform IOMMU setup
just for working around buggy implementation in the other setup.

Ok, so what you actually mean is that userspace can tolerate the "bug"
with the performance penalty.

Right.



For example, if it has tolerance, why bother?

I'm not sure I get the question. But I think userspace is compromising
because of buggy implementation in a few drivers doesn't mean we should
uniformly enforce such behavior for all set_map/dma_map implementations.

This is not my point. I meant, we can fix we need a negotiation in
order to let some "buggy" old user space to survive from the changes.

Userspace is no buggy today, how to define "buggy"? Userspace with
tolerance could survive just fine no matter if this negotiation or buggy
driver behavior emulation is around or not. If any userspace doesn't
tolerate, it can work still fine on good on-chip IOMMU or platform
IOMMU, no matter if the negotiation is around or not.

This code of

not checking IOTLB_PERSIST being set is intentional, there's no point to
emulate bad IOMMU behavior even for older userspace (with improper
emulation to be done it would result in even worse performance).

I can easily imagine a case:

The old Qemu that works only with a setup like mlx5_vdpa.

Noted, seems to me there's no such case of a userspace implementation
that only works with mlx5_vdpa or its friends, but doesn't work with the
others e.g. platform IOMMU, or well behaving on-chip IOMMU
implementations.

It's not hard to think of a case where:

1) the environment has 

Re: [PATCH 2/4] vhost-vdpa: reset vendor specific mapping to initial state in .release

2023-10-18 Thread Si-Wei Liu




On 10/18/2023 12:00 AM, Jason Wang wrote:

Unfortunately, it's a must to stick to ABI. I agree it's a mess but we
don't have a better choice. Or we can fail the probe if userspace
doesn't ack this feature.

Antoher idea we can just do the following in vhost_vdpa reset?

config->reset()
if (IOTLB_PERSIST is not set) {
 config->reset_map()
}

Then we don't have the burden to maintain them in the parent?

Thanks

Please see my earlier response in the other email, thanks.

%<%<

First, the ideal fix would be to leave this reset_vendor_mappings() 
emulation code on the individual driver itself, which already has the 
broken behavior. But today there's no backend feature negotiation 
between vhost-vdpa and the parent driver. Do we want to send down the 
acked_backend_features to parent drivers?


Second, IOTLB_PERSIST is needed but not sufficient. Due to lack of 
backend feature negotiation in parent driver, if vhost-vdpa has to 
provide the old-behaviour emulation for compatibility on driver's 
behalf, it needs to be done per-driver basis. There could be good 
on-chip or vendor IOMMU implementation which doesn't clear the IOTLB in 
.reset, and vendor specific IOMMU doesn't have to provide .reset_map, we 
should allow these good driver implementations rather than 
unconditionally stick to some specific problematic behavior for every 
other good driver. Then we need a set of device flags (backend_features 
bit again?) to indicate the specific driver needs upper layer's help on 
old-behaviour emulation.


Last but not least, I'm not sure how to properly emulate 
reset_vendor_mappings() from vhost-vdpa layer. If a vendor driver has no 
.reset_map op implemented, or if .reset_map has a slightly different 
implementation than what it used to reset the iotlb in the .reset op, 
then this either becomes effectively dead code if no one ends up using, 
or the vhost-vdpa emulation is helpless and limited in scope, unable to 
cover all the cases.


%<%<
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 2/4] vhost-vdpa: reset vendor specific mapping to initial state in .release

2023-10-18 Thread Si-Wei Liu



On 10/17/2023 10:27 PM, Jason Wang wrote:

On Wed, Oct 18, 2023 at 12:36 PM Si-Wei Liu  wrote:



On 10/16/2023 7:35 PM, Jason Wang wrote:

On Tue, Oct 17, 2023 at 4:30 AM Si-Wei Liu  wrote:


On 10/16/2023 4:28 AM, Eugenio Perez Martin wrote:

On Mon, Oct 16, 2023 at 8:33 AM Jason Wang  wrote:

On Fri, Oct 13, 2023 at 3:36 PM Si-Wei Liu  wrote:

On 10/12/2023 8:01 PM, Jason Wang wrote:

On Tue, Oct 10, 2023 at 5:05 PM Si-Wei Liu  wrote:

Devices with on-chip IOMMU or vendor specific IOTLB implementation
may need to restore iotlb mapping to the initial or default state
using the .reset_map op, as it's desirable for some parent devices
to solely manipulate mappings by its own, independent of virtio device
state. For instance, device reset does not cause mapping go away on
such IOTLB model in need of persistent mapping. Before vhost-vdpa
is going away, give them a chance to reset iotlb back to the initial
state in vhost_vdpa_cleanup().

Signed-off-by: Si-Wei Liu 
---
 drivers/vhost/vdpa.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 851535f..a3f8160 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -131,6 +131,15 @@ static struct vhost_vdpa_as 
*vhost_vdpa_find_alloc_as(struct vhost_vdpa *v,
return vhost_vdpa_alloc_as(v, asid);
 }

+static void vhost_vdpa_reset_map(struct vhost_vdpa *v, u32 asid)
+{
+   struct vdpa_device *vdpa = v->vdpa;
+   const struct vdpa_config_ops *ops = vdpa->config;
+
+   if (ops->reset_map)
+   ops->reset_map(vdpa, asid);
+}
+
 static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 asid)
 {
struct vhost_vdpa_as *as = asid_to_as(v, asid);
@@ -140,6 +149,13 @@ static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 
asid)

hlist_del(>hash_link);
vhost_vdpa_iotlb_unmap(v, >iotlb, 0ULL, 0ULL - 1, asid);
+   /*
+* Devices with vendor specific IOMMU may need to restore
+* iotlb to the initial or default state which is not done
+* through device reset, as the IOTLB mapping manipulation
+* could be decoupled from the virtio device life cycle.
+*/

Should we do this according to whether IOTLB_PRESIST is set?

Well, in theory this seems like so but it's unnecessary code change
actually, as that is the way how vDPA parent behind platform IOMMU works
today, and userspace doesn't break as of today. :)

Well, this is one question I've ever asked before. You have explained
that one of the reason that we don't break userspace is that they may
couple IOTLB reset with vDPA reset as well. One example is the Qemu.


As explained in previous threads [1][2], when IOTLB_PERSIST is not set
it doesn't necessarily mean the iotlb will definitely be destroyed
across reset (think about the platform IOMMU case), so userspace today
is already tolerating enough with either good or bad IOMMU.

I'm confused, how to define tolerating here?

Tolerating defined as QEMU has to proactively unmap before reset just to
workaround the driver bug (on-chip maps out of sync), unconditionally
for platform or on-chip. While we all know it doesn't have to do so for
platform IOMMU, though userspace has no means to distinguish. That said,
userspace is sacrificing reset time performance on platform IOMMU setup
just for working around buggy implementation in the other setup.

Ok, so what you actually mean is that userspace can tolerate the "bug"
with the performance penalty.

Right.




For example, if it has tolerance, why bother?

I'm not sure I get the question. But I think userspace is compromising
because of buggy implementation in a few drivers doesn't mean we should
uniformly enforce such behavior for all set_map/dma_map implementations.

This is not my point. I meant, we can fix we need a negotiation in
order to let some "buggy" old user space to survive from the changes.
Userspace is no buggy today, how to define "buggy"? Userspace with 
tolerance could survive just fine no matter if this negotiation or buggy 
driver behavior emulation is around or not. If any userspace doesn't 
tolerate, it can work still fine on good on-chip IOMMU or platform 
IOMMU, no matter if the negotiation is around or not.



This code of

not checking IOTLB_PERSIST being set is intentional, there's no point to
emulate bad IOMMU behavior even for older userspace (with improper
emulation to be done it would result in even worse performance).

I can easily imagine a case:

The old Qemu that works only with a setup like mlx5_vdpa.

Noted, seems to me there's no such case of a userspace implementation
that only works with mlx5_vdpa or its friends, but doesn't work with the
others e.g. platform IOMMU, or well behaving on-chip IOMMU
implementations.

It's not hard to think of a case where:

1) the environment has mlx5_vdpa only
2) kernel doc can't have endless details, so when developing
application, the author notice IOTLB 

Re: [PATCH 2/4] vhost-vdpa: reset vendor specific mapping to initial state in .release

2023-10-18 Thread Jason Wang
On Wed, Oct 18, 2023 at 1:27 PM Jason Wang  wrote:
>
> On Wed, Oct 18, 2023 at 12:36 PM Si-Wei Liu  wrote:
> >
> >
> >
> > On 10/16/2023 7:35 PM, Jason Wang wrote:
> > > On Tue, Oct 17, 2023 at 4:30 AM Si-Wei Liu  wrote:
> > >>
> > >>
> > >> On 10/16/2023 4:28 AM, Eugenio Perez Martin wrote:
> > >>> On Mon, Oct 16, 2023 at 8:33 AM Jason Wang  wrote:
> >  On Fri, Oct 13, 2023 at 3:36 PM Si-Wei Liu  
> >  wrote:
> > >
> > > On 10/12/2023 8:01 PM, Jason Wang wrote:
> > >> On Tue, Oct 10, 2023 at 5:05 PM Si-Wei Liu  
> > >> wrote:
> > >>> Devices with on-chip IOMMU or vendor specific IOTLB implementation
> > >>> may need to restore iotlb mapping to the initial or default state
> > >>> using the .reset_map op, as it's desirable for some parent devices
> > >>> to solely manipulate mappings by its own, independent of virtio 
> > >>> device
> > >>> state. For instance, device reset does not cause mapping go away on
> > >>> such IOTLB model in need of persistent mapping. Before vhost-vdpa
> > >>> is going away, give them a chance to reset iotlb back to the initial
> > >>> state in vhost_vdpa_cleanup().
> > >>>
> > >>> Signed-off-by: Si-Wei Liu 
> > >>> ---
> > >>> drivers/vhost/vdpa.c | 16 
> > >>> 1 file changed, 16 insertions(+)
> > >>>
> > >>> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > >>> index 851535f..a3f8160 100644
> > >>> --- a/drivers/vhost/vdpa.c
> > >>> +++ b/drivers/vhost/vdpa.c
> > >>> @@ -131,6 +131,15 @@ static struct vhost_vdpa_as 
> > >>> *vhost_vdpa_find_alloc_as(struct vhost_vdpa *v,
> > >>>return vhost_vdpa_alloc_as(v, asid);
> > >>> }
> > >>>
> > >>> +static void vhost_vdpa_reset_map(struct vhost_vdpa *v, u32 asid)
> > >>> +{
> > >>> +   struct vdpa_device *vdpa = v->vdpa;
> > >>> +   const struct vdpa_config_ops *ops = vdpa->config;
> > >>> +
> > >>> +   if (ops->reset_map)
> > >>> +   ops->reset_map(vdpa, asid);
> > >>> +}
> > >>> +
> > >>> static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 asid)
> > >>> {
> > >>>struct vhost_vdpa_as *as = asid_to_as(v, asid);
> > >>> @@ -140,6 +149,13 @@ static int vhost_vdpa_remove_as(struct 
> > >>> vhost_vdpa *v, u32 asid)
> > >>>
> > >>>hlist_del(>hash_link);
> > >>>vhost_vdpa_iotlb_unmap(v, >iotlb, 0ULL, 0ULL - 1, 
> > >>> asid);
> > >>> +   /*
> > >>> +* Devices with vendor specific IOMMU may need to restore
> > >>> +* iotlb to the initial or default state which is not done
> > >>> +* through device reset, as the IOTLB mapping manipulation
> > >>> +* could be decoupled from the virtio device life cycle.
> > >>> +*/
> > >> Should we do this according to whether IOTLB_PRESIST is set?
> > > Well, in theory this seems like so but it's unnecessary code change
> > > actually, as that is the way how vDPA parent behind platform IOMMU 
> > > works
> > > today, and userspace doesn't break as of today. :)
> >  Well, this is one question I've ever asked before. You have explained
> >  that one of the reason that we don't break userspace is that they may
> >  couple IOTLB reset with vDPA reset as well. One example is the Qemu.
> > 
> > > As explained in previous threads [1][2], when IOTLB_PERSIST is not set
> > > it doesn't necessarily mean the iotlb will definitely be destroyed
> > > across reset (think about the platform IOMMU case), so userspace today
> > > is already tolerating enough with either good or bad IOMMU.
> > > I'm confused, how to define tolerating here?
> >
> > Tolerating defined as QEMU has to proactively unmap before reset just to
> > workaround the driver bug (on-chip maps out of sync), unconditionally
> > for platform or on-chip. While we all know it doesn't have to do so for
> > platform IOMMU, though userspace has no means to distinguish. That said,
> > userspace is sacrificing reset time performance on platform IOMMU setup
> > just for working around buggy implementation in the other setup.
>
> Ok, so what you actually mean is that userspace can tolerate the "bug"
> with the performance penalty.
>
>
> >
> > > For example, if it has tolerance, why bother?
> > I'm not sure I get the question. But I think userspace is compromising
> > because of buggy implementation in a few drivers doesn't mean we should
> > uniformly enforce such behavior for all set_map/dma_map implementations.
>
> This is not my point. I meant, we can fix we need a negotiation in
> order to let some "buggy" old user space to survive from the changes.
>
> >
> > >
> >  This code of
> > > not checking IOTLB_PERSIST being set is intentional, there's no point 
> > > to
> > > emulate bad IOMMU behavior even for older userspace 

Re: [PATCH 2/4] vhost-vdpa: reset vendor specific mapping to initial state in .release

2023-10-17 Thread Jason Wang
On Wed, Oct 18, 2023 at 12:36 PM Si-Wei Liu  wrote:
>
>
>
> On 10/16/2023 7:35 PM, Jason Wang wrote:
> > On Tue, Oct 17, 2023 at 4:30 AM Si-Wei Liu  wrote:
> >>
> >>
> >> On 10/16/2023 4:28 AM, Eugenio Perez Martin wrote:
> >>> On Mon, Oct 16, 2023 at 8:33 AM Jason Wang  wrote:
>  On Fri, Oct 13, 2023 at 3:36 PM Si-Wei Liu  wrote:
> >
> > On 10/12/2023 8:01 PM, Jason Wang wrote:
> >> On Tue, Oct 10, 2023 at 5:05 PM Si-Wei Liu  
> >> wrote:
> >>> Devices with on-chip IOMMU or vendor specific IOTLB implementation
> >>> may need to restore iotlb mapping to the initial or default state
> >>> using the .reset_map op, as it's desirable for some parent devices
> >>> to solely manipulate mappings by its own, independent of virtio device
> >>> state. For instance, device reset does not cause mapping go away on
> >>> such IOTLB model in need of persistent mapping. Before vhost-vdpa
> >>> is going away, give them a chance to reset iotlb back to the initial
> >>> state in vhost_vdpa_cleanup().
> >>>
> >>> Signed-off-by: Si-Wei Liu 
> >>> ---
> >>> drivers/vhost/vdpa.c | 16 
> >>> 1 file changed, 16 insertions(+)
> >>>
> >>> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> >>> index 851535f..a3f8160 100644
> >>> --- a/drivers/vhost/vdpa.c
> >>> +++ b/drivers/vhost/vdpa.c
> >>> @@ -131,6 +131,15 @@ static struct vhost_vdpa_as 
> >>> *vhost_vdpa_find_alloc_as(struct vhost_vdpa *v,
> >>>return vhost_vdpa_alloc_as(v, asid);
> >>> }
> >>>
> >>> +static void vhost_vdpa_reset_map(struct vhost_vdpa *v, u32 asid)
> >>> +{
> >>> +   struct vdpa_device *vdpa = v->vdpa;
> >>> +   const struct vdpa_config_ops *ops = vdpa->config;
> >>> +
> >>> +   if (ops->reset_map)
> >>> +   ops->reset_map(vdpa, asid);
> >>> +}
> >>> +
> >>> static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 asid)
> >>> {
> >>>struct vhost_vdpa_as *as = asid_to_as(v, asid);
> >>> @@ -140,6 +149,13 @@ static int vhost_vdpa_remove_as(struct 
> >>> vhost_vdpa *v, u32 asid)
> >>>
> >>>hlist_del(>hash_link);
> >>>vhost_vdpa_iotlb_unmap(v, >iotlb, 0ULL, 0ULL - 1, 
> >>> asid);
> >>> +   /*
> >>> +* Devices with vendor specific IOMMU may need to restore
> >>> +* iotlb to the initial or default state which is not done
> >>> +* through device reset, as the IOTLB mapping manipulation
> >>> +* could be decoupled from the virtio device life cycle.
> >>> +*/
> >> Should we do this according to whether IOTLB_PRESIST is set?
> > Well, in theory this seems like so but it's unnecessary code change
> > actually, as that is the way how vDPA parent behind platform IOMMU works
> > today, and userspace doesn't break as of today. :)
>  Well, this is one question I've ever asked before. You have explained
>  that one of the reason that we don't break userspace is that they may
>  couple IOTLB reset with vDPA reset as well. One example is the Qemu.
> 
> > As explained in previous threads [1][2], when IOTLB_PERSIST is not set
> > it doesn't necessarily mean the iotlb will definitely be destroyed
> > across reset (think about the platform IOMMU case), so userspace today
> > is already tolerating enough with either good or bad IOMMU.
> > I'm confused, how to define tolerating here?
>
> Tolerating defined as QEMU has to proactively unmap before reset just to
> workaround the driver bug (on-chip maps out of sync), unconditionally
> for platform or on-chip. While we all know it doesn't have to do so for
> platform IOMMU, though userspace has no means to distinguish. That said,
> userspace is sacrificing reset time performance on platform IOMMU setup
> just for working around buggy implementation in the other setup.

Ok, so what you actually mean is that userspace can tolerate the "bug"
with the performance penalty.


>
> > For example, if it has tolerance, why bother?
> I'm not sure I get the question. But I think userspace is compromising
> because of buggy implementation in a few drivers doesn't mean we should
> uniformly enforce such behavior for all set_map/dma_map implementations.

This is not my point. I meant, we can fix we need a negotiation in
order to let some "buggy" old user space to survive from the changes.

>
> >
>  This code of
> > not checking IOTLB_PERSIST being set is intentional, there's no point to
> > emulate bad IOMMU behavior even for older userspace (with improper
> > emulation to be done it would result in even worse performance).
> > I can easily imagine a case:
> >
> > The old Qemu that works only with a setup like mlx5_vdpa.
> Noted, seems to me there's no such case of a userspace implementation
> that only works with mlx5_vdpa or 

Re: [PATCH 2/4] vhost-vdpa: reset vendor specific mapping to initial state in .release

2023-10-17 Thread Si-Wei Liu



On 10/16/2023 7:35 PM, Jason Wang wrote:

On Tue, Oct 17, 2023 at 4:30 AM Si-Wei Liu  wrote:



On 10/16/2023 4:28 AM, Eugenio Perez Martin wrote:

On Mon, Oct 16, 2023 at 8:33 AM Jason Wang  wrote:

On Fri, Oct 13, 2023 at 3:36 PM Si-Wei Liu  wrote:


On 10/12/2023 8:01 PM, Jason Wang wrote:

On Tue, Oct 10, 2023 at 5:05 PM Si-Wei Liu  wrote:

Devices with on-chip IOMMU or vendor specific IOTLB implementation
may need to restore iotlb mapping to the initial or default state
using the .reset_map op, as it's desirable for some parent devices
to solely manipulate mappings by its own, independent of virtio device
state. For instance, device reset does not cause mapping go away on
such IOTLB model in need of persistent mapping. Before vhost-vdpa
is going away, give them a chance to reset iotlb back to the initial
state in vhost_vdpa_cleanup().

Signed-off-by: Si-Wei Liu 
---
drivers/vhost/vdpa.c | 16 
1 file changed, 16 insertions(+)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 851535f..a3f8160 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -131,6 +131,15 @@ static struct vhost_vdpa_as 
*vhost_vdpa_find_alloc_as(struct vhost_vdpa *v,
   return vhost_vdpa_alloc_as(v, asid);
}

+static void vhost_vdpa_reset_map(struct vhost_vdpa *v, u32 asid)
+{
+   struct vdpa_device *vdpa = v->vdpa;
+   const struct vdpa_config_ops *ops = vdpa->config;
+
+   if (ops->reset_map)
+   ops->reset_map(vdpa, asid);
+}
+
static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 asid)
{
   struct vhost_vdpa_as *as = asid_to_as(v, asid);
@@ -140,6 +149,13 @@ static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 
asid)

   hlist_del(>hash_link);
   vhost_vdpa_iotlb_unmap(v, >iotlb, 0ULL, 0ULL - 1, asid);
+   /*
+* Devices with vendor specific IOMMU may need to restore
+* iotlb to the initial or default state which is not done
+* through device reset, as the IOTLB mapping manipulation
+* could be decoupled from the virtio device life cycle.
+*/

Should we do this according to whether IOTLB_PRESIST is set?

Well, in theory this seems like so but it's unnecessary code change
actually, as that is the way how vDPA parent behind platform IOMMU works
today, and userspace doesn't break as of today. :)

Well, this is one question I've ever asked before. You have explained
that one of the reason that we don't break userspace is that they may
couple IOTLB reset with vDPA reset as well. One example is the Qemu.


As explained in previous threads [1][2], when IOTLB_PERSIST is not set
it doesn't necessarily mean the iotlb will definitely be destroyed
across reset (think about the platform IOMMU case), so userspace today
is already tolerating enough with either good or bad IOMMU.

I'm confused, how to define tolerating here?


Tolerating defined as QEMU has to proactively unmap before reset just to 
workaround the driver bug (on-chip maps out of sync), unconditionally 
for platform or on-chip. While we all know it doesn't have to do so for 
platform IOMMU, though userspace has no means to distinguish. That said, 
userspace is sacrificing reset time performance on platform IOMMU setup 
just for working around buggy implementation in the other setup.



For example, if it has tolerance, why bother?
I'm not sure I get the question. But I think userspace is compromising 
because of buggy implementation in a few drivers doesn't mean we should 
uniformly enforce such behavior for all set_map/dma_map implementations.





This code of

not checking IOTLB_PERSIST being set is intentional, there's no point to
emulate bad IOMMU behavior even for older userspace (with improper
emulation to be done it would result in even worse performance).

I can easily imagine a case:

The old Qemu that works only with a setup like mlx5_vdpa.
Noted, seems to me there's no such case of a userspace implementation 
that only works with mlx5_vdpa or its friends, but doesn't work with the 
others e.g. platform IOMMU, or well behaving on-chip IOMMU 
implementations. The Unmap+remap trick around vdpa reset works totally 
fine for platform IOMMU, except with sub-optimal performance. Other than 
this trick, I cannot easily think of other means or iotlb message 
sequence for userspace to recover the bogus state and make iotlb back to 
work again after reset. Are we talking about hypnosis that has no real 
basis to exist in the real world?



  If we do
this without a negotiation, IOTLB will not be clear but the Qemu will
try to re-program the IOTLB after reset. Which will break?

1) stick the exact old behaviour with just one line of check
It's not just one line of check here, the old behavior emulation has to 
be done as Eugenio illustrated in the other email. In addition, the 
emulation has to limit to those buggy drivers as I don't feel this 
emulation should apply uniformly to all future set_map/dma_map 

Re: [PATCH 2/4] vhost-vdpa: reset vendor specific mapping to initial state in .release

2023-10-16 Thread Jason Wang
On Tue, Oct 17, 2023 at 4:30 AM Si-Wei Liu  wrote:
>
>
>
> On 10/16/2023 4:28 AM, Eugenio Perez Martin wrote:
> > On Mon, Oct 16, 2023 at 8:33 AM Jason Wang  wrote:
> >> On Fri, Oct 13, 2023 at 3:36 PM Si-Wei Liu  wrote:
> >>>
> >>>
> >>> On 10/12/2023 8:01 PM, Jason Wang wrote:
>  On Tue, Oct 10, 2023 at 5:05 PM Si-Wei Liu  wrote:
> > Devices with on-chip IOMMU or vendor specific IOTLB implementation
> > may need to restore iotlb mapping to the initial or default state
> > using the .reset_map op, as it's desirable for some parent devices
> > to solely manipulate mappings by its own, independent of virtio device
> > state. For instance, device reset does not cause mapping go away on
> > such IOTLB model in need of persistent mapping. Before vhost-vdpa
> > is going away, give them a chance to reset iotlb back to the initial
> > state in vhost_vdpa_cleanup().
> >
> > Signed-off-by: Si-Wei Liu 
> > ---
> >drivers/vhost/vdpa.c | 16 
> >1 file changed, 16 insertions(+)
> >
> > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > index 851535f..a3f8160 100644
> > --- a/drivers/vhost/vdpa.c
> > +++ b/drivers/vhost/vdpa.c
> > @@ -131,6 +131,15 @@ static struct vhost_vdpa_as 
> > *vhost_vdpa_find_alloc_as(struct vhost_vdpa *v,
> >   return vhost_vdpa_alloc_as(v, asid);
> >}
> >
> > +static void vhost_vdpa_reset_map(struct vhost_vdpa *v, u32 asid)
> > +{
> > +   struct vdpa_device *vdpa = v->vdpa;
> > +   const struct vdpa_config_ops *ops = vdpa->config;
> > +
> > +   if (ops->reset_map)
> > +   ops->reset_map(vdpa, asid);
> > +}
> > +
> >static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 asid)
> >{
> >   struct vhost_vdpa_as *as = asid_to_as(v, asid);
> > @@ -140,6 +149,13 @@ static int vhost_vdpa_remove_as(struct vhost_vdpa 
> > *v, u32 asid)
> >
> >   hlist_del(>hash_link);
> >   vhost_vdpa_iotlb_unmap(v, >iotlb, 0ULL, 0ULL - 1, asid);
> > +   /*
> > +* Devices with vendor specific IOMMU may need to restore
> > +* iotlb to the initial or default state which is not done
> > +* through device reset, as the IOTLB mapping manipulation
> > +* could be decoupled from the virtio device life cycle.
> > +*/
>  Should we do this according to whether IOTLB_PRESIST is set?
> >>> Well, in theory this seems like so but it's unnecessary code change
> >>> actually, as that is the way how vDPA parent behind platform IOMMU works
> >>> today, and userspace doesn't break as of today. :)
> >> Well, this is one question I've ever asked before. You have explained
> >> that one of the reason that we don't break userspace is that they may
> >> couple IOTLB reset with vDPA reset as well. One example is the Qemu.
> >>
> >>> As explained in previous threads [1][2], when IOTLB_PERSIST is not set
> >>> it doesn't necessarily mean the iotlb will definitely be destroyed
> >>> across reset (think about the platform IOMMU case), so userspace today
> >>> is already tolerating enough with either good or bad IOMMU.

I'm confused, how to define tolerating here? For example, if it has
tolerance, why bother?

> >>This code of
> >>> not checking IOTLB_PERSIST being set is intentional, there's no point to
> >>> emulate bad IOMMU behavior even for older userspace (with improper
> >>> emulation to be done it would result in even worse performance).

I can easily imagine a case:

The old Qemu that works only with a setup like mlx5_vdpa. If we do
this without a negotiation, IOTLB will not be clear but the Qemu will
try to re-program the IOTLB after reset. Which will break?

1) stick the exact old behaviour with just one line of check
2) audit all the possible cases to avoid a one line of code

1) seems much easier than 2)

> >> For two reasons:
> >>
> >> 1) backend features need acked by userspace this is by design
> >> 2) keep the odd behaviour seems to be more safe as we can't audit
> >> every userspace program
> >>
> > The old behavior (without flag ack) cannot be trusted already, as:

Possibly but the point is to unbreak userspace no matter how weird the
behaviour we've ever had.

> > * Devices using platform IOMMU (in other words, not implementing
> > neither .set_map nor .dma_map) does not unmap memory at virtio reset.
> > * Devices that implement .set_map or .dma_map (vdpa_sim, mlx5) do
> > reset IOTLB, but in their parent ops (vdpasim_do_reset, prune_iotlb
> > called from mlx5_vdpa_reset). With vdpa_sim patch removing the reset,
> > now all backends work the same as far as I know., which was (and is)
> > the way devices using the platform IOMMU works.
> >
> > The difference in behavior did not matter as QEMU unmaps all the
> > memory unregistering the memory listener at vhost_vdpa_dev_start(...,
> > started = 

Re: [PATCH 2/4] vhost-vdpa: reset vendor specific mapping to initial state in .release

2023-10-16 Thread Si-Wei Liu



On 10/16/2023 4:28 AM, Eugenio Perez Martin wrote:

On Mon, Oct 16, 2023 at 8:33 AM Jason Wang  wrote:

On Fri, Oct 13, 2023 at 3:36 PM Si-Wei Liu  wrote:



On 10/12/2023 8:01 PM, Jason Wang wrote:

On Tue, Oct 10, 2023 at 5:05 PM Si-Wei Liu  wrote:

Devices with on-chip IOMMU or vendor specific IOTLB implementation
may need to restore iotlb mapping to the initial or default state
using the .reset_map op, as it's desirable for some parent devices
to solely manipulate mappings by its own, independent of virtio device
state. For instance, device reset does not cause mapping go away on
such IOTLB model in need of persistent mapping. Before vhost-vdpa
is going away, give them a chance to reset iotlb back to the initial
state in vhost_vdpa_cleanup().

Signed-off-by: Si-Wei Liu 
---
   drivers/vhost/vdpa.c | 16 
   1 file changed, 16 insertions(+)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 851535f..a3f8160 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -131,6 +131,15 @@ static struct vhost_vdpa_as 
*vhost_vdpa_find_alloc_as(struct vhost_vdpa *v,
  return vhost_vdpa_alloc_as(v, asid);
   }

+static void vhost_vdpa_reset_map(struct vhost_vdpa *v, u32 asid)
+{
+   struct vdpa_device *vdpa = v->vdpa;
+   const struct vdpa_config_ops *ops = vdpa->config;
+
+   if (ops->reset_map)
+   ops->reset_map(vdpa, asid);
+}
+
   static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 asid)
   {
  struct vhost_vdpa_as *as = asid_to_as(v, asid);
@@ -140,6 +149,13 @@ static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 
asid)

  hlist_del(>hash_link);
  vhost_vdpa_iotlb_unmap(v, >iotlb, 0ULL, 0ULL - 1, asid);
+   /*
+* Devices with vendor specific IOMMU may need to restore
+* iotlb to the initial or default state which is not done
+* through device reset, as the IOTLB mapping manipulation
+* could be decoupled from the virtio device life cycle.
+*/

Should we do this according to whether IOTLB_PRESIST is set?

Well, in theory this seems like so but it's unnecessary code change
actually, as that is the way how vDPA parent behind platform IOMMU works
today, and userspace doesn't break as of today. :)

Well, this is one question I've ever asked before. You have explained
that one of the reason that we don't break userspace is that they may
couple IOTLB reset with vDPA reset as well. One example is the Qemu.


As explained in previous threads [1][2], when IOTLB_PERSIST is not set
it doesn't necessarily mean the iotlb will definitely be destroyed
across reset (think about the platform IOMMU case), so userspace today
is already tolerating enough with either good or bad IOMMU. This code of
not checking IOTLB_PERSIST being set is intentional, there's no point to
emulate bad IOMMU behavior even for older userspace (with improper
emulation to be done it would result in even worse performance).

For two reasons:

1) backend features need acked by userspace this is by design
2) keep the odd behaviour seems to be more safe as we can't audit
every userspace program


The old behavior (without flag ack) cannot be trusted already, as:
* Devices using platform IOMMU (in other words, not implementing
neither .set_map nor .dma_map) does not unmap memory at virtio reset.
* Devices that implement .set_map or .dma_map (vdpa_sim, mlx5) do
reset IOTLB, but in their parent ops (vdpasim_do_reset, prune_iotlb
called from mlx5_vdpa_reset). With vdpa_sim patch removing the reset,
now all backends work the same as far as I know., which was (and is)
the way devices using the platform IOMMU works.

The difference in behavior did not matter as QEMU unmaps all the
memory unregistering the memory listener at vhost_vdpa_dev_start(...,
started = false),
Exactly. It's not just QEMU, but any (older) userspace manipulates 
mappings through the vhost-vdpa iotlb interface has to unmap all 
mappings to workaround the vdpa parent driver bug. If they don't do 
explicit unmap, it would cause state inconsistency between vhost-vdpa 
and parent driver, then old mappings can't be restored, and new mapping 
can be added to iotlb after vDPA reset. There's no point to preserve 
this broken and inconsistent behavior between vhost-vdpa and parent 
driver, as userspace doesn't care at all!



but the backend acknowledging this feature flag
allows QEMU to make sure it is safe to skip this unmap & map in the
case of vhost stop & start cycle.

In that sense, this feature flag is actually a signal for userspace to
know that the bug has been solved.
Right, I couldn't say it better than you do, thanks! The feature flag is 
more of an unusual means to indicating kernel bug having been fixed, 
rather than introduce a new feature or new kernel behavior ending up in 
change of userspace's expectation.



Not offering it indicates that
userspace cannot trust the kernel will retain the maps.

Si-Wei or Dragos, please correct me if 

Re: [PATCH 2/4] vhost-vdpa: reset vendor specific mapping to initial state in .release

2023-10-16 Thread Si-Wei Liu



On 10/15/2023 11:32 PM, Jason Wang wrote:

On Fri, Oct 13, 2023 at 3:36 PM Si-Wei Liu  wrote:



On 10/12/2023 8:01 PM, Jason Wang wrote:

On Tue, Oct 10, 2023 at 5:05 PM Si-Wei Liu  wrote:

Devices with on-chip IOMMU or vendor specific IOTLB implementation
may need to restore iotlb mapping to the initial or default state
using the .reset_map op, as it's desirable for some parent devices
to solely manipulate mappings by its own, independent of virtio device
state. For instance, device reset does not cause mapping go away on
such IOTLB model in need of persistent mapping. Before vhost-vdpa
is going away, give them a chance to reset iotlb back to the initial
state in vhost_vdpa_cleanup().

Signed-off-by: Si-Wei Liu 
---
   drivers/vhost/vdpa.c | 16 
   1 file changed, 16 insertions(+)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 851535f..a3f8160 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -131,6 +131,15 @@ static struct vhost_vdpa_as 
*vhost_vdpa_find_alloc_as(struct vhost_vdpa *v,
  return vhost_vdpa_alloc_as(v, asid);
   }

+static void vhost_vdpa_reset_map(struct vhost_vdpa *v, u32 asid)
+{
+   struct vdpa_device *vdpa = v->vdpa;
+   const struct vdpa_config_ops *ops = vdpa->config;
+
+   if (ops->reset_map)
+   ops->reset_map(vdpa, asid);
+}
+
   static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 asid)
   {
  struct vhost_vdpa_as *as = asid_to_as(v, asid);
@@ -140,6 +149,13 @@ static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 
asid)

  hlist_del(>hash_link);
  vhost_vdpa_iotlb_unmap(v, >iotlb, 0ULL, 0ULL - 1, asid);
+   /*
+* Devices with vendor specific IOMMU may need to restore
+* iotlb to the initial or default state which is not done
+* through device reset, as the IOTLB mapping manipulation
+* could be decoupled from the virtio device life cycle.
+*/

Should we do this according to whether IOTLB_PRESIST is set?

Well, in theory this seems like so but it's unnecessary code change
actually, as that is the way how vDPA parent behind platform IOMMU works
today, and userspace doesn't break as of today. :)

Well, this is one question I've ever asked before. You have explained
that one of the reason that we don't break userspace is that they may
couple IOTLB reset with vDPA reset as well. One example is the Qemu.
Nope, it was the opposite. Maybe it was not clear enough, let me try 
once more - userspace CANNOT decouple IOTLB reset from vDPA reset today. 
This is because of bug/discrepancy in mlx5_vdap and vdpa_sim already 
breaking userspace's expectation, rendering the brokenness/inconsistency 
on vhost-vdpa mapping interface from behaving what it promised and 
should have done. Only with the IOTLB_PERSIST flag seen userspace can 
trust vhost-vdpa kernel interface *reliably* to decouple IOTLB reset 
from vDPA reset. Without seeing this flag, no matter how the code in 
QEMU was written, today's older userspace was never like to assume the 
mappings will *definitely* be cleared by vDPA reset. If any userspace 
implementation wants to get consistent behavior for all vDPA parent 
devices, it still has to *explicitly* clear all existing mappings by its 
own by sending bunch of unmap (iotlb invalidate) requests to vhost-vdpa 
kernel before resetting the vDPA backend.


In brief, userspace is already broken by kernel implementation today, 
and new userspace needs some device flag to know for sure if kernel bug 
has already been fixed; older userspace doesn't care about preserving 
the broken kernel behavior at all, regardless whether or not it wants to 
decouple IOTLB from vDPA reset.





As explained in previous threads [1][2], when IOTLB_PERSIST is not set
it doesn't necessarily mean the iotlb will definitely be destroyed
across reset (think about the platform IOMMU case), so userspace today
is already tolerating enough with either good or bad IOMMU. This code of
not checking IOTLB_PERSIST being set is intentional, there's no point to
emulate bad IOMMU behavior even for older userspace (with improper
emulation to be done it would result in even worse performance).

For two reasons:

1) backend features need acked by userspace this is by design
There's no breakage on this part. Backend feature IOTLB_PERSIST won't be 
set if userspace doesn't ack.

2) keep the odd behaviour seems to be more safe as we can't audit
every userspace program
Definitely don't have to audit every userspace program, but I cannot 
think of a case where a sane userspace program can be broken. Can you 
elaborate one or two potential userspace usage that may break because of 
this? As said, platform IOMMU already did it this way.


Regards,
-Siwei


Thanks


I think
the purpose of the IOTLB_PERSIST flag is just to give userspace 100%
certainty of persistent iotlb mapping not getting lost across vdpa reset.

Thanks,
-Siwei

[1]

Re: [PATCH 2/4] vhost-vdpa: reset vendor specific mapping to initial state in .release

2023-10-16 Thread Jason Wang
On Fri, Oct 13, 2023 at 3:36 PM Si-Wei Liu  wrote:
>
>
>
> On 10/12/2023 8:01 PM, Jason Wang wrote:
> > On Tue, Oct 10, 2023 at 5:05 PM Si-Wei Liu  wrote:
> >> Devices with on-chip IOMMU or vendor specific IOTLB implementation
> >> may need to restore iotlb mapping to the initial or default state
> >> using the .reset_map op, as it's desirable for some parent devices
> >> to solely manipulate mappings by its own, independent of virtio device
> >> state. For instance, device reset does not cause mapping go away on
> >> such IOTLB model in need of persistent mapping. Before vhost-vdpa
> >> is going away, give them a chance to reset iotlb back to the initial
> >> state in vhost_vdpa_cleanup().
> >>
> >> Signed-off-by: Si-Wei Liu 
> >> ---
> >>   drivers/vhost/vdpa.c | 16 
> >>   1 file changed, 16 insertions(+)
> >>
> >> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> >> index 851535f..a3f8160 100644
> >> --- a/drivers/vhost/vdpa.c
> >> +++ b/drivers/vhost/vdpa.c
> >> @@ -131,6 +131,15 @@ static struct vhost_vdpa_as 
> >> *vhost_vdpa_find_alloc_as(struct vhost_vdpa *v,
> >>  return vhost_vdpa_alloc_as(v, asid);
> >>   }
> >>
> >> +static void vhost_vdpa_reset_map(struct vhost_vdpa *v, u32 asid)
> >> +{
> >> +   struct vdpa_device *vdpa = v->vdpa;
> >> +   const struct vdpa_config_ops *ops = vdpa->config;
> >> +
> >> +   if (ops->reset_map)
> >> +   ops->reset_map(vdpa, asid);
> >> +}
> >> +
> >>   static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 asid)
> >>   {
> >>  struct vhost_vdpa_as *as = asid_to_as(v, asid);
> >> @@ -140,6 +149,13 @@ static int vhost_vdpa_remove_as(struct vhost_vdpa *v, 
> >> u32 asid)
> >>
> >>  hlist_del(>hash_link);
> >>  vhost_vdpa_iotlb_unmap(v, >iotlb, 0ULL, 0ULL - 1, asid);
> >> +   /*
> >> +* Devices with vendor specific IOMMU may need to restore
> >> +* iotlb to the initial or default state which is not done
> >> +* through device reset, as the IOTLB mapping manipulation
> >> +* could be decoupled from the virtio device life cycle.
> >> +*/
> > Should we do this according to whether IOTLB_PRESIST is set?
> Well, in theory this seems like so but it's unnecessary code change
> actually, as that is the way how vDPA parent behind platform IOMMU works
> today, and userspace doesn't break as of today. :)

Well, this is one question I've ever asked before. You have explained
that one of the reason that we don't break userspace is that they may
couple IOTLB reset with vDPA reset as well. One example is the Qemu.

>
> As explained in previous threads [1][2], when IOTLB_PERSIST is not set
> it doesn't necessarily mean the iotlb will definitely be destroyed
> across reset (think about the platform IOMMU case), so userspace today
> is already tolerating enough with either good or bad IOMMU. This code of
> not checking IOTLB_PERSIST being set is intentional, there's no point to
> emulate bad IOMMU behavior even for older userspace (with improper
> emulation to be done it would result in even worse performance).

For two reasons:

1) backend features need acked by userspace this is by design
2) keep the odd behaviour seems to be more safe as we can't audit
every userspace program

Thanks

> I think
> the purpose of the IOTLB_PERSIST flag is just to give userspace 100%
> certainty of persistent iotlb mapping not getting lost across vdpa reset.
>
> Thanks,
> -Siwei
>
> [1]
> https://lore.kernel.org/virtualization/9f118fc9-4f6f-dd67-a291-be78152e4...@oracle.com/
> [2]
> https://lore.kernel.org/virtualization/3364adfd-1eb7-8bce-41f9-bfe5473f1...@oracle.com/
> >   Otherwise
> > we may break old userspace.
> >
> > Thanks
> >
> >> +   vhost_vdpa_reset_map(v, asid);
> >>  kfree(as);
> >>
> >>  return 0;
> >> --
> >> 1.8.3.1
> >>
>

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 2/4] vhost-vdpa: reset vendor specific mapping to initial state in .release

2023-10-13 Thread Si-Wei Liu



On 10/12/2023 8:01 PM, Jason Wang wrote:

On Tue, Oct 10, 2023 at 5:05 PM Si-Wei Liu  wrote:

Devices with on-chip IOMMU or vendor specific IOTLB implementation
may need to restore iotlb mapping to the initial or default state
using the .reset_map op, as it's desirable for some parent devices
to solely manipulate mappings by its own, independent of virtio device
state. For instance, device reset does not cause mapping go away on
such IOTLB model in need of persistent mapping. Before vhost-vdpa
is going away, give them a chance to reset iotlb back to the initial
state in vhost_vdpa_cleanup().

Signed-off-by: Si-Wei Liu 
---
  drivers/vhost/vdpa.c | 16 
  1 file changed, 16 insertions(+)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 851535f..a3f8160 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -131,6 +131,15 @@ static struct vhost_vdpa_as 
*vhost_vdpa_find_alloc_as(struct vhost_vdpa *v,
 return vhost_vdpa_alloc_as(v, asid);
  }

+static void vhost_vdpa_reset_map(struct vhost_vdpa *v, u32 asid)
+{
+   struct vdpa_device *vdpa = v->vdpa;
+   const struct vdpa_config_ops *ops = vdpa->config;
+
+   if (ops->reset_map)
+   ops->reset_map(vdpa, asid);
+}
+
  static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 asid)
  {
 struct vhost_vdpa_as *as = asid_to_as(v, asid);
@@ -140,6 +149,13 @@ static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 
asid)

 hlist_del(>hash_link);
 vhost_vdpa_iotlb_unmap(v, >iotlb, 0ULL, 0ULL - 1, asid);
+   /*
+* Devices with vendor specific IOMMU may need to restore
+* iotlb to the initial or default state which is not done
+* through device reset, as the IOTLB mapping manipulation
+* could be decoupled from the virtio device life cycle.
+*/

Should we do this according to whether IOTLB_PRESIST is set?
Well, in theory this seems like so but it's unnecessary code change 
actually, as that is the way how vDPA parent behind platform IOMMU works 
today, and userspace doesn't break as of today. :)


As explained in previous threads [1][2], when IOTLB_PERSIST is not set 
it doesn't necessarily mean the iotlb will definitely be destroyed 
across reset (think about the platform IOMMU case), so userspace today 
is already tolerating enough with either good or bad IOMMU. This code of 
not checking IOTLB_PERSIST being set is intentional, there's no point to 
emulate bad IOMMU behavior even for older userspace (with improper 
emulation to be done it would result in even worse performance). I think 
the purpose of the IOTLB_PERSIST flag is just to give userspace 100% 
certainty of persistent iotlb mapping not getting lost across vdpa reset.


Thanks,
-Siwei

[1] 
https://lore.kernel.org/virtualization/9f118fc9-4f6f-dd67-a291-be78152e4...@oracle.com/
[2] 
https://lore.kernel.org/virtualization/3364adfd-1eb7-8bce-41f9-bfe5473f1...@oracle.com/

  Otherwise
we may break old userspace.

Thanks


+   vhost_vdpa_reset_map(v, asid);
 kfree(as);

 return 0;
--
1.8.3.1



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 2/4] vhost-vdpa: reset vendor specific mapping to initial state in .release

2023-10-12 Thread Jason Wang
On Tue, Oct 10, 2023 at 5:05 PM Si-Wei Liu  wrote:
>
> Devices with on-chip IOMMU or vendor specific IOTLB implementation
> may need to restore iotlb mapping to the initial or default state
> using the .reset_map op, as it's desirable for some parent devices
> to solely manipulate mappings by its own, independent of virtio device
> state. For instance, device reset does not cause mapping go away on
> such IOTLB model in need of persistent mapping. Before vhost-vdpa
> is going away, give them a chance to reset iotlb back to the initial
> state in vhost_vdpa_cleanup().
>
> Signed-off-by: Si-Wei Liu 
> ---
>  drivers/vhost/vdpa.c | 16 
>  1 file changed, 16 insertions(+)
>
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index 851535f..a3f8160 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -131,6 +131,15 @@ static struct vhost_vdpa_as 
> *vhost_vdpa_find_alloc_as(struct vhost_vdpa *v,
> return vhost_vdpa_alloc_as(v, asid);
>  }
>
> +static void vhost_vdpa_reset_map(struct vhost_vdpa *v, u32 asid)
> +{
> +   struct vdpa_device *vdpa = v->vdpa;
> +   const struct vdpa_config_ops *ops = vdpa->config;
> +
> +   if (ops->reset_map)
> +   ops->reset_map(vdpa, asid);
> +}
> +
>  static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 asid)
>  {
> struct vhost_vdpa_as *as = asid_to_as(v, asid);
> @@ -140,6 +149,13 @@ static int vhost_vdpa_remove_as(struct vhost_vdpa *v, 
> u32 asid)
>
> hlist_del(>hash_link);
> vhost_vdpa_iotlb_unmap(v, >iotlb, 0ULL, 0ULL - 1, asid);
> +   /*
> +* Devices with vendor specific IOMMU may need to restore
> +* iotlb to the initial or default state which is not done
> +* through device reset, as the IOTLB mapping manipulation
> +* could be decoupled from the virtio device life cycle.
> +*/

Should we do this according to whether IOTLB_PRESIST is set? Otherwise
we may break old userspace.

Thanks

> +   vhost_vdpa_reset_map(v, asid);
> kfree(as);
>
> return 0;
> --
> 1.8.3.1
>

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 2/4] vhost-vdpa: reset vendor specific mapping to initial state in .release

2023-10-12 Thread Si-Wei Liu



On 10/11/2023 4:21 AM, Eugenio Perez Martin wrote:

On Tue, Oct 10, 2023 at 11:05 AM Si-Wei Liu  wrote:

Devices with on-chip IOMMU or vendor specific IOTLB implementation
may need to restore iotlb mapping to the initial or default state
using the .reset_map op, as it's desirable for some parent devices
to solely manipulate mappings by its own, independent of virtio device
state. For instance, device reset does not cause mapping go away on
such IOTLB model in need of persistent mapping. Before vhost-vdpa
is going away, give them a chance to reset iotlb back to the initial
state in vhost_vdpa_cleanup().

Signed-off-by: Si-Wei Liu 
---
  drivers/vhost/vdpa.c | 16 
  1 file changed, 16 insertions(+)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 851535f..a3f8160 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -131,6 +131,15 @@ static struct vhost_vdpa_as 
*vhost_vdpa_find_alloc_as(struct vhost_vdpa *v,
 return vhost_vdpa_alloc_as(v, asid);
  }

+static void vhost_vdpa_reset_map(struct vhost_vdpa *v, u32 asid)
+{
+   struct vdpa_device *vdpa = v->vdpa;
+   const struct vdpa_config_ops *ops = vdpa->config;
+
+   if (ops->reset_map)
+   ops->reset_map(vdpa, asid);
+}
+
  static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 asid)
  {
 struct vhost_vdpa_as *as = asid_to_as(v, asid);
@@ -140,6 +149,13 @@ static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 
asid)

 hlist_del(>hash_link);
 vhost_vdpa_iotlb_unmap(v, >iotlb, 0ULL, 0ULL - 1, asid);

Now I'm wondering, does this call to vhost_vdpa_iotlb_unmap sets a
different map (via .set_map) per element of the vhost_iotlb_itree?
Yes and no, effectively this vhost_vdpa_iotlb_unmap call will pass an 
empty iotlb with zero map entry down to the driver via .set_map, so for 
.set_map interface it's always a different map no matter what. As for 
this special case, the internal implementation of mlx5_vdpa .set_map may 
choose to either destroy MR and recreate a new one, or remove all 
mappings on the existing MR (currently it uses destroy+recreate for 
simplicity without have to special case). But .reset_map is different - 
the 1:1 DMA MR has to be recreated explicitly after destroying the 
regular MR, so you see this is driver/device implementation specifics.



  Not
a big deal since we're in the cleanup path, but it could be a nice
optimization on top as we're going to reset the map of the asid
anyway.
You mean wrap up what's done in vhost_vdpa_iotlb_unmap and 
vhost_vdpa_reset_map to a new call, say vhost_vdpa_iotlb_reset? Yes this 
is possible, but be noted that the vhost_vdpa_iotlb_unmap also takes 
charge of pinning accounting other than mapping, and it has to also 
maintain it's own vhost_iotlb copy in sync. There's no such much code 
that can be consolidated or generalized at this point, as 
vhost_vdpa_reset_map() is very specific to some device implementation, 
and I don't see common need to optimize this further up in the map/unmap 
hot path rather than this cleanup slow path, just as you alluded to.


Regards,
-Siwei



+   /*
+* Devices with vendor specific IOMMU may need to restore
+* iotlb to the initial or default state which is not done
+* through device reset, as the IOTLB mapping manipulation
+* could be decoupled from the virtio device life cycle.
+*/
+   vhost_vdpa_reset_map(v, asid);
 kfree(as);

 return 0;
--
1.8.3.1



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization