date:20230920

Hi Michael,

> From: Michael S. Tsirkin 
> Sent: Thursday, September 21, 2023 11:12 AM

> I was attempting to have each of you see other's point of view.
> It seems clear I was right, at least one way communication was not getting
> through. Let me try to help.
> 
> 
> First, clearly Zhu Lingshan cares about the mediation use-case, not the un-
> mediated one.  Mediation is clearly heavier but also more powerful in many
> use-cases - is that obvious or do I need to list the reasons?

I agree to it.

> To mention one example, it supports cross-vendor migration. Which the
> unmediated variant maybe can in theory support too, and when it does maybe
> in a better and more structured way - but that will require standartization 
> effort
> that didn't happen yet. With mediation it was already demonstrated more than
> once.
>
We should be enhancing the device context so that more and more items can be 
annotated.
I started with small to get the design and idea through and will expand the 
device context so that cross vendors can migrate.

> 1. For mediation something that works within existing mediation framework -
> e.g. reusing as he does feature bits - will require less support than a 
> completely
> separate facility.
> I think Zhu Lingshan also believes that since there will be less code -> less
> security issues.
>
With approach of [1], there is less code in the core device migration flow 
because none of those fields etc are parsed/read/written by the driver software.
 
> 2. With or without mediation, the mapping of commands to VFs is simpler,
> allowing more control - for example, let's say you want to reset a VF - you 
> do not
> need to flush the queue of existing commands, which might potentially take a
> long time because some other VFs are very busy - you just reset the VF which
> any unmap flow will already do.
> 
If I understand you right, to reset a VF, no need to flush the queues without 
mediation too.
Just do VF FLR or do device reset, both will be fine.

> 
> 
> But Zhu Lingshan, all this will be pointless if you also do not try to do 
> this and list
> what are reasonable points that Parav made. Please do not mistake what I'm
> doing here for taking sides I just want the communication to start working. 
> And
> that means everyone tries to take all use-cases into account even if working 
> for
> a vendor that does not care about this use-case. Otherwise we will just keep
> getting into these flamewars.

Right. I like to take this opportunity to ask again, lets sit together and see 
if we can utilize common framework between two methods.
For example,
1. device context for mediation and without mediation can provide common 
structures that both solutions can use
2. device provisioning (not in any of our series, that can find common ways)

May be more can be merged once we are open to collaborate.
If we face technical issues in unifying the methods, it will be self-explained 
why both methods to exist or create something different for two different use 
cases.

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state

On Thu, Sep 21, 2023 at 03:43:12AM +, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin 
> > Sent: Thursday, September 21, 2023 1:34 AM
> > 
> > On Wed, Sep 20, 2023 at 05:21:52PM +, Parav Pandit wrote:
> > > > OK so with this summary in mind, can you find any advantages to
> > > > inband+mediation that are real or do you just see disadvantages? And
> > > > it's a tricky question because I can see some advantages ;)
> > >
> > > inband + mediation may be useful for nested case.
> > 
> > Hint: there's more.
> 
> Can you please list down?
> 
> The starting point of discussion is, there is passthrough member device 
> without mediation in virtio interface layers.
> How shall device migration should work for it?

I was attempting to have each of you see other's point of view.
It seems clear I was right, at least one way communication was
not getting through. Let me try to help.

First, clearly Zhu Lingshan cares about the mediation use-case, not the
un-mediated one.  Mediation is clearly heavier but also more powerful
in many use-cases - is that obvious or do I need to list the reasons?
To mention one example, it supports cross-vendor migration. Which the unmediated
variant maybe can in theory support too, and when it does maybe in a better and
more structured way - but that will require standartization effort that
didn't happen yet. With mediation it was already demonstrated more than
once.

1. For mediation something that works within existing mediation framework -
e.g. reusing as he does feature bits - will require less support
than a completely separate facility.
I think Zhu Lingshan also believes that since there will be less code ->
less security issues.

2. With or without mediation, the mapping of commands to VFs is simpler,
allowing more control - for example, let's say you want to reset a VF -
you do not need to flush the queue of existing commands, which might
potentially take a long time because some other VFs are very busy - you
just reset the VF which any unmap flow will already do.

But Zhu Lingshan, all this will be pointless if you also do not try to
do this and list what are reasonable points that Parav made. Please do
not mistake what I'm doing here for taking sides I just want the
communication to start working. And that means everyone tries to take
all use-cases into account even if working for a vendor that does not
care about this use-case. Otherwise we will just keep getting into these
flamewars.
-- 
MST

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state

On Wed, Sep 20, 2023 at 07:27:05PM +0800, Zhu, Lingshan wrote:
> What if a malicious SW dump guest memory through admin vq LM facility?

What if malicious SW misconfigures vq through the SUSPEND bit facility?

-- 
MST


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

RE: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state



> From: Jason Wang 
> Sent: Thursday, September 21, 2023 9:50 AM

> Parav, I think I've clarified several times:
> 
> migration using the admin command is probably fine in some use cases.
>
This is definitely, was not clear to me.
I am 100% clear now.
 
> What's not fine, is:
> 
> Mandate the admin command to be the only way for migration.
>
For sure, my series did not mandate that either.
I kept asking if we both can converge it will be really good to merge two use 
cases, we should.
If we cannot because of technical issues, than both methods to exists to 
address two different use cases.

> Are we on the same page for my concern now?
> 
Yes.

> > What is the advantage of descriptor posting using virtqueue. It is the way 
> > of
> virtio spec...
> >
> > > > Well, it is one way to achieve it.
> > > > There may be different way to do all bulk data transfer without
> > > > admin
> > > commands.
> > >
> > > Why is virtqueue the only way to do bulk data transferring? Can't
> > > DMA be initiated by other-way?
> > >
> >
> > Sure, what is the disadvantage of existing mechanism of virtqueue that can 
> > do
> following.
> > 1. Ability to do DMA
> > 2. agnostic of the DMA who do not want to do DMA
> 
> I don't understand this.
>
Admin commands can work without DMA, right because they are transported using 
admin queue.
 
> > 3. Ability to multiple command executions in parallel
> 
> Each device has their self-contained interface, why can't the commands be
> executed in parallel.
> 
Within the device it cannot if the interface is synchronous.

> > 4. Non blocking interface for driver that does not require any kind of
> > polling
> 
> Are you saying the interrupt can only work for virtqueue?
>
No. I am saying if one has to invent a interface that satisfy above needs, it 
will become a virtqueue.
And if it is not, one should list those disadvantages and cost of new 
interface, and explain its benefits.
Such interface should be generic one too.
 
> >
> > Why to invent new DMA scheme which at does all the 4 tasks?
> 
> It's simply because admin virtqueue can not work for all the cases. I think 
> you've
> agreed on this, no?
>
I think it may work for nested case as well at the cost of replicating it on 
each device and adding special plumbing to isolate it, so that guest cannot 
issue driver notifications to it.

> > First please list down disadvantages of admin queue + show all 4 things are
> achieved using new DMA interface.
> > That will help to understand why new dma interface is needed.
> 
> I can give you a simple example. For example, what happens if we want to
> migrate the owner? Having another owner for this owner is not a good answer.

That is the nesting, don’t see any difference with other nesting.

[virtio-dev] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg

On Tue, Sep 19, 2023 at 7:43 PM Jiqian Chen  wrote:
>
> When guest vm does S3, Qemu will reset and clear some things of virtio
> devices, but guest can't aware that, so that may cause some problems.
> For excample, Qemu calls virtio_reset->virtio_gpu_gl_reset when guest
> resume, that function will destroy render resources of virtio-gpu. As
> a result, after guest resume, the display can't come back and we only
> saw a black screen. Due to guest can't re-create all the resources, so
> we need to let Qemu not to destroy them when S3.
>
> For above purpose, we need a mechanism that allows guests and QEMU to
> negotiate their reset behavior. So this patch add a new parameter
> named freeze_mode to struct virtio_pci_common_cfg. And when guest
> suspends, it can write freeze_mode to be FREEZE_S3, and then virtio
> devices can change their reset behavior on Qemu side according to
> freeze_mode. What's more, freeze_mode can be used for all virtio
> devices to affect the behavior of Qemu, not just virtio gpu device.

A simple question, why is this issue specific to pci?

Thanks


>
> Signed-off-by: Jiqian Chen 
> ---
>  transport-pci.tex | 7 +++
>  1 file changed, 7 insertions(+)
>
> diff --git a/transport-pci.tex b/transport-pci.tex
> index a5c6719..2543536 100644
> --- a/transport-pci.tex
> +++ b/transport-pci.tex
> @@ -319,6 +319,7 @@ \subsubsection{Common configuration structure 
> layout}\label{sec:Virtio Transport
>  le64 queue_desc;/* read-write */
>  le64 queue_driver;  /* read-write */
>  le64 queue_device;  /* read-write */
> +le16 freeze_mode;   /* read-write */
>  le16 queue_notif_config_data;   /* read-only for driver */
>  le16 queue_reset;   /* read-write */
>
> @@ -393,6 +394,12 @@ \subsubsection{Common configuration structure 
> layout}\label{sec:Virtio Transport
>  \item[\field{queue_device}]
>  The driver writes the physical address of Device Area here.  See 
> section \ref{sec:Basic Facilities of a Virtio Device / Virtqueues}.
>
> +\item[\field{freeze_mode}]
> +The driver writes this to set the freeze mode of virtio pci.
> +VIRTIO_PCI_FREEZE_MODE_UNFREEZE - virtio-pci is running;
> +VIRTIO_PCI_FREEZE_MODE_FREEZE_S3 - guest vm is doing S3, and 
> virtio-pci enters S3 suspension;
> +Other values are reserved for future use, like S4, etc.
> +
>  \item[\field{queue_notif_config_data}]
>  This field exists only if VIRTIO_F_NOTIF_CONFIG_DATA has been 
> negotiated.
>  The driver will use this value when driver sends available buffer
> --
> 2.34.1
>


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

RE: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state


> From: Jason Wang 
> Sent: Thursday, September 21, 2023 8:56 AM


> My understanding is it might be better that each side do a summary of the both
> proposals.

I will summarize it soon in reply to [1].
Thanks.

[1] 
https://lore.kernel.org/virtio-comment/20230909142911.524407-7-pa...@nvidia.com/T/#md9fcfa1ba997463de8c7fb8c6d1786b224b0bead

Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state

On Thu, Sep 21, 2023 at 12:11 PM Parav Pandit  wrote:
>
>
>
> > From: Jason Wang 
> > Sent: Thursday, September 21, 2023 9:32 AM
> >
> > On Thu, Sep 21, 2023 at 11:51 AM Parav Pandit  wrote:
> > >
> > >
> > >
> > > > From: Jason Wang 
> > > > Sent: Thursday, September 21, 2023 8:45 AM The main issue I see so
> > > > far is that you want to couple migration with admin commands but I
> > > > don't see much advantages to doing this.
> > > >
> > > The way I read above comment is, to draw a parallel line: descriptor 
> > > posting in
> > virtio spec is tied to virtqueues. What is the advantage of it?
> >
> > Are you saying virtio can't live without admin commands? Again, let's not 
> > shift
> > concepts.
> >
> No, I did not say that.
> I just don’t see how functionalities proposed in [1] can be done without 
> admin commands by the _device_ for member device passthrough requirement.
>
> You made point as "don’t see much advantage with migration done using admin 
> commands".

Parav, I think I've clarified several times:

migration using the admin command is probably fine in some use cases.

What's not fine, is:

Mandate the admin command to be the only way for migration.

Are we on the same page for my concern now?

> What is the advantage of descriptor posting using virtqueue. It is the way of 
> virtio spec...
>
> > > Well, it is one way to achieve it.
> > > There may be different way to do all bulk data transfer without admin
> > commands.
> >
> > Why is virtqueue the only way to do bulk data transferring? Can't DMA be
> > initiated by other-way?
> >
>
> Sure, what is the disadvantage of existing mechanism of virtqueue that can do 
> following.
> 1. Ability to do DMA
> 2. agnostic of the DMA who do not want to do DMA

I don't understand this.

> 3. Ability to multiple command executions in parallel

Each device has their self-contained interface, why can't the commands
be executed in parallel.

> 4. Non blocking interface for driver that does not require any kind of polling

Are you saying the interrupt can only work for virtqueue?

>
> Why to invent new DMA scheme which at does all the 4 tasks?

It's simply because admin virtqueue can not work for all the cases. I
think you've agreed on this, no?

> First please list down disadvantages of admin queue + show all 4 things are 
> achieved using new DMA interface.
> That will help to understand why new dma interface is needed.

I can give you a simple example. For example, what happens if we want
to migrate the owner? Having another owner for this owner is not a
good answer.

Thanks


>
>
>
> > Thanks
> >
> > > What it the advantage of it, please list down them in [1] for the command
> > where you can find alternative.
> > >
> > > [1]
> > > https://lore.kernel.org/virtio-comment/20230909142911.524407-7-parav@n
> > > vidia.com/T/#md9fcfa1ba997463de8c7fb8c6d1786b224b0bead
>


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

RE: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state


> From: Jason Wang 
> Sent: Thursday, September 21, 2023 9:39 AM
> 
> On Thu, Sep 21, 2023 at 12:01 PM Parav Pandit  wrote:
> >
> >
> >
> > > From: Jason Wang 
> > > Sent: Thursday, September 21, 2023 8:48 AM
> >
> > > As replied in another thread, the issues for BAR are:
> > >
> > > 1) Not sure it can have an efficient interface, it would be
> > > something like VIRTIO_PCI_CAP_PCI_CFG which is very slow compared to
> > > single register accessing
> > > 2) There's no owner/group/member for MMIO, most of the time, we only
> > > need a single MMIO device. If we want the owner to manage itself, it
> > > seems redundant as is implied in all the existing transports (without 
> > > admin
> commands).
> > > Even if we had, it might still suffer from bootstrap issues.
> > > 3) For live migration, it means the admin commands needs to start
> > > from duplicating every existing transport specific interface it can
> > > give us. One example is that we may end up with two interfaces to
> > > access virtqueue addresses etc. This results in extra complicity and
> > > it is actually a full transport (driver can just use admin commands to 
> > > drive
> the device).
> > In [1] there is no duplication. The live migration driver never parses the 
> > device
> context either while reading or write.
> > Hence no code and no complexity in driver and no duplicate work.
> > Therefore, those admin commands are not to drive the guest device either.
> 
> I'm not sure how this is related to the duplication issue.
> 
You commented that admin virtqueue duplicates somethings.
And I explained above that it does not.

> >
> > > 4) Admin commands itself may not be capable of doing things like
> > > dirty page logging, it requires the assistance from the transport
> > >
> > Admin command in [1] is capable of dirty page logging.
> 
> In your design, the logging is done via DMA not the virtqueue.
> 
No. it is done via admin command, not DMA in [2].

[2] 
https://lore.kernel.org/virtio-comment/20230909142911.524407-7-pa...@nvidia.com/T/#m17b09acd8c73d374e98ad84764b315afa94f59c9

> The only job for virtqueue is to initiate the DMA. But if DMA can be 
> initiated via
> virtqueue, it can be done in other ways.
> 
Lets first establish 4 things in alternative way, any motivation to do so with 
5th point without giant registers need in device.

> >
> > > 1) Parav's proposal does several couplings: couple basic build
> > > blocks (suspend, dirty page tracking) with live migration, couple
> > > live migration with admin commands.
> > In which use case you find dirty page tracking useful without migration for
> which you like to see it detached from device migration flow?
> 
> Is it only the dirty page tracking? It's the combinations of
> 
> 1) suspending
> 2) device states
> 3) dirty page tracking
> 
> Eeah of those will have use cases other than live migration: VM stop, power
> management in VM, profiling and monitoring, failover etc.
> 
Suspend/resume with different power state is driven by the guest directly.
So it may find some overlap.

Device context has no overlap.

Dirty page tracking has no overlap. What do you want to profile and monitor? In 
case if you want to profile, it can be used without migration command anyway?
If you describe, may be we I can split "device migration" chapter to two 
pieces, 
Device management and device migration.

Device migration will use these basic facility.
Would that help you?

Also those can be split later when the actual use case can be described.


> > One can always use these commands if they wish to as_is.
> >
> > > 2) It's still not clear that Parav's proposal can work, a lot of
> > > corner cases needs to be examined
> > >
> > Please let me know which part can be improved in [1].
> 
> I will do that but it may take time. It's near to the public holiday.

I understand. No problem. Take your time.
I will proceed with v1 enhancements regardless for [1].

RE: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state

> From: Jason Wang 
> Sent: Thursday, September 21, 2023 9:32 AM
> 
> On Thu, Sep 21, 2023 at 11:51 AM Parav Pandit  wrote:
> >
> >
> >
> > > From: Jason Wang 
> > > Sent: Thursday, September 21, 2023 8:45 AM The main issue I see so
> > > far is that you want to couple migration with admin commands but I
> > > don't see much advantages to doing this.
> > >
> > The way I read above comment is, to draw a parallel line: descriptor 
> > posting in
> virtio spec is tied to virtqueues. What is the advantage of it?
> 
> Are you saying virtio can't live without admin commands? Again, let's not 
> shift
> concepts.
>
No, I did not say that.
I just don’t see how functionalities proposed in [1] can be done without admin 
commands by the _device_ for member device passthrough requirement.

You made point as "don’t see much advantage with migration done using admin 
commands".
What is the advantage of descriptor posting using virtqueue. It is the way of 
virtio spec...

> > Well, it is one way to achieve it.
> > There may be different way to do all bulk data transfer without admin
> commands.
> 
> Why is virtqueue the only way to do bulk data transferring? Can't DMA be
> initiated by other-way?
>

Sure, what is the disadvantage of existing mechanism of virtqueue that can do 
following.
1. Ability to do DMA
2. agnostic of the DMA who do not want to do DMA
3. Ability to multiple command executions in parallel
4. Non blocking interface for driver that does not require any kind of polling

Why to invent new DMA scheme which at does all the 4 tasks?
First please list down disadvantages of admin queue + show all 4 things are 
achieved using new DMA interface.
That will help to understand why new dma interface is needed.

> Thanks
> 
> > What it the advantage of it, please list down them in [1] for the command
> where you can find alternative.
> >
> > [1]
> > https://lore.kernel.org/virtio-comment/20230909142911.524407-7-parav@n
> > vidia.com/T/#md9fcfa1ba997463de8c7fb8c6d1786b224b0bead

Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state

On Thu, Sep 21, 2023 at 12:01 PM Parav Pandit  wrote:
>
>
>
> > From: Jason Wang 
> > Sent: Thursday, September 21, 2023 8:48 AM
>
> > As replied in another thread, the issues for BAR are:
> >
> > 1) Not sure it can have an efficient interface, it would be something like
> > VIRTIO_PCI_CAP_PCI_CFG which is very slow compared to single register
> > accessing
> > 2) There's no owner/group/member for MMIO, most of the time, we only need
> > a single MMIO device. If we want the owner to manage itself, it seems
> > redundant as is implied in all the existing transports (without admin 
> > commands).
> > Even if we had, it might still suffer from bootstrap issues.
> > 3) For live migration, it means the admin commands needs to start from
> > duplicating every existing transport specific interface it can give us. One
> > example is that we may end up with two interfaces to access virtqueue
> > addresses etc. This results in extra complicity and it is actually a full 
> > transport
> > (driver can just use admin commands to drive the device).
> In [1] there is no duplication. The live migration driver never parses the 
> device context either while reading or write.
> Hence no code and no complexity in driver and no duplicate work.
> Therefore, those admin commands are not to drive the guest device either.

I'm not sure how this is related to the duplication issue.

>
> > 4) Admin commands itself may not be capable of doing things like dirty page
> > logging, it requires the assistance from the transport
> >
> Admin command in [1] is capable of dirty page logging.

In your design, the logging is done via DMA not the virtqueue.

The only job for virtqueue is to initiate the DMA. But if DMA can be
initiated via virtqueue, it can be done in other ways.

>
> > 1) Parav's proposal does several couplings: couple basic build blocks 
> > (suspend,
> > dirty page tracking) with live migration, couple live migration with admin
> > commands.
> In which use case you find dirty page tracking useful without migration for 
> which you like to see it detached from device migration flow?

Is it only the dirty page tracking? It's the combinations of

1) suspending
2) device states
3) dirty page tracking

Eeah of those will have use cases other than live migration: VM stop,
power management in VM, profiling and monitoring, failover etc.

> One can always use these commands if they wish to as_is.
>
> > 2) It's still not clear that Parav's proposal can work, a lot of corner 
> > cases needs
> > to be examined
> >
> Please let me know which part can be improved in [1].

I will do that but it may take time. It's near to the public holiday.

Thanks

>
> [1] 
> https://lore.kernel.org/virtio-comment/20230909142911.524407-7-pa...@nvidia.com/T/#md9fcfa1ba997463de8c7fb8c6d1786b224b0bead
>
>
> > >
> > > > the bar is only a proxy, doesn't fix anything. and even larger side
> > > > channel attacking surface: vf-->pf-->vf
> > >
> > > In this model there's no pf. BAR belongs to vf itself and you submit
> > > commands for the VF through its BAR.
> > > Just separate from the pci config space.
> > >
> > > The whole attacking surface discussion is also puzzling.  We either
> > > are or are not discussing confidential computing/TDI.  I couldn't
> > > figure it out. This needs a separate thread I think.
> >
> > Anyhow, it's not bad to take it into consideration. But we can do it 
> > elsewhere for
> > sure.
> Thanks.
> Please have comments in [1].
>
> [1] 
> https://lore.kernel.org/virtio-comment/20230909142911.524407-7-pa...@nvidia.com/T/#md9fcfa1ba997463de8c7fb8c6d1786b224b0bead
>


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

RE: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state



> From: Jason Wang 
> Sent: Thursday, September 21, 2023 8:48 AM


> I'm fine. But TDISP is something that needs to be considered. The earlier we
> realize the possible issue the better.

[1] has considered this in the design.

[1] 
https://lore.kernel.org/virtio-comment/20230909142911.524407-7-pa...@nvidia.com/T/#md9fcfa1ba997463de8c7fb8c6d1786b224b0bead

Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state

On Thu, Sep 21, 2023 at 11:51 AM Parav Pandit  wrote:
>
>
>
> > From: Jason Wang 
> > Sent: Thursday, September 21, 2023 8:45 AM
> > The main issue I see so far is that
> > you want to couple migration with admin commands but I don't see much
> > advantages to doing this.
> >
> The way I read above comment is, to draw a parallel line: descriptor posting 
> in virtio spec is tied to virtqueues. What is the advantage of it?

Are you saying virtio can't live without admin commands? Again, let's
not shift concepts.

> Well, it is one way to achieve it.
> There may be different way to do all bulk data transfer without admin 
> commands.

Why is virtqueue the only way to do bulk data transferring? Can't DMA
be initiated by other-way?

Thanks

> What it the advantage of it, please list down them in [1] for the command 
> where you can find alternative.
>
> [1] 
> https://lore.kernel.org/virtio-comment/20230909142911.524407-7-pa...@nvidia.com/T/#md9fcfa1ba997463de8c7fb8c6d1786b224b0bead


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

RE: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state



> From: Jason Wang 
> Sent: Thursday, September 21, 2023 8:48 AM

> As replied in another thread, the issues for BAR are:
> 
> 1) Not sure it can have an efficient interface, it would be something like
> VIRTIO_PCI_CAP_PCI_CFG which is very slow compared to single register
> accessing
> 2) There's no owner/group/member for MMIO, most of the time, we only need
> a single MMIO device. If we want the owner to manage itself, it seems
> redundant as is implied in all the existing transports (without admin 
> commands).
> Even if we had, it might still suffer from bootstrap issues.
> 3) For live migration, it means the admin commands needs to start from
> duplicating every existing transport specific interface it can give us. One
> example is that we may end up with two interfaces to access virtqueue
> addresses etc. This results in extra complicity and it is actually a full 
> transport
> (driver can just use admin commands to drive the device).
In [1] there is no duplication. The live migration driver never parses the 
device context either while reading or write.
Hence no code and no complexity in driver and no duplicate work.
Therefore, those admin commands are not to drive the guest device either.

> 4) Admin commands itself may not be capable of doing things like dirty page
> logging, it requires the assistance from the transport
>
Admin command in [1] is capable of dirty page logging.
 
> 1) Parav's proposal does several couplings: couple basic build blocks 
> (suspend,
> dirty page tracking) with live migration, couple live migration with admin
> commands. 
In which use case you find dirty page tracking useful without migration for 
which you like to see it detached from device migration flow?
One can always use these commands if they wish to as_is.

> 2) It's still not clear that Parav's proposal can work, a lot of corner cases 
> needs
> to be examined
> 
Please let me know which part can be improved in [1].

[1] 
https://lore.kernel.org/virtio-comment/20230909142911.524407-7-pa...@nvidia.com/T/#md9fcfa1ba997463de8c7fb8c6d1786b224b0bead


> >
> > > the bar is only a proxy, doesn't fix anything. and even larger side
> > > channel attacking surface: vf-->pf-->vf
> >
> > In this model there's no pf. BAR belongs to vf itself and you submit
> > commands for the VF through its BAR.
> > Just separate from the pci config space.
> >
> > The whole attacking surface discussion is also puzzling.  We either
> > are or are not discussing confidential computing/TDI.  I couldn't
> > figure it out. This needs a separate thread I think.
> 
> Anyhow, it's not bad to take it into consideration. But we can do it 
> elsewhere for
> sure.
Thanks.
Please have comments in [1].

[1] 
https://lore.kernel.org/virtio-comment/20230909142911.524407-7-pa...@nvidia.com/T/#md9fcfa1ba997463de8c7fb8c6d1786b224b0bead

RE: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state



> From: Jason Wang 
> Sent: Thursday, September 21, 2023 8:45 AM
> The main issue I see so far is that
> you want to couple migration with admin commands but I don't see much
> advantages to doing this.
> 
The way I read above comment is, to draw a parallel line: descriptor posting in 
virtio spec is tied to virtqueues. What is the advantage of it?
Well, it is one way to achieve it.
There may be different way to do all bulk data transfer without admin commands.
What it the advantage of it, please list down them in [1] for the command where 
you can find alternative.

[1] 
https://lore.kernel.org/virtio-comment/20230909142911.524407-7-pa...@nvidia.com/T/#md9fcfa1ba997463de8c7fb8c6d1786b224b0bead

RE: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state




> From: Michael S. Tsirkin 
> Sent: Thursday, September 21, 2023 1:34 AM
> 
> On Wed, Sep 20, 2023 at 05:21:52PM +, Parav Pandit wrote:
> > > OK so with this summary in mind, can you find any advantages to
> > > inband+mediation that are real or do you just see disadvantages? And
> > > it's a tricky question because I can see some advantages ;)
> >
> > inband + mediation may be useful for nested case.
> 
> Hint: there's more.

Can you please list down?

The starting point of discussion is, there is passthrough member device without 
mediation in virtio interface layers.
How shall device migration should work for it?


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state

On Wed, Sep 20, 2023 at 8:41 PM Michael S. Tsirkin  wrote:
>
> On Wed, Sep 20, 2023 at 08:05:49AM -0400, Michael S. Tsirkin wrote:
> > On Wed, Sep 20, 2023 at 07:22:32PM +0800, Zhu, Lingshan wrote:
> > >
> > >
> > > On 9/20/2023 6:36 PM, Michael S. Tsirkin wrote:
> > > > On Wed, Sep 20, 2023 at 02:06:13PM +0800, Zhu, Lingshan wrote:
> > > > >
> > > > > On 9/19/2023 2:49 AM, Michael S. Tsirkin wrote:
> > > > > > On Mon, Sep 18, 2023 at 06:41:55PM +, Parav Pandit wrote:
> > > > > > > > Please refer to the code for setting FEATURES_OK.
> > > > > > > It wont work when one needs to suspend the device.
> > > > > > > There is no point of doing such work over registers as 
> > > > > > > fundamental framework is over the AQ.
> > > > > > Well not really. It's over admin commands. When these were built the
> > > > > > intent always was that it's possible to use admin commands through
> > > > > > another interface, other than admin queue. Is there a problem
> > > > > > implementing admin commands over a memory BAR? For example, I can 
> > > > > > see
> > > > > > an "admin command" capability pointing at a BAR where
> > > > > > commands are supplied, and using a new group type referring to
> > > > > > device itself.
> > > > > I am not sure, if a bar cap would be implemented as a proxy for the 
> > > > > admin vq
> > > > > based live migration.
> > > > Not a proxy for a vq in that there's no vq then.
> > > I think if the driver sends admin commands through a VF's bar, then
> > > VF forwards the admin commands to the PF, it acts like a proxy,
> > > or an agent. Anyway it takes admin commands.
> >
> > Why send them to the PF? They are controlling the VF anyway.
> >
> > > So the problems we have discussed still exist.
> > > >
> > > > > then the problems of admin vq LM that we have
> > > > > discussed
> > > > > still exist.
> > > > I freely admit the finer points of this extended flamewar have been lost
> > > > on me, and I wager I'm not the only one. I thought you wanted to migrate
> > > > the device just by accessing the device itself (e.g. the VF) without
> > > > accessing other devices (e.g. the PF), while Parav wants it in a
> > > > separate device so the whole of the device itself can passed through to
> > > > guest. Isn't this, fundamentally, the issue?
> > > we are implementing basic facilities for live migration.
> > >
> > > We have pointed out lots of issues, there are many discussions with
> > > Jason and Parav about the problems in migration by admin vq, for example:
> > > security, QOS and nested.
> >
> > /me shrugs
> > Thanks for the summary I guess. Same applies to almost any proposal.
> > What would help make progress is an explanation why this has grown into
> > a megathread.  Do you understand Parav's thoughts well enough to
> > summarize them?
>
>
> And Parav same goes for you - can you summarize Zhu Lingshan's position?

The root cause for the long debate is that there are a lot of
misunderstandings between each other. This can be seen from Parav's
reply.

My understanding is it might be better that each side do a summary of
the both proposals.

Thanks


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state

On Wed, Sep 20, 2023 at 8:05 PM Michael S. Tsirkin  wrote:
>
> On Wed, Sep 20, 2023 at 07:22:32PM +0800, Zhu, Lingshan wrote:
> >
> >
> > On 9/20/2023 6:36 PM, Michael S. Tsirkin wrote:
> > > On Wed, Sep 20, 2023 at 02:06:13PM +0800, Zhu, Lingshan wrote:
> > > >
> > > > On 9/19/2023 2:49 AM, Michael S. Tsirkin wrote:
> > > > > On Mon, Sep 18, 2023 at 06:41:55PM +, Parav Pandit wrote:
> > > > > > > Please refer to the code for setting FEATURES_OK.
> > > > > > It wont work when one needs to suspend the device.
> > > > > > There is no point of doing such work over registers as fundamental 
> > > > > > framework is over the AQ.
> > > > > Well not really. It's over admin commands. When these were built the
> > > > > intent always was that it's possible to use admin commands through
> > > > > another interface, other than admin queue. Is there a problem
> > > > > implementing admin commands over a memory BAR? For example, I can see
> > > > > an "admin command" capability pointing at a BAR where
> > > > > commands are supplied, and using a new group type referring to
> > > > > device itself.
> > > > I am not sure, if a bar cap would be implemented as a proxy for the 
> > > > admin vq
> > > > based live migration.
> > > Not a proxy for a vq in that there's no vq then.
> > I think if the driver sends admin commands through a VF's bar, then
> > VF forwards the admin commands to the PF, it acts like a proxy,
> > or an agent. Anyway it takes admin commands.
>
> Why send them to the PF? They are controlling the VF anyway.
>
> > So the problems we have discussed still exist.
> > >
> > > > then the problems of admin vq LM that we have
> > > > discussed
> > > > still exist.
> > > I freely admit the finer points of this extended flamewar have been lost
> > > on me, and I wager I'm not the only one. I thought you wanted to migrate
> > > the device just by accessing the device itself (e.g. the VF) without
> > > accessing other devices (e.g. the PF), while Parav wants it in a
> > > separate device so the whole of the device itself can passed through to
> > > guest. Isn't this, fundamentally, the issue?
> > we are implementing basic facilities for live migration.
> >
> > We have pointed out lots of issues, there are many discussions with
> > Jason and Parav about the problems in migration by admin vq, for example:
> > security, QOS and nested.
>
> /me shrugs
> Thanks for the summary I guess. Same applies to almost any proposal.

So it's something we need to consider in the virtio as well and it is
raised by different people. (Correct me if I was wrong)

Security, Parav.
Nest, me
QOS, LingShan

> What would help make progress is an explanation why this has grown into
> a megathread.  Do you understand Parav's thoughts well enough to
> summarize them?
>
> > >
> > > > the bar is only a proxy, doesn't fix anything. and even larger
> > > > side channel attacking surface: vf-->pf-->vf
> > > In this model there's no pf. BAR belongs to vf itself
> > > and you submit commands for the VF through its BAR.
> > > Just separate from the pci config space.
> > If using the bar to process admin commands,
> > is this solution too heavy compared to my proposal in this series?
>
> somewhat - because it's more comprehensive - you can actually
> migrate a device using it.

But it's not a must. And there will be a lot of duplication where it
will become a transport.

> this series just begins to define how to poke at some
> of the vq state - it's a subset of the necessary functionality.

It defines the minimal set of the functionality. We can have more for sure.

>
> And it will give you a bunch of side benefits, such as
> support for legacy compat commands that were merged.

Legacy has too many corner cases and why do we need to do such
revinenting of wheels? We had already had transitional devices for
years.

>
>
>
> > >
> > > The whole attacking surface discussion is also puzzling.  We either are
> > > or are not discussing confidential computing/TDI.  I couldn't figure
> > > it out. This needs a separate thread I think.
> > I agree confidential computing is out of spec. Parva mentioned TDISP and
> > even
> > in TDISP spec, it explicitly defined some attacking model, and PF is an
> > example.
> >
> > It is out of spec anyway.
>
> OK so we are ignoring TDISP applications for now? Everyone agrees on
> that?

I'm fine. But TDISP is something that needs to be considered. The
earlier we realize the possible issue the better.

Thanks




>
> --
> MST
>


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state

On Wed, Sep 20, 2023 at 6:36 PM Michael S. Tsirkin  wrote:
>
> On Wed, Sep 20, 2023 at 02:06:13PM +0800, Zhu, Lingshan wrote:
> >
> >
> > On 9/19/2023 2:49 AM, Michael S. Tsirkin wrote:
> > > On Mon, Sep 18, 2023 at 06:41:55PM +, Parav Pandit wrote:
> > > > > Please refer to the code for setting FEATURES_OK.
> > > > It wont work when one needs to suspend the device.
> > > > There is no point of doing such work over registers as fundamental 
> > > > framework is over the AQ.
> > > Well not really. It's over admin commands. When these were built the
> > > intent always was that it's possible to use admin commands through
> > > another interface, other than admin queue. Is there a problem
> > > implementing admin commands over a memory BAR? For example, I can see
> > > an "admin command" capability pointing at a BAR where
> > > commands are supplied, and using a new group type referring to
> > > device itself.
> > I am not sure, if a bar cap would be implemented as a proxy for the admin vq
> > based live migration.
>
> Not a proxy for a vq in that there's no vq then.

As replied in another thread, the issues for BAR are:

1) Not sure it can have an efficient interface, it would be something
like VIRTIO_PCI_CAP_PCI_CFG which is very slow compared to single
register accessing
2) There's no owner/group/member for MMIO, most of the time, we only
need a single MMIO device. If we want the owner to manage itself, it
seems redundant as is implied in all the existing transports (without
admin commands). Even if we had, it might still suffer from bootstrap
issues.
3) For live migration, it means the admin commands needs to start from
duplicating every existing transport specific interface it can give
us. One example is that we may end up with two interfaces to access
virtqueue addresses etc. This results in extra complicity and it is
actually a full transport (driver can just use admin commands to drive
the device).
4) Admin commands itself may not be capable of doing things like dirty
page logging, it requires the assistance from the transport

>
> > then the problems of admin vq LM that we have
> > discussed
> > still exist.
>
> I freely admit the finer points of this extended flamewar have been lost
> on me, and I wager I'm not the only one. I thought you wanted to migrate
> the device just by accessing the device itself (e.g. the VF) without
> accessing other devices (e.g. the PF), while Parav wants it in a
> separate device so the whole of the device itself can passed through to
> guest.

If accessing the device itself, anything prevents us from passing it
through to the guest? It is how all the existing devices are built.

> Isn't this, fundamentally, the issue?

For me it's not. The fundamental issues are:

1) Parav's proposal does several couplings: couple basic build blocks
(suspend, dirty page tracking) with live migration, couple live
migration with admin commands. This proposal doesn't do such coupling,
and admin commands can be built on top.
2) It's still not clear that Parav's proposal can work, a lot of
corner cases needs to be examined

>
> > the bar is only a proxy, doesn't fix anything. and even larger
> > side channel attacking surface: vf-->pf-->vf
>
> In this model there's no pf. BAR belongs to vf itself
> and you submit commands for the VF through its BAR.
> Just separate from the pci config space.
>
> The whole attacking surface discussion is also puzzling.  We either are
> or are not discussing confidential computing/TDI.  I couldn't figure
> it out. This needs a separate thread I think.

Anyhow, it's not bad to take it into consideration. But we can do it
elsewhere for sure.

Thanks

>
> --
> MST
>

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state

On Wed, Sep 20, 2023 at 8:40 PM Michael S. Tsirkin  wrote:
>
> On Wed, Sep 20, 2023 at 08:16:13PM +0800, Zhu, Lingshan wrote:
> >
> >
> > On 9/20/2023 8:05 PM, Michael S. Tsirkin wrote:
> > > On Wed, Sep 20, 2023 at 07:22:32PM +0800, Zhu, Lingshan wrote:
> > > >
> > > > On 9/20/2023 6:36 PM, Michael S. Tsirkin wrote:
> > > > > On Wed, Sep 20, 2023 at 02:06:13PM +0800, Zhu, Lingshan wrote:
> > > > > > On 9/19/2023 2:49 AM, Michael S. Tsirkin wrote:
> > > > > > > On Mon, Sep 18, 2023 at 06:41:55PM +, Parav Pandit wrote:
> > > > > > > > > Please refer to the code for setting FEATURES_OK.
> > > > > > > > It wont work when one needs to suspend the device.
> > > > > > > > There is no point of doing such work over registers as 
> > > > > > > > fundamental framework is over the AQ.
> > > > > > > Well not really. It's over admin commands. When these were built 
> > > > > > > the
> > > > > > > intent always was that it's possible to use admin commands through
> > > > > > > another interface, other than admin queue. Is there a problem
> > > > > > > implementing admin commands over a memory BAR? For example, I can 
> > > > > > > see
> > > > > > > an "admin command" capability pointing at a BAR where
> > > > > > > commands are supplied, and using a new group type referring to
> > > > > > > device itself.
> > > > > > I am not sure, if a bar cap would be implemented as a proxy for the 
> > > > > > admin vq
> > > > > > based live migration.
> > > > > Not a proxy for a vq in that there's no vq then.
> > > > I think if the driver sends admin commands through a VF's bar, then
> > > > VF forwards the admin commands to the PF, it acts like a proxy,
> > > > or an agent. Anyway it takes admin commands.
> > > Why send them to the PF? They are controlling the VF anyway.
> > I think its still too heavy compared to this series proposal
>
> it will be on you to prove all the complexity is unnecessary though.
>
> > >
> > > > So the problems we have discussed still exist.
> > > > > > then the problems of admin vq LM that we have
> > > > > > discussed
> > > > > > still exist.
> > > > > I freely admit the finer points of this extended flamewar have been 
> > > > > lost
> > > > > on me, and I wager I'm not the only one. I thought you wanted to 
> > > > > migrate
> > > > > the device just by accessing the device itself (e.g. the VF) without
> > > > > accessing other devices (e.g. the PF), while Parav wants it in a
> > > > > separate device so the whole of the device itself can passed through 
> > > > > to
> > > > > guest. Isn't this, fundamentally, the issue?
> > > > we are implementing basic facilities for live migration.
> > > >
> > > > We have pointed out lots of issues, there are many discussions with
> > > > Jason and Parav about the problems in migration by admin vq, for 
> > > > example:
> > > > security, QOS and nested.
> > > /me shrugs
> > > Thanks for the summary I guess. Same applies to almost any proposal.
> > > What would help make progress is an explanation why this has grown into
> > > a megathread.  Do you understand Parav's thoughts well enough to
> > > summarize them?
> > as far as I see, I don't see admin vq as must for live migration.
> > and it does not serve nested for sure.
> > >
> > > > > > the bar is only a proxy, doesn't fix anything. and even larger
> > > > > > side channel attacking surface: vf-->pf-->vf
> > > > > In this model there's no pf. BAR belongs to vf itself
> > > > > and you submit commands for the VF through its BAR.
> > > > > Just separate from the pci config space.
> > > > If using the bar to process admin commands,
> > > > is this solution too heavy compared to my proposal in this series?
> > > somewhat - because it's more comprehensive - you can actually
> > > migrate a device using it.
> > > this series just begins to define how to poke at some
> > > of the vq state - it's a subset of the necessary functionality.
> > >
> > > And it will give you a bunch of side benefits, such as
> > > support for legacy compat commands that were merged.
> > next version will include in-flight descriptors and dirty page tracking.
>
> what we don't need is another version of this megathread.
> which it sounds like you intend to restart?
> nor do I cherish maintaining two independent mechanisms for doing
> the same thing in the spec.

I'm not sure how to define "the same thing" but we had different
transports for accessing basic facilities like virtqueue, device
status, etc. I don't get why live migration is different. Especially
considering migration is actually a combination of several independent
functions.

And we have already had different ways to transport legacy devices as
well. It's really hard to say one mechanism can work for all use
cases.

> all of the above is already in parav's patchset so you guys should find
> a way to work together rather than compete?

There're things that are missed in Parav's series for sure, that is
the migrating for owner and nest.

>From my point of view there's no

Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state

On Wed, Sep 20, 2023 at 05:21:52PM +, Parav Pandit wrote:
> > OK so with this summary in mind, can you find any advantages to
> > inband+mediation that are real or do you just see disadvantages? And
> > it's a tricky question because I can see some advantages ;)
> 
> inband + mediation may be useful for nested case.

Hint: there's more.

-- 
MST


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

RE: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state



> From: Michael S. Tsirkin 
> Sent: Wednesday, September 20, 2023 7:46 PM
> 
> > Details of his position in my view:
> >
> > 1. Device migration must be done through VF itself by suspending specific 
> > vqs
> and the VF device both.
> > 2. When device migration is done using #1, it must be done using mediation
> approach in hypervisor.
> >
> > 3. When migration is done using inband mediation it is more secure than AQ
> approach.
> > (as opposed to AQ of the owner device who enables/disables SR-IOV).
> >
> > 4. AQ is not secure.
> > But,
> > 5. AQ and admin commands can be built on top of his proposal #1, even if AQ
> is less secure. Opposing statements...
> >
> > 6. Dirty page tracking and inflight descriptors tracking to be done in his 
> > v1. but
> he does not want to review such coverage in [1].
> >
> > 8. Since his series does not cover any device context migration and
> > does not talk anything about it, I deduce that he plans to use cvq for 
> > setting
> ups RSS and other fields using inband CVQ of the VF.
> > This further limit the solution to only net device, ignoring rest of the 
> > other
> 20+ device types, where all may not have the CVQ.
> >
> > 9. trapping and emulation of following objects: AQ, CVQ, virtio config 
> > space,
> PCI FLR flow in hypervisor is secure, but when if AQ of the PF do far small 
> work
> of it, AQ is not secure.
> >
> > 10. Any traps proposed in #9 mostly do not work with future TDISP as TDISP 
> > do
> not bifurcate the device, so ignore them for now to promote inband migration.
> >
> > 11. He do not show interest in collaboration (even after requesting few 
> > times)
> to see if we can produce common commands that may work for both
> passthrough (without mediation) and using mediation for nested case.
> >
> > 12. Some how register access on single physical card for the PFs and VFs 
> > gives
> better QoS guarantee than virtqueue as registers can scale infinitely no 
> matter
> how many VFs or for multiple VQs because it is per VF.
> >
> > [1]
> > https://lore.kernel.org/virtio-comment/20230909142911.524407-7-parav@n
> > vidia.com/T/#md9fcfa1ba997463de8c7fb8c6d1786b224b0bead
> 
> 
> OK so with this summary in mind, can you find any advantages to
> inband+mediation that are real or do you just see disadvantages? And
> it's a tricky question because I can see some advantages ;)

inband + mediation may be useful for nested case.

In attempting inband + mediation, there are many critical pieces are let go.
It may be fine to let go for some cases but not for passthrough.

The fundamental advantages of owner-based approach I see are:
1. Nesting use case usually involve large number of VMs to be hosted in one VM
For this purpose, better to hand over a PF to level 0 VM that hosts VFs and 
level 1 VMs and avoid two level device nesting.

2. Support P2P natively

3. Single non replicated resource (AQ) manages less frequent work of device 
migration
No need to replicate AQs to thousands of VFs who rarely do the migration work.
Overall gains system, device, and memory efficiency.

5. Passthrough simply do not work at all ever without owner device.
This is because dirty page tracking, device context management, CVQ, FLR, 
config space, device status, MSIX config all must be trapped.
Many systems do not prefer this involvement of hypervisor even if the 
hypervisor is trusted. (to avoid moving parts).
New generation TEE, TPM devices are on horizon, and they would not like things 
not trapped either.
The security audit surface is very large for them.

6. Any new basic functionality added to device must always also require 
constant software updates at few layers in mediation entities

7. TDISP is inherently covered. Without owner device TDISP is broken as device 
cannot be bifurcated.

To me #2, #5, #7 are critical piece that a device migration must support/work 
with.
Rest is second level of importance.

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state

On Wed, Sep 20, 2023 at 01:41:00PM +, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin 
> > Sent: Wednesday, September 20, 2023 6:12 PM
> 
> > And Parav same goes for you - can you summarize Zhu Lingshan's position?
> 
> Below is my summary about Zhu Lingshan's position:
> 
> One line summary of his position in my view:
> 
> 0. Use inband device migration only, use mediation, mediation is secure, but 
> AQ is not secure.
> 
> Details of his position in my view:
> 
> 1. Device migration must be done through VF itself by suspending specific vqs 
> and the VF device both.
> 2. When device migration is done using #1, it must be done using mediation 
> approach in hypervisor.
> 
> 3. When migration is done using inband mediation it is more secure than AQ 
> approach.
> (as opposed to AQ of the owner device who enables/disables SR-IOV).
> 
> 4. AQ is not secure.
> But,
> 5. AQ and admin commands can be built on top of his proposal #1, even if AQ 
> is less secure. Opposing statements...
> 
> 6. Dirty page tracking and inflight descriptors tracking to be done in his 
> v1. but he does not want to review such coverage in [1].
> 
> 8. Since his series does not cover any device context migration and does not 
> talk anything about it, 
> I deduce that he plans to use cvq for setting ups RSS and other fields using 
> inband CVQ of the VF.
> This further limit the solution to only net device, ignoring rest of the 
> other 20+ device types, where all may not have the CVQ.
> 
> 9. trapping and emulation of following objects: AQ, CVQ, virtio config space, 
> PCI FLR flow in hypervisor is secure, but when if AQ of the PF do far small 
> work of it, AQ is not secure.
> 
> 10. Any traps proposed in #9 mostly do not work with future TDISP as TDISP do 
> not bifurcate the device, so ignore them for now to promote inband migration.
> 
> 11. He do not show interest in collaboration (even after requesting few 
> times) to see if we can produce common commands that may work for both 
> passthrough (without mediation) and using mediation for nested case.
> 
> 12. Some how register access on single physical card for the PFs and VFs 
> gives better QoS guarantee than virtqueue as registers can scale infinitely 
> no matter how many VFs or for multiple VQs because it is per VF.
> 
> [1] 
> https://lore.kernel.org/virtio-comment/20230909142911.524407-7-pa...@nvidia.com/T/#md9fcfa1ba997463de8c7fb8c6d1786b224b0bead


OK so with this summary in mind, can you find any advantages to
inband+mediation that are real or do you just see disadvantages? And
it's a tricky question because I can see some advantages ;)


-- 
MST


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state

On Wed, Sep 20, 2023 at 01:41:00PM +, Parav Pandit wrote:
> 12. Some how register access on single physical card for the PFs and VFs 
> gives better QoS guarantee than virtqueue as registers can scale infinitely 
> no matter how many VFs or for multiple VQs because it is per VF.
>

This makes some sense as memory accesses to independent devices do not
need to be ordered. AQ's answer is multiple queues and out of order
execution within the queue. Not sure whether this is good enough.

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

RE: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state

> From: Michael S. Tsirkin
> Sent: Wednesday, September 20, 2023 6:12 PM

> And Parav same goes for you - can you summarize Zhu Lingshan's position?

Below is my summary about Zhu Lingshan's position:

One line summary of his position in my view:

0. Use inband device migration only, use mediation, mediation is secure, but AQ
is not secure.

Details of his position in my view:

1. Device migration must be done through VF itself by suspending specific vqs
and the VF device both.
2. When device migration is done using #1, it must be done using mediation
approach in hypervisor.

3. When migration is done using inband mediation it is more secure than AQ
approach.
(as opposed to AQ of the owner device who enables/disables SR-IOV).

4. AQ is not secure.
But,
5. AQ and admin commands can be built on top of his proposal #1, even if AQ is
less secure. Opposing statements...

6. Dirty page tracking and inflight descriptors tracking to be done in his v1.
but he does not want to review such coverage in [1].

8. Since his series does not cover any device context migration and does not
talk anything about it,
I deduce that he plans to use cvq for setting ups RSS and other fields using
inband CVQ of the VF.
This further limit the solution to only net device, ignoring rest of the other
20+ device types, where all may not have the CVQ.

9. trapping and emulation of following objects: AQ, CVQ, virtio config space,
PCI FLR flow in hypervisor is secure, but when if AQ of the PF do far small
work of it, AQ is not secure.

10. Any traps proposed in #9 mostly do not work with future TDISP as TDISP do
not bifurcate the device, so ignore them for now to promote inband migration.

11. He do not show interest in collaboration (even after requesting few times)
to see if we can produce common commands that may work for both passthrough
(without mediation) and using mediation for nested case.

12. Some how register access on single physical card for the PFs and VFs gives
better QoS guarantee than virtqueue as registers can scale infinitely no matter
how many VFs or for multiple VQs because it is per VF.

[1]
https://lore.kernel.org/virtio-comment/20230909142911.524407-7-pa...@nvidia.com/T/#md9fcfa1ba997463de8c7fb8c6d1786b224b0bead

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state

On Wed, Sep 20, 2023 at 08:05:49AM -0400, Michael S. Tsirkin wrote:
> On Wed, Sep 20, 2023 at 07:22:32PM +0800, Zhu, Lingshan wrote:
> > 
> > 
> > On 9/20/2023 6:36 PM, Michael S. Tsirkin wrote:
> > > On Wed, Sep 20, 2023 at 02:06:13PM +0800, Zhu, Lingshan wrote:
> > > > 
> > > > On 9/19/2023 2:49 AM, Michael S. Tsirkin wrote:
> > > > > On Mon, Sep 18, 2023 at 06:41:55PM +, Parav Pandit wrote:
> > > > > > > Please refer to the code for setting FEATURES_OK.
> > > > > > It wont work when one needs to suspend the device.
> > > > > > There is no point of doing such work over registers as fundamental 
> > > > > > framework is over the AQ.
> > > > > Well not really. It's over admin commands. When these were built the
> > > > > intent always was that it's possible to use admin commands through
> > > > > another interface, other than admin queue. Is there a problem
> > > > > implementing admin commands over a memory BAR? For example, I can see
> > > > > an "admin command" capability pointing at a BAR where
> > > > > commands are supplied, and using a new group type referring to
> > > > > device itself.
> > > > I am not sure, if a bar cap would be implemented as a proxy for the 
> > > > admin vq
> > > > based live migration.
> > > Not a proxy for a vq in that there's no vq then.
> > I think if the driver sends admin commands through a VF's bar, then
> > VF forwards the admin commands to the PF, it acts like a proxy,
> > or an agent. Anyway it takes admin commands.
> 
> Why send them to the PF? They are controlling the VF anyway.
> 
> > So the problems we have discussed still exist.
> > > 
> > > > then the problems of admin vq LM that we have
> > > > discussed
> > > > still exist.
> > > I freely admit the finer points of this extended flamewar have been lost
> > > on me, and I wager I'm not the only one. I thought you wanted to migrate
> > > the device just by accessing the device itself (e.g. the VF) without
> > > accessing other devices (e.g. the PF), while Parav wants it in a
> > > separate device so the whole of the device itself can passed through to
> > > guest. Isn't this, fundamentally, the issue?
> > we are implementing basic facilities for live migration.
> > 
> > We have pointed out lots of issues, there are many discussions with
> > Jason and Parav about the problems in migration by admin vq, for example:
> > security, QOS and nested.
> 
> /me shrugs
> Thanks for the summary I guess. Same applies to almost any proposal.
> What would help make progress is an explanation why this has grown into
> a megathread.  Do you understand Parav's thoughts well enough to
> summarize them?


And Parav same goes for you - can you summarize Zhu Lingshan's position?

> > > 
> > > > the bar is only a proxy, doesn't fix anything. and even larger
> > > > side channel attacking surface: vf-->pf-->vf
> > > In this model there's no pf. BAR belongs to vf itself
> > > and you submit commands for the VF through its BAR.
> > > Just separate from the pci config space.
> > If using the bar to process admin commands,
> > is this solution too heavy compared to my proposal in this series?
> 
> somewhat - because it's more comprehensive - you can actually
> migrate a device using it.
> this series just begins to define how to poke at some
> of the vq state - it's a subset of the necessary functionality.
> 
> And it will give you a bunch of side benefits, such as
> support for legacy compat commands that were merged.
> 
> 
> 
> > > 
> > > The whole attacking surface discussion is also puzzling.  We either are
> > > or are not discussing confidential computing/TDI.  I couldn't figure
> > > it out. This needs a separate thread I think.
> > I agree confidential computing is out of spec. Parva mentioned TDISP and
> > even
> > in TDISP spec, it explicitly defined some attacking model, and PF is an
> > example.
> > 
> > It is out of spec anyway.
> 
> OK so we are ignoring TDISP applications for now? Everyone agrees on
> that?
> 
> -- 
> MST


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state

On Wed, Sep 20, 2023 at 08:16:13PM +0800, Zhu, Lingshan wrote:
> 
> 
> On 9/20/2023 8:05 PM, Michael S. Tsirkin wrote:
> > On Wed, Sep 20, 2023 at 07:22:32PM +0800, Zhu, Lingshan wrote:
> > > 
> > > On 9/20/2023 6:36 PM, Michael S. Tsirkin wrote:
> > > > On Wed, Sep 20, 2023 at 02:06:13PM +0800, Zhu, Lingshan wrote:
> > > > > On 9/19/2023 2:49 AM, Michael S. Tsirkin wrote:
> > > > > > On Mon, Sep 18, 2023 at 06:41:55PM +, Parav Pandit wrote:
> > > > > > > > Please refer to the code for setting FEATURES_OK.
> > > > > > > It wont work when one needs to suspend the device.
> > > > > > > There is no point of doing such work over registers as 
> > > > > > > fundamental framework is over the AQ.
> > > > > > Well not really. It's over admin commands. When these were built the
> > > > > > intent always was that it's possible to use admin commands through
> > > > > > another interface, other than admin queue. Is there a problem
> > > > > > implementing admin commands over a memory BAR? For example, I can 
> > > > > > see
> > > > > > an "admin command" capability pointing at a BAR where
> > > > > > commands are supplied, and using a new group type referring to
> > > > > > device itself.
> > > > > I am not sure, if a bar cap would be implemented as a proxy for the 
> > > > > admin vq
> > > > > based live migration.
> > > > Not a proxy for a vq in that there's no vq then.
> > > I think if the driver sends admin commands through a VF's bar, then
> > > VF forwards the admin commands to the PF, it acts like a proxy,
> > > or an agent. Anyway it takes admin commands.
> > Why send them to the PF? They are controlling the VF anyway.
> I think its still too heavy compared to this series proposal

it will be on you to prove all the complexity is unnecessary though.

> > 
> > > So the problems we have discussed still exist.
> > > > > then the problems of admin vq LM that we have
> > > > > discussed
> > > > > still exist.
> > > > I freely admit the finer points of this extended flamewar have been lost
> > > > on me, and I wager I'm not the only one. I thought you wanted to migrate
> > > > the device just by accessing the device itself (e.g. the VF) without
> > > > accessing other devices (e.g. the PF), while Parav wants it in a
> > > > separate device so the whole of the device itself can passed through to
> > > > guest. Isn't this, fundamentally, the issue?
> > > we are implementing basic facilities for live migration.
> > > 
> > > We have pointed out lots of issues, there are many discussions with
> > > Jason and Parav about the problems in migration by admin vq, for example:
> > > security, QOS and nested.
> > /me shrugs
> > Thanks for the summary I guess. Same applies to almost any proposal.
> > What would help make progress is an explanation why this has grown into
> > a megathread.  Do you understand Parav's thoughts well enough to
> > summarize them?
> as far as I see, I don't see admin vq as must for live migration.
> and it does not serve nested for sure.
> > 
> > > > > the bar is only a proxy, doesn't fix anything. and even larger
> > > > > side channel attacking surface: vf-->pf-->vf
> > > > In this model there's no pf. BAR belongs to vf itself
> > > > and you submit commands for the VF through its BAR.
> > > > Just separate from the pci config space.
> > > If using the bar to process admin commands,
> > > is this solution too heavy compared to my proposal in this series?
> > somewhat - because it's more comprehensive - you can actually
> > migrate a device using it.
> > this series just begins to define how to poke at some
> > of the vq state - it's a subset of the necessary functionality.
> > 
> > And it will give you a bunch of side benefits, such as
> > support for legacy compat commands that were merged.
> next version will include in-flight descriptors and dirty page tracking.

what we don't need is another version of this megathread.
which it sounds like you intend to restart?
nor do I cherish maintaining two independent mechanisms for doing
the same thing in the spec.
all of the above is already in parav's patchset so you guys should find
a way to work together rather than compete?

> I failed to process the comments for legacy.
> legacy devices are defined by code than the spec, if wants to migrate legacy
> devices, maybe working on QEMU first

that is not much in the way of addressing, just saying go pound sand.
the functionality has already been accepted by the TC though I don't
know what you are trying to say here. that we should drop it from spec?

> > 
> > 
> > 
> > > > The whole attacking surface discussion is also puzzling.  We either are
> > > > or are not discussing confidential computing/TDI.  I couldn't figure
> > > > it out. This needs a separate thread I think.
> > > I agree confidential computing is out of spec. Parva mentioned TDISP and
> > > even
> > > in TDISP spec, it explicitly defined some attacking model, and PF is an
> > > example.
> > > 
> > > It is out of spec anyway.
> > OK so

Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state

On Wed, Sep 20, 2023 at 08:05:24PM +0800, Zhu, Lingshan wrote:
> 
> 
> On 9/20/2023 7:52 PM, Michael S. Tsirkin wrote:
> > On Wed, Sep 20, 2023 at 07:28:39PM +0800, Zhu, Lingshan wrote:
> > > 
> > > On 9/20/2023 6:55 PM, Parav Pandit wrote:
> > > > > From: Michael S. Tsirkin 
> > > > > Sent: Wednesday, September 20, 2023 4:06 PM
> > > > > I freely admit the finer points of this extended flamewar have been 
> > > > > lost on me,
> > > > > and I wager I'm not the only one. I thought you wanted to migrate the 
> > > > > device
> > > > > just by accessing the device itself (e.g. the VF) without accessing 
> > > > > other devices
> > > > > (e.g. the PF), while Parav wants it in a separate device so the whole 
> > > > > of the
> > > > > device itself can passed through to guest. Isn't this, fundamentally, 
> > > > > the issue?
> > > > Right. An admin device doing the work of device migration. Today it is 
> > > > the owner PF.
> > > > In future it can be other admin device who is deleted this task of 
> > > > migration, who can be group owner.
> > > > All the admin commands that we plumb here just works great in that 
> > > > CC/TDI future, because only thing changes is the admin device issuing 
> > > > this command.
> > > > 
> > > > > > the bar is only a proxy, doesn't fix anything. and even larger side
> > > > > > channel attacking surface: vf-->pf-->vf
> > > > > In this model there's no pf. BAR belongs to vf itself and you submit 
> > > > > commands
> > > > > for the VF through its BAR.
> > > > > Just separate from the pci config space.
> > > > > 
> > > > > The whole attacking surface discussion is also puzzling.  We either 
> > > > > are or are
> > > > > not discussing confidential computing/TDI.  I couldn't figure it out. 
> > > > > This needs a
> > > > > separate thread I think.
> > > > True. Many of Lingshan thoughts/comments gets mixed I feel.
> > > > Because he proposes trap+emulation/mediation-based solution by 
> > > > hypervisor and none of that is secure anyway in CC/TDI concept.
> > > > He keeps attacking AQ as some side channel attack, while somehow 
> > > > trap+emulation also done by hypervisor is secure, which obviously does 
> > > > not make sense in CC/TDI concept.
> > > > Both scores equal where hypervisor trust is of concern.
> > > Please answer directly:
> > And here you go discussing this in the same thread. I feel you guys are
> > wasting bytes copying the list with this most people lost track
> > if not interest.
> I agree, although I have to reply because Parav said I am "attacking" AQ
> which is
> not a good wording.
> And I need to show its not attacking, this is discussions on
> LM topics, there may be some debating, and I surely need to provide proof.

I will be very surprised if something costructive comes out of debates
in this style.

Some sure ways to detect flamewars:
- deep thread
- repeating same claims
- ignoring questions/comments

This passes with flying colors and I'm not going to point fingers
but really please start seeing each other's point of view.


> > 
> > > What if a malicious SW suspend the guest when it is running through admin 
> > > vq
> > > live migration facility
> > I doubt suspend is a problem - looks like a denial of service to me
> > and that is not considered part of the threat model at least going by
> > the documents confidential computing guys are posting on lkml.
> Yes this is a denial of service and it can be a problem if the service is a
> critical service
> like a remote attestation server.
> 
> So suspending the VM by admin vq LM commands can attack the system.

um did you read the coco threat model posts? they are educational.
ability to deny service to a VM is currently fundamental to building
PaaS platforms on top of it.


> > 
> > 
> > > What if a malicious SW dump guest memory by tracking guest dirty pages by
> > > admin vq live migration faclity
> > All this does is tell you which pages did device access though.
> > It looks like on many architectures this information is readily
> > available anyway due to host page tables being under the hypervisor
> > control, since this is how it's migrated. Problem? How is memory
> > migrated otherwise?
> It tracks dirty pages, may record them in a bitmap.
> 
> Without CoCo, procdump or qemu dump-guest-memory can dump the guest memory
> pages,
> but it does not know which part of memory is guest secrets.
> 
> For example, if a malicious SW wants to sniff the guest networking, the
> bitmap
> of the dirty pages can help to locate the network DMA pages. This also apply
> to disk IOs
> 
> So this enlarges the attacking surface.
> 
> Current live migration solution does not use any PFs tracking dirty pages,
> so no such side channel.


You must be joking. Look into the virtio ring, there are DMA addresses
right there.


> > 
> > > > And admin command approach [1] has clear direction for CC to delete 
> > > > those admin commands to a dedicated trusted entity instead of 
> > > > hypervisor.
> > > > 
> > >

Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state





On 9/20/2023 8:05 PM, Michael S. Tsirkin wrote:

On Wed, Sep 20, 2023 at 07:22:32PM +0800, Zhu, Lingshan wrote:


On 9/20/2023 6:36 PM, Michael S. Tsirkin wrote:

On Wed, Sep 20, 2023 at 02:06:13PM +0800, Zhu, Lingshan wrote:

On 9/19/2023 2:49 AM, Michael S. Tsirkin wrote:

On Mon, Sep 18, 2023 at 06:41:55PM +, Parav Pandit wrote:

Please refer to the code for setting FEATURES_OK.

It wont work when one needs to suspend the device.
There is no point of doing such work over registers as fundamental framework is 
over the AQ.

Well not really. It's over admin commands. When these were built the
intent always was that it's possible to use admin commands through
another interface, other than admin queue. Is there a problem
implementing admin commands over a memory BAR? For example, I can see
an "admin command" capability pointing at a BAR where
commands are supplied, and using a new group type referring to
device itself.

I am not sure, if a bar cap would be implemented as a proxy for the admin vq
based live migration.

Not a proxy for a vq in that there's no vq then.

I think if the driver sends admin commands through a VF's bar, then
VF forwards the admin commands to the PF, it acts like a proxy,
or an agent. Anyway it takes admin commands.

Why send them to the PF? They are controlling the VF anyway.

I think its still too heavy compared to this series proposal



So the problems we have discussed still exist.

then the problems of admin vq LM that we have
discussed
still exist.

I freely admit the finer points of this extended flamewar have been lost
on me, and I wager I'm not the only one. I thought you wanted to migrate
the device just by accessing the device itself (e.g. the VF) without
accessing other devices (e.g. the PF), while Parav wants it in a
separate device so the whole of the device itself can passed through to
guest. Isn't this, fundamentally, the issue?

we are implementing basic facilities for live migration.

We have pointed out lots of issues, there are many discussions with
Jason and Parav about the problems in migration by admin vq, for example:
security, QOS and nested.

/me shrugs
Thanks for the summary I guess. Same applies to almost any proposal.
What would help make progress is an explanation why this has grown into
a megathread.  Do you understand Parav's thoughts well enough to
summarize them?

as far as I see, I don't see admin vq as must for live migration.
and it does not serve nested for sure.



the bar is only a proxy, doesn't fix anything. and even larger
side channel attacking surface: vf-->pf-->vf

In this model there's no pf. BAR belongs to vf itself
and you submit commands for the VF through its BAR.
Just separate from the pci config space.

If using the bar to process admin commands,
is this solution too heavy compared to my proposal in this series?

somewhat - because it's more comprehensive - you can actually
migrate a device using it.
this series just begins to define how to poke at some
of the vq state - it's a subset of the necessary functionality.

And it will give you a bunch of side benefits, such as
support for legacy compat commands that were merged.

next version will include in-flight descriptors and dirty page tracking.
I failed to process the comments for legacy.
legacy devices are defined by code than the spec, if wants to migrate legacy
devices, maybe working on QEMU first





The whole attacking surface discussion is also puzzling.  We either are
or are not discussing confidential computing/TDI.  I couldn't figure
it out. This needs a separate thread I think.

I agree confidential computing is out of spec. Parva mentioned TDISP and
even
in TDISP spec, it explicitly defined some attacking model, and PF is an
example.

It is out of spec anyway.

OK so we are ignoring TDISP applications for now? Everyone agrees on
that?

sure





-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

RE: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state



> From: Michael S. Tsirkin 
> Sent: Wednesday, September 20, 2023 5:36 PM
> 
> OK so we are ignoring TDISP applications for now? Everyone agrees on that?

We are actively considering TDISP applications to support in (unknown) future 
in a way that new spec additions for new features we do, does not block the 
future TDISP support.

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state





On 9/20/2023 8:05 PM, Zhu, Lingshan wrote:



On 9/20/2023 7:52 PM, Michael S. Tsirkin wrote:

On Wed, Sep 20, 2023 at 07:28:39PM +0800, Zhu, Lingshan wrote:


On 9/20/2023 6:55 PM, Parav Pandit wrote:

From: Michael S. Tsirkin 
Sent: Wednesday, September 20, 2023 4:06 PM
I freely admit the finer points of this extended flamewar have 
been lost on me,
and I wager I'm not the only one. I thought you wanted to migrate 
the device
just by accessing the device itself (e.g. the VF) without 
accessing other devices
(e.g. the PF), while Parav wants it in a separate device so the 
whole of the
device itself can passed through to guest. Isn't this, 
fundamentally, the issue?
Right. An admin device doing the work of device migration. Today it 
is the owner PF.
In future it can be other admin device who is deleted this task of 
migration, who can be group owner.
All the admin commands that we plumb here just works great in that 
CC/TDI future, because only thing changes is the admin device 
issuing this command.



the bar is only a proxy, doesn't fix anything. and even larger side
channel attacking surface: vf-->pf-->vf
In this model there's no pf. BAR belongs to vf itself and you 
submit commands

for the VF through its BAR.
Just separate from the pci config space.

The whole attacking surface discussion is also puzzling.  We 
either are or are
not discussing confidential computing/TDI.  I couldn't figure it 
out. This needs a

separate thread I think.

True. Many of Lingshan thoughts/comments gets mixed I feel.
Because he proposes trap+emulation/mediation-based solution by 
hypervisor and none of that is secure anyway in CC/TDI concept.
He keeps attacking AQ as some side channel attack, while somehow 
trap+emulation also done by hypervisor is secure, which obviously 
does not make sense in CC/TDI concept.

Both scores equal where hypervisor trust is of concern.

Please answer directly:

And here you go discussing this in the same thread. I feel you guys are
wasting bytes copying the list with this most people lost track
if not interest.
I agree, although I have to reply because Parav said I am "attacking" 
AQ which is

not a good wording.

And I need to show its not attacking, this is discussions on
LM topics, there may be some debating, and I surely need to provide 
proof.


What if a malicious SW suspend the guest when it is running through 
admin vq

live migration facility

I doubt suspend is a problem - looks like a denial of service to me
and that is not considered part of the threat model at least going by
the documents confidential computing guys are posting on lkml.
Yes this is a denial of service and it can be a problem if the service 
is a critical service

like a remote attestation server.

So suspending the VM by admin vq LM commands can attack the system.



What if a malicious SW dump guest memory by tracking guest dirty 
pages by

admin vq live migration faclity

All this does is tell you which pages did device access though.
It looks like on many architectures this information is readily
available anyway due to host page tables being under the hypervisor
control, since this is how it's migrated. Problem? How is memory
migrated otherwise?

It tracks dirty pages, may record them in a bitmap.

Without CoCo, procdump or qemu dump-guest-memory can dump the guest 
memory pages,

but it does not know which part of memory is guest secrets.

For example, if a malicious SW wants to sniff the guest networking, 
the bitmap
of the dirty pages can help to locate the network DMA pages. This also 
apply

to disk IOs

So this enlarges the attacking surface.

Current live migration solution does not use any PFs tracking dirty 
pages,

so no such side channel.

supplementary comments for my own reply:

Confidential computing is still out of the spec, and I think we should focus
on current solution



And admin command approach [1] has clear direction for CC to delete 
those admin commands to a dedicated trusted entity instead of 
hypervisor.


I try to explain these few times, but..

Anyways, if AQ has some comments better to reply in its thread at [1].

[1] 
https://lore.kernel.org/virtio-comment/20230909142911.524407-7-pa...@nvidia.com/T/#md9fcfa1ba997463de8c7fb8c6d1786b224b0bead


I will post v1 for [1] with more mature device context this week 
along with future provisioning item note.



-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org




-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state

On Wed, Sep 20, 2023 at 07:22:32PM +0800, Zhu, Lingshan wrote:
> 
> 
> On 9/20/2023 6:36 PM, Michael S. Tsirkin wrote:
> > On Wed, Sep 20, 2023 at 02:06:13PM +0800, Zhu, Lingshan wrote:
> > > 
> > > On 9/19/2023 2:49 AM, Michael S. Tsirkin wrote:
> > > > On Mon, Sep 18, 2023 at 06:41:55PM +, Parav Pandit wrote:
> > > > > > Please refer to the code for setting FEATURES_OK.
> > > > > It wont work when one needs to suspend the device.
> > > > > There is no point of doing such work over registers as fundamental 
> > > > > framework is over the AQ.
> > > > Well not really. It's over admin commands. When these were built the
> > > > intent always was that it's possible to use admin commands through
> > > > another interface, other than admin queue. Is there a problem
> > > > implementing admin commands over a memory BAR? For example, I can see
> > > > an "admin command" capability pointing at a BAR where
> > > > commands are supplied, and using a new group type referring to
> > > > device itself.
> > > I am not sure, if a bar cap would be implemented as a proxy for the admin 
> > > vq
> > > based live migration.
> > Not a proxy for a vq in that there's no vq then.
> I think if the driver sends admin commands through a VF's bar, then
> VF forwards the admin commands to the PF, it acts like a proxy,
> or an agent. Anyway it takes admin commands.

Why send them to the PF? They are controlling the VF anyway.

> So the problems we have discussed still exist.
> > 
> > > then the problems of admin vq LM that we have
> > > discussed
> > > still exist.
> > I freely admit the finer points of this extended flamewar have been lost
> > on me, and I wager I'm not the only one. I thought you wanted to migrate
> > the device just by accessing the device itself (e.g. the VF) without
> > accessing other devices (e.g. the PF), while Parav wants it in a
> > separate device so the whole of the device itself can passed through to
> > guest. Isn't this, fundamentally, the issue?
> we are implementing basic facilities for live migration.
> 
> We have pointed out lots of issues, there are many discussions with
> Jason and Parav about the problems in migration by admin vq, for example:
> security, QOS and nested.

/me shrugs
Thanks for the summary I guess. Same applies to almost any proposal.
What would help make progress is an explanation why this has grown into
a megathread.  Do you understand Parav's thoughts well enough to
summarize them?

> > 
> > > the bar is only a proxy, doesn't fix anything. and even larger
> > > side channel attacking surface: vf-->pf-->vf
> > In this model there's no pf. BAR belongs to vf itself
> > and you submit commands for the VF through its BAR.
> > Just separate from the pci config space.
> If using the bar to process admin commands,
> is this solution too heavy compared to my proposal in this series?

somewhat - because it's more comprehensive - you can actually
migrate a device using it.
this series just begins to define how to poke at some
of the vq state - it's a subset of the necessary functionality.

And it will give you a bunch of side benefits, such as
support for legacy compat commands that were merged.



> > 
> > The whole attacking surface discussion is also puzzling.  We either are
> > or are not discussing confidential computing/TDI.  I couldn't figure
> > it out. This needs a separate thread I think.
> I agree confidential computing is out of spec. Parva mentioned TDISP and
> even
> in TDISP spec, it explicitly defined some attacking model, and PF is an
> example.
> 
> It is out of spec anyway.

OK so we are ignoring TDISP applications for now? Everyone agrees on
that?

-- 
MST


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state





On 9/20/2023 7:52 PM, Michael S. Tsirkin wrote:

On Wed, Sep 20, 2023 at 07:28:39PM +0800, Zhu, Lingshan wrote:


On 9/20/2023 6:55 PM, Parav Pandit wrote:

From: Michael S. Tsirkin 
Sent: Wednesday, September 20, 2023 4:06 PM
I freely admit the finer points of this extended flamewar have been lost on me,
and I wager I'm not the only one. I thought you wanted to migrate the device
just by accessing the device itself (e.g. the VF) without accessing other 
devices
(e.g. the PF), while Parav wants it in a separate device so the whole of the
device itself can passed through to guest. Isn't this, fundamentally, the issue?

Right. An admin device doing the work of device migration. Today it is the 
owner PF.
In future it can be other admin device who is deleted this task of migration, 
who can be group owner.
All the admin commands that we plumb here just works great in that CC/TDI 
future, because only thing changes is the admin device issuing this command.


the bar is only a proxy, doesn't fix anything. and even larger side
channel attacking surface: vf-->pf-->vf

In this model there's no pf. BAR belongs to vf itself and you submit commands
for the VF through its BAR.
Just separate from the pci config space.

The whole attacking surface discussion is also puzzling.  We either are or are
not discussing confidential computing/TDI.  I couldn't figure it out. This 
needs a
separate thread I think.

True. Many of Lingshan thoughts/comments gets mixed I feel.
Because he proposes trap+emulation/mediation-based solution by hypervisor and 
none of that is secure anyway in CC/TDI concept.
He keeps attacking AQ as some side channel attack, while somehow trap+emulation 
also done by hypervisor is secure, which obviously does not make sense in 
CC/TDI concept.
Both scores equal where hypervisor trust is of concern.

Please answer directly:

And here you go discussing this in the same thread. I feel you guys are
wasting bytes copying the list with this most people lost track
if not interest.
I agree, although I have to reply because Parav said I am "attacking" AQ 
which is

not a good wording.

And I need to show its not attacking, this is discussions on
LM topics, there may be some debating, and I surely need to provide proof.



What if a malicious SW suspend the guest when it is running through admin vq
live migration facility

I doubt suspend is a problem - looks like a denial of service to me
and that is not considered part of the threat model at least going by
the documents confidential computing guys are posting on lkml.
Yes this is a denial of service and it can be a problem if the service 
is a critical service

like a remote attestation server.

So suspending the VM by admin vq LM commands can attack the system.




What if a malicious SW dump guest memory by tracking guest dirty pages by
admin vq live migration faclity

All this does is tell you which pages did device access though.
It looks like on many architectures this information is readily
available anyway due to host page tables being under the hypervisor
control, since this is how it's migrated. Problem? How is memory
migrated otherwise?

It tracks dirty pages, may record them in a bitmap.

Without CoCo, procdump or qemu dump-guest-memory can dump the guest 
memory pages,

but it does not know which part of memory is guest secrets.

For example, if a malicious SW wants to sniff the guest networking, the 
bitmap

of the dirty pages can help to locate the network DMA pages. This also apply
to disk IOs

So this enlarges the attacking surface.

Current live migration solution does not use any PFs tracking dirty pages,
so no such side channel.



And admin command approach [1] has clear direction for CC to delete those admin 
commands to a dedicated trusted entity instead of hypervisor.

I try to explain these few times, but..

Anyways, if AQ has some comments better to reply in its thread at [1].

[1] 
https://lore.kernel.org/virtio-comment/20230909142911.524407-7-pa...@nvidia.com/T/#md9fcfa1ba997463de8c7fb8c6d1786b224b0bead

I will post v1 for [1] with more mature device context this week along with 
future provisioning item note.



-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state

On Wed, Sep 20, 2023 at 07:28:39PM +0800, Zhu, Lingshan wrote:
> 
> 
> On 9/20/2023 6:55 PM, Parav Pandit wrote:
> > > From: Michael S. Tsirkin 
> > > Sent: Wednesday, September 20, 2023 4:06 PM
> > > I freely admit the finer points of this extended flamewar have been lost 
> > > on me,
> > > and I wager I'm not the only one. I thought you wanted to migrate the 
> > > device
> > > just by accessing the device itself (e.g. the VF) without accessing other 
> > > devices
> > > (e.g. the PF), while Parav wants it in a separate device so the whole of 
> > > the
> > > device itself can passed through to guest. Isn't this, fundamentally, the 
> > > issue?
> > Right. An admin device doing the work of device migration. Today it is the 
> > owner PF.
> > In future it can be other admin device who is deleted this task of 
> > migration, who can be group owner.
> > All the admin commands that we plumb here just works great in that CC/TDI 
> > future, because only thing changes is the admin device issuing this command.
> > 
> > > > the bar is only a proxy, doesn't fix anything. and even larger side
> > > > channel attacking surface: vf-->pf-->vf
> > > In this model there's no pf. BAR belongs to vf itself and you submit 
> > > commands
> > > for the VF through its BAR.
> > > Just separate from the pci config space.
> > > 
> > > The whole attacking surface discussion is also puzzling.  We either are 
> > > or are
> > > not discussing confidential computing/TDI.  I couldn't figure it out. 
> > > This needs a
> > > separate thread I think.
> > True. Many of Lingshan thoughts/comments gets mixed I feel.
> > Because he proposes trap+emulation/mediation-based solution by hypervisor 
> > and none of that is secure anyway in CC/TDI concept.
> > He keeps attacking AQ as some side channel attack, while somehow 
> > trap+emulation also done by hypervisor is secure, which obviously does not 
> > make sense in CC/TDI concept.
> > Both scores equal where hypervisor trust is of concern.
> Please answer directly:

And here you go discussing this in the same thread. I feel you guys are
wasting bytes copying the list with this most people lost track
if not interest.

> What if a malicious SW suspend the guest when it is running through admin vq
> live migration facility

I doubt suspend is a problem - looks like a denial of service to me
and that is not considered part of the threat model at least going by
the documents confidential computing guys are posting on lkml.


> What if a malicious SW dump guest memory by tracking guest dirty pages by
> admin vq live migration faclity

All this does is tell you which pages did device access though.
It looks like on many architectures this information is readily
available anyway due to host page tables being under the hypervisor
control, since this is how it's migrated. Problem? How is memory
migrated otherwise?

> > 
> > And admin command approach [1] has clear direction for CC to delete those 
> > admin commands to a dedicated trusted entity instead of hypervisor.
> > 
> > I try to explain these few times, but..
> > 
> > Anyways, if AQ has some comments better to reply in its thread at [1].
> > 
> > [1] 
> > https://lore.kernel.org/virtio-comment/20230909142911.524407-7-pa...@nvidia.com/T/#md9fcfa1ba997463de8c7fb8c6d1786b224b0bead
> > 
> > I will post v1 for [1] with more mature device context this week along with 
> > future provisioning item note.


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state





On 9/20/2023 6:55 PM, Parav Pandit wrote:

From: Michael S. Tsirkin 
Sent: Wednesday, September 20, 2023 4:06 PM
I freely admit the finer points of this extended flamewar have been lost on me,
and I wager I'm not the only one. I thought you wanted to migrate the device
just by accessing the device itself (e.g. the VF) without accessing other 
devices
(e.g. the PF), while Parav wants it in a separate device so the whole of the
device itself can passed through to guest. Isn't this, fundamentally, the issue?

Right. An admin device doing the work of device migration. Today it is the 
owner PF.
In future it can be other admin device who is deleted this task of migration, 
who can be group owner.
All the admin commands that we plumb here just works great in that CC/TDI 
future, because only thing changes is the admin device issuing this command.


the bar is only a proxy, doesn't fix anything. and even larger side
channel attacking surface: vf-->pf-->vf

In this model there's no pf. BAR belongs to vf itself and you submit commands
for the VF through its BAR.
Just separate from the pci config space.

The whole attacking surface discussion is also puzzling.  We either are or are
not discussing confidential computing/TDI.  I couldn't figure it out. This 
needs a
separate thread I think.

True. Many of Lingshan thoughts/comments gets mixed I feel.
Because he proposes trap+emulation/mediation-based solution by hypervisor and 
none of that is secure anyway in CC/TDI concept.
He keeps attacking AQ as some side channel attack, while somehow trap+emulation 
also done by hypervisor is secure, which obviously does not make sense in 
CC/TDI concept.
Both scores equal where hypervisor trust is of concern.

Please answer directly:

What if a malicious SW suspend the guest when it is running through 
admin vq live migration facility


What if a malicious SW dump guest memory by tracking guest dirty pages 
by admin vq live migration faclity


And admin command approach [1] has clear direction for CC to delete those admin 
commands to a dedicated trusted entity instead of hypervisor.

I try to explain these few times, but..

Anyways, if AQ has some comments better to reply in its thread at [1].

[1] 
https://lore.kernel.org/virtio-comment/20230909142911.524407-7-pa...@nvidia.com/T/#md9fcfa1ba997463de8c7fb8c6d1786b224b0bead

I will post v1 for [1] with more mature device context this week along with 
future provisioning item note.



-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state

On 9/20/2023 7:15 PM, Parav Pandit wrote:

Random words like malicious SW to describe an attack do not make sense.

this is not random wording, "malicious" used a lot in the papers,
you can search in google scholar

Refer the patches and series and usage model to describe the sw attack 
if any.

I disagree and I will not repeat all the points anymore.

don't only say you disagree, please provide why it is not an attacking 
surface.

The problem still exist:

What if a malicious SW dump guest memory through admin vq LM facility?

I think this is end of this discussion.

If you have comments in [1], please reply in [1].

Series [1] clearly describes the usage model at least for one widely 
used OS = Linux.

[1] 
https://lore.kernel.org/virtio-comment/20230909142911.524407-7-pa...@nvidia.com/T/#md9fcfa1ba997463de8c7fb8c6d1786b224b0bead

*From:* Zhu, Lingshan 
*Sent:* Wednesday, September 20, 2023 4:41 PM
*To:* Parav Pandit ; Michael S. Tsirkin 
*Cc:* virtio-dev@lists.oasis-open.org; Jason Wang 
*Subject:* Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND 
bit and vq state

On 9/20/2023 5:52 PM, Parav Pandit wrote:

Hi Lingshan,

Last two email replies in non-next format are getting hard to follow.

Can you please revert back to have text-based emails?

When one wants to use PF for the live migration in trusted
hypervisor, PF is in the trust zone.

even without live migration, it can be an attacking surface while 
guest running.
As repeated for many times, it can be used by malicious SW to dump 
guest memory

In future when hypervisor is not trusted, the task of LM will be
delegated to other infrastructure TVM.

Ravi at Intel already explained this a year ago using migration TD.

This fits very well without bifurcating the member device which is
extremely hard.

TD, TDX or TDX-IO are more complex topics, and we should
focus on our live migration solution, not CC.

My point is: using bar cap as a proxy for admin vq based LM is still 
problematic.

Maybe we can close this.

Parav

*From:* Zhu, Lingshan 

*Sent:* Wednesday, September 20, 2023 3:15 PM
*To:* Parav Pandit  ;
Michael S. Tsirkin  
*Cc:* virtio-dev@lists.oasis-open.org; Jason Wang

*Subject:* Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce
SUSPEND bit and vq state

On 9/20/2023 4:34 PM, Parav Pandit wrote:

@font-face {font-family:"Cambria Math"; panose-1:2 4 5 3 5 4 6
3 2 4;}@font-face {font-family:Calibri; panose-1:2 15 5 2 2 2
4 3 2 4;}@font-face {font-family:Consolas; panose-1:2 11 6 9 2
2 4 3 2 4;}p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in; font-size:11.0pt;
font-family:"Calibri",sans-serif;}a:link, span.MsoHyperlink
{mso-style-priority:99; color:blue;
text-decoration:underline;}pre {mso-style-priority:99;
mso-style-link:"HTML Preformatted Char"; margin:0in;
margin-bottom:.0001pt; font-size:10.0pt; font-family:"Courier
New";}span.HTMLPreformattedChar {mso-style-name:"HTML
Preformatted Char"; mso-style-priority:99;
mso-style-link:"HTML Preformatted";
font-family:Consolas;}span.fontstyle0
{mso-style-name:fontstyle0;}span.EmailStyle21
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:windowtext;}.MsoChpDefault {mso-style-type:export-only;
font-size:10.0pt; mso-ligatures:none;}div.WordSection1
{page:WordSection1;}

> There can be malicious SW on the host, and the host may be
hacked and compromised.
> For example:
> 1) SUSPEND the a running guest by admin vq
> 2) dumping guest memory through admin vq dirty page tracking.

No. hypervisor is trusted entity who is hosting the VM.

The PF may not owned by the hypervisor and the host can be hacked
and computerized.

The device migration is initiated by the hypervisor.

I am omitting the TDISP question for now as talked before.

TDISP spec will evolve for hypercalls when we get there.

Confidential computing is out of the spec, as we discussed and agreed.

This is to demonstrate why even using a bar cap as proxy for admin
vq LM is still problematic.
TDISP gives examples of the attacking models, and admin vq based LM
conforms to the models.

*From:*virtio-dev@lists.oasis-open.org

 *On Behalf Of *Zhu,
Lingshan
*Sent:* Wednesday, September 20, 2023 12:01 PM
*To:* Parav Pandit 
; Michael S. Tsirkin 

*Cc:* virtio-dev@lists.oasis-open.org; Jason Wang

*Subject:* Re: [virtio-dev] Re: [PATCH 0/5] virtio:

Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state





On 9/20/2023 6:36 PM, Michael S. Tsirkin wrote:

On Wed, Sep 20, 2023 at 02:06:13PM +0800, Zhu, Lingshan wrote:


On 9/19/2023 2:49 AM, Michael S. Tsirkin wrote:

On Mon, Sep 18, 2023 at 06:41:55PM +, Parav Pandit wrote:

Please refer to the code for setting FEATURES_OK.

It wont work when one needs to suspend the device.
There is no point of doing such work over registers as fundamental framework is 
over the AQ.

Well not really. It's over admin commands. When these were built the
intent always was that it's possible to use admin commands through
another interface, other than admin queue. Is there a problem
implementing admin commands over a memory BAR? For example, I can see
an "admin command" capability pointing at a BAR where
commands are supplied, and using a new group type referring to
device itself.

I am not sure, if a bar cap would be implemented as a proxy for the admin vq
based live migration.

Not a proxy for a vq in that there's no vq then.

I think if the driver sends admin commands through a VF's bar, then
VF forwards the admin commands to the PF, it acts like a proxy,
or an agent. Anyway it takes admin commands.

So the problems we have discussed still exist.



then the problems of admin vq LM that we have
discussed
still exist.

I freely admit the finer points of this extended flamewar have been lost
on me, and I wager I'm not the only one. I thought you wanted to migrate
the device just by accessing the device itself (e.g. the VF) without
accessing other devices (e.g. the PF), while Parav wants it in a
separate device so the whole of the device itself can passed through to
guest. Isn't this, fundamentally, the issue?

we are implementing basic facilities for live migration.

We have pointed out lots of issues, there are many discussions with
Jason and Parav about the problems in migration by admin vq, for example:
security, QOS and nested.



the bar is only a proxy, doesn't fix anything. and even larger
side channel attacking surface: vf-->pf-->vf

In this model there's no pf. BAR belongs to vf itself
and you submit commands for the VF through its BAR.
Just separate from the pci config space.

If using the bar to process admin commands,
is this solution too heavy compared to my proposal in this series?


The whole attacking surface discussion is also puzzling.  We either are
or are not discussing confidential computing/TDI.  I couldn't figure
it out. This needs a separate thread I think.
I agree confidential computing is out of spec. Parva mentioned TDISP and 
even
in TDISP spec, it explicitly defined some attacking model, and PF is an 
example.


It is out of spec anyway.





-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

RE: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state

Random words like malicious SW to describe an attack do not make sense.

Refer the patches and series and usage model to describe the sw attack if any.
I disagree and I will not repeat all the points anymore.
If you have comments in [1], please reply in [1].

Series [1] clearly describes the usage model at least for one widely used OS = 
Linux.

[1] 
https://lore.kernel.org/virtio-comment/20230909142911.524407-7-pa...@nvidia.com/T/#md9fcfa1ba997463de8c7fb8c6d1786b224b0bead

From: Zhu, Lingshan 
Sent: Wednesday, September 20, 2023 4:41 PM
To: Parav Pandit ; Michael S. Tsirkin 
Cc: virtio-dev@lists.oasis-open.org; Jason Wang 
Subject: Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq 
state

On 9/20/2023 5:52 PM, Parav Pandit wrote:

Hi Lingshan,

Last two email replies in non-next format are getting hard to follow.
Can you please revert back to have text-based emails?

When one wants to use PF for the live migration in trusted hypervisor, PF is in 
the trust zone.
even without live migration, it can be an attacking surface while guest running.
As repeated for many times, it can be used by malicious SW to dump guest memory

In future when hypervisor is not trusted, the task of LM will be delegated to 
other infrastructure TVM.
Ravi at Intel already explained this a year ago using migration TD.
This fits very well without bifurcating the member device which is extremely 
hard.
TD, TDX or TDX-IO are more complex topics, and we should
focus on our live migration solution, not CC.

My point is: using bar cap as a proxy for admin vq based LM is still 
problematic.

Maybe we can close this.

Parav

From: Zhu, Lingshan 
Sent: Wednesday, September 20, 2023 3:15 PM
To: Parav Pandit ; Michael S. 
Tsirkin 
Cc: virtio-dev@lists.oasis-open.org; 
Jason Wang 
Subject: Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq 
state

On 9/20/2023 4:34 PM, Parav Pandit wrote:
@font-face {font-family:"Cambria Math"; panose-1:2 4 5 3 5 4 6 3 2 
4;}@font-face {font-family:Calibri; panose-1:2 15 5 2 2 2 4 3 2 4;}@font-face 
{font-family:Consolas; panose-1:2 11 6 9 2 2 4 3 2 4;}p.MsoNormal, 
li.MsoNormal, div.MsoNormal {margin:0in; font-size:11.0pt; 
font-family:"Calibri",sans-serif;}a:link, span.MsoHyperlink 
{mso-style-priority:99; color:blue; text-decoration:underline;}pre 
{mso-style-priority:99; mso-style-link:"HTML Preformatted Char"; margin:0in; 
margin-bottom:.0001pt; font-size:10.0pt; font-family:"Courier 
New";}span.HTMLPreformattedChar {mso-style-name:"HTML Preformatted Char"; 
mso-style-priority:99; mso-style-link:"HTML Preformatted"; 
font-family:Consolas;}span.fontstyle0 
{mso-style-name:fontstyle0;}span.EmailStyle21 {mso-style-type:personal-reply; 
font-family:"Calibri",sans-serif; color:windowtext;}.MsoChpDefault 
{mso-style-type:export-only; font-size:10.0pt; 
mso-ligatures:none;}div.WordSection1 {page:WordSection1;}
> There can be malicious SW on the host, and the host may be hacked and 
> compromised.
> For example:
> 1) SUSPEND the a running guest by admin vq
> 2) dumping guest memory through admin vq dirty page tracking.

No. hypervisor is trusted entity who is hosting the VM.
The PF may not owned by the hypervisor and the host can be hacked and 
computerized.

The device migration is initiated by the hypervisor.

I am omitting the TDISP question for now as talked before.
TDISP spec will evolve for hypercalls when we get there.
Confidential computing is out of the spec, as we discussed and agreed.

This is to demonstrate why even using a bar cap as proxy for admin vq LM is 
still problematic.
TDISP gives examples of the attacking models, and admin vq based LM
conforms to the models.

From: virtio-dev@lists.oasis-open.org 
 On 
Behalf Of Zhu, Lingshan
Sent: Wednesday, September 20, 2023 12:01 PM
To: Parav Pandit ; Michael S. 
Tsirkin 
Cc: virtio-dev@lists.oasis-open.org; 
Jason Wang 
Subject: Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq 
state

On 9/20/2023 2:08 PM, Parav Pandit wrote:

From: Zhu, Lingshan 

Sent: Wednesday, September 20, 2023 11:36 AM

On 9/19/2023 2:49 AM, Michael S. Tsirkin wrote:

On Mon, Sep 18, 2023 at 06:41:55PM +, Parav Pandit wrote:

Please refer to the code for setting FEATURES_OK.

It wont work when one needs to suspend the device.

There is no point of doing such work over registers as fundamental

framework is over the AQ.

Well not really. It's over admin commands. When these were built the

intent always was that it's possible to use admin commands through

another interface, other than admin queue. Is there a problem

implementing

Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state

On 9/20/2023 5:52 PM, Parav Pandit wrote:

Hi Lingshan,

Last two email replies in non-next format are getting hard to follow.

Can you please revert back to have text-based emails?

When one wants to use PF for the live migration in trusted hypervisor, 
PF is in the trust zone.

even without live migration, it can be an attacking surface while guest 
running.
As repeated for many times, it can be used by malicious SW to dump guest 
memory

In future when hypervisor is not trusted, the task of LM will be 
delegated to other infrastructure TVM.

Ravi at Intel already explained this a year ago using migration TD.

This fits very well without bifurcating the member device which is 
extremely hard.

TD, TDX or TDX-IO are more complex topics, and we should
focus on our live migration solution, not CC.

My point is: using bar cap as a proxy for admin vq based LM is still 
problematic.

Maybe we can close this.

Parav

*From:* Zhu, Lingshan 
*Sent:* Wednesday, September 20, 2023 3:15 PM
*To:* Parav Pandit ; Michael S. Tsirkin 
*Cc:* virtio-dev@lists.oasis-open.org; Jason Wang 
*Subject:* Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND 
bit and vq state

On 9/20/2023 4:34 PM, Parav Pandit wrote:

@font-face {font-family:"Cambria Math"; panose-1:2 4 5 3 5 4 6 3 2
4;}@font-face {font-family:Calibri; panose-1:2 15 5 2 2 2 4 3 2
4;}@font-face {font-family:Consolas; panose-1:2 11 6 9 2 2 4 3 2
4;}p.MsoNormal, li.MsoNormal, div.MsoNormal {margin:0in;
font-size:11.0pt; font-family:"Calibri",sans-serif;}a:link,
span.MsoHyperlink {mso-style-priority:99; color:blue;
text-decoration:underline;}pre {mso-style-priority:99;
mso-style-link:"HTML Preformatted Char"; margin:0in;
margin-bottom:.0001pt; font-size:10.0pt; font-family:"Courier
New";}span.HTMLPreformattedChar {mso-style-name:"HTML Preformatted
Char"; mso-style-priority:99; mso-style-link:"HTML Preformatted";
font-family:Consolas;}span.fontstyle0
{mso-style-name:fontstyle0;}span.EmailStyle21
{mso-style-type:personal-reply; font-family:"Calibri",sans-serif;
color:windowtext;}.MsoChpDefault {mso-style-type:export-only;
font-size:10.0pt; mso-ligatures:none;}div.WordSection1
{page:WordSection1;}

> There can be malicious SW on the host, and the host may be hacked
and compromised.
> For example:
> 1) SUSPEND the a running guest by admin vq
> 2) dumping guest memory through admin vq dirty page tracking.

No. hypervisor is trusted entity who is hosting the VM.

The PF may not owned by the hypervisor and the host can be hacked and 
computerized.

The device migration is initiated by the hypervisor.

I am omitting the TDISP question for now as talked before.

TDISP spec will evolve for hypercalls when we get there.

Confidential computing is out of the spec, as we discussed and agreed.

This is to demonstrate why even using a bar cap as proxy for admin vq 
LM is still problematic.

TDISP gives examples of the attacking models, and admin vq based LM
conforms to the models.

*From:*virtio-dev@lists.oasis-open.org

 *On Behalf Of *Zhu, Lingshan
*Sent:* Wednesday, September 20, 2023 12:01 PM
*To:* Parav Pandit  ;
Michael S. Tsirkin  
*Cc:* virtio-dev@lists.oasis-open.org; Jason Wang

*Subject:* Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce
SUSPEND bit and vq state

On 9/20/2023 2:08 PM, Parav Pandit wrote:

From: Zhu, Lingshan  

Sent: Wednesday, September 20, 2023 11:36 AM

On 9/19/2023 2:49 AM, Michael S. Tsirkin wrote:

On Mon, Sep 18, 2023 at 06:41:55PM +, Parav Pandit wrote:

Please refer to the code for setting FEATURES_OK.

It wont work when one needs to suspend the device.

There is no point of doing such work over registers as 
fundamental

framework is over the AQ.

Well not really. It's over admin commands. When these were 
built the

intent always was that it's possible to use admin commands 
through

another interface, other than admin queue. Is there a problem

implementing admin commands over a memory BAR? For example, I 
can see

an "admin command" capability pointing at a BAR where commands 
are

supplied, and using a new group type referring to device itself.

I am not sure, if a bar cap would be implemented as a proxy for the 
admin vq

based live migration. then the problems of admin vq LM that we have 
discussed

still exist. the bar is only a proxy, doesn't fix anything. and 
even larger side

channel attacking surface: vf-->pf-->vf

RE: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state



> From: Michael S. Tsirkin 
> Sent: Wednesday, September 20, 2023 4:06 PM

> 
> I freely admit the finer points of this extended flamewar have been lost on 
> me,
> and I wager I'm not the only one. I thought you wanted to migrate the device
> just by accessing the device itself (e.g. the VF) without accessing other 
> devices
> (e.g. the PF), while Parav wants it in a separate device so the whole of the
> device itself can passed through to guest. Isn't this, fundamentally, the 
> issue?
Right. An admin device doing the work of device migration. Today it is the 
owner PF.
In future it can be other admin device who is deleted this task of migration, 
who can be group owner.
All the admin commands that we plumb here just works great in that CC/TDI 
future, because only thing changes is the admin device issuing this command.

> 
> > the bar is only a proxy, doesn't fix anything. and even larger side
> > channel attacking surface: vf-->pf-->vf
> 
> In this model there's no pf. BAR belongs to vf itself and you submit commands
> for the VF through its BAR.
> Just separate from the pci config space.
> 
> The whole attacking surface discussion is also puzzling.  We either are or are
> not discussing confidential computing/TDI.  I couldn't figure it out. This 
> needs a
> separate thread I think.

True. Many of Lingshan thoughts/comments gets mixed I feel.
Because he proposes trap+emulation/mediation-based solution by hypervisor and 
none of that is secure anyway in CC/TDI concept.
He keeps attacking AQ as some side channel attack, while somehow trap+emulation 
also done by hypervisor is secure, which obviously does not make sense in 
CC/TDI concept.
Both scores equal where hypervisor trust is of concern.

And admin command approach [1] has clear direction for CC to delete those admin 
commands to a dedicated trusted entity instead of hypervisor.

I try to explain these few times, but..

Anyways, if AQ has some comments better to reply in its thread at [1].

[1] 
https://lore.kernel.org/virtio-comment/20230909142911.524407-7-pa...@nvidia.com/T/#md9fcfa1ba997463de8c7fb8c6d1786b224b0bead

I will post v1 for [1] with more mature device context this week along with 
future provisioning item note.

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state

On Wed, Sep 20, 2023 at 02:06:13PM +0800, Zhu, Lingshan wrote:
> 
> 
> On 9/19/2023 2:49 AM, Michael S. Tsirkin wrote:
> > On Mon, Sep 18, 2023 at 06:41:55PM +, Parav Pandit wrote:
> > > > Please refer to the code for setting FEATURES_OK.
> > > It wont work when one needs to suspend the device.
> > > There is no point of doing such work over registers as fundamental 
> > > framework is over the AQ.
> > Well not really. It's over admin commands. When these were built the
> > intent always was that it's possible to use admin commands through
> > another interface, other than admin queue. Is there a problem
> > implementing admin commands over a memory BAR? For example, I can see
> > an "admin command" capability pointing at a BAR where
> > commands are supplied, and using a new group type referring to
> > device itself.
> I am not sure, if a bar cap would be implemented as a proxy for the admin vq
> based live migration.

Not a proxy for a vq in that there's no vq then.

> then the problems of admin vq LM that we have
> discussed
> still exist.

I freely admit the finer points of this extended flamewar have been lost
on me, and I wager I'm not the only one. I thought you wanted to migrate
the device just by accessing the device itself (e.g. the VF) without
accessing other devices (e.g. the PF), while Parav wants it in a
separate device so the whole of the device itself can passed through to
guest. Isn't this, fundamentally, the issue?

> the bar is only a proxy, doesn't fix anything. and even larger
> side channel attacking surface: vf-->pf-->vf

In this model there's no pf. BAR belongs to vf itself
and you submit commands for the VF through its BAR.
Just separate from the pci config space.

The whole attacking surface discussion is also puzzling.  We either are
or are not discussing confidential computing/TDI.  I couldn't figure
it out. This needs a separate thread I think.

-- 
MST

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

RE: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state

Hi Lingshan,

Last two email replies in non-next format are getting hard to follow.
Can you please revert back to have text-based emails?

When one wants to use PF for the live migration in trusted hypervisor, PF is in 
the trust zone.

In future when hypervisor is not trusted, the task of LM will be delegated to 
other infrastructure TVM.
Ravi at Intel already explained this a year ago using migration TD.
This fits very well without bifurcating the member device which is extremely 
hard.

Parav

From: Zhu, Lingshan 
Sent: Wednesday, September 20, 2023 3:15 PM
To: Parav Pandit ; Michael S. Tsirkin 
Cc: virtio-dev@lists.oasis-open.org; Jason Wang 
Subject: Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq 
state

On 9/20/2023 4:34 PM, Parav Pandit wrote:
@font-face {font-family:"Cambria Math"; panose-1:2 4 5 3 5 4 6 3 2 
4;}@font-face {font-family:Calibri; panose-1:2 15 5 2 2 2 4 3 2 4;}@font-face 
{font-family:Consolas; panose-1:2 11 6 9 2 2 4 3 2 4;}p.MsoNormal, 
li.MsoNormal, div.MsoNormal {margin:0in; font-size:11.0pt; 
font-family:"Calibri",sans-serif;}a:link, span.MsoHyperlink 
{mso-style-priority:99; color:blue; text-decoration:underline;}pre 
{mso-style-priority:99; mso-style-link:"HTML Preformatted Char"; margin:0in; 
margin-bottom:.0001pt; font-size:10.0pt; font-family:"Courier 
New";}span.HTMLPreformattedChar {mso-style-name:"HTML Preformatted Char"; 
mso-style-priority:99; mso-style-link:"HTML Preformatted"; 
font-family:Consolas;}span.fontstyle0 
{mso-style-name:fontstyle0;}span.EmailStyle21 {mso-style-type:personal-reply; 
font-family:"Calibri",sans-serif; color:windowtext;}.MsoChpDefault 
{mso-style-type:export-only; font-size:10.0pt; 
mso-ligatures:none;}div.WordSection1 {page:WordSection1;}
> There can be malicious SW on the host, and the host may be hacked and 
> compromised.
> For example:
> 1) SUSPEND the a running guest by admin vq
> 2) dumping guest memory through admin vq dirty page tracking.

No. hypervisor is trusted entity who is hosting the VM.
The PF may not owned by the hypervisor and the host can be hacked and 
computerized.

The device migration is initiated by the hypervisor.

I am omitting the TDISP question for now as talked before.
TDISP spec will evolve for hypercalls when we get there.
Confidential computing is out of the spec, as we discussed and agreed.

This is to demonstrate why even using a bar cap as proxy for admin vq LM is 
still problematic.
TDISP gives examples of the attacking models, and admin vq based LM
conforms to the models.

From: virtio-dev@lists.oasis-open.org 
 On 
Behalf Of Zhu, Lingshan
Sent: Wednesday, September 20, 2023 12:01 PM
To: Parav Pandit ; Michael S. 
Tsirkin 
Cc: virtio-dev@lists.oasis-open.org; 
Jason Wang 
Subject: Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq 
state

On 9/20/2023 2:08 PM, Parav Pandit wrote:

From: Zhu, Lingshan 

Sent: Wednesday, September 20, 2023 11:36 AM

On 9/19/2023 2:49 AM, Michael S. Tsirkin wrote:

On Mon, Sep 18, 2023 at 06:41:55PM +, Parav Pandit wrote:

Please refer to the code for setting FEATURES_OK.

It wont work when one needs to suspend the device.

There is no point of doing such work over registers as fundamental

framework is over the AQ.

Well not really. It's over admin commands. When these were built the

intent always was that it's possible to use admin commands through

another interface, other than admin queue. Is there a problem

implementing admin commands over a memory BAR? For example, I can see

an "admin command" capability pointing at a BAR where commands are

supplied, and using a new group type referring to device itself.

I am not sure, if a bar cap would be implemented as a proxy for the admin vq

based live migration. then the problems of admin vq LM that we have discussed

still exist. the bar is only a proxy, doesn't fix anything. and even larger side

channel attacking surface: vf-->pf-->vf

AQ LM using PF has no side channel attack as hypervisor and owner device is 
trusted entity as already discussed.
I believe we have discussed this for many times, and I even provide you some 
examples.

Let me repeat for the last time.

There can be malicious SW on the host, and the host may be hacked and 
compromised.
For example:
1) SUSPEND the a running guest by admin vq
2) dumping guest memory through admin vq dirty page tracking.

These above can happen right?

You made TDISP as an example, but have you really read the TDISP spec?
In the spec:

Device Security Architecture - Administrative interfaces (e.g., a PF) may be
used to influence the security properties of the TDI used by the TVM.

TEE-I/O requires the device to organize its hardware/software interfaces such 
that the PF cannot
be used to affect

RE: [virtio-dev] [PATCH] virtio-tee: Reserve device ID 46 for TEE device

Hi,

> From: virtio-dev@lists.oasis-open.org  On
> Behalf Of jeshwank
> Sent: Wednesday, September 6, 2023 3:13 PM
> 
> The virtio-tee device allows guest OS to access the TEE present in host 
> system,
> to perform secure operations.
> 
> This patch is to reserve a device ID 46 for virtio-tee device.
> 

For those who breath TEE its may be obvious thing. 
For me, a humble request, 
can you please expand this patch to contain short description of basic 
functionality of the TEE device to encompass under this device-id?

> Signed-off-by: jeshwank 
> ---
>  content.tex | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/content.tex b/content.tex
> index 0a62dce..644aa4a 100644
> --- a/content.tex
> +++ b/content.tex
> @@ -739,6 +739,8 @@ \chapter{Device Types}\label{sec:Device Types}  \hline
>  45 &   SPI master \\
>  \hline
> +46 &   TEE device \\
> +\hline
>  \end{tabular}
> 
>  Some of the devices above are unspecified by this document,
> --
> 2.25.1
> 
> 
> -
> To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state

On 9/20/2023 4:34 PM, Parav Pandit wrote:

> There can be malicious SW on the host, and the host may be hacked 
and compromised.

> For example:
> 1) SUSPEND the a running guest by admin vq
> 2) dumping guest memory through admin vq dirty page tracking.

No. hypervisor is trusted entity who is hosting the VM.

The PF may not owned by the hypervisor and the host can be hacked and 
computerized.

The device migration is initiated by the hypervisor.

I am omitting the TDISP question for now as talked before.

TDISP spec will evolve for hypercalls when we get there.

Confidential computing is out of the spec, as we discussed and agreed.

This is to demonstrate why even using a bar cap as proxy for admin vq LM 
is still problematic.

TDISP gives examples of the attacking models, and admin vq based LM
conforms to the models.

*From:* virtio-dev@lists.oasis-open.org 
 *On Behalf Of *Zhu, Lingshan

*Sent:* Wednesday, September 20, 2023 12:01 PM
*To:* Parav Pandit ; Michael S. Tsirkin 
*Cc:* virtio-dev@lists.oasis-open.org; Jason Wang 
*Subject:* Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND 
bit and vq state

On 9/20/2023 2:08 PM, Parav Pandit wrote:

From: Zhu, Lingshan  

Sent: Wednesday, September 20, 2023 11:36 AM

On 9/19/2023 2:49 AM, Michael S. Tsirkin wrote:

On Mon, Sep 18, 2023 at 06:41:55PM +, Parav Pandit wrote:

Please refer to the code for setting FEATURES_OK.

It wont work when one needs to suspend the device.

There is no point of doing such work over registers as 
fundamental

framework is over the AQ.

Well not really. It's over admin commands. When these were built the

intent always was that it's possible to use admin commands through

another interface, other than admin queue. Is there a problem

implementing admin commands over a memory BAR? For example, I can 
see

an "admin command" capability pointing at a BAR where commands are

supplied, and using a new group type referring to device itself.

I am not sure, if a bar cap would be implemented as a proxy for the 
admin vq

based live migration. then the problems of admin vq LM that we have 
discussed

still exist. the bar is only a proxy, doesn't fix anything. and even 
larger side

channel attacking surface: vf-->pf-->vf

AQ LM using PF has no side channel attack as hypervisor and owner device is 
trusted entity as already discussed.

I believe we have discussed this for many times, and I even provide 
you some examples.

Let me repeat for the last time.

There can be malicious SW on the host, and the host may be hacked and 
compromised.

For example:
1) SUSPEND the a running guest by admin vq
2) dumping guest memory through admin vq dirty page tracking.

These above can happen right?

You made TDISP as an example, but have you really read the TDISP spec?
In the spec:

Device Security Architecture - Administrative interfaces (e.g., a PF) 
may be

used to influence the security properties of the TDI used by the TVM.

TEE-I/O requires the device to organize its hardware/software 
interfaces such that the PF cannot

be used to affect the security of a TDI when it is in use by a TVM

Clear?

RE: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state

> There can be malicious SW on the host, and the host may be hacked and 
> compromised.
> For example:
> 1) SUSPEND the a running guest by admin vq
> 2) dumping guest memory through admin vq dirty page tracking.

No. hypervisor is trusted entity who is hosting the VM.
The device migration is initiated by the hypervisor.

I am omitting the TDISP question for now as talked before.
TDISP spec will evolve for hypercalls when we get there.

From: virtio-dev@lists.oasis-open.org  On 
Behalf Of Zhu, Lingshan
Sent: Wednesday, September 20, 2023 12:01 PM
To: Parav Pandit ; Michael S. Tsirkin 
Cc: virtio-dev@lists.oasis-open.org; Jason Wang 
Subject: Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq 
state

On 9/20/2023 2:08 PM, Parav Pandit wrote:

From: Zhu, Lingshan 

Sent: Wednesday, September 20, 2023 11:36 AM

On 9/19/2023 2:49 AM, Michael S. Tsirkin wrote:

On Mon, Sep 18, 2023 at 06:41:55PM +, Parav Pandit wrote:

Please refer to the code for setting FEATURES_OK.

It wont work when one needs to suspend the device.

There is no point of doing such work over registers as fundamental

framework is over the AQ.

Well not really. It's over admin commands. When these were built the

intent always was that it's possible to use admin commands through

another interface, other than admin queue. Is there a problem

implementing admin commands over a memory BAR? For example, I can see

an "admin command" capability pointing at a BAR where commands are

supplied, and using a new group type referring to device itself.

I am not sure, if a bar cap would be implemented as a proxy for the admin vq

based live migration. then the problems of admin vq LM that we have discussed

still exist. the bar is only a proxy, doesn't fix anything. and even larger side

channel attacking surface: vf-->pf-->vf

AQ LM using PF has no side channel attack as hypervisor and owner device is 
trusted entity as already discussed.
I believe we have discussed this for many times, and I even provide you some 
examples.

Let me repeat for the last time.

There can be malicious SW on the host, and the host may be hacked and 
compromised.
For example:
1) SUSPEND the a running guest by admin vq
2) dumping guest memory through admin vq dirty page tracking.

These above can happen right?

You made TDISP as an example, but have you really read the TDISP spec?
In the spec:

Device Security Architecture - Administrative interfaces (e.g., a PF) may be
used to influence the security properties of the TDI used by the TVM.

TEE-I/O requires the device to organize its hardware/software interfaces such 
that the PF cannot
be used to affect the security of a TDI when it is in use by a TVM

Clear?

[virtio-dev] Re: [virtio-comment] RE: [virtio-dev] Re: [virtio-comment] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg



On 2023/9/20 15:56, Parav Pandit wrote:
> Hi Jiquian,
> 
>> From: Chen, Jiqian 
>> Sent: Wednesday, September 20, 2023 1:24 PM
>>
>> Hi Lingshan,
>> Please reply to your own email thread, below are not related to my patches.
>> Thanks a lot.
> 
> They are related to your patch.
>  Both the patches have overlapping functionalities.
But Lingshan's patch is not meet my need. It clears the SUSPEND bit during 
reset.

> 
> You probably missed my previous response explaining the same at [1].
> 
> [1] https://lists.oasis-open.org/archives/virtio-comment/202309/msg00255.html
Yes, I didn't receive this response. 
After reading your responses, I think your concerns are below:
> The point is when driver tells to freeze, it is freeze command and not reset.
> So resume() should not invoke device_reset() when FREEZE+RESUME supported.
The modifications in my Qemu and kernel patches, I just set freeze_mode to be 
S3, and then when guest resume I can change the reset behavior according the S3 
mode, see below callstack:
pci_pm_resume
virtio_pci_restore
virtio_device_restore
virtio_reset_device(set 0 the device status and then 
call virtio_reset to reset device)
drv->restore(virtio_gpu_restore)
So, reset will be done during the restore process(resume invoke the reset), and 
I want to affect the reset behavior during the restore process.

> 

-- 
Best regards,
Jiqian Chen.

RE: [virtio-dev] Re: [virtio-comment] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg

Hi Jiquian,

> From: Chen, Jiqian 
> Sent: Wednesday, September 20, 2023 1:24 PM
> 
> Hi Lingshan,
> Please reply to your own email thread, below are not related to my patches.
> Thanks a lot.

They are related to your patch.
 Both the patches have overlapping functionalities.

You probably missed my previous response explaining the same at [1].

[1] https://lists.oasis-open.org/archives/virtio-comment/202309/msg00255.html

Re: [virtio-dev] Re: [virtio-comment] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg





On 9/20/2023 3:51 PM, Parav Pandit wrote:

From: Zhu, Lingshan 
Sent: Wednesday, September 20, 2023 1:17 PM

This is not live or device migration. This is restoring the device context

initiated by the driver owning the device.
restore the device context should be done by the hypervisor before setting
DRIVER_OK and waking up the guest, not a concern here, out of spec

The framework is generic for the PCI devices hence, there may not be any 
hypervisor at all. Hence restore operation to trigger on DRIVER_OK setting, 
when previously suspended.

Since we already agree in previous email that re-read until device sets the 
DRIVER_OK, its fine to me. It covers the above restore condition.

OK


Thanks.




-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

Re: [virtio-dev] Re: [virtio-comment] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg

Hi Lingshan,
Please reply to your own email thread, below are not related to my patches. 
Thanks a lot.

On 2023/9/20 15:47, Zhu, Lingshan wrote:
> 
> 
> On 9/20/2023 3:35 PM, Parav Pandit wrote:
>>> From: Zhu, Lingshan 
>>> Sent: Wednesday, September 20, 2023 1:00 PM
>>>
>>> On 9/20/2023 3:24 PM, Chen, Jiqian wrote:
 Hi Lingshan,
 It seems you reply to the wrong email thread. They are not related to my
>>> patch.
>>> These reply to Parva's comments.
>>> @Parva, if you want to discuss more about live migration, please reply in my
>>> thread, lets don't flood here.
>> You made the point that "this is not live migration".
>> I am not discussing live migration in this thread.
>>
>> I replied for the point that device restoration from suspend for physical 
>> and hypevisor based device is not a 40nsec worth of work to restore by just 
>> doing a register write.
>> This is not live or device migration. This is restoring the device context 
>> initiated by the driver owning the device.
> restore the device context should be done by the hypervisor before setting 
> DRIVER_OK and waking up the guest, not a concern here, out of spec
> 

-- 
Best regards,
Jiqian Chen.

RE: [virtio-dev] Re: [virtio-comment] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg


> From: Zhu, Lingshan 
> Sent: Wednesday, September 20, 2023 1:17 PM

> > This is not live or device migration. This is restoring the device context
> initiated by the driver owning the device.
> restore the device context should be done by the hypervisor before setting
> DRIVER_OK and waking up the guest, not a concern here, out of spec

The framework is generic for the PCI devices hence, there may not be any 
hypervisor at all. Hence restore operation to trigger on DRIVER_OK setting, 
when previously suspended.

Since we already agree in previous email that re-read until device sets the 
DRIVER_OK, its fine to me. It covers the above restore condition.

Thanks.

RE: [virtio-dev] Re: [virtio-comment] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg


> From: Zhu, Lingshan 
> Sent: Wednesday, September 20, 2023 1:16 PM

[..]
> > In my view, setting the DRIVER_OK is the signal regardless of hypervisor or
> physical device.
> > Hence the re-read is must.
> Yes, as I said below, should verify by re-read.
> >
Thanks.

Re: [virtio-dev] Re: [virtio-comment] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg





On 9/20/2023 3:35 PM, Parav Pandit wrote:

From: Zhu, Lingshan 
Sent: Wednesday, September 20, 2023 1:00 PM

On 9/20/2023 3:24 PM, Chen, Jiqian wrote:

Hi Lingshan,
It seems you reply to the wrong email thread. They are not related to my

patch.
These reply to Parva's comments.
@Parva, if you want to discuss more about live migration, please reply in my
thread, lets don't flood here.

You made the point that "this is not live migration".
I am not discussing live migration in this thread.

I replied for the point that device restoration from suspend for physical and 
hypevisor based device is not a 40nsec worth of work to restore by just doing a 
register write.
This is not live or device migration. This is restoring the device context 
initiated by the driver owning the device.
restore the device context should be done by the hypervisor before 
setting DRIVER_OK and waking up the guest, not a concern here, out of spec



-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

Re: [virtio-dev] Re: [virtio-comment] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg





On 9/20/2023 3:32 PM, Parav Pandit wrote:



From: Zhu, Lingshan 
Sent: Wednesday, September 20, 2023 12:58 PM

On 9/20/2023 3:10 PM, Parav Pandit wrote:

From: Zhu, Lingshan 
Sent: Wednesday, September 20, 2023 12:37 PM

The problem to overcome in [1] is, resume operation needs to be
synchronous

as it involves large part of context to resume back, and hence just
asynchronously setting DRIVER_OK is not enough.

The sw must verify back that device has resumed the operation and
ready to

answer requests.
this is not live migration, all device status and other information
still stay in the device, no need to "resume" context, just resume running.


I am aware that it is not live migration. :)

"Just resuming" involves lot of device setup task. The device implementation

does not know for how long a device is suspended.

So for example, a VM is suspended for 6 hours, hence the device context

could be saved in a slow disk.

Hence, when the resume is done, it needs to setup things again and driver got

to verify before accessing more from the device.
The restore procedures should perform by the hypervisor and done before set
DRIVER_OK and wake up the guest.

Which is the signal to trigger the restore? Which is the trigger in physical 
device when there is no hypervisor?

In my view, setting the DRIVER_OK is the signal regardless of hypervisor or 
physical device.
Hence the re-read is must.

Yes, as I said below, should verify by re-read.



And the hypervisor/driver needs to check the device status by re-reading.

Like resume from a failed LM.

This is slightly different flow than setting the DRIVER_OK for the
first time

device initialization sequence as it does not involve large restoration.

So, to merge two ideas, instead of doing DRIVER_OK to resume, the
driver

should clear the SUSPEND bit and verify that it is out of SUSPEND.

Because driver is still in _OK_ driving the device flipping the SUSPEND bit.

Please read the spec, it says:
The driver MUST NOT clear a device status bit


Yes, this is why either DRIER_OK validation by the driver is needed or Jiqian's

synchronous new register..
so re-read

Yes. re-read until set, Thanks.




-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg





On 9/20/2023 3:17 PM, Chen, Jiqian wrote:

Hi Lingshan,

On 2023/9/20 14:58, Zhu, Lingshan wrote:


On 9/20/2023 2:33 PM, Chen, Jiqian wrote:

Hi Lingshan,

On 2023/9/20 13:59, Zhu, Lingshan wrote:

On 9/19/2023 8:31 PM, Michael S. Tsirkin wrote:

On Tue, Sep 19, 2023 at 07:42:42PM +0800, Jiqian Chen wrote:

When guest vm does S3, Qemu will reset and clear some things of virtio
devices, but guest can't aware that, so that may cause some problems.
For excample, Qemu calls virtio_reset->virtio_gpu_gl_reset when guest
resume, that function will destroy render resources of virtio-gpu. As
a result, after guest resume, the display can't come back and we only
saw a black screen. Due to guest can't re-create all the resources, so
we need to let Qemu not to destroy them when S3.

For above purpose, we need a mechanism that allows guests and QEMU to
negotiate their reset behavior. So this patch add a new parameter
named freeze_mode to struct virtio_pci_common_cfg. And when guest
suspends, it can write freeze_mode to be FREEZE_S3, and then virtio
devices can change their reset behavior on Qemu side according to
freeze_mode. What's more, freeze_mode can be used for all virtio
devices to affect the behavior of Qemu, not just virtio gpu device.

Hi Jiqian,

Have you seen this series: [PATCH 0/5] virtio: introduce SUSPEND bit and vq 
state
https://lore.kernel.org/all/3f4cbf84-010c-cffa-0b70-33c449b55...@intel.com/T/

We introduced a bit in the device status SUSPEND, when VIRTIO_F_SUSPEND is
negotiated, the driver can set SUSPEND in the device status to suspend the
device.

When SUSPEND, the device should pause its operations and preserve its 
configurations
in its configuration space.

The driver re-write DRIVER_OK to clear SUSPEND, so the device resumes running.

This is originally to serve live migration, but I think it can also meet your 
needs.

I noticed your series, but I am not sure they are also meet my needs.
If driver write 0 to reset device, can the SUSPEND bit be cleared? 
(pci_pm_resume->virtio_pci_restore->virtio_device_restore->virtio_reset_device)

if the driver writes 0, it resets all virtio functionalities. So SUSPEND is 
cleared.

Then your patches are not meet my needs. In my scene, it needs to keep the 
SUSPEND bit util the resume process is complete.
Because in my virtio-gpu scene, when guest resume, it call virtio_reset_device 
to clear all device status bits, and then reset virtio-gpu in Qemu, and then 
destroy render resources, I don't want the resources are destroyed during the 
resume process. So, I add freeze_mode to tell Qemu that guest is doing S3 and 
resources need to be kept.
When a guest set to S3, the hypervisor suspend the guest to RAM, and the 
passthrough-ed device are in low power state.
I am not sure the device can keep its status/context/data, maybe need to 
recover from RAM anyway.


So I suggest not reset the device, there need some code changes in QEMU
When set to S3, SUSPEND the device, then suspend to RAM
When resume from S3, restore from RAM and set DRIVER_OK to resume running.



device reset can also be used to recover the device from fatal errors, so it 
should reset everything in virtio.

If SUSPEND is cleared, then during the reset process in Qemu, I can't judge if 
the reset request is from guest restore process or not, and then I can't change 
the reset behavior.

I think when enter S3, the hypervisor/driver should set SUSPEND to the device. 
And when resume from S3, the hypervisor/driver should
re-write DRIVER_OK to clear SUSPEND, then the device resume running.

Can you send me your patch link on kernel and qemu side? I will take a deep 
look.

There are no patches for qemu/kernel yet, spec first.

Thanks,
Zhu Lingshan

Signed-off-by: Jiqian Chen 
---
    transport-pci.tex | 7 +++
    1 file changed, 7 insertions(+)

diff --git a/transport-pci.tex b/transport-pci.tex
index a5c6719..2543536 100644
--- a/transport-pci.tex
+++ b/transport-pci.tex
@@ -319,6 +319,7 @@ \subsubsection{Common configuration structure 
layout}\label{sec:Virtio Transport
    le64 queue_desc;    /* read-write */
    le64 queue_driver;  /* read-write */
    le64 queue_device;  /* read-write */
+    le16 freeze_mode;   /* read-write */
    le16 queue_notif_config_data;   /* read-only for driver */
    le16 queue_reset;   /* read-write */


we can't add fields in the middle of the structure like this -
offset of queue_notif_config_data and queue_reset changes.

   

@@ -393,6 +394,12 @@ \subsubsection{Common configuration structure 
layout}\label{sec:Virtio Transport
    \item[\field{queue_device}]
    The driver writes the physical address of Device Area here.  See 
section \ref{sec:Basic Facilities of a Virtio Device / Virtqueues}.
    +\item[\field{freeze_mode}]
+    The driver writes this to set the freeze mode of virtio pci.
+    VIRTIO_PCI_FREEZE_MODE_UNFREEZE -

RE: [virtio-dev] Re: [virtio-comment] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg

> From: Zhu, Lingshan 
> Sent: Wednesday, September 20, 2023 1:00 PM
> 
> On 9/20/2023 3:24 PM, Chen, Jiqian wrote:
> > Hi Lingshan,
> > It seems you reply to the wrong email thread. They are not related to my
> patch.
> These reply to Parva's comments.
> @Parva, if you want to discuss more about live migration, please reply in my
> thread, lets don't flood here.
You made the point that "this is not live migration".
I am not discussing live migration in this thread.

I replied for the point that device restoration from suspend for physical and 
hypevisor based device is not a 40nsec worth of work to restore by just doing a 
register write.
This is not live or device migration. This is restoring the device context 
initiated by the driver owning the device.

RE: [virtio-dev] Re: [virtio-comment] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg



> From: Zhu, Lingshan 
> Sent: Wednesday, September 20, 2023 12:58 PM
> 
> On 9/20/2023 3:10 PM, Parav Pandit wrote:
> >> From: Zhu, Lingshan 
> >> Sent: Wednesday, September 20, 2023 12:37 PM
> >>> The problem to overcome in [1] is, resume operation needs to be
> >>> synchronous
> >> as it involves large part of context to resume back, and hence just
> >> asynchronously setting DRIVER_OK is not enough.
> >>> The sw must verify back that device has resumed the operation and
> >>> ready to
> >> answer requests.
> >> this is not live migration, all device status and other information
> >> still stay in the device, no need to "resume" context, just resume running.
> >>
> > I am aware that it is not live migration. :)
> >
> > "Just resuming" involves lot of device setup task. The device implementation
> does not know for how long a device is suspended.
> > So for example, a VM is suspended for 6 hours, hence the device context
> could be saved in a slow disk.
> > Hence, when the resume is done, it needs to setup things again and driver 
> > got
> to verify before accessing more from the device.
> The restore procedures should perform by the hypervisor and done before set
> DRIVER_OK and wake up the guest.
Which is the signal to trigger the restore? Which is the trigger in physical 
device when there is no hypervisor?

In my view, setting the DRIVER_OK is the signal regardless of hypervisor or 
physical device.
Hence the re-read is must.

> And the hypervisor/driver needs to check the device status by re-reading.
> >
> >> Like resume from a failed LM.
> >>> This is slightly different flow than setting the DRIVER_OK for the
> >>> first time
> >> device initialization sequence as it does not involve large restoration.
> >>> So, to merge two ideas, instead of doing DRIVER_OK to resume, the
> >>> driver
> >> should clear the SUSPEND bit and verify that it is out of SUSPEND.
> >>> Because driver is still in _OK_ driving the device flipping the SUSPEND 
> >>> bit.
> >> Please read the spec, it says:
> >> The driver MUST NOT clear a device status bit
> >>
> > Yes, this is why either DRIER_OK validation by the driver is needed or 
> > Jiqian's
> synchronous new register..
> so re-read
Yes. re-read until set, Thanks.

Re: [virtio-dev] Re: [virtio-comment] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg





On 9/20/2023 3:24 PM, Chen, Jiqian wrote:

Hi Lingshan,
It seems you reply to the wrong email thread. They are not related to my patch.

These reply to Parva's comments.
@Parva, if you want to discuss more about live migration, please reply 
in my thread, lets don't flood here.


On 2023/9/20 15:06, Zhu, Lingshan wrote:


On 9/20/2023 2:58 PM, Parav Pandit wrote:

From: Chen, Jiqian 
Sent: Wednesday, September 20, 2023 12:03 PM
If driver write 0 to reset device, can the SUSPEND bit be cleared?

It must as reset operation, resets everything else and so the suspend too.


(pci_pm_resume->virtio_pci_restore->virtio_device_restore-

virtio_reset_device)

If SUSPEND is cleared, then during the reset process in Qemu, I can't judge if
the reset request is from guest restore process or not, and then I can't change
the reset behavior.

Reset should not be influenced by suspend.
Suspend should do the work of suspend and reset to do the reset.

The problem to overcome in [1] is, resume operation needs to be synchronous as 
it involves large part of context to resume back, and hence just asynchronously 
setting DRIVER_OK is not enough.
The sw must verify back that device has resumed the operation and ready to 
answer requests.

this is not live migration, all device status and other information still stay in the 
device, no need to "resume" context, just resume running.

Like resume from a failed LM.

This is slightly different flow than setting the DRIVER_OK for the first time 
device initialization sequence as it does not involve large restoration.

So, to merge two ideas, instead of doing DRIVER_OK to resume, the driver should 
clear the SUSPEND bit and verify that it is out of SUSPEND.

Because driver is still in _OK_ driving the device flipping the SUSPEND bit.

Please read the spec, it says:
The driver MUST NOT clear a device status bit





-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

Re: [virtio-dev] Re: [virtio-comment] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg





On 9/20/2023 3:10 PM, Parav Pandit wrote:

From: Zhu, Lingshan 
Sent: Wednesday, September 20, 2023 12:37 PM

The problem to overcome in [1] is, resume operation needs to be synchronous

as it involves large part of context to resume back, and hence just
asynchronously setting DRIVER_OK is not enough.

The sw must verify back that device has resumed the operation and ready to

answer requests.
this is not live migration, all device status and other information still stay 
in the
device, no need to "resume" context, just resume running.


I am aware that it is not live migration. :)

"Just resuming" involves lot of device setup task. The device implementation 
does not know for how long a device is suspended.
So for example, a VM is suspended for 6 hours, hence the device context could 
be saved in a slow disk.
Hence, when the resume is done, it needs to setup things again and driver got 
to verify before accessing more from the device.
The restore procedures should perform by the hypervisor and done before 
set DRIVER_OK and wake up the guest.

And the hypervisor/driver needs to check the device status by re-reading.
  

Like resume from a failed LM.

This is slightly different flow than setting the DRIVER_OK for the first time

device initialization sequence as it does not involve large restoration.

So, to merge two ideas, instead of doing DRIVER_OK to resume, the driver

should clear the SUSPEND bit and verify that it is out of SUSPEND.

Because driver is still in _OK_ driving the device flipping the SUSPEND bit.

Please read the spec, it says:
The driver MUST NOT clear a device status bit


Yes, this is why either DRIER_OK validation by the driver is needed or Jiqian's 
synchronous new register..

so re-read





-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

Re: [virtio-dev] Re: [virtio-comment] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg

Hi Lingshan,
It seems you reply to the wrong email thread. They are not related to my patch.

On 2023/9/20 15:06, Zhu, Lingshan wrote:
> 
> 
> On 9/20/2023 2:58 PM, Parav Pandit wrote:
>>> From: Chen, Jiqian 
>>> Sent: Wednesday, September 20, 2023 12:03 PM
>>> If driver write 0 to reset device, can the SUSPEND bit be cleared?
>> It must as reset operation, resets everything else and so the suspend too.
>>
>>> (pci_pm_resume->virtio_pci_restore->virtio_device_restore-
 virtio_reset_device)
>>> If SUSPEND is cleared, then during the reset process in Qemu, I can't judge 
>>> if
>>> the reset request is from guest restore process or not, and then I can't 
>>> change
>>> the reset behavior.
>> Reset should not be influenced by suspend.
>> Suspend should do the work of suspend and reset to do the reset.
>>
>> The problem to overcome in [1] is, resume operation needs to be synchronous 
>> as it involves large part of context to resume back, and hence just 
>> asynchronously setting DRIVER_OK is not enough.
>> The sw must verify back that device has resumed the operation and ready to 
>> answer requests.
> this is not live migration, all device status and other information still 
> stay in the device, no need to "resume" context, just resume running.
> 
> Like resume from a failed LM.
>>
>> This is slightly different flow than setting the DRIVER_OK for the first 
>> time device initialization sequence as it does not involve large restoration.
>>
>> So, to merge two ideas, instead of doing DRIVER_OK to resume, the driver 
>> should clear the SUSPEND bit and verify that it is out of SUSPEND.
>>
>> Because driver is still in _OK_ driving the device flipping the SUSPEND bit.
> Please read the spec, it says:
> The driver MUST NOT clear a device status bit
> 
> 

-- 
Best regards,
Jiqian Chen.

[virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg

Hi Lingshan,

On 2023/9/20 14:58, Zhu, Lingshan wrote:
> 
> 
> On 9/20/2023 2:33 PM, Chen, Jiqian wrote:
>> Hi Lingshan,
>>
>> On 2023/9/20 13:59, Zhu, Lingshan wrote:
>>>
>>> On 9/19/2023 8:31 PM, Michael S. Tsirkin wrote:
 On Tue, Sep 19, 2023 at 07:42:42PM +0800, Jiqian Chen wrote:
> When guest vm does S3, Qemu will reset and clear some things of virtio
> devices, but guest can't aware that, so that may cause some problems.
> For excample, Qemu calls virtio_reset->virtio_gpu_gl_reset when guest
> resume, that function will destroy render resources of virtio-gpu. As
> a result, after guest resume, the display can't come back and we only
> saw a black screen. Due to guest can't re-create all the resources, so
> we need to let Qemu not to destroy them when S3.
>
> For above purpose, we need a mechanism that allows guests and QEMU to
> negotiate their reset behavior. So this patch add a new parameter
> named freeze_mode to struct virtio_pci_common_cfg. And when guest
> suspends, it can write freeze_mode to be FREEZE_S3, and then virtio
> devices can change their reset behavior on Qemu side according to
> freeze_mode. What's more, freeze_mode can be used for all virtio
> devices to affect the behavior of Qemu, not just virtio gpu device.
>>> Hi Jiqian,
>>>
>>> Have you seen this series: [PATCH 0/5] virtio: introduce SUSPEND bit and vq 
>>> state
>>> https://lore.kernel.org/all/3f4cbf84-010c-cffa-0b70-33c449b55...@intel.com/T/
>>>
>>> We introduced a bit in the device status SUSPEND, when VIRTIO_F_SUSPEND is
>>> negotiated, the driver can set SUSPEND in the device status to suspend the
>>> device.
>>>
>>> When SUSPEND, the device should pause its operations and preserve its 
>>> configurations
>>> in its configuration space.
>>>
>>> The driver re-write DRIVER_OK to clear SUSPEND, so the device resumes 
>>> running.
>>>
>>> This is originally to serve live migration, but I think it can also meet 
>>> your needs.
>> I noticed your series, but I am not sure they are also meet my needs.
>> If driver write 0 to reset device, can the SUSPEND bit be cleared? 
>> (pci_pm_resume->virtio_pci_restore->virtio_device_restore->virtio_reset_device)
> if the driver writes 0, it resets all virtio functionalities. So SUSPEND is 
> cleared.
Then your patches are not meet my needs. In my scene, it needs to keep the 
SUSPEND bit util the resume process is complete.
Because in my virtio-gpu scene, when guest resume, it call virtio_reset_device 
to clear all device status bits, and then reset virtio-gpu in Qemu, and then 
destroy render resources, I don't want the resources are destroyed during the 
resume process. So, I add freeze_mode to tell Qemu that guest is doing S3 and 
resources need to be kept.

> device reset can also be used to recover the device from fatal errors, so it 
> should reset everything in virtio.
>> If SUSPEND is cleared, then during the reset process in Qemu, I can't judge 
>> if the reset request is from guest restore process or not, and then I can't 
>> change the reset behavior.
> I think when enter S3, the hypervisor/driver should set SUSPEND to the 
> device. And when resume from S3, the hypervisor/driver should
> re-write DRIVER_OK to clear SUSPEND, then the device resume running.
>> Can you send me your patch link on kernel and qemu side? I will take a deep 
>> look.
> There are no patches for qemu/kernel yet, spec first.
>>
>>> Thanks,
>>> Zhu Lingshan
> Signed-off-by: Jiqian Chen 
> ---
>    transport-pci.tex | 7 +++
>    1 file changed, 7 insertions(+)
>
> diff --git a/transport-pci.tex b/transport-pci.tex
> index a5c6719..2543536 100644
> --- a/transport-pci.tex
> +++ b/transport-pci.tex
> @@ -319,6 +319,7 @@ \subsubsection{Common configuration structure 
> layout}\label{sec:Virtio Transport
>    le64 queue_desc;    /* read-write */
>    le64 queue_driver;  /* read-write */
>    le64 queue_device;  /* read-write */
> +    le16 freeze_mode;   /* read-write */
>    le16 queue_notif_config_data;   /* read-only for driver */
>    le16 queue_reset;   /* read-write */
>
 we can't add fields in the middle of the structure like this -
 offset of queue_notif_config_data and queue_reset changes.

   
> @@ -393,6 +394,12 @@ \subsubsection{Common configuration structure 
> layout}\label{sec:Virtio Transport
>    \item[\field{queue_device}]
>    The driver writes the physical address of Device Area here.  
> See section \ref{sec:Basic Facilities of a Virtio Device / Virtqueues}.
>    +\item[\field{freeze_mode}]
> +    The driver writes this to set the freeze mode of virtio pci.
> +    VIRTIO_PCI_FREEZE_MODE_UNFREEZE - virtio-pci is running;
> +    VIRTIO_PCI_FREEZE_MODE_FREEZE_S3 - guest vm is doing

RE: [virtio-dev] Re: [virtio-comment] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg


> From: Zhu, Lingshan 
> Sent: Wednesday, September 20, 2023 12:37 PM

> > The problem to overcome in [1] is, resume operation needs to be synchronous
> as it involves large part of context to resume back, and hence just
> asynchronously setting DRIVER_OK is not enough.
> > The sw must verify back that device has resumed the operation and ready to
> answer requests.
> this is not live migration, all device status and other information still 
> stay in the
> device, no need to "resume" context, just resume running.
> 
I am aware that it is not live migration. :)

"Just resuming" involves lot of device setup task. The device implementation 
does not know for how long a device is suspended.
So for example, a VM is suspended for 6 hours, hence the device context could 
be saved in a slow disk.
Hence, when the resume is done, it needs to setup things again and driver got 
to verify before accessing more from the device.
 
> Like resume from a failed LM.
> >
> > This is slightly different flow than setting the DRIVER_OK for the first 
> > time
> device initialization sequence as it does not involve large restoration.
> >
> > So, to merge two ideas, instead of doing DRIVER_OK to resume, the driver
> should clear the SUSPEND bit and verify that it is out of SUSPEND.
> >
> > Because driver is still in _OK_ driving the device flipping the SUSPEND bit.
> Please read the spec, it says:
> The driver MUST NOT clear a device status bit
> 
Yes, this is why either DRIER_OK validation by the driver is needed or Jiqian's 
synchronous new register..

Re: [virtio-dev] Re: [virtio-comment] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg





On 9/20/2023 2:58 PM, Parav Pandit wrote:

From: Chen, Jiqian 
Sent: Wednesday, September 20, 2023 12:03 PM
If driver write 0 to reset device, can the SUSPEND bit be cleared?

It must as reset operation, resets everything else and so the suspend too.


(pci_pm_resume->virtio_pci_restore->virtio_device_restore-

virtio_reset_device)

If SUSPEND is cleared, then during the reset process in Qemu, I can't judge if
the reset request is from guest restore process or not, and then I can't change
the reset behavior.

Reset should not be influenced by suspend.
Suspend should do the work of suspend and reset to do the reset.

The problem to overcome in [1] is, resume operation needs to be synchronous as 
it involves large part of context to resume back, and hence just asynchronously 
setting DRIVER_OK is not enough.
The sw must verify back that device has resumed the operation and ready to 
answer requests.
this is not live migration, all device status and other information 
still stay in the device, no need to "resume" context, just resume running.


Like resume from a failed LM.


This is slightly different flow than setting the DRIVER_OK for the first time 
device initialization sequence as it does not involve large restoration.

So, to merge two ideas, instead of doing DRIVER_OK to resume, the driver should 
clear the SUSPEND bit and verify that it is out of SUSPEND.

Because driver is still in _OK_ driving the device flipping the SUSPEND bit.

Please read the spec, it says:
The driver MUST NOT clear a device status bit



-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

Re: [virtio-dev] Re: [virtio-comment] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg





On 9/20/2023 2:33 PM, Chen, Jiqian wrote:

Hi Lingshan,

On 2023/9/20 13:59, Zhu, Lingshan wrote:


On 9/19/2023 8:31 PM, Michael S. Tsirkin wrote:

On Tue, Sep 19, 2023 at 07:42:42PM +0800, Jiqian Chen wrote:

When guest vm does S3, Qemu will reset and clear some things of virtio
devices, but guest can't aware that, so that may cause some problems.
For excample, Qemu calls virtio_reset->virtio_gpu_gl_reset when guest
resume, that function will destroy render resources of virtio-gpu. As
a result, after guest resume, the display can't come back and we only
saw a black screen. Due to guest can't re-create all the resources, so
we need to let Qemu not to destroy them when S3.

For above purpose, we need a mechanism that allows guests and QEMU to
negotiate their reset behavior. So this patch add a new parameter
named freeze_mode to struct virtio_pci_common_cfg. And when guest
suspends, it can write freeze_mode to be FREEZE_S3, and then virtio
devices can change their reset behavior on Qemu side according to
freeze_mode. What's more, freeze_mode can be used for all virtio
devices to affect the behavior of Qemu, not just virtio gpu device.

Hi Jiqian,

Have you seen this series: [PATCH 0/5] virtio: introduce SUSPEND bit and vq 
state
https://lore.kernel.org/all/3f4cbf84-010c-cffa-0b70-33c449b55...@intel.com/T/

We introduced a bit in the device status SUSPEND, when VIRTIO_F_SUSPEND is
negotiated, the driver can set SUSPEND in the device status to suspend the
device.

When SUSPEND, the device should pause its operations and preserve its 
configurations
in its configuration space.

The driver re-write DRIVER_OK to clear SUSPEND, so the device resumes running.

This is originally to serve live migration, but I think it can also meet your 
needs.

I noticed your series, but I am not sure they are also meet my needs.
If driver write 0 to reset device, can the SUSPEND bit be cleared? 
(pci_pm_resume->virtio_pci_restore->virtio_device_restore->virtio_reset_device)
if the driver writes 0, it resets all virtio functionalities. So SUSPEND 
is cleared.
device reset can also be used to recover the device from fatal errors, 
so it should reset everything in virtio.

If SUSPEND is cleared, then during the reset process in Qemu, I can't judge if 
the reset request is from guest restore process or not, and then I can't change 
the reset behavior.
I think when enter S3, the hypervisor/driver should set SUSPEND to the 
device. And when resume from S3, the hypervisor/driver should

re-write DRIVER_OK to clear SUSPEND, then the device resume running.

Can you send me your patch link on kernel and qemu side? I will take a deep 
look.

There are no patches for qemu/kernel yet, spec first.



Thanks,
Zhu Lingshan

Signed-off-by: Jiqian Chen 
---
   transport-pci.tex | 7 +++
   1 file changed, 7 insertions(+)

diff --git a/transport-pci.tex b/transport-pci.tex
index a5c6719..2543536 100644
--- a/transport-pci.tex
+++ b/transport-pci.tex
@@ -319,6 +319,7 @@ \subsubsection{Common configuration structure 
layout}\label{sec:Virtio Transport
   le64 queue_desc;    /* read-write */
   le64 queue_driver;  /* read-write */
   le64 queue_device;  /* read-write */
+    le16 freeze_mode;   /* read-write */
   le16 queue_notif_config_data;   /* read-only for driver */
   le16 queue_reset;   /* read-write */


we can't add fields in the middle of the structure like this -
offset of queue_notif_config_data and queue_reset changes.

   

@@ -393,6 +394,12 @@ \subsubsection{Common configuration structure 
layout}\label{sec:Virtio Transport
   \item[\field{queue_device}]
   The driver writes the physical address of Device Area here.  See 
section \ref{sec:Basic Facilities of a Virtio Device / Virtqueues}.
   +\item[\field{freeze_mode}]
+    The driver writes this to set the freeze mode of virtio pci.
+    VIRTIO_PCI_FREEZE_MODE_UNFREEZE - virtio-pci is running;
+    VIRTIO_PCI_FREEZE_MODE_FREEZE_S3 - guest vm is doing S3, and 
virtio-pci enters S3 suspension;
+    Other values are reserved for future use, like S4, etc.
+

we need to specify these values then.

we also need
- feature bit to detect support for S3
- conformance statements documenting behavious under S3



   \item[\field{queue_notif_config_data}]
   This field exists only if VIRTIO_F_NOTIF_CONFIG_DATA has been 
negotiated.
   The driver will use this value when driver sends available buffer
--
2.34.1

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscr...@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscr...@lists.oasis-open.org
List help: virtio-comment-h...@lists.oasis-open.org
List archive:

RE: [virtio-dev] Re: [virtio-comment] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg


> From: Chen, Jiqian 
> Sent: Wednesday, September 20, 2023 12:03 PM

> If driver write 0 to reset device, can the SUSPEND bit be cleared?
It must as reset operation, resets everything else and so the suspend too.

> (pci_pm_resume->virtio_pci_restore->virtio_device_restore-
> >virtio_reset_device)
> If SUSPEND is cleared, then during the reset process in Qemu, I can't judge if
> the reset request is from guest restore process or not, and then I can't 
> change
> the reset behavior.
Reset should not be influenced by suspend.
Suspend should do the work of suspend and reset to do the reset.

The problem to overcome in [1] is, resume operation needs to be synchronous as 
it involves large part of context to resume back, and hence just asynchronously 
setting DRIVER_OK is not enough.
The sw must verify back that device has resumed the operation and ready to 
answer requests.

This is slightly different flow than setting the DRIVER_OK for the first time 
device initialization sequence as it does not involve large restoration.

So, to merge two ideas, instead of doing DRIVER_OK to resume, the driver should 
clear the SUSPEND bit and verify that it is out of SUSPEND.

Because driver is still in _OK_ driving the device flipping the SUSPEND bit.

Re: [virtio-dev] Re: [virtio-comment] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg

Hi Lingshan,

On 2023/9/20 13:59, Zhu, Lingshan wrote:
> 
> 
> On 9/19/2023 8:31 PM, Michael S. Tsirkin wrote:
>> On Tue, Sep 19, 2023 at 07:42:42PM +0800, Jiqian Chen wrote:
>>> When guest vm does S3, Qemu will reset and clear some things of virtio
>>> devices, but guest can't aware that, so that may cause some problems.
>>> For excample, Qemu calls virtio_reset->virtio_gpu_gl_reset when guest
>>> resume, that function will destroy render resources of virtio-gpu. As
>>> a result, after guest resume, the display can't come back and we only
>>> saw a black screen. Due to guest can't re-create all the resources, so
>>> we need to let Qemu not to destroy them when S3.
>>>
>>> For above purpose, we need a mechanism that allows guests and QEMU to
>>> negotiate their reset behavior. So this patch add a new parameter
>>> named freeze_mode to struct virtio_pci_common_cfg. And when guest
>>> suspends, it can write freeze_mode to be FREEZE_S3, and then virtio
>>> devices can change their reset behavior on Qemu side according to
>>> freeze_mode. What's more, freeze_mode can be used for all virtio
>>> devices to affect the behavior of Qemu, not just virtio gpu device.
> Hi Jiqian,
> 
> Have you seen this series: [PATCH 0/5] virtio: introduce SUSPEND bit and vq 
> state
> https://lore.kernel.org/all/3f4cbf84-010c-cffa-0b70-33c449b55...@intel.com/T/
> 
> We introduced a bit in the device status SUSPEND, when VIRTIO_F_SUSPEND is
> negotiated, the driver can set SUSPEND in the device status to suspend the
> device.
> 
> When SUSPEND, the device should pause its operations and preserve its 
> configurations
> in its configuration space.
> 
> The driver re-write DRIVER_OK to clear SUSPEND, so the device resumes running.
> 
> This is originally to serve live migration, but I think it can also meet your 
> needs.
I noticed your series, but I am not sure they are also meet my needs.
If driver write 0 to reset device, can the SUSPEND bit be cleared? 
(pci_pm_resume->virtio_pci_restore->virtio_device_restore->virtio_reset_device)
If SUSPEND is cleared, then during the reset process in Qemu, I can't judge if 
the reset request is from guest restore process or not, and then I can't change 
the reset behavior.
Can you send me your patch link on kernel and qemu side? I will take a deep 
look.

> 
> Thanks,
> Zhu Lingshan
>>>
>>> Signed-off-by: Jiqian Chen 
>>> ---
>>>   transport-pci.tex | 7 +++
>>>   1 file changed, 7 insertions(+)
>>>
>>> diff --git a/transport-pci.tex b/transport-pci.tex
>>> index a5c6719..2543536 100644
>>> --- a/transport-pci.tex
>>> +++ b/transport-pci.tex
>>> @@ -319,6 +319,7 @@ \subsubsection{Common configuration structure 
>>> layout}\label{sec:Virtio Transport
>>>   le64 queue_desc;    /* read-write */
>>>   le64 queue_driver;  /* read-write */
>>>   le64 queue_device;  /* read-write */
>>> +    le16 freeze_mode;   /* read-write */
>>>   le16 queue_notif_config_data;   /* read-only for driver */
>>>   le16 queue_reset;   /* read-write */
>>>
>> we can't add fields in the middle of the structure like this -
>> offset of queue_notif_config_data and queue_reset changes.
>>
>>   
>>> @@ -393,6 +394,12 @@ \subsubsection{Common configuration structure 
>>> layout}\label{sec:Virtio Transport
>>>   \item[\field{queue_device}]
>>>   The driver writes the physical address of Device Area here.  See 
>>> section \ref{sec:Basic Facilities of a Virtio Device / Virtqueues}.
>>>   +\item[\field{freeze_mode}]
>>> +    The driver writes this to set the freeze mode of virtio pci.
>>> +    VIRTIO_PCI_FREEZE_MODE_UNFREEZE - virtio-pci is running;
>>> +    VIRTIO_PCI_FREEZE_MODE_FREEZE_S3 - guest vm is doing S3, and 
>>> virtio-pci enters S3 suspension;
>>> +    Other values are reserved for future use, like S4, etc.
>>> +
>> we need to specify these values then.
>>
>> we also need
>> - feature bit to detect support for S3
>> - conformance statements documenting behavious under S3
>>
>>
>>>   \item[\field{queue_notif_config_data}]
>>>   This field exists only if VIRTIO_F_NOTIF_CONFIG_DATA has been 
>>> negotiated.
>>>   The driver will use this value when driver sends available buffer
>>> -- 
>>> 2.34.1
>>
>> This publicly archived list offers a means to provide input to the
>> OASIS Virtual I/O Device (VIRTIO) TC.
>>
>> In order to verify user consent to the Feedback License terms and
>> to minimize spam in the list archive, subscription is required
>> before posting.
>>
>> Subscribe: virtio-comment-subscr...@lists.oasis-open.org
>> Unsubscribe: virtio-comment-unsubscr...@lists.oasis-open.org
>> List help: virtio-comment-h...@lists.oasis-open.org
>> List archive: https://lists.oasis-open.org/archives/virtio-comment/
>> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
>> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
>>

Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state




On 9/20/2023 2:08 PM, Parav Pandit wrote:

From: Zhu, Lingshan
Sent: Wednesday, September 20, 2023 11:36 AM

On 9/19/2023 2:49 AM, Michael S. Tsirkin wrote:

On Mon, Sep 18, 2023 at 06:41:55PM +, Parav Pandit wrote:

Please refer to the code for setting FEATURES_OK.

It wont work when one needs to suspend the device.
There is no point of doing such work over registers as fundamental

framework is over the AQ.

Well not really. It's over admin commands. When these were built the
intent always was that it's possible to use admin commands through
another interface, other than admin queue. Is there a problem
implementing admin commands over a memory BAR? For example, I can see
an "admin command" capability pointing at a BAR where commands are
supplied, and using a new group type referring to device itself.

I am not sure, if a bar cap would be implemented as a proxy for the admin vq
based live migration. then the problems of admin vq LM that we have discussed
still exist. the bar is only a proxy, doesn't fix anything. and even larger side
channel attacking surface: vf-->pf-->vf

AQ LM using PF has no side channel attack as hypervisor and owner device is 
trusted entity as already discussed.
I believe we have discussed this for many times, and I even provide you 
some examples.


Let me repeat for the last time.

There can be malicious SW on the host, and the host may be hacked and 
compromised.

For example:
1) SUSPEND the a running guest by admin vq
2) dumping guest memory through admin vq dirty page tracking.

These above can happen right?

You made TDISP as an example, but have you really read the TDISP spec?
In the spec:

Device Security Architecture - Administrative interfaces (e.g., a PF) may be
used to influence the security properties of the TDI used by the TVM.

TEE-I/O requires the device to organize its hardware/software interfaces 
such that the PF cannot

be used to affect the security of a TDI when it is in use by a TVM

Clear?

RE: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state


> From: Zhu, Lingshan 
> Sent: Wednesday, September 20, 2023 11:36 AM
> 
> On 9/19/2023 2:49 AM, Michael S. Tsirkin wrote:
> > On Mon, Sep 18, 2023 at 06:41:55PM +, Parav Pandit wrote:
> >>> Please refer to the code for setting FEATURES_OK.
> >> It wont work when one needs to suspend the device.
> >> There is no point of doing such work over registers as fundamental
> framework is over the AQ.
> > Well not really. It's over admin commands. When these were built the
> > intent always was that it's possible to use admin commands through
> > another interface, other than admin queue. Is there a problem
> > implementing admin commands over a memory BAR? For example, I can see
> > an "admin command" capability pointing at a BAR where commands are
> > supplied, and using a new group type referring to device itself.
> I am not sure, if a bar cap would be implemented as a proxy for the admin vq
> based live migration. then the problems of admin vq LM that we have discussed
> still exist. the bar is only a proxy, doesn't fix anything. and even larger 
> side
> channel attacking surface: vf-->pf-->vf

AQ LM using PF has no side channel attack as hypervisor and owner device is 
trusted entity as already discussed.

Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state