date:20230308

[virtio-dev] Re: [virtio-comment] [PATCH v10] virtio-net: support the virtqueue coalescing moderation

2023-03-08 Thread Heng Qi





在 2023/3/7 上午6:57, Michael S. Tsirkin 写道:

On Thu, Mar 02, 2023 at 11:36:18AM +, David Edmondson wrote:

+for an enabled transmit/receive 
virtqueue whose number is \field{vqn}.

Should this now be "whose index is \field{vqn}"?

Ugh.  I guess we'll have to fix the number/index mess in the spec
first. Parav you said you are looking into it?




# Where virtqueue number and virtqueue index are used.
  1. In the Virtqueue Configuration Section, use the virtqueue index: 
"Write the **virtqueue index** (first queue is 0) to queue_select."

  2. Both descriptions are used separately in the Notification Section.
  2.1 Here vqn is called virtqueue index:
    "When VIRTIO_F_NOTIFICATION_DATA has not been negotiated, 
the driver sends an available buffer notification
  to the device by writing the **16-bit virtqueue index** 
of this virtqueue to the Queue Notify address.

  ...

  le32 {
 vqn: 16;
 next_off : 15;
 next_wrap: 1;
 };
 ...

 If VIRTIO_F_NOTIFICATION_DATA has not been negotiated, the 
driver MUST use the queue_notify_data value instead of the **virtqueue 
index**."


  2.2 Here vqn is called virtqueue number:
    "When VIRTIO_F_NOTIFICATION_DATA has not been negotiated, 
the Notification data contains the **Virtqueue number**.

    ...

    be32 {
    vqn: 16;
    next_off : 15;
    next_wrap: 1;
 };
    ...

    When VIRTIO_F_NOTIFICATION_DATA has not been negotiated, 
this notification involves sending the **virtqueue number** to the 
device (method depending on the transport).

    ...

    vqn -- **VQ number** to be notified."

# 0-based index and 0-based number are used respectively in the RSS Section:
1. "Field unclassified_queue contains the **0-based index** of the 
receive virtqueue to place unclassified packets in. Index 0 corresponds 
to receiveq1."
2. "use the result as the index in the indirection table to get 
**0-based number** of destination receiveq (value of 0 corresponds to 
receiveq1)."




\field{vqn} has been called '0-based virtqueue index' or '0-based 
virtqueue number',

I think both seem to be friendly to readers, so what are your options?

Thanks.






-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

Re: [virtio-dev] [PATCH v10] virtio-net: support inner header hash

2023-03-08 Thread Michael S. Tsirkin

On Wed, Mar 08, 2023 at 10:27:13PM +0800, Heng Qi wrote:
> Do you have any comments, please?
> 
> Thanks. :)
> 
> 在 2023/3/6 下午11:48, Heng Qi 写道:

It's been just 2 partial days. Sit tight pls.

-- 
MST


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [PATCH v9] virtio-net: support inner header hash

2023-03-08 Thread Michael S. Tsirkin

On Wed, Mar 01, 2023 at 10:56:31AM +0800, Heng Qi wrote:
> 
> 
> 在 2023/2/28 下午7:16, Michael S. Tsirkin 写道:
> > On Sat, Feb 18, 2023 at 10:37:15PM +0800, Heng Qi wrote:
> > > If the tunnel is used to encapsulate the packets, the hash calculated
> > > using the outer header of the receive packets is always fixed for the
> > > same flow packets, i.e. they will be steered to the same receive queue.
> > Wait a second. How is this true? Does not everyone stick the
> > inner header hash in the outer source port to solve this?
> 
> Yes, you are right. That's what we did before the inner header hash, but it
> has a performance penalty, which I'll explain below.
> 
> > For example geneve spec says:
> > 
> > it is necessary for entropy from encapsulated packets to be
> > exposed in the tunnel header.  The most common technique for this is
> > to use the UDP source port
> 
> The end point of the tunnel called the gateway (with DPDK on top of it).
> 
> 1. When there is no inner header hash, entropy can be inserted into the udp
> src port of the outer header of the tunnel,
> and then the tunnel packet is handed over to the host. The host needs to
> take out a part of the CPUs to parse the outer headers (but not drop them)
> to calculate the inner hash for the inner payloads,
> and then use the inner
> hash to forward them to another part of the CPUs that are responsible for
> processing.

I don't get this part. Leave inner hashes to the guest inside the
tunnel, why is your host doing this?

> 1). During this process, the CPUs on the host is divided into two parts, one
> part is used as a forwarding node to parse the outer header,
>  and the CPU utilization is low. Another part handles packets.

Some overhead is clearly involved in *sending* packets -
to calculate the hash and stick it in the port number.
This is, however, a separate problem and if you want to
solve it then my suggestion would be to teach the *transmit*
side about GRE offloads, so it can fill the source port in the card.

> 2). The entropy of the source udp src port is not enough, that is, the queue
> is not widely distributed.

how isn't it enough? 16 bit is enough to cover all vqs ...

> 2. When there is an inner header hash, the gateway will directly help parse
> the outer header, and use the inner 5 tuples to calculate the inner hash.
> The tunneled packet is then handed over to the host.
> 1) All the CPUs of the host are used to process data packets, and there is
> no need to use some CPUs to forward and parse the outer header.

You really have to parse the outer header anyway,
otherwise there's no tunneling.
Unless you want to teach virtio to implement tunneling
in hardware, which is something I'd find it easier to
get behind.

> 2) The entropy of the original quintuple is sufficient, and the queue is
> widely distributed.

It's exactly the same entropy, why would it be better? In fact you
are taking out the outer hash entropy making things worse.

> 
> Thanks.
> > 
> > same goes for vxlan did not check further.
> > 
> > so what is the problem?  and which tunnel types actually suffer from the
> > problem?
> > 
> 
> 
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
> 
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
> 
> Subscribe: virtio-comment-subscr...@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscr...@lists.oasis-open.org
> List help: virtio-comment-h...@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] Re: [PATCH v10 03/10] admin: introduce group administration commands

2023-03-08 Thread Cornelia Huck

On Wed, Mar 08 2023, Max Gurtovoy  wrote:

> On 08/03/2023 14:07, Michael S. Tsirkin wrote:
>> On Wed, Mar 08, 2023 at 12:55:37PM +0200, Max Gurtovoy wrote:
> 5. VIRTIO_ADMIN_STATUS_COMMAND_SPECIFIC_ERR (for more info read the
> command_specific_error field).

 I don't think it's a good idea, we'll have to agree to disagree.
>>>
>>> Ok.
>>> can you explain why isn't this a good idea please ?
>> 
>> Pointless complexity for spec, devices and drivers. In the end drivers
>> don't really do anything with all this detailed info.
>> 
>
> Pointless according to your own opinion. As I mentioned, other 
> specifications and devices work this way. I didn't invent it.
>
> All drivers in SW world ? All drivers in Linux ? or Virtio drivers ?

FWIW, I've spent enough time translating
detailed-but-not-really-that-useful error status codes to -EINVAL for
s390-specific things...

The only thing that might make sense to transmit to a consumer would be
the difference between "this thing won't work" and "this thing might
work if you retry later" (if the consumer is even in a place to retry.)


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] Re: [PATCH v10 03/10] admin: introduce group administration commands

2023-03-08 Thread Michael S. Tsirkin

On Wed, Mar 08, 2023 at 03:43:14PM +0100, Cornelia Huck wrote:
> On Wed, Mar 08 2023, Max Gurtovoy  wrote:
> 
> > On 08/03/2023 14:07, Michael S. Tsirkin wrote:
> >> On Wed, Mar 08, 2023 at 12:55:37PM +0200, Max Gurtovoy wrote:
> > 5. VIRTIO_ADMIN_STATUS_COMMAND_SPECIFIC_ERR (for more info read the
> > command_specific_error field).
> 
>  I don't think it's a good idea, we'll have to agree to disagree.
> >>>
> >>> Ok.
> >>> can you explain why isn't this a good idea please ?
> >> 
> >> Pointless complexity for spec, devices and drivers. In the end drivers
> >> don't really do anything with all this detailed info.
> >> 
> >
> > Pointless according to your own opinion. As I mentioned, other 
> > specifications and devices work this way. I didn't invent it.
> >
> > All drivers in SW world ? All drivers in Linux ? or Virtio drivers ?
> 
> FWIW, I've spent enough time translating
> detailed-but-not-really-that-useful error status codes to -EINVAL for
> s390-specific things...
> 
> The only thing that might make sense to transmit to a consumer would be
> the difference between "this thing won't work" and "this thing might
> work if you retry later" (if the consumer is even in a place to retry.)

One place where that extra detail might be helpful is device debugging.
But there an error code out of list is too restrictive anyway, you want
"Turbo Encabulator 42 logarithmic casing failure".  I am still looking
for something that drivers are not likely to abuse. Maybe an ASCII
string?

-- 
MST


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] Re: [virtio] [PATCH v10 01/10] virtio: document forward compatibility guarantees

2023-03-08 Thread Michael S. Tsirkin

On Wed, Mar 08, 2023 at 03:16:41PM +0100, Cornelia Huck wrote:
> On Tue, Mar 07 2023, David Edmondson  wrote:
> 
> > "Michael S. Tsirkin"  writes:
> >
> >> On Mon, Mar 06, 2023 at 01:53:50PM +, David Edmondson wrote:
> >>> "Michael S. Tsirkin"  writes:
> >>> 
> >>> > Feature negotiation forms the basis of forward compatibility
> >>> > guarantees of virtio but has never been properly documented.
> >>> > Do it now.
> >>> >
> >>> > Suggested-by: Halil Pasic 
> >>> > Signed-off-by: Michael S. Tsirkin 
> >>> > ---
> >>> >  content.tex | 42 ++
> >>> >  1 file changed, 42 insertions(+)
> >>> >
> >>> > diff --git a/content.tex b/content.tex
> >>> > index 0e474dd..0c2d917 100644
> >>> > --- a/content.tex
> >>> > +++ b/content.tex
> >>> > @@ -114,21 +114,63 @@ \section{Feature Bits}\label{sec:Basic Facilities 
> >>> > of a Virtio Device / Feature B
> >>> >  In particular, new fields in the device configuration space are
> >>> >  indicated by offering a new feature bit.
> >>> >  
> >>> > +To keep the feature negotiation mechanism extensible, it is important
> >>> > +that devices \em{do not} offer any feature bits that they would not be
> >>> > +able to handle if the driver accepted them (even though drivers are not
> >>> > +supposed to accept them in the first place even if offered, according 
> >>> > to
> >>> > +this version of the specification.) 
> >>> 
> >>> I find this (the bit in parenthesis) confusing.
> >>> 
> >>> Why are drivers not supposed to accept features that they have been
> >>> offered, given that they can't know that the device cannot handle the
> >>> feature that it just offered?
> >>> 
> >>> Is this alluding to the later section:
> >>> 
> >>> > feature bits not described in this specification, reserved feature
> >>> > bits and feature bits reserved or not supported for the specific
> >>> > transport or the specific device type
> >>> 
> >>> ?
> >>
> >> exactly. how would you put this better? given an example?
> >
> > Perhaps it would be enough to say:
> >
> >> (even though drivers are not supposed to accept unrecognised features in
> >> the first place even if offered, according to the specification)
> >
> > "Unrecognised" is intended as a shorthand for the whole "not described,
> > reserved, ...". Maybe "unrecognised or reserved"?
> 
> Hm, what about
> 
> "even though drivers are not supposed to accept any unspecified,
> reserved, or unsupported features even if offered..."
> 
> ?

ok

> I'm not sure how we can make this both short and descriptive enough...



-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] Re: [PATCH v10 03/10] admin: introduce group administration commands

2023-03-08 Thread Michael S. Tsirkin

On Wed, Mar 08, 2023 at 03:05:50PM +0200, Max Gurtovoy wrote:
> All drivers in SW world ? All drivers in Linux ? or Virtio drivers ?

Most of the above.

-- 
MST


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] RE: [virtio-comment] Re: [virtio] Re: [PATCH v10 04/10] admin: introduce virtio admin virtqueues

2023-03-08 Thread Parav Pandit



> From: Stefan Hajnoczi 
> Sent: Wednesday, March 8, 2023 9:13 AM

> Therefore I think the admin queue must be designed under the assumption that
> some commands take a very long time.

+1

[virtio-dev] Re: [virtio-comment] RE: [PATCH v10 06/10] mmio: document ADMIN_VQ as reserved

2023-03-08 Thread Cornelia Huck

On Tue, Mar 07 2023, Parav Pandit  wrote:

>> From: Michael S. Tsirkin 
>> Sent: Friday, March 3, 2023 3:34 AM
>> 
>> On Thu, Mar 02, 2023 at 06:40:55PM +, Parav Pandit wrote:
>> > Did you miss reviewed-by from [1] or this is an old series reposted?
>> > [1]
>> > https://lists.oasis-open.org/archives/virtio-dev/202302/msg00242.html
>> 
>> As a general rule, we don't strictly need to track reviewed by since there's 
>> a
>> ballot (and presumably people review before voting).
>> 
>> People also tack on Signed-off-by: (and I do it too) but as long as we don't
>> document what it means it's kind of vague, and the process of subscribing to
>> the mailing list is a kind of replacement.
>> 
>> If you think everyone needs to follow practices like netdev does, we really 
>> need
>> something written up, and agree on it.
>> 
>> E.g.  I work on the linux kernel too, so I can copy practices from there, 
>> but even
>> linux isn't uniform.
>> 
>> And I wonder whether it's worth it - it definitely makes contributing to 
>> Linux
>> harder, and even within Linux it pushes contributors away. 
> The number of virtio spec contributors are in order of magnitude less than 
> Linux kernel or just the Linux netdev.
> Patch split up reduces lot of time author and reviewers.
> For example, interrupt moderation at v10 can be very easily split up where 
> prep patch can have RB tag and v11 doesn't need reviewers and author's time.
> Your work here with 10 patches is yet another good example of it.
> I remember Max and I started with 6 patches... and current 10 are more useful 
> way to split them.

I'd argue that splitting changes in a way that makes it easy for
reviewers is a good thing for any project (although practices on many
forges seem to go into another direction...)

>
>> At least for Linux
>> tracking history in a precise way is extremely important since it's helpful 
>> with
>> stability. Spec is very different.
>> 
>> Until we have a good contribution documentation I think we should not ask
>> people to follow a pseudo Linux work flow with requests like "please split 
>> this
>> patchset up" or "track changes across patch versions"
>> simply because there's no good docs to teach people what exactly is the best
>> practice.
>
> Yes, I wanted to update the contributing document. It is in my to-do list.
> But given the small crowd of contributors right now, almost everyone is 
> familiar with the RB, ack tag.
> At moment it has two main reasons.
>
> 1. Acknowledge the work and efforts that go in the review

I agree, and I try to include tags when applying.

> 2. When in question, reach out later to the spec author or reviewers to know 
> about context/design etc.
> 3. Same reviewer doesn't need to go through the patch which already has RB 
> tag of him/her.

I tend to use R-b in that way as well, but it relies on the patch author
not doing substantial changes between versions...

I guess the usage of R-b et al in the virtio spec stems from the fact
that many contributors are also Linux and/or QEMU contributors  -- not
sure if it makes sense to enshrine their usage, but

4. We can get an R-b from a non-TC member who is an expert on the topic
(and follows standard Linux/QEMU/... practices.)

In the end, how the TC members are voting is the only thing that matters
for inclusion.


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] Re: [PATCH v10 06/10] mmio: document ADMIN_VQ as reserved

2023-03-08 Thread Michael S. Tsirkin

On Tue, Mar 07, 2023 at 06:52:03PM +, Parav Pandit wrote:
> > And I wonder whether it's worth it - it definitely makes contributing to 
> > Linux
> > harder, and even within Linux it pushes contributors away. 
> The number of virtio spec contributors are in order of magnitude less
> than Linux kernel or just the Linux netdev.  Patch split up reduces
> lot of time author and reviewers.  For example, interrupt moderation
> at v10 can be very easily split up where prep patch can have RB tag
> and v11 doesn't need reviewers and author's time.  Your work here with
> 10 patches is yet another good example of it.  I remember Max and I
> started with 6 patches... and current 10 are more useful way to split
> them.

It's a big project though. All I ask is that we try to be gentler
than netdev to new contributors. If there's a 10 line patch it is
not necessary to be pedantic and ask for it to be split up to
a 3 patch series.

-- 
MST


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] Re: [virtio-comment] [PATCH v10 00/10] Introduce device group and device management

2023-03-08 Thread Cornelia Huck

On Mon, Mar 06 2023, "Michael S. Tsirkin"  wrote:

> On Mon, Mar 06, 2023 at 01:29:30PM +0100, Jiri Pirko wrote:
>> Thu, Mar 02, 2023 at 02:04:48PM CET, m...@redhat.com wrote:
>> 
>> [...]
>> 
>> >
>> >TODO (maybe?) - probably ok to defer until this part is upstream:
>> >
>> >Add "all members" member id.
>> >
>> >Add commands for MSI, feature discovery.
>> >
>> >Add commands for transport vq.
>> >
>> >
>> >My intent is to try and support both SR-IOV and SIOV
>> >usecases with the same structure and maybe even the same
>> >VQ.
>> 
>> Sorry to be late to the party, I'm trying to catch up. Is it common to
>> have cover letter for features this brief? I mean, from the cover
>> letter, I'm totally unable to understand what you are introducing here.
>> 
>> Could you elaborate about what you are aiming to achive with this?
>> Could you shed some usecases perhaps?
>> 
>> I have to be missing something obvious, but I don't get why any notion
>> of SR-IOV could be beneficial for virtio.
>> 
>
> Good point, I'll add a bit of motivation.
>
> For SR-IOV, it is not unusual for PFs to excercise control over VFs.
> There is interest in the community to include an interface to allow this
> in the virtio spec, when the PF is a virtio device.  This is what this
> patch does.

Unfortunately, information in the cover letter does not make it into
git -- should things like that go into the github issue (and into the
ballot?) It's useful both for reviewing (cover letter) and understanding
the rationale later (github/ballot).


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] RE: [virtio-comment] RE: [PATCH v10 06/10] mmio: document ADMIN_VQ as reserved

2023-03-08 Thread Parav Pandit




> From: Cornelia Huck 
> Sent: Wednesday, March 8, 2023 11:24 AM

> >> And I wonder whether it's worth it - it definitely makes contributing
> >> to Linux harder, and even within Linux it pushes contributors away.
> > The number of virtio spec contributors are in order of magnitude less than
> Linux kernel or just the Linux netdev.
> > Patch split up reduces lot of time author and reviewers.
> > For example, interrupt moderation at v10 can be very easily split up where
> prep patch can have RB tag and v11 doesn't need reviewers and author's time.
> > Your work here with 10 patches is yet another good example of it.
> > I remember Max and I started with 6 patches... and current 10 are more
> useful way to split them.
> 
> I'd argue that splitting changes in a way that makes it easy for reviewers is 
> a
> good thing for any project (although practices on many forges seem to go into
> another direction...)
> 
In my experience in dpdk, Linux netdev, rdma, nvme, tells me that patch 
splitting is common practice.

> >
> >> At least for Linux
> >> tracking history in a precise way is extremely important since it's
> >> helpful with stability. Spec is very different.
> >>
> >> Until we have a good contribution documentation I think we should not
> >> ask people to follow a pseudo Linux work flow with requests like
> >> "please split this patchset up" or "track changes across patch versions"
> >> simply because there's no good docs to teach people what exactly is
> >> the best practice.
> >
> > Yes, I wanted to update the contributing document. It is in my to-do list.
> > But given the small crowd of contributors right now, almost everyone is
> familiar with the RB, ack tag.
> > At moment it has two main reasons.
> >
> > 1. Acknowledge the work and efforts that go in the review
> 
> I agree, and I try to include tags when applying.
> 
Sure. It is not about you.
It is about the patch author when he/she sends v9 to v10, adding R-b tag in v10 
(for the already reviewed work of v9) helps to ack and speed up the v10 review.

> > 2. When in question, reach out later to the spec author or reviewers to know
> about context/design etc.
> > 3. Same reviewer doesn't need to go through the patch which already has RB
> tag of him/her.
> 
> I tend to use R-b in that way as well, but it relies on the patch author not 
> doing
> substantial changes between versions...
> 
> I guess the usage of R-b et al in the virtio spec stems from the fact that 
> many
> contributors are also Linux and/or QEMU contributors  -- not sure if it makes
> sense to enshrine their usage, but
> 
> 4. We can get an R-b from a non-TC member who is an expert on the topic (and
> follows standard Linux/QEMU/... practices.)
> 
The intent here was between vN to v(N+1).
Virtio spec maintainers are likely adding R-b when they merge anyway. No issue 
there.

> In the end, how the TC members are voting is the only thing that matters for
> inclusion.

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] RE: [virtio-comment] [PATCH v10] virtio-net: support the virtqueue coalescing moderation

2023-03-08 Thread Parav Pandit

> From: Michael S. Tsirkin 
> Sent: Monday, March 6, 2023 5:57 PM
> 
> On Thu, Mar 02, 2023 at 11:36:18AM +, David Edmondson wrote:
> > > +for an enabled transmit/receive 
> > > virtqueue whose
> number is \field{vqn}.
> >
> > Should this now be "whose index is \field{vqn}"?
> 
> Ugh.  I guess we'll have to fix the number/index mess in the spec first. Parav
> you said you are looking into it?

Yes, I want to send v1 for " Rename queue index to queue number" series.
V1 will include,

a. the vqn changes for net device rss and 
b. other ccw changes that Cornelia requested.
c. extra note that you asked to add to mmio for referring the old naming 
convention

I guess we agree that it should be vqn.
Therefore, vq coalescing and aq series should progress with vqn terminology.
This way all 3 series can progress in parallel (aq, vq coalescing, current spec 
cleanup).

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] Re: [virtio-comment] Re: [virtio] Re: [PATCH v10 04/10] admin: introduce virtio admin virtqueues

2023-03-08 Thread Stefan Hajnoczi

On Wed, Mar 08, 2023 at 06:19:41PM +0200, Max Gurtovoy wrote:
> 
> 
> On 08/03/2023 16:13, Stefan Hajnoczi wrote:
> > On Wed, Mar 08, 2023 at 01:17:33PM +0200, Max Gurtovoy wrote:
> > > 
> > > 
> > > On 06/03/2023 18:25, Stefan Hajnoczi wrote:
> > > > On Mon, Mar 06, 2023 at 05:28:03PM +0200, Max Gurtovoy wrote:
> > > > > 
> > > > > 
> > > > > On 06/03/2023 13:20, Stefan Hajnoczi wrote:
> > > > > > On Mon, Mar 06, 2023 at 04:00:50PM +0800, Jason Wang wrote:
> > > > > > > 
> > > > > > > 在 2023/3/6 08:03, Stefan Hajnoczi 写道:
> > > > > > > > On Sun, Mar 05, 2023 at 04:38:59AM -0500, Michael S. Tsirkin 
> > > > > > > > wrote:
> > > > > > > > > On Fri, Mar 03, 2023 at 03:21:33PM -0500, Stefan Hajnoczi 
> > > > > > > > > wrote:
> > > > > > > > > > What happens if a command takes 1 second to complete, is 
> > > > > > > > > > the device
> > > > > > > > > > allowed to process the next command from the virtqueue 
> > > > > > > > > > during this time,
> > > > > > > > > > possibly completing it before the first command?
> > > > > > > > > > 
> > > > > > > > > > This requires additional clarification in the spec because 
> > > > > > > > > > "they are
> > > > > > > > > > processed by the device in the order in which they are 
> > > > > > > > > > queued" does not
> > > > > > > > > > explain whether commands block the virtqueue (in order 
> > > > > > > > > > completion) or
> > > > > > > > > > not (out of order completion).
> > > > > > > > > Oh I begin to see. Hmm how does e.g. virtio scsi handle this?
> > > > > > > > virtio-scsi, virtio-blk, and NVMe requests may complete out of 
> > > > > > > > order.
> > > > > > > > Several may be processed by the device at the same time.
> > > > > > > > 
> > > > > > > > They rely on multi-queue for abort operations:
> > > > > > > > 
> > > > > > > > In virtio-scsi the abort requests 
> > > > > > > > (VIRTIO_SCSI_T_TMF_ABORT_TASK) are
> > > > > > > > sent on the control virtqueue. The the request identifier 
> > > > > > > > namespace is
> > > > > > > > shared across all virtqueues so it's possible to abort a 
> > > > > > > > request that
> > > > > > > > was submitted to any command virtqueue.
> > > > > > > > 
> > > > > > > > NVMe also follows the same design where abort commands are sent 
> > > > > > > > on the
> > > > > > > > Admin Submission Queue instead of an I/O Submission Queue. It's 
> > > > > > > > possible
> > > > > > > > to identify NVMe requests by  > > > > > > > Identifier>.
> > > > > > > > 
> > > > > > > > virtio-blk doesn't support aborting requests.
> > > > > > > > 
> > > > > > > > I think the logic behind this design is that if a queue gets 
> > > > > > > > stuck
> > > > > > > > processing long-running requests, then the device should not be 
> > > > > > > > forced
> > > > > > > > to perform lookahead in the queue to find abort commands. A 
> > > > > > > > separate
> > > > > > > > control/admin queue is used for the abort requests.
> > > > > > > 
> > > > > > > 
> > > > > > > Or device need mandate some kind of QOS here, e.g a request must 
> > > > > > > be complete
> > > > > > > in some time. Otherwise we don't have sufficient reliability for 
> > > > > > > using it as
> > > > > > > management task?
> > > > > > 
> > > > > > Yes, if all commands can be executed in bounded time then a 
> > > > > > guarantee is
> > > > > > possible.
> > > > > > 
> > > > > > Here is an example where that's hard: imagine a virtio-blk device 
> > > > > > backed
> > > > > > by network storage. When an admin queue command is used to delete a
> > > > > > group member, any of the group member's in-flight I/O requests need 
> > > > > > to
> > > > > > be aborted. If the network hangs while the group member is being
> > > > > > deleted, then the device can't complete an orderly shutdown of I/O
> > > > > > requests in a reasonable time.
> > > > > > 
> > > > > > That example shows a basic group admin command that I think Michael 
> > > > > > is
> > > > > > about to propose. We can't avoid this problem by not making it a 
> > > > > > group
> > > > > > admin command - it needs to be a group admin command. So I think 
> > > > > > it's
> > > > > > likely that there will be admin commands that take an unbounded 
> > > > > > amount
> > > > > > of time to complete. One way to achieve what you mentioned is 
> > > > > > timeouts.
> > > > > 
> > > > > I think that you're getting into device specific implementation 
> > > > > details and
> > > > > I'm not sure it's necessary.
> > > > > 
> > > > > I don't think we need to abort admin commands. Admin commands can be
> > > > > flushed/aborted during the device reset phase.
> > > > > Only IO commands should have the possibility to being aborted as you
> > > > > mentioned in NVMe and SCSI (and potentially in virtio-blk).
> > > > 
> > > > It's a general design issue that should be clarified now rather than
> > > > being left unspecified.
> > > > 
> > > > I'm not saying that it must be possible to abort admin commands. There
> > > > are other options, like requiring the d

[virtio-dev] Re: [virtio] Re: [virtio-comment] Re: [virtio] Re: [PATCH v10 04/10] admin: introduce virtio admin virtqueues

2023-03-08 Thread Stefan Hajnoczi

On Wed, Mar 08, 2023 at 01:57:43PM +0100, Jiri Pirko wrote:
> Wed, Mar 08, 2023 at 01:44:18PM CET, stefa...@redhat.com wrote:
> >On Wed, Mar 08, 2023 at 11:17:35AM +0100, Jiri Pirko wrote:
> >> Tue, Mar 07, 2023 at 08:03:47PM CET, stefa...@redhat.com wrote:
> >> >On Tue, Mar 07, 2023 at 04:07:54PM +0100, Jiri Pirko wrote:
> >> >> Tue, Mar 07, 2023 at 03:39:11PM CET, stefa...@redhat.com wrote:
> >> >> >On Tue, Mar 07, 2023 at 09:03:18AM +0100, Jiri Pirko wrote:
> >> >> >> Mon, Mar 06, 2023 at 07:37:31PM CET, m...@redhat.com wrote:
> >> >> >> >On Mon, Mar 06, 2023 at 06:03:40AM -0500, Stefan Hajnoczi wrote:
> >> >> >> >> On Sun, Mar 05, 2023 at 07:18:24PM -0500, Michael S. Tsirkin 
> >> >> >> >> wrote:
> >> >> >> >> > On Sun, Mar 05, 2023 at 07:03:02PM -0500, Stefan Hajnoczi wrote:
> >> >> >> >> > > On Sun, Mar 05, 2023 at 04:38:59AM -0500, Michael S. Tsirkin 
> >> >> >> >> > > wrote:
> >> >> >> >> > > > On Fri, Mar 03, 2023 at 03:21:33PM -0500, Stefan Hajnoczi 
> >> >> >> >> > > > wrote:
> >> >> >> >> > > > > What happens if a command takes 1 second to complete, is 
> >> >> >> >> > > > > the device
> >> >> >> >> > > > > allowed to process the next command from the virtqueue 
> >> >> >> >> > > > > during this time,
> >> >> >> >> > > > > possibly completing it before the first command?
> >> >> >> >> > > > > 
> >> >> >> >> > > > > This requires additional clarification in the spec 
> >> >> >> >> > > > > because "they are
> >> >> >> >> > > > > processed by the device in the order in which they are 
> >> >> >> >> > > > > queued" does not
> >> >> >> >> > > > > explain whether commands block the virtqueue (in order 
> >> >> >> >> > > > > completion) or
> >> >> >> >> > > > > not (out of order completion).
> >> >> >> >> > > > 
> >> >> >> >> > > > Oh I begin to see. Hmm how does e.g. virtio scsi handle 
> >> >> >> >> > > > this?
> >> >> >> >> > > 
> >> >> >> >> > > virtio-scsi, virtio-blk, and NVMe requests may complete out 
> >> >> >> >> > > of order.
> >> >> >> >> > > Several may be processed by the device at the same time.
> >> >> >> >> > 
> >> >> >> >> > Let's say I submit a write followed by read - is read
> >> >> >> >> > guaranteed to return an up to date info?
> >> >> >> >> 
> >> >> >> >> In general, no. The driver must wait for the write completion 
> >> >> >> >> before
> >> >> >> >> submitting the read if it wants consistency.
> >> >> >> >> 
> >> >> >> >> Stefan
> >> >> >> >
> >> >> >> >I see.  I think it's a good design to follow then.
> >> >> >> 
> >> >> >> Hmm, is it suitable to have this approach for configuration 
> >> >> >> interface?
> >> >> >> Storage device is a different beast, having parallel reads and writes
> >> >> >> makes complete sense for performance.
> >> >> >> 
> >> >> >> ->read a req
> >> >> >> ->read b req
> >> >> >> ->read c req
> >> >> >> <-read a rep
> >> >> >> <-read b rep
> >> >> >> <-read c rep
> >> >> >> 
> >> >> >> There is no dependency, even between writes.
> >> >> >> 
> >> >> >> But in case of configuration, does not make any sense to me.
> >> >> >> Why is it needed? To pass the burden of consistency of
> >> >> >> configuration to driver sounds odd at least.
> >> >> >> 
> >> >> >> I sense there is no concete idea about what the "admin virtqueue" 
> >> >> >> should
> >> >> >> serve for exactly.
> >> >> >
> >> >> >It's useful for long-running commands because they prevent other
> >> >> >commands from executing.
> >> >> >
> >> >> >An example I've given is that deleting a group member might require
> >> >> >waiting for the group member's I/O activity to finish. If that I/O
> >> >> >activity cannot be cancelled instantaneously, then it could take an
> >> >> >unbounded amount of time to delete the group member. The device would 
> >> >> >be
> >> >> >unable to process futher admin commands.
> >> >> 
> >> >> I see. Then I believe that the device should handle the dependencies.
> >> >> Example 1:
> >> >> -> REQ cmd to create group member A
> >> >> -> REQ cmd to create group member B
> >> >> <- REP cmd to create group member A
> >> >> <- REP cmd to create group member B
> >> >> 
> >> >> The device according to internal implementation can either serialize the
> >> >> 2 group member creations or do it in parallel, if it supports it.
> >> >> 
> >> >> Example 2:
> >> >> -> REQ cmd to create group member A
> >> >> -> REQ cmd config group member A
> >> >> <- REP cmd to create group member A
> >> >> <- REP cmd config group member A
> >> >> 
> >> >> Here the serialization is necessary and the device is the one to take
> >> >> care of it.
> >> >> 
> >> >> Makes sense?
> >> >
> >> >Yes, I understand. The spec would need to define ordering rules for
> >> >specific commands and the device must implement them. It allows the
> >> >driver to pipeline commands while also allowing out-of-order completion
> >> >(parallelism) in some cases. The disadvantage of this approach is
> >> >complexity in the spec and implementations.
> >> >
> >> >An alternative is unconditional out-of-order completion, where ther

[virtio-dev] Re: [virtio-comment] Re: [virtio] Re: [PATCH v10 04/10] admin: introduce virtio admin virtqueues

2023-03-08 Thread Michael S. Tsirkin

On Wed, Mar 08, 2023 at 12:15:23PM -0500, Stefan Hajnoczi wrote:
> > > > > Or we could say that admin commands must complete within bounded time,
> > > > > but I'm not sure that is implementable for some device types like
> > > > > virtio-blk, virtio-scsi, and virtiofs.
> > > > 
> > > > No we can't.
> > > > Some commands, for example FW upgrade can take 10 minutes and it's 
> > > > perfectly
> > > > fine. Other commands like setting feature bit will take 1 millisec.
> > > > Each device implements commands in a different internal logic so we 
> > > > can't
> > > > expect to complete after X time.
> > > 
> > > When I say bounded time, I mean that it finishes in a finite amount of
> > > time. I'm not saying there is a specific time X that all device
> > > implementations must satisfy. Unbounded means it might never finish.
> > 
> > There might be a chance that any command for any virtio device type will
> > never finish. Nothing new here in the adminq.
> > 
> > what one can do is to set a timeout for himself and if this timeout expire -
> > check the device status. If it needs_reset - do a reset. if status is ok,
> > then wait some more time.
> > After X retries, unmap buffers or reset the adminq.
> 
> Michael: What effect does resetting the group owner device have on group
> member devices?

virtio level reset? It's a good question. I'd expect them all to be
reset no?

> I'm concerned that this approach disrupts all group member devices. For
> example, you try to add a new device but the command hangs. In order to
> recover you now have to reset the group owner device and this breaks all
> the group member devices.


I agree. How about a VQ level reset though? Seems like exactly
what's needed here?


> > > 
> > > > Device can go to so FATAL state in case a command is stuck and causing
> > > > internal errors in it.
> > > > 
> > > > > 
> > > > > > For your example, stopping a member is possible even it there are 
> > > > > > some
> > > > > > errors in the network. You can for example destroy all the 
> > > > > > connections to
> > > > > > the remote target and complete all the BIOS with some error.
> > > > > 
> > > > > Forgetting about in-flight requests doesn't necessarily make them go
> > > > > away. It creates a race between forgotten requests and reconnection. 
> > > > > In
> > > > > the worst case a forgotten write request takes effect after
> > > > > reconnection, causing data corruption.
> > > > 
> > > > For making it work without data corruption we need a cooperation of the
> > > > target side for sure. But this is fine since the target in that case is 
> > > > part
> > > > of the "virtio-blk backend".
> > > > One solution is that the target can decide it will flush all the 
> > > > requests to
> > > > the storage device before accepting new connections.
> > > 
> > > This solution shifts the unbounded time from disconnection to
> > > connection. The Group Member Delete command will complete quickly but a
> > > subsequent Group Member Create command for the same underlying storage
> > > device would need to wait until the requests are done.
> > > 
> > > Therefore I think the admin queue must be designed under the assumption
> > > that some commands take a very long time.
> > 
> > For sure an admin command may take long time. FW upgrade can take 10 minutes
> > for example.
> > But each device is free to implement internal logic as he choose.
> > 
> > Same for live migration, when we stop/quiesce a device we must make sure it
> > doesn't master any DMA operations. Thus, in some implementations we need to
> > wait for all inflights to end fast. In others, we can invalidate the access
> > to host/guest memory and wait for completions until the freeze state.
> > 
> > Bottom line, this is device implementation specific consideration.
> 
> What I'm asking is that the spec clarifies the command completion order
> semantics (in-order or out-of-order), whether there is a mechanism to
> abort commands, etc.
> 
> Device implementers can then take advantage of those aspects to
> implement devices that don't hang (e.g. health monitoring becomes
> unavailable when there is a long running command).
> 
> If the spec doesn't cover this, then device implementers will not be
> able to work around it when implementing standard commands like
> create/delete group member.
> 
> Does that make sense?
> 
> Stefan

It does, to me.



-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] Re: [virtio-comment] [PATCH v10 00/10] Introduce device group and device management

2023-03-08 Thread Michael S. Tsirkin

On Wed, Mar 08, 2023 at 05:30:48PM +0100, Cornelia Huck wrote:
> On Mon, Mar 06 2023, "Michael S. Tsirkin"  wrote:
> 
> > On Mon, Mar 06, 2023 at 01:29:30PM +0100, Jiri Pirko wrote:
> >> Thu, Mar 02, 2023 at 02:04:48PM CET, m...@redhat.com wrote:
> >> 
> >> [...]
> >> 
> >> >
> >> >TODO (maybe?) - probably ok to defer until this part is upstream:
> >> >
> >> >  Add "all members" member id.
> >> >
> >> >  Add commands for MSI, feature discovery.
> >> >
> >> >  Add commands for transport vq.
> >> >
> >> >
> >> >My intent is to try and support both SR-IOV and SIOV
> >> >usecases with the same structure and maybe even the same
> >> >VQ.
> >> 
> >> Sorry to be late to the party, I'm trying to catch up. Is it common to
> >> have cover letter for features this brief? I mean, from the cover
> >> letter, I'm totally unable to understand what you are introducing here.
> >> 
> >> Could you elaborate about what you are aiming to achive with this?
> >> Could you shed some usecases perhaps?
> >> 
> >> I have to be missing something obvious, but I don't get why any notion
> >> of SR-IOV could be beneficial for virtio.
> >> 
> >
> > Good point, I'll add a bit of motivation.
> >
> > For SR-IOV, it is not unusual for PFs to excercise control over VFs.
> > There is interest in the community to include an interface to allow this
> > in the virtio spec, when the PF is a virtio device.  This is what this
> > patch does.
> 
> Unfortunately, information in the cover letter does not make it into
> git -- should things like that go into the github issue (and into the
> ballot?) It's useful both for reviewing (cover letter) and understanding
> the rationale later (github/ballot).

Well the github issue has a link to the archives, but sure it's
possible to copy that.

-- 
MST


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] Re: [virtio-comment] [PATCH v10] virtio-net: support the virtqueue coalescing moderation

2023-03-08 Thread Michael S. Tsirkin

On Wed, Mar 08, 2023 at 04:42:25PM +, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin 
> > Sent: Monday, March 6, 2023 5:57 PM
> > 
> > On Thu, Mar 02, 2023 at 11:36:18AM +, David Edmondson wrote:
> > > > +for an enabled 
> > > > transmit/receive virtqueue whose
> > number is \field{vqn}.
> > >
> > > Should this now be "whose index is \field{vqn}"?
> > 
> > Ugh.  I guess we'll have to fix the number/index mess in the spec first. 
> > Parav
> > you said you are looking into it?
> 
> Yes, I want to send v1 for " Rename queue index to queue number" series.
> V1 will include,
>  
> a. the vqn changes for net device rss and 
> b. other ccw changes that Cornelia requested.
> c. extra note that you asked to add to mmio for referring the old naming 
> convention
> 
> I guess we agree that it should be vqn.
> Therefore, vq coalescing and aq series should progress with vqn terminology.
> This way all 3 series can progress in parallel (aq, vq coalescing, current 
> spec cleanup).

Don't much care at this point but I think when I thought about this
deeply it seemed that yes, it is marginally better. Don't remember why
though ;)

-- 
MST


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] Re: [virtio-comment] Re: [virtio] [PATCH v10 04/10] admin: introduce virtio admin virtqueues

2023-03-08 Thread David Edmondson

Max Gurtovoy  writes:

> On 08/03/2023 14:08, Jiri Pirko wrote:
>> Wed, Mar 08, 2023 at 12:50:48PM CET, m...@redhat.com wrote:
>>> On Wed, Mar 08, 2023 at 11:05:00AM +0100, Jiri Pirko wrote:
 Tue, Mar 07, 2023 at 05:30:18PM CET, m...@redhat.com wrote:
> On Tue, Mar 07, 2023 at 08:36:41AM +0100, Jiri Pirko wrote:
>> Hmm, if not for now, the future exension would not be so simple, I fear.
>
> Without knowing what it is I can't say.

 Yep, so basically you say, for other things if they appear,
 let's introduce another queue type? If yes, sounds fair to me.
>>>
>>> Yes. For example I find it likely that live migration/failover support
>>> will require a queue where driver pre-adds buffers and then device
>>> supplies information as state changes.
>> 
>> I see. So there would be a queue called for example "child state virtqueue"
>> or something like that for the sole purpose of getting the state of VF?
>> Hmm, wouldn't it make more sense to have this done as a part of "group
>> administrarion queue"? I mean, there is already notion of addresing
>> child/VF here. So from my perspective, it is just another "group
>> administration" command.
>
> For sure VF Live Migration, MSIX config of VF, VF feature bits config 
> and others should be admin commands on admin vq.
> I don't see any reason introducing another type of admin-like vq.
> Also we don't need to have multiple admin vqs. This AQ is not aimed for 
> performance.

In support of live migration, might we end up moving large amounts of
device state through the admin queue?

If so, that would seem to have some performance requirements, though I
don't know if it would justify multiple admin queues.
-- 
Tonight I think I'll walk alone, I'll find my soul as I go home.

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] Re: [virtio-comment] [PATCH v10 00/10] Introduce device group and device management

2023-03-08 Thread Jiri Pirko

Wed, Mar 08, 2023 at 05:30:48PM CET, coh...@redhat.com wrote:
>On Mon, Mar 06 2023, "Michael S. Tsirkin"  wrote:
>
>> On Mon, Mar 06, 2023 at 01:29:30PM +0100, Jiri Pirko wrote:
>>> Thu, Mar 02, 2023 at 02:04:48PM CET, m...@redhat.com wrote:
>>> 
>>> [...]
>>> 
>>> >
>>> >TODO (maybe?) - probably ok to defer until this part is upstream:
>>> >
>>> >   Add "all members" member id.
>>> >
>>> >   Add commands for MSI, feature discovery.
>>> >
>>> >   Add commands for transport vq.
>>> >
>>> >
>>> >My intent is to try and support both SR-IOV and SIOV
>>> >usecases with the same structure and maybe even the same
>>> >VQ.
>>> 
>>> Sorry to be late to the party, I'm trying to catch up. Is it common to
>>> have cover letter for features this brief? I mean, from the cover
>>> letter, I'm totally unable to understand what you are introducing here.
>>> 
>>> Could you elaborate about what you are aiming to achive with this?
>>> Could you shed some usecases perhaps?
>>> 
>>> I have to be missing something obvious, but I don't get why any notion
>>> of SR-IOV could be beneficial for virtio.
>>> 
>>
>> Good point, I'll add a bit of motivation.
>>
>> For SR-IOV, it is not unusual for PFs to excercise control over VFs.
>> There is interest in the community to include an interface to allow this
>> in the virtio spec, when the PF is a virtio device.  This is what this
>> patch does.
>
>Unfortunately, information in the cover letter does not make it into
>git -- should things like that go into the github issue (and into the

Why's that? Can't you have a merge commit to include the cover letter,
as we have it in netdev for example?


>ballot?) It's useful both for reviewing (cover letter) and understanding
>the rationale later (github/ballot).
>

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] Re: [virtio-comment] Re: [virtio] [PATCH v10 04/10] admin: introduce virtio admin virtqueues

2023-03-08 Thread Jiri Pirko

Wed, Mar 08, 2023 at 07:01:02PM CET, david.edmond...@oracle.com wrote:
>Max Gurtovoy  writes:
>
>> On 08/03/2023 14:08, Jiri Pirko wrote:
>>> Wed, Mar 08, 2023 at 12:50:48PM CET, m...@redhat.com wrote:
 On Wed, Mar 08, 2023 at 11:05:00AM +0100, Jiri Pirko wrote:
> Tue, Mar 07, 2023 at 05:30:18PM CET, m...@redhat.com wrote:
>> On Tue, Mar 07, 2023 at 08:36:41AM +0100, Jiri Pirko wrote:
>>> Hmm, if not for now, the future exension would not be so simple, I fear.
>>
>> Without knowing what it is I can't say.
>
> Yep, so basically you say, for other things if they appear,
> let's introduce another queue type? If yes, sounds fair to me.

 Yes. For example I find it likely that live migration/failover support
 will require a queue where driver pre-adds buffers and then device
 supplies information as state changes.
>>> 
>>> I see. So there would be a queue called for example "child state virtqueue"
>>> or something like that for the sole purpose of getting the state of VF?
>>> Hmm, wouldn't it make more sense to have this done as a part of "group
>>> administrarion queue"? I mean, there is already notion of addresing
>>> child/VF here. So from my perspective, it is just another "group
>>> administration" command.
>>
>> For sure VF Live Migration, MSIX config of VF, VF feature bits config 
>> and others should be admin commands on admin vq.
>> I don't see any reason introducing another type of admin-like vq.
>> Also we don't need to have multiple admin vqs. This AQ is not aimed for 
>> performance.
>
>In support of live migration, might we end up moving large amounts of
>device state through the admin queue?

Sounds to me that would be the case, yes. Depends on device
implementation, but could be potentially big. We should make sure it
scales good.


>
>If so, that would seem to have some performance requirements, though I
>don't know if it would justify multiple admin queues.

Well, I can imagine when migrating a lots of VFs at the same time (host
shutdown), multiqueue might definitelly help.



>-- 
>Tonight I think I'll walk alone, I'll find my soul as I go home.

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] Re: [virtio] Re: [PATCH v10 06/10] mmio: document ADMIN_VQ as reserved

2023-03-08 Thread Jiri Pirko

Wed, Mar 08, 2023 at 05:30:21PM CET, m...@redhat.com wrote:
>On Tue, Mar 07, 2023 at 06:52:03PM +, Parav Pandit wrote:
>> > And I wonder whether it's worth it - it definitely makes contributing to 
>> > Linux
>> > harder, and even within Linux it pushes contributors away. 
>> The number of virtio spec contributors are in order of magnitude less
>> than Linux kernel or just the Linux netdev.  Patch split up reduces
>> lot of time author and reviewers.  For example, interrupt moderation
>> at v10 can be very easily split up where prep patch can have RB tag
>> and v11 doesn't need reviewers and author's time.  Your work here with
>> 10 patches is yet another good example of it.  I remember Max and I
>> started with 6 patches... and current 10 are more useful way to split
>> them.
>
>It's a big project though. All I ask is that we try to be gentler
>than netdev to new contributors. If there's a 10 line patch it is
>not necessary to be pedantic and ask for it to be split up to
>a 3 patch series.

One patch per change quite common everywhere. IDK, I don't see why that
would be a problem.


>
>-- 
>MST
>
>
>-
>To unsubscribe from this mail list, you must leave the OASIS TC that 
>generates this mail.  Follow this link to all your TCs in OASIS at:
>https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php 
>

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] Re: [virtio-comment] Re: [PATCH v10 04/10] admin: introduce virtio admin virtqueues

2023-03-08 Thread Michael S. Tsirkin

On Wed, Mar 08, 2023 at 10:58:59PM +0200, Max Gurtovoy wrote:
> We have a non-spec prototypes for LM so please use our experience in this
> area.

Don't see what the fuss is about.  Nothing will prevent your device from
using a single queue if that is what you want.

-- 
MST


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] RE: [virtio-comment] Re: [PATCH v10 04/10] admin: introduce virtio admin virtqueues

2023-03-08 Thread Parav Pandit

> From: Michael S. Tsirkin 
> Sent: Wednesday, March 8, 2023 4:09 PM
> 
> On Wed, Mar 08, 2023 at 10:58:59PM +0200, Max Gurtovoy wrote:
> > We have a non-spec prototypes for LM so please use our experience in
> > this area.
> 
> Don't see what the fuss is about.  Nothing will prevent your device from 
> using a
> single queue if that is what you want.

I think the use case is not clear in the cover letter on how/when multiple aq 
are useful or will be used.

I agree with you on this single queue should be enough provided a single queue 
suffices below design aspects:

The design:
a. to have one aq to support multiple outstanding commands
b. ability to execute unrelated aq commands by device in parallel
c. driver to not issue related commands to the AQ (because they are executed 
out of order by the device)
(Assumption is IN_ORDER is not negotiated).

Above the actual use case requirements/expectations from the AQ.

The current version indicates that the device will execute commands in-order, 
hence multi-queue is needed to achieve parallelism - is not serving the purpose 
of "q" notion fully.
Or do I miss reading v10?

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] RE: [virtio-comment] Re: [virtio] [PATCH v10 04/10] admin: introduce virtio admin virtqueues

2023-03-08 Thread Parav Pandit



> From: virtio-comm...@lists.oasis-open.org  open.org> On Behalf Of David Edmondson

> In support of live migration, might we end up moving large amounts of device
> state through the admin queue?
> 
Correct.

> If so, that would seem to have some performance requirements, though I don't
> know if it would justify multiple admin queues.
DMA of the data through the proposed AQ is supported.

If I understood Max correctly when Max said " This AQ is not aimed for 
performance ",
he means that AQ doesn't have performance requirements as io/network queues to 
complete millions of ops/sec.

it is several hundred to maybe (on the higher side) thousand ops/sec during LM, 
provisioning use case.

DMA perspective as you mentioned, AQ still has the same perf requirements as 
that of regular nw/blk io queues.

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] RE: [virtio-comment] Re: [virtio] [PATCH v10 04/10] admin: introduce virtio admin virtqueues

2023-03-08 Thread Parav Pandit



> From: Jiri Pirko 
> Sent: Wednesday, March 8, 2023 5:05 AM

> >For example a common feature is to program a vlan and have device put a
> >given VF inside this vlan.
> 
> I don't follow entirely. The way how the VF is connected to network should be
> ouf of the scope of this interface. The eswitch manager should take care. What
> you say sounds awfully like the "ip vf" legacy interface, which should not be
> considered here I believe.
> 
> If PF would be the eswitch manager, there are other means to do network
> programming, using eswitch port representors. But I don't think this is the 
> can
> of worms we want to open now. I don't think we have a usecase for it 
> currently.
> Am I wrong Parav?
> 
> 
We want the ability to program/provision the virtio feature bits and virtio 
config space parameters of the VF through PF.
These are host-facing parameters that usually migrate from one to another host 
when a VF migrates.

A future device may even do this out of band, but we are far from it.
At that point in the future virtio management device can take birth and do 
things over it.

ip vf was legacy interface and it is not applicable here.
virtio net doesn't implement it either and I don't see any user asking for it 
either.
All users have migrated to using tc mechanism for long time now.

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

RE: [virtio-dev] RE: [PATCH v10] virtio-net: support the virtqueue coalescing moderation

2023-03-08 Thread Parav Pandit


> From: virtio-dev@lists.oasis-open.org  On
> Behalf Of Heng Qi
> Sent: Thursday, March 2, 2023 10:27 PM
> 
> > I remember we discussed that instead of mentioning each individual field,
> better to describe the whole structure being read-only or write-only.
> 
> Consider the following scenarios:
> 1. A read-only field of the structure virtio_net_ctrl_coal is extended for
> CTRL_NOTF_COAL_RX/TX_SET to get some extra information 
A set command cannot extend the struct virtio_net_ctrl_coal, particularly for 
read-only and partially for write-only.
This would mean that for the tiny number of bytes, an additional dma descriptor 
is to be allocated with read/write-only permissions.
It would be inefficient for the driver to do so for the SET command to have vqn 
as write-only, reserved as read-only, rest fields as write-only dma attributes.

As I think more, I think the whole set command fields should be read-only for 
device. Better to describe it this way instead of splitting individual fields.
This way driver can just do a single DMA allocation with read-only attributes 
for set cmd.

Get command doesn’t have any choice today the way CVQ is structured to it lives 
with the limitation.

> > Looks good, however you have well covered in the device normative
> statements.
> > So possibly it can be removed from here.
> 
> I tend to keep this, as we have done in the past, we can have normative
> descriptions and the corresponding non-normative descriptions.
> 
Ok. but please revisit if the description can be simpler text than the 
normative lines.

[virtio-dev] Re: [virtio-comment] RE: [virtio-dev] RE: [PATCH v10] virtio-net: support the virtqueue coalescing moderation

2023-03-08 Thread Heng Qi





在 2023/3/9 上午6:30, Parav Pandit 写道:

From: virtio-dev@lists.oasis-open.org  On
Behalf Of Heng Qi
Sent: Thursday, March 2, 2023 10:27 PM


I remember we discussed that instead of mentioning each individual field,

better to describe the whole structure being read-only or write-only.

Consider the following scenarios:
1. A read-only field of the structure virtio_net_ctrl_coal is extended for
CTRL_NOTF_COAL_RX/TX_SET to get some extra information

A set command cannot extend the struct virtio_net_ctrl_coal, particularly for 
read-only and partially for write-only.
This would mean that for the tiny number of bytes, an additional dma descriptor 
is to be allocated with read/write-only permissions.
It would be inefficient for the driver to do so for the SET command to have vqn 
as write-only, reserved as read-only, rest fields as write-only dma attributes.

As I think more, I think the whole set command fields should be read-only for 
device. Better to describe it this way instead of splitting individual fields.
This way driver can just do a single DMA allocation with read-only attributes 
for set cmd.

Get command doesn’t have any choice today the way CVQ is structured to it lives 
with the limitation.


I think it is reasonable and will be revised in the next version.




Looks good, however you have well covered in the device normative

statements.

So possibly it can be removed from here.

I tend to keep this, as we have done in the past, we can have normative
descriptions and the corresponding non-normative descriptions.


Ok. but please revisit if the description can be simpler text than the 
normative lines.


Ok.

Thanks.



-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

Re: [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [PATCH v9] virtio-net: support inner header hash

2023-03-08 Thread Heng Qi





在 2023/3/8 下午10:39, Michael S. Tsirkin 写道:

On Wed, Mar 01, 2023 at 10:56:31AM +0800, Heng Qi wrote:


在 2023/2/28 下午7:16, Michael S. Tsirkin 写道:

On Sat, Feb 18, 2023 at 10:37:15PM +0800, Heng Qi wrote:

If the tunnel is used to encapsulate the packets, the hash calculated
using the outer header of the receive packets is always fixed for the
same flow packets, i.e. they will be steered to the same receive queue.

Wait a second. How is this true? Does not everyone stick the
inner header hash in the outer source port to solve this?

Yes, you are right. That's what we did before the inner header hash, but it
has a performance penalty, which I'll explain below.


For example geneve spec says:

 it is necessary for entropy from encapsulated packets to be
 exposed in the tunnel header.  The most common technique for this is
 to use the UDP source port

The end point of the tunnel called the gateway (with DPDK on top of it).

1. When there is no inner header hash, entropy can be inserted into the udp
src port of the outer header of the tunnel,
and then the tunnel packet is handed over to the host. The host needs to
take out a part of the CPUs to parse the outer headers (but not drop them)
to calculate the inner hash for the inner payloads,
and then use the inner
hash to forward them to another part of the CPUs that are responsible for
processing.

I don't get this part. Leave inner hashes to the guest inside the
tunnel, why is your host doing this?


Assuming that the same flow includes a unidirectional flow a->b, or a 
bidirectional flow a->b and b->a,

such flow may be out of order when processed by the gateway(DPDK):

1. In unidirectional mode, if the same flow is switched to another 
gateway for some reason, resulting in different outer IP address,
    then this flow may be processed by different CPUs after reaching 
the host if there is no inner hash. So after the host receives the
    flow, first use the forwarding CPUs to parse the inner hash, and 
then use the hash to ensure that the flow is processed by the

    same CPU.
2. In bidirectional mode, a->b flow may go to gateway 1, and b->a flow 
may go to gateway 2. In order to ensure that the same flow is
    processed by the same CPU, we still need the forwarding CPUs to 
parse the real inner hash(here, the hash key needs to be replaced with a 
symmetric hash key).





1). During this process, the CPUs on the host is divided into two parts, one
part is used as a forwarding node to parse the outer header,
  and the CPU utilization is low. Another part handles packets.

Some overhead is clearly involved in *sending* packets -
to calculate the hash and stick it in the port number.
This is, however, a separate problem and if you want to
solve it then my suggestion would be to teach the *transmit*
side about GRE offloads, so it can fill the source port in the card.


2). The entropy of the source udp src port is not enough, that is, the queue
is not widely distributed.

how isn't it enough? 16 bit is enough to cover all vqs ...


A 5-tuple brings more entropy than a single port, doesn't it? In fact, 
the inner hash of the physical network card used by
the business team is indeed better than the udp port number of the outer 
header we modify now, but they did not give me the data.



2. When there is an inner header hash, the gateway will directly help parse
the outer header, and use the inner 5 tuples to calculate the inner hash.
The tunneled packet is then handed over to the host.
1) All the CPUs of the host are used to process data packets, and there is
no need to use some CPUs to forward and parse the outer header.

You really have to parse the outer header anyway,
otherwise there's no tunneling.
Unless you want to teach virtio to implement tunneling
in hardware, which is something I'd find it easier to
get behind.


There is no need to parse the outer header twice, because we use shared 
memory.



2) The entropy of the original quintuple is sufficient, and the queue is
widely distributed.

It's exactly the same entropy, why would it be better? In fact you
are taking out the outer hash entropy making things worse.


I don't get the point, why the entropy of the inner 5-tuple and the 
outer tunnel header is the same,

multiple streams have the same outer header.

Thanks.



Thanks.

same goes for vxlan did not check further.

so what is the problem?  and which tunnel types actually suffer from the
problem?



This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscr...@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscr...@lists.oasis-open.org
List help: virtio-comment-h...@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/w

[virtio-dev] Re: [virtio-comment] Re: [PATCH v10 03/10] admin: introduce group administration commands

2023-03-08 Thread Michael S. Tsirkin

On Thu, Mar 09, 2023 at 02:29:31AM +0200, Max Gurtovoy wrote:
> please use "Co-developed-by:" for this patch

Sure.

-- 
MST


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

Re: [virtio-dev] Re: [virtio-comment] Re: [PATCH v10 04/10] admin: introduce virtio admin virtqueues

2023-03-08 Thread Jiri Pirko

Wed, Mar 08, 2023 at 10:09:22PM CET, m...@redhat.com wrote:
>On Wed, Mar 08, 2023 at 10:58:59PM +0200, Max Gurtovoy wrote:
>> We have a non-spec prototypes for LM so please use our experience in this
>> area.
>
>Don't see what the fuss is about.  Nothing will prevent your device from
>using a single queue if that is what you want.

Yeah. Plus designing the thing multiq in the spec from start is
future proof. I second that.


>
>-- 
>MST
>
>
>-
>To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
>For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org
>

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] Re: [virtio-comment] Re: [virtio] [PATCH v10 04/10] admin: introduce virtio admin virtqueues

2023-03-08 Thread Jiri Pirko

Wed, Mar 08, 2023 at 10:25:32PM CET, pa...@nvidia.com wrote:
>
>> From: virtio-comm...@lists.oasis-open.org > open.org> On Behalf Of David Edmondson
>
>> In support of live migration, might we end up moving large amounts of device
>> state through the admin queue?
>> 
>Correct.
>
>> If so, that would seem to have some performance requirements, though I don't
>> know if it would justify multiple admin queues.
>DMA of the data through the proposed AQ is supported.
>
>If I understood Max correctly when Max said " This AQ is not aimed for 
>performance ",
>he means that AQ doesn't have performance requirements as io/network queues to 
>complete millions of ops/sec.
>
>it is several hundred to maybe (on the higher side) thousand ops/sec during 
>LM, provisioning use case.

But isn't it good to design it for performance from start? I mean, state
transfer of thousands of VFs at a time is definitelly performance
related, isn't it?


>
>DMA perspective as you mentioned, AQ still has the same perf requirements as 
>that of regular nw/blk io queues.

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] Re: [virtio-comment] Re: [virtio] [PATCH v10 04/10] admin: introduce virtio admin virtqueues

2023-03-08 Thread Jiri Pirko

Wed, Mar 08, 2023 at 10:45:18PM CET, pa...@nvidia.com wrote:
>
>> From: Jiri Pirko 
>> Sent: Wednesday, March 8, 2023 5:05 AM
>
>> >For example a common feature is to program a vlan and have device put a
>> >given VF inside this vlan.
>> 
>> I don't follow entirely. The way how the VF is connected to network should be
>> ouf of the scope of this interface. The eswitch manager should take care. 
>> What
>> you say sounds awfully like the "ip vf" legacy interface, which should not be
>> considered here I believe.
>> 
>> If PF would be the eswitch manager, there are other means to do network
>> programming, using eswitch port representors. But I don't think this is the 
>> can
>> of worms we want to open now. I don't think we have a usecase for it 
>> currently.
>> Am I wrong Parav?
>> 
>> 
>We want the ability to program/provision the virtio feature bits and virtio 
>config space parameters of the VF through PF.
>These are host-facing parameters that usually migrate from one to another host 
>when a VF migrates.

Here is where we lack to have separate "devlink function" entity,
independend of "devlink port" :/ I mean, for MAC config and irq vectors.
What other host facing parmeters you have in mind. VLAN of "ip vf" is
definitelly one of them, I have to emphasize.


>
>A future device may even do this out of band, but we are far from it.
>At that point in the future virtio management device can take birth and do 
>things over it.
>
>ip vf was legacy interface and it is not applicable here.

Agreed, unacceptable to implement "ip vf" legacy interfaces in
virtio_net from my perspective.


>virtio net doesn't implement it either and I don't see any user asking for it 
>either.
>All users have migrated to using tc mechanism for long time now.

Okay, so the bottom line is, we should take care of host facing config
of VF, but we should not care about network plug of VF - that stays out
of the scope.

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] Re: [virtio] Re: [virtio-comment] [PATCH v10 00/10] Introduce device group and device management

2023-03-08 Thread Jiri Pirko

Tue, Mar 07, 2023 at 06:20:03PM CET, m...@redhat.com wrote:
>On Tue, Mar 07, 2023 at 08:21:54AM +0100, Jiri Pirko wrote:
>> Mon, Mar 06, 2023 at 11:54:45PM CET, m...@redhat.com wrote:
>> >On Mon, Mar 06, 2023 at 01:29:30PM +0100, Jiri Pirko wrote:
>> >> Thu, Mar 02, 2023 at 02:04:48PM CET, m...@redhat.com wrote:
>> >> 
>> >> [...]
>> >> 
>> >> >
>> >> >TODO (maybe?) - probably ok to defer until this part is upstream:
>> >> >
>> >> > Add "all members" member id.
>> >> >
>> >> > Add commands for MSI, feature discovery.
>> >> >
>> >> > Add commands for transport vq.
>> >> >
>> >> >
>> >> >My intent is to try and support both SR-IOV and SIOV
>> >> >usecases with the same structure and maybe even the same
>> >> >VQ.
>> >> 
>> >> Sorry to be late to the party, I'm trying to catch up. Is it common to
>> >> have cover letter for features this brief? I mean, from the cover
>> >> letter, I'm totally unable to understand what you are introducing here.
>> >> 
>> >> Could you elaborate about what you are aiming to achive with this?
>> >> Could you shed some usecases perhaps?
>> >> 
>> >> I have to be missing something obvious, but I don't get why any notion
>> >> of SR-IOV could be beneficial for virtio.
>> >> 
>> >
>> >Good point, I'll add a bit of motivation.
>> >
>> >For SR-IOV, it is not unusual for PFs to excercise control over VFs.
>> 
>> I understand the concepts of SR-IOV. Yet I fail to see the need of such
>> concept in virtio world. SR-IOV is very specific solution for PCI
>> functions instantiation, and I believe that it is already considered
>> quite limiting in many aspects. Does not make sense to me to introduce
>> it for virtio. But again, I may be missing something crucial, I just
>> would like to see the motivation, needs, usecases for this crystal
>> clear, which is opposite to the current cover letter I'm afraid :/
>
>First people are asking for it because it's out there, however limiting
>it is. In fact Nvidia is - why don't you talk to Parav here and tell
>him that SR-IOV is legacy and there's no need to support.

Yep, I talked to Parav, makes much more sense now for me as I now
understand the motivation. I was not able to reach there just by reading
the patchset :/


>
>
>
>
>> 
>> >There is interest in the community to include an interface to allow this
>> >in the virtio spec, when the PF is a virtio device.  This is what this
>> >patch does.
>> 
>> Yeah, but why? As I asked before, what are the usecases? The fact there
>> is interest in the community does not mean it makes sense to have it :)
>> 
>
>If people want to build such hardware it will need some interface.
>Far better to have it standard.
>
>
>But generally e.g. intel already said they will reuse this
>same structure with a different group type for SIOV support.
>I'll mention this in the cover letter.

Cool.


>
>> >
>> >
>> >
>> >
>> >> >
>> >> >For example, it might make sense to split creating/destroying
>> >> >SIOV devices from the transport passing data from the guest - the
>> >> >driver would then not negotiate VIRTIO_F_SR_IOV (which
>> >> >then means auto-provisioning).
>> >> >
>> >> >This is out of RFC, since we have two commands which are useful
>> >> >to discover supported group types (ATM can be none or SR-IOV).
>> >> >
>> >> >
>> >> >Michael S. Tsirkin (10):
>> >> >  virtio: document forward compatibility guarantees
>> >> >  admin: introduce device group and related concepts
>> >> >  admin: introduce group administration commands
>> >> >  admin: introduce virtio admin virtqueues
>> >> >  pci: add admin vq registers to virtio over pci
>> >> >  mmio: document ADMIN_VQ as reserved
>> >> >  ccw: document ADMIN_VQ as reserved
>> >> >  admin: command list discovery
>> >> >  admin: conformance clauses
>> >> >  ccw: document more reserved features
>> >> >
>> >> > admin.tex| 540 +++
>> >> > content.tex  | 112 +-
>> >> > introduction.tex |   3 +
>> >> > 3 files changed, 653 insertions(+), 2 deletions(-)
>> >> > create mode 100644 admin.tex
>> >> >
>> >> >-- 
>> >> >MST
>> >> >
>> >> >
>> >> >This publicly archived list offers a means to provide input to the
>> >> >OASIS Virtual I/O Device (VIRTIO) TC.
>> >> >
>> >> >In order to verify user consent to the Feedback License terms and
>> >> >to minimize spam in the list archive, subscription is required
>> >> >before posting.
>> >> >
>> >> >Subscribe: virtio-comment-subscr...@lists.oasis-open.org
>> >> >Unsubscribe: virtio-comment-unsubscr...@lists.oasis-open.org
>> >> >List help: virtio-comment-h...@lists.oasis-open.org
>> >> >List archive: https://lists.oasis-open.org/archives/virtio-comment/
>> >> >Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
>> >> >List Guidelines: 
>> >> >https://www.oasis-open.org/policies-guidelines/mailing-lists
>> >> >Committee: https://www.oasis-open.org/committees/virtio/
>> >> >Join OASIS: https://www.oasis-open.org/join/
>> >> >
>> >> 
>> >> This publicly archived lis

[virtio-dev] Re: [virtio-comment] Re: [virtio] [PATCH v10 04/10] admin: introduce virtio admin virtqueues

2023-03-08 Thread Jiri Pirko

Tue, Mar 07, 2023 at 05:30:18PM CET, m...@redhat.com wrote:
>On Tue, Mar 07, 2023 at 08:36:41AM +0100, Jiri Pirko wrote:
>> Hmm, if not for now, the future exension would not be so simple, I fear.
>
>Without knowing what it is I can't say.

Yep, so basically you say, for other things if they appear,
let's introduce another queue type? If yes, sounds fair to me.


>
>> 
>> >
>> >Passing commands to devices themselves is already covered in spec
>> >reasonably well though not in a generic way.
>> 
>> You mean using the control queue, correct?
>
>Depends on the device type. network devices have a control queue, yes.
>
>> >From one of the patch description of this patchset I understand that you
>> cannot use control queue for this because control queue is
>> device-specific, yet group control is device-agnostic.
>> 
>> My undestanding therefore was, that the admin queue you are introducing
>> serves as a generic carrier for device-agnostic commands, in parallel
>> for having control queue serving as a generic carrier of device-specific
>> commands. If this is not the case, I think it would be nice to describe
>> the exact monivation and scope of admin queue.
>
>Nope unfortunately.  This queue is just a carrier for admin commands.
>admin commands are commands that talk to one device about other
>devices. There's clearly no mechanism in the spec to do that,
>so we plug this hole.

Okay, in that case "admin" sounds a bit misleading as for me it
implicates that this is for "administration" of the device. Yet is is
for the administration of other devices (slaves).

Perhaps there could be different term used to clarify?
Group leader virtqueue?
Group owner virtqueue?
Group master virtqueue?

>
>
>
>> 
>> >
>> >What we lack is passing commands about one device to another device.
>> >E.g. control VFs through PFs.
>> 
>> Could you provide examples of such commands please?
>
>For example a common feature is to program a vlan and have device
>put a given VF inside this vlan.

I don't follow entirely. The way how the VF is connected to network
should be ouf of the scope of this interface. The eswitch manager should
take care. What you say sounds awfully like the "ip vf" legacy
interface, which should not be considered here I believe.

If PF would be the eswitch manager, there are other means to do network
programming, using eswitch port representors. But I don't think this is
the can of worms we want to open now. I don't think we have a usecase
for it currently. Am I wrong Parav?



>
>In a virtualization scenario host controls this vlan programming giving
>the network a measure of protection from VFs.  If a VF is passed through
>to a VM, IOMMU limits VFs to only access guest memory so host has to do
>this programming through a PF.

Understood. This really looks like "ip vf" legacy. I strongly believe
it should not be supported.

Any other commands you have in mind?


>
>
>
>> 
>> >This is what groups do.
>> >But if we see more uses we can always add them.
>> >
>> >
>> >I'd rather avoid being too generic though.
>> 
>> In that case, why not to avoid using generic terms and stay
>> "group-centric"? What I mean is:
>> "Administration Virtqueues" -> "Group Administration Virtqueues"
>> "struct virtio_admin_cmd" -> "struct virtio_group_admin_cmd"
>> 
>> Etc. Helps to avoid confusion.
>
>Sure, I tried to do that but missed some opportunities.
>Will address.

Cool.


>
>> 
>> >
>> >
>> >
>> >
>> >> 
>> >> >+than one administration virtqueue.
>> 
>> 
>> [...]
>> 
>
>
>This publicly archived list offers a means to provide input to the
>OASIS Virtual I/O Device (VIRTIO) TC.
>
>In order to verify user consent to the Feedback License terms and
>to minimize spam in the list archive, subscription is required
>before posting.
>
>Subscribe: virtio-comment-subscr...@lists.oasis-open.org
>Unsubscribe: virtio-comment-unsubscr...@lists.oasis-open.org
>List help: virtio-comment-h...@lists.oasis-open.org
>List archive: https://lists.oasis-open.org/archives/virtio-comment/
>Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
>List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
>Committee: https://www.oasis-open.org/committees/virtio/
>Join OASIS: https://www.oasis-open.org/join/
>

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] Re: [virtio-comment] RE: [PATCH v3 0/2] virtio-net: Improve dev config layout

2023-03-08 Thread Cornelia Huck

On Tue, Mar 07 2023, Parav Pandit  wrote:

> Hi Cornelia,
>
>> -Original Message-
>> From: Cornelia Huck 
>> Sent: Monday, March 6, 2023 9:32 AM
>> To: Parav Pandit ; m...@redhat.com; virtio-dev@lists.oasis-
>> open.org
>> Cc: virtio-comm...@lists.oasis-open.org; Shahaf Shuler 
>> Subject: RE: [PATCH v3 0/2] virtio-net: Improve dev config layout
>> 
>> On Tue, Feb 21 2023, Parav Pandit  wrote:
>> 
>> >> From: Parav Pandit 
>> >> Sent: Friday, February 17, 2023 10:45 AM
>> >
>> >> Patch summary:
>> >> patch-1: consolidate read only field at one place in driver
>> >> requirements
>> >> patch-2: define device configuration layout before describing its fields.
>> >>
>> >> changelog:
>> >> v2->v3:
>> >> - split into two patches
>> >> - move read only description to driver requirements section
>> >
>> > These 2 patches are reviewed.
>> > Can you please open a ballot for it?
>> > Github issue:
>> > https://lists.oasis-open.org/archives/virtio-dev/202302/msg00405.html
>> > And the github description has the latest link this time. :)
>> 
>> I think something went wrong here... the link above points to this series 
>> (i.e. not
>> the latest one), and so does the github issue, and therefore the ballot. I 
>> think we
>> should withdraw the ballot?
>
> Please withdraw.
>
> Right link is: [1].
> Please create the new ballot with link [1].
>
> [1] https://lists.oasis-open.org/archives/virtio-dev/202302/msg00556.html

Done.


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] Re: [virtio-comment] Re: [virtio] Re: [PATCH v10 04/10] admin: introduce virtio admin virtqueues

2023-03-08 Thread Jiri Pirko

Tue, Mar 07, 2023 at 05:13:01PM CET, m...@redhat.com wrote:
>On Tue, Mar 07, 2023 at 09:03:18AM +0100, Jiri Pirko wrote:
>> I sense there is no concete idea about what the "admin virtqueue" should
>> serve for exactly.
>
>Because the virtqueue is just a virtqueue - a way to pass around
>commands.  Admin commands are used whenever one needs to talk to one
>device about another one.

Yeah, I got that now. Please check my other reply in this thread
suggesting a name change of "admin". Perhaps it would make sense and
makes thing a bit clearer for slower people like myself :)


>-- 
>MST
>
>
>This publicly archived list offers a means to provide input to the
>OASIS Virtual I/O Device (VIRTIO) TC.
>
>In order to verify user consent to the Feedback License terms and
>to minimize spam in the list archive, subscription is required
>before posting.
>
>Subscribe: virtio-comment-subscr...@lists.oasis-open.org
>Unsubscribe: virtio-comment-unsubscr...@lists.oasis-open.org
>List help: virtio-comment-h...@lists.oasis-open.org
>List archive: https://lists.oasis-open.org/archives/virtio-comment/
>Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
>List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
>Committee: https://www.oasis-open.org/committees/virtio/
>Join OASIS: https://www.oasis-open.org/join/
>

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] Re: [virtio-comment] Re: [virtio] Re: [PATCH v10 04/10] admin: introduce virtio admin virtqueues

2023-03-08 Thread Jiri Pirko

Tue, Mar 07, 2023 at 08:03:47PM CET, stefa...@redhat.com wrote:
>On Tue, Mar 07, 2023 at 04:07:54PM +0100, Jiri Pirko wrote:
>> Tue, Mar 07, 2023 at 03:39:11PM CET, stefa...@redhat.com wrote:
>> >On Tue, Mar 07, 2023 at 09:03:18AM +0100, Jiri Pirko wrote:
>> >> Mon, Mar 06, 2023 at 07:37:31PM CET, m...@redhat.com wrote:
>> >> >On Mon, Mar 06, 2023 at 06:03:40AM -0500, Stefan Hajnoczi wrote:
>> >> >> On Sun, Mar 05, 2023 at 07:18:24PM -0500, Michael S. Tsirkin wrote:
>> >> >> > On Sun, Mar 05, 2023 at 07:03:02PM -0500, Stefan Hajnoczi wrote:
>> >> >> > > On Sun, Mar 05, 2023 at 04:38:59AM -0500, Michael S. Tsirkin wrote:
>> >> >> > > > On Fri, Mar 03, 2023 at 03:21:33PM -0500, Stefan Hajnoczi wrote:
>> >> >> > > > > What happens if a command takes 1 second to complete, is the 
>> >> >> > > > > device
>> >> >> > > > > allowed to process the next command from the virtqueue during 
>> >> >> > > > > this time,
>> >> >> > > > > possibly completing it before the first command?
>> >> >> > > > > 
>> >> >> > > > > This requires additional clarification in the spec because 
>> >> >> > > > > "they are
>> >> >> > > > > processed by the device in the order in which they are queued" 
>> >> >> > > > > does not
>> >> >> > > > > explain whether commands block the virtqueue (in order 
>> >> >> > > > > completion) or
>> >> >> > > > > not (out of order completion).
>> >> >> > > > 
>> >> >> > > > Oh I begin to see. Hmm how does e.g. virtio scsi handle this?
>> >> >> > > 
>> >> >> > > virtio-scsi, virtio-blk, and NVMe requests may complete out of 
>> >> >> > > order.
>> >> >> > > Several may be processed by the device at the same time.
>> >> >> > 
>> >> >> > Let's say I submit a write followed by read - is read
>> >> >> > guaranteed to return an up to date info?
>> >> >> 
>> >> >> In general, no. The driver must wait for the write completion before
>> >> >> submitting the read if it wants consistency.
>> >> >> 
>> >> >> Stefan
>> >> >
>> >> >I see.  I think it's a good design to follow then.
>> >> 
>> >> Hmm, is it suitable to have this approach for configuration interface?
>> >> Storage device is a different beast, having parallel reads and writes
>> >> makes complete sense for performance.
>> >> 
>> >> ->read a req
>> >> ->read b req
>> >> ->read c req
>> >> <-read a rep
>> >> <-read b rep
>> >> <-read c rep
>> >> 
>> >> There is no dependency, even between writes.
>> >> 
>> >> But in case of configuration, does not make any sense to me.
>> >> Why is it needed? To pass the burden of consistency of
>> >> configuration to driver sounds odd at least.
>> >> 
>> >> I sense there is no concete idea about what the "admin virtqueue" should
>> >> serve for exactly.
>> >
>> >It's useful for long-running commands because they prevent other
>> >commands from executing.
>> >
>> >An example I've given is that deleting a group member might require
>> >waiting for the group member's I/O activity to finish. If that I/O
>> >activity cannot be cancelled instantaneously, then it could take an
>> >unbounded amount of time to delete the group member. The device would be
>> >unable to process futher admin commands.
>> 
>> I see. Then I believe that the device should handle the dependencies.
>> Example 1:
>> -> REQ cmd to create group member A
>> -> REQ cmd to create group member B
>> <- REP cmd to create group member A
>> <- REP cmd to create group member B
>> 
>> The device according to internal implementation can either serialize the
>> 2 group member creations or do it in parallel, if it supports it.
>> 
>> Example 2:
>> -> REQ cmd to create group member A
>> -> REQ cmd config group member A
>> <- REP cmd to create group member A
>> <- REP cmd config group member A
>> 
>> Here the serialization is necessary and the device is the one to take
>> care of it.
>> 
>> Makes sense?
>
>Yes, I understand. The spec would need to define ordering rules for
>specific commands and the device must implement them. It allows the
>driver to pipeline commands while also allowing out-of-order completion
>(parallelism) in some cases. The disadvantage of this approach is
>complexity in the spec and implementations.
>
>An alternative is unconditional out-of-order completion, where there are
>no per-command ordering rules. The driver must wait for a command to
>complete if it relies on the results of that command for its next
>command. I like this approach because it's less complex in the spec and
>for device implementers, while the burden on the driver implementer is
>still reasonable.

But isn't this duplicating the burden of maintaining dependencies to
both driver and device? I mean, device should not depend on driver doing
the right thing, that means it has to check the dependencies for every
incoming command anyway. The only difference would be to wait instead of
returning "-EBUSY" in case the dependency is not satisfied yet.

Device knows exactly what are the dependencies. And I believe, those are
device implementation specific. For example, some implementation co

[virtio-dev] Re: [virtio-comment] Re: [virtio] Re: [PATCH v10 04/10] admin: introduce virtio admin virtqueues

2023-03-08 Thread Michael S. Tsirkin

On Wed, Mar 08, 2023 at 11:08:56AM +0100, Jiri Pirko wrote:
> Tue, Mar 07, 2023 at 05:13:01PM CET, m...@redhat.com wrote:
> >On Tue, Mar 07, 2023 at 09:03:18AM +0100, Jiri Pirko wrote:
> >> I sense there is no concete idea about what the "admin virtqueue" should
> >> serve for exactly.
> >
> >Because the virtqueue is just a virtqueue - a way to pass around
> >commands.  Admin commands are used whenever one needs to talk to one
> >device about another one.
> 
> Yeah, I got that now. Please check my other reply in this thread
> suggesting a name change of "admin". Perhaps it would make sense and
> makes thing a bit clearer for slower people like myself :)

Yes I think it's a good idea. Replied there too.

> 
> >-- 
> >MST
> >
> >
> >This publicly archived list offers a means to provide input to the
> >OASIS Virtual I/O Device (VIRTIO) TC.
> >
> >In order to verify user consent to the Feedback License terms and
> >to minimize spam in the list archive, subscription is required
> >before posting.
> >
> >Subscribe: virtio-comment-subscr...@lists.oasis-open.org
> >Unsubscribe: virtio-comment-unsubscr...@lists.oasis-open.org
> >List help: virtio-comment-h...@lists.oasis-open.org
> >List archive: https://lists.oasis-open.org/archives/virtio-comment/
> >Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> >List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> >Committee: https://www.oasis-open.org/committees/virtio/
> >Join OASIS: https://www.oasis-open.org/join/
> >


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] Re: [virtio-comment] Re: [virtio] [PATCH v10 04/10] admin: introduce virtio admin virtqueues

2023-03-08 Thread Michael S. Tsirkin

On Wed, Mar 08, 2023 at 11:05:00AM +0100, Jiri Pirko wrote:
> Tue, Mar 07, 2023 at 05:30:18PM CET, m...@redhat.com wrote:
> >On Tue, Mar 07, 2023 at 08:36:41AM +0100, Jiri Pirko wrote:
> >> Hmm, if not for now, the future exension would not be so simple, I fear.
> >
> >Without knowing what it is I can't say.
> 
> Yep, so basically you say, for other things if they appear,
> let's introduce another queue type? If yes, sounds fair to me.

Yes. For example I find it likely that live migration/failover support
will require a queue where driver pre-adds buffers and then device
supplies information as state changes.

> 
> >
> >> 
> >> >
> >> >Passing commands to devices themselves is already covered in spec
> >> >reasonably well though not in a generic way.
> >> 
> >> You mean using the control queue, correct?
> >
> >Depends on the device type. network devices have a control queue, yes.
> >
> >> >From one of the patch description of this patchset I understand that you
> >> cannot use control queue for this because control queue is
> >> device-specific, yet group control is device-agnostic.
> >> 
> >> My undestanding therefore was, that the admin queue you are introducing
> >> serves as a generic carrier for device-agnostic commands, in parallel
> >> for having control queue serving as a generic carrier of device-specific
> >> commands. If this is not the case, I think it would be nice to describe
> >> the exact monivation and scope of admin queue.
> >
> >Nope unfortunately.  This queue is just a carrier for admin commands.
> >admin commands are commands that talk to one device about other
> >devices. There's clearly no mechanism in the spec to do that,
> >so we plug this hole.
> 
> Okay, in that case "admin" sounds a bit misleading as for me it
> implicates that this is for "administration" of the device. Yet is is
> for the administration of other devices (slaves).
> 
> Perhaps there could be different term used to clarify?
> Group leader virtqueue?
> Group owner virtqueue?
> Group master virtqueue?

I used group administration virtqueue in a couple of places,
just inconsistently. Good enough?


> >
> >
> >
> >> 
> >> >
> >> >What we lack is passing commands about one device to another device.
> >> >E.g. control VFs through PFs.
> >> 
> >> Could you provide examples of such commands please?
> >
> >For example a common feature is to program a vlan and have device
> >put a given VF inside this vlan.
> 
> I don't follow entirely. The way how the VF is connected to network
> should be ouf of the scope of this interface. The eswitch manager should
> take care. What you say sounds awfully like the "ip vf" legacy
> interface, which should not be considered here I believe.
> 
> If PF would be the eswitch manager, there are other means to do network
> programming, using eswitch port representors. But I don't think this is
> the can of worms we want to open now. I don't think we have a usecase
> for it currently. Am I wrong Parav?
> 
> 
> 
> >
> >In a virtualization scenario host controls this vlan programming giving
> >the network a measure of protection from VFs.  If a VF is passed through
> >to a VM, IOMMU limits VFs to only access guest memory so host has to do
> >this programming through a PF.
> 
> Understood. This really looks like "ip vf" legacy. I strongly believe
> it should not be supported.
> 
> Any other commands you have in mind?
> 
> 
> >
> >
> >
> >> 
> >> >This is what groups do.
> >> >But if we see more uses we can always add them.
> >> >
> >> >
> >> >I'd rather avoid being too generic though.
> >> 
> >> In that case, why not to avoid using generic terms and stay
> >> "group-centric"? What I mean is:
> >> "Administration Virtqueues" -> "Group Administration Virtqueues"
> >> "struct virtio_admin_cmd" -> "struct virtio_group_admin_cmd"
> >> 
> >> Etc. Helps to avoid confusion.
> >
> >Sure, I tried to do that but missed some opportunities.
> >Will address.
> 
> Cool.
> 
> 
> >
> >> 
> >> >
> >> >
> >> >
> >> >
> >> >> 
> >> >> >+than one administration virtqueue.
> >> 
> >> 
> >> [...]
> >> 
> >
> >
> >This publicly archived list offers a means to provide input to the
> >OASIS Virtual I/O Device (VIRTIO) TC.
> >
> >In order to verify user consent to the Feedback License terms and
> >to minimize spam in the list archive, subscription is required
> >before posting.
> >
> >Subscribe: virtio-comment-subscr...@lists.oasis-open.org
> >Unsubscribe: virtio-comment-unsubscr...@lists.oasis-open.org
> >List help: virtio-comment-h...@lists.oasis-open.org
> >List archive: https://lists.oasis-open.org/archives/virtio-comment/
> >Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> >List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> >Committee: https://www.oasis-open.org/committees/virtio/
> >Join OASIS: https://www.oasis-open.org/join/
> >


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-o

Re: [virtio-dev] [PATCH v10 08/10] admin: command list discovery

2023-03-08 Thread Michael S. Tsirkin

On Mon, Mar 06, 2023 at 01:22:30PM +0100, Jiri Pirko wrote:
> Thu, Mar 02, 2023 at 02:05:22PM CET, m...@redhat.com wrote:
> >Add commands to find out which commands does each group support,
> >as well as enable their use by driver.
> >This will be especially useful once we have multiple group types.
> >
> >An alternative is per-type VQs. This is possible but will
> >require more per-transport work. Discovery through the vq
> >helps keep things contained.
> >
> >Signed-off-by: Max Gurtovoy 
> >Signed-off-by: Michael S. Tsirkin 
> 
> [...]
> 
> 
> >+
> >+The driver issues the command VIRTIO_ADMIN_CMD_LIST_QUERY to
> >+query the list of commands valid for this group and before sending
> >+any commands for any member of a group.
> >+
> >+The driver then enables use of some of the opcodes by sending to
> >+the device the command VIRTIO_ADMIN_CMD_LIST_USE with a subset
> >+of the list returned by VIRTIO_ADMIN_CMD_LIST_QUERY that is
> >+both understood and used by the driver.
> 
> To my untrained ear, this sounds somewhat similar to the feature
> negotiantion mechanism. Why the fact that device/driver supports some
> command can't be covered by just another feature? Looks like unnecassary
> complexicity to negotiate supported commands like this.

Absolutely, it is similar. The issue is that a single device can
be an owner for multiple group types.
For example, I think an SRIOV PF itself can have subfunctions with SIOV
as well as virtual functions with SRIOV.
And the set of supported commands might differ.

We thus need a command that is kind of like features but per group type.

As a small bonus as we are building this from scratch, I added
something which was requested for a long time from features
which is blocking access to commands not allowed by features.
With standard virtio features we trust the driver not to send
commands before enabling a feature and this was considered
a maintainance/security problem.

-- 
MST

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] Re: [virtio-comment] [PATCH v10 09/10] admin: conformance clauses

2023-03-08 Thread Michael S. Tsirkin

On Tue, Mar 07, 2023 at 11:04:33AM +, David Edmondson wrote:
> "Michael S. Tsirkin"  writes:
> 
> > Add conformance clauses for admin commands and admin virtqueues.
> >
> > Signed-off-by: Michael S. Tsirkin 
> > ---
> >  admin.tex | 216 +-
> >  1 file changed, 215 insertions(+), 1 deletion(-)
> >
> > diff --git a/admin.tex b/admin.tex
> > index 1172054..6c4f79c 100644
> > --- a/admin.tex
> > +++ b/admin.tex
> > @@ -251,6 +251,145 @@ \subsection{Group administration 
> > commands}\label{sec:Basic Facilities of a Virti
> >  supporting multiple group types, the list of supported commands
> >  might differ between different group types.
> >  
> > +\devicenormative{\paragraph}{Group administration commands}{Basic 
> > Facilities of a Virtio Device / Device groups / Group administration 
> > commands}
> > +
> > +The device MUST validate \field{opcode}, \field{group_type} and
> > +\field{group_member_id}, and if any of these has an invalid or
> > +unsupported value, set \field{status} to
> > +VIRTIO_ADMIN_STATUS_EINVAL and set \field{status_qualifier}
> > +accordingly:
> > +\begin{itemize}
> > +\item if \field{group_type} is invalid, \field{status_qualifier}
> > +   MUST be set to VIRTIO_ADMIN_STATUS_Q_INVALID_GROUP;
> > +\item otherwise, if \field{opcode} is invalid,
> > +   \field{status_qualifier} MUST be set to
> > +   VIRTIO_ADMIN_STATUS_Q_INVALID_OPCODE;
> > +\item otherwise, if \field{group_member_id} is used by the
> > +   specific command and is invalid, \field{status_qualifier} MUST be
> > +   set to VIRTIO_ADMIN_STATUS_Q_INVALID_MEMBER.
> > +\end{itemize}
> > +
> > +If a command completes successfully, the device MUST set
> > +\field{status} to VIRTIO_ADMIN_STATUS_OK.
> > +
> > +If a command fails, the device MUST set
> > +\field{status} to a value different from VIRTIO_ADMIN_STATUS_OK.
> > +
> > +If \field{status} is set to VIRTIO_ADMIN_STATUS_EINVAL, the
> > +device state MUST NOT change, that is the command MUST NOT have
> > +any side effects on the device, in particular the device MUST not
> 
> MUST NOT

/me nods

> > +enter an error state as a result of this command.
> > +
> > +If a command fails, the device state generally SHOULD NOT change,
> > +as far as possible.
> > +
> > +The device MAY enforce additional restrictions and dependencies on
> > +opcodes used by the driver and MAY fail the command
> > +VIRTIO_ADMIN_CMD_LIST_USE with \field{status} set to 
> > VIRTIO_ADMIN_STATUS_EINVAL
> > +and \field{status_qualifier} set to VIRTIO_ADMIN_STATUS_Q_INVALID_FIELD
> > +if the list of commands used violate internal device dependencies.
> > +
> > +If the device supports multiple group types, commands for each group
> > +type MUST operate independently of each other, in particular,
> > +the device MAY return different results for VIRTIO_ADMIN_CMD_LIST_QUERY
> > +for different group types.
> > +
> > +After reset, if the device supports a given group type
> > +and before receiving VIRTIO_ADMIN_CMD_LIST_USE for this group type
> > +the device MUST assume
> > +that the list of legal commands used by the driver consists of
> > +the two commands VIRTIO_ADMIN_CMD_LIST_QUERY and VIRTIO_ADMIN_CMD_LIST_USE.
> > +
> > +After completing VIRTIO_ADMIN_CMD_LIST_USE successfully,
> > +the device MUST set the list of legal commands used by the driver
> > +to the one supplied in \field{command_specific_data}.
> > +
> > +The device MUST set the list of legal commands used by the driver
> > +to the one supplied in VIRTIO_ADMIN_CMD_LIST_USE.
> 
> Are these last two paragraphs not saying the same thing?


They are!

> > +The device MUST validate commands against the list used by
> > +the driver and MUST fail any commands not in the list with
> > +\field{status} set to VIRTIO_ADMIN_STATUS_EINVAL
> > +and \field{status_qualifier} set to
> > +VIRTIO_ADMIN_STATUS_Q_INVALID_OPCODE.
> > +
> > +The list of supported commands MUST NOT shrink (but MAY expand):
> > +after reporting a given command as supported through
> > +VIRTIO_ADMIN_CMD_LIST_QUERY the device MUST NOT later report it
> > +as unsupported.  Further, after a given set of commands has been
> > +used (via a successful VIRTIO_ADMIN_CMD_LIST_USE), then after a
> > +device or system reset the device SHOULD complete successfully
> > +any following calls to VIRTIO_ADMIN_CMD_LIST_USE with the same
> > +list of commands; if this command VIRTIO_ADMIN_CMD_LIST_USE fails
> > +after a device or system reset, the device MUST not fail it
> > +solely because of the command list used.  Failure to do so would
> > +interfere with resuming from suspend and error recovery.
> > +
> > +When processing a command with the SR-IOV group type,
> > +if the device does not have an SR-IOV Extended Capability or
> > +if \field{VF Enable} is clear
> > +then the device MUST fail all commands with
> > +\field{status} set to VIRTIO_ADMIN_STATUS_EINVAL and
> > +\field{status_qualifier} set to
> > +VIRTIO_ADMIN_STATUS_Q_INVALID_GROUP;
> > +otherwise

[virtio-dev] Re: [virtio] RE: [virtio-comment] Re: [virtio] Re: [PATCH v10 04/10] admin: introduce virtio admin virtqueues

2023-03-08 Thread Stefan Hajnoczi

On Wed, Mar 08, 2023 at 01:17:44PM +0800, Jason Wang wrote:
> On Wed, Mar 8, 2023 at 3:09 AM Parav Pandit  wrote:
> >
> >
> >
> > > From: Stefan Hajnoczi 
> > > Sent: Tuesday, March 7, 2023 2:04 PM
> >
> > > An alternative is unconditional out-of-order completion, where there are 
> > > no
> > > per-command ordering rules. The driver must wait for a command to complete
> > > if it relies on the results of that command for its next command. I like 
> > > this
> > > approach because it's less complex in the spec and for device 
> > > implementers,
> > > while the burden on the driver implementer is still reasonable.
> > +1.
> 
> Note that this is the way current ctrl virtqueue works.
> 
> > This has best of both.
> > 1. Command ordering knowledge and hence the decision left to the one which 
> > issues them. (driver).
> > 2. Ability to execute multiple unrelated commands using a single AQ.
> > 3. stateless device in AQ command execution
> 
> Does this mean if we want to migrate AQ (not use AQ to migrate), we
> need to wait for the AQ command to be completed?

It depends. Today, if the AQ is emulated by the VMM then it might be
possible to migrate while there is a command in-flight. If the AQ is
handled by the device then no because VIRTIO currently does not support
device state migration.

In the future, device state migration will probably be added to
vhost-user and vDPA. In that case a device can migrate in-flight AQ
activity - assuming it's possible to pause it, serialize state,
deserialize it, and resume it. Otherwise it's still necessary to wait.

To be clear, I'm just taking about AQ commands that are currently in the
process of executing. Commands that are in the virtqueue but have not
yet been started by the device can be easily migrated using the existing
virtqueue migration infrastructure.

Stefan

signature.asc
Description: PGP signature

[virtio-dev] Re: [virtio] RE: [virtio-comment] Re: [virtio] Re: [PATCH v10 04/10] admin: introduce virtio admin virtqueues

2023-03-08 Thread Michael S. Tsirkin

On Wed, Mar 08, 2023 at 01:17:44PM +0800, Jason Wang wrote:
> On Wed, Mar 8, 2023 at 3:09 AM Parav Pandit  wrote:
> >
> >
> >
> > > From: Stefan Hajnoczi 
> > > Sent: Tuesday, March 7, 2023 2:04 PM
> >
> > > An alternative is unconditional out-of-order completion, where there are 
> > > no
> > > per-command ordering rules. The driver must wait for a command to complete
> > > if it relies on the results of that command for its next command. I like 
> > > this
> > > approach because it's less complex in the spec and for device 
> > > implementers,
> > > while the burden on the driver implementer is still reasonable.
> > +1.
> 
> Note that this is the way current ctrl virtqueue works.
> 
> > This has best of both.
> > 1. Command ordering knowledge and hence the decision left to the one which 
> > issues them. (driver).
> > 2. Ability to execute multiple unrelated commands using a single AQ.
> > 3. stateless device in AQ command execution
> 
> Does this mean if we want to migrate AQ (not use AQ to migrate), we
> need to wait for the AQ command to be completed?
> 
> Thanks

As with any VQ, if you migrate while an operation is in progress
you need to figure out the internal device state.

> >
> >
> > -
> > To unsubscribe from this mail list, you must leave the OASIS TC that
> > generates this mail.  Follow this link to all your TCs in OASIS at:
> > https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php
> >


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] RE: [virtio-comment] [PATCH v10 00/10] Introduce device group and device management

2023-03-08 Thread Parav Pandit



> From: Jason Wang 
> Sent: Wednesday, March 8, 2023 12:12 AM
> > In other words, can AQ command be useful tomorrow for doing SIOV device


> add/remove, and provisioning from non-owning PF?
> > The way AQ is crafted today is not there yet, but in the future, it can be
> extended.
> 
> There's already a proposal to use the queue as a transport. Any device that
> can't use a transport specific configuration or provisioning can benefit from
> that. SIOV is only one of the users for that.
>
o.k. Better to include that link in the cover letter.
 
> > >
> > > >
> > > > >There is interest in the community to include an interface to
> > > > >allow this in the virtio spec, when the PF is a virtio device.
> > > > >This is what this patch does.
> > > >
> > > > Yeah, but why? As I asked before, what are the usecases? The fact
> > > > there is interest in the community does not mean it makes sense to
> > > > have it :)
> > > >
> > >
> > > If people want to build such hardware it will need some interface.
> > > Far better to have it standard.
> > >
> > >
> > > But generally e.g. intel already said they will reuse this same
> > > structure with a different group type for SIOV support.
> > > I'll mention this in the cover letter.
> > >
> >
> > We have the following already identified use cases of a device agnostic VQ.
> >
> > 1. SR-IOV VF device provisioning at virtio layer (features and config)
> > 2. SR IOV (SR-PCIM interface of the PCI spec) for VF provisioning, for
> > example, MSI-X vectors
> >
> 
> The virtio layer device provisioning is transport independent. 
Yes. I listed known use cases above. If there are more that can use, sure.

> I don't see
> anything that is PCI specific there. Even the MSI-X could be reused by other
> transport.
>
Maybe yes.
I just don’t know which other transport of virtio has defined MSI-X.
If there is, sure there are more users of AQ.
 
> > 3. SR-IOV VF Live migration
> 
> I think we don't want a partial solution for live migration that only works 
> for PCI
> VF devices and L1. Instead, there needs to be a general device facility 
> section
> and the admin virtqueue could be one of the interfaces for the driver to use 
> the
> facility.
> 
May be there are more use cases of AQ than what I listed.

> > 4. Above #1 to #3 for SIOV devices instead of SRIOV devices
> 
> If we make the above as a general facility, it would not be hard to re-use 
> them
> for transport virtqueue or SIOV.
>
Ok. thanks.

Re: [virtio-dev] Re: [PATCH v10 03/10] admin: introduce group administration commands

2023-03-08 Thread Michael S. Tsirkin

On Tue, Mar 07, 2023 at 12:31:51PM +0100, Jiri Pirko wrote:
> Fri, Mar 03, 2023 at 09:23:14PM CET, stefa...@redhat.com wrote:
> >On Fri, Mar 03, 2023 at 08:18:43AM -0500, Michael S. Tsirkin wrote:
> >> On Fri, Mar 03, 2023 at 08:13:37AM -0500, Stefan Hajnoczi wrote:
> >> > On Thu, Mar 02, 2023 at 06:57:24PM -0500, Michael S. Tsirkin wrote:
> >> > > On Thu, Mar 02, 2023 at 03:10:11PM -0500, Stefan Hajnoczi wrote:
> >> > > > On Thu, Mar 02, 2023 at 08:05:02AM -0500, Michael S. Tsirkin wrote:
> >> > > > > This introduces a general structure for group administration 
> >> > > > > commands,
> >> > > > > used to control device groups through their owner.
> >> > > > > 
> >> > > > > Following patches will introduce specific commands and an 
> >> > > > > interface for
> >> > > > > submitting these commands to the owner.
> >> > > > > 
> >> > > > > Signed-off-by: Max Gurtovoy 
> >> > > > > Signed-off-by: Michael S. Tsirkin 
> >> > > > > ---
> >> > > > >  admin.tex| 108 
> >> > > > > +++
> >> > > > >  introduction.tex |   3 ++
> >> > > > >  2 files changed, 111 insertions(+)
> >> > > > > 
> >> > > > > diff --git a/admin.tex b/admin.tex
> >> > > > > index 3dc47be..7e28b77 100644
> >> > > > > --- a/admin.tex
> >> > > > > +++ b/admin.tex
> >> > > > > @@ -46,4 +46,112 @@ \section{Device groups}\label{sec:Basic 
> >> > > > > Facilities of a Virtio Device / Device g
> >> > > > >  PCI transport (see \ref{sec:Virtio Transport Options / Virtio 
> >> > > > > Over PCI Bus}).
> >> > > > >  \end{description}
> >> > > > >  
> >> > > > > +\subsection{Group administration commands}\label{sec:Basic 
> >> > > > > Facilities of a Virtio Device / Device groups / Group 
> >> > > > > administration commands}
> >> > > > >  
> >> > > > > +The driver sends group administration commands to the owner 
> >> > > > > device of
> >> > > > 
> >> > > > I notice that the terminology is simply "the driver". "Owner driver"
> >> > > > and "group member driver" might be clearer because there will be two
> >> > > > (possibly different) drivers involved.
> >> > > 
> >> > > Hmm I don't really want to repeat owner everywhere.
> >> > > I will clarify that in this section simple "driver" and "device" are
> >> > > owner, "member device" and "member driver" is always called explicitly.
> >> > 
> >> > Sounds good.
> >> > 
> >> > > > > +a group to control member devices of the group.
> >> > > > > +This mechanism can
> >> > > > > +be used, for example, to configure a member device before it is
> >> > > > > +initialized by its driver.
> >> > > > > +\footnote{The term "administration" is intended to be interpreted
> >> > > > > +widely to include any kind of control. See specific commands
> >> > > > > +for detail.}
> >> > > > > +
> >> > > > > +All the group administration commands are of the following form:
> >> > > > > +
> >> > > > > +\begin{lstlisting}
> >> > > > > +struct virtio_admin_cmd {
> >> > > > > +/* Device-readable part */
> >> > > > > +le16 opcode;
> >> > > > > +/*
> >> > > > > + * 1 - SR-IOV
> >> > > > > + * 2 - 65535 reserved
> >> > > > > + */
> >> > > > > +le16 group_type;
> >> > > > > +/* unused, reserved for future extensions */
> >> > > > > +u8 reserved1[12];
> >> > > > > +le64 group_member_id;
> >> > > > > +u8 command_specific_data[];
> >> > > > > +
> >> > > > > +/* Device-writable part */
> >> > > > > +le16 status;
> >> > > > > +le16 status_qualifier;
> >> > > > > +/* unused, reserved for future extensions */
> >> > > > > +u8 reserved2[4];
> >> > > > > +u8 command_specific_result[];
> >> > > > > +};
> >> > > > > +\end{lstlisting}
> >> > > > > +
> >> > > > > +For all commands, \field{opcode}, \field{group_type} and if
> >> > > > > +necessary \field{group_member_id} and 
> >> > > > > \field{command_specific_data} are
> >> > > > > +set by the driver, and the owner device sets \field{status} and if
> >> > > > > +needed \field{status_qualifier} and
> >> > > > > +\field{command_specific_result}.
> >> > > > > +
> >> > > > > +Generally, any unused device-readable fields are set to zero by 
> >> > > > > the driver
> >> > > > > +and ignored by the device.  Any unused device-writeable fields 
> >> > > > > are set to zero
> >> > > > > +by the device and ignored by the driver.
> >> > > > > +
> >> > > > > +\field{opcode} specifies the command. The valid
> >> > > > > +values for \field{opcode} can be found in the following table:
> >> > > > > +
> >> > > > > +\begin{tabular}{|l|l|}
> >> > > > > +\hline
> >> > > > > +opcode & Name & Command Description \\
> >> > > > > +\hline \hline
> >> > > > > +0x - 0x7FFF & - &  Group administration commands\\
> >> > > > > +\hline
> >> > > > > +0x8000 - 0x & - & Reserved\\
> >> > > > > +\hline
> >> > > > > +\end{tabular}
> >> > > > 
> >> > > > I thought all commands are "group administration commands" but this
> >> > > > table ma

[virtio-dev] Re: [PATCH v10 03/10] admin: introduce group administration commands

2023-03-08 Thread Michael S. Tsirkin

On Wed, Mar 08, 2023 at 12:55:37PM +0200, Max Gurtovoy wrote:
> > > 5. VIRTIO_ADMIN_STATUS_COMMAND_SPECIFIC_ERR (for more info read the
> > > command_specific_error field).
> > 
> > I don't think it's a good idea, we'll have to agree to disagree.
> 
> Ok.
> can you explain why isn't this a good idea please ?

Pointless complexity for spec, devices and drivers. In the end drivers
don't really do anything with all this detailed info.

-- 
MST


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] Re: [virtio-comment] Re: [virtio] [PATCH v10 04/10] admin: introduce virtio admin virtqueues

2023-03-08 Thread Jiri Pirko

Wed, Mar 08, 2023 at 12:50:48PM CET, m...@redhat.com wrote:
>On Wed, Mar 08, 2023 at 11:05:00AM +0100, Jiri Pirko wrote:
>> Tue, Mar 07, 2023 at 05:30:18PM CET, m...@redhat.com wrote:
>> >On Tue, Mar 07, 2023 at 08:36:41AM +0100, Jiri Pirko wrote:
>> >> Hmm, if not for now, the future exension would not be so simple, I fear.
>> >
>> >Without knowing what it is I can't say.
>> 
>> Yep, so basically you say, for other things if they appear,
>> let's introduce another queue type? If yes, sounds fair to me.
>
>Yes. For example I find it likely that live migration/failover support
>will require a queue where driver pre-adds buffers and then device
>supplies information as state changes.

I see. So there would be a queue called for example "child state virtqueue"
or something like that for the sole purpose of getting the state of VF?
Hmm, wouldn't it make more sense to have this done as a part of "group
administrarion queue"? I mean, there is already notion of addresing
child/VF here. So from my perspective, it is just another "group
administration" command.

We would have to prealloc buffers and use it for this purpose as
well. I mean, the cmd payload for other group commands
could be put in this buffers as well, making all group command
infrastructure consistent.

Something like:

opcode X
group_member_id 1
cmd_payload-> buffer A

opcode Y
group_member_id 1
cmd_payload-> buffer B

Basically this would replace command_specific_data/command_specific_result


>
>> 
>> >
>> >> 
>> >> >
>> >> >Passing commands to devices themselves is already covered in spec
>> >> >reasonably well though not in a generic way.
>> >> 
>> >> You mean using the control queue, correct?
>> >
>> >Depends on the device type. network devices have a control queue, yes.
>> >
>> >> >From one of the patch description of this patchset I understand that you
>> >> cannot use control queue for this because control queue is
>> >> device-specific, yet group control is device-agnostic.
>> >> 
>> >> My undestanding therefore was, that the admin queue you are introducing
>> >> serves as a generic carrier for device-agnostic commands, in parallel
>> >> for having control queue serving as a generic carrier of device-specific
>> >> commands. If this is not the case, I think it would be nice to describe
>> >> the exact monivation and scope of admin queue.
>> >
>> >Nope unfortunately.  This queue is just a carrier for admin commands.
>> >admin commands are commands that talk to one device about other
>> >devices. There's clearly no mechanism in the spec to do that,
>> >so we plug this hole.
>> 
>> Okay, in that case "admin" sounds a bit misleading as for me it
>> implicates that this is for "administration" of the device. Yet is is
>> for the administration of other devices (slaves).
>> 
>> Perhaps there could be different term used to clarify?
>> Group leader virtqueue?
>> Group owner virtqueue?
>> Group master virtqueue?
>
>I used group administration virtqueue in a couple of places,
>just inconsistently. Good enough?

Yep, sounds good. Thanks!


>
>
>> >
>> >
>> >
>> >> 
>> >> >
>> >> >What we lack is passing commands about one device to another device.
>> >> >E.g. control VFs through PFs.
>> >> 
>> >> Could you provide examples of such commands please?
>> >
>> >For example a common feature is to program a vlan and have device
>> >put a given VF inside this vlan.
>> 
>> I don't follow entirely. The way how the VF is connected to network
>> should be ouf of the scope of this interface. The eswitch manager should
>> take care. What you say sounds awfully like the "ip vf" legacy
>> interface, which should not be considered here I believe.
>> 
>> If PF would be the eswitch manager, there are other means to do network
>> programming, using eswitch port representors. But I don't think this is
>> the can of worms we want to open now. I don't think we have a usecase
>> for it currently. Am I wrong Parav?
>> 
>> 
>> 
>> >
>> >In a virtualization scenario host controls this vlan programming giving
>> >the network a measure of protection from VFs.  If a VF is passed through
>> >to a VM, IOMMU limits VFs to only access guest memory so host has to do
>> >this programming through a PF.
>> 
>> Understood. This really looks like "ip vf" legacy. I strongly believe
>> it should not be supported.
>> 
>> Any other commands you have in mind?
>> 
>> 
>> >
>> >
>> >
>> >> 
>> >> >This is what groups do.
>> >> >But if we see more uses we can always add them.
>> >> >
>> >> >
>> >> >I'd rather avoid being too generic though.
>> >> 
>> >> In that case, why not to avoid using generic terms and stay
>> >> "group-centric"? What I mean is:
>> >> "Administration Virtqueues" -> "Group Administration Virtqueues"
>> >> "struct virtio_admin_cmd" -> "struct virtio_group_admin_cmd"
>> >> 
>> >> Etc. Helps to avoid confusion.
>> >
>> >Sure, I tried to do that but missed some opportunities.
>> >Will address.
>> 
>> Cool.
>> 
>> 
>> >
>> >> 
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >> 
>> >> >> >+t

Re: [virtio-dev] [PATCH v10 08/10] admin: command list discovery

2023-03-08 Thread Jiri Pirko

Thu, Mar 02, 2023 at 02:05:22PM CET, m...@redhat.com wrote:
>Add commands to find out which commands does each group support,
>as well as enable their use by driver.
>This will be especially useful once we have multiple group types.
>
>An alternative is per-type VQs. This is possible but will
>require more per-transport work. Discovery through the vq
>helps keep things contained.
>
>Signed-off-by: Max Gurtovoy 
>Signed-off-by: Michael S. Tsirkin 
>---
> admin.tex | 97 ++-
> 1 file changed, 96 insertions(+), 1 deletion(-)
>
>diff --git a/admin.tex b/admin.tex
>index 3ffac2e..1172054 100644
>--- a/admin.tex
>+++ b/admin.tex
>@@ -99,7 +99,9 @@ \subsection{Group administration commands}\label{sec:Basic 
>Facilities of a Virti
> \hline
> opcode & Name & Command Description \\
> \hline \hline
>-0x - 0x7FFF & - &  Group administration commands\\
>+0x & VIRTIO_ADMIN_CMD_LIST_QUERY & Provides to driver list of commands 
>supported for this group type\\
>+0x0001 & VIRTIO_ADMIN_CMD_LIST_USE & Provides to device list of commands used 
>for this group type \\
>+0x0002 - 0x7FFF & - &  Group administration commands\\
> \hline
> 0x8000 - 0x & - & Reserved\\

This table would not be needed at all, as all the commands here are
"group administration commands". Could you remove it?


> \hline
>@@ -156,6 +158,99 @@ \subsection{Group administration 
>commands}\label{sec:Basic Facilities of a Virti
> depends on these structures and is described separately or is
> implicit in the structure description.
> 
>+Before sending any administration commands to the device, the driver
>+needs to communicate to the device which commands it is going to
>+use. Initially (after reset), only two commands are assumed to be used:
>+VIRTIO_ADMIN_CMD_LIST_QUERY and VIRTIO_ADMIN_CMD_LIST_USE.
>+
>+Before sending any other commands for any member of a specific group to
>+the device, the driver queries the supported commands via
>+VIRTIO_ADMIN_CMD_LIST_QUERY and sends the commands it will use via
>+VIRTIO_ADMIN_CMD_LIST_USE.
>+
>+Commands VIRTIO_ADMIN_CMD_LIST_QUERY and
>+VIRTIO_ADMIN_CMD_LIST_USE
>+both use the following structure describing the
>+command opcodes:
>+
>+\begin{lstlisting}
>+struct virtio_admin_cmd_list {
>+   /* Indicates which of the below fields were returned
>+   le64 device_admin_cmds[];
>+};
>+\end{lstlisting}
>+
>+This structure is an array of 64 bit values in little-endian byte
>+order, in which a bit is set if the specific command opcode

Ah wait, you are having this as bitfield. How many commands do you
expect to have? I mean, could this be rather just plain array of
opcodes.

If you want to save some space, the opcode certainly does not need to be
64 bit. I think that 16 bits is more than enough for this :)

Then to make this consistent with struct virtio_admin_cmd,
where you use term "opcode" and with the rest of the text where you use
"CMD" and "cmd", perhaps "cmd_opcode" in struct virtio_admin_cmd
and "cmd_opcodes[]" here would work better?




>+is supported. Thus, \field{device_admin_cmds[0]} refers to the first 32-bit 
>value

You say "32", it is "64".


>+in this array corresponding to opcodes 0 to 63,
>+\field{device_admin_cmds[1]} is the second 64-bit value
>+corresponding to opcodes 64 to 127, etc.
>+For example, the array of size 2 including
>+the values 0x3 in \field{device_admin_cmds[0]}
>+and 0x1 in \field{device_admin_cmds[1]} indicates that only
>+opcodes 0, 1 and 64 are supported.
>+The length of the array depends on the supported opcodes - it is
>+large enough to include bits set for all supported opcodes,
>+that is the length can be calculated by starting with the largest
>+supported opcode adding one, dividing by 64 and rounding up.
>+In other words, for
>+VIRTIO_ADMIN_CMD_LIST_QUERY and VIRTIO_ADMIN_CMD_LIST_USE the
>+length of \field{command_specific_result} and
>+\field{command_specific_data} respectively will be
>+$DIV_ROUND_UP(max_cmd, 64) * 8$ where DIV_ROUND_UP is integer division
>+with round up and \field{max_cmd} is the largest available command opcode.
>+
>+The array is also allowed to be larger and to additionally
>+include an arbitrary number of all-zero entries.
>+
>+Accordingly, bits 0 and 1 corresponding to opcode 0
>+(VIRTIO_ADMIN_CMD_LIST_QUERY) and 1
>+(VIRTIO_ADMIN_CMD_LIST_USE) are
>+always set in \field{device_admin_cmds[0]} returned by 
>VIRTIO_ADMIN_CMD_LIST_QUERY.
>+
>+For the command VIRTIO_ADMIN_CMD_LIST_QUERY, \field{opcode} is set to 0x0.
>+The \field{group_member_id} is unused. It is set to zero by driver.
>+This command has no command specific data.
>+The device, upon success, returns a result in
>+\field{command_specific_result} in the format
>+\field{struct virtio_admin_cmd_list} describing the
>+list of administration commands supported for the given group.

Hmm, IIUC, the list is per group type (not group). If yes, could
you please consistently use "group type"?

Also, I think you are

Re: [virtio-dev] [PATCH v10 08/10] admin: command list discovery

2023-03-08 Thread Jiri Pirko

Wed, Mar 08, 2023 at 12:54:52PM CET, m...@redhat.com wrote:
>On Mon, Mar 06, 2023 at 01:22:30PM +0100, Jiri Pirko wrote:
>> Thu, Mar 02, 2023 at 02:05:22PM CET, m...@redhat.com wrote:
>> >Add commands to find out which commands does each group support,
>> >as well as enable their use by driver.
>> >This will be especially useful once we have multiple group types.
>> >
>> >An alternative is per-type VQs. This is possible but will
>> >require more per-transport work. Discovery through the vq
>> >helps keep things contained.
>> >
>> >Signed-off-by: Max Gurtovoy 
>> >Signed-off-by: Michael S. Tsirkin 
>> 
>> [...]
>> 
>> 
>> >+
>> >+The driver issues the command VIRTIO_ADMIN_CMD_LIST_QUERY to
>> >+query the list of commands valid for this group and before sending
>> >+any commands for any member of a group.
>> >+
>> >+The driver then enables use of some of the opcodes by sending to
>> >+the device the command VIRTIO_ADMIN_CMD_LIST_USE with a subset
>> >+of the list returned by VIRTIO_ADMIN_CMD_LIST_QUERY that is
>> >+both understood and used by the driver.
>> 
>> To my untrained ear, this sounds somewhat similar to the feature
>> negotiantion mechanism. Why the fact that device/driver supports some
>> command can't be covered by just another feature? Looks like unnecassary
>> complexicity to negotiate supported commands like this.
>
>Absolutely, it is similar. The issue is that a single device can
>be an owner for multiple group types.
>For example, I think an SRIOV PF itself can have subfunctions with SIOV
>as well as virtual functions with SRIOV.
>And the set of supported commands might differ.
>
>We thus need a command that is kind of like features but per group type.

I see. But that does not mean it can't be done using existing features:

_F_SRIOV_CMD1
_F_SRIOV_CMD2
..
_F_SIOV_CMD1
_F_SIOV_CMD2
...

>
>As a small bonus as we are building this from scratch, I added
>something which was requested for a long time from features
>which is blocking access to commands not allowed by features.
>With standard virtio features we trust the driver not to send
>commands before enabling a feature and this was considered
>a maintainance/security problem.

I understand. Is it unfixable for features not to break buggy drivers?


>
>
>-- 
>MST
>

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] Re: [virtio-comment] Re: [virtio] Re: [PATCH v10 04/10] admin: introduce virtio admin virtqueues

2023-03-08 Thread Stefan Hajnoczi

On Wed, Mar 08, 2023 at 11:17:35AM +0100, Jiri Pirko wrote:
> Tue, Mar 07, 2023 at 08:03:47PM CET, stefa...@redhat.com wrote:
> >On Tue, Mar 07, 2023 at 04:07:54PM +0100, Jiri Pirko wrote:
> >> Tue, Mar 07, 2023 at 03:39:11PM CET, stefa...@redhat.com wrote:
> >> >On Tue, Mar 07, 2023 at 09:03:18AM +0100, Jiri Pirko wrote:
> >> >> Mon, Mar 06, 2023 at 07:37:31PM CET, m...@redhat.com wrote:
> >> >> >On Mon, Mar 06, 2023 at 06:03:40AM -0500, Stefan Hajnoczi wrote:
> >> >> >> On Sun, Mar 05, 2023 at 07:18:24PM -0500, Michael S. Tsirkin wrote:
> >> >> >> > On Sun, Mar 05, 2023 at 07:03:02PM -0500, Stefan Hajnoczi wrote:
> >> >> >> > > On Sun, Mar 05, 2023 at 04:38:59AM -0500, Michael S. Tsirkin 
> >> >> >> > > wrote:
> >> >> >> > > > On Fri, Mar 03, 2023 at 03:21:33PM -0500, Stefan Hajnoczi 
> >> >> >> > > > wrote:
> >> >> >> > > > > What happens if a command takes 1 second to complete, is the 
> >> >> >> > > > > device
> >> >> >> > > > > allowed to process the next command from the virtqueue 
> >> >> >> > > > > during this time,
> >> >> >> > > > > possibly completing it before the first command?
> >> >> >> > > > > 
> >> >> >> > > > > This requires additional clarification in the spec because 
> >> >> >> > > > > "they are
> >> >> >> > > > > processed by the device in the order in which they are 
> >> >> >> > > > > queued" does not
> >> >> >> > > > > explain whether commands block the virtqueue (in order 
> >> >> >> > > > > completion) or
> >> >> >> > > > > not (out of order completion).
> >> >> >> > > > 
> >> >> >> > > > Oh I begin to see. Hmm how does e.g. virtio scsi handle this?
> >> >> >> > > 
> >> >> >> > > virtio-scsi, virtio-blk, and NVMe requests may complete out of 
> >> >> >> > > order.
> >> >> >> > > Several may be processed by the device at the same time.
> >> >> >> > 
> >> >> >> > Let's say I submit a write followed by read - is read
> >> >> >> > guaranteed to return an up to date info?
> >> >> >> 
> >> >> >> In general, no. The driver must wait for the write completion before
> >> >> >> submitting the read if it wants consistency.
> >> >> >> 
> >> >> >> Stefan
> >> >> >
> >> >> >I see.  I think it's a good design to follow then.
> >> >> 
> >> >> Hmm, is it suitable to have this approach for configuration interface?
> >> >> Storage device is a different beast, having parallel reads and writes
> >> >> makes complete sense for performance.
> >> >> 
> >> >> ->read a req
> >> >> ->read b req
> >> >> ->read c req
> >> >> <-read a rep
> >> >> <-read b rep
> >> >> <-read c rep
> >> >> 
> >> >> There is no dependency, even between writes.
> >> >> 
> >> >> But in case of configuration, does not make any sense to me.
> >> >> Why is it needed? To pass the burden of consistency of
> >> >> configuration to driver sounds odd at least.
> >> >> 
> >> >> I sense there is no concete idea about what the "admin virtqueue" should
> >> >> serve for exactly.
> >> >
> >> >It's useful for long-running commands because they prevent other
> >> >commands from executing.
> >> >
> >> >An example I've given is that deleting a group member might require
> >> >waiting for the group member's I/O activity to finish. If that I/O
> >> >activity cannot be cancelled instantaneously, then it could take an
> >> >unbounded amount of time to delete the group member. The device would be
> >> >unable to process futher admin commands.
> >> 
> >> I see. Then I believe that the device should handle the dependencies.
> >> Example 1:
> >> -> REQ cmd to create group member A
> >> -> REQ cmd to create group member B
> >> <- REP cmd to create group member A
> >> <- REP cmd to create group member B
> >> 
> >> The device according to internal implementation can either serialize the
> >> 2 group member creations or do it in parallel, if it supports it.
> >> 
> >> Example 2:
> >> -> REQ cmd to create group member A
> >> -> REQ cmd config group member A
> >> <- REP cmd to create group member A
> >> <- REP cmd config group member A
> >> 
> >> Here the serialization is necessary and the device is the one to take
> >> care of it.
> >> 
> >> Makes sense?
> >
> >Yes, I understand. The spec would need to define ordering rules for
> >specific commands and the device must implement them. It allows the
> >driver to pipeline commands while also allowing out-of-order completion
> >(parallelism) in some cases. The disadvantage of this approach is
> >complexity in the spec and implementations.
> >
> >An alternative is unconditional out-of-order completion, where there are
> >no per-command ordering rules. The driver must wait for a command to
> >complete if it relies on the results of that command for its next
> >command. I like this approach because it's less complex in the spec and
> >for device implementers, while the burden on the driver implementer is
> >still reasonable.
> 
> But isn't this duplicating the burden of maintaining dependencies to
> both driver and device? I mean, device should not depend on driver doing
> the right thing, that means it has to check

[virtio-dev] Re: [virtio] Re: [virtio-comment] Re: [virtio] Re: [PATCH v10 04/10] admin: introduce virtio admin virtqueues

2023-03-08 Thread Jiri Pirko

Wed, Mar 08, 2023 at 01:44:18PM CET, stefa...@redhat.com wrote:
>On Wed, Mar 08, 2023 at 11:17:35AM +0100, Jiri Pirko wrote:
>> Tue, Mar 07, 2023 at 08:03:47PM CET, stefa...@redhat.com wrote:
>> >On Tue, Mar 07, 2023 at 04:07:54PM +0100, Jiri Pirko wrote:
>> >> Tue, Mar 07, 2023 at 03:39:11PM CET, stefa...@redhat.com wrote:
>> >> >On Tue, Mar 07, 2023 at 09:03:18AM +0100, Jiri Pirko wrote:
>> >> >> Mon, Mar 06, 2023 at 07:37:31PM CET, m...@redhat.com wrote:
>> >> >> >On Mon, Mar 06, 2023 at 06:03:40AM -0500, Stefan Hajnoczi wrote:
>> >> >> >> On Sun, Mar 05, 2023 at 07:18:24PM -0500, Michael S. Tsirkin wrote:
>> >> >> >> > On Sun, Mar 05, 2023 at 07:03:02PM -0500, Stefan Hajnoczi wrote:
>> >> >> >> > > On Sun, Mar 05, 2023 at 04:38:59AM -0500, Michael S. Tsirkin 
>> >> >> >> > > wrote:
>> >> >> >> > > > On Fri, Mar 03, 2023 at 03:21:33PM -0500, Stefan Hajnoczi 
>> >> >> >> > > > wrote:
>> >> >> >> > > > > What happens if a command takes 1 second to complete, is 
>> >> >> >> > > > > the device
>> >> >> >> > > > > allowed to process the next command from the virtqueue 
>> >> >> >> > > > > during this time,
>> >> >> >> > > > > possibly completing it before the first command?
>> >> >> >> > > > > 
>> >> >> >> > > > > This requires additional clarification in the spec because 
>> >> >> >> > > > > "they are
>> >> >> >> > > > > processed by the device in the order in which they are 
>> >> >> >> > > > > queued" does not
>> >> >> >> > > > > explain whether commands block the virtqueue (in order 
>> >> >> >> > > > > completion) or
>> >> >> >> > > > > not (out of order completion).
>> >> >> >> > > > 
>> >> >> >> > > > Oh I begin to see. Hmm how does e.g. virtio scsi handle this?
>> >> >> >> > > 
>> >> >> >> > > virtio-scsi, virtio-blk, and NVMe requests may complete out of 
>> >> >> >> > > order.
>> >> >> >> > > Several may be processed by the device at the same time.
>> >> >> >> > 
>> >> >> >> > Let's say I submit a write followed by read - is read
>> >> >> >> > guaranteed to return an up to date info?
>> >> >> >> 
>> >> >> >> In general, no. The driver must wait for the write completion before
>> >> >> >> submitting the read if it wants consistency.
>> >> >> >> 
>> >> >> >> Stefan
>> >> >> >
>> >> >> >I see.  I think it's a good design to follow then.
>> >> >> 
>> >> >> Hmm, is it suitable to have this approach for configuration interface?
>> >> >> Storage device is a different beast, having parallel reads and writes
>> >> >> makes complete sense for performance.
>> >> >> 
>> >> >> ->read a req
>> >> >> ->read b req
>> >> >> ->read c req
>> >> >> <-read a rep
>> >> >> <-read b rep
>> >> >> <-read c rep
>> >> >> 
>> >> >> There is no dependency, even between writes.
>> >> >> 
>> >> >> But in case of configuration, does not make any sense to me.
>> >> >> Why is it needed? To pass the burden of consistency of
>> >> >> configuration to driver sounds odd at least.
>> >> >> 
>> >> >> I sense there is no concete idea about what the "admin virtqueue" 
>> >> >> should
>> >> >> serve for exactly.
>> >> >
>> >> >It's useful for long-running commands because they prevent other
>> >> >commands from executing.
>> >> >
>> >> >An example I've given is that deleting a group member might require
>> >> >waiting for the group member's I/O activity to finish. If that I/O
>> >> >activity cannot be cancelled instantaneously, then it could take an
>> >> >unbounded amount of time to delete the group member. The device would be
>> >> >unable to process futher admin commands.
>> >> 
>> >> I see. Then I believe that the device should handle the dependencies.
>> >> Example 1:
>> >> -> REQ cmd to create group member A
>> >> -> REQ cmd to create group member B
>> >> <- REP cmd to create group member A
>> >> <- REP cmd to create group member B
>> >> 
>> >> The device according to internal implementation can either serialize the
>> >> 2 group member creations or do it in parallel, if it supports it.
>> >> 
>> >> Example 2:
>> >> -> REQ cmd to create group member A
>> >> -> REQ cmd config group member A
>> >> <- REP cmd to create group member A
>> >> <- REP cmd config group member A
>> >> 
>> >> Here the serialization is necessary and the device is the one to take
>> >> care of it.
>> >> 
>> >> Makes sense?
>> >
>> >Yes, I understand. The spec would need to define ordering rules for
>> >specific commands and the device must implement them. It allows the
>> >driver to pipeline commands while also allowing out-of-order completion
>> >(parallelism) in some cases. The disadvantage of this approach is
>> >complexity in the spec and implementations.
>> >
>> >An alternative is unconditional out-of-order completion, where there are
>> >no per-command ordering rules. The driver must wait for a command to
>> >complete if it relies on the results of that command for its next
>> >command. I like this approach because it's less complex in the spec and
>> >for device implementers, while the burden on the driver implementer is
>> >still reasonable.
>> 
>> But isn'

[virtio-dev] Re: [virtio-comment] [PATCH v10 09/10] admin: conformance clauses

2023-03-08 Thread David Edmondson

"Michael S. Tsirkin"  writes:

> On Tue, Mar 07, 2023 at 11:04:33AM +, David Edmondson wrote:
>> "Michael S. Tsirkin"  writes:
>> > +The driver MAY block use of VIRTIO_ADMIN_CMD_LIST_QUERY and
>> > +VIRTIO_ADMIN_CMD_LIST_USE by issuing VIRTIO_ADMIN_CMD_LIST_USE
>> > +with respective bits cleared in \field{command_specific_data}.
>> 
>> This runs contrary to the assertion "The list of supported commands MUST
>> NOT shrink", given that a driver is told to assume that
>> VIRTIO_ADMIN_CMD_LIST_QUERY and VIRTIO_ADMIN_CMD_LIST_USE are the only
>> commands initially available.
>
> Commands are still available just disabled, the meaning of
> "MUST NOT shrink" is clarified by the following:
>
>   > > +after reporting a given command as supported through
>   > > +VIRTIO_ADMIN_CMD_LIST_QUERY the device MUST NOT later report it
>   > > +as unsupported.
> I will stick an "i.e." there to make it hopefully clearer.

This puzzles me.

I can see the point of blocking LIST_USE, but not of blocking
LIST_QUERY. What's the purpose of this?
-- 
Time is waiting to explain, why refuse?

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] Re: [virtio] Re: [virtio-comment] [PATCH v10 09/10] admin: conformance clauses

2023-03-08 Thread Jiri Pirko

Wed, Mar 08, 2023 at 01:59:10PM CET, david.edmond...@oracle.com wrote:
>"Michael S. Tsirkin"  writes:
>
>> On Tue, Mar 07, 2023 at 11:04:33AM +, David Edmondson wrote:
>>> "Michael S. Tsirkin"  writes:
>>> > +The driver MAY block use of VIRTIO_ADMIN_CMD_LIST_QUERY and
>>> > +VIRTIO_ADMIN_CMD_LIST_USE by issuing VIRTIO_ADMIN_CMD_LIST_USE
>>> > +with respective bits cleared in \field{command_specific_data}.
>>> 
>>> This runs contrary to the assertion "The list of supported commands MUST
>>> NOT shrink", given that a driver is told to assume that
>>> VIRTIO_ADMIN_CMD_LIST_QUERY and VIRTIO_ADMIN_CMD_LIST_USE are the only
>>> commands initially available.
>>
>> Commands are still available just disabled, the meaning of
>> "MUST NOT shrink" is clarified by the following:
>>
>>  > > +after reporting a given command as supported through
>>  > > +VIRTIO_ADMIN_CMD_LIST_QUERY the device MUST NOT later report it
>>  > > +as unsupported.
>> I will stick an "i.e." there to make it hopefully clearer.
>
>This puzzles me.
>
>I can see the point of blocking LIST_USE, but not of blocking

What is that point, I'm curious.


>LIST_QUERY. What's the purpose of this?

I think that this might be just for the sake of treating all
commands equally.


>-- 
>Time is waiting to explain, why refuse?
>
>-
>To unsubscribe from this mail list, you must leave the OASIS TC that 
>generates this mail.  Follow this link to all your TCs in OASIS at:
>https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php 
>

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] Re: [virtio] Re: [virtio-comment] [PATCH v10 09/10] admin: conformance clauses

2023-03-08 Thread Michael S. Tsirkin

On Wed, Mar 08, 2023 at 02:05:01PM +0100, Jiri Pirko wrote:
> Wed, Mar 08, 2023 at 01:59:10PM CET, david.edmond...@oracle.com wrote:
> >"Michael S. Tsirkin"  writes:
> >
> >> On Tue, Mar 07, 2023 at 11:04:33AM +, David Edmondson wrote:
> >>> "Michael S. Tsirkin"  writes:
> >>> > +The driver MAY block use of VIRTIO_ADMIN_CMD_LIST_QUERY and
> >>> > +VIRTIO_ADMIN_CMD_LIST_USE by issuing VIRTIO_ADMIN_CMD_LIST_USE
> >>> > +with respective bits cleared in \field{command_specific_data}.
> >>> 
> >>> This runs contrary to the assertion "The list of supported commands MUST
> >>> NOT shrink", given that a driver is told to assume that
> >>> VIRTIO_ADMIN_CMD_LIST_QUERY and VIRTIO_ADMIN_CMD_LIST_USE are the only
> >>> commands initially available.
> >>
> >> Commands are still available just disabled, the meaning of
> >> "MUST NOT shrink" is clarified by the following:
> >>
> >>> > +after reporting a given command as supported through
> >>> > +VIRTIO_ADMIN_CMD_LIST_QUERY the device MUST NOT later report it
> >>> > +as unsupported.
> >> I will stick an "i.e." there to make it hopefully clearer.
> >
> >This puzzles me.
> >
> >I can see the point of blocking LIST_USE, but not of blocking
> 
> What is that point, I'm curious.
> 
> 
> >LIST_QUERY. What's the purpose of this?
> 
> I think that this might be just for the sake of treating all
> commands equally.
> 

Right. There was this vague idea that blocking all commands
is good to have. E.g. if you are making some drastic
changes to the device?


> >-- 
> >Time is waiting to explain, why refuse?
> >
> >-
> >To unsubscribe from this mail list, you must leave the OASIS TC that 
> >generates this mail.  Follow this link to all your TCs in OASIS at:
> >https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php 
> >


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] Re: [virtio] Re: [virtio-comment] [PATCH v10 09/10] admin: conformance clauses

2023-03-08 Thread David Edmondson

Jiri Pirko  writes:

> Wed, Mar 08, 2023 at 01:59:10PM CET, david.edmond...@oracle.com wrote:
>>"Michael S. Tsirkin"  writes:
>>
>>> On Tue, Mar 07, 2023 at 11:04:33AM +, David Edmondson wrote:
 "Michael S. Tsirkin"  writes:
 > +The driver MAY block use of VIRTIO_ADMIN_CMD_LIST_QUERY and
 > +VIRTIO_ADMIN_CMD_LIST_USE by issuing VIRTIO_ADMIN_CMD_LIST_USE
 > +with respective bits cleared in \field{command_specific_data}.
 
 This runs contrary to the assertion "The list of supported commands MUST
 NOT shrink", given that a driver is told to assume that
 VIRTIO_ADMIN_CMD_LIST_QUERY and VIRTIO_ADMIN_CMD_LIST_USE are the only
 commands initially available.
>>>
>>> Commands are still available just disabled, the meaning of
>>> "MUST NOT shrink" is clarified by the following:
>>>
>>> > > +after reporting a given command as supported through
>>> > > +VIRTIO_ADMIN_CMD_LIST_QUERY the device MUST NOT later report it
>>> > > +as unsupported.
>>> I will stick an "i.e." there to make it hopefully clearer.
>>
>>This puzzles me.
>>
>>I can see the point of blocking LIST_USE, but not of blocking
>
> What is that point, I'm curious.

I might restrict the commands permitted by a device before handing it
off to some subordinate less trusted code.

>
>>LIST_QUERY. What's the purpose of this?
>
> I think that this might be just for the sake of treating all
> commands equally.
>
>
>>-- 
>>Time is waiting to explain, why refuse?
>>
>>-
>>To unsubscribe from this mail list, you must leave the OASIS TC that 
>>generates this mail.  Follow this link to all your TCs in OASIS at:
>>https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php 
>>
-- 
I used to get mad at my school, the teachers who taught me weren't cool.

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] Re: [virtio] Re: [virtio-comment] [PATCH v10 09/10] admin: conformance clauses

2023-03-08 Thread Jiri Pirko

Wed, Mar 08, 2023 at 02:44:08PM CET, david.edmond...@oracle.com wrote:
>Jiri Pirko  writes:
>
>> Wed, Mar 08, 2023 at 01:59:10PM CET, david.edmond...@oracle.com wrote:
>>>"Michael S. Tsirkin"  writes:
>>>
 On Tue, Mar 07, 2023 at 11:04:33AM +, David Edmondson wrote:
> "Michael S. Tsirkin"  writes:
> > +The driver MAY block use of VIRTIO_ADMIN_CMD_LIST_QUERY and
> > +VIRTIO_ADMIN_CMD_LIST_USE by issuing VIRTIO_ADMIN_CMD_LIST_USE
> > +with respective bits cleared in \field{command_specific_data}.
> 
> This runs contrary to the assertion "The list of supported commands MUST
> NOT shrink", given that a driver is told to assume that
> VIRTIO_ADMIN_CMD_LIST_QUERY and VIRTIO_ADMIN_CMD_LIST_USE are the only
> commands initially available.

 Commands are still available just disabled, the meaning of
 "MUST NOT shrink" is clarified by the following:

> > +after reporting a given command as supported through
> > +VIRTIO_ADMIN_CMD_LIST_QUERY the device MUST NOT later report it
> > +as unsupported.
 I will stick an "i.e." there to make it hopefully clearer.
>>>
>>>This puzzles me.
>>>
>>>I can see the point of blocking LIST_USE, but not of blocking
>>
>> What is that point, I'm curious.
>
>I might restrict the commands permitted by a device before handing it
>off to some subordinate less trusted code.

Hmm, that is theoretical, or is there a usecase for that?


>
>>
>>>LIST_QUERY. What's the purpose of this?
>>
>> I think that this might be just for the sake of treating all
>> commands equally.
>>
>>
>>>-- 
>>>Time is waiting to explain, why refuse?
>>>
>>>-
>>>To unsubscribe from this mail list, you must leave the OASIS TC that 
>>>generates this mail.  Follow this link to all your TCs in OASIS at:
>>>https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php 
>>>
>-- 
>I used to get mad at my school, the teachers who taught me weren't cool.

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] Re: [virtio] Re: [virtio-comment] [PATCH v10 09/10] admin: conformance clauses

2023-03-08 Thread Michael S. Tsirkin

On Wed, Mar 08, 2023 at 03:02:05PM +0100, Jiri Pirko wrote:
> Wed, Mar 08, 2023 at 02:44:08PM CET, david.edmond...@oracle.com wrote:
> >Jiri Pirko  writes:
> >
> >> Wed, Mar 08, 2023 at 01:59:10PM CET, david.edmond...@oracle.com wrote:
> >>>"Michael S. Tsirkin"  writes:
> >>>
>  On Tue, Mar 07, 2023 at 11:04:33AM +, David Edmondson wrote:
> > "Michael S. Tsirkin"  writes:
> > > +The driver MAY block use of VIRTIO_ADMIN_CMD_LIST_QUERY and
> > > +VIRTIO_ADMIN_CMD_LIST_USE by issuing VIRTIO_ADMIN_CMD_LIST_USE
> > > +with respective bits cleared in \field{command_specific_data}.
> > 
> > This runs contrary to the assertion "The list of supported commands MUST
> > NOT shrink", given that a driver is told to assume that
> > VIRTIO_ADMIN_CMD_LIST_QUERY and VIRTIO_ADMIN_CMD_LIST_USE are the only
> > commands initially available.
> 
>  Commands are still available just disabled, the meaning of
>  "MUST NOT shrink" is clarified by the following:
> 
>   > > +after reporting a given command as supported through
>   > > +VIRTIO_ADMIN_CMD_LIST_QUERY the device MUST NOT later report it
>   > > +as unsupported.
>  I will stick an "i.e." there to make it hopefully clearer.
> >>>
> >>>This puzzles me.
> >>>
> >>>I can see the point of blocking LIST_USE, but not of blocking
> >>
> >> What is that point, I'm curious.
> >
> >I might restrict the commands permitted by a device before handing it
> >off to some subordinate less trusted code.
> 
> Hmm, that is theoretical, or is there a usecase for that?
> 

Maybe there will be a future version replacing VIRTIO_ADMIN_CMD_LIST_QUERY
with a different command. Then we start with a device supporting
both VIRTIO_ADMIN_CMD_LIST_QUERY and the new VIRTIO_ADMIN_FOO,
and we block VIRTIO_ADMIN_CMD_LIST_QUERY for testing, to make sure driver
is able to handle future devices without VIRTIO_ADMIN_CMD_LIST_QUERY.


> >
> >>
> >>>LIST_QUERY. What's the purpose of this?
> >>
> >> I think that this might be just for the sake of treating all
> >> commands equally.
> >>
> >>
> >>>-- 
> >>>Time is waiting to explain, why refuse?
> >>>
> >>>-
> >>>To unsubscribe from this mail list, you must leave the OASIS TC that 
> >>>generates this mail.  Follow this link to all your TCs in OASIS at:
> >>>https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php 
> >>>
> >-- 
> >I used to get mad at my school, the teachers who taught me weren't cool.


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

[virtio-dev] Re: [virtio-comment] Re: [virtio] Re: [PATCH v10 04/10] admin: introduce virtio admin virtqueues

2023-03-08 Thread Stefan Hajnoczi

On Wed, Mar 08, 2023 at 01:17:33PM +0200, Max Gurtovoy wrote:
> 
> 
> On 06/03/2023 18:25, Stefan Hajnoczi wrote:
> > On Mon, Mar 06, 2023 at 05:28:03PM +0200, Max Gurtovoy wrote:
> > > 
> > > 
> > > On 06/03/2023 13:20, Stefan Hajnoczi wrote:
> > > > On Mon, Mar 06, 2023 at 04:00:50PM +0800, Jason Wang wrote:
> > > > > 
> > > > > 在 2023/3/6 08:03, Stefan Hajnoczi 写道:
> > > > > > On Sun, Mar 05, 2023 at 04:38:59AM -0500, Michael S. Tsirkin wrote:
> > > > > > > On Fri, Mar 03, 2023 at 03:21:33PM -0500, Stefan Hajnoczi wrote:
> > > > > > > > What happens if a command takes 1 second to complete, is the 
> > > > > > > > device
> > > > > > > > allowed to process the next command from the virtqueue during 
> > > > > > > > this time,
> > > > > > > > possibly completing it before the first command?
> > > > > > > > 
> > > > > > > > This requires additional clarification in the spec because 
> > > > > > > > "they are
> > > > > > > > processed by the device in the order in which they are queued" 
> > > > > > > > does not
> > > > > > > > explain whether commands block the virtqueue (in order 
> > > > > > > > completion) or
> > > > > > > > not (out of order completion).
> > > > > > > Oh I begin to see. Hmm how does e.g. virtio scsi handle this?
> > > > > > virtio-scsi, virtio-blk, and NVMe requests may complete out of 
> > > > > > order.
> > > > > > Several may be processed by the device at the same time.
> > > > > > 
> > > > > > They rely on multi-queue for abort operations:
> > > > > > 
> > > > > > In virtio-scsi the abort requests (VIRTIO_SCSI_T_TMF_ABORT_TASK) are
> > > > > > sent on the control virtqueue. The the request identifier namespace 
> > > > > > is
> > > > > > shared across all virtqueues so it's possible to abort a request 
> > > > > > that
> > > > > > was submitted to any command virtqueue.
> > > > > > 
> > > > > > NVMe also follows the same design where abort commands are sent on 
> > > > > > the
> > > > > > Admin Submission Queue instead of an I/O Submission Queue. It's 
> > > > > > possible
> > > > > > to identify NVMe requests by  > > > > > Identifier>.
> > > > > > 
> > > > > > virtio-blk doesn't support aborting requests.
> > > > > > 
> > > > > > I think the logic behind this design is that if a queue gets stuck
> > > > > > processing long-running requests, then the device should not be 
> > > > > > forced
> > > > > > to perform lookahead in the queue to find abort commands. A separate
> > > > > > control/admin queue is used for the abort requests.
> > > > > 
> > > > > 
> > > > > Or device need mandate some kind of QOS here, e.g a request must be 
> > > > > complete
> > > > > in some time. Otherwise we don't have sufficient reliability for 
> > > > > using it as
> > > > > management task?
> > > > 
> > > > Yes, if all commands can be executed in bounded time then a guarantee is
> > > > possible.
> > > > 
> > > > Here is an example where that's hard: imagine a virtio-blk device backed
> > > > by network storage. When an admin queue command is used to delete a
> > > > group member, any of the group member's in-flight I/O requests need to
> > > > be aborted. If the network hangs while the group member is being
> > > > deleted, then the device can't complete an orderly shutdown of I/O
> > > > requests in a reasonable time.
> > > > 
> > > > That example shows a basic group admin command that I think Michael is
> > > > about to propose. We can't avoid this problem by not making it a group
> > > > admin command - it needs to be a group admin command. So I think it's
> > > > likely that there will be admin commands that take an unbounded amount
> > > > of time to complete. One way to achieve what you mentioned is timeouts.
> > > 
> > > I think that you're getting into device specific implementation details 
> > > and
> > > I'm not sure it's necessary.
> > > 
> > > I don't think we need to abort admin commands. Admin commands can be
> > > flushed/aborted during the device reset phase.
> > > Only IO commands should have the possibility to being aborted as you
> > > mentioned in NVMe and SCSI (and potentially in virtio-blk).
> > 
> > It's a general design issue that should be clarified now rather than
> > being left unspecified.
> > 
> > I'm not saying that it must be possible to abort admin commands. There
> > are other options, like requiring the device itself to fail a command
> > after a timeout.
> 
> do you have an example of timeout today for control vq ?

Do you mean the virtio-net control virtqueue? I don't think it has any
commands with an unbounded execution time.

> > 
> > Or we could say that admin commands must complete within bounded time,
> > but I'm not sure that is implementable for some device types like
> > virtio-blk, virtio-scsi, and virtiofs.
> 
> No we can't.
> Some commands, for example FW upgrade can take 10 minutes and it's perfectly
> fine. Other commands like setting feature bit will take 1 millisec.
> Each device implements commands in a different internal logic so we can't
> expec

[virtio-dev] Re: [virtio] [PATCH v10 01/10] virtio: document forward compatibility guarantees

2023-03-08 Thread Cornelia Huck

On Tue, Mar 07 2023, David Edmondson  wrote:

> "Michael S. Tsirkin"  writes:
>
>> On Mon, Mar 06, 2023 at 01:53:50PM +, David Edmondson wrote:
>>> "Michael S. Tsirkin"  writes:
>>> 
>>> > Feature negotiation forms the basis of forward compatibility
>>> > guarantees of virtio but has never been properly documented.
>>> > Do it now.
>>> >
>>> > Suggested-by: Halil Pasic 
>>> > Signed-off-by: Michael S. Tsirkin 
>>> > ---
>>> >  content.tex | 42 ++
>>> >  1 file changed, 42 insertions(+)
>>> >
>>> > diff --git a/content.tex b/content.tex
>>> > index 0e474dd..0c2d917 100644
>>> > --- a/content.tex
>>> > +++ b/content.tex
>>> > @@ -114,21 +114,63 @@ \section{Feature Bits}\label{sec:Basic Facilities 
>>> > of a Virtio Device / Feature B
>>> >  In particular, new fields in the device configuration space are
>>> >  indicated by offering a new feature bit.
>>> >  
>>> > +To keep the feature negotiation mechanism extensible, it is important
>>> > +that devices \em{do not} offer any feature bits that they would not be
>>> > +able to handle if the driver accepted them (even though drivers are not
>>> > +supposed to accept them in the first place even if offered, according to
>>> > +this version of the specification.) 
>>> 
>>> I find this (the bit in parenthesis) confusing.
>>> 
>>> Why are drivers not supposed to accept features that they have been
>>> offered, given that they can't know that the device cannot handle the
>>> feature that it just offered?
>>> 
>>> Is this alluding to the later section:
>>> 
>>> > feature bits not described in this specification, reserved feature
>>> > bits and feature bits reserved or not supported for the specific
>>> > transport or the specific device type
>>> 
>>> ?
>>
>> exactly. how would you put this better? given an example?
>
> Perhaps it would be enough to say:
>
>> (even though drivers are not supposed to accept unrecognised features in
>> the first place even if offered, according to the specification)
>
> "Unrecognised" is intended as a shorthand for the whole "not described,
> reserved, ...". Maybe "unrecognised or reserved"?

Hm, what about

"even though drivers are not supposed to accept any unspecified,
reserved, or unsupported features even if offered..."

?

I'm not sure how we can make this both short and descriptive enough...


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

60 matches

Mail list logo