[virtio-dev] Re: [PATCH v10 1/8] content: Add vq number text

2023-03-30 Thread Cornelia Huck
On Thu, Mar 30 2023, Parav Pandit  wrote:

> Introduce vq number and its range so that subsequent patches can refer
> to it.
>
> Fixes: https://github.com/oasis-tcs/virtio-spec/issues/163
> Signed-off-by: Parav Pandit 
> ---
> changelog:
> v9->v10:
> - added braces around vq number wording
> - added vqn as another term for vq number
> v8->v9:
> - added inclusive when describing the vq number range
> - skipped comment to put virtqueue number wording first because we
>   prefer to use shorter vq number as much as possible
> v5->v6:
> - moved description close to introduction, it was in middle of
>   queue data transfer description
> v2->v3:
> - new patch
> ---
>  content.tex | 4 
>  1 file changed, 4 insertions(+)

Reviewed-by: Cornelia Huck 


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] Re: [PATCH v10 6/8] transport-ccw: Refer to the vq by its number

2023-03-30 Thread Cornelia Huck
On Thu, Mar 30 2023, Parav Pandit  wrote:

> Currently specification uses virtqueue index and
> number interchangeably to refer to the virtqueue.
>
> Instead refer to it by its number.
>
> Fixes: https://github.com/oasis-tcs/virtio-spec/issues/163
> Signed-off-by: Parav Pandit 
>
> ---
> changelog:
> v9->v10:
> - replaced queue_select with vqn
> - used lower case letter for first word in comment
> v8->v9:
> - replaced 'named' with 'known'
> - replaced 'queue number' with 'vq number'
> v3->v4:
> - moved note to comment
> v2->v3:
> - added comment note for queue_select similar to max_queue_size
> v0->v1:
> - new patch
> ---
>  transport-ccw.tex | 14 --
>  1 file changed, 8 insertions(+), 6 deletions(-)

Reviewed-by: Cornelia Huck 


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] Re: [PATCH v10 7/8] virtio-net: Avoid duplicate receive queue example

2023-03-30 Thread Cornelia Huck
On Thu, Mar 30 2023, Parav Pandit  wrote:

> Receive queue number/index example is duplicate which is already defined
> in the Setting RSS parameters section.
>
> Hence, avoid such duplicate example and prepare it for the subsequent
> patch to describe using receive queue handle.
>
> Fixes: https://github.com/oasis-tcs/virtio-spec/issues/163
> Signed-off-by: Parav Pandit 
> ---
>  device-types/net/description.tex | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/device-types/net/description.tex 
> b/device-types/net/description.tex
> index d7c8b1b..435c1fc 100644
> --- a/device-types/net/description.tex
> +++ b/device-types/net/description.tex
> @@ -1467,8 +1467,8 @@ \subsubsection{Control Virtqueue}\label{sec:Device 
> Types / Network Device / Devi
>  The device MUST determine the destination queue for a network packet as 
> follows:
>  \begin{itemize}
>  \item Calculate the hash of the packet as defined in \ref{sec:Device Types / 
> Network Device / Device Operation / Processing of Incoming Packets / Hash 
> calculation for incoming packets}.
> -\item If the device did not calculate the hash for the specific packet, the 
> device directs the packet to the receiveq specified by 
> \field{unclassified_queue} of virtio_net_rss_config structure (value of 0 
> corresponds to receiveq1).
> -\item Apply \field{indirection_table_mask} to the calculated hash and use 
> the result as the index in the indirection table to get 0-based number of 
> destination receiveq (value of 0 corresponds to receiveq1).
> +\item If the device did not calculate the hash for the specific packet, the 
> device directs the packet to the receiveq specified by 
> \field{unclassified_queue} of virtio_net_rss_config structure.
> +\item Apply \field{indirection_table_mask} to the calculated hash and use 
> the result as the index in the indirection table to get 0-based number of 
> destination receiveq.

As you touch this sentence anyway, maybe make it "the 0-based number"?

>  \item If the destination receive queue is being reset (See \ref{sec:Basic 
> Facilities of a Virtio Device / Virtqueues / Virtqueue Reset}), the device 
> MUST drop the packet.
>  \end{itemize}
>  

Otherwise,

Reviewed-by: Cornelia Huck 


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] Re: [PATCH v10 8/8] virtio-net: Describe RSS using rss rq id

2023-03-30 Thread Cornelia Huck
On Thu, Mar 30 2023, Parav Pandit  wrote:

> The content of indirection table and unclassified_queue which are
> based on math calculation historically. To better describe this, to
> avoid intermixing array index with virtqueue index and to use virtqueue
> number
>
> introduce a field rq_handle (receive queue handle) and refer them
> to describe unclassified_queue and indirection_table fields.

This description is a bit confusing (and you renamed rq_handle as well.)
Does

"The content of the indirection table and unclassified_queue were
originally described based on mathematical operations. In order to make
it easier to understand and to avoid intermixing the array index with
the virtqueue number, introduce a structure rss_rq_id (RSS receive queue
ID) and use it to describe the unclassified_queue and indirection_table
fields."

capture the intent correctly?

>
> As part of it, have the example that uses non zero virtqueue

"have the example use a non-zero..." ?

> number which helps to have better mapping between receiveX
> object with virtqueue number and the actual value in the
> indirection table.
>
> Fixes: https://github.com/oasis-tcs/virtio-spec/issues/163
> Signed-off-by: Parav Pandit 
>
> ---
> changelog:
> v8->v9:
> - reword rss_rq_id and unclassified_packets description
> - use article
> - use 'vq number' instead of 'virtqueue number'
> v4->v5:
> - change subject to reflect rss_rq_id
> - fixed accidental removal of "unclassifed packets".
> - simplfied text around indirection_table mask to use the term
>   destination receive virtqueue. This aligns with next line about queue
>   reset.
> - removed rss_rq_id references as indirection table and
>   unclassified_queue data type is self explanatory
> v3->v4:
> - renamed rq_handle to rss_rq_id
> - moved rss_rq_id definition close to its usage in rss_config struct
> v2->v3:
> - moved rq_handle definition before using it
> - removed duplicate example as rq_handle is now described first
> v0->v1:
> - new patch suggested by Michael Tsirkin
> ---
>  device-types/net/description.tex | 27 ---
>  1 file changed, 20 insertions(+), 7 deletions(-)
>
> diff --git a/device-types/net/description.tex 
> b/device-types/net/description.tex
> index 435c1fc..c64335d 100644
> --- a/device-types/net/description.tex
> +++ b/device-types/net/description.tex
> @@ -1423,11 +1423,16 @@ \subsubsection{Control Virtqueue}\label{sec:Device 
> Types / Network Device / Devi
>  
>  Driver sends a VIRTIO_NET_CTRL_MQ_RSS_CONFIG command using the following 
> format for \field{command-specific-data}:
>  \begin{lstlisting}
> +struct rss_rq_id {
> +   le16 vqn_1_16: 15; /* Bits 1 to 16 of the vq number */
> +   le16 reserved: 1; /* Set to zero */
> +};
> +
>  struct virtio_net_rss_config {
>  le32 hash_types;
>  le16 indirection_table_mask;
> -le16 unclassified_queue;
> -le16 indirection_table[indirection_table_length];
> +struct rss_rq_id unclassified_queue;
> +struct rss_rq_id indirection_table[indirection_table_length];
>  le16 max_tx_vq;
>  u8 hash_key_length;
>  u8 hash_key_data[hash_key_length];
> @@ -1442,10 +1447,15 @@ \subsubsection{Control Virtqueue}\label{sec:Device 
> Types / Network Device / Devi
>  \field{indirection_table} array.
>  Number of entries in \field{indirection_table} is 
> (\field{indirection_table_mask} + 1).
>  
> -Field \field{unclassified_queue} contains the 0-based index of
> -the receive virtqueue to place unclassified packets in. Index 0 corresponds 
> to receiveq1.
> +\field{rss_rq_id} is a receive virtqueue id. \field{vqn_1_16}
> +consists of bits 1 to 16 of a vq number. For example, a
> +\field{vqn_1_16} value of 3 corresponds to vq number 6,
> +which maps to receiveq4.
> +
> +Field \field{unclassified_queue} contains the receive virtqueue
> +in which to place unclassified packets.
>  
> -Field \field{indirection_table} contains an array of 0-based indices of 
> receive virtqueues. Index 0 corresponds to receiveq1.
> +Field \field{indirection_table} is an array of receive virtqueues.

"an array of receive virtqueues identified via their rss_rq_id" ?

>  
>  A driver sets \field{max_tx_vq} to inform a device how many transmit 
> virtqueues it may use (transmitq1\ldots transmitq \field{max_tx_vq}).
>  
> @@ -1455,7 +1465,8 @@ \subsubsection{Control Virtqueue}\label{sec:Device 
> Types / Network Device / Devi
>  
>  A driver MUST NOT send the VIRTIO_NET_CTRL_MQ_RSS_CONFIG command if the 
> feature VIRTIO_NET_F_RSS has not been negotiated.
>  
> -A driver MUST fill the \field{indirection_table} array only with indices of 
> enabled queues. Index 0 corresponds to receiveq1.
> +A driver MUST fill the \field{indirection_table} array only with
> +enabled receive virtqueues.

"only with rss_rq_id references to enabled receive virtqueues" ?

>  
>  The number of entries in \field{indirection_table} 
> (\field{indirection_table_mask} + 1) MUST be a power of two.
>  
> @@ -1468,7 +1479,9 @@ \subsubsection{

[virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [PATCH v9] virtio-net: support inner header hash

2023-03-30 Thread Heng Qi




在 2023/3/21 上午3:45, Michael S. Tsirkin 写道:

On Thu, Mar 16, 2023 at 09:17:26PM +0800, Heng Qi wrote:

On Wed, Mar 15, 2023 at 10:57:40AM -0400, Michael S. Tsirkin wrote:

On Wed, Mar 15, 2023 at 08:55:45PM +0800, Heng Qi wrote:


在 2023/3/15 下午7:58, Michael S. Tsirkin 写道:

On Sat, Mar 11, 2023 at 11:23:08AM +0800, Heng Qi wrote:


在 2023/3/10 上午3:36, Michael S. Tsirkin 写道:

On Thu, Mar 09, 2023 at 12:55:02PM +0800, Heng Qi wrote:

在 2023/3/8 下午10:39, Michael S. Tsirkin 写道:

On Wed, Mar 01, 2023 at 10:56:31AM +0800, Heng Qi wrote:

在 2023/2/28 下午7:16, Michael S. Tsirkin 写道:

On Sat, Feb 18, 2023 at 10:37:15PM +0800, Heng Qi wrote:

If the tunnel is used to encapsulate the packets, the hash calculated
using the outer header of the receive packets is always fixed for the
same flow packets, i.e. they will be steered to the same receive queue.

Wait a second. How is this true? Does not everyone stick the
inner header hash in the outer source port to solve this?

Yes, you are right. That's what we did before the inner header hash, but it
has a performance penalty, which I'll explain below.


For example geneve spec says:

it is necessary for entropy from encapsulated packets to be
exposed in the tunnel header.  The most common technique for this is
to use the UDP source port

The end point of the tunnel called the gateway (with DPDK on top of it).

1. When there is no inner header hash, entropy can be inserted into the udp
src port of the outer header of the tunnel,
and then the tunnel packet is handed over to the host. The host needs to
take out a part of the CPUs to parse the outer headers (but not drop them)
to calculate the inner hash for the inner payloads,
and then use the inner
hash to forward them to another part of the CPUs that are responsible for
processing.

I don't get this part. Leave inner hashes to the guest inside the
tunnel, why is your host doing this?

Let's simplify some details and take a fresh look at two different
scenarios: VXLAN and GENEVE (Scenario1) and GRE (Scenario2).

1. In Scenario1, we can improve the processing performance of the same flow
by implementing inner symmetric hashing.

This is because even though client1 and client2 communicate bidirectionally
through the same flow, their data may pass

through and be encapsulated by different tunnels, resulting in the same flow
being hashed to different queues and processed by different CPUs.

To ensure consistency and optimized processing, we need to parse out the
inner header and compute a symmetric hash on it using a special rss key.

Sorry for not mentioning the inner symmetric hash before, in order to
prevent the introduction of more concepts, but it is indeed a kind of inner
hash.

If parts of a flow go through different tunnels won't this cause
reordering at the network level? Why is it so important to prevent it at
the nic then?  Or, since you are stressing symmetric hash, are you
talking about TX and RX side going through different tunnels?

Yes, the directions client1->client2 and client2->client1 may go through
different tunnels.
Using inner symmetric hashing can satisfy the same CPU to process two
directions of the same flow to improve performance.

Well sure but ... are you just doing forwarding or inner processing too?

When there is an inner hash, there is no forwarding anymore.


If forwarding why do you care about matching TX and RX queues? If e2e

In fact, we are just matching on the same rx queue. The network topology
is roughly as follows. The processing host will receive the packets
sent from client1 and client2 respectively, then make some action judgments,
and return them to client2 and client1 respectively.

client1   client2
| |
|  __ |
+->| tunnel |<+
   ||
  |  |
  |  |
  |  |
  v  v
+-+
| processing host |
+-+

Thanks.

monotoring host would be a better term


Sure.

I'm so sorry I didn't realize I missed this until I checked my emails. 😮 :(





processing can't you just store the incoming hash in the flow and reuse
on TX? This is what Linux is doing...






2. In Scenario2 with GRE, the lack of outer transport headers means that
flows between multiple communication pairs encapsulated by the same tunnel

will all be hashed to the same queue. To address this, we need to implement
inner hashing to improve the performance of RSS. By parsing and calculating

the inner hash, different flows can be hashed to different queues.

Thanks.



Well 2 is at least inexact, there's flowID there. It's just 8 bit

We use the most basic GRE header fields (not NVGRE), not even optional
fields.
There is also no flow id in the GRE header, should you be referring to
NVGRE?

Thanks.


so not sufficient if there are more than 512 queues. Still 512 queues
is quite a lot. Are you trying to solve for c

[virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [PATCH v9] virtio-net: support inner header hash

2023-03-30 Thread Heng Qi




在 2023/3/21 上午3:48, Michael S. Tsirkin 写道:

On Wed, Mar 15, 2023 at 08:55:45PM +0800, Heng Qi wrote:

We use the most basic GRE header fields (not NVGRE), not even optional
fields.

I'd say yes, the most convincing usecase is with legacy GRE.


Yes. But we still have a strong need for VXLAN and GENEVE to do 
symmetric hashing. Please consider this.



Given that, do you need the rest of protocols there?


I would say that I checked the current tunneling protocols used for 
overlay networks and their respective RFC versions compared to each other.


They are:

1. GRE_rfc2784 :This protocol is only specified for IPv4 and used as 
either the payload or delivery protocol.

    link : https://datatracker.ietf.org/doc/rfc2784/

2. GRE_rfc2890: This protocol describes extensions by which two fields, 
Key and Sequence Number, can be optionally carried in the GRE Header.

    link: https://www.rfc-editor.org/rfc/rfc2890

3. GRE_rfc7676: IPv6 Support for Generic Routing Encapsulation (GRE). 
This protocol is specified for IPv6 and used as either the payload or 
delivery protocol.
    Note that this does not change the GRE header format or any 
behaviors specified by RFC 2784 or RFC 2890.

    link: https://datatracker.ietf.org/doc/rfc7676/

4. GRE-in-UDP: GRE-in-UDP Encapsulation. This specifies a method of 
encapsulating network protocol packets within GRE and UDP headers.
    This GRE-in-UDP encapsulation allows the UDP source port field to 
be used as an entropy field. This protocol is specified for IPv4 and 
IPv6, and used as either the payload or delivery protocol.

    link: https://www.rfc-editor.org/rfc/rfc8086

5. VXLAN: Virtual eXtensible Local Area Network.
    link: https://datatracker.ietf.org/doc/rfc7348/

6. VXLAN-GPE: Generic Protocol Extension for VXLAN. This protocol 
describes extending Virtual eXtensible Local Area Network (VXLAN) via 
changes to the VXLAN header.

    link: https://www.ietf.org/archive/id/draft-ietf-nvo3-vxlan-gpe-12.txt

7. GENEVE: Generic Network Virtualization Encapsulation.
    link: https://datatracker.ietf.org/doc/rfc8926/

8. IPIP: IP Encapsulation within IP.
    link: https://www.rfc-editor.org/rfc/rfc2003

9. NVGRE: Network Virtualization Using Generic Routing Encapsulation
    link: https://www.rfc-editor.org/rfc/rfc7637.html

10. STT: Stateless Transport Tunneling. STT is particularly useful when 
some tunnel endpoints are in end-systems, as it utilizes the 
capabilities of the network interface card to improve performance.

  link: https://www.ietf.org/archive/id/draft-davie-stt-08.txt

Among them, GRE_rfc2784, VXLAN and GENEVE are our internal requirements 
for inner header hashing.

GRE_rfc2784 requires RSS hashing to different queues.
For the monitoring scenario I mentioned, VXLAN or GRE_rfc2890 also needs 
to use inner symmetric hashing.


I know you mean to want this feature to only support GRE_rfc2784, since 
it's the most convincing for RSS.

But RSS hashes packets to different queues for different streams.
For the same flow, it needs to hash it to the same queue.
So this doesn't distort the role of RSS, and I believe that for modern 
protocols like VXLAN and others, inner symmetric hashing is still a 
common requirement for other vendors using virtio devices.


So, can we make this feature support all the protocols I have checked 
above, so that vendors can choose to support the protocols they want. 
And this can avoid the addition of new tunnel protocols

in the near future as much as possible.

Do you think it's ok?

Again: I'm so sorry I didn't realize I missed this until I checked my 
emails. 🙁😮



We can start with just legacy GRE (think about including IPv6 or not).
Given how narrow this usecase is, I'd be fine with focusing
just on this, and addressing more protocols down the road
with something programmable like BPF. WDYT?




-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] Re: [virtio-comment] [PATCH v10 2/8] transport-pci: Refer to the vq by its number

2023-03-30 Thread Halil Pasic
On Thu, 30 Mar 2023 00:23:35 +0300
Parav Pandit  wrote:

> Currently specification uses virtqueue index and
> number interchangeably to refer to the virtqueue.
> 
> Instead refer to it by its number.
> 
> Fixes: https://github.com/oasis-tcs/virtio-spec/issues/163
> Reviewed-by: Cornelia Huck 
> Reviewed-by: Max Gurtovoy 
> Reviewed-by: Jiri Pirko 
> Signed-off-by: Parav Pandit 
> ---

[..]

>  transport-pci.tex | 13 +++--
>  1 file changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/transport-pci.tex b/transport-pci.tex
> index b07a822..0f3a48b 100644
> --- a/transport-pci.tex
> +++ b/transport-pci.tex
> @@ -390,13 +390,14 @@ \subsubsection{Common configuration structure 
> layout}\label{sec:Virtio Transport
>  
>  \item[\field{queue_notify_data}]
>  This field exists only if VIRTIO_F_NOTIF_CONFIG_DATA has been 
> negotiated.
> -The driver will use this value to put it in the 'virtqueue number' 
> field
> +The driver uses this value in the field \field{vqn}

This is one of the problems with this approach... 

>  in the available buffer notification structure.
>  See section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus 
> / PCI-specific Initialization And Device Operation / Available Buffer 
> Notifications}.
>  \begin{note}
>  This field provides the device with flexibility to determine how 
> virtqueues
>  will be referred to in available buffer notifications.
> -In a trivial case the device can set \field{queue_notify_data}=vqn. 
> Some devices
> +In a trivial case the device can set
> +\field{queue_notify_data} to the vq number. Some devices

... the 'vq' number here is not the same vqn above which renders the usage
of vqn ambiguous. 

> @@ -1053,7 +1054,7 @@ \subsubsection{Available Buffer 
> Notifications}\label{sec:Virtio Transport Option
>  If VIRTIO_F_NOTIF_CONFIG_DATA has been negotiated:
>  \begin{itemize}
>  \item If VIRTIO_F_NOTIFICATION_DATA has not been negotiated, the driver MUST 
> use the
> -\field{queue_notify_data} value instead of the virtqueue index.
> +\field{queue_notify_data} value instead of the vq number.
>  \item If VIRTIO_F_NOTIFICATION_DATA has been negotiated, the driver MUST set 
> the
>  \field{vqn} field to the \field{queue_notify_data} value.

And here it is really obvious: if VIRTIO_F_NOTIF_CONFIG_DATA has been negotiated
the field \field{vqn} does not hold a virtqueue number/vq  nuber/vqn but
a device supplied identifier which may or may not be same as the vqn.

So we went 
from:
if !VIRTIO_F_NOTIF_CONFIG_DATA then vqn is the virtqueue index
if !!VIRTIO_F_NOTIF_CONFIG_DATA then vqn is queue_notify_data which may
or may not be the same as the virtqueue index
to
if !VIRTIO_F_NOTIF_CONFIG_DATA then vqn is the vq number
if !!VIRTIO_F_NOTIF_CONFIG_DATA then vqn is queue_notify_data which may
or may not be the same as the vq number.

Which is my opinion less clear that what we had before. And in my opinion
calling the very same thing virtqueue und vq number interchangeably is
not helpful either -- makes grepping harder.

I don't see the benefit of replacing virtqueue index with virtqueue
number. AFAIR the supposed benefit was to:
a) harmonize the terminology in the notifications part with the rest of
the spec
b) to resolve the RSS problematic with its receive virtqueue indexes.

For b) we are going down a different route calling those receive queue
ids (AFAIR), and for the notifications part see my comments above.

I'm about to reply to the cover letter, and make my argument against
standardizing virtqueue nuber (and in favor of standardizing on the
on virtqueue index) along with a diff that is supposed to act as
a counter proposal. If that doesn't work I will give up and make peace
with vq number and vqn.

Regards,
Halil 


>  \end{itemize}


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



Re: [virtio-dev] [PATCH v10 0/8] Rename queue index to queue number

2023-03-30 Thread Halil Pasic
On Thu, 30 Mar 2023 00:23:33 +0300
Parav Pandit  wrote:

> 1. Currently, virtqueue is identified between driver and device
> interchangeably using either number or index terminology.
> 
> 2. Between PCI and MMIO transport the queue size (depth) is
> defined as queue_size and QueueNum respectively.
> 
> To avoid confusion and to have consistency, unify them to use Number.
> 
> Solution:
> a. Use virtqueue number description, and rename MMIO register as QueueSize.

I'm in favor of replacing number with size where appropriate.

> b. Replace virtqueue index with virtqueue number

I don't see the benefit of replacing virtqueue index with virtqueue
number.

Currently virtqueue number is only used in the parts that describe
notifications (Guest->Host), the rest of the spec uses virtqueue index.

I argue that using a different term in that context than in the rest
of the specification makes sense, because in the context of notifications
the virtqueue isn't always identified by its index.

More precisely: if VIRTIO_F_NOTIF_CONFIG_DATA has been negotiated in the
context of notifications the virtqueue is identified by the
so called "queue_notify_data"; if VIRTIO_F_NOTIF_CONFIG_DATA has been
negotiated in the context of notifications the virtqueue is identified by
the virtqueue index (as usual, for example in queue_select, or in
the ccws).

As I've pointed out in my comment to patch 2, I believe replacing
virtqueue index with virtqueue number is detrimental to clarity.

Thus please find a counter-proposal below. If there is interest
I can make a series out of it, and prettify it. If I can't convince
you guys, then I will have to get used to vqn and virtqueue number.

AFAIR the other problem with index was the RSS for virtio-net. But there
we are currently heading down a direction of introducing a new
abstraction. This approach avoids confusion around the term 'virtqueue
index' as much as it avoids confusion around the term 'virtqueue nuber'.


> c. RSS area of virtio net has inherited some logic, describe it
> using abstract rss_rq_id.

-8<--
From: Halil Pasic 
Date: Thu, 30 Mar 2023 17:57:53 +0200
Subject: [PATCH 1/1] content: clarify how virtques are identified

Clarify how virtqueues are identified in the context of
available notifications and in the context of RSS for
virtio-net .

Signed-off-by: Halil Pasic 
---
 content.tex  | 15 ++-
 device-types/net/description.tex | 30 ++
 transport-ccw.tex|  2 +-
 transport-pci.tex|  7 ---
 4 files changed, 37 insertions(+), 17 deletions(-)

diff --git a/content.tex b/content.tex
index cff548a..a376232 100644
--- a/content.tex
+++ b/content.tex
@@ -298,6 +298,9 @@ \section{Virtqueues}\label{sec:Basic Facilities of a Virtio 
Device / Virtqueues}
 virtqueues\footnote{For example, the simplest network device has one virtqueue 
for
 transmit and one for receive.}.
 
+If not otherwise stated, each virtqueue is identified by a so called virtqueue
+index. Valid virtqueue indexes range from 0 up to 65535.
+
 Driver makes requests available to device by adding
 an available buffer to the queue, i.e., adding a buffer
 describing the request to a virtqueue, and optionally triggering
@@ -402,7 +405,7 @@ \section{Driver Notifications} \label{sec:Basic Facilities 
of a Virtio Device /
 
 When VIRTIO_F_NOTIFICATION_DATA has not been negotiated,
 this notification involves sending the
-virtqueue number to the device (method depending on the transport).
+virtqueue index to the device (method depending on the transport).
 
 However, some devices benefit from the ability to find out the
 amount of available data in the queue without accessing the virtqueue in 
memory:
@@ -413,7 +416,9 @@ \section{Driver Notifications} \label{sec:Basic Facilities 
of a Virtio Device /
 the following information:
 
 \begin{description}
-\item [vqn] VQ number to be notified.
+\item [vqn] Identifies the virtqueueue to be notified. If 
VIRTIO_F_NOTIF_CONFIG_DATA has been
+  negotiateted, the value of \field{vqn} is the notify data suplied by the 
device
+  (by transport specific means). Otherwise it is the virtqueue index.
 \item [next_off] Offset
   within the ring where the next available ring entry
   will be written.
@@ -786,13 +791,13 @@ \chapter{Reserved Feature Bits}\label{sec:Reserved 
Feature Bits}
   buffer notifications.
   As mentioned in section \ref{sec:Basic Facilities of a Virtio Device / 
Driver notifications}, when the
   driver is required to send an available buffer notification to the device, it
-  sends the virtqueue number to be notified. The method of delivering
+  sends the virtqueue index of the virtqueue to be notified. The method of 
delivering
   notifications is transport specific.
   With the PCI transport, the device can optionally provide a per-virtqueue 
value
-  for the driver to use in driver notifications, ins

[virtio-dev] RE: [virtio-comment] [PATCH v10 2/8] transport-pci: Refer to the vq by its number

2023-03-30 Thread Parav Pandit


> From: Halil Pasic 
> Sent: Thursday, March 30, 2023 1:10 PM

> > -The driver will use this value to put it in the 'virtqueue number' 
> > field
> > +The driver uses this value in the field \field{vqn}
> 
> This is one of the problems with this approach...
> 
Instead of vqn, it is just q identifier.
Sometimes is vqn, sometimes its q config data.
Currently wording has taken care of it.
But a better field name can also do it in future.

> >  in the available buffer notification structure.
> >  See section \ref{sec:Virtio Transport Options / Virtio Over PCI 
> > Bus / PCI-
> specific Initialization And Device Operation / Available Buffer 
> Notifications}.
> >  \begin{note}
> >  This field provides the device with flexibility to determine how
> virtqueues
> >  will be referred to in available buffer notifications.
> > -In a trivial case the device can set 
> > \field{queue_notify_data}=vqn. Some
> devices
> > +In a trivial case the device can set
> > +\field{queue_notify_data} to the vq number. Some devices
> 
> ... the 'vq' number here is not the same vqn above which renders the usage of
> vqn ambiguous.
> 
Not sure I follow. It is the vqn of above case in trivial case.

> > @@ -1053,7 +1054,7 @@ \subsubsection{Available Buffer
> > Notifications}\label{sec:Virtio Transport Option  If
> VIRTIO_F_NOTIF_CONFIG_DATA has been negotiated:
> >  \begin{itemize}
> >  \item If VIRTIO_F_NOTIFICATION_DATA has not been negotiated, the
> > driver MUST use the -\field{queue_notify_data} value instead of the 
> > virtqueue
> index.
> > +\field{queue_notify_data} value instead of the vq number.
> >  \item If VIRTIO_F_NOTIFICATION_DATA has been negotiated, the driver
> > MUST set the  \field{vqn} field to the \field{queue_notify_data} value.
> 
> And here it is really obvious: if VIRTIO_F_NOTIF_CONFIG_DATA has been
> negotiated the field \field{vqn} does not hold a virtqueue number/vq
> nuber/vqn but a device supplied identifier which may or may not be same as
> the vqn.
> 
> So we went
> from:
> if !VIRTIO_F_NOTIF_CONFIG_DATA then vqn is the virtqueue index if
> !!VIRTIO_F_NOTIF_CONFIG_DATA then vqn is queue_notify_data which may or
> may not be the same as the virtqueue index to if
> !VIRTIO_F_NOTIF_CONFIG_DATA then vqn is the vq number if
> !!VIRTIO_F_NOTIF_CONFIG_DATA then vqn is queue_notify_data which may or
> may not be the same as the vq number.
> 
The current wording is written bit simpler than above multiple if-else. :)

> Which is my opinion less clear that what we had before. And in my opinion
> calling the very same thing virtqueue und vq number interchangeably is not
> helpful either -- makes grepping harder.
> 
> I don't see the benefit of replacing virtqueue index with virtqueue number.
> AFAIR the supposed benefit was to:
> a) harmonize the terminology in the notifications part with the rest of the 
> spec
> b) to resolve the RSS problematic with its receive virtqueue indexes.
> 
And also, the upcoming patches to use either one of them that uses vqn at 
multiple places.
And you have already voted for vqn in the new patches ballot yesterday.

> For b) we are going down a different route calling those receive queue ids
> (AFAIR), and for the notifications part see my comments above.
> 
This would be done anyway regardless of picking "index" vs "number".

> I'm about to reply to the cover letter, and make my argument against
> standardizing virtqueue nuber (and in favor of standardizing on the on
> virtqueue index) along with a diff that is supposed to act as a counter 
> proposal.
> If that doesn't work I will give up and make peace with vq number and vqn.

I think the filed name "vqn" in the notification area can be improved.
The main reason I didn't touch it much, because CONFIG_DATA has rare usage in 
most virtualized world, real devices will not implement it.
After 2+ years of introduction in the spec, no open-source user has find it 
useful yet.
May be some use it there.
So if there is motivation, we can rename the vqn field and further improve the 
language.

I will surely read through other proposal anyway.
Thanks for taking time into this.

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



RE: [virtio-dev] [PATCH v10 0/8] Rename queue index to queue number

2023-03-30 Thread Parav Pandit


> From: Halil Pasic 
> Sent: Thursday, March 30, 2023 1:11 PM

> I argue that using a different term in that context than in the rest of the
> specification makes sense, because in the context of notifications the 
> virtqueue
> isn't always identified by its index.
Using single term as number is possible, so lets use single term. 

> 
> More precisely: if VIRTIO_F_NOTIF_CONFIG_DATA has been negotiated in the
> context of notifications the virtqueue is identified by the so called
> "queue_notify_data"; 
> if VIRTIO_F_NOTIF_CONFIG_DATA has been negotiated in

You missed "not" before negotiation. :)

> the context of notifications the virtqueue is identified by the virtqueue 
> index (as
> usual, for example in queue_select, or in the ccws).
> 
> As I've pointed out in my comment to patch 2, I believe replacing virtqueue
> index with virtqueue number is detrimental to clarity.
> 
> Thus please find a counter-proposal below. If there is interest I can make a
> series out of it, and prettify it. If I can't convince you guys, then I will 
> have to
> get used to vqn and virtqueue number.
> 
> AFAIR the other problem with index was the RSS for virtio-net. But there we 
> are
> currently heading down a direction of introducing a new abstraction. This
> approach avoids confusion around the term 'virtqueue index' as much as it
> avoids confusion around the term 'virtqueue nuber'.
> 
> 
> > c. RSS area of virtio net has inherited some logic, describe it using
> > abstract rss_rq_id.
> 
> -8<--
> From: Halil Pasic 
> Date: Thu, 30 Mar 2023 17:57:53 +0200
> Subject: [PATCH 1/1] content: clarify how virtques are identified
> 
> Clarify how virtqueues are identified in the context of available 
> notifications
> and in the context of RSS for virtio-net .
> 
> Signed-off-by: Halil Pasic 
> ---
>  content.tex  | 15 ++-
>  device-types/net/description.tex | 30 ++
>  transport-ccw.tex|  2 +-
>  transport-pci.tex|  7 ---
>  4 files changed, 37 insertions(+), 17 deletions(-)
> 
> diff --git a/content.tex b/content.tex
> index cff548a..a376232 100644
> --- a/content.tex
> +++ b/content.tex
> @@ -298,6 +298,9 @@ \section{Virtqueues}\label{sec:Basic Facilities of a
> Virtio Device / Virtqueues}  virtqueues\footnote{For example, the simplest
> network device has one virtqueue for  transmit and one for receive.}.
> 
> +If not otherwise stated, each virtqueue is identified by a so called
> +virtqueue index. Valid virtqueue indexes range from 0 up to 65535.
> +
Missing inclusive after 65535.
Covered in v10.

>  Driver makes requests available to device by adding  an available buffer to 
> the
> queue, i.e., adding a buffer  describing the request to a virtqueue, and
> optionally triggering @@ -402,7 +405,7 @@ \section{Driver Notifications}
> \label{sec:Basic Facilities of a Virtio Device /
> 
>  When VIRTIO_F_NOTIFICATION_DATA has not been negotiated,  this
> notification involves sending the -virtqueue number to the device (method
> depending on the transport).
> +virtqueue index to the device (method depending on the transport).
> 
>  However, some devices benefit from the ability to find out the  amount of
> available data in the queue without accessing the virtqueue in memory:
> @@ -413,7 +416,9 @@ \section{Driver Notifications} \label{sec:Basic Facilities
> of a Virtio Device /  the following information:
> 
>  \begin{description}
> -\item [vqn] VQ number to be notified.
> +\item [vqn] Identifies the virtqueueue to be notified. If
> VIRTIO_F_NOTIF_CONFIG_DATA has been
> +  negotiateted, the value of \field{vqn} is the notify data suplied by 
> the
> device
There is still vqn here.  So no better than vq number.
To be truly clear, it should be renamed to vq_identifier, that either contains 
vqn or vq_notification_data.

As I replied I can take up that cleanup at later point if we find some 
motivation for CONFIG_DATA use.

> +  (by transport specific means). Otherwise it is the virtqueue index.
>  \item [next_off] Offset
>within the ring where the next available ring entry
>will be written.
> @@ -786,13 +791,13 @@ \chapter{Reserved Feature Bits}\label{sec:Reserved
> Feature Bits}
>buffer notifications.
>As mentioned in section \ref{sec:Basic Facilities of a Virtio Device / 
> Driver
> notifications}, when the
>driver is required to send an available buffer notification to the device, 
> it
> -  sends the virtqueue number to be notified. The method of delivering
> +  sends the virtqueue index of the virtqueue to be notified. The method
> + of delivering
>notifications is transport specific.
>With the PCI transport, the device can optionally provide a per-virtqueue
> value
> -  for the driver to use in driver notifications, instead of the virtqueue 
> number.
> +  for the driver to use in driver notifications, instead of the virtqueu

[virtio-dev] [PATCH v2 0/2] virtio-blk: fix a few problems in ZBD-related code

2023-03-30 Thread Dmitry Fomichev
This two-part series contains code fixes that are related to the
recently added support for zoned block devices in virtio-blk
driver.

The original ZBD patchset that was merged to the kernel -next was not
the latest and greatest version. The first patch of this series is
essentially a diff between the current code and the final revision of
the original patchset.

The second patch fixes a problem that is observed when a host-managed
zoned device is virtualized via virtio-blk and the driver kernel is
configured without ZBD support (CONFIG_BLK_DEV_ZONED not set). In this
case, the driver must not allow the device scan to succeed in order to
avoid erroneous device operation that may cause data loss. The
second patch adds this currently missing functionality.

v1 -> v2:

 - address nits from Damien
 - add Review-by tags from Stefan and Damien

Dmitry Fomichev (2):
  virtio-blk: migrate to the latest patchset version
  virtio-blk: fix ZBD probe in kernels without ZBD support

 drivers/block/virtio_blk.c  | 269 
 include/uapi/linux/virtio_blk.h |  18 +--
 2 files changed, 182 insertions(+), 105 deletions(-)

-- 
2.34.1


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] [PATCH v2 1/2] virtio-blk: migrate to the latest patchset version

2023-03-30 Thread Dmitry Fomichev
The merged patch series to support zoned block devices in virtio-blk
is not the most up to date version. The merged patch can be found at

https://lore.kernel.org/linux-block/20221016034127.330942-3-dmitry.fomic...@wdc.com/

, but the latest and reviewed version is

https://lore.kernel.org/linux-block/20221110053952.3378990-3-dmitry.fomic...@wdc.com/

The differences between the two are mostly cleanups, but there is one
change that is very important in terms of compatibility with the
approved virtio-zbd specification.

Before it was approved, the OASIS virtio spec had a change in
VIRTIO_BLK_T_ZONE_APPEND request layout that is not reflected in the
current virtio-blk driver code. In the running code, the status is
the first byte of the in-header that is followed by some pad bytes
and the u64 that carries the sector at which the data has been written
to the zone back to the driver, aka the append sector.

This layout turned out to be problematic for implementing in QEMU and
the request status byte has been eventually made the last byte of the
in-header. The current code doesn't expect that and this causes the
append sector value always come as zero to the block layer. This needs
to be fixed ASAP.

Fixes: 95bfec41bd3d ("virtio-blk: add support for zoned block devices")
Cc: sta...@vger.kernel.org
Signed-off-by: Dmitry Fomichev 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Damien Le Moal 
---
 drivers/block/virtio_blk.c  | 238 +---
 include/uapi/linux/virtio_blk.h |  18 +--
 2 files changed, 166 insertions(+), 90 deletions(-)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 2723eede6f21..4f0dbbb3d4a5 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -96,16 +96,14 @@ struct virtblk_req {
 
/*
 * The zone append command has an extended in header.
-* The status field in zone_append_in_hdr must have
-* the same offset in virtblk_req as the non-zoned
-* status field above.
+* The status field in zone_append_in_hdr must always
+* be the last byte.
 */
struct {
+   __virtio64 sector;
u8 status;
-   u8 reserved[7];
-   __le64 append_sector;
-   } zone_append_in_hdr;
-   };
+   } zone_append;
+   } in_hdr;
 
size_t in_hdr_len;
 
@@ -154,7 +152,7 @@ static int virtblk_add_req(struct virtqueue *vq, struct 
virtblk_req *vbr)
sgs[num_out + num_in++] = vbr->sg_table.sgl;
}
 
-   sg_init_one(&in_hdr, &vbr->status, vbr->in_hdr_len);
+   sg_init_one(&in_hdr, &vbr->in_hdr.status, vbr->in_hdr_len);
sgs[num_out + num_in++] = &in_hdr;
 
return virtqueue_add_sgs(vq, sgs, num_out, num_in, vbr, GFP_ATOMIC);
@@ -242,11 +240,14 @@ static blk_status_t virtblk_setup_cmd(struct 
virtio_device *vdev,
  struct request *req,
  struct virtblk_req *vbr)
 {
-   size_t in_hdr_len = sizeof(vbr->status);
+   size_t in_hdr_len = sizeof(vbr->in_hdr.status);
bool unmap = false;
u32 type;
u64 sector = 0;
 
+   if (!IS_ENABLED(CONFIG_BLK_DEV_ZONED) && op_is_zone_mgmt(req_op(req)))
+   return BLK_STS_NOTSUPP;
+
/* Set fields for all request types */
vbr->out_hdr.ioprio = cpu_to_virtio32(vdev, req_get_ioprio(req));
 
@@ -287,7 +288,7 @@ static blk_status_t virtblk_setup_cmd(struct virtio_device 
*vdev,
case REQ_OP_ZONE_APPEND:
type = VIRTIO_BLK_T_ZONE_APPEND;
sector = blk_rq_pos(req);
-   in_hdr_len = sizeof(vbr->zone_append_in_hdr);
+   in_hdr_len = sizeof(vbr->in_hdr.zone_append);
break;
case REQ_OP_ZONE_RESET:
type = VIRTIO_BLK_T_ZONE_RESET;
@@ -297,7 +298,10 @@ static blk_status_t virtblk_setup_cmd(struct virtio_device 
*vdev,
type = VIRTIO_BLK_T_ZONE_RESET_ALL;
break;
case REQ_OP_DRV_IN:
-   /* Out header already filled in, nothing to do */
+   /*
+* Out header has already been prepared by the caller 
(virtblk_get_id()
+* or virtblk_submit_zone_report()), nothing to do here.
+*/
return 0;
default:
WARN_ON_ONCE(1);
@@ -318,16 +322,28 @@ static blk_status_t virtblk_setup_cmd(struct 
virtio_device *vdev,
return 0;
 }
 
+/*
+ * The status byte is always the last byte of the virtblk request
+ * in-header. This helper fetches its value for all in-header formats
+ * that are currently defined.
+ */
+static inline u8 virtblk_vbr_status(struct virtblk_req *vbr)
+{
+   return *((u8 *)&vbr->in_hdr + vbr->in_hdr_len - 1);
+}
+
 static inline void virtblk_request_done

[virtio-dev] [PATCH v2 2/2] virtio-blk: fix ZBD probe in kernels without ZBD support

2023-03-30 Thread Dmitry Fomichev
When the kernel is built without support for zoned block devices,
virtio-blk probe needs to error out any host-managed device scans
to prevent such devices from appearing in the system as non-zoned.
The current virtio-blk code simply bypasses all ZBD checks if
CONFIG_BLK_DEV_ZONED is not defined and this leads to host-managed
block devices being presented as non-zoned in the OS. This is one of
the main problems this patch series is aimed to fix.

In this patch, make VIRTIO_BLK_F_ZONED feature defined even when
CONFIG_BLK_DEV_ZONED is not. This change makes the code compliant with
the voted revision of virtio-blk ZBD spec. Modify the probe code to
look at the situation when VIRTIO_BLK_F_ZONED is negotiated in a kernel
that is built without ZBD support. In this case, the code checks
the zoned model of the device and fails the probe is the device
is host-managed.

The patch also adds the comment to clarify that the call to perform
the zoned device probe is correctly placed after virtio_device ready().

Fixes: 95bfec41bd3d ("virtio-blk: add support for zoned block devices")
Cc: sta...@vger.kernel.org
Signed-off-by: Dmitry Fomichev 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Damien Le Moal 
---
 drivers/block/virtio_blk.c | 33 +
 1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 4f0dbbb3d4a5..2b918e28acaa 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -866,16 +866,12 @@ static int virtblk_probe_zoned_device(struct 
virtio_device *vdev,
return ret;
 }
 
-static inline bool virtblk_has_zoned_feature(struct virtio_device *vdev)
-{
-   return virtio_has_feature(vdev, VIRTIO_BLK_F_ZONED);
-}
-
 #else
 
 /*
  * Zoned block device support is not configured in this kernel.
- * We only need to define a few symbols to avoid compilation errors.
+ * Host-managed zoned devices can't be supported, but others are
+ * good to go as regular block devices.
  */
 #define virtblk_report_zones   NULL
 
@@ -886,12 +882,16 @@ static inline void virtblk_revalidate_zones(struct 
virtio_blk *vblk)
 static inline int virtblk_probe_zoned_device(struct virtio_device *vdev,
struct virtio_blk *vblk, struct request_queue *q)
 {
-   return -EOPNOTSUPP;
-}
+   u8 model;
 
-static inline bool virtblk_has_zoned_feature(struct virtio_device *vdev)
-{
-   return false;
+   virtio_cread(vdev, struct virtio_blk_config, zoned.model, &model);
+   if (model == VIRTIO_BLK_Z_HM) {
+   dev_err(&vdev->dev,
+   "virtio_blk: zoned devices are not supported");
+   return -EOPNOTSUPP;
+   }
+
+   return 0;
 }
 #endif /* CONFIG_BLK_DEV_ZONED */
 
@@ -1577,7 +1577,11 @@ static int virtblk_probe(struct virtio_device *vdev)
virtblk_update_capacity(vblk, false);
virtio_device_ready(vdev);
 
-   if (virtblk_has_zoned_feature(vdev)) {
+   /*
+* All steps that follow use the VQs therefore they need to be
+* placed after the virtio_device_ready() call above.
+*/
+   if (virtio_has_feature(vdev, VIRTIO_BLK_F_ZONED)) {
err = virtblk_probe_zoned_device(vdev, vblk, q);
if (err)
goto out_cleanup_disk;
@@ -1683,10 +1687,7 @@ static unsigned int features[] = {
VIRTIO_BLK_F_RO, VIRTIO_BLK_F_BLK_SIZE,
VIRTIO_BLK_F_FLUSH, VIRTIO_BLK_F_TOPOLOGY, VIRTIO_BLK_F_CONFIG_WCE,
VIRTIO_BLK_F_MQ, VIRTIO_BLK_F_DISCARD, VIRTIO_BLK_F_WRITE_ZEROES,
-   VIRTIO_BLK_F_SECURE_ERASE,
-#ifdef CONFIG_BLK_DEV_ZONED
-   VIRTIO_BLK_F_ZONED,
-#endif /* CONFIG_BLK_DEV_ZONED */
+   VIRTIO_BLK_F_SECURE_ERASE, VIRTIO_BLK_F_ZONED,
 };
 
 static struct virtio_driver virtio_blk = {
-- 
2.34.1


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] [PATCH 00/11] Introduce transitional mmr pci device

2023-03-30 Thread Parav Pandit
Overview:
-
The Transitional MMR device is a variant of the transitional PCI device.
It has its own small Device ID range. It does not have I/O
region BAR; instead it exposes legacy configuration and device
specific registers at an offset in the memory region BAR.

Such transitional MMR devices will be used at the scale of
thousands of devices using PCI SR-IOV and/or future scalable
virtualization technology to provide backward
compatibility (for legacy devices) and also future
compatibility with new features.

Usecase:

1. A hypervisor/system needs to provide transitional
   virtio devices to the guest VM at scale of thousands,
   typically, one to eight devices per VM.

2. A hypervisor/system needs to provide such devices using a
   vendor agnostic driver in the hypervisor system.

3. A hypervisor system prefers to have single stack regardless of
   virtio device type (net/blk) and be future compatible with a
   single vfio stack using SR-IOV or other scalable device
   virtualization technology to map PCI devices to the guest VM.
   (as transitional or otherwise)

Motivation/Background:
--
The existing transitional PCI device is missing support for
PCI SR-IOV based devices. Currently it does not work beyond
PCI PF, or as software emulated device in reality. It currently
has below cited system level limitations:

[a] PCIe spec citation:
VFs do not support I/O Space and thus VF BARs shall not
indicate I/O Space.

[b] cpu arch citiation:
Intel 64 and IA-32 Architectures Software Developer’s Manual:
The processor’s I/O address space is separate and distinct from
the physical-memory address space. The I/O address space consists
of 64K individually addressable 8-bit I/O ports, numbered 0 through H.

[c] PCIe spec citation:
If a bridge implements an I/O address range,...I/O address range
will be aligned to a 4 KB boundary.

[d] I/O region accesses at PCI system level is slow as they are non-posted
operations in PCIe fabric.

The usecase requirements and limitations above can be solved by
extending the transitional device, mapping legacy and device
specific configuration registers in a memory PCI BAR instead
of using non composable I/O region.

Please review.

Patch summary:
--
patch 1 to 5 prepares the spec
patch 6 to 11 defines transitional mmr device

patch-1 uses lower case alphabets to name device id
patch-2 move transitional device id in legay section along with
revision id
patch-3 splits legacy feature bits description from device id
patch-4 rename and moves virtio config registers next to 1.x
registers section
patch-5 Adds missing helper verb in terminology definitions
patch-6 introduces transitional mmr device
patch-7 introduces transitional mmr device pci device ids
patch-8 introduces virtio extended pci capability
patch-9 describes new pci capability to locate legacy mmr
registers
patch-10 extended usage of driver notification capability for
 the transitional mmr device
patch-11 adds conformance section of the transitional mmr device

This design and details further described below.

Design:
---
Below picture captures the main small difference between current
transitional PCI SR-IOV VF and transitional MMR SR-IOV VF.

+--+ ++ ++
|virtio 1.x| |Transitional| |Transitional|
|SRIOV VF  | |SRIOV VF| |MMR SRIOV VF|
|  | || ||
++---+ | ++---+   | ++---+   |
||dev_id =   | | ||dev_id =   |   | ||dev_id =   |   |
||{0x1040-0x106C}| | ||{0x1000-0x103f}|   | ||{0x10f9-0x10ff}|   |
|+---+ | |+---+   | |+---+   |
|  | || ||
|++| |++  | |+-+ |
||Memory BAR  || ||Memory BAR  |  | ||Memory BAR   | |
|++| |++  | || | |
|  | || || +--+| |
|  | |+-+ | || |legacy virtio || |
|  | ||IOBAR impossible | | || |+ dev cfg || |
|  | |+-+ | || |registers || |
|  | || || +--+| |
|  | || |+-+ |
+--+ ++ ++

Here transitional MMR SR-IOV VF has legacy configuration and
legacy device specific registers located at an offset in the memory
region BAR.

A memory region can be dedicated at BAR0 or it can be in an
existing BAR, allowing flexibility when implementing support
in a hardware device.

Transitional MMR SR-IOV VFs use a distinct device ID range to that
of existing virtio SR-IOV VFs to allow flexibility in driver
binding.

A more zoom-in version of trans

[virtio-dev] [PATCH 04/11] transport-pci: Rename and move legacy PCI Device layout section

2023-03-30 Thread Parav Pandit
Current 'Legacy Interfaces: A Note on PCI Device Layout' explains virtio
legacy registers which consist of common configuration structure and
device specific registers. It has little to do with the existing
PCI Device layout section.

For example, non transitional device's common configuration layout is
described in 'Common configuration structure layout' section outside of
'PCI Device Layout' section.

Hence, to keep Legacy common configuration registers section consistent
with 1.x, rename and move it adjacent to 1.x common configuration
structure.

Reviewed-by: Satananda Burla 
Signed-off-by: Parav Pandit 
---
 conformance.tex   |   2 +-
 transport-pci.tex | 162 +++---
 2 files changed, 83 insertions(+), 81 deletions(-)

diff --git a/conformance.tex b/conformance.tex
index 4f724c2..ccbc9bf 100644
--- a/conformance.tex
+++ b/conformance.tex
@@ -262,7 +262,7 @@ \section{Conformance Targets}\label{sec:Conformance / 
Conformance Targets}
 \item Section \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / 
Message Framing / Legacy Interface: Message Framing}
 \item Section \ref{sec:General Initialization And Device Operation / Device 
Initialization / Legacy Interface: Device Initialization}
 \item Section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI 
Device Discovery / Legacy Interfaces: A Note on PCI Device Discovery}
-\item Section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI 
Device Layout / Legacy Interfaces: A Note on PCI Device Layout}
+\item Section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio 
Structure PCI Capabilities / Common configuration structure layout / Legacy 
Interfaces: A Note on Configuration Registers}
 \item Section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio 
Structure PCI Capabilities / Legacy Interface: A Note on Feature Bits}
 \item Section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / 
PCI-specific Initialization And Device Operation / Device Initialization / 
Virtio Device Configuration Layout Detection / Legacy Interface: A Note on 
Device Layout Detection}
 \item Section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / 
PCI-specific Initialization And Device Operation / Device Initialization / 
Virtqueue Configuration / Legacy Interface: A Note on Virtqueue Configuration}
diff --git a/transport-pci.tex b/transport-pci.tex
index 65d9748..ee11ba5 100644
--- a/transport-pci.tex
+++ b/transport-pci.tex
@@ -497,6 +497,88 @@ \subsubsection{Common configuration structure 
layout}\label{sec:Virtio Transport
 were used before the queue reset.
 (see \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue 
Reset}).
 
+\paragraph{Legacy Interfaces: A Note on Configuration Registers}
+\label{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure 
PCI Capabilities / Common configuration structure layout / Legacy Interfaces: A 
Note on Configuration Registers}
+
+Transitional devices MUST present part of configuration
+registers in a legacy configuration structure in BAR0 in the first I/O
+region of the PCI device, as documented below.
+When using the legacy interface, transitional drivers
+MUST use the legacy configuration structure in BAR0 in the first
+I/O region of the PCI device, as documented below.
+
+When using the legacy interface the driver MAY access
+the device-specific configuration region using any width accesses, and
+a transitional device MUST present driver with the same results as
+when accessed using the ``natural'' access method (i.e.
+32-bit accesses for 32-bit fields, etc).
+
+Note that this is possible because while the virtio common configuration 
structure is PCI
+(i.e. little) endian, when using the legacy interface the device-specific
+configuration region is encoded in the native endian of the guest (where such 
distinction is
+applicable).
+
+When used through the legacy interface, the virtio common configuration 
structure looks as follows:
+
+\begin{tabularx}{\textwidth}{ |X||X|X|X|X|X|X|X|X| }
+\hline
+ Bits & 32 & 32 & 32 & 16 & 16 & 16 & 8 & 8 \\
+\hline
+ Read / Write & R & R+W & R+W & R & R+W & R+W & R+W & R \\
+\hline
+ Purpose & Device Features bits 0:31 & Driver Features bits 0:31 &
+  Queue Address & \field{queue_size} & \field{queue_select} & Queue Notify &
+  Device Status & ISR \newline Status \\
+\hline
+\end{tabularx}
+
+If MSI-X is enabled for the device, two additional fields
+immediately follow this header:
+
+\begin{tabular}{ |l||l|l| }
+\hline
+Bits   & 16 & 16 \\
+\hline
+Read/Write & R+W& R+W\\
+\hline
+Purpose (MSI-X) & \field{config_msix_vector}  & \field{queue_msix_vector} \\
+\hline
+\end{tabular}
+
+Note: When MSI-X capability is enabled, device-specific configuration starts at
+byte offset 24 in virtio common configuration structure. When MSI-X capability 
is not
+enabled, device-specific configuration starts at byte offset 20 in vi

[virtio-dev] [PATCH 07/11] transport-pci: Introduce transitional MMR device id

2023-03-30 Thread Parav Pandit
Transitional MMR device PCI Device IDs are unique. Hence,
any of the existing drivers do not bind to it.
This further maintains the backward compatibility with
existing drivers.

Co-developed-by: Satananda Burla 
Signed-off-by: Parav Pandit 
---
 transport-pci.tex | 45 +
 1 file changed, 41 insertions(+), 4 deletions(-)

diff --git a/transport-pci.tex b/transport-pci.tex
index ee11ba5..665448e 100644
--- a/transport-pci.tex
+++ b/transport-pci.tex
@@ -19,12 +19,14 @@ \section{Virtio Over PCI Bus}\label{sec:Virtio Transport 
Options / Virtio Over P
 \subsection{PCI Device Discovery}\label{sec:Virtio Transport Options / Virtio 
Over PCI Bus / PCI Device Discovery}
 
 Any PCI device with PCI Vendor ID 0x1af4, and PCI Device ID 0x1000 through
-0x107f inclusive is a virtio device. The actual value within this range
-indicates which virtio device is supported by the device.
+0x107f inclusive and DeviceID 0x10f9 through 0x10ff is a virtio device.
+The actual value within this range indicates which virtio device
+type it is.
 The PCI Device ID is calculated by adding 0x1040 to the Virtio Device ID,
 as indicated in section \ref{sec:Device Types}.
-Additionally, devices MAY utilize a Transitional PCI Device ID range,
-0x1000 to 0x103f depending on the device type.
+Additionally, devices MAY utilize a Transitional PCI Device ID range
+0x1000 to 0x103f inclusive or a Transitional MMR PCI Device ID range
+0x10f9 to 0x10ff inclusive, depending on the device type.
 
 \devicenormative{\subsubsection}{PCI Device Discovery}{Virtio Transport 
Options / Virtio Over PCI Bus / PCI Device Discovery}
 
@@ -95,6 +97,41 @@ \subsubsection{Legacy Interfaces: A Note on PCI Device 
Discovery}\label{sec:Virt
 
 This is to match legacy drivers.
 
+\subsubsection{Transitional MMR Interface: A Note on PCI Device
+Discovery}\label{sec:Virtio Transport Options / Virtio Over PCI
+Bus / PCI Device Discovery / Transitional MMR Interface: A Note on PCI Device 
Discovery}
+
+The transitional MMR device has one of the following PCI Device ID
+depending on the device type:
+
+\begin{tabular}{|l|c|}
+\hline
+Transitional PCI Device ID  &  Virtio Device\\
+\hline \hline
+0x10f9  &   network device \\
+\hline
+0x10fa &   block device \\
+\hline
+0x10fb & memory ballooning (traditional)  \\
+\hline
+0x10fc &  console   \\
+\hline
+0x10fd & SCSI host  \\
+\hline
+0x10fe &  entropy source\\
+\hline
+0x10ff &   9P transport \\
+\hline
+\end{tabular}
+
+The PCI Subsystem Vendor ID and the PCI Subsystem Device ID MAY
+reflect the PCI Vendor and Device ID of the environment.
+
+The transitional MMR driver MUST match any PCI Revision ID value.
+
+The transitional MMR driver MAY match any PCI Subsystem Vendor ID and
+any PCI Subsystem Device ID value.
+
 \subsection{PCI Device Layout}\label{sec:Virtio Transport Options / Virtio 
Over PCI Bus / PCI Device Layout}
 
 The device is configured via I/O and/or memory regions (though see
-- 
2.26.2


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] [PATCH 02/11] transport-pci: Move transitional device id to legacy section

2023-03-30 Thread Parav Pandit
Currently PCI device discovery details for the transitional device
are documented in two different sections.

For example, PCI device and vendor ID registers are documented in
'Device Requirements: PCI Device Discovery' section, while PCI
revision id is documented in 'Legacy Interfaces: A Note on PCI
Device Discovery' section.

Transitional devices requirements should be documented in "legacy
interfaces" section as clearly mentioned in
'Legacy Interface: A Note on Feature Bits'.

Hence,
1. Move transitional device requirements to its designated Legacy
   interface section
2. Describe regular device requirements without quoting it as "non
   transitional device"

While at it, write the description using a singular object definition.

Reviewed-by: Satananda Burla 
Signed-off-by: Parav Pandit 
---
 transport-pci.tex | 70 ---
 1 file changed, 36 insertions(+), 34 deletions(-)

diff --git a/transport-pci.tex b/transport-pci.tex
index 7f27107..1f74c6f 100644
--- a/transport-pci.tex
+++ b/transport-pci.tex
@@ -28,46 +28,24 @@ \subsection{PCI Device Discovery}\label{sec:Virtio 
Transport Options / Virtio Ov
 
 \devicenormative{\subsubsection}{PCI Device Discovery}{Virtio Transport 
Options / Virtio Over PCI Bus / PCI Device Discovery}
 
-Devices MUST have the PCI Vendor ID 0x1af4.
-Devices MUST either have the PCI Device ID calculated by adding 0x1040
+The device MUST have the PCI Vendor ID 0x1af4.
+The device MUST calculate PCI Device ID by adding 0x1040
 to the Virtio Device ID, as indicated in section \ref{sec:Device
-Types} or have the Transitional PCI Device ID depending on the device type,
-as follows:
-
-\begin{tabular}{|l|c|}
-\hline
-Transitional PCI Device ID  &  Virtio Device\\
-\hline \hline
-0x1000  &   network device \\
-\hline
-0x1001 &   block device \\
-\hline
-0x1002 & memory ballooning (traditional)  \\
-\hline
-0x1003 &  console   \\
-\hline
-0x1004 & SCSI host  \\
-\hline
-0x1005 &  entropy source\\
-\hline
-0x1009 &   9P transport \\
-\hline
-\end{tabular}
+Types}.
 
 For example, the network device with the Virtio Device ID 1
-has the PCI Device ID 0x1041 or the Transitional PCI Device ID 0x1000.
-
-The PCI Subsystem Vendor ID and the PCI Subsystem Device ID MAY reflect
-the PCI Vendor and Device ID of the environment (for informational purposes by 
the driver).
+has the PCI Device ID 0x1041.
 
-Non-transitional devices SHOULD have a PCI Device ID in the range
-0x1040 to 0x107f.
-Non-transitional devices SHOULD have a PCI Revision ID of 1 or higher.
-Non-transitional devices SHOULD have a PCI Subsystem Device ID of 0x40 or 
higher.
+The device SHOULD have a PCI Device ID in the range 0x1040 to 0x107f.
+The device SHOULD have a PCI Revision ID of 1 or higher.
+The device SHOULD have a PCI Subsystem Device ID of 0x40 or higher.
 
 This is to reduce the chance of a legacy driver attempting
 to drive the device.
 
+The PCI Subsystem Vendor ID and the PCI Subsystem Device ID MAY reflect
+the PCI Vendor and Device ID of the environment (for informational purposes by 
the driver).
+
 \drivernormative{\subsubsection}{PCI Device Discovery}{Virtio Transport 
Options / Virtio Over PCI Bus / PCI Device Discovery}
 Drivers MUST match devices with the PCI Vendor ID 0x1af4 and
 the PCI Device ID in the range 0x1040 to 0x107f,
@@ -85,8 +63,32 @@ \subsection{PCI Device Discovery}\label{sec:Virtio Transport 
Options / Virtio Ov
 PCI Subsystem Device ID value.
 
 \subsubsection{Legacy Interfaces: A Note on PCI Device 
Discovery}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI 
Device Discovery / Legacy Interfaces: A Note on PCI Device Discovery}
-Transitional devices MUST have a PCI Revision ID of 0.
-Transitional devices MUST have the PCI Subsystem Device ID
+
+The transitional device has one of the following PCI Device ID
+depending on the device type:
+
+\begin{tabular}{|l|c|}
+\hline
+Transitional PCI Device ID  &  Virtio Device\\
+\hline \hline
+0x1000  &   network device \\
+\hline
+0x1001 &   block device \\
+\hline
+0x1002 & memory ballooning (traditional)  \\
+\hline
+0x1003 &  console   \\
+\hline
+0x1004 & SCSI host  \\
+\hline
+0x1005 &  entropy source\\
+\hline
+0x1009 &   9P transport \\
+\hline
+\end{tabular}
+
+The transitional device MUST have a PCI Revision ID of 0.
+The transitional device MUST have the PCI Subsystem Device ID
 matching the Virtio Device ID, as indicated in section \ref{sec:Device Types}.
 Transitional devices MUST have the Transitional PCI Device ID in
 the range 0x1000 to 0x103f.
-- 
2.26.2


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] [PATCH 01/11] transport-pci: Use lowecase alphabets

2023-03-30 Thread Parav Pandit
Use uniformly lowercase alphabets to write PCI vendor id and device id.

Signed-off-by: Parav Pandit 
---
 transport-pci.tex | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/transport-pci.tex b/transport-pci.tex
index 85b2dae..7f27107 100644
--- a/transport-pci.tex
+++ b/transport-pci.tex
@@ -18,17 +18,17 @@ \section{Virtio Over PCI Bus}\label{sec:Virtio Transport 
Options / Virtio Over P
 
 \subsection{PCI Device Discovery}\label{sec:Virtio Transport Options / Virtio 
Over PCI Bus / PCI Device Discovery}
 
-Any PCI device with PCI Vendor ID 0x1AF4, and PCI Device ID 0x1000 through
-0x107F inclusive is a virtio device. The actual value within this range
+Any PCI device with PCI Vendor ID 0x1af4, and PCI Device ID 0x1000 through
+0x107f inclusive is a virtio device. The actual value within this range
 indicates which virtio device is supported by the device.
 The PCI Device ID is calculated by adding 0x1040 to the Virtio Device ID,
 as indicated in section \ref{sec:Device Types}.
 Additionally, devices MAY utilize a Transitional PCI Device ID range,
-0x1000 to 0x103F depending on the device type.
+0x1000 to 0x103f depending on the device type.
 
 \devicenormative{\subsubsection}{PCI Device Discovery}{Virtio Transport 
Options / Virtio Over PCI Bus / PCI Device Discovery}
 
-Devices MUST have the PCI Vendor ID 0x1AF4.
+Devices MUST have the PCI Vendor ID 0x1af4.
 Devices MUST either have the PCI Device ID calculated by adding 0x1040
 to the Virtio Device ID, as indicated in section \ref{sec:Device
 Types} or have the Transitional PCI Device ID depending on the device type,
@@ -69,13 +69,13 @@ \subsection{PCI Device Discovery}\label{sec:Virtio 
Transport Options / Virtio Ov
 to drive the device.
 
 \drivernormative{\subsubsection}{PCI Device Discovery}{Virtio Transport 
Options / Virtio Over PCI Bus / PCI Device Discovery}
-Drivers MUST match devices with the PCI Vendor ID 0x1AF4 and
+Drivers MUST match devices with the PCI Vendor ID 0x1af4 and
 the PCI Device ID in the range 0x1040 to 0x107f,
 calculated by adding 0x1040 to the Virtio Device ID,
 as indicated in section \ref{sec:Device Types}.
 Drivers for device types listed in section \ref{sec:Virtio
 Transport Options / Virtio Over PCI Bus / PCI Device Discovery}
-MUST match devices with the PCI Vendor ID 0x1AF4 and
+MUST match devices with the PCI Vendor ID 0x1af4 and
 the Transitional PCI Device ID indicated in section
  \ref{sec:Virtio
 Transport Options / Virtio Over PCI Bus / PCI Device Discovery}.
@@ -691,7 +691,7 @@ \subsubsection{Vendor data capability}\label{sec:Virtio
 Devices CAN present multiple Vendor data capabilities with
 either different or identical \field{vendor_id} values.
 
-The value \field{vendor_id} MUST NOT equal 0x1AF4.
+The value \field{vendor_id} MUST NOT equal 0x1af4.
 
 The size of the Vendor data capability MUST be a multiple of 4 bytes.
 
-- 
2.26.2


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] [PATCH 03/11] transport-pci: Split notes of PCI Device Layout

2023-03-30 Thread Parav Pandit
Currently single legacy interface section describes PCI common
configuration layout and feature bits operation for the
legacy interface.

Hence, split PCI Device Layout legacy interface section into two
parts. First subsection for common configuration and second
subsection for feature bits.

Reviewed-by: Satananda Burla 
Signed-off-by: Parav Pandit 
---
 conformance.tex   |  1 +
 transport-pci.tex | 14 +-
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/conformance.tex b/conformance.tex
index 01ccd69..4f724c2 100644
--- a/conformance.tex
+++ b/conformance.tex
@@ -263,6 +263,7 @@ \section{Conformance Targets}\label{sec:Conformance / 
Conformance Targets}
 \item Section \ref{sec:General Initialization And Device Operation / Device 
Initialization / Legacy Interface: Device Initialization}
 \item Section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI 
Device Discovery / Legacy Interfaces: A Note on PCI Device Discovery}
 \item Section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI 
Device Layout / Legacy Interfaces: A Note on PCI Device Layout}
+\item Section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio 
Structure PCI Capabilities / Legacy Interface: A Note on Feature Bits}
 \item Section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / 
PCI-specific Initialization And Device Operation / Device Initialization / 
Virtio Device Configuration Layout Detection / Legacy Interface: A Note on 
Device Layout Detection}
 \item Section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / 
PCI-specific Initialization And Device Operation / Device Initialization / 
Virtqueue Configuration / Legacy Interface: A Note on Virtqueue Configuration}
 \item Section \ref{sec:Virtio Transport Options / Virtio Over MMIO / Legacy 
interface}
diff --git a/transport-pci.tex b/transport-pci.tex
index 1f74c6f..65d9748 100644
--- a/transport-pci.tex
+++ b/transport-pci.tex
@@ -845,16 +845,20 @@ \subsubsection{Legacy Interfaces: A Note on PCI Device 
Layout}\label{sec:Virtio
 devices MUST present the device-specific configuration space
 if any at an offset immediately following the general headers.
 
-Note that only Feature Bits 0 to 31 are accessible through the
-Legacy Interface. When used through the Legacy Interface,
-Transitional Devices MUST assume that Feature Bits 32 to 63
-are not acknowledged by Driver.
-
 As legacy devices had no \field{config_generation} field,
 see \ref{sec:Basic Facilities of a Virtio Device / Device
 Configuration Space / Legacy Interface: Device Configuration
 Space}~\nameref{sec:Basic Facilities of a Virtio Device / Device Configuration 
Space / Legacy Interface: Device Configuration Space} for workarounds.
 
+\subsubsection{Legacy Interface: A Note on Feature Bits}
+\label{sec:Virtio Transport Options / Virtio Over PCI Bus /
+Virtio Structure PCI Capabilities / Legacy Interface: A Note on Feature Bits}
+
+Only Feature Bits 0 to 31 are accessible through the
+Legacy Interface. When used through the Legacy Interface,
+Transitional Devices MUST assume that Feature Bits 32 to 63
+are not acknowledged by Driver.
+
 \subsubsection{Non-transitional Device With Legacy Driver: A Note
 on PCI Device Layout}\label{sec:Virtio Transport Options / Virtio
 Over PCI Bus / PCI Device Layout / Non-transitional Device With
-- 
2.26.2


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability

2023-03-30 Thread Parav Pandit
PCI device configuration space for capabilities is limited to only 192
bytes shared by many PCI capabilities of generic PCI device and virtio
specific.

Hence, introduce virtio extended capability that uses PCI Express
extended capability.
Subsequent patch uses this virtio extended capability.

Co-developed-by: Satananda Burla 
Signed-off-by: Parav Pandit 
---
 transport-pci.tex | 69 ++-
 1 file changed, 68 insertions(+), 1 deletion(-)

diff --git a/transport-pci.tex b/transport-pci.tex
index 665448e..aeda4a1 100644
--- a/transport-pci.tex
+++ b/transport-pci.tex
@@ -174,7 +174,8 @@ \subsection{Virtio Structure PCI 
Capabilities}\label{sec:Virtio Transport Option
 the function, or accessed via the special VIRTIO_PCI_CAP_PCI_CFG field in the 
PCI configuration space.
 
 The location of each structure is specified using a vendor-specific PCI 
capability located
-on the capability list in PCI configuration space of the device.
+on the capability list in PCI configuration space of the device
+unless stated otherwise.
 This virtio structure capability uses little-endian format; all fields are
 read-only for the driver unless stated otherwise:
 
@@ -301,6 +302,72 @@ \subsection{Virtio Structure PCI 
Capabilities}\label{sec:Virtio Transport Option
 fields provide the most significant 32 bits of a total 64 bit offset and
 length within the BAR specified by \field{cap.bar}.
 
+Virtio extended PCI Express capability structure defines
+the location of certain virtio device configuration related
+structures using PCI Express extended capability. Virtio
+extended PCI Express capability structure uses PCI Express
+vendor specific extended capability (VSEC). It has a below
+layout:
+
+\begin{lstlisting}
+struct pcie_ext_cap {
+le16 cap_vendor_id; /* Generic PCI field: 0xB */
+le16 cap_version : 2; /* Generic PCI field: 0 */
+le16 next_cap_offset : 14; /* Generic PCI field: next cap or 0 */
+};
+
+struct virtio_pcie_ext_cap {
+struct pcie_ext_cap pcie_ecap;
+u8 cfg_type; /* Identifies the structure. */
+u8 bar; /* Index of the BAR where its located */
+u8 id; /* Multiple capabilities of the same type */
+u8 zero_padding[1];
+le64 offset; /* Offset with the bar */
+le64 length; /* Length of the structure, in bytes. */
+u8 data[]; /* Optional variable length data */
+};
+\end{lstlisting}
+
+This structure contains optional data, depending on
+\field{cfg_type}. The fields are interpreted as follows:
+
+\begin{description}
+\item[\field{cap_vendor_id}]
+ 0x0B; identifies a vendor-specific extended capability.
+
+\item[\field{cap_version}]
+ contains a value of 0.
+
+\item[\field{next_cap_offset}]
+Offset to the next capability.
+
+\item[\field{cfg_type}]
+follows the same definition as \field{cfg_type}
+from the \field{struct virtio_pci_cap}.
+
+\item[\field{bar}]
+follows the same  same definition as  \field{bar}
+from the \field{struct virtio_pci_cap}.
+
+\item[\field{id}]
+follows the same  same definition as  \field{id}
+from the \field{struct virtio_pci_cap}.
+
+\item[\field{offset}]
+indicates where the structure begins relative to the
+base address associated with the BAR. The alignment
+requirements of offset are indicated in each
+structure-specific section that uses
+\field{struct virtio_pcie_ext_cap}.
+
+\item[\field{length}]
+indicates the length of the structure indicated by this
+capability.
+
+\item[\field{data}]
+optional data of this capability.
+\end{description}
+
 \drivernormative{\subsubsection}{Virtio Structure PCI Capabilities}{Virtio 
Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities}
 
 The driver MUST ignore any vendor-specific capability structure which has
-- 
2.26.2


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] [PATCH 05/11] introduction: Add missing helping verb

2023-03-30 Thread Parav Pandit
The terminology of Transitional device and driver is missing the helping
verb 'is' similar to other terminologies.

Hence, add them to complete the sentence.

Signed-off-by: Parav Pandit 
---
 introduction.tex | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/introduction.tex b/introduction.tex
index 287c5fc..e8b34e3 100644
--- a/introduction.tex
+++ b/introduction.tex
@@ -145,14 +145,14 @@ \subsection{Legacy Interface: 
Terminology}\label{intro:Legacy
 
 \begin{description}
 \item[Transitional Device]
-a device supporting both drivers conforming to this
+is a device supporting both drivers conforming to this
 specification, and allowing legacy drivers.
 \end{description}
 
 Similarly, a driver MAY implement:
 \begin{description}
 \item[Transitional Driver]
-a driver supporting both devices conforming to this
+is a driver supporting both devices conforming to this
 specification, and legacy devices.
 \end{description}
 
-- 
2.26.2


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] [PATCH 06/11] introduction: Introduce transitional MMR interface

2023-03-30 Thread Parav Pandit
Introduce terminology for the transitional MMR device and transitional
MMR driver.

Add description of the transitional MMR device. It is a PCI
device that implements legacy virtio common configuration registers
followed by legacy device specific registers in a memory region at
an offset.

This enables hypervisor such as vfio driver to emulate
I/O region towards the guest at BAR0. By doing so VFIO driver can
translate read/write accesses on I/O region from the guest
to the device memory region.

High level comparison of 1.x, transitional & transitional MMR
sriov vf device:

+--+ ++ ++
|virtio 1.x| |Transitional| |Transitional|
|SRIOV VF  | |SRIOV VF| |MMR SRIOV VF|
|  | || ||
++---+ | ++---+   | ++---+   |
||dev_id =   | | ||dev_id =   |   | ||dev_id =   |   |
||{0x1040-0x106C}| | ||{0x1000-0x103f}|   | ||{0x10f9-0x10ff}|   |
|+---+ | |+---+   | |+---+   |
|  | || ||
|++| |++  | |+-+ |
||Memory BAR  || ||Memory BAR  |  | ||Memory BAR   | |
|++| |++  | || | |
|  | || || +--+| |
|  | |+-+ | || |legacy virtio || |
|  | ||IOBAR impossible | | || |+ dev cfg || |
|  | |+-+ | || |registers || |
|  | || || +--+| |
|  | || |+-+ |
+--+ ++ ++

Motivation and background:
PCIe and system limitations:
1. PCIe VFs do not support IOBAR cited at [1].

Perhaps the PCIe spec could be extended, however it would be only
useful for virtio transitional devices. Even if such an extension
is present, there are other system limitations described below in (2)
and (3).

2. cpu io port space limit and fragmentation
x86_64 is limited to only 64K worth of IO port space at [2],
which is shared with many other onboard system peripherals which
are behind PCIe bridge; such I/O region also needs to be aligned
to 4KB at PCIe bridge level cited at [3]. This can lead to a I/O space
fragmentation. Due to this fragmentation and alignment need,
actual usable range is small.

3. IO space access of PCI device is done through non-posted message
 which requires higher completion time in the PCIe fabric for
round trip travel.

[1] PCIe spec citation:
VFs do not support I/O Space and thus VF BARs shall not indicate I/O Space.

[2] cpu arch citiation:
Intel 64 and IA-32 Architectures Software Developer’s Manual
The processor’s I/O address space is separate and distinct from
the physical-memory address space. The I/O address space consists
of 64K individually addressable 8-bit I/O ports, numbered 0 through H.

[3] PCIe spec citation:
If a bridge implements an I/O address range,...I/O address range
will be aligned to a 4 KB boundary.

Co-developed-by: Satananda Burla 
Signed-off-by: Parav Pandit 
---
 introduction.tex | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/introduction.tex b/introduction.tex
index e8b34e3..9a0f96a 100644
--- a/introduction.tex
+++ b/introduction.tex
@@ -161,6 +161,20 @@ \subsection{Legacy Interface: 
Terminology}\label{intro:Legacy
   have a need for backwards compatibility!
 \end{note}
 
+\begin{description}
+\item[Transitional MMR Device]
+   is a PCI device which exposes legacy virtio configuration
+   registers followed by legacy device configuration registers as
+   memory mapped registers (MMR) at an offset in a memory region
+   BAR, has no I/O region BAR, having its own PCI Device ID range,
+   and follows the rest of the functionalities of the transitional device.
+\end{description}
+
+\begin{description}
+\item[Transitional MMR Driver]
+   is a PCI device driver that supports the Transitional MMR device.
+\end{description}
+
 Devices or drivers with no legacy compatibility are referred to as
 non-transitional devices and drivers, respectively.
 
@@ -174,6 +188,22 @@ \subsection{Transition from earlier specification 
drafts}\label{sec:Transition f
 sections tagged "Legacy Interface" in the section title.
 These highlight the changes made since the earlier drafts.
 
+\subsection{Transitional MMR interface: specification 
drafts}\label{sec:Transitional MMR interface: specification drafts}
+
+The transitional MMR device and driver differs from the
+transitional device and driver respectively in few areas. Such
+differences are contained in sections named
+'Transitional MMR interface', like this one. When no differences
+are mentioned explicitly, the transitional MMR devic

[virtio-dev] [PATCH 10/11] transport-pci: Use driver notification PCI capability

2023-03-30 Thread Parav Pandit
PCI devices support memory BAR regions for performant driver
notifications using the notification capability.
Enable transitional MMR devices to use it in simpler manner.

Co-developed-by: Satananda Burla 
Signed-off-by: Parav Pandit 
---
 transport-pci.tex | 28 
 1 file changed, 28 insertions(+)

diff --git a/transport-pci.tex b/transport-pci.tex
index 55a6aa0..4fd9898 100644
--- a/transport-pci.tex
+++ b/transport-pci.tex
@@ -763,6 +763,34 @@ \subsubsection{Notification structure 
layout}\label{sec:Virtio Transport Options
 cap.length >= queue_notify_off * notify_off_multiplier + 4
 \end{lstlisting}
 
+\paragraph{Transitional MMR Interface: A note on Notification Capability}
+\label{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure 
PCI Capabilities / Notification capability / Transitional MMR Interface}
+
+The transitional MMR device benefits from receiving driver
+notifications at the Queue Notification address offered using
+the notification capability, rather than via the memory mapped
+legacy QueueNotify configuration register.
+
+Transitional MMR device uses same Queue Notification address
+within a BAR for all virtqueues:
+\begin{lstlisting}
+cap.offset
+\end{lstlisting}
+
+The transitional MMR device MUST support Queue Notification
+address within a BAR for all virtqueues at:
+\begin{lstlisting}
+cap.offset
+\end{lstlisting}
+
+The transitional MMR driver that wants to use driver
+notifications offered using notification capability MUST use
+same Queue Notification address within a BAR for all virtqueues at:
+
+\begin{lstlisting}
+cap.offset
+\end{lstlisting}
+
 \subsubsection{ISR status capability}\label{sec:Virtio Transport Options / 
Virtio Over PCI Bus / PCI Device Layout / ISR status capability}
 
 The VIRTIO_PCI_CAP_ISR_CFG capability
-- 
2.26.2


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers

2023-03-30 Thread Parav Pandit
Legacy virtio configuration registers and adjacent
device configuration registers are located somewhere
in a memory BAR.

A new capability supplies the location of these registers
which a driver can use to map I/O access to legacy
memory mapped registers.

This gives the ability to locate legacy registers in either
the existing memory BAR or as completely new BAR at BAR 0.

A below example diagram attempts to depicts it in an existing
memory BAR.

+--+
|Transitional  |
|MMR SRIOV VF  |
|  |
++---+ |
||dev_id =   | |
||{0x10f9-0x10ff}| |
|+---+ |
|  |
+++|
|| PCIe ext cap = 0xB ||
|| cfg_type = 10  ||
|| offset   = 0x1000  ||
|| bar  = A {0..5}||
|+--|-+|
|   |  |
|   |  |
|   |+---+ |
|   || Memory BAR = A| |
|   ||   | |
|   +-->+--+ | |
||  |legacy virtio | | |
||  |+ dev cfg | | |
||  |registers | | |
||  +--+ | |
|+-+ | |
+--+

Co-developed-by: Satananda Burla 
Signed-off-by: Parav Pandit 
---
 transport-pci.tex | 33 +++--
 1 file changed, 31 insertions(+), 2 deletions(-)

diff --git a/transport-pci.tex b/transport-pci.tex
index aeda4a1..55a6aa0 100644
--- a/transport-pci.tex
+++ b/transport-pci.tex
@@ -168,6 +168,7 @@ \subsection{Virtio Structure PCI 
Capabilities}\label{sec:Virtio Transport Option
 \item ISR Status
 \item Device-specific configuration (optional)
 \item PCI configuration access
+\item Legacy memory mapped configuration registers (optional)
 \end{itemize}
 
 Each structure can be mapped by a Base Address register (BAR) belonging to
@@ -228,6 +229,8 @@ \subsection{Virtio Structure PCI 
Capabilities}\label{sec:Virtio Transport Option
 #define VIRTIO_PCI_CAP_SHARED_MEMORY_CFG 8
 /* Vendor-specific data */
 #define VIRTIO_PCI_CAP_VENDOR_CFG9
+/* Legacy configuration registers capability */
+#define VIRTIO_PCI_CAP_LEGACY_MMR_CFG10
 \end{lstlisting}
 
 Any other value is reserved for future use.
@@ -682,6 +685,18 @@ \subsubsection{Common configuration structure 
layout}\label{sec:Virtio Transport
 Configuration Space / Legacy Interface: Device Configuration
 Space}~\nameref{sec:Basic Facilities of a Virtio Device / Device Configuration 
Space / Legacy Interface: Device Configuration Space} for workarounds.
 
+\paragraph{Transitional MMR Interface: A Note on Configuration Registers}
+\label{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure 
PCI Capabilities / Common configuration structure layout / Transitional MMR 
Interface: A Note on Configuration Registers}
+
+The transitional MMR device MUST present legacy virtio registers
+consisting of legacy common configuration registers followed by
+legacy device specific configuration registers described in section
+\ref{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI 
Capabilities / Common configuration structure layout / Legacy Interfaces: A 
Note on Configuration Registers}
+in a memory region PCI BAR.
+
+The transitional MMR device MUST provide the location of the
+legacy virtio configuration registers using a legacy memory mapped
+registers capability described in section \ref{sec:Virtio Transport Options / 
Virtio Over PCI Bus / Virtio Structure PCI Capabilities / Transitional MMR 
Interface: Legacy Memory Mapped Configuration Registers Capability}.
 
 \subsubsection{Notification structure layout}\label{sec:Virtio Transport 
Options / Virtio Over PCI Bus / PCI Device Layout / Notification capability}
 
@@ -956,9 +971,23 @@ \subsubsection{PCI configuration access 
capability}\label{sec:Virtio Transport O
 specified by some other Virtio Structure PCI Capability
 of type other than \field{VIRTIO_PCI_CAP_PCI_CFG}.
 
+\subsubsection{Transitional MMR Interface: Legacy Memory Mapped Configuration 
Registers Capability}
+\label{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure 
PCI Capabilities / Transitional MMR Interface: Legacy Memory Mapped 
Configuration Registers Capability}
+
+The optional VIRTIO_PCI_CAP_LEGACY_MMR_CFG capability defines
+the location of the legacy virtio configuration registers
+followed by legacy device specific configuration registers in
+the memory region BAR for the transitional MMR device.
+
+The \field{cap.offset} MUST be 4-byte aligned.
+The \field{cap.offset} SHOULD be 4KBytes aligned and
+\field{cap.length} SHOULD be 4KBytes.
+
+The transitional MMR device MUST present a legacy configuration
+memory mapped registers capability using \field{virtio_pcie_ext_cap}.
+
 \subsubsection{Legacy Interface: A Note on Feature 

[virtio-dev] [PATCH 11/11] conformance: Add transitional MMR interface conformance

2023-03-30 Thread Parav Pandit
Add conformance section for the transitional MMR interface.

Basically transitional MMR interface follows same requirements as that
of the transitional device with few exceptions to it. List such delta
requirements in the conformance section.

Co-developed-by: Satananda Burla 
Signed-off-by: Parav Pandit 
---
 conformance.tex  |  8 ++--
 tmmr-conformance.tex | 27 +++
 2 files changed, 33 insertions(+), 2 deletions(-)
 create mode 100644 tmmr-conformance.tex

diff --git a/conformance.tex b/conformance.tex
index ccbc9bf..4cccba7 100644
--- a/conformance.tex
+++ b/conformance.tex
@@ -11,7 +11,7 @@ \section{Conformance Targets}\label{sec:Conformance / 
Conformance Targets}
 
 Conformance targets:
 \begin{description}
-\item[Driver] A driver MUST conform to four conformance clauses:
+\item[Driver] A driver MUST conform to five conformance clauses:
   \begin{itemize}
 \item Clause \ref{sec:Conformance / Driver Conformance}.
 \item One of clauses \ref{sec:Conformance / Driver Conformance / PCI 
Driver Conformance}, \ref{sec:Conformance / Driver Conformance / MMIO Driver 
Conformance} or \ref{sec:Conformance / Driver Conformance / Channel I/O Driver 
Conformance}.
@@ -36,8 +36,9 @@ \section{Conformance Targets}\label{sec:Conformance / 
Conformance Targets}
 \ref{sec:Conformance / Driver Conformance / PMEM Driver Conformance}.
 
 \item Clause \ref{sec:Conformance / Legacy Interface: Transitional Device 
and Transitional Driver Conformance}.
+\item Clause \ref{sec:Conformance / Transitional MMR Interface: 
Transitional MMR Device and Transitional MMR Driver Conformance}.
   \end{itemize}
-\item[Device] A device MUST conform to four conformance clauses:
+\item[Device] A device MUST conform to five conformance clauses:
   \begin{itemize}
 \item Clause \ref{sec:Conformance / Device Conformance}.
 \item One of clauses \ref{sec:Conformance / Device Conformance / PCI 
Device Conformance}, \ref{sec:Conformance / Device Conformance / MMIO Device 
Conformance} or \ref{sec:Conformance / Device Conformance / Channel I/O Device 
Conformance}.
@@ -63,6 +64,7 @@ \section{Conformance Targets}\label{sec:Conformance / 
Conformance Targets}
 \ref{sec:Conformance / Device Conformance / PMEM Device Conformance}.
 
 \item Clause \ref{sec:Conformance / Legacy Interface: Transitional Device 
and Transitional Driver Conformance}.
+\item Clause \ref{sec:Conformance / Transitional MMR Interface: 
Transitional MMR Device and Transitional MMR Driver Conformance}.
   \end{itemize}
 \end{description}
 
@@ -294,3 +296,5 @@ \section{Conformance Targets}\label{sec:Conformance / 
Conformance Targets}
 \item Section \ref{sec:Device Types / SCSI Host Device / Device Operation / 
Device Operation: eventq / Legacy Interface: Device Operation: eventq}
 \item Section \ref{sec:Reserved Feature Bits / Legacy Interface: Reserved 
Feature Bits}
 \end{itemize}
+
+\input{tmmr-conformance.tex}
diff --git a/tmmr-conformance.tex b/tmmr-conformance.tex
new file mode 100644
index 000..ad3489b
--- /dev/null
+++ b/tmmr-conformance.tex
@@ -0,0 +1,27 @@
+\conformance{\section}{Transitional MMR Interface: Transitional MMR Device and 
Transitional MMR Driver Conformance}\label{sec:Conformance / Transitional MMR 
Interface: Transitional MMR Device and Transitional MMR Driver Conformance}
+
+An implementation MAY choose to implement OPTIONAL support for the
+transitional MMR interface, by conforming to all of the MUST
+level requirements for the transitional MMR interface for the
+transitional devices and drivers.
+
+The requirements for the transitional MMR interface follows all
+the legacy interface requirements listed in section
+\ref{sec:Conformance / Legacy Interface: Transitional Device and
+Transitional Driver Conformance} with the following exceptions.
+
+Following requirements MUST NOT be implemented:
+
+\begin{itemize}
+\item Section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI 
Device Discovery / Legacy Interfaces: A Note on PCI Device Discovery}
+\item Section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio 
Structure PCI Capabilities / Common configuration structure layout / Legacy 
Interfaces: A Note on Configuration Registers}
+\end{itemize}
+
+Instead following requirements MUST be implemented:
+
+\begin{itemize}
+\item Section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI 
Device Discovery / Transitional MMR Interface: A Note on PCI Device Discovery}
+\item Section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio 
Structure PCI Capabilities / Common configuration structure layout / 
Transitional MMR Interface: A Note on Configuration Registers}
+\item Section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio 
Structure PCI Capabilities / Notification capability / Transitional MMR 
Interface}
+\item Section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio 
Structure PCI Capabilities / Transitiona

[virtio-dev] Re: [PATCH 02/11] transport-pci: Move transitional device id to legacy section

2023-03-30 Thread Michael S. Tsirkin
On Fri, Mar 31, 2023 at 01:58:25AM +0300, Parav Pandit wrote:
> Currently PCI device discovery details for the transitional device
> are documented in two different sections.
> 
> For example, PCI device and vendor ID registers are documented in
> 'Device Requirements: PCI Device Discovery' section, while PCI
> revision id is documented in 'Legacy Interfaces: A Note on PCI
> Device Discovery' section.
> 
> Transitional devices requirements should be documented in "legacy
> interfaces" section as clearly mentioned in
> 'Legacy Interface: A Note on Feature Bits'.

I already commented on this, I disagree.
Modern drivers must be able
to completely ignore legacy interface sections, but they
do bind to transitional device IDs.
This change breaks this assumption.


> Hence,
> 1. Move transitional device requirements to its designated Legacy
>interface section
> 2. Describe regular device requirements without quoting it as "non
>transitional device"
> 
> While at it, write the description using a singular object definition.
> 
> Reviewed-by: Satananda Burla 
> Signed-off-by: Parav Pandit 
> ---
>  transport-pci.tex | 70 ---
>  1 file changed, 36 insertions(+), 34 deletions(-)
> 
> diff --git a/transport-pci.tex b/transport-pci.tex
> index 7f27107..1f74c6f 100644
> --- a/transport-pci.tex
> +++ b/transport-pci.tex
> @@ -28,46 +28,24 @@ \subsection{PCI Device Discovery}\label{sec:Virtio 
> Transport Options / Virtio Ov
>  
>  \devicenormative{\subsubsection}{PCI Device Discovery}{Virtio Transport 
> Options / Virtio Over PCI Bus / PCI Device Discovery}
>  
> -Devices MUST have the PCI Vendor ID 0x1af4.
> -Devices MUST either have the PCI Device ID calculated by adding 0x1040
> +The device MUST have the PCI Vendor ID 0x1af4.
> +The device MUST calculate PCI Device ID by adding 0x1040
>  to the Virtio Device ID, as indicated in section \ref{sec:Device
> -Types} or have the Transitional PCI Device ID depending on the device type,
> -as follows:
> -
> -\begin{tabular}{|l|c|}
> -\hline
> -Transitional PCI Device ID  &  Virtio Device\\
> -\hline \hline
> -0x1000  &   network device \\
> -\hline
> -0x1001 &   block device \\
> -\hline
> -0x1002 & memory ballooning (traditional)  \\
> -\hline
> -0x1003 &  console   \\
> -\hline
> -0x1004 & SCSI host  \\
> -\hline
> -0x1005 &  entropy source\\
> -\hline
> -0x1009 &   9P transport \\
> -\hline
> -\end{tabular}
> +Types}.
>  
>  For example, the network device with the Virtio Device ID 1
> -has the PCI Device ID 0x1041 or the Transitional PCI Device ID 0x1000.
> -
> -The PCI Subsystem Vendor ID and the PCI Subsystem Device ID MAY reflect
> -the PCI Vendor and Device ID of the environment (for informational purposes 
> by the driver).
> +has the PCI Device ID 0x1041.
>  
> -Non-transitional devices SHOULD have a PCI Device ID in the range
> -0x1040 to 0x107f.
> -Non-transitional devices SHOULD have a PCI Revision ID of 1 or higher.
> -Non-transitional devices SHOULD have a PCI Subsystem Device ID of 0x40 or 
> higher.
> +The device SHOULD have a PCI Device ID in the range 0x1040 to 0x107f.
> +The device SHOULD have a PCI Revision ID of 1 or higher.
> +The device SHOULD have a PCI Subsystem Device ID of 0x40 or higher.
>  
>  This is to reduce the chance of a legacy driver attempting
>  to drive the device.
>  
> +The PCI Subsystem Vendor ID and the PCI Subsystem Device ID MAY reflect
> +the PCI Vendor and Device ID of the environment (for informational purposes 
> by the driver).
> +
>  \drivernormative{\subsubsection}{PCI Device Discovery}{Virtio Transport 
> Options / Virtio Over PCI Bus / PCI Device Discovery}
>  Drivers MUST match devices with the PCI Vendor ID 0x1af4 and
>  the PCI Device ID in the range 0x1040 to 0x107f,
> @@ -85,8 +63,32 @@ \subsection{PCI Device Discovery}\label{sec:Virtio 
> Transport Options / Virtio Ov
>  PCI Subsystem Device ID value.
>  
>  \subsubsection{Legacy Interfaces: A Note on PCI Device 
> Discovery}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI 
> Device Discovery / Legacy Interfaces: A Note on PCI Device Discovery}
> -Transitional devices MUST have a PCI Revision ID of 0.
> -Transitional devices MUST have the PCI Subsystem Device ID
> +
> +The transitional device has one of the following PCI Device ID
> +depending on the device type:
> +
> +\begin{tabular}{|l|c|}
> +\hline
> +Transitional PCI Device ID  &  Virtio Device\\
> +\hline \hline
> +0x1000  &   network device \\
> +\hline
> +0x1001 &   block device \\
> +\hline
> +0x1002 & memory ballooning (traditional)  \\
> +\hline
> +0x1003 &  console   \\
> +\hline
> +0x1004 & SCSI host  \\
> +\hline
> +0x1005 &  entropy source\\
> +\hline
> +0x1009 &   9P transport \\
> +\hline
> +\end{tabular}
> +
> +The transitional device MUST have a PCI Revision ID of 0.
> +The transitional device MUST have th