[PATCH 2/7] qapi: Fix comment indentation

2022-04-29 Thread Andrea Bolognani
It should start on the very first column.

Signed-off-by: Andrea Bolognani 
---
 qapi/ui.json | 30 +++---
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/qapi/ui.json b/qapi/ui.json
index 059302a5ef..43e62efd76 100644
--- a/qapi/ui.json
+++ b/qapi/ui.json
@@ -1250,21 +1250,21 @@
 '*p2p': 'bool',
 '*audiodev': 'str' } }
 
- ##
- # @DisplayGLMode:
- #
- # Display OpenGL mode.
- #
- # @off: Disable OpenGL (default).
- # @on: Use OpenGL, pick context type automatically.
- #  Would better be named 'auto' but is called 'on' for backward
- #  compatibility with bool type.
- # @core: Use OpenGL with Core (desktop) Context.
- # @es: Use OpenGL with ES (embedded systems) Context.
- #
- # Since: 3.0
- #
- ##
+##
+# @DisplayGLMode:
+#
+# Display OpenGL mode.
+#
+# @off: Disable OpenGL (default).
+# @on: Use OpenGL, pick context type automatically.
+#  Would better be named 'auto' but is called 'on' for backward
+#  compatibility with bool type.
+# @core: Use OpenGL with Core (desktop) Context.
+# @es: Use OpenGL with ES (embedded systems) Context.
+#
+# Since: 3.0
+#
+##
 { 'enum': 'DisplayGLMode',
   'data': [ 'off', 'on', 'core', 'es' ] }
 
-- 
2.35.1




Re: [RFC PATCH v1 0/8] qapi: add generator for Golang interface

2022-04-29 Thread Andrea Bolognani
On Thu, Apr 28, 2022 at 03:50:55PM +0200, Markus Armbruster wrote:
> Andrea Bolognani  writes:
> > One concern that I have is about naming struct members: things like
> > SpiceInfo.MouseMode and most others are translated from the QAPI
> > schema exactly the way you'd expect them, but for example
> > ChardevCommon.Logappend doesn't look quite right.
>
> It doesn't look quite right in the QAPI schema, either: @logappend.  If
> it was @log-append, as it should, then it would get translated to
> LogAppend, I guess.
>
> Fixing up style isn't a code generator's job.

I agree that the generator shouldn't take too many liberties when
translating names, and specifically should never attempt to figure
out that @logappend should have been @log-append instead.

What I was thinking of was more along the lines of, can we change the
schema so that the proper name is available to the generator without
breaking the wire protocol? Maybe something like

  ##
  # ChardevCommon:
  #
  # @logappend (rename @log-append): ...
  ##

That way the generator would have access to both information, and
would thus be able to generate

  type ChardevCommon struct {
LogAppend *bool `json:"logappend,omitempty"`
  }

The wire protocol would still retain the unappealing name, but at
least client libraries could hide the uglyness from users.

> > Same for the various
> > structs or members that have unexpectedly-capitalized "Tls" or "Vnc"
> > in them.
>
> Examples?

A perfect one is TlsCredsProperties, whose endpoint member is of type
QCryptoTLSCredsEndpoint.

On the VNC front, we have SetPasswordOptionsVnc but also
DisplayReloadOptionsVNC.

There's plenty more, but this should be illustrative enough already.
Capitalization of these acronyms is inconsistent across the schema,
with one of the two forms disagreeing with Go naming expectations.

In this case we might be able to just change the schema without
introducing backwards compatibility issues, though? Type names are
not actually transmitted on the wire IIUC.

> > but maybe
> > there's room for adding this kind of information in the form of
> > additional annotations or something like that?
>
> We did for enumeration types: 'prefix' overrides the TYPE_NAME prefix.
> I fear this was a mistake.

This might be an oversimplification, but I think we might be able to
solve all of these issues with a single annotation in the form

  namespace:word-MLA-other-words

So for example QCryptoTLSCredsEndpoint would be annotated with

  q:crypto-TLS-creds-endpoint

and each generator would have enough information to produce
identifiers that fit into the corresponding language, such as

  qcrypto_tls_creds_endpoint
  CryptoTlsCredsEndpoint

or whatever else.

Of course such annotations would only be necessary to deal with
identifiers that are not already following the expected naming
conventions and when MLAs are involved.

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [PATCH] qapi: Fix malformed "Since:" section tags

2022-04-22 Thread Andrea Bolognani
On Fri, Apr 22, 2022 at 03:28:07PM +0200, Markus Armbruster wrote:
> "Since X.Y" is not recognized as a tagged section, and therefore not
> formatted as such in generated documentation.  Fix by adding the
> required colon.
>
> Signed-off-by: Markus Armbruster 
> ---
>  qapi/crypto.json | 3 +--
>  qapi/machine.json| 2 +-
>  qapi/misc.json   | 2 +-
>  qga/qapi-schema.json | 2 +-
>  4 files changed, 4 insertions(+), 5 deletions(-)

Reviewed-by: Andrea Bolognani 

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: Create qemu-project/py-qemu.qmp repo

2022-04-22 Thread Andrea Bolognani
On Thu, Apr 21, 2022 at 05:00:16PM -0400, John Snow wrote:
> On Thu, Apr 21, 2022, 2:00 PM Andrea Bolognani  wrote:
> > I think I would go with "python-qemu-qmp". Having a dot in the name
> > of a git repo is not very common AFAICT, and I wouldn't rule out the
> > possibility of some GitLab feature or other tooling breaking or
> > misbehaving because of it.
>
> The idea is to have the repo name resemble the Python package name, which
> is "qemu.qmp". For Python, it's customary to have the package name match
> the import name. The import name is "qemu.qmp".
>
> I tested this name on GitLab and it appears to work just fine.

I'm concerned about issues that you'd only trigger when using
certain, perhaps less common, features.

Here's an example of such an issue from just a year ago:

  https://gitlab.com/gitlab-org/gitlab/-/issues/224669

There's an epic tracking more issues of the same kind, though
admittedly most were addressed four years ago:

  https://gitlab.com/groups/gitlab-org/-/epics/3740

Up to you whether you feel confident enough that you're not going to
run into issues later.

> > If you're really keen on saving those few extra keystrokes, maybe
> > "pyqemu" is a better prefix than "py-qemu"? I don't know, it just
> > looks more natural to me.
>
> I'd add "py:" as a prefix, but the colon doesn't work as a filename in many
> places, so I suggested "py-".
>
> Thus, all together, "py-qemu.qmp".
>
> (I could spell out "python", I just prefer the shorter prefix because it's
> explanatory enough as-is and I like keeping git checkout names short. My
> favorite color of bike shed is blue.)

You can absolutely have short names locally even when things are
spelled out in GitLab.

Anyway, in this case my taste in names is clearly simply different
from yours and you should absolutely feel free to ignore my opinion
on the matter :)

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: Create qemu-project/py-qemu.qmp repo

2022-04-21 Thread Andrea Bolognani
On Thu, Apr 21, 2022 at 12:46:48PM -0400, John Snow wrote:
> Hi Alex: do you have the ability to create a blank/empty "py-qemu.qmp"
> repo under the qemu-project grouping, and add me and Cleber as
> maintainers for it? There weren't any objections when I floated the
> idea [1].
>
> (Though I suggested "py-qemu.qmp" and Dan suggested "python-qemu.qmp".
> I don't think we explicitly reconciled the difference. I like the
> shorter one.)

Since you CC'd me to this message, I'm going to assume you're
explicitly welcoming my input on what specific shade this bikeshed
should be painted ;)

I think I would go with "python-qemu-qmp". Having a dot in the name
of a git repo is not very common AFAICT, and I wouldn't rule out the
possibility of some GitLab feature or other tooling breaking or
misbehaving because of it.

If you're really keen on saving those few extra keystrokes, maybe
"pyqemu" is a better prefix than "py-qemu"? I don't know, it just
looks more natural to me.

-- 
Andrea Bolognani / Red Hat / Virtualization




[PATCH 2/2] docs: build-platforms: Clarify stance on minor releases and backports

2022-04-20 Thread Andrea Bolognani
These changes match those made in the following libvirt commits:

  2ac78307af docs: Clarify our stance on backported packages
  78cffd450a docs: Spell out our policy concerning minor releases

Since QEMU's platform support policy is based on libvirt's, it
makes sense to mirror these recent changes made to the latter.

The policy is not altered significantly - we're simply spelling
out some rules that were likely already being implicitly
enforced.

Signed-off-by: Andrea Bolognani 
---
 docs/about/build-platforms.rst | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/docs/about/build-platforms.rst b/docs/about/build-platforms.rst
index e9163ba556..dce6557bd4 100644
--- a/docs/about/build-platforms.rst
+++ b/docs/about/build-platforms.rst
@@ -71,7 +71,10 @@ The project aims to support the most recent major version at 
all times. Support
 for the previous major version will be dropped 2 years after the new major
 version is released or when the vendor itself drops support, whichever comes
 first. In this context, third-party efforts to extend the lifetime of a distro
-are not considered, even when they are endorsed by the vendor (eg. Debian LTS).
+are not considered, even when they are endorsed by the vendor (eg. Debian LTS);
+the same is true of repositories that contain packages backported from later
+releases (e.g. Debian backports). Within each major release, only the most
+recent minor release is considered.
 
 For the purposes of identifying supported software versions available on Linux,
 the project will look at CentOS, Debian, Fedora, openSUSE, RHEL, SLES and
-- 
2.35.1




Re: [PATCH 2/2] docs: build-platforms: Clarify stance on minor releases and backports

2022-04-20 Thread Andrea Bolognani
On Wed, Apr 20, 2022 at 05:15:08PM +0100, Daniel P. Berrangé wrote:
> On Wed, Apr 20, 2022 at 06:03:11PM +0200, Andrea Bolognani wrote:
> > These changes match those made in the following libvirt commits:
> >
> >   2ac78307af docs: Clarify our stance on backported packages
> >   78cffd450a docs: Spell out our policy concerning minor releases
> >
> > Since QEMU's platform support policy is based on libvirt's, it
> > makes sense to mirror these recent changes made to the latter.
> >
> > The policy is not altered significantly - we're simply spelling
> > out some rules that were likely already being implicitly
> > enforced.
>
> Indeed, I think that's basically defacto the case already.
>
> Reviewed-by: Daniel P. Berrangé 

Thanks! Are you going to bring these in through one of your trees, or
do I need to bug someone else so that they will pick them up? :)

-- 
Andrea Bolognani / Red Hat / Virtualization




[PATCH 1/2] docs: build-platforms: Fix spelling for Homebrew

2022-04-20 Thread Andrea Bolognani
Signed-off-by: Andrea Bolognani 
---
 docs/about/build-platforms.rst | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/about/build-platforms.rst b/docs/about/build-platforms.rst
index c29a4b8fe6..e9163ba556 100644
--- a/docs/about/build-platforms.rst
+++ b/docs/about/build-platforms.rst
@@ -80,7 +80,7 @@ Ubuntu LTS. Other distros will be assumed to ship similar 
software versions.
 For FreeBSD and OpenBSD, decisions will be made based on the contents of the
 respective ports repository, while NetBSD will use the pkgsrc repository.
 
-For macOS, `HomeBrew`_ will be used, although `MacPorts`_ is expected to carry
+For macOS, `Homebrew`_ will be used, although `MacPorts`_ is expected to carry
 similar versions.
 
 Windows
@@ -92,6 +92,6 @@ hosted on Linux (Debian/Fedora).
 The version of the Windows API that's currently targeted is Vista / Server
 2008.
 
-.. _HomeBrew: https://brew.sh/
+.. _Homebrew: https://brew.sh/
 .. _MacPorts: https://www.macports.org/
 .. _Repology: https://repology.org/
-- 
2.35.1




[PATCH 0/2] docs: build-platforms: Fix and clarify

2022-04-20 Thread Andrea Bolognani


Andrea Bolognani (2):
  docs: build-platforms: Fix spelling for Homebrew
  docs: build-platforms: Clarify stance on minor releases and backports

 docs/about/build-platforms.rst | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

-- 
2.35.1





[PATCH 3/3] qapi: Fix typo

2022-04-20 Thread Andrea Bolognani
Signed-off-by: Andrea Bolognani 
---
 qapi/sockets.json | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/qapi/sockets.json b/qapi/sockets.json
index 5773d9fcc4..fccc38584b 100644
--- a/qapi/sockets.json
+++ b/qapi/sockets.json
@@ -149,7 +149,7 @@
 #
 # Note: This type is deprecated in favor of SocketAddress.  The
 #   difference between SocketAddressLegacy and SocketAddress is that the
-#   latter is has fewer {} on the wire.
+#   latter has fewer {} on the wire.
 #
 # Since: 1.3
 ##
-- 
2.35.1




[PATCH 2/3] qapi: Fix documentation for query-xen-replication-status

2022-04-20 Thread Andrea Bolognani
The correct return type is ReplicationStatus, not
ReplicationResult.

Signed-off-by: Andrea Bolognani 
---
 qapi/migration.json | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/qapi/migration.json b/qapi/migration.json
index 27d7b28158..409eb086a2 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -1619,7 +1619,7 @@
 #
 # Query replication status while the vm is running.
 #
-# Returns: A @ReplicationResult object showing the status.
+# Returns: A @ReplicationStatus object showing the status.
 #
 # Example:
 #
-- 
2.35.1




[PATCH 1/3] docs: qapi: Remove outdated reference to simple unions

2022-04-20 Thread Andrea Bolognani
Commit 4e99f4b12c0e dropped simple unions and updated most
documentation accordingly, but in one case we still claim that
there are "two flavors of unions".

Signed-off-by: Andrea Bolognani 
---
 docs/devel/qapi-code-gen.rst | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/devel/qapi-code-gen.rst b/docs/devel/qapi-code-gen.rst
index 246709ede8..7b968433a6 100644
--- a/docs/devel/qapi-code-gen.rst
+++ b/docs/devel/qapi-code-gen.rst
@@ -41,8 +41,8 @@ used internally.
 
 There are several kinds of types: simple types (a number of built-in
 types, such as ``int`` and ``str``; as well as enumerations), arrays,
-complex types (structs and two flavors of unions), and alternate types
-(a choice between other types).
+complex types (structs and unions), and alternate types (a choice
+between other types).
 
 
 Schema syntax
-- 
2.35.1




[PATCH 0/3] qapi: Random small fixes

2022-04-20 Thread Andrea Bolognani


Andrea Bolognani (3):
  docs: qapi: Remove outdated reference to simple unions
  qapi: Fix documentation for query-xen-replication-status
  qapi: Fix typo

 docs/devel/qapi-code-gen.rst | 4 ++--
 qapi/migration.json  | 2 +-
 qapi/sockets.json| 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

-- 
2.35.1





Re: [RFC PATCH v1 0/8] qapi: add generator for Golang interface

2022-04-19 Thread Andrea Bolognani
On Tue, Apr 19, 2022 at 11:12:28AM -0700, Andrea Bolognani wrote:
> Dealing with errors and commands that don't have a return value might
> require us to have generic CommandResult wrapper after all, but we
> should really try as hard as we can to stick to type safe interfaces.

On second thought, this wouldn't actually need to be generic: we
could have something like

  type TraceEventGetStateResult struct {
  Result TraceEventInfo `json:"return"`
  Error  *Error `json:"error"`
  }

and the caller would check that res.Error is nil before accessing
res.Result.

Commands for which a return value is not expected would just have the
Error part in their corresponding Result struct, and those that can
either return an object or nothing (are there actually any like
that?) could have a pointer as the Result member.

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [RFC PATCH v1 0/8] qapi: add generator for Golang interface

2022-04-19 Thread Andrea Bolognani
On Sat, Apr 02, 2022 at 12:40:56AM +0200, Victor Toso wrote:
> Thanks for taking a look, let me know if you have questions, ideas
> or suggestions.

Full disclosure: I have only given the actual implementation a very
cursory look so far, and I've focused on the generated Go API
instead.

Overall things look pretty good.

One concern that I have is about naming struct members: things like
SpiceInfo.MouseMode and most others are translated from the QAPI
schema exactly the way you'd expect them, but for example
ChardevCommon.Logappend doesn't look quite right. Of course there's
no way to programmatically figure out what to capitalize, but maybe
there's room for adding this kind of information in the form of
additional annotations or something like that? Same for the various
structs or members that have unexpectedly-capitalized "Tls" or "Vnc"
in them.

To be clear, I don't think the above is a blocker - just something to
be aware of, and think about.

My biggest concern is about the interface offered for commands.

Based on the example you have in the README and how commands are
defined, invoking (a simplified version of) the trace-event-get-state
command would look like

  cmd := Command{
  Name: "trace-event-get-state",
  Arg: TraceEventGetStateCommand{
  Name: "qemu_memalign",
  },
  }
  qmp_input, _ := json.Marshal()
  // qmp_input now contains
  //   {"execute":"trace-event-get-state","arguments":{"name":"qemu_memalign"}}
  // do something with it

  qmp_output :=
([]byte)(`{"return":{"name":"qemu_memalign","state":"disabled"}}`)
  ret := cmd.GetReturnType()
  _ = json.Unmarshal(qmp_output, )
  // ret is a CommandResult instance whose Value member can be cast
  // to a TraceEventInfo struct

First of all, from an application's point of view there are way too
many steps involved: performing this operation should really be as
simple as

  ret, _ := qmp.TraceEventGetState("qemu_memalign")
  // ret is a TraceEventInfo instance

That's the end state we should be working towards.

Of course that assumes that the "qmp" object knows where the QMP
socket is, knows how to talk the QMP protocol, transparently deals
with serializing and deserializing data... Plus, in some case you
might want to deal with the wire transfer yourself in an
application-specific manner. So it makes sense to have the basic
building blocks available and then build the more ergonomic SDK on
top of that - with only the first part being in scope for this
series.

Even with that in mind, the current interface is IMO problematic
because of its almost complete lack of type safety. Both Command.Arg
and CommandResult.Value are of type Any and CommandBase.Name, which
is used to drive the JSON unmarshal logic as well as ending up on the
wire when executing a command, is just a plain string.

I think the low-level interface should look more like

  cmd := TraceEventGetStateCommand{
  Name: "qemu_memalign",
  }
  qmp_input, _ := json.Marshal()
  // qmp_input looks the same as before

  qmp_output :=
([]byte)(`{"return":{"name":"qemu_memalign","state":"disabled"}}`)
  ret := TraceEventInfo{}
  _ = json.Unmarshal(qmp_output, )
  // ret is a TraceEventInfo instance

The advantages over the current implementation is that the compiler
will prevent you from doing something silly like passing the wrong
set of arguments to a commmand, and that the application has to
explicitly spell out what kind of object it expects to get as output.

I'm attaching an incomplete implementation that I used for playing
around. It's obviously too simplistic, but hopefully it will help
illustrate my point.

Dealing with errors and commands that don't have a return value might
require us to have generic CommandResult wrapper after all, but we
should really try as hard as we can to stick to type safe interfaces.

-- 
Andrea Bolognani / Red Hat / Virtualization
package main

import (
"encoding/json"
"fmt"
)

type TraceEventGetStateCommand struct {
Name string `json:"name"`
}

func (self *TraceEventGetStateCommand) MarshalJSON() ([]byte, error) {
type Arguments TraceEventGetStateCommand
return json.Marshal( {
Execute   string `json:"execute"`
Arguments *Arguments `json:"arguments"`
}{
Execute:   "trace-event-get-state",
Arguments: (*Arguments)(self),
})
}

type TraceEventInfo struct {
Name  string `json:"name"`
State string `json:"state"`
}

func (self *TraceEventInfo) UnmarshalJSON(data []byte) error {
type Return TraceEventInfo
ret := struct {
Return Return `json:"return"

Re: [PATCH 01/10] python/aqmp: add explicit GPLv2 license to legacy.py

2022-03-24 Thread Andrea Bolognani
On Thu, Mar 24, 2022 at 09:03:07AM +, Daniel P. Berrangé wrote:
> > Overall making it *all* GPLv2+ compat is going to be important if you
> > want people to be comfortable using it. If it has a mix of GPLv2+
> > and GPLv2-only code in the source tarball, then the overall combined
> > work will have to be considered GPLv2-only and that will put people
> > off using it. Even if they could theoreticallly restrict their usage
> > to only the GPLv2+ parts, many won't get that far before moving on.

Agreed.

> Actually I'll go furthuer and suggest that if we're going to do a
> relicensing at all, and your goal is to encourage usage, then GPLv2+
> is the wrong choice. Use LGPLv2+ if you want to facilitate usage, while
> retaining a copyleft license.

Does LGPL make sense in the context of Python, where there is no
linking?

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: qemu crash 100% CPU with Ubuntu10.04 guest (solved)

2022-02-17 Thread Andrea Bolognani
On Thu, Feb 17, 2022 at 12:07:15PM +1100, Ben Smith wrote:
> Hi All,
>
> I'm cross-posting this from Reddit qemu_kvm, in case it helps in some
> way. I know my setup is ancient and unique; let me know if you would
> like more info.
>
> Symptoms:
> 1. Ubuntu10.04 32-bit guest locks up randomly between 0 and 30 days.
> 2. The console shows a CPU trace dump, nothing else logged on the guest or 
> host.
> 3. Host system (Ubuntu20.04) 100% CPU for qemu process.
>
> Solution:
> When using virt-install, always use the "--os-variant" parameter!
> e.g. --os-variant ubuntu10.04

FWIW, the --os-variant / --osinfo argument is going to be mandatory
starting with the upcoming virt-manager release.

https://listman.redhat.com/archives/virt-tools-list/2022-February/msg00021.html

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [PATCH 2/2] tests: Update CentOS 8 container to CentOS Stream 8

2022-02-01 Thread Andrea Bolognani
On Tue, Feb 01, 2022 at 06:47:14PM +0100, Philippe Mathieu-Daudé wrote:
> Andrea, do you think it is acceptable to merge this and fix on top, or
> we should do it properly from start?

My preference is always to avoid follow-up tweaks if possible :) but
ultimately the decision is up to the QEMU developers and maintainers.

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [PATCH 2/2] tests: Update CentOS 8 container to CentOS Stream 8

2022-02-01 Thread Andrea Bolognani
On Tue, Feb 01, 2022 at 06:02:21PM +, Daniel P. Berrangé wrote:
> On Tue, Feb 01, 2022 at 09:08:22AM -0800, Andrea Bolognani wrote:
> > CentOS 8 and CentOS Stream 8 are two pretty distinct operating
> > systems in terms of update cadence and stability expectations, so I
> > think that using the label "centos8" for containers and CI jobs that
> > are actually consuming CentOS Stream 8 is going to be a source of
> > confusion.
>
> Given the EOL of what I call the "traditional" CentOS model at the
> end of 2021, both "CentOS" and "CentOS Stream" terms effectively
> mean the same thing now.

Note that while CentOS 7 is no longer targeted by QEMU, it still is
very much relevant as an OS and will keep receiving updates for ~2.5
more years.

> The caveat is that aside from this dockerfile, we also have a
> VM config in test/vm/ that historically used traditional CentOS.
> That may also need updating to point to Stream unless it has
> seemlessly transitioned to using Stream content without us
> needing to change anything. I've not looked closely at that yet.

This example IMO only highlights the need for names to match reality
and why overloading existing names is a bad idea.

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [PATCH 2/2] tests: Update CentOS 8 container to CentOS Stream 8

2022-02-01 Thread Andrea Bolognani
On Tue, Feb 01, 2022 at 11:19:11AM +0100, Thomas Huth wrote:
> +++ b/tests/docker/dockerfiles/centos8.docker
> @@ -1,10 +1,10 @@
>  # THIS FILE WAS AUTO-GENERATED
>  #
> -#  $ lcitool dockerfile centos-8 qemu
> +#  $ lcitool dockerfile centos-stream-8 qemu
>  #
>  # https://gitlab.com/libvirt/libvirt-ci
>
> -FROM docker.io/library/centos:8
> +FROM quay.io/centos/centos:stream8
[...]
> +++ b/tests/lcitool/refresh
> @@ -77,7 +77,7 @@ ubuntu2004_tsanhack = [
>  ]
>
>  try:
> -   generate_dockerfile("centos8", "centos-8")
> +   generate_dockerfile("centos8", "centos-stream-8")

I'm not convinced this is a good idea.

CentOS 8 and CentOS Stream 8 are two pretty distinct operating
systems in terms of update cadence and stability expectations, so I
think that using the label "centos8" for containers and CI jobs that
are actually consuming CentOS Stream 8 is going to be a source of
confusion.

I recommend proceeding with a thorough rename, after which the string
"centos8" no longer shows up anywhere, instead.

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [PATCH 0/3] meson: Don't pass 'method' to dependency()

2022-01-19 Thread Andrea Bolognani
On Wed, Jan 19, 2022 at 05:38:38PM +, Daniel P. Berrangé wrote:
> On Wed, Jan 19, 2022 at 06:17:57PM +0100, Andrea Bolognani wrote:
> > See [1] for recent discussion about libgcrypt specifically, which the
> > first patch is about.
> >
> > After writing that one, I realized that there is no point in
> > explicitly passing 'method' to dependency() because Meson will do the
> > right thing by default - hence the next two patches.
>
> This whole series is effectively reverting
>
>   commit 1a94933fcc3d641bda9988244cde61769baae2e5
>   Author: Paolo Bonzini 
>   Date:   Mon Aug 31 06:27:00 2020 -0400
>
> meson: use pkg-config method to find dependencies
>
> We do not need to ask cmake for the dependencies, so just use the
> pkg-config mechanism.  Keep "auto" for SDL so that it tries using
> sdl-config too.
>
> The documentation is adjusted to use SDL2_image as the example,
> rather than SDL which does not use the "pkg-config" method.
>
> Signed-off-by: Paolo Bonzini 
>
> which IIRC was done to get rid of mesons' confusing/misleading
> attempts to probe for things via cmake when the pkg-config file
> is not present.

I guess I stand corrected on Meson doing the right thing by default
then :)

The first patch should still makes sense though: libgcrypt is like
SDL in that Meson implements special handling for it, and we should
allow the pkg-config file to be used if available.

-- 
Andrea Bolognani / Red Hat / Virtualization




[PATCH 1/3] meson: Don't force use of libgcrypt-config

2022-01-19 Thread Andrea Bolognani
libgcrypt 1.9.0 (released in 2021-01) ships with a proper
pkg-config file, which Meson's libgcrypt detection code can use
if available.

Passing 'config-tool' as 'method' when calling dependency(),
however, forces Meson to ignore the pkg-config file and always
use libgcrypt-config instead.

Signed-off-by: Andrea Bolognani 
---
 meson.build | 1 -
 1 file changed, 1 deletion(-)

diff --git a/meson.build b/meson.build
index 762d7cee85..bc17ba67fd 100644
--- a/meson.build
+++ b/meson.build
@@ -1036,7 +1036,6 @@ endif
 if not gnutls_crypto.found()
   if (not get_option('gcrypt').auto() or have_system) and not 
get_option('nettle').enabled()
 gcrypt = dependency('libgcrypt', version: '>=1.8',
-method: 'config-tool',
 required: get_option('gcrypt'),
 kwargs: static_kwargs)
 # Debian has removed -lgpg-error from libgcrypt-config
-- 
2.34.1




[PATCH 3/3] docs: Don't recommend passing 'method' to dependency()

2022-01-19 Thread Andrea Bolognani
Meson will do the right thing by default.

Signed-off-by: Andrea Bolognani 
---
 docs/devel/build-system.rst | 1 -
 1 file changed, 1 deletion(-)

diff --git a/docs/devel/build-system.rst b/docs/devel/build-system.rst
index 431caba7aa..fcdc0cd187 100644
--- a/docs/devel/build-system.rst
+++ b/docs/devel/build-system.rst
@@ -316,7 +316,6 @@ dependency will be used::
   sdl_image = not_found
   if not get_option('sdl_image').auto() or have_system
 sdl_image = dependency('SDL2_image', required: get_option('sdl_image'),
-   method: 'pkg-config',
static: enable_static)
   endif
 
-- 
2.34.1




[PATCH 0/3] meson: Don't pass 'method' to dependency()

2022-01-19 Thread Andrea Bolognani
See [1] for recent discussion about libgcrypt specifically, which the
first patch is about.

After writing that one, I realized that there is no point in
explicitly passing 'method' to dependency() because Meson will do the
right thing by default - hence the next two patches.

[1] https://lists.gnu.org/archive/html/qemu-devel/2022-01/msg01224.html

Andrea Bolognani (3):
  meson: Don't force use of libgcrypt-config
  meson: Don't pass 'method' to dependency()
  docs: Don't recommend passing 'method' to dependency()

 docs/devel/build-system.rst |  1 -
 meson.build | 75 +++--
 tcg/meson.build |  2 +-
 3 files changed, 31 insertions(+), 47 deletions(-)

-- 
2.34.1





[PATCH 2/3] meson: Don't pass 'method' to dependency()

2022-01-19 Thread Andrea Bolognani
Meson will do the right thing by default.

Signed-off-by: Andrea Bolognani 
---
 meson.build | 74 -
 tcg/meson.build |  2 +-
 2 files changed, 31 insertions(+), 45 deletions(-)

diff --git a/meson.build b/meson.build
index bc17ba67fd..b807ad9fbb 100644
--- a/meson.build
+++ b/meson.build
@@ -427,13 +427,13 @@ if 'CONFIG_GIO' in config_host
 endif
 lttng = not_found
 if 'ust' in get_option('trace_backends')
-  lttng = dependency('lttng-ust', required: true, method: 'pkg-config',
+  lttng = dependency('lttng-ust', required: true,
  kwargs: static_kwargs)
 endif
 pixman = not_found
 if have_system or have_tools
   pixman = dependency('pixman-1', required: have_system, version:'>=0.21.8',
-  method: 'pkg-config', kwargs: static_kwargs)
+  kwargs: static_kwargs)
 endif
 zlib = dependency('zlib', required: true, kwargs: static_kwargs)
 
@@ -446,18 +446,18 @@ endif
 linux_io_uring = not_found
 if not get_option('linux_io_uring').auto() or have_block
   linux_io_uring = dependency('liburing', required: 
get_option('linux_io_uring'),
-  method: 'pkg-config', kwargs: static_kwargs)
+  kwargs: static_kwargs)
 endif
 libxml2 = not_found
 if not get_option('libxml2').auto() or have_block
   libxml2 = dependency('libxml-2.0', required: get_option('libxml2'),
-   method: 'pkg-config', kwargs: static_kwargs)
+   kwargs: static_kwargs)
 endif
 libnfs = not_found
 if not get_option('libnfs').auto() or have_block
   libnfs = dependency('libnfs', version: '>=1.9.3',
   required: get_option('libnfs'),
-  method: 'pkg-config', kwargs: static_kwargs)
+  kwargs: static_kwargs)
 endif
 
 libattr_test = '''
@@ -505,7 +505,7 @@ seccomp = not_found
 if not get_option('seccomp').auto() or have_system or have_tools
   seccomp = dependency('libseccomp', version: '>=2.3.0',
required: get_option('seccomp'),
-   method: 'pkg-config', kwargs: static_kwargs)
+   kwargs: static_kwargs)
 endif
 
 libcap_ng = not_found
@@ -533,7 +533,7 @@ if get_option('xkbcommon').auto() and not have_system and 
not have_tools
   xkbcommon = not_found
 else
   xkbcommon = dependency('xkbcommon', required: get_option('xkbcommon'),
- method: 'pkg-config', kwargs: static_kwargs)
+ kwargs: static_kwargs)
 endif
 
 vde = not_found
@@ -562,30 +562,30 @@ endif
 pulse = not_found
 if not get_option('pa').auto() or (targetos == 'linux' and have_system)
   pulse = dependency('libpulse', required: get_option('pa'),
- method: 'pkg-config', kwargs: static_kwargs)
+ kwargs: static_kwargs)
 endif
 alsa = not_found
 if not get_option('alsa').auto() or (targetos == 'linux' and have_system)
   alsa = dependency('alsa', required: get_option('alsa'),
-method: 'pkg-config', kwargs: static_kwargs)
+kwargs: static_kwargs)
 endif
 jack = not_found
 if not get_option('jack').auto() or have_system
   jack = dependency('jack', required: get_option('jack'),
-method: 'pkg-config', kwargs: static_kwargs)
+kwargs: static_kwargs)
 endif
 
 spice_protocol = not_found
 if not get_option('spice_protocol').auto() or have_system
   spice_protocol = dependency('spice-protocol', version: '>=0.12.3',
   required: get_option('spice_protocol'),
-  method: 'pkg-config', kwargs: static_kwargs)
+  kwargs: static_kwargs)
 endif
 spice = not_found
 if not get_option('spice').auto() or have_system
   spice = dependency('spice-server', version: '>=0.12.5',
  required: get_option('spice'),
- method: 'pkg-config', kwargs: static_kwargs)
+ kwargs: static_kwargs)
 endif
 spice_headers = spice.partial_dependency(compile_args: true, includes: true)
 
@@ -595,32 +595,29 @@ libiscsi = not_found
 if not get_option('libiscsi').auto() or have_block
   libiscsi = dependency('libiscsi', version: '>=1.9.0',
  required: get_option('libiscsi'),
- method: 'pkg-config', kwargs: static_kwargs)
+ kwargs: static_kwargs)
 endif
 zstd = not_found
 if not get_option('zstd').auto() or have_block
   zstd = dependency('libzstd', version: '>=1.4.0',
 required: get_option('zstd'),
-method: 'pkg-config', kwargs: static_kwargs)
+kwargs: static_kwargs)
 endif
 virgl = not_found
 if not get_option('virglrenderer').auto() or have_system
   virgl = dependency('virglrenderer',
- method: 'pkg-config',
  required: get_

Re: [Qemu-devel] [PATCH] configure: Add pkg-config handling for libgcrypt

2022-01-07 Thread Andrea Bolognani
On Fri, Jan 07, 2022 at 12:06:47PM +, Daniel P. Berrangé wrote:
> On Fri, Jan 07, 2022 at 12:55:42PM +0100, Thomas Huth wrote:
> > On 07/01/2022 12.43, Andrea Bolognani wrote:
> > > On Thu, Aug 29, 2019 at 10:15:05AM +0100, Daniel P. Berrangé wrote:
> > > > Where are you seeing pkg-config files for libgcrypt ?
> > > >
> > > > The upstream project has (frustratingly) been hostile to any proposal to
> > > > add pkg-config support saying people should stick with their custom
> > > > libgcrypt-config tool
> > > >
> > > > https://dev.gnupg.org/T2037
> > >
> > > Resurrecting an old thread to point out that the upstream stance
> > > seems to have changed since that discussion:
> > >
> > >
> > > https://git.gnupg.org/cgi-bin/gitweb.cgi?p=libgcrypt.git;a=commit;h=97194b422bc89a6137f4e218d4cdee118c63e96e
> > >
> > > libgcrypt 1.9.0, released almost exactly a year ago, comes with a
> > > pkg-config file out of the box. With that in mind, I think it would
> > > make sense to re-evaluate this patch for inclusion.
> >
> > Maybe ... but we switched to Meson in between, so the patch needs to be
> > rewritten to use meson.build instead, I think.

Right. I didn't mean that the patch should be merged as-is, and I
wouldn't expect it to even apply two years later :)

> > Also it seems like version
> > 1.9 is not available in all distros yet, so someone needs to do an
> > assessment whether the distros that use an older version of libgrypt provide
> > a .pc file or not...

The original approach used the .pc file where present and used
libgcrypt-config as a fallback. Once QEMU stopped targeting all
operating system that have libgcrypt < 1.9, the fallback path could
have been dropped.

> Comes back to my question of what is the benefit of using the .pc file
> when what we have already works and is known to be portable.
>
> When using meson there is essentially zero burden to using the
> libgcrypt-config approach, because that's all handled transparently
> by meson.  This is different from the situation with configure,
> where libgcrypt-config required extra work on our side.

I didn't realize that was the case! I see now that Meson implements
special handling for libgcrypt, which is very nice indeed :)

> Overall I don't see any need to change it.

IIUC, the fact that the call currently looks like

  gcrypt = dependency('libgcrypt', version: '>=1.8',
  method: 'config-tool', ...)

means that Meson will not use the .pc file even when it's present. I
think changing method to "auto" would cause it to use the .pc file
when it's available, which is likely a better behavior.

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [Qemu-devel] [PATCH] configure: Add pkg-config handling for libgcrypt

2022-01-07 Thread Andrea Bolognani
On Thu, Aug 29, 2019 at 10:15:05AM +0100, Daniel P. Berrangé wrote:
> On Thu, Aug 29, 2019 at 04:53:02PM +0800, zhe...@windriver.com wrote:
> > libgcrypt may also be controlled by pkg-config, this patch adds pkg-config
> > handling for libgcrypt.
>
> Where are you seeing pkg-config files for libgcrypt ?
>
> The upstream project has (frustratingly) been hostile to any proposal to
> add pkg-config support saying people should stick with their custom
> libgcrypt-config tool
>
>https://dev.gnupg.org/T2037
>
> Even if this is something added by some distro downstream, what is the
> benefit in using it, compared with libgcrypt-confg which should already
> work & is portable.

Resurrecting an old thread to point out that the upstream stance
seems to have changed since that discussion:

  
https://git.gnupg.org/cgi-bin/gitweb.cgi?p=libgcrypt.git;a=commit;h=97194b422bc89a6137f4e218d4cdee118c63e96e

libgcrypt 1.9.0, released almost exactly a year ago, comes with a
pkg-config file out of the box. With that in mind, I think it would
make sense to re-evaluate this patch for inclusion.

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [qemu-web PATCH] remove deployment phase from CI

2021-11-03 Thread Andrea Bolognani
On Wed, Nov 03, 2021 at 09:14:59AM +0100, Paolo Bonzini wrote:
> qemu.org is now served via a reverse proxy from qemu-project.gitlab.io; it 
> does
> not need anymore the rsync step to the QEMU project\s shell server.

*project's

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [PATCH RESEND v2 0/6] target/arm: Add nested virtualization support

2021-04-01 Thread Andrea Bolognani
On Thu, 2021-04-01 at 12:55 +, Haibo Xu wrote:
> v2:
>   - Move the NV to a CPU feature flag(Andrea)
>   - Add CPU feature 'el2' test(Andrew)
> 
> Many thanks to Andrea and Andrew for their comments!
> 
> This series add support for ARMv8.3/8.4 nested virtualization support
> in KVM mode. It's based on Marc Zyngier's kernel KVM patches[1], and 
> has been tested on a FVP model to run a L2 guest with Qemu. Now the 
> feature can be enabled by "-M virt,accel=kvm -cpu host,el2=on" when
> starting a VM. 

This looks good from the user interface point of view, thanks for
addressing the concerns that were raised!

I'll leave Drew to review the actual code changes :)

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: Serious doubts about Gitlab CI

2021-03-31 Thread Andrea Bolognani
On Wed, 2021-03-31 at 09:54 +0200, Thomas Huth wrote:
> On 30/03/2021 15.19, Daniel P. Berrangé wrote:
> > Yep, that looks similar to what we do in libvirt, though we don't override
> > the compiler at the job level. Instead we just ensure the dir containing
> > ccache symlinks appears first in $PATH.
> 
> That's of course a nice idea, too, but it does not seem to work with the 
> clang builds:
> 
> https://gitlab.com/thuth/libvirt/-/jobs/1142239591#L3780
> https://gitlab.com/thuth/libvirt/-/jobs/1142239581#L3738

That's because the corresponding Dockerfile doesn't contain
instructions to create a symlink for clang:

  
https://gitlab.com/libvirt/libvirt/-/blob/master/ci/containers/fedora-rawhide.Dockerfile#L105-107

It's simply a bug in lcitool that needs to be addressed.

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [PATCH 0/3] target/arm: Add nested virtualization support

2021-03-22 Thread Andrea Bolognani
On Mon, 2021-03-22 at 10:07 +, Haibo Xu wrote:
> This series add support for ARMv8.3/8.4 nested virtualization support
> in KVM mode. It's based on Marc Zyngier's kernel KVM patches[1], and
> has been tested on a FVP model to run a L2 guest with Qemu. Now the
> feature can be enabled by "-M virt,accel=kvm,virtualization=on" when
> starting a VM.

Why the need to enable this explicitly? AFAIK, that's not necessary
for any other architecture: on x86, you just need to make sure you're
using '-cpu host' and pass a parameter to the kernel module.

Even assuming this can't be enabled transparently, wouldn't its
availability it be controlled by a CPU feature flag, similar to what
already happens for SVE and PMU, rather than a machine type option?

That would also address the discoverability issue: unless I'm
mistaken (which I very well might be :), with the current
implementation there's no way to tell whether nested KVM will be
usable short of trying and seeing whether QEMU errors out.

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: minimal "zero conf" build dockerfiles for fedora:latest and alpine:latest

2021-01-14 Thread Andrea Bolognani
On Wed, 2021-01-13 at 13:31 -0500, John Snow wrote:
> On 1/13/21 5:09 AM, Gerd Hoffmann wrote:
> > > I don't like Perl really, but there's a chicken-and-egg problem between
> > > detecting Python and using it to print the configure help script.  For
> > > configure-time tasks, Perl has the advantage that "#! /usr/bin/env perl"
> > > just works.
> > 
> > Assuming perl is actually installed, the world seems to shift to python.
> > On a minimal fedora install python is present but perl is not ...
> > 
> > On the other hand git depends on perl, so it is probably pretty hard to
> > find a developer workstation without perl installed, so maybe that
> > doesn't matter much for the time being.
> 
> I agree that it doesn't matter much right now, Though I don't always 
> have git installed in containers when I am doing builds. It will become 
> more common to encounter environments that are missing "obvious" 
> dependencies.

Note that Fedora has a git-core package that doesn't depend on Perl
while still providing more than enough git for something like a CI
build job.

As a data point, the libvirt project has made it an explicit goal[1]
to remove all usage of Perl in favor of Python. We're not quite there
yet, but at this point there are only a very tiny handful of Perl
scripts remaining in the repository.


[1] https://libvirt.org/strategy.html
-- 
Andrea Bolognani / Red Hat / Virtualization




Re: Command line QAPIfication and -readconfig

2020-11-12 Thread Andrea Bolognani
On Wed, 2020-11-11 at 11:35 +0100, Kevin Wolf wrote:
> Am 11.11.2020 um 11:14 hat Daniel P. Berrangé geschrieben:
> > Normally we would not mark something as deprecated unless its replacement
> > is ready, because telling people "stop using this but the replacement
> > doesn't exist yet" is not a nice message as there is no action users can
> > take to deal with the deprecation.
> 
> But there is a replacement: Put everything back into the command line
> and keep it in a shell script. Config files with -readconfig were never
> complete enough to fully describe a VM, so it's not too unlikely that
> you'll already have that script anyway.

This is correct: in fact...

> > We might question whether -readconfig has any users but I would note
> > that our own documentation illustrates its usage, and provides several
> > example configs
> > 
[...]
> > config/mach-virt-graphical.cfg:# -readconfig mach-virt-graphical.cfg \
> > config/mach-virt-serial.cfg:# -readconfig mach-virt-serial.cfg \
> > config/q35-emulated.cfg:# -readconfig q35-emulated.cfg
> > config/q35-virtio-graphical.cfg:# -readconfig q35-virtio-graphical.cfg
> > config/q35-virtio-serial.cfg:# -readconfig q35-virtio-serial.cfg \
[...]

... these configuration files, which I contributed some time ago, all
start with something along the lines of

  # Usage:
  #
  #   $ qemu-system-aarch64 \
  # -nodefaults \
  # -readconfig mach-virt-serial.cfg \
  # -display none -serial mon:stdio \
  # -cpu host

because it was simply impossible to provide QEMU with all the
settings necessary to obtain the desired virtual hardware using the
configuration file only.

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [PATCH v4 2/4] Jobs based on custom runners: build environment docs and playbook

2020-11-10 Thread Andrea Bolognani
On Mon, 2020-11-09 at 11:37 -0500, Cleber Rosa wrote:
> On Tue, Oct 20, 2020 at 07:52:43PM +0200, Andrea Bolognani wrote:
> > On Sun, 2020-10-18 at 21:50 -0400, Cleber Rosa wrote:
> > > +- name: Install basic packages to build QEMU on FreeBSD 12.x
> > > +  pkgng:
> > > +# Originally from packages on .cirrus.yml under the 
> > > freebsd_12_task
> > > +name: 
> > > bash,curl,cyrus-sasl,git,glib,gmake,gnutls,gsed,nettle,ninja,perl5,pixman,pkgconf,png,usbredir
> > 
> > See above for 'pkgng' vs 'package', but at the very least this should
> > be
> > 
> >   pkgng:
> > name:
> >   - bash
> >   - curl
> > ...
> > 
> > or each time the list is touched the resulting diff is going to be
> > unmanageable.
> 
> The documentation suggests a comma separated list of package names:
> 
>https://docs.ansible.com/ansible/2.8/modules/pkgng_module.html#pkgng-module
> 
> And the reason is that this module is not as smart as others, and will
> run one separate command for each individual package name value:
> 
>
> https://github.com/ansible/ansible/blob/v2.10.3/test/support/integration/plugins/modules/pkgng.py#L214
> 
> It's a tradeoff indeed, but I think we're aligned with the docs.

I would probably take the performance hit over having to deal with an
unmaintainable blob. As a rule of thumb, reviewer time is more
valuable than machine time.

If the performance hit turns out to be unacceptably big, you could at
least do something along the lines of

  - set_fact:
  freebsd_packages:
- bash
- curl
  ...
  
  - pkgng:
  name: "{ freebsd_packages | join(',') }"

to keep things reviewable.

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [PATCH v4 3/4] Jobs based on custom runners: docs and gitlab-runner setup playbook

2020-10-20 Thread Andrea Bolognani
On Tue, 2020-10-20 at 09:29 +0100, Daniel P. Berrangé wrote:
> I think the obvious and easy place is start using lcitool is for the
> tests/docker/dockerfiles/*.  All that's required is to add mappings
> to lcitool for the various deps that QEMU has which libvirt does not
> already have. Then we should be able to start auto-generating the
> dockerfiles without too much difficulty. This will be a significant
> step forward because it will help us keep te package lists in sync
> across all the dockerfiles which is a major fail in QEMU right now.

I took a quick look at the contents of those dockerfiles and while I
agree that it's the obvious place to start, I'm not sure I would say
that generating them using lcitool is going to be easy :)

Basically there seem to be a number of assumptions both in lcitool
and in the QEMU dockerfiles, and getting the two to agree on those
will require quite some work from what I can tell.

But again, I agree it's something that should be worked on.

> Dealing with tests/vm replacement or these ansible recipes is likely
> to be a significantly more challenging proposition. Perhaps we can
> again start by just automating creation of the package lists that
> the tests/vm and ansibile recipes need, as again those are all
> inconsistent.

Perhaps 'lcitool variables' could be of use here.

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [PATCH v4 2/4] Jobs based on custom runners: build environment docs and playbook

2020-10-20 Thread Andrea Bolognani
On Sun, 2020-10-18 at 21:50 -0400, Cleber Rosa wrote:
> +++ b/scripts/ci/setup/build-environment.yml
> @@ -0,0 +1,233 @@
> +---
> +- name: Installation of basic packages to build QEMU
> +  hosts: all
> +  tasks:

My Ansible-fu is a bit rusty at the moment, so please double-check my
claims and apologies in advance for the ones that I will get wrong O:-)

> +- name: Install basic packages to build QEMU on Ubuntu 18.04/20.04
> +  apt:
> +update_cache: yes
> +# Originally from tests/docker/dockerfiles/ubuntu1804.docker
> +pkg:
> +  - ccache
> +  - clang

Instead of using the 'apt' module here, and the equivalent module
further down, you could just do

  package:
name:
  - pkg1
  - pkg2
...
state: present

every single time and let Ansible take care of the differences for
you.

> +  when: "ansible_facts['distribution'] == 'Ubuntu'"

Quoting the entire condition is not necessary, you can just have

  when: ansible_facts['distribution'] == 'Ubuntu'

or, my preference,

  when:
- ansible_facts['distribution'] == 'Ubuntu'

which results in a nicer diff when you add/remove conditions.

> +- name: Install packages to build QEMU on Ubuntu 18.04/20.04 on non-s390x
> +  apt:
> +update_cache: yes
> +pkg:
> + - libspice-server-dev
> + - libxen-dev

Indentation of list items is inconsistent here.

> +- name: Install basic packages to build QEMU on FreeBSD 12.x
> +  pkgng:
> +# Originally from packages on .cirrus.yml under the freebsd_12_task
> +name: 
> bash,curl,cyrus-sasl,git,glib,gmake,gnutls,gsed,nettle,ninja,perl5,pixman,pkgconf,png,usbredir

See above for 'pkgng' vs 'package', but at the very least this should
be

  pkgng:
name:
  - bash
  - curl
...

or each time the list is touched the resulting diff is going to be
unmanageable.

> +- name: Enable PowerTools repo on CentOS 8
> +  ini_file:
> +path: /etc/yum.repos.d/CentOS-PowerTools.repo
> +section: PowerTools
> +option: enabled
> +value: "1"
> +  when:
> +- "ansible_facts['distribution'] == 'CentOS'"
> +- "ansible_facts['distribution_major_version'] == '8'"

Another option would be to use

  command: 'dnf config-manager --set-enabled Stream-PowerTools -y'
  args:
warn: no

but I have to admit the way you're doing it is very clever ;)

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [PATCH v2 03/15] python: add VERSION file

2020-10-20 Thread Andrea Bolognani
On Tue, 2020-10-20 at 10:06 +0100, Daniel P. Berrangé wrote:
> The QEMU python modules are not like other python modules though,
> precisely because they are talking to QEMU. If we are shipping
> QEMU python releases on the same schedule as QEMU, then we can
> expect the normal ase to be updating both QEMU & Python together.

Once you start uploading the Python packages to PyPi, you really have
no way to ensure this will be the case.

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [PATCH v2 03/15] python: add VERSION file

2020-10-20 Thread Andrea Bolognani
On Mon, 2020-10-19 at 11:02 +0100, Daniel P. Berrangé wrote:
> On Mon, Oct 19, 2020 at 11:45:09AM +0200, Andrea Bolognani wrote:
> > I think this need to be considered very carefully.
> > 
> > I'm not overly familiar with the Python ecosystem but it would appear
> > that, despite PEP 440 not mandating this, many (most?) of the
> > packages uploaded to PyPi are using semantic versioning.
> 
>   
> https://packaging.python.org/guides/distributing-packages-using-setuptools/#choosing-a-versioning-scheme
> 
> Semver is the recommended approach, but they explicitly list date
> based versioning as a valid alternative
> 
>   "Semantic versioning is not a suitable choice for all projects, 
>such as those with a regular time based release cadence and a 
>deprecation process that provides warnings for a number of 
>releases prior to removal of a feature."
> 
> That paragraph describes QEMU's scenario.

The section on date based versioning continues with

  A key advantage of date based versioning is that it is
  straightforward to tell how old the base feature set of a
  particular release is given just the version number.

  Version numbers for date based projects typically take the form of
  YEAR.MONTH (for example, 12.04, 15.10).

The problem with QEMU's version numbers is that, while they are date
based, they still *look* like semver, so it wouldn't be at all
unreasonable for the user to expect that they also *behave* like
semver.

This is not much of a problem when it comes to the main binary, but
it is certainly much more confusing when you start using the same
version number for a Python library.

> > With that in mind, I think it would be unwise for qemu.* not to do
> > the same; in particular, using a version number that's not <1.0.0 for
> > a package that is very much in flux will almost certainly break
> > people's expectations, and is also not something that you can easily
> > take back at a later time.
> 
> I don't think it is that big a deal, and there is clear benefit to
> having the python code version match the QEMU version that it is
> the companioon to.
> 
> Ultimately the versioning scheme just impacts on the version string
> conditionals people list for their dependancies. Apps consuming QEMU
> can handle any of the version schemes without much difference.

The problem comes from the expectations: a Python programmer, who is
used to semver due to its prominence on PyPi, when deciding whether
to move from qemu.core 4.2.0 to 5.0.0 might expect to need code
changes to cope with API-breaking changes - where in fact there are
none, and at the same time might expect upgrading to 5.2.0 from 5.0.0
to be completely straightforward when in reality a feature their
application depends on might have been removed after the usual
deprecation period.

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [PATCH v2 03/15] python: add VERSION file

2020-10-19 Thread Andrea Bolognani
On Wed, 2020-10-14 at 10:29 -0400, John Snow wrote:
> Python infrastructure as it exists today is not capable reliably of
> single-sourcing a package version from a parent directory. The authors
> of pip are working to correct this, but as of today this is not possible
> to my knowledge.
> 
> The problem is that when using pip to build and install a python
> package, it copies files over to a temporary directory and performs its
> build there. This loses access to any information in the parent
> directory, including git itself.
> 
> Further, Python versions have a standard (PEP 440) that may or may not
> follow QEMU's versioning. In general, it does; but naturally QEMU does
> not follow PEP 440. To avoid any automatically-generated conflict, a
> manual version file is preferred.
> 
> 
> I am proposing:
> 
> - Python core tooling synchronizes with the QEMU version directly
>   (5.2.0, 5.1.1, 5.3.0, etc.)
> 
> - In the event that a Python package needs to be updated independently
>   of the QEMU version, a pre-release alpha version should be preferred,
>   but *only* after inclusion to the qemu development or stable branches.
> 
>   e.g. 5.2.0a1, 5.2.0a2, and so on should be preferred prior to 5.2.0's
>   release.
> 
> - The Python core tooling makes absolutely no version compatibility
>   checks or constraints. It *may* work with releases of QEMU from the
>   past or future, but it is not required to.
> 
>   i.e., "qemu.core" will always remain in lock-step with QEMU.
> 
> - We reserve the right to split out e.g. qemu.core.qmp to qemu.qmp
>   and begin indepedently versioning such a package separately from the
>   QEMU version it accompanies.

I think this need to be considered very carefully.

I'm not overly familiar with the Python ecosystem but it would appear
that, despite PEP 440 not mandating this, many (most?) of the
packages uploaded to PyPi are using semantic versioning.

With that in mind, I think it would be unwise for qemu.* not to do
the same; in particular, using a version number that's not <1.0.0 for
a package that is very much in flux will almost certainly break
people's expectations, and is also not something that you can easily
take back at a later time.

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [RFC DOCUMENT 00/12] kubevirt-and-kvm: Add documents

2020-09-24 Thread Andrea Bolognani
On Tue, 2020-09-22 at 11:29 +0200, Philippe Mathieu-Daudé wrote:
> Hi Andrea,

Hi Philippe, and sorry for the delay in answering!

First of all, thanks for taking the time to go through the documents
and posting your thoughts. More comments below.

> Thanks a lot for this documentation, I could learn new things,
> use cases out of my interest area. Useful as a developer to
> better understand how are used the areas I'm coding. This
> shorten a bit that gap between developers and users.
> 
> What would be more valuable than a developer review/feedback is
> having feedback from users and technical writers.
> Suggestion: also share it on qemu-disc...@nongnu.org which is
> less technical (maybe simply repost the cover and link to the
> Wiki).

More eyes would obviously be good, but note that these are really
intended to improve the interactions between QEMU/libvirt and
KubeVirt, so the audience is ultimately developers. Of course you
could say that KubeVirt developers *are* users when it comes to
QEMU/libvirt, and you wouldn't be wrong ;) Still, qemu-devel seems
like the proper venue.

> What is not obvious in this cover (and the documents pasted on
> the list) is there are schema pictures on the Wiki pages which
> are not viewable and appreciable via an email post.

You're right! I was pretty sure I had a line about that somewhere in
there but I guess it got lost during editing. Hopefully the URL at
the very beginning of each document caused people to browse the HTML
version.

> I had zero knowledge on Kubernetes. I have been confused by their
> use in the introduction...
> 
> From Index:
> 
> "The intended audience is people who are familiar with the traditional
> virtualization stack (QEMU plus libvirt), and in order to make it
> more approachable to them comparisons, are included and little to no
> knowledge of KubeVirt or Kubernetes is assumed."
> 
> Then in Architecture's {Goals and Components} there is an assumption
> Kubernetes is known. Entering in Components, Kubernetes is briefly
> but enough explained.
> 
> Then KubeVirt is very well explained.

I guess the sections in the Index you're referring to assume that you
know that Kubernetes is somehow connected to containers, and that
it's a clustered environment. Anything else I missed?

Perhaps we could move the contents of

  
https://gitlab.cee.redhat.com/abologna/kubevirt-and-kvm/-/blob/master/Components.md#kubernetes

to a small document that's linked to near the very top. Would that
improve things, in your opinion?

> Sometimes the "Other topics" category is confusing, it seems out
> of the scope of the "better understanding and documenting the
> interactions between KubeVirt and KVM" and looks like left over
> notes.

That's probably because they absolutely are O:-)

> Maybe renaming the "Other topics" section would help.
> "Unanswered questions", "Other possibilities to investigate",...

This sounds sensible :)

Thanks again for your feedback!

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: Python 3.5 EOL; when can require 3.6?

2020-09-17 Thread Andrea Bolognani
On Thu, 2020-09-17 at 17:35 +0100, Daniel P. Berrangé wrote:
> On Thu, Sep 17, 2020 at 06:07:00PM +0200, Andrea Bolognani wrote:
> > It would be *fantastic* if we could keep the platform support policy
> > used by QEMU and libvirt as aligned as reasonably possible.
> 
> The current QEMU policy was a copy+paste of the same policy I wrote for
> libvirt originally, just adding OpenBSD/NetBSD.
> 
> I've just posted an update for QEMU which matches the latest libvirt
> policy, again just adding the extra BSDs.
> 
> https://lists.gnu.org/archive/html/qemu-devel/2020-09/msg06371.html

That's even better than I hoped for! Thank you so much :)

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: Python 3.5 EOL; when can require 3.6?

2020-09-17 Thread Andrea Bolognani
On Thu, 2020-09-17 at 17:30 +0200, Markus Armbruster wrote:
> Thomas Huth  writes:
> > Sorry, I forgot to check Debian. If I got that right, Debian 9 still
> > uses Python 3.5 by default. So I guess that means we can not deprecate
> > Python 3.5 yet?
> 
> Discussed before:
> 
> Subject: Re: [PATCH] qapi: Fix code generation with Python 3.5
> Date: Sat, 18 Jan 2020 07:54:18 +0100
> Message-ID: <87lfq5s19h@dusky.pond.sub.org>
> https://lists.nongnu.org/archive/html/qemu-devel/2020-01/msg03855.html
> 
> Short version: Debian != Debian LTS.  We support Debian until EOL, not
> LTS.  Debian 9 reached EOL in July.

FWIW, this is the same policy the libvirt project follows, and we
have formalized it at

  https://libvirt.org/platforms.html

with Debian LTS being called out explicitly as not supported.

It would be *fantastic* if we could keep the platform support policy
used by QEMU and libvirt as aligned as reasonably possible.

-- 
Andrea Bolognani / Red Hat / Virtualization




[RFC DOCUMENT 11/12] kubevirt-and-kvm: Add Backpropagation page

2020-09-16 Thread Andrea Bolognani
https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Backpropagation.md

# Backpropagation

Whenever a partial VM configuration is submitted to libvirt, any missing
information is automatically filled in to obtain a configuration that's
complete enough to guarantee long-term guest ABI stability.

PCI addresses are perhaps the most prominent example of this: most management
applications don't include this information at all in the XML they submit to
libvirt, and rely on libvirt building a reasonable PCI topology to support the
requested devices.

For example, using a made-up YAML syntax for brevity, the input could look like

```yaml
devices:
  disks:
  - image: /path/to/image.qcow2
```

and the output could be augmented by libvirt to look like

```yaml
devices:
  controllers:
  - model: pcie-root-port
address:
  type: pci
  domain: 0x
  bus: 0x00
  slot: 0x01
  function: 0x0
  disks:
  - image: /path/to/image.qcow2
model: virtio-blk
address:
  type: pci
  domain: 0x
  bus: 0x01
  slot: 0x00
  function: 0x0
```

This is where backpropagation comes in: the only version of the VM
configuration that is complete enough to guarantee a stable guest ABI is the
one that includes all information added by libvirt, so if the management
application wants to be able to make further changes to the VM it needs to
reflect the additional information back into its understanding of the VM
configuration somehow.

For applications like virsh and virt-manager, this is easy: they don't have
their own configuration format or even store the VM configuration, and
simply fetch it from libvirt and operate on it directly every single time.

oVirt, to the best of my knowledge, generates an initial VM configuration based
on the settings provided by the user, submits it to libvirt and then parses
back the augmented version, figuring out what information was added and
updating its database to match: if the VM configuration needs to be generated
again later, it will include all information present in the database, including
those that originated from libvirt rather than the user.

KubeVirt does not currently perform any backpropagation. There are two ways a
user can influence PCI address allocation:

* explicitly add a `pciAddress` attribute for the device, which will cause
  KubeVirt to pass the corresponding address to libvirt, which in turn will
  attempt to comply with the user's request;
* add the `kubevirt.io/placePCIDevicesOnRootComplex` annotation to the VM
  configuration, which will cause KubeVirt to provide libvirt with a
  fully-specified PCI topology where all devices live on the PCIe Root Bus.

In all cases but the one where KubeVirt defines the full PCI topology itself,
it's implicitly relying on libvirt always building the PCI topology in the
exact same way every single time in order to have a stable guest ABI. While
this works in practice, it's not something that libvirt actually guarantees:
once a VM has been defined, libvirt will never change its PCI topology, but
submitting the same partial VM configuration to different libvirt versions can
result in different PCI topologies.




[RFC DOCUMENT 12/12] kubevirt-and-kvm: Add Contacts page

2020-09-16 Thread Andrea Bolognani
https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Contacts.md

# Contacts and credits

# Contacts

The following people have agreed to serve as points of contact for
follow-up discussion around the topics included in these documents.

## Overall

* Andrea Bolognani <> (KVM user space)
* Cole Robinson <> (KVM user space)
* Roman Mohr <> (KubeVirt)
* Vladik Romanovsky <> (KubeVirt)

## Networking

* Alona Paz <> (KubeVirt)
* Stefano Brivio <> (KVM user space)

## Storage

* Adam Litke <> (KubeVirt)
* Stefan Hajnoczi <> (KVM user space)

# Credits

In addition to those listed above, the following people have also
contributed to the documents or the discussion around them.

Ademar Reis, Adrian Moreno Zapata, Alice Frosi, Amnon Ilan, Ariel
Adam, Christophe de Dinechin, Dan Kenisberg, David Gilbert, Eduardo
Habkost, Fabian Deutsch, Gerd Hoffmann, Jason Wang, John Snow, Kevin
Wolf, Marc-André Lureau, Michael Henriksen, Michael Tsirkin, Paolo
Bonzini, Peter Krempa, Petr Horacek, Richard Jones, Sergio Lopez,
Steve Gordon, Victor Toso, Viviek Goyal.

If your name should be in the list above but is not, please know that
was an honest mistake and not a way to downplay your contribution!
Get in touch and we'll get it sorted out :)




[RFC DOCUMENT 10/12] kubevirt-and-kvm: Add Upgrades page

2020-09-16 Thread Andrea Bolognani
https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Upgrades.md

# Upgrades

The KubeVirt installation and upgrade process are entirely controlled
by an [operator][], which is a common pattern in the Kubernetes
world. The operator is a piece of software running in the cluster and
managing the lifecycle of other components, in this case KubeVirt.

## The operator

What it does:

* Manages the whole KubeVirt installation
* Keeps the cluster actively in sync with the desired state
* Upgrades KubeVirt with zero downtime

## Installation

Install the operator:

```bash
$ LATEST=$(curl -L 
https://storage.googleapis.com/kubevirt-prow/devel/nightly/release/kubevirt/kubevirt/latest)
$ kubectl apply -f 
https://storage.googleapis.com/kubevirt-prow/devel/nightly/release/kubevirt/kubevirt/$(LATEST)/kubevirt-operator.yaml
$ kubectl get pods -n kubevirt
NAME  READY   STATUSRESTARTS   AGE
virt-operator-58cf9d6648-c7qph1/1 Running   0  69s
virt-operator-58cf9d6648-pvzw21/1 Running   0  69s
```

Trigger the installation of KubeVirt:

```bash
$ LATEST=$(curl -L 
https://storage.googleapis.com/kubevirt-prow/devel/nightly/release/kubevirt/kubevirt/latest)
$ kubectl apply -f 
https://storage.googleapis.com/kubevirt-prow/devel/nightly/release/kubevirt/kubevirt/${LATEST}/kubevirt-cr.yaml
$ kubectl get pods -n kubevirt
NAME  READY   STATUSRESTARTS   AGE
virt-api-8bdd88557-fllhr  1/1 Running   0  59s
virt-controller-55ccb8cdcb-5rtp6  1/1 Running   0  43s
virt-controller-55ccb8cdcb-v8kr9  1/1 Running   0  43s
virt-handler-67pns1/1 Running   0  43s
```

The process happens in two steps because the operator relies on the
KubeVirt [custom resource][] for information on the desired
installation, and will not do anything until that resource exists in
the cluster.

## Upgrade

The upgrading process is similar:

* Install the latest operator
* Reference the new version in the KubeVirt CustomResource to trigger
  the actual upgrade

```bash
$ kubectl.sh get kubevirt -n kubevirt kubevirt -o yaml
apiVersion: kubevirt.io/v1alpha3
kind: KubeVirt
metadata:
  name: kubevirt
spec:
  imageTag: v0.30
  certificateRotateStrategy: {}
  configuration: {}
  imagePullPolicy: IfNotPresent
```

Note the `imageTag` attribute: when present, the KubeVirt operator
will take steps to ensure that the version of KubeVirt that's
deployed on the cluster matches its value, which in our case will
trigger an upgrade.

The following chart explain the upgrade flow in more detail and shows
how the various components are affected:

![KubeVirt upgrade flow][Upgrades-Kubevirt]

KubeVirt is released as a complete suite: no individual
`virt-launcher` releases are planned. Everything is tested together,
everything is released together.

## QEMU and libvirt

The versions of QEMU and libvirt used for VMs are also tied to the
version of KubeVirt and are upgraded along with everything else.

* Migrations from old libvirt/QEMU to new libvirt/QEMU pairs are
  possible
* As soon as the new `virt-handler` and the new controller are rolled
  out, the cluster will only start VMIs with the new QEMU/libvirt
  versions

## Version compatibility

The virt stack is updated along with KubeVirt, which mitigates
compatibility concerns. As a rule of thumb, versions of QEMU and
libvirt older than a year or so are not taken into consideration.

Currently, the ability to perform backwards migation (eg. from a
newer version of QEMU to an older one) is not necessary, but that
could very well change as KubeVirt becomes more widely used.

[Upgrades-Kubevirt]: 
https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Images/Upgrades-Kubevirt.png
[custom resource]: 
https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/
[operator]: https://kubernetes.io/docs/concepts/extend-kubernetes/operator/




[RFC DOCUMENT 09/12] kubevirt-and-kvm: Add Isolation page

2020-09-16 Thread Andrea Bolognani
https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Isolation.md

# Isolation

How is the QEMU process isolated from the host and from other VMs?

## Traditional virtualization

cgroups

* managed by libvirt

SELinux

* libvirt is privileged and QEMU is protected by SELinux policies set
  by libvirt (SVirt)
* QEMU runs with SELinux type `svirt_t`

## KubeVirt

cgroups

* Managed by kubelet
  * No involvement from libvirt
* Memory limits
  * When using hard limits, the entire VM can be killed by Kubernetes
  * Memory consumption estimates are based on heuristics

SELinux

* KubeVirt is not using SVirt and there are no plans to do so
* At the moment, the custom [KubeVirt SELinux policy][] is used to
  ensure libvirt has sufficient privilege to perform its own setup
  steps
  * The standard SELinux type used by containers is `container_t`
  * KubeVirt would like to eventually use the same for VMs as well

Capabilities

* The default set of capabilities is fairly conservative
  * Privileged operation should happen outside of the pod: in
KubeVirt's case, a good candidate is `virt-handler`, the
privileged components that runs at the node level
* Additional capabilities can be requested for a pod
  * However, this is frowned upon and considered a liability from the
security point of view
  * The cluster admin may even set a security policy that prevent
pods from using certain capabilities
* In such a scenario, KubeVirt workloads may be entirely unable
  to run

## Specific examples

The following is a list of examples, either historical or current, of
scenarios where libvirt's approach to isolation clashed with
Kubernetes' and changes on either component were necessary.

SELinux

* libvirt use of hugetlbfs for hugepages config is disallowed by
  `container_t`
  * Possibly fixable by using memfd
* [libvirt memoryBacking docs][]
* [KubeVirt memfd issue][]
* Use of libvirt+QEMU multiqueue tap support is disallowed by
  `container_t`
  * And there’s no way to pass in this setup from outside the
existing stack
  * [KubeVirt multiqueue workaround][] extending their SELinux policy to allow
`attach_queue`
* Passing precreated tap devices to libvirt triggers
  relabelfrom+relabelto `tun_socket` SELinux access
  * This may not be virt stacks fault, seems to happen automatically
when permissions aren’t correct

Capabilities

* libvirt performs memory locking for VFIO devices unconditionally
  * Previously KubeVirt had to grant `CAP_SYS_RESOURCE` to pods.
KubeVirt worked around it by duplicating libvirt’s memory pinning
calculations so the libvirt action would be a no-op, but that is
fragile and may cause the issue to resurface if libvirt
calculation logic changes.
  * References: [libvir-list memlock thread][], [KubeVirt memlock
PR][], [libvirt qemuDomainGetMemLockLimitBytes][], [KubeVirt
VMI.getMemlockSize][]
* virtiofsd requires `CAP_SYS_ADMIN` capability to perform
  `unshare(CLONE_NEWPID|CLONE_NEWNS)`
  * This is required for certain use cases like running overlayfs in
the VM on top of virtiofs, but is not a requirement for all
usecases.
  * References: [KubeVirt virtiofs PR][], [RHEL virtiofs bug][]
* KubeVirt uses libvirt for CPU pinning, which requires the pod to
  have `CAP_SYS_NICE`.
  * Long term, KubeVirt would like to handle that pinning in their
privileged component virt-handler, so `CAP_SYS_NICE` can be
dropped.
* Sidenote: libvirt unconditionally requires `CAP_SYS_NICE` when
  any other running VM is using CPU pinning, however this sounds
  like a plain old bug.
  * References: [KubeVirt CPU pinning PR][], [KubeVirt CPU pinning
workaround PR][], [RHEL CPU pinning bug][]
* libvirt bridge usage used to require `CAP_NET_ADMIN`
  * This is a historical example for reference. libvirt usage of a
bridge device always implied tap device creation, which required
`CAP_NET_ADMIN` privileges for the pod
  * The fix was to teach libvirt to accept a precreated tap device
and skip some setup operations on it
* Example XML: ``
  * Kubevirt still hasn’t fully managed to drop `CAP_NET_ADMIN`
though
  * References: [RHEL precreated TAP bug][], [libvirt precreated TAP
patches][], [KubeVirt precreated TAP PR][], [KubeVirt NET_ADMIN
issue][], [KubeVirt NET_ADMIN issue][]

[KubeVirt CPU pinning PR]: https://github.com/kubevirt/kubevirt/pull/1381
[KubeVirt CPU pinning workaround PR]: 
https://github.com/kubevirt/kubevirt/pull/1648
[KubeVirt NET_ADMIN PR]: https://github.com/kubevirt/kubevirt/pull/3290
[KubeVirt NET_ADMIN issue]: https://github.com/kubevirt/kubevirt/issues/3085
[KubeVirt SELinux policy]: 
https://github.com/kubevirt/kubevirt/blob/master/cmd/virt-handler/virt_launcher.cil
[KubeVirt VMI.getMemlockSize]: 
https://github.com/kubevirt/kubevirt/blob/f5ffba5f84365155c81d0e2cda4aa709da062230/pkg/virt-handler/isolation/isolation.go#L206
[KubeVirt memfd issue]: 

[RFC DOCUMENT 08/12] kubevirt-and-kvm: Add NUMA Pinning page

2020-09-16 Thread Andrea Bolognani
https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/NUMA-Pinning.md

# NUMA pinning

KubeVirt doesn't currently implement NUMA pinning due to Kubernetes
limitation.

## Kubernetes Topology Manager

Allows aligning CPU and peripheral device allocations by NUMA node.

Many limitations:

* Not scheduler aware.
* Doesn’t allow memory alignment.
* etc...




[RFC DOCUMENT 07/12] kubevirt-and-kvm: Add CPU Pinning page

2020-09-16 Thread Andrea Bolognani
https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/CPU-Pinning.md

# CPU pinning

As is the case for many of KubeVirt's features, CPU pinning is
partially achieved using standard Kubernetes components: this both
reduces the amount of new code that has to be written and guarantees
better integration with containers running side by side with the VMs.

## Kubernetes CPU Manager

The Static policy allocates exclusive CPUs to pod containers in the
Guaranteed QoS class which request integer CPUs. On a best-effort
basis, the Static policy tries to allocate CPUs topologically in the
following order:

* Allocate all the CPUs in the same processor socket if available and
  the container requests at least an entire socket worth of CPUs.
* Allocate all the logical CPUs (hyperthreads) from the same physical
  CPU core if available and the container requests an entire core
  worth of CPUs.
* Allocate any available logical CPU, preferring to acquire CPUs from
  the same socket.

## KubeVirt dedicated CPU placement

KubeVirt relies on the Kubernetes CPU Manager to allocate dedicated
CPUs to the `virt-launcher` container.

When `virt-launcher` starts, it reads
`/sys/fs/cgroup/cpuset/cpuset.cpus` and generates ``
configuration for libvirt based on the information found within.
However, affinity changes require `CAP_SYS_NICE` so this additional
capability has to be granted to the VM pod.

Going forward, we would like to perform the affinity change in
`virt-handler` (the privileged component running at the node level),
which would allow the VM pod to work without additional capabilities.




[RFC DOCUMENT 06/12] kubevirt-and-kvm: Add Live Migration page

2020-09-16 Thread Andrea Bolognani
https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Live-Migration.md

# Live Migration

There are two scenarios where live migration is triggered in KubeVirt

* As per user request, by posting a `VirtualMachineInstanceMigration`
  to the cluster
* As per cluster request, for instance on a Node eviction (due lack
  of resources or maintenance of given Node)

In both situations, KubeVirt will use libvirt to handle logic and
coordination with QEMU while KubeVirt's components manage the
Kubernetes control plane and the cluster's limitations.

In short, KubeVirt:

* Checks if the target host is capable of migration of the given VM;
* Handles single network namespace per Pod by proxying migration data
  (more at [Networking][])
* Handles cluster resources usage (e.g: bandwidth usage);
* Handles cross-version migration;

![Live migration between two nodes][Live-Migration-Flow]

## Limitations

Live migration is not possible if:

* The VM is configured with cpu-passthrough;
* The VM has local or non-shared volume
* The Pod is using bridge binding for network access (right side of
  image below)

![Kubevirt's Pod][Live-Migration-Network]

## More on KubeVirt's Live migration

This blog [post on live migration][] explains how to have live
migration enabled in KubeVirt's VMs and describes some of its
caveats.

[Live-Migration-Flow]: 
https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Images/Live-Migration-Flow.png
[Live-Migration-Network]: 
https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Images/Live-Migration-Network.png
[Networking]: 
https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Networking.md
[post on live migration]: https://kubevirt.io/2020/Live-migration.html




[RFC DOCUMENT 02/12] kubevirt-and-kvm: Add Components page

2020-09-16 Thread Andrea Bolognani
https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Components.md

# Components

This document describes the various components of the KubeVirt
architecture, how they fit together, and how they compare to the
traditional virtualization architecture (QEMU + libvirt).

## Traditional architecture

For the comparison to make sense, let's start by reviewing the
architecture used for traditional virtualization.

![libvirt architecture][Components-Libvirt]

(Image taken from the "[Look into libvirt][]" presentation by Osier
Yang, which is a bit old but still mostly accurate from a high-level
perspective.)

In particular, the `libvirtd` process runs with high privileges on
the host and is responsible for managing all VMs.

When asked to start a VM, the management process will

* Prepare the environment by performing a number of privileged
  operations upfront
* Set up CGroups
* Set up kernel namespaces
* Apply SELinux labels
* Configure network devices
* Open host files
* ...
* Start a non-privileged QEMU process in that environment

## Kubernetes

To understand how KubeVirt works, it's first necessary to have some
knowledge of Kubernetes.

In Kubernetes, every user workload runs inside [Pods][]. The pod is
the smallest unit of work that Kubernetes will schedule.

Some facts about pods:

* They consist of multiple containers
* The containers share a network namespace
* The containers have their own PID and mount namespace
* The containers have their own CGroups for CPU, memory, devices and
  so forth. They are controlled by k8s and can’t be modified from
  outside.
* Pods can be started with extended privileges (`CAP_NICE`,
  `CAP_NET_RAW`, root user, ...)
* The app in the pods can drop the privileges, but the pod can not
  drop them (`kubectl exec` gives you a shell with the full
  privileges).

Creating pods with elevated privileges is generally frowned upon, and
depending on the policy decided by the cluster administrator it might
be outright impossible.

## KubeVirt architecture

Let's now discuss how KubeVirt is structured.

![KubeVirt architecture][Components-Kubevirt]

The main components are:

* `virt-launcher`, a copy of which runs inside each pod besides QEMU
  and libvirt, is the unprivileged component responsible for
  receiving commands from other KubeVirt components and reporting
  back events such as VM crashes;
* `virt-handler` runs at the node level via a DaemonSet, and is the
  privileged component which takes care of the VM setup by reaching
  into the corresponding pod and modifying its namespaces;
* `virt-controller` runs at the cluster level and monitors the API
  server so that it can react to user requests and VM events;
* `virt-api`, also running at the cluster level, exposes a few
  additional APIs that only apply to VMs, such as the "console" and
  "vnc" actions.

When a KubeVirt VM is started:

* We request a Pod with certain privileges and resources from
  Kubernetes.
* The kubelet (the node daemon of kubernetes) prepares the
  environment with the help of a container runtime.
* A shim process (virt-launcher) is our main entrypoint in the pod,
  which starts libvirt
* Once our node-daemon (virt-handler) can reach our shim process, it
  does privileged setup from outside. It reaches into the namespaces
  and modifies their content as needed. We mostly have to modify the
  mount and network namespaces.
* Once the environment is prepared, virt-handler asks virt-launcher
  to start a VM via its libvirt component.

More information can be found in the [KubeVirt architecture][] page.

## Comparison

The two architectures are quite similar from the high-level point of
view: in both cases there are a number of privileged components which
take care of preparing an environment suitable for running an
unprivileged QEMU process in.

The difference, however, is that while libvirtd takes care of all
this setup itself, in the case of KubeVirt several smaller components
are involved: some of these components are privileged just as libvirtd
is, but others are not, and some of the tasks are not even performed
by KubeVirt itself but rather delegated to the existing Kubernetes
infrastructure.

## Use of libvirtd in KubeVirt

In the traditional virtualization scenario, `libvirtd` provides a
number of useful features on top of those available with plain QEMU,
including

* support for multiple clients connecting at the same time
* management of multiple VMs through a single entry point
* remote API access

KubeVirt interacts with libvirt under certain conditions that make
the features described above irrelevant:

* there's only one client talking to libvirt: `virt-handler`
* libvirt is only asked to manage a single VM
* client and libvirt are running in the same pod, no remote libvirt
  access

[Components-Kubevirt]: 
https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Images/Components-Kubevirt.png
[Components-Libvirt]: 

[RFC DOCUMENT 05/12] kubevirt-and-kvm: Add Networking page

2020-09-16 Thread Andrea Bolognani
https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Networking.md

# Networking

## Problem description

Service meshes (such as [Istio][], [Linkerd][]) typically expect
application processes to run on the same physical host, usually in a
separate user namespace. Network namespaces might be used too, for
additional isolation. Network traffic to and from local processes is
monitored and proxied by redirection and observation of local
sockets. `iptables` and `nftables` (collectively referred to as the
`netfilter` framework) are the typical Linux facilities providing
classification and redirection of packets.

![containers][Networking-Containers]
*Service meshes with containers. Typical ingress path:
**1.** NIC driver queues buffers for IP processing
**2.** `netfilter` rules installed by *service mesh* redirect packets
   to proxy
**3.** IP receive path completes, L4 protocol handler invoked
**4.** TCP socket of proxy receives packets
**5.** proxy opens TCP socket towards application service
**6.** packets get TCP header, ready for classification
**7.** `netfilter` rules installed by service mesh forward request to
   service
**8.** local IP routing queues packets for TCP protocol handler
**9.** application process receives packets and handles request.
Egress path is conceptually symmetrical.*

If we move application processes to VMs, sockets and processes are
not visible anymore. All the traffic is typically forwarded via
interfaces operating at data link level. Socket redirection and port
mapping to local processes don't work.

![and now?][Networking-Challenge]
*Application process moved to VM:
**8.** IP layer enqueues packets to L2 interface towards application
**9.** `tap` driver forwards L2 packets to guest
**10.** packets are received on `virtio-net` ring buffer
**11.** guest driver queues buffers for IP processing
**12.** IP receive path completes, L4 protocol handler invoked
**13.** TCP socket of application receives packets and handles request.
**:warning: Proxy challenge**: the service mesh can't forward packets
to local sockets via `netfilter` rules. *Add-on* NAT rules might
conflict, as service meshes expect full control of the ruleset.
Socket monitoring and PID/UID classification isn't possible.*

## Existing solutions

Existing solutions typically implement a full TCP/IP stack, replaying
traffic on sockets local to the Pod of the service mesh. This creates
the illusion of application processes running on the same host,
eventually separated by user namespaces.

![slirp][Networking-Slirp]
*Existing solutions introduce a third TCP/IP stack:
**8.** local IP routing queues packets for TCP protocol handler
**9.** userspace implementation of TCP/IP stack receives packets on
   local socket, and
**10.** forwards L2 encapsulation to `tap` *QEMU* interface (socket
back-end).*

While being almost transparent to the service mesh infrastructure,
this kind of solution comes with a number of downsides:

* three different TCP/IP stacks (guest, adaptation and host) need to
  be traversed for every service request. There are no chances to
  implement zero-copy mechanisms, and the amount of context switches
  increases dramatically
* addressing needs to be coordinated to create the pretense of
  consistent addresses and routes between guest and host
  environments. This typically needs a NAT with masquerading, or some
  form of packet bridging
* the traffic seen by the service mesh and observable externally is a
  distant replica of the packets forwarded to and from the guest
  environment:
  * TCP congestion windows and network buffering mechanisms in
general operate differently from what would be naturally expected
by the application
  * protocols carrying addressing information might pose additional
challenges, as the applications don't see the same set of
addresses and routes as they would if deployed with regular
containers

## Experiments

![experiments: thin layer][Networking-Experiments-Thin-Layer]
*How can we improve on the existing solutions while maintaining
drop-in compatibility? A thin layer implements a TCP adaptation
and IP services.*

These are some directions we have been exploring so far:

* a thinner layer between guest and host, that only implements what's
  strictly needed to pretend processes are running locally. A further
  TCP/IP stack is not necessarily needed. Some sort of TCP adaptation
  is needed, however, if this layer (currently implemented as
  userspace process) runs without the `CAP_NET_RAW` capability: we
  can't create raw IP sockets on the Pod, and therefore need to map
  packets at layer 2 to layer 4 sockets offered by the host kernel
  * to avoid implementing an actual TCP/IP stack like the one
offered by *libslirp*, we can align TCP parameters advertised
towards the guest (MSS, congestion window) to the socket
parameters provided by the host kernel, probing them via the
`TCP_INFO` socket option (introduced in 

[RFC DOCUMENT 03/12] kubevirt-and-kvm: Add Hotplug page

2020-09-16 Thread Andrea Bolognani
https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Hotplug.md

# Hotplug

In Kubernetes, pods are defined to be immutable, so it's not possible
to perform hotplug of devices in the same way as with the traditional
virtualization stack.

This limitation is a result of KubeVirt's guiding principle of
integrating with Kubernetes as much as possible and making VMs appear
the same as containers from the user's point of view.

There have been several attempts at lifting this restriction in
Kubernetes over the years, but they have all been unsuccessful so
far.

## Existing hotplug support

When working with containers, changing the amount of resources
associated with a pod will result in it being destroyed and a new
pod with the updated resource allocation being created in its place.

This works fine for containers, which are designed to be clonable and
disposable, but when it comes to VMs they usually can't be destroyed
on a whim and running multiple instances in parallel is generally not
wise even when possible.

## Possible workarounds

Until a proper hotplug API makes its way into Kubernetes, one
possible way to implement hotplug could be to perform migration to a
container compliant with the new allocation request, and only then
perform the QEMU-level hotplug operation.




[RFC DOCUMENT 04/12] kubevirt-and-kvm: Add Storage page

2020-09-16 Thread Andrea Bolognani
https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Storage.md

# Storage

This document describes the known use-cases and architecture options
we have for Linux Virtualization storage in [KubeVirt][].

## Problem description

The main goal of Kubevirt is to leverage the storage subsystem of
Kubernetes (built around [CSI][] and [Persistent Volumes][] aka PVs),
in order to let both workloads (VMs and containers) leverage the same
storage. As a consequence Kubevirt is limited in its use of QEMU
storage subsystem and features. That means:

* Storage solutions should be implemented in k8s in a way that can be
  consumed by both containers and VMs.
* VMs can only consume (and provide) storage features which are
  available in the pod, through k8s APIs. For example, a VM will not
  support disk snapshots if it’s attached to a storage provider that
  doesn’t support it. Ditto for incremental backup, block jobs,
  encryption, etc.

## Current situation

### Storage handled outside of QEMU

In this scenario, the VM pod uses a [Persistent Volume Claim
(PVC)][Persistent Volumes] to give QEMU access to a raw storage
device or fs mount, which is provided by a [CSI][] driver. QEMU
**doesn’t** handle any of the storage use-cases such as thin
provisioning, snapshots, change block tracking, block jobs, etc.

This is how things work today in KubeVirt.

![Storage handled outside of QEMU][Storage-Current]

Devices and interfaces:

* PVC: block or fs
* QEMU backend: raw device or raw image
* QEMU frontend: virtio-blk
  * alternative: emulated device for wider compatibility and Windows
installations
  * CDROM (sata)
  * disk (sata)

Pros:

* Simplicity
* Sharing the same storage model with other pods/containers

Cons:

* Limited feature-set (fully off-loaded to the storage provider from
  CSI).
* No VM snapshots (disk + memory)
* Limited opportunities for fine-tuning and optimizations for
  high-performance.
* Hotplug is challenging, because the set of PVCs in a pod is
  immutable.

Questions and comments

* How to optimize this in QEMU?
  * Can we bypass the block layer for this use-case? Like having SPDK
inside the VM pod?
* Rust-based storage daemon (e.g. [vhost_user_block][]) running
  inside the VM pod alongside QEMU (bypassing the block layer)
  * We should be able to achieve high-performance with local NVME
storage here, with multiple polling IOThreads and multi queue.
* See [this blog post][PVC resize blog] for information about the PVC
  resize feature.  To implement this for VMs we could have kubevirt
  watch PVCs and respond to capacity changes with a corresponding
  call to resize the image file (if applicable) and to notify qemu of
  the enlarged device.
* Features such as incremental backup (CBT) and snapshots could be
  implemented through a generic CSI backend... Device mapper?
  Stratis? (See [Other Topics](#other-topics))

## Possible alternatives

### Storage device passthrough (highest performance)

Device passthrough via PCI VFIO, SCSI, or vDPA. No storage use-cases
and no CSI, as the device is passed directly to the guest.

![Storage device passthrough][Storage-Passthrough]

Devices and interfaces:

* N/A (hardware passthrough)

Pros:

* Highest possible performance (same as host)

Cons:

* No storage features anywhere outside of the guest.
* No live-migration for most cases.

### File-system passthrough (virtio-fs)

File mount volumes (directories, actually) can be exposed to QEMU via
[virtio-fs][] so that VMs have access to files and directories.

![File-system passthrough (virtio-fs)][Storage-Virtiofs]

Devices and interfaces:

* PVC: file-system

Pros:

* Simplicity from the user-perspective
* Flexibility
* Great for heterogeneous workloads that share data between
  containers and VMs (ie. OpenShift pipelines)

Cons:

* Performance when compared to block device passthrough

Questions and comments:

* Feature is still quite new (The Windows driver is fresh out of the
  oven)

### QEMU storage daemon in CSI for local storage

The qemu-storage-daemon is a user-space daemon that exposes QEMU’s
block layer to external users. It’s similar to [SPDK][], but includes
the implementation of QEMU block layer features such as snapshots and
bitmap tracking for incremental backup (CBT). It also allows the
splitting of one single NVMe device, allowing multiple QEMU VMs to
share one NVMe disk.

In this architecture, the storage daemon runs as part of CSI (control
plane), with the data-plane being either a vhost-user-blk interface
for QEMU or a fs-mount export for containers.

![QEMU storage daemon in CSI for local storage][Storage-QSD]

Devices and interfaces:

* CSI:
  * fs mount with a vhost-user-blk socket for QEMU to open
  * (OR) fs mount via NBD or FUSE with the actual file-system
contents
* qemu-storage-daemon backend: NVMe local device w/ raw or qcow2
  * alternative: any driver supported by QEMU, such as file-posix.
* QEMU frontend: virtio-blk
  * alternative: any emulated 

[RFC DOCUMENT 01/12] kubevirt-and-kvm: Add Index page

2020-09-16 Thread Andrea Bolognani
https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Index.md

# KubeVirt and the KVM user space

This is the entry point to a series of documents which, together,
detail the current status of KubeVirt and how it interacts with the
KVM user space.

The intended audience is people who are familiar with the traditional
virtualization stack (QEMU plus libvirt), and in order to make it
more approachable to them comparisons, are included and little to no
knowledge of KubeVirt or Kubernetes is assumed.

Each section contains a short summary as well as a link to a separate
document discussing the topic in more detail, with the intention that
readers will be able to quickly get a high-level understading of the
various topics by reading this overview document and then dig further
into the specific areas they're interested in.

## Architecture

### Goals

* KubeVirt aims to feel completely native to Kubernetes users
  * VMs should behave like containers whenever possible
  * There should be no features that are limited to VMs when it would
make sense to implement them for containers too
* KubeVirt also aims to support all the workloads that traditional
  virtualization can handle
  * Windows support, device assignment etc. are all fair game
* When these two goals clash, integration with Kubernetes usually
  wins

### Components

* KubeVirt is made up of various discrete components that interact
  with Kubernetes and the KVM user space
  * The overall design is somewhat similar to that of libvirt, except
with a much higher granularity and many of the tasks offloaded to
Kubernetes
  * Some of the components run at the cluster level or host level
with very high privileges, others run at the pod level with
significantly reduced privileges

Additional information: [Components][]

### Runtime environment

* QEMU expects its environment to be set up in advance, something
  that is typically taken care of by libvirt
* libvirtd, when not running in session mode, assumes that it has
  root-level access to the system and can perform pretty much any
  privileged operation
* In Kubernetes, the runtime environment is usually heavily locked
  down and many privileged operations are not permitted
  * Requiring additional permissions for VMs goes against the goal,
mentioned earlier, to have VMs behave the same as containers
whenever possible

## Specific areas

### Hotplug

* QEMU supports hotplug (and hot-unplug) of most devices, and its use
  is extremely common
* Conversely, resources associated with containers such as storage
  volumes, network interfaces and CPU shares are allocated upfront
  and do not change throughout the life of the workload
  * If the container needs more (or less) resources, the Kubernetes
approach is to destroy the existing one and schedule a new one to
take over its role

Additional information: [Hotplug][]

### Storage

* Handled through the same Kubernetes APIs used for containers
  * QEMU / libvirt only see an image file and don't have direct
access to the underlying storage implementation
  * This makes certain scenarios that are common in the
virtualization world very challenging: examples include hotplug
and full VM snapshots (storage plus memory)
* It might be possible to remove some of these limitations by
  changing the way storage is exposed to QEMU, or even take advantage
  of the storage technologies that QEMU already implements and make
  them available to containers in addition to VMs.

Additional information: [Storage][]

### Networking

* Application processes running in VMs are hidden behind a network
  interface as opposed to local sockets and processes running in
  a separated user namespace
  * Service meshes proxy and monitor applications by means of
socket redirection and classification on local ports and
process identifiers. We need to aim for generic compatibility
  * Existing solutions replicate a full TCP/IP stack to pretend
applications running in a QEMU instance are local. No chances
for zero-copy and context switching avoidance
* Network connectivity is shared between control plane and workload
  itself. Addressing and port mapping need particular attention
* Linux capabilities granted to the pod might be minimal, or none
  at all. Live migration presents further challenges in terms of
  network addressing and port mapping

Additional information: [Networking][]

### Live migration

* QEMU supports live migration between hosts, usually coordinated by
  libvirt
* Kubernetes expects containers to be disposable, so the equivalent
  of live migration would be to simply destroy the ones running on
  the source node and schedule replacements on the destination node
* For KubeVirt, a hybrid approach is used: a new container is created
  on the target node, then the VM is migrated from the old container,
  running on the source node, to the newly-created one

Additional information: [Live migration][]

### CPU 

[RFC DOCUMENT 00/12] kubevirt-and-kvm: Add documents

2020-09-16 Thread Andrea Bolognani
Hello there!

Several weeks ago, a group of Red Hatters working on the
virtualization stack (primarily QEMU and libvirt) started a
conversation with developers from the KubeVirt project with the goal
of better understanding and documenting the interactions between the
two.

Specifically, we were interested in integration pain points, with the
underlying ideas being that only once those issues are understood it
becomes possible to look for solutions, and that better communication
would naturally lead to improvements on both sides.

This series of documents was born out of that conversation. We're
sharing them with the QEMU and libvirt communities in the hope that
they can be a valuable resource for understanding how the projects
they're working on are consumed by higher-level tools, and what
challenges are encountered in the process.

Note that, while the documents describe a number of potential
directions for things like development of new components, that's all
just brainstorming that naturally occurred as we were learning new
things: the actual design process should, and will, happen on the
upstream lists.

Right now the documents live in their own little git repository[1],
but the expectation is that eventually they will find a suitable
long-term home. The most likely candidate right now is the main
KubeVirt repository, but if you have other locations in mind please
do speak up!

I'm also aware of the fact that this delivery mechanism is fairly
unconventional, but I thought it would be the best way to spark a
discussion around these topics with the QEMU and libvirt developers.

Last but not least, please keep in mind that the documents are a work
in progress, and polish has been applied to them unevenly: while the
information presented is, to the best of our knowledge, all accurate,
some parts are in a rougher state than others. Improvements will
hopefully come over time - and if you feel like helping out in making
that happen, it would certainly be appreciated!

Looking forward to your feedback :)


[1] https://gitlab.com/abologna/kubevirt-and-kvm
-- 
Andrea Bolognani / Red Hat / Virtualization




Re: Python 3.5 EOL; when can require 3.6?

2020-09-16 Thread Andrea Bolognani
On Wed, 2020-09-16 at 09:53 +0200, Philippe Mathieu-Daudé wrote:
> On 9/16/20 9:43 AM, Markus Armbruster wrote:
> > We require Python 3.5.  It will reach its "end of life" at the end of
> > September 2020[*].  Any reason not to require 3.6 for 5.2?  qemu-iotests
> > already does for its Python parts.
> 
> Not answering your question, but it would help to start a table
> of "oldest package released" versions, with our supported distributions
> as columns and package names as row.
> 
> This way when new distributions are released (and oldest dropped from
> our side) we can add/remove a column and see the oldest version we aim
> to support.

In case you're not already aware of it, https://repology.org/ is a
very valuable tool when it comes to figuring out minimum versions for
dependencies.

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [PATCH v2 2/2] GitLab Gating CI: initial set of jobs, documentation and scripts

2020-09-04 Thread Andrea Bolognani
On Thu, 2020-09-03 at 17:12 -0400, Cleber Rosa wrote:
> On Thu, Jul 09, 2020 at 10:55:07AM +0200, Erik Skultety wrote:
> > On Wed, Jul 08, 2020 at 10:46:57PM -0400, Cleber Rosa wrote:
> > > +.. note:: there are currently limitations to gitlab-runner itself when
> > > +  setting up a service under FreeBSD systems.  You will need to
> > > +  perform steps 4 to 10 manually, as described at
> > > +  https://docs.gitlab.com/runner/install/freebsd.html
> > 
> > What kinds of limitations? Is it architecture constrained maybe? I'm asking
> > because we have all of the steps covered by an Ansible playbook, so I kinda 
> > got
> > triggered by the word "manually". Also, the document only mentions 9 steps
> > overall.
> 
> FreeBSD's "service management" (systemd/sys-v like) is not covered by
> the GO library[1] used on gitlab-runner.  It's not ideal, and the
> second best solution would be to script the equivalent handling within
> the playbook, but I remember trying and finding some inconsistencies.
> Then, I had to give it up and defer to whenever a FreeBSD job is
> actually added.
> 
> [1] - https://github.com/ayufan/golang-kardianos-service

Note that this is a fork of

  https://github.com/kardianos/service

where FreeBSD support was added recently with

  
https://github.com/kardianos/service/commit/14b2cc59a290407a6f1cb3daba59069429d9665b

I'm not sure why gitlab-runner would use a fork rather than the
primary repository, but perhaps they can be convinced to switch and
gain better FreeBSD support in the process.

-- 
Andrea Bolognani / Red Hat / Virtualization




[PATCH] schemas: Add vim modeline

2020-07-29 Thread Andrea Bolognani
The various schemas included in QEMU use a JSON-based format which
is, however, strictly speaking not valid JSON.

As a consequence, when vim tries to apply syntax highlight rules
for JSON (as guessed from the file name), the result is an unreadable
mess which mostly consist of red markers pointing out supposed errors
in, well, pretty much everything.

Using Python syntax highlighting produces much better results, and
in fact these files already start with specially-formatted comments
that instruct Emacs to process them as if they were Python files.

This commit adds the equivalent special comments for vim.

Signed-off-by: Andrea Bolognani 
---
 docs/interop/firmware.json| 1 +
 docs/interop/vhost-user.json  | 1 +
 qapi/authz.json   | 1 +
 qapi/block-core.json  | 1 +
 qapi/block.json   | 1 +
 qapi/char.json| 1 +
 qapi/common.json  | 1 +
 qapi/control.json | 1 +
 qapi/crypto.json  | 1 +
 qapi/dump.json| 1 +
 qapi/error.json   | 1 +
 qapi/introspect.json  | 1 +
 qapi/job.json | 1 +
 qapi/machine-target.json  | 1 +
 qapi/machine.json | 1 +
 qapi/migration.json   | 1 +
 qapi/misc-target.json | 1 +
 qapi/misc.json| 1 +
 qapi/net.json | 1 +
 qapi/qapi-schema.json | 1 +
 qapi/qdev.json| 1 +
 qapi/qom.json | 1 +
 qapi/rdma.json| 1 +
 qapi/rocker.json  | 1 +
 qapi/run-state.json   | 1 +
 qapi/sockets.json | 1 +
 qapi/tpm.json | 1 +
 qapi/transaction.json | 1 +
 qapi/ui.json  | 1 +
 qga/qapi-schema.json  | 1 +
 storage-daemon/qapi/qapi-schema.json  | 1 +
 tests/qapi-schema/doc-good.json   | 2 ++
 tests/qapi-schema/include/sub-module.json | 1 +
 tests/qapi-schema/qapi-schema-test.json   | 1 +
 tests/qapi-schema/sub-sub-module.json | 1 +
 35 files changed, 36 insertions(+)

diff --git a/docs/interop/firmware.json b/docs/interop/firmware.json
index 240f565397..989f10b626 100644
--- a/docs/interop/firmware.json
+++ b/docs/interop/firmware.json
@@ -1,4 +1,5 @@
 # -*- Mode: Python -*-
+# vim: filetype=python
 #
 # Copyright (C) 2018 Red Hat, Inc.
 #
diff --git a/docs/interop/vhost-user.json b/docs/interop/vhost-user.json
index ef8ac5941f..feb5fe58ca 100644
--- a/docs/interop/vhost-user.json
+++ b/docs/interop/vhost-user.json
@@ -1,4 +1,5 @@
 # -*- Mode: Python -*-
+# vim: filetype=python
 #
 # Copyright (C) 2018 Red Hat, Inc.
 #
diff --git a/qapi/authz.json b/qapi/authz.json
index 1c836a3abd..f3e9745426 100644
--- a/qapi/authz.json
+++ b/qapi/authz.json
@@ -1,4 +1,5 @@
 # -*- Mode: Python -*-
+# vim: filetype=python
 #
 # QAPI authz definitions
 
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 943df1926a..5f72b50149 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1,4 +1,5 @@
 # -*- Mode: Python -*-
+# vim: filetype=python
 
 ##
 # == Block core (VM unrelated)
diff --git a/qapi/block.json b/qapi/block.json
index 2ddbfa8306..c54a393cf3 100644
--- a/qapi/block.json
+++ b/qapi/block.json
@@ -1,4 +1,5 @@
 # -*- Mode: Python -*-
+# vim: filetype=python
 
 ##
 # = Block devices
diff --git a/qapi/char.json b/qapi/char.json
index daceb20f84..8aeedf96b2 100644
--- a/qapi/char.json
+++ b/qapi/char.json
@@ -1,4 +1,5 @@
 # -*- Mode: Python -*-
+# vim: filetype=python
 #
 
 ##
diff --git a/qapi/common.json b/qapi/common.json
index 7b9cbcd97b..716712d4b3 100644
--- a/qapi/common.json
+++ b/qapi/common.json
@@ -1,4 +1,5 @@
 # -*- Mode: Python -*-
+# vim: filetype=python
 
 ##
 # = Common data types
diff --git a/qapi/control.json b/qapi/control.json
index 6b816bb61f..de51e9916c 100644
--- a/qapi/control.json
+++ b/qapi/control.json
@@ -1,4 +1,5 @@
 # -*- Mode: Python -*-
+# vim: filetype=python
 #
 
 ##
diff --git a/qapi/crypto.json b/qapi/crypto.json
index b2a4cff683..c41e869e31 100644
--- a/qapi/crypto.json
+++ b/qapi/crypto.json
@@ -1,4 +1,5 @@
 # -*- Mode: Python -*-
+# vim: filetype=python
 #
 
 ##
diff --git a/qapi/dump.json b/qapi/dump.json
index a1eed7b15c..f7c4267e3f 100644
--- a/qapi/dump.json
+++ b/qapi/dump.json
@@ -1,4 +1,5 @@
 # -*- Mode: Python -*-
+# vim: filetype=python
 #
 # This work is licensed under the terms of the GNU GPL, version 2 or later.
 # See the COPYING file in the top-level directory.
diff --git a/qapi/error.json b/qapi/error.json
index 3fad08f506..94a6502de9 100644
--- a/qapi/error.json
+++ b/qapi/error.json
@@ -1,4 +1,5 @@
 # -*- Mode: Python -*-
+# vim: filetype=python

Re: [PATCH v2 2/2] GitLab Gating CI: initial set of jobs, documentation and scripts

2020-07-09 Thread Andrea Bolognani
On Thu, 2020-07-09 at 11:30 +0100, Daniel P. Berrangé wrote:
> On Wed, Jul 08, 2020 at 10:46:57PM -0400, Cleber Rosa wrote:
> > +- name: Installation of basic packages to build QEMU
> > +  hosts: all
> > +  vars_files:
> > +- vars.yml
> > +  tasks:
> > +- name: Install basic packages to build QEMU on Ubuntu 18.04/20.04
> > +  apt:
> > +update_cache: yes
> > +# This matches the packages on 
> > tests/docker/Dockerfiles/ubuntu1804.docker
> 
> I'd be inclined to actually use docker on the custom runners.
> 
> eg. instead of having separate physical machines or VMs for each
> (distro, arch) pair, have a single host distro for the arch. Then
> use docker to provide the build environment against each distro.
> 
> IOW, a RHEL-8 aarch64 host, running docker for ubuntu18.04, fedora30
> etc.
> 
> That way we don't end up duplicating all these packages, and instead
> can use  tests/docker/Dockerfiles/ubuntu1804.docker.  This ensures
> that if a user needs to reproduce a build failure on their own local
> aarch64 machine, they can run docker and get the exact same build
> architecture.
> 
> It also has the benefit that we don't need to worry about how to
> setup gitlab runners for every distro we care about. We only need to
> do gitlab runner for the standard host distro, which spawns a pristine
> throwaway docker env.
> 
> I appreciate this is a big change from what you've done in this patch
> though, so don't consider this comment a blocker for initial merge.
> I think we should do this as the long term strategy though. Essentially
> for Linux builds, everything should always be container based.

Agreed. You should be able to set up a fairly minimal environment,
which consists of Docker, gitlab-runner and not much else, using a
long-term supported distro such as CentOS and then just schedule
whatever container build on it. No need to provision a new machine
every time a new Fedora release comes out, just create a container
image for it and add it to the mix.

Additionally, the gitlab-runner Docker executor provides more
isolation than the shell executor, so running untrusted builds
becomes a more reasonable proposition - this is how the shared
runners on gitlab.com work - and you don't have to worry about your
jobs cleaning up properly after themselves nearly as much.

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [PATCH 0/5] QEMU Gating CI

2020-04-27 Thread Andrea Bolognani
On Mon, 2020-04-27 at 01:24 -0400, Cleber Rosa wrote:
> On Fri, 24 Apr 2020 08:57:46 +0200
> Erik Skultety  wrote:
> > On Thu, Apr 23, 2020 at 11:28:21PM +0200, Philippe Mathieu-Daudé
> > wrote:
> > > Thanks to insist with that point Daniel. I'd rather see every
> > > configuration reproducible, so if we loose a hardware sponsor, we
> > > can find another one and start another runner.
> > > Also note, if it is not easy to reproduce a runner, it will be very
> > > hard to debug a reported build/test error.
> > 
> > (Thanks for bringing ^this point up Philippe)
> > 
> > ...However, what we've been actively working on in libvirt is to
> > extend the lcitool we have (which can spawn local test VMs) to the
> > point where we're able to generate machines that would be the
> > reproducible. Right now I'm playing with cloud-init integration with
> > lcitool (patches coming soon) that would allow us to use the same
> > machines locally as we'd want to have in, say, OpenStack and share
> > them as compressed images, so even when updated/managed by lcitool
> > locally, you'd get the same environment.
> 
> This is great, and it aligns with my previous point that reproducibility
> it's not *just* about the hardware, but about diligently documenting
> and automating the environments, be them mundane or super specialized.
> And IMO that should score some points when it comes to being
> promoted/demoted as a Gating machine/job.

I think there's room to extend and rework lcitool so that it can be
used for QEMU CI as well, and we should definitely look into that.

Right now it only really covers VMs and containers, but there's
already one situation (FreeBSD) where the expectation is that you'd
import an existing VM image and then apply CI-related customizations
on top, so it's not too much of a stretch to imagine doing the same
for a bare metal machine. We use Ansible, so as long as we can
connect to the machine via ssh we're pretty much good to go.

Installation of VMs we already perform in an unattended fashion using
preseed/kickstart, and it should be relatively straighforward to
adapt those configurations to also work on real hardware. This way
we'd both be able to rely on having a sane OS as the base, and
relieve the administrator from the duty of manually installing the
machines.

-- 
Andrea Bolognani / Red Hat / Virtualization




[PATCH] qapi: Expand documentation for LostTickPolicy

2020-02-11 Thread Andrea Bolognani
The current documentation is fairly terse and not easy to decode
for someone who's not intimately familiar with the inner workings
of timer devices. Expand on it by providing a somewhat verbose
description of what behavior each policy will result in, as seen
from both the guest OS and host point of view.

Signed-off-by: Andrea Bolognani 
---
This information is reported pretty much word by word in

  https://libvirt.org/formatdomain.html#elementsTime

so I'm hoping I can get the QEMU documentation updated and then just
merge back the changes.

 qapi/misc.json | 34 +++---
 1 file changed, 23 insertions(+), 11 deletions(-)

diff --git a/qapi/misc.json b/qapi/misc.json
index 33b94e3589..cd7445d29f 100644
--- a/qapi/misc.json
+++ b/qapi/misc.json
@@ -163,17 +163,29 @@
 ##
 # @LostTickPolicy:
 #
-# Policy for handling lost ticks in timer devices.
-#
-# @discard: throw away the missed tick(s) and continue with future injection
-#   normally.  Guest time may be delayed, unless the OS has explicit
-#   handling of lost ticks
-#
-# @delay: continue to deliver ticks at the normal rate.  Guest time will be
-# delayed due to the late tick
-#
-# @slew: deliver ticks at a higher rate to catch up with the missed tick. The
-#guest time should not be delayed once catchup is complete.
+# Policy for handling lost ticks in timer devices.  Ticks end up getting
+# lost when, for example, the guest is paused.
+#
+# @discard: throw away the missed ticks and continue with future injection
+#   normally.  The guest OS will see the timer jump ahead by a
+#   potentially quite significant amount all at once, as if the
+#   intervening chunk of time had simply not existed; needless to
+#   say, such a sudden jump can easily confuse a guest OS which is
+#   not specifically prepared to deal with it.  Assuming the guest
+#   OS can deal correctly with the time jump, the time in the guest
+#   and in the host should now match.
+#
+# @delay: continue to deliver ticks at the normal rate.  The guest OS will
+# not notice anything is amiss, as from its point of view time will
+# have continued to flow normally.  The time in the guest should now
+# be behind the time in the host by exactly the amount of time during
+# which ticks have been missed.
+#
+# @slew: deliver ticks at a higher rate to catch up with the missed ticks.
+#The guest OS will not notice anything is amiss, as from its point
+#of view time will have continued to flow normally.  Once the timer
+#has managed to catch up with all the missing ticks, the time in
+#the guest and in the host should match.
 #
 # Since: 2.0
 ##
-- 
2.24.1




Re: Making QEMU easier for management tools and applications

2020-02-03 Thread Andrea Bolognani
h its JSON-based API.


Obviously QEMU developers, for their own use, could still benefit
from having access to a user interface that doesn't require either
rigging up libvirt support or messing with JSON directly, and such
an interface could even look very similarly to the current CLI, but
it could simply not be made user-facing and thus not subject to any
compatibility concerns.


[1] https://lists.nongnu.org/archive/html/qemu-devel/2020-01/msg06034.html
-- 
Andrea Bolognani / Red Hat / Virtualization


virt-run
Description: application/shellscript


Re: [RFC PATCH v3 0/6] target/arm/kvm: Adjust virtual time

2020-01-31 Thread Andrea Bolognani
On Mon, 2020-01-20 at 11:10 +0100, Andrew Jones wrote:
> v3:
>  - Added a target/arm/kvm_arm.h comment cleanup patch (1/6)
>  - Minor refactoring of assert_has_feature_enabled/disabled in 4/6,
>kept Richard's r-b.
>  - Rewrote kvm-no-adjvtime documentation in 6/6.
>  - Reworked approach in 5/6 to properly deal with migration and to
>track running vs. !running, rather than running vs. paused states.

Probably too late since Peter already queued the series, but FWIW

  Tested-by: Andrea Bolognani 

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [PATCH qemu v5] spapr: Kill SLOF

2020-01-28 Thread Andrea Bolognani
On Thu, 2020-01-23 at 16:11 +1100, David Gibson wrote:
> On Wed, Jan 22, 2020 at 06:14:37PM +1100, Alexey Kardashevskiy wrote:
> > On 22/01/2020 17:32, David Gibson wrote:
> > > I'm not thinking of "grub" as a separate option - that would be the
> > > same as "vof".  Using vof + no -kernel we'd need to scan the disks in
> > > the same way SLOF does, and look for a boot partition, which will
> > > probably contain a GRUB image. 
> > 
> > I was hoping we can avoid that by allowing
> > "-kernel grub" and let grub do filesystems and MBR/GPT.
> 
> I don't want that to be the only way, because I want the GRUB
> installed by the OS installer to be the GRUB we use.

Agreed, the bootloader and the kernel should live inside the guest
image and not on the host's filesystem.

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [RFC PATCH qemu] spapr: Kill SLOF

2020-01-13 Thread Andrea Bolognani
On Wed, 2020-01-08 at 13:34 +1100, Alexey Kardashevskiy wrote:
> On 07/01/2020 20:39, Andrea Bolognani wrote:
> > On Tue, 2020-01-07 at 12:55 +1100, Alexey Kardashevskiy wrote:
> > > Petitboot kernel+initramdisk almost replaces SLOF + GRUB.
> > 
> > Is this necessarily a good thing?
> 
> The bare metal host and the powernv machine in QEMU do not use grub,
> they use petitboot which parses all grub configs and supports quite a lot.

How well does the distro integration work? Eg. if I change something
in /etc/default/grub and then run grub2-mkconfig, can I expect my
changes to be picked up? In which scenarios will that *not* work?

> Using Linux for a boot loader is not powerpc-only thing really, some
> folks do this too (forgot who, just heard this at the KVM forum).

While other options are available and some architectures use
something else entirely, GRUB is the de-facto standard across most
of the non-obscure architectures.

I guess the question is whether it's more important to be consistent
within the architecture or across them. I think the latter might be
preferable, especially when we consider what I think is the most
common scenario, that is, someone who's used to having GRUB on their
x86 machine running a ppc64 guest on the cloud. The more skills they
can automatically transfer over, the better.

> > Personally I quite like the fact
> > that I can use the same bootloader across x86, ppc64 and aarch64.
> 
> I am not suggesting removing SLOF soon

Perhaps the patch subject shouldn't be "kill SLOF" then? ;)

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [RFC PATCH qemu] spapr: Kill SLOF

2020-01-07 Thread Andrea Bolognani
On Tue, 2020-01-07 at 12:55 +1100, Alexey Kardashevskiy wrote:
> On 07/01/2020 01:15, Daniel Henrique Barboza wrote:
> > Question: does Petitboot already replaces SLOF in every possible
> > scenario for all
> > the spapr machine features?
> 
> Petitboot kernel+initramdisk almost replaces SLOF + GRUB.

Is this necessarily a good thing? Personally I quite like the fact
that I can use the same bootloader across x86, ppc64 and aarch64.

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [PATCH v1 0/5] target/arm/kvm: Provide an option to adjust virtual time

2019-12-10 Thread Andrea Bolognani
On Tue, 2019-12-10 at 17:08 +0100, Andrew Jones wrote:
> I don't have a strong enough opinion about kvm-adjvtime vs.
> kvm-no-adjvtime to insist one way or another. I agree double inversions
> are easier to mess up, but I also like the way the '-no-' better
> communicates that the default is [probably] 'yes'.
> 
> All interested parties, please vote. I'll be sending v2 soon and I can
> call this thing anything the majority (or the dominate minority) prefer.

I like kvm-adjvtime better because it avoids the double negative,
but on the other hand if the new default is be to adjust the time
then I don't expect many people will actually need to use the
parameter, so the name doesn't matter that much after all :)

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [PATCH v1 0/5] target/arm/kvm: Provide an option to adjust virtual time

2019-12-10 Thread Andrea Bolognani
On Tue, 2019-12-10 at 15:33 +0100, Andrew Jones wrote:
> On Tue, Dec 10, 2019 at 03:21:02PM +0100, Andrea Bolognani wrote:
> > I agree with everything except the naming: why would
> > 
> >   kvm-no-adjvtime=off  vtime is adjusted  (default)
> >   kvm-no-adjvtime=on   vtime is not adjusted
> > 
> > be better than
> > 
> >   kvm-adjvtime=on   vtime is adjusted  (default)
> >   kvm-adjvtime=off  vtime is not adjusted
> > 
> > ? Both offer the exact same amount of flexibility, but the latter has
> > the advantage of not introducing any unwieldy double negatives.
> 
> A default of 'off' == 0 means not setting anything at all. There's
> already precedent for 'kvm-no*' prefixed cpu features,
> 
> kvm-no-smi-migration
> kvm-nopiodelay

Sorry, I'm not sure I understand. Do you mean that the array where
you store CPU features is 0-inizialized, so it's more convenient to
have off (0) as the default because it means you don't have to touch
it beforehand? Or something different?

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [PATCH v1 0/5] target/arm/kvm: Provide an option to adjust virtual time

2019-12-10 Thread Andrea Bolognani
On Tue, 2019-12-10 at 14:32 +0100, Andrew Jones wrote:
> After having done this git mining, it looks more and more like we should
> at least consider naming this feature 'kvm-no-adjvtime' and probably
> also change arm's default.

I agree with everything except the naming: why would

  kvm-no-adjvtime=off  vtime is adjusted  (default)
  kvm-no-adjvtime=on   vtime is not adjusted

be better than

  kvm-adjvtime=on   vtime is adjusted  (default)
  kvm-adjvtime=off  vtime is not adjusted

? Both offer the exact same amount of flexibility, but the latter has
the advantage of not introducing any unwieldy double negatives.

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [PATCH v1 0/5] target/arm/kvm: Provide an option to adjust virtual time

2019-12-10 Thread Andrea Bolognani
On Fri, 2019-12-06 at 16:53 +0100, Andrew Jones wrote:
> On Fri, Dec 06, 2019 at 03:22:58PM +, Peter Maydell wrote:
> > On Wed, 16 Oct 2019 at 15:34, Andrew Jones  wrote:
> > > This series is inspired by a series[1] posted by Bijan Mottahedeh about
> > > a year ago.  The problem described in the cover letter of [1] is easily
> > > reproducible and some users would like to have the option to avoid it.
> > > However the solution, which is to adjust the virtual counter offset each
> > > time the VM transitions to the running state, introduces a different
> > > problem, which is that the virtual and physical counters diverge.  As
> > > described in the cover letter of [1] this divergence is easily observed
> > > when comparing the output of `date` and `hwclock` after suspending the
> > > guest, waiting a while, and then resuming it.  Because this different
> > > problem may actually be worse for some users, unlike [1], the series
> > > posted here makes the virtual counter offset adjustment optional and not
> > > even enabled by default.  Besides the adjustment being optional, this
> > > series approaches the needed changes differently to apply them in more
> > > appropriate locations.  Finally, unlike [1], this series doesn't attempt
> > > to measure "pause time" itself.  Simply using QEMU_CLOCK_VIRTUAL, which
> > > only ticks when the VM is not stopped, is sufficient.
> > 
> > So I guess my overall question is "what is the x86 solution to
> > this problem, and why is this all arm-specific?" It would also
> 
> x86 adjusts the counter offset by default, and I don't think there's any
> way to turn that behavior off. I think it's too late to follow that
> default for arm, but this series provides a way to opt into the same
> behavior.

My understanding is that turning kvm-adjvtime either on or off
results in a different set of advantages and drawbacks, with neither
begin a one-size-fits-all solution. So it's good that we offer a way
for the user to pick one or the other based on their specific needs.

The main difference is that kvm-adjvtime=on matches x86's behavior,
which is fairly well understood and thoroughly documented, along with
the corresponding workarounds, dos and don'ts.

With that in mind, in my opinion it would make sense to make
kvm-adjvtime=on the behavior for newer machine types so that people
coming from an x86 background and/or having to manage both at the
same time will get a consistent experience and will be able to draw,
even for aarch64, on the considerable amount of existing x86-centric
literature on the subject.

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [PATCH-for-5.0] roms/edk2-funcs.sh: Use available GCC for ARM/Aarch64 targets

2019-12-05 Thread Andrea Bolognani
On Thu, 2019-12-05 at 17:27 +0100, Philippe Mathieu-Daudé wrote:
> On 12/5/19 5:13 PM, Laszlo Ersek wrote:
> > If that rules out CentOS 7 as a QEMU project build / CI platform for the
> > bundled ArmVirtQemu binaries, then we need a more recent platform
> > (perhaps CentOS 8, not sure).
> 
> Unfortunately CentOS 8 is not available as a Docker image, which is a 
> convenient way to build EDK2 in a CI.

Uh?

  https://hub.docker.com/_/centos

seems to disagree with you. Is the 'centos:8' image not suitable
for some non-obvious reason?

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [Qemu-devel] git.qemu.org gitweb misrenders on git/ URLs

2019-08-29 Thread Andrea Bolognani
On Thu, 2019-08-29 at 16:48 +0100, Stefan Hajnoczi wrote:
> Hi Jeff,
> Philippe noticed that the git HTTPS clone URL
> https://git.qemu.org/git/libslirp.git renders a gitweb page that looks
> right but has broken links.  The correct gitweb URL listed on
> https://git.qemu.org/ is https://git.qemu.org/?p=libslirp.git;a=summary,
> but there's a chance that people will open the HTTPS clone URL in their
> browser and expect to see gitweb working.
> 
> Is it possible to tweak the Apache configuration so that
> https://git.qemu.org/git/libslirp.git[/] redirects to the working gitweb
> URL?
> 
> The tricky part is not breaking HTTPS git clone, which accesses URLs
> below https://git.qemu.org/git/libslirp.git/ :).

I know that's not quite the answer to your question, but if you look
for example at

  https://git.zx2c4.com/cgit

you'll see that the same URL can be used both for viewing with a
browser *and* cloning.

Basically with cgit all requests go through the CGI script, and an
advantage of that is that you don't even need to call

  git update-server-info

to make the repository accessible via HTTPs. It's also pretty fast
and extremely easy to setup. Maybe consider switching from gitweb
to it?

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [Qemu-devel] [PATCH v3 00/15] target/arm/kvm: enable SVE in guests

2019-08-20 Thread Andrea Bolognani
On Tue, 2019-08-20 at 09:53 +0200, Andrew Jones wrote:
> On Tue, Aug 20, 2019 at 06:08:04AM +, Zhang, Lei wrote:
> > Hi Andrew,
> > 
> > I have tested your patch on kernel- 5.2.7 + QEMU (4.0.94 + patch).
> 
> Thanks for the testing! I guess it's time for me to get back to this
> series and spin a v4 (so we can test some more :-)
> 
> > This patch series works fine for my tests when use qemu-system-aarch64 
> > directly.
> > But I can't startup kvm when I use virsh[1].
> > 
> > Command I executed.
> > # virsh start test1
> > 
> > The error message is [internal error: CPU features not supported by 
> > hypervisor for aarch64 architecture.]
> > Do you have any ideas for this error? 
> 
> I've CC'ed Andrea.

I've specifically patched out that check in my series... Are you
sure you're using the modified libvirt version, and that your guest
is configured to use the modified QEMU binary?

Anyway, once v4 has been posted I'll respin the libvirt series as
well, since in the meantime conflicts have popped up and it no longer
applies cleanly on top of master.

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [Qemu-devel] [PATCH 2/4] configure: Avoid using libssh deprecated API

2019-08-15 Thread Andrea Bolognani
On Wed, 2019-08-14 at 17:14 +0200, Philippe Mathieu-Daudé wrote:
> On 8/14/19 4:51 PM, Andrea Bolognani wrote:
> > On Wed, 2019-08-14 at 16:15 +0200, Philippe Mathieu-Daudé wrote:
> > > On 8/14/19 3:27 PM, Andrea Bolognani wrote:
> > > > On Wed, 2019-08-14 at 14:15 +0200, Philippe Mathieu-Daudé wrote:
> > > > > Suggested-by: Andrea Bolognani 
> > > > > Signed-off-by: Philippe Mathieu-Daudé 
> > > > > ---
> > > > >  block/ssh.c | 2 +-
> > > > >  configure   | 7 +++
> > > > >  2 files changed, 8 insertions(+), 1 deletion(-)
> > > > 
> > > > Did I really suggest this? I have no recollection of doing so, or
> > > > even getting involved with libssh support in QEMU at all for that
> > > > matter.
> > > 
> > > I took this suggestion from
> > > https://www.redhat.com/archives/libvir-list/2018-May/msg00597.html
> > 
> > I see :)
> > 
> > I feel like adding a Suggested-by because of something that was
> > posted on an unrelated project's mailing list is stretching the
> > definition of the tag a bit, so if you end up having to respin I
> > think it would be reasonable to drop it, but honestly it's not a
> > big deal either way: I was just curious.
> 
> Understood, sorry.

Nothing to apologize for! :)

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [Qemu-devel] [PATCH 2/4] configure: Avoid using libssh deprecated API

2019-08-14 Thread Andrea Bolognani
On Wed, 2019-08-14 at 16:15 +0200, Philippe Mathieu-Daudé wrote:
> On 8/14/19 3:27 PM, Andrea Bolognani wrote:
> > On Wed, 2019-08-14 at 14:15 +0200, Philippe Mathieu-Daudé wrote:
> > > The libssh packaged by a distribution can predate version 0.8,
> > > but still provides the newer API introduced after version 0.7.
> > > 
> > > Using the deprecated API leads to build failure, as on Ubuntu 18.04:
> > > 
> > > CC  block/ssh.o
> > >   block/ssh.c: In function 'check_host_key_hash':
> > >   block/ssh.c:444:5: error: 'ssh_get_publickey' is deprecated 
> > > [-Werror=deprecated-declarations]
> > >r = ssh_get_publickey(s->session, );
> > >^
> > >   In file included from block/ssh.c:27:0:
> > >   /usr/include/libssh/libssh.h:489:31: note: declared here
> > >SSH_DEPRECATED LIBSSH_API int ssh_get_publickey(ssh_session session, 
> > > ssh_key *key);
> > >  ^
> > >   rules.mak:69: recipe for target 'block/ssh.o' failed
> > >   make: *** [block/ssh.o] Error 1
> > > 
> > > Fix by using the newer API if available.
> > > 
> > > Suggested-by: Andrea Bolognani 
> > > Signed-off-by: Philippe Mathieu-Daudé 
> > > ---
> > >  block/ssh.c | 2 +-
> > >  configure   | 7 +++
> > >  2 files changed, 8 insertions(+), 1 deletion(-)
> > 
> > Did I really suggest this? I have no recollection of doing so, or
> > even getting involved with libssh support in QEMU at all for that
> > matter.
> 
> I took this suggestion from
> https://www.redhat.com/archives/libvir-list/2018-May/msg00597.html

I see :)

I feel like adding a Suggested-by because of something that was
posted on an unrelated project's mailing list is stretching the
definition of the tag a bit, so if you end up having to respin I
think it would be reasonable to drop it, but honestly it's not a
big deal either way: I was just curious.

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [Qemu-devel] [PATCH 2/4] configure: Avoid using libssh deprecated API

2019-08-14 Thread Andrea Bolognani
On Wed, 2019-08-14 at 14:15 +0200, Philippe Mathieu-Daudé wrote:
> The libssh packaged by a distribution can predate version 0.8,
> but still provides the newer API introduced after version 0.7.
> 
> Using the deprecated API leads to build failure, as on Ubuntu 18.04:
> 
> CC  block/ssh.o
>   block/ssh.c: In function 'check_host_key_hash':
>   block/ssh.c:444:5: error: 'ssh_get_publickey' is deprecated 
> [-Werror=deprecated-declarations]
>r = ssh_get_publickey(s->session, );
>^
>   In file included from block/ssh.c:27:0:
>   /usr/include/libssh/libssh.h:489:31: note: declared here
>SSH_DEPRECATED LIBSSH_API int ssh_get_publickey(ssh_session session, 
> ssh_key *key);
>  ^
>   rules.mak:69: recipe for target 'block/ssh.o' failed
>   make: *** [block/ssh.o] Error 1
> 
> Fix by using the newer API if available.
> 
> Suggested-by: Andrea Bolognani 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  block/ssh.c | 2 +-
>  configure   | 7 +++
>  2 files changed, 8 insertions(+), 1 deletion(-)

Did I really suggest this? I have no recollection of doing so, or
even getting involved with libssh support in QEMU at all for that
matter.

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [Qemu-devel] [RFC] virtio-mmio: implement modern (v2) personality (virtio-1)

2019-07-30 Thread Andrea Bolognani
On Tue, 2019-07-30 at 13:35 +0200, Cornelia Huck wrote:
> On Tue, 30 Jul 2019 12:25:30 +0200
> Andrea Bolognani  wrote:
> > Can you please make sure virtio-mmio uses the existing interface
> > instead of introducing a new one?
> 
> FWIW, I really hate virtio-pci's disable-modern/disable-legacy... for a
> starter, what is 'modern'? Will we have 'ultra-modern' in the future?

AIUI the modern/legacy terminology is part of the VirtIO spec, so
while I agree that it's not necessarily the least prone to ambiguity
at least it's well defined.

> It is also quite backwards with the 'disable' terminology.

That's also true. I never claimed the way virtio-pci does it is
perfect!

> We also have a different mechanism for virtio-ccw ('max_revision',
> which covers a bit more than virtio-1; it doesn't have a 'min_revision',
> as negotiating the revision down is fine), so I don't see why
> virtio-mmio should replicate the virtio-pci mechanism.
> 
> Also, IIUC, virtio-mmio does not have transitional devices, but either
> version 1 (legacy) or version 2 (virtio-1). It probably makes more
> sense to expose the device version instead; either as an exact version
> (especially if it isn't supposed to go up without incompatible
> changes), or with some min/max concept (where version 1 would stand a
> bit alone, so that would probably be a bit awkward.)

I think that if reinventing the wheel is generally agreed not to be
a good idea, then it stands to reason that reinventing it twice can
only be described as absolute madness :)

We should have a single way to control the VirtIO protocol version
that works for all VirtIO devices, regardless of transport. We might
even want to have virtio-*-{device,ccw}-non-transitional to mirror
the existing virtio-*-pci-non-transitional.

FWIW, libvirt already implements support for (non)-transitional
virtio-pci devices using either the dedicated devices or the base
virtio-pci plus the disable-{modern,legacy} attributes.

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [Qemu-devel] [RFC] virtio-mmio: implement modern (v2) personality (virtio-1)

2019-07-30 Thread Andrea Bolognani
On Mon, 2019-07-29 at 14:57 +0200, Sergio Lopez wrote:
[...]
>  /* virtio-mmio device */
>  
>  static Property virtio_mmio_properties[] = {
>  DEFINE_PROP_BOOL("format_transport_address", VirtIOMMIOProxy,
>   format_transport_address, true),
> +DEFINE_PROP_BOOL("modern", VirtIOMMIOProxy, modern, false),
>  DEFINE_PROP_END_OF_LIST(),
>  };

Not a QEMU developer so forgive me if I say something silly, but IIUC
you'd be able to opt into the new feature by using eg.

  -device virtio-net-device,modern=on

However, virtio-pci devices already have a mechanism to control the
VirtIO protocol version, where you use

  -device virtio-net-pci,disable-modern=no,disable-legacy=yes

to get a VirtIO 1.x-only device and

  -device virtio-net-pci,disable-modern=no,disable-legacy=no

for a transitional device.

Can you please make sure virtio-mmio uses the existing interface
instead of introducing a new one?

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [Qemu-devel] [Qemu-riscv] [RFC v1 0/5] RISC-V: Add firmware loading support and default

2019-06-27 Thread Andrea Bolognani
On Fri, 2019-06-21 at 14:35 +0200, Andrea Bolognani wrote:
> On Thu, 2019-06-20 at 21:43 +0300, David Abdurachmanov wrote:
> > On Thu, Jun 20, 2019 at 9:18 PM Alistair Francis  
> > wrote:
> > > OE-Core already packages OpenSBI by default, Fedora and Debian are
> > > moving to OpenSBI for RISC-V targets as well.
> > > 
> > > Any distro that supports the RISC-V toolchain (which is all
> > > upstreamed) can build OpenSBI.
> > 
> > Fedora uses OpenSBI for the last 2 or 3 months now. I don't plan to update
> > BBL builds. OpenSBI packages in Fedora/RISCV isn't finalized, but it does
> > ship *.elf and *.bin files.
> 
> Sounds good to me, thanks for confirming!

>From further off-list discussion with David, I have learned that
recent Fedora images include an OpenSBI build with embedded U-Boot
payload, such that you only need to have that single file on the host
and pass it to QEMU via -kernel[1] for RISC-V guest boot to work. I
played with it over the past few days, and it works very nicely.

I think this is the result that we want to ultimately reach: a single
RISC-V "firmware" binary installed on the host through an appropriate
distro package, shared among guests, with everything else that is
guest-specific being contained in the corresponding disk image.

This is what other architectures are already doing, with SeaBIOS and
OVMF on x86_64, AAVMF on aarch64 and SLOF on ppc64 all being handled
this way: RISC-V should, where it makes sense, follow suit.

QEMU also recently introduced a JSON-based specification that can be
used to advertise guest firmwares and libvirt already supports it,
which makes firmware configuration either extremely convenient or
entirely automatic for the user: the OpenSBI support should also be
advertised this way.


[1] I guess that'd be -bios after these patches?
-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [Qemu-devel] [Qemu-arm] [RFC v2 PATCH] hw/arm/virt: makes virt a default machine type

2019-06-24 Thread Andrea Bolognani
On Sat, 2019-06-22 at 16:58 +0100, Peter Maydell wrote:
> On Fri, 21 Jun 2019 at 20:04, Cleber Rosa  wrote:
> > You can consider me biased (I do consider myself), but trying to wear
> > the hat of a user first interacting with QEMU, I would expect a (any)
> > reasonably capable environment that can represent the given target.
> > That will probably be a different environment than the one I may need,
> > and I think that's fine.
> 
> I'm really not sure what you're trying to suggest here; maybe
> you could clarify? If you specify a target (ie a machine type),
> you get that machine type. If you don't specify a target, then
> we can't really guess what you were hoping to run and
> magically pick something that works.
> 
> The main problem here is that users expect "all the world is a PC"
> type behaviour, ie they can just provide qemu-system-arm or
> qemu-system-aarch64 with no command line arguments except
> a guest kernel (which is half the time something they found under
> a rock or extracted from some firmware image) or a guest CDROM
> image and have it boot, because that generally works for x86. It
> doesn't and can't work for Arm, because of the much greater
> diversity of machine types and the way that kernels are often
> only compiled to work on a specific subset of machines.
> Making the user specify a machine type means they do at least
> get prompted that the world is more complicated than they
> think it is and there are decisions that have to be made.
> 
> In any case even if we did default to "virt" the user still
> has to specify a CPU type, may well also want to provide
> a GIC version (gicv3 being better than the default v2),
> likely more RAM than the very small default, they need to provide
> all the virtio devices, and so on and so on. So giving
> them one option they no longer need to specify doesn't
> really make it any easier IMHO.

Additional note on GIC: most server-grade machines you can buy today
do *not* support GICv2, so you will need to opt-in to GICv3 if you
want your guest to even start.

More generally, as someone who has worked on supporting non-x86
guests in libvirt for the past few years, I can tell you from
experience that you're always going to need some arch-specific logic
to deal with the small (and not so small :) differences in behavior
between QEMU targets: as Peter correctly says, machine type is just
a single example among many.

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [Qemu-devel] [Qemu-riscv] [RFC v1 0/5] RISC-V: Add firmware loading support and default

2019-06-21 Thread Andrea Bolognani
On Thu, 2019-06-20 at 21:43 +0300, David Abdurachmanov wrote:
> On Thu, Jun 20, 2019 at 9:18 PM Alistair Francis  wrote:
> > On Thu, Jun 20, 2019 at 1:16 AM Andrea Bolognani  
> > wrote:
> > > but one question comes to mind: once this is in, we will probably
> > > want to have OpenSBI packaged separately in distributions, the same
> > > way it already happens for SeaBIOS, SLOF and edk2-based firmwares.
> > > 
> > > Will using either of the formats prevent that from happening?
> > 
> > Both options allow this.
> > 
> > OE-Core already packages OpenSBI by default, Fedora and Debian are
> > moving to OpenSBI for RISC-V targets as well.
> >
> > Any distro that supports the RISC-V toolchain (which is all
> > upstreamed) can build OpenSBI.
> 
> Fedora uses OpenSBI for the last 2 or 3 months now. I don't plan to update
> BBL builds. OpenSBI packages in Fedora/RISCV isn't finalized, but it does
> ship *.elf and *.bin files.

Sounds good to me, thanks for confirming!

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [Qemu-devel] [Qemu-riscv] [RFC v1 0/5] RISC-V: Add firmware loading support and default

2019-06-20 Thread Andrea Bolognani
On Wed, 2019-06-19 at 11:23 -0700, Alistair Francis wrote:
> On Wed, Jun 19, 2019 at 7:42 AM Bin Meng  wrote:
> > On Wed, Jun 19, 2019 at 10:30 PM Alistair Francis  
> > wrote:
> > > On Wed, Jun 19, 2019 at 7:26 AM Bin Meng  wrote:
> > > > >  pc-bios/opensbi-riscv32-fw_jump.elf | Bin 0 -> 197988 bytes
> > > > >  pc-bios/opensbi-riscv64-fw_jump.elf | Bin 0 -> 200192 bytes
> > > > 
> > > > Since we are considering adding "bios" images, I prefer to add the
> > > > pure binary images instead of ELF images here.
> > > 
> > > I didn't think about that. Can we just boot them in QEMU like we do
> > > with the ELFs?
> > 
> > Yes, use load_image_targphys() instead of load_elf().
> 
> Ah, that is obvious. I'll update it to use the bin files then.

I'm unclear on the advantages of using one format over the other,
but one question comes to mind: once this is in, we will probably
want to have OpenSBI packaged separately in distributions, the same
way it already happens for SeaBIOS, SLOF and edk2-based firmwares.

Will using either of the formats prevent that from happening?

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [Qemu-devel] How do we do user input bitmap properties?

2019-05-27 Thread Andrea Bolognani
On Fri, 2019-05-24 at 15:24 -0300, Eduardo Habkost wrote:
> On Thu, May 23, 2019 at 10:35:24AM +0200, Andrea Bolognani wrote:
> > [...] the above looks good to
> > me as a general direction, but note that you'll have to implement at
> > the very least the query-cpu-model-expansion QMP command for the
> > introspection to work.
> 
> Why is query-cpu-model-expansion needed?  Isn't
> device-list-properties enough?

Good question.

I'll have to check with Jirka, but from playing with both commands
it looks like the latter returns a superset of what the former does,
so for the purpose of figuring out which vector lengths the QEMU
binary recognizes it should be good enough; I suspect, however, that
query-cpu-model-expansion might be (made to be) smarter and for
example not report vector lengths that the underlying hardware
doesn't support, which would be valuable for the purpose of user
friendly error reporting and allowing applications to decide which
vector lengths to request when creating guests.

I'll try to come back with more than theories soon :)

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [Qemu-devel] How do we do user input bitmap properties?

2019-05-23 Thread Andrea Bolognani
On Wed, 2019-05-15 at 13:54 +0200, Andrew Jones wrote:
> On Wed, May 15, 2019 at 12:52:29PM +0200, Igor Mammedov wrote:
> > since using magic numbers is not very descriptive
> > (but if there is some spec where they come from that we could point users to
> > it might be acceptable too, but I'd reserve number approach for values 
> > only).
> 
> The numbers aren't magic, they're part of the name. '1' in the above
> 'sve1' means one quadword. It would probably have been better to use bits
> instead in the example, i.e.
> 
>   -cpu host,sve128=on,sve256=on,sve384=off,sve512=on
> 
> where it's now clear that "sve512" has an analogy with x86's "avx512".
> 
[...]
> 
> So I set off to convince Igor of the wide word idea (he sits next to me,
> so I didn't have go far), but he has convinced me of the above property
> idea. He used the magic phrase: "less code would be needed". If we use
> the properties like above then we get introspection for free (cpu property
> listing which libvirt already knows how to do) - so no QMP query needed.
> The cost is adding several properties (16 to handle the current 2048-bit
> limit), but I guess that's cheap enough. The command line is verbose, but
> also much easier for a human to construct and read. I'm pretty sold on
> this path, but adding Andrea and Eduardo for their input as well.

Sorry for taking a while to respond. Anyway, the above looks good to
me as a general direction, but note that you'll have to implement at
the very least the query-cpu-model-expansion QMP command for the
introspection to work.

query-cpu-model-baseline and query-cpu-model-comparison are two more
QMP command which, while perhaps not immediately applicabile, we will
want to implement at some point; more in general, what s390x is doing
with respect to CPU models is a good blueprint, according to the
libvirt developer who's the most involved with that specific area of
the project.

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [Qemu-devel] [PATCH 00/13] target/arm/kvm: enable SVE in guests

2019-05-15 Thread Andrea Bolognani
On Wed, 2019-05-15 at 12:14 +0100, Dave Martin wrote:
> On Wed, May 15, 2019 at 09:03:58AM +0100, Andrea Bolognani wrote:
> > On Tue, 2019-05-14 at 13:14 -0700, Richard Henderson wrote:
> > > Why is =4 less user-friendly than =512?
> > > 
> > > I don't actually see "total bits in vector" as more user-friendly than 
> > > "number
> > > of quadwords" when it comes to non-powers-of-2 like =7 vs =896 or =13 vs 
> > > =1664.
> > 
> > I would wager most people are intimately familiar with bits, bytes
> > and multiples due to having to work with them daily. Quadwords, not
> > so much.
> 
> Generally I tend to agree.  For kvmtool I leaned torward quadwords
> purely because
> 
>   16,32,48,64,80,96,112,128,144,160,176,192,208
> 
> is a big pain to type compared with
> 
>   1,2,3,4,5,6,7,8,9,10,11,12,13
> 
> Even though I prefer to specify vector lengths in bytes everywhere else
> in the Linux user API (precisely to avoid the confusion you object to).
> 
> This isn't finalised yet for kvmtool -- I need to rework the patches
> and may not include it at all initially: kvmtool doesn't support
> migration, which is the main usecase for being able to specify an exact
> set of vector lengths AFAICT.
> 
> Since this is otherwise only useful for migration, experimentation or
> machine-driven configuration, a bitmask
> 
>   0x1fff
> 
> as some have suggested may well be a pragmatic alternative for kvmtool.

Just to be clear, I have suggested using bits (or bytes or megabytes
depending on the exact value) only for the command-line-user-oriented
sve-vl-max option, which would take a single value.

For interoperation with the management layer, on the other hand,
using a bitmap is perfectly fine, and whether the values encoded
within are expressed in quadwords or whatever other format is largely
irrelevant, so long as it it's properly documented of course.

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [Qemu-devel] [PATCH 00/13] target/arm/kvm: enable SVE in guests

2019-05-15 Thread Andrea Bolognani
On Tue, 2019-05-14 at 13:14 -0700, Richard Henderson wrote:
> On 5/14/19 9:03 AM, Andrea Bolognani wrote:
> > On Tue, 2019-05-14 at 14:53 +0200, Andrew Jones wrote:
> > > We already have sve-max-vq, so I'm not sure we want to rename it.
> > 
> > Oh, I didn't realize that was the case. And of course it already
> > takes a number of quadwords as argument, I suppose? That's pretty
> > unfortunate :(
> > 
> > Perhaps we could consider deprecating it in favor of a user-friendly
> > variant that's actually suitable for regular humans, like the one I
> > suggest above?
> 
> Why is =4 less user-friendly than =512?
> 
> I don't actually see "total bits in vector" as more user-friendly than "number
> of quadwords" when it comes to non-powers-of-2 like =7 vs =896 or =13 vs 
> =1664.

I would wager most people are intimately familiar with bits, bytes
and multiples due to having to work with them daily. Quadwords, not
so much.

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [Qemu-devel] [PATCH 00/13] target/arm/kvm: enable SVE in guests

2019-05-14 Thread Andrea Bolognani
On Tue, 2019-05-14 at 14:53 +0200, Andrew Jones wrote:
> On Tue, May 14, 2019 at 02:29:51PM +0200, Andrea Bolognani wrote:
> > Since we expect management applications to use QMP to discover what
> > vector lengths are supported and then provide an explicit map, I
> > think it's fair to say that the ability to specify a single maximum
> > vector length is going to be exclusively used as a convenience for
> > command line users.
> > 
> > In that light, I think it would be reasonable for the usage to look
> > along the lines of
> > 
> >   -cpu host,sve-vl-map=0xd # machine-friendly variant
> >   -cpu max,sve-vl-max=512  # user-friendly variant
> 
> We already have sve-max-vq, so I'm not sure we want to rename it.

Oh, I didn't realize that was the case. And of course it already
takes a number of quadwords as argument, I suppose? That's pretty
unfortunate :(

Perhaps we could consider deprecating it in favor of a user-friendly
variant that's actually suitable for regular humans, like the one I
suggest above?

[...]
> > > Dave points out
> > > we may want to reduce the list to a single set and then add flags
> > > to indicate what can be done with it in order to derive other sets.
> > > What do you think about that?
> > 
> > So right now all that can be done is truncating the list by removing
> > an arbitrary number of elements from the end, right? Eg. if you have
> > [ 1, 2, 4 ] you can use either that or [ 1, 2 ] or [ 1 ]. But in the
> > future you might also be able to mask single elements in the middle
> > of the list, thus enabling things like [ 1, 4 ].
> > 
> > That doesn't sound very complicated to support in libvirt, though I
> > have to say that I'm not a big fan of this proposal because as far as
> > I can see it basically means implementing the very same logic twice,
> > once in QEMU and then once more in libvirt.
> 
> So if big tables of bits aren't a problem for QMP queries, then I'll
> just leave the design as it is.

I thought about it a bit more and perhaps the simplified design is
better after all.

Whatever the interface looks like on the QEMU side, we're going to
want to offer libvirt users two options for configuring vector
lengths: listing all desired vector lengths explicitly (basically
sev-vl-map but more verbose and readable) and providing just the
biggest desired vector length (like in sev-max-vq).

In the latter case, we'll want to expand the user-provided value
into an explicit list anyway in order to guarantee guest ABI
stability, and doing so when a single bitmap has been obtained via
QMP seems like it would be more manageable.

Sorry for the flip-flop, but after all isn't this exactly what
upstream design discussion is supposed to be all about? :)

[...]
> > If the size of the bitmap on the KVM side is 512 bits, why don't we
> > just make it that size on the QEMU side too from the start?
> 
> I'd still only want to input 64-bits on the command line, otherwise
> we get into inputting arrays, which isn't easy. KVM's interface is
> meant for expansion, but it won't be using most of those bits for
> quite some time either.

I'm probably missing something entirely obvious, but couldn't you
just have a single, possibly fairly massive (up to 128 hex digits if
I'm not mistaken) value on the command line and just work with that
one, no arrays necessary?

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [Qemu-devel] [PATCH 00/13] target/arm/kvm: enable SVE in guests

2019-05-14 Thread Andrea Bolognani
On Mon, 2019-05-13 at 14:36 +0200, Andrew Jones wrote:
> On Mon, May 13, 2019 at 11:32:46AM +0200, Andrea Bolognani wrote:
> > On Sun, 2019-05-12 at 10:36 +0200, Andrew Jones wrote:
> > [...]
> > >CPU type | accel | sve-max-vq | sve-vls-map
> > >---
> > >  1) max | tcg   |  $MAX_VQ   |  $VLS_MAP
> > >  2) max | kvm   |  $MAX_VQ   |  $VLS_MAP
> > >  3)host | kvm   |  N/A   |  $VLS_MAP
> > > 
> > > Where for (1) $MAX_VQ sets the maximum vq and smaller vqs are
> > > all supported when $VLS_MAP is zero, or when the vqs are selected
> > > in $VLS_MAP.
> > 
> > I'm a bit confused by the nomenclature here. VL clearly stands for
> > Vector Length, but what does VQ stand for? You seem to be using the
> > two terms pretty much interchangeably throughout the cover letter.
> 
> As Dave pointed out, they're both lengths, but VQ specifically points
> out that the unit is 'Q'uadwords. We could use VQS instead of VLS,
> "Vector Lengths" sounds better.

Alright, it makes sense - once you've managed to figure out what
exactly a "quadword" is, at least :)

Since we expect management applications to use QMP to discover what
vector lengths are supported and then provide an explicit map, I
think it's fair to say that the ability to specify a single maximum
vector length is going to be exclusively used as a convenience for
command line users.

In that light, I think it would be reasonable for the usage to look
along the lines of

  -cpu host,sve-vl-map=0xd # machine-friendly variant
  -cpu max,sve-vl-max=512  # user-friendly variant

with documentation clearly pointing out that the two options expect
completely different formats - but that was the case even before,
so we would have had to document that anyway.

> > > The QMP query returns a list of valid vq lists. For example, if
> > > a guest can use vqs 1, 2, 3, and 4, then the following list will
> > > be returned
> > > 
> > >  [ [ 1 ], [ 1, 2 ], [ 1, 2, 3 ], [ 1, 2, 3, 4 ] ]
> > > 
> > > Another example might be 1, 2, 4, as the architecture states 3
> > > is optional. In that case the list would be
> > > 
> > >  [ [ 1 ], [ 1, 2 ], [ 1, 2, 4 ] ]
> > 
> > I think the proposed QMP command is problematic, because it reports
> > the valid vector lengths for either KVM or TCG based on which
> > accelerator is currently enabled: it should report all information
> > at all times instead, similarly to how query-gic-capabilities does
> > it.
> 
> OK, and then with a flag stating which is when then.

Correct.

> Dave points out
> we may want to reduce the list to a single set and then add flags
> to indicate what can be done with it in order to derive other sets.
> What do you think about that?

So right now all that can be done is truncating the list by removing
an arbitrary number of elements from the end, right? Eg. if you have
[ 1, 2, 4 ] you can use either that or [ 1, 2 ] or [ 1 ]. But in the
future you might also be able to mask single elements in the middle
of the list, thus enabling things like [ 1, 4 ].

That doesn't sound very complicated to support in libvirt, though I
have to say that I'm not a big fan of this proposal because as far as
I can see it basically means implementing the very same logic twice,
once in QEMU and then once more in libvirt.

> > [...]
> > > And now for what might be a bit more controversial; how we input
> > > the valid vector set with sve-vls-map. Well, sve-vls-map is a
> > > 64-bit bitmap, which is admittedly not user friendly and also not
> > > the same size as KVM's vls bitmap (which is 8 64-bit words). Here's
> > > the justification:
> > > 
> > >  1) We want to use the QEMU command line in order for the information
> > > to be migrated without needing to add more VM state.
> > >  2) It's not easy/pretty to input arrays on the QEMU command line.
> > >  3) We already need to use the QMP query to get a valid set, which
> > > isn't user friendly either, meaning we're already in libvirt
> > > territory.
> > >  4) A 64-bit map (supporting up to 8192-bit vectors) is likely big
> > > enough for quite some time (currently KVM and TCG only support
> > > 2048-bit vectors).
> > >  5) If user friendliness is needed more than migratability then
> > > the 'max' cpu type can be used with the sve-max-vq property.
> > >  6) It's possible to probe the full valid vector set from the
> > > command line by using something like sve-vls-map=0x and
> > > then, when it fails, the error message wil

Re: [Qemu-devel] [PATCH 00/13] target/arm/kvm: enable SVE in guests

2019-05-13 Thread Andrea Bolognani
On Sun, 2019-05-12 at 10:36 +0200, Andrew Jones wrote:
[...]
>CPU type | accel | sve-max-vq | sve-vls-map
>---
>  1) max | tcg   |  $MAX_VQ   |  $VLS_MAP
>  2) max | kvm   |  $MAX_VQ   |  $VLS_MAP
>  3)host | kvm   |  N/A   |  $VLS_MAP
> 
> Where for (1) $MAX_VQ sets the maximum vq and smaller vqs are
> all supported when $VLS_MAP is zero, or when the vqs are selected
> in $VLS_MAP.

I'm a bit confused by the nomenclature here. VL clearly stands for
Vector Length, but what does VQ stand for? You seem to be using the
two terms pretty much interchangeably throughout the cover letter.

[...]
> There is never any need to provide both properties, but if both
> are provided then they are checked for consistency.

I would personally just error out when both are provided.

> The QMP query returns a list of valid vq lists. For example, if
> a guest can use vqs 1, 2, 3, and 4, then the following list will
> be returned
> 
>  [ [ 1 ], [ 1, 2 ], [ 1, 2, 3 ], [ 1, 2, 3, 4 ] ]
> 
> Another example might be 1, 2, 4, as the architecture states 3
> is optional. In that case the list would be
> 
>  [ [ 1 ], [ 1, 2 ], [ 1, 2, 4 ] ]

I think the proposed QMP command is problematic, because it reports
the valid vector lengths for either KVM or TCG based on which
accelerator is currently enabled: it should report all information
at all times instead, similarly to how query-gic-capabilities does
it.

[...]
> And now for what might be a bit more controversial; how we input
> the valid vector set with sve-vls-map. Well, sve-vls-map is a
> 64-bit bitmap, which is admittedly not user friendly and also not
> the same size as KVM's vls bitmap (which is 8 64-bit words). Here's
> the justification:
> 
>  1) We want to use the QEMU command line in order for the information
> to be migrated without needing to add more VM state.
>  2) It's not easy/pretty to input arrays on the QEMU command line.
>  3) We already need to use the QMP query to get a valid set, which
> isn't user friendly either, meaning we're already in libvirt
> territory.
>  4) A 64-bit map (supporting up to 8192-bit vectors) is likely big
> enough for quite some time (currently KVM and TCG only support
> 2048-bit vectors).
>  5) If user friendliness is needed more than migratability then
> the 'max' cpu type can be used with the sve-max-vq property.
>  6) It's possible to probe the full valid vector set from the
> command line by using something like sve-vls-map=0x and
> then, when it fails, the error message will state the correct
> map, e.g. 0xb.

I don't have a problem with having to use a bitmap internally,
though libvirt will clearly want to expose a more approachable
interface to users.

However, QMP reporting the information in the current format means
we'd have to build an additional parser on top of the bitmap handling
and conversion routines we'll clearly need to make this work; plus it
just feels weird that the information reported by QMP can't be used
on the command line without going through some tranformation first.

Wouldn't it make more sense to both accept and report bitmaps?

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [Qemu-devel] [PATCH] vl: fix -sandbox parsing crash when seccomp support is disabled

2019-04-29 Thread Andrea Bolognani
On Mon, 2019-04-29 at 15:47 +0200, Marc-André Lureau wrote:
> $ ./x86_64-softmmu/qemu-system-x86_64 -sandbox off
> qemu-system-x86_64: -sandbox off: There is no option group 'sandbox'
> Segmentation fault
> 
> Commit 5780760f5e ("seccomp: check TSYNC host capability") wrapped one
> use of the sandbox option group to produce a sensible error, it didn't
> do the same for another call to qemu_opts_parse_noisily():
> 
> (gdb) bt
> at util/qemu-option.c:829
>  #0  0x105b36d8 in opts_parse (list=0x0, params=0x3ab5 "off", 
> permit_abbrev=true, defaults=false, errp=0x3080)
>  at util/qemu-option.c:829
>  #1  0x105b3b74 in qemu_opts_parse_noisily (list=, 
> params=, permit_abbrev=) at 
> util/qemu-option.c:890
>  #2  0x10024964 in main (argc=, argv=, 
> envp=) at vl.c:3589
> 
> Fixes: 5780760f5ea6163939a5dabe7427318b4f07d1a2
> Cc: da...@gibson.dropbear.id.au
> Cc: ot...@redhat.com
> Signed-off-by: Marc-André Lureau 
> ---
>  vl.c | 18 ++
>  1 file changed, 10 insertions(+), 8 deletions(-)

This fixes the crash for me, so

  Tested-by: Andrea Bolognani 

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [Qemu-devel] [Qemu-ppc] [PATCH v7 0/2] spapr-rtas: add ibm, get-vpd RTAS interface

2019-04-09 Thread Andrea Bolognani
Apologies for taking this long to respond.

On Mon, 2019-04-08 at 14:27 +1000, David Gibson wrote:
> On Tue, Apr 02, 2019 at 12:28:07PM +0200, Greg Kurz wrote:
> > The recent fixes around "host-serial" and "host-model" simply moved
> > the decision to expose host data to the upper layer, ie. libvirt
> > which should be involved in this discussion.
> 
> Right, that's deliberate.  Note that roughly-equivalent information on
> x86 is currently supplied via the SMBIOS.  OpenStack Nova sets that,
> rather than qemu, and I'd like to move towards a common configuration
> model with x86, though it's a fairly long path to there.
> 
> OpenStack had an equivalent security problem to our one, which it
> addressed by taking the host serial from /etc/machine-id if present
> rather than the real host info.

IIUC the situation is a bit different between x86 and ppc64, because
while for the latter SPAPR defines a way for the guest to access
information about the host it's running on, that's not the case for
the former, at least to the best of my knowledge.

What OpenStack is doing is reading the machine-id (if explicitly
configured to do so: the default is to use the guest's own UUID[1])
and exposing that as the *guest* serial, not as the *host* serial.

>From libvirt's point of view, the entire mechanism is entirely
optional, so unless the management layer explicitly asks it to set
a certain value for the serial, libvirt will simply pass no
information down to QEMU.

The relevant XML elements[2] are clearly modeled after x86, so I
wonder if Nova is setting them also on ppc64 and if so, what the
guest will ultimately see...

> > Cc'ing Andrea for expertise. Problem exposed below.
> > 
> > The pseries machine used to expose the content of the host's
> > /proc/device-tree/system-id and /proc/device-tree/model in the guest
> > DT. This led to a CVE and QEMU doesn't do that anymore for new machine
> > types. Instead, two new properties where added to the pseries machine:
> > 
> > pseries-4.0.host-serial=string (Host serial number to advertise in guest 
> > device tree)
> > pseries-4.0.host-model=string (Host model to advertise in guest device tree)
> > 
> > It is up to the caller to pass something... which may be anything,
> > including something like $(cat /proc/device-tree/system-id) or
> > randomly generated.

What happens if the caller doesn't provide any value? Will QEMU come
up with something itself?

Adding a few extra knobs in the vein as the existing ones sounds like
a fairly reasonable idea. It will still be up to the management layer
to actually provide the values.

> > Is there a chance libvirt can be taught to pass a different string
> > to the target QEMU in case of migration ?

libvirt already supports providing a different XML to the target
host, so changing a couple values should be no big deal.


As a final note, unless I've gotten it wrong and x86 actually *does*
provide a way for the guest to figure out its host's serial, then any
software relying on the attributes defined by SPAPR is ultimately not
portable to non-ppc64 hardware and should probably be rearchitected
to go through the management layer, as Daniel was also suggesting
earlier in the thread.


[1] 
https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L364-L372
[2] https://libvirt.org/formatdomain.html#elementsSysinfo
-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [Qemu-devel] [Qemu-riscv] [PATCH v1 1/1] riscv: plic: Set msi_nonbroken as true

2019-03-21 Thread Andrea Bolognani
On Mon, 2019-03-18 at 02:31 -0700, Palmer Dabbelt wrote:
> On Mon, 18 Mar 2019 01:39:46 PDT (-0700), pbonz...@redhat.com wrote:
> > On 15/03/19 21:05, Alistair Francis wrote:
> > > Set msi_nonbroken as true for the PLIC.
> > > 
> > > According to the comment located here:
> > > https://git.qemu.org/?p=qemu.git;a=blob;f=hw/pci/msi.c;h=47d2b0f33c664533b8dbd5cb17faa8e6a01afe1f;hb=HEAD#l38
> > > the msi_nonbroken variable should be set to true even if they don't
> > > support MSI. In this case that is what we are doing as we don't support
> > > MSI.
> > 
> > I can queue this patch, and add the "select MSI" to CONFIG_SIFIVE.
> 
> Works for me.  Thanks!

Just so we're on the same page, are you targeting this at 4.0.0?
If it gets merged in the next few days I can probably get the
corresponding libvirt patches in before our own freeze starts.

It would be great if we could make it so guests created with
QEMU 4.0.0 + libvirt 5.2.0 get PCI by default :)

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [Qemu-devel] [PATCH v1 1/1] riscv: plic: Set msi_nonbroken as true

2019-03-18 Thread Andrea Bolognani
On Fri, 2019-03-15 at 20:05 +, Alistair Francis wrote:
> Set msi_nonbroken as true for the PLIC.
> 
> According to the comment located here:
> https://git.qemu.org/?p=qemu.git;a=blob;f=hw/pci/msi.c;h=47d2b0f33c664533b8dbd5cb17faa8e6a01afe1f;hb=HEAD#l38
> the msi_nonbroken variable should be set to true even if they don't
> support MSI. In this case that is what we are doing as we don't support
> MSI.
> 
> Signed-off-by: Alistair Francis 
> Reported-by: Andrea Bolognani 
> Reported-by: David Abdurachmanov 
> ---
> This should allow working pcie-root-ports in QEMU and allow libvirt
> to start using PCIe by default for RISC-V guests.
> 
> hw/riscv/sifive_plic.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/hw/riscv/sifive_plic.c b/hw/riscv/sifive_plic.c
> index d12ec3fc9a..4b0537c912 100644
> --- a/hw/riscv/sifive_plic.c
> +++ b/hw/riscv/sifive_plic.c
> @@ -22,6 +22,7 @@
>  #include "qemu/log.h"
>  #include "qemu/error-report.h"
>  #include "hw/sysbus.h"
> +#include "hw/pci/msi.h"
>  #include "target/riscv/cpu.h"
>  #include "hw/riscv/sifive_plic.h"
>  
> @@ -443,6 +444,8 @@ static void sifive_plic_realize(DeviceState *dev, Error 
> **errp)
>  plic->enable = g_new0(uint32_t, plic->bitfield_words * plic->num_addrs);
>  sysbus_init_mmio(SYS_BUS_DEVICE(dev), >mmio);
>  qdev_init_gpio_in(dev, sifive_plic_irq_request, plic->num_sources);
> +
> +msi_nonbroken = true;
>  }
>  
>  static void sifive_plic_class_init(ObjectClass *klass, void *data)

With this patch applied, I was able to bring up a riscv64/virt guest
with graphics, using PCIe devices only:

  https://imgur.com/a/taN06hE

Tested-by: Andrea Bolognani 

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [Qemu-devel] [PATCH v1 1/1] riscv: plic: Set msi_nonbroken as true

2019-03-18 Thread Andrea Bolognani
On Mon, 2019-03-18 at 09:39 +0100, Paolo Bonzini wrote:
> On 15/03/19 21:05, Alistair Francis wrote:
> > Set msi_nonbroken as true for the PLIC.
> > 
> > According to the comment located here:
> > https://git.qemu.org/?p=qemu.git;a=blob;f=hw/pci/msi.c;h=47d2b0f33c664533b8dbd5cb17faa8e6a01afe1f;hb=HEAD#l38
> > the msi_nonbroken variable should be set to true even if they don't
> > support MSI. In this case that is what we are doing as we don't support
> > MSI.
> > 
> > Signed-off-by: Alistair Francis 
> > Reported-by: Andrea Bolognani 
> > Reported-by: David Abdurachmanov 
> > ---
> > This should allow working pcie-root-ports in QEMU and allow libvirt
> > to start using PCIe by default for RISC-V guests.
> > 
> > hw/riscv/sifive_plic.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/hw/riscv/sifive_plic.c b/hw/riscv/sifive_plic.c
> > index d12ec3fc9a..4b0537c912 100644
> > --- a/hw/riscv/sifive_plic.c
> > +++ b/hw/riscv/sifive_plic.c
> > @@ -22,6 +22,7 @@
> >  #include "qemu/log.h"
> >  #include "qemu/error-report.h"
> >  #include "hw/sysbus.h"
> > +#include "hw/pci/msi.h"
> >  #include "target/riscv/cpu.h"
> >  #include "hw/riscv/sifive_plic.h"
> >  
> > @@ -443,6 +444,8 @@ static void sifive_plic_realize(DeviceState *dev, Error 
> > **errp)
> >  plic->enable = g_new0(uint32_t, plic->bitfield_words * 
> > plic->num_addrs);
> >  sysbus_init_mmio(SYS_BUS_DEVICE(dev), >mmio);
> >  qdev_init_gpio_in(dev, sifive_plic_irq_request, plic->num_sources);
> > +
> > +msi_nonbroken = true;
> >  }
> >  
> >  static void sifive_plic_class_init(ObjectClass *klass, void *data)
> 
> I can queue this patch, and add the "select MSI" to CONFIG_SIFIVE.

The interrupt controller is used by the virt machine type too IIUC,
so the same should be added to CONFIG_RISCV_VIRT I think.

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [Qemu-devel] [PATCH] ci: store Patchew configuration in the tree

2019-03-15 Thread Andrea Bolognani
On Fri, 2019-03-15 at 10:19 +0100, Paolo Bonzini wrote:
[...]
> +s390x:
> +  enabled: false
> +  requirements: ppcle

This doesn't look right :)

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [Qemu-devel] [PATCH 0/2] kconfig: add fine-grained dependencies for MSI

2019-03-15 Thread Andrea Bolognani
On Thu, 2019-03-14 at 15:30 +0100, Paolo Bonzini wrote:
> RISC-V targets did not include PCIe ports before the Kconfig transition,
> and grew them afterwards, but they are nonfunctional because the interrupt
> controller does not support MSI.  This patch restores the situation prior to
> the introduction of Kconfig; in fact, it will automatically drop devices
> that require MSI unless the binary includes an MSI-enabled board.
> 
> Paolo
> 
> Paolo Bonzini (2):
>   kconfig: add CONFIG_MSI
>   kconfig: add dependencies on CONFIG_MSI
> 
>  Kconfig.host  | 3 +++
>  Makefile  | 3 ++-
>  hw/Kconfig| 1 +
>  hw/intc/Kconfig   | 3 +++
>  hw/misc/Kconfig   | 4 ++--
>  hw/net/Kconfig| 4 ++--
>  hw/pci-bridge/Kconfig | 6 +++---
>  hw/pci-host/Kconfig   | 1 +
>  hw/pci/Kconfig| 4 
>  hw/ppc/Kconfig| 1 +
>  hw/rdma/Kconfig   | 3 +++
>  hw/rdma/Makefile.objs | 6 ++
>  hw/s390x/Kconfig  | 1 +
>  13 files changed, 28 insertions(+), 12 deletions(-)
>  create mode 100644 hw/rdma/Kconfig

FWIW

  Tested-by: Andrea Bolognani 

Thanks for addressing this :)

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [Qemu-devel] [PULL 15/54] build: convert pci.mak to Kconfig

2019-03-14 Thread Andrea Bolognani
On Mon, 2019-03-04 at 19:19 +0100, Paolo Bonzini wrote:
> Instead of including the same list of devices for each target,
> set CONFIG_PCI to true, and make the devices default to present
> whenever PCI is available.  However, s390x does not want all the
> PCI devices, so there is a separate symbol to enable them.
[...]
> +++ b/default-configs/riscv32-softmmu.mak
> @@ -1,8 +1,8 @@
>  # Default configuration for riscv-softmmu
>  
> -include pci.mak
>  include usb.mak
> -
> +CONFIG_PCI=y
> +CONFIG_PCI_DEVICES=y
>  CONFIG_SERIAL=y
>  CONFIG_VIRTIO_MMIO=y

I *think* this might have caused some unwanted changes for RISC-V.

pcie-root-port is built into qemu-system-riscv64 by default as of
dbbc277510aa (along with ioh3420!), but if you actually try to use it
you'll get:

  $ ./riscv64-softmmu/qemu-system-riscv64 \
-M virt \
-device pcie-root-port
  qemu-system-riscv64: -device pcie-root-port: MSI-X is not
   supported by interrupt controller

This is a limitation we have been aware of, and the plan was to
enable the device in QEMU once it had been addressed: from the
libvirt side, the availability of the device would have meant that it
was safe to use it, but if the device is enabled in QEMU before it
can actually be used, then that makes detection on the libvirt side
problematic.

I haven't spent time digging further - and I'm not familiar enough
with the QEMU build system anyway O:-) - but I wouldn't be surprised
if the same happened for other architectures, too.

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [Qemu-devel] [libvirt] [PATCH for-4.0 v4 0/2] virtio: Provide version-specific variants of virtio PCI devices

2019-03-06 Thread Andrea Bolognani
On Wed, 2019-03-06 at 09:30 +0100, Ján Tomko wrote:
> On Wed, Mar 06, 2019 at 08:41:48AM +0100, Peter Krempa wrote:
> > On Tue, Mar 05, 2019 at 16:56:43 +0100, Andrea Bolognani wrote:
> > > So I agree neither scenario is exactly perfect, but I still think
> > > adding non-transitional alias devices would overall be more
> > > user-friendly.
> > 
> > I don't think it makes sense to add it at the qemu level. From libvirt's
> > point of view users should be shielded from any qemu impl detail or
> > inconsistency as libvirt is the 'user friendly'[1] layer. In qemu the
> > devices would be the same and thus does not make sense to do that
> > because it would be more confusing.
> > 
> > You can argue that we should add the alias at the libvirt level though.
> 
> You can, but please don't.

It would seem nobody except me thinks this is a good idea, so I
guess I'll just drop it now.

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [Qemu-devel] [PATCH for-4.0 v4 0/2] virtio: Provide version-specific variants of virtio PCI devices

2019-03-05 Thread Andrea Bolognani
On Tue, 2019-03-05 at 15:38 +0100, Gerd Hoffmann wrote:
>   Hi,
> 
> >   -device virtio-blk-pci-non-transitional \
> >   -device virtio-net-pci-non-transitional \
> >   -device virtio-gpu-pci-non-transitional \
> > 
> > and you wouldn't have to question why you can use the
> > non-transitional variant for pretty much everything, except for the
> > few cases where you can't - for no apparent reason...
> 
> Well, there are no variants, only a single virtio-$foo-pci* device.  So
> you don't have to worry about picking one of the available variants,
> there is no choice in the first place.
> 
> When adding an virtio-gpu-pci-non-transitional variant we'll create
> confusion too, because it wouldn't be a real variant.  We would have two
> 100% identical devices then, and people will probably wonder why they
> exist and what the difference is ...

When looking at a single device, I mostly agree with your assessment;
however, when looking at the overall situation with VirtIO devices,
one might quite reasonably infer the following rules:

  * devices marked as (non-)transitional are going to show up as
(non-)transitional;

  * unmarked devices might show up as either one, depending on some
factor which is not immediately obvious.

So if you knew you wanted non-transitional devices you would expect
to just use the non-transitional variant for *all* VirtIO devices,
including virtio-gpu, without necessarily caring whether the unmarked
devices behaves any differently; if you tried to use the transitional
device, you'd get an error message telling you that device doesn't
exist, which is pretty reasonable and easy to research / understand.

With the current situation, once you've tried using non-transitional
virtio-gpu and gotten back an error message, there's quite a bit more
digging required to figure out *why* the device is not there in the
first place.

So I agree neither scenario is exactly perfect, but I still think
adding non-transitional alias devices would overall be more
user-friendly.

> So I can't see how this would be so much better.  We have to document
> the mess no matter what.

We have some documentation in libvirt:

  https://libvirt.org/formatdomain.html#elementsVirtioTransitional

Not that more / improved documentation is ever a bad idea :)

-- 
Andrea Bolognani / Red Hat / Virtualization




Re: [Qemu-devel] [PATCH for-4.0 v4 0/2] virtio: Provide version-specific variants of virtio PCI devices

2019-03-05 Thread Andrea Bolognani
Sorry to resurrect such an old thread, but I have been wondering...

On Wed, 2018-12-05 at 17:57 -0200, Eduardo Habkost wrote:
[...]
> Changes v1 -> v2:
> * Removed *-0.9 devices.  Nobody will want to use them, if
>   transitional devices work with legacy drivers
>   (Gerd Hoffmann, Michael S. Tsirkin)
> * Drop virtio version from name: rename -1.0-transitional to
>   -transitional (Michael S. Tsirkin)
> * Renamed -1.0 to -non-transitional
> * Don't add any extra variants to modern-only device types
>   (they don't need it)

... if doing this was a good idea after all?

While I understand that something like virtio-gpu, which supports
the 1.0 specification exclusively, only really needs to have a
single device associated with it from the functionality point of
view, looking at it from a user's perspective it seems to me like
providing an explicit non-transitional variant would be appropriate
for consistency reasons, so that your guest could look like

  -device virtio-blk-pci-non-transitional \
  -device virtio-net-pci-non-transitional \
  -device virtio-gpu-pci-non-transitional \

and you wouldn't have to question why you can use the
non-transitional variant for pretty much everything, except for the
few cases where you can't - for no apparent reason...

It would also signal quite clearly which devices support both
transitional and non-transitional variants and which ones don't,
without having to infer that the complete lack of (non-)transitional
variants means that only the non-transitional variant is available -
except you have to use the suffix-less device name to use it.

tl;dr providing the non-transitional variant for virtio 1.0-only
  devices would make using this much more user-friendly.

-- 
Andrea Bolognani / Red Hat / Virtualization




<    1   2   3   4   >