[PATCH 2/7] qapi: Fix comment indentation
It should start on the very first column. Signed-off-by: Andrea Bolognani --- qapi/ui.json | 30 +++--- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/qapi/ui.json b/qapi/ui.json index 059302a5ef..43e62efd76 100644 --- a/qapi/ui.json +++ b/qapi/ui.json @@ -1250,21 +1250,21 @@ '*p2p': 'bool', '*audiodev': 'str' } } - ## - # @DisplayGLMode: - # - # Display OpenGL mode. - # - # @off: Disable OpenGL (default). - # @on: Use OpenGL, pick context type automatically. - # Would better be named 'auto' but is called 'on' for backward - # compatibility with bool type. - # @core: Use OpenGL with Core (desktop) Context. - # @es: Use OpenGL with ES (embedded systems) Context. - # - # Since: 3.0 - # - ## +## +# @DisplayGLMode: +# +# Display OpenGL mode. +# +# @off: Disable OpenGL (default). +# @on: Use OpenGL, pick context type automatically. +# Would better be named 'auto' but is called 'on' for backward +# compatibility with bool type. +# @core: Use OpenGL with Core (desktop) Context. +# @es: Use OpenGL with ES (embedded systems) Context. +# +# Since: 3.0 +# +## { 'enum': 'DisplayGLMode', 'data': [ 'off', 'on', 'core', 'es' ] } -- 2.35.1
Re: [RFC PATCH v1 0/8] qapi: add generator for Golang interface
On Thu, Apr 28, 2022 at 03:50:55PM +0200, Markus Armbruster wrote: > Andrea Bolognani writes: > > One concern that I have is about naming struct members: things like > > SpiceInfo.MouseMode and most others are translated from the QAPI > > schema exactly the way you'd expect them, but for example > > ChardevCommon.Logappend doesn't look quite right. > > It doesn't look quite right in the QAPI schema, either: @logappend. If > it was @log-append, as it should, then it would get translated to > LogAppend, I guess. > > Fixing up style isn't a code generator's job. I agree that the generator shouldn't take too many liberties when translating names, and specifically should never attempt to figure out that @logappend should have been @log-append instead. What I was thinking of was more along the lines of, can we change the schema so that the proper name is available to the generator without breaking the wire protocol? Maybe something like ## # ChardevCommon: # # @logappend (rename @log-append): ... ## That way the generator would have access to both information, and would thus be able to generate type ChardevCommon struct { LogAppend *bool `json:"logappend,omitempty"` } The wire protocol would still retain the unappealing name, but at least client libraries could hide the uglyness from users. > > Same for the various > > structs or members that have unexpectedly-capitalized "Tls" or "Vnc" > > in them. > > Examples? A perfect one is TlsCredsProperties, whose endpoint member is of type QCryptoTLSCredsEndpoint. On the VNC front, we have SetPasswordOptionsVnc but also DisplayReloadOptionsVNC. There's plenty more, but this should be illustrative enough already. Capitalization of these acronyms is inconsistent across the schema, with one of the two forms disagreeing with Go naming expectations. In this case we might be able to just change the schema without introducing backwards compatibility issues, though? Type names are not actually transmitted on the wire IIUC. > > but maybe > > there's room for adding this kind of information in the form of > > additional annotations or something like that? > > We did for enumeration types: 'prefix' overrides the TYPE_NAME prefix. > I fear this was a mistake. This might be an oversimplification, but I think we might be able to solve all of these issues with a single annotation in the form namespace:word-MLA-other-words So for example QCryptoTLSCredsEndpoint would be annotated with q:crypto-TLS-creds-endpoint and each generator would have enough information to produce identifiers that fit into the corresponding language, such as qcrypto_tls_creds_endpoint CryptoTlsCredsEndpoint or whatever else. Of course such annotations would only be necessary to deal with identifiers that are not already following the expected naming conventions and when MLAs are involved. -- Andrea Bolognani / Red Hat / Virtualization
Re: [PATCH] qapi: Fix malformed "Since:" section tags
On Fri, Apr 22, 2022 at 03:28:07PM +0200, Markus Armbruster wrote: > "Since X.Y" is not recognized as a tagged section, and therefore not > formatted as such in generated documentation. Fix by adding the > required colon. > > Signed-off-by: Markus Armbruster > --- > qapi/crypto.json | 3 +-- > qapi/machine.json| 2 +- > qapi/misc.json | 2 +- > qga/qapi-schema.json | 2 +- > 4 files changed, 4 insertions(+), 5 deletions(-) Reviewed-by: Andrea Bolognani -- Andrea Bolognani / Red Hat / Virtualization
Re: Create qemu-project/py-qemu.qmp repo
On Thu, Apr 21, 2022 at 05:00:16PM -0400, John Snow wrote: > On Thu, Apr 21, 2022, 2:00 PM Andrea Bolognani wrote: > > I think I would go with "python-qemu-qmp". Having a dot in the name > > of a git repo is not very common AFAICT, and I wouldn't rule out the > > possibility of some GitLab feature or other tooling breaking or > > misbehaving because of it. > > The idea is to have the repo name resemble the Python package name, which > is "qemu.qmp". For Python, it's customary to have the package name match > the import name. The import name is "qemu.qmp". > > I tested this name on GitLab and it appears to work just fine. I'm concerned about issues that you'd only trigger when using certain, perhaps less common, features. Here's an example of such an issue from just a year ago: https://gitlab.com/gitlab-org/gitlab/-/issues/224669 There's an epic tracking more issues of the same kind, though admittedly most were addressed four years ago: https://gitlab.com/groups/gitlab-org/-/epics/3740 Up to you whether you feel confident enough that you're not going to run into issues later. > > If you're really keen on saving those few extra keystrokes, maybe > > "pyqemu" is a better prefix than "py-qemu"? I don't know, it just > > looks more natural to me. > > I'd add "py:" as a prefix, but the colon doesn't work as a filename in many > places, so I suggested "py-". > > Thus, all together, "py-qemu.qmp". > > (I could spell out "python", I just prefer the shorter prefix because it's > explanatory enough as-is and I like keeping git checkout names short. My > favorite color of bike shed is blue.) You can absolutely have short names locally even when things are spelled out in GitLab. Anyway, in this case my taste in names is clearly simply different from yours and you should absolutely feel free to ignore my opinion on the matter :) -- Andrea Bolognani / Red Hat / Virtualization
Re: Create qemu-project/py-qemu.qmp repo
On Thu, Apr 21, 2022 at 12:46:48PM -0400, John Snow wrote: > Hi Alex: do you have the ability to create a blank/empty "py-qemu.qmp" > repo under the qemu-project grouping, and add me and Cleber as > maintainers for it? There weren't any objections when I floated the > idea [1]. > > (Though I suggested "py-qemu.qmp" and Dan suggested "python-qemu.qmp". > I don't think we explicitly reconciled the difference. I like the > shorter one.) Since you CC'd me to this message, I'm going to assume you're explicitly welcoming my input on what specific shade this bikeshed should be painted ;) I think I would go with "python-qemu-qmp". Having a dot in the name of a git repo is not very common AFAICT, and I wouldn't rule out the possibility of some GitLab feature or other tooling breaking or misbehaving because of it. If you're really keen on saving those few extra keystrokes, maybe "pyqemu" is a better prefix than "py-qemu"? I don't know, it just looks more natural to me. -- Andrea Bolognani / Red Hat / Virtualization
[PATCH 2/2] docs: build-platforms: Clarify stance on minor releases and backports
These changes match those made in the following libvirt commits: 2ac78307af docs: Clarify our stance on backported packages 78cffd450a docs: Spell out our policy concerning minor releases Since QEMU's platform support policy is based on libvirt's, it makes sense to mirror these recent changes made to the latter. The policy is not altered significantly - we're simply spelling out some rules that were likely already being implicitly enforced. Signed-off-by: Andrea Bolognani --- docs/about/build-platforms.rst | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/docs/about/build-platforms.rst b/docs/about/build-platforms.rst index e9163ba556..dce6557bd4 100644 --- a/docs/about/build-platforms.rst +++ b/docs/about/build-platforms.rst @@ -71,7 +71,10 @@ The project aims to support the most recent major version at all times. Support for the previous major version will be dropped 2 years after the new major version is released or when the vendor itself drops support, whichever comes first. In this context, third-party efforts to extend the lifetime of a distro -are not considered, even when they are endorsed by the vendor (eg. Debian LTS). +are not considered, even when they are endorsed by the vendor (eg. Debian LTS); +the same is true of repositories that contain packages backported from later +releases (e.g. Debian backports). Within each major release, only the most +recent minor release is considered. For the purposes of identifying supported software versions available on Linux, the project will look at CentOS, Debian, Fedora, openSUSE, RHEL, SLES and -- 2.35.1
Re: [PATCH 2/2] docs: build-platforms: Clarify stance on minor releases and backports
On Wed, Apr 20, 2022 at 05:15:08PM +0100, Daniel P. Berrangé wrote: > On Wed, Apr 20, 2022 at 06:03:11PM +0200, Andrea Bolognani wrote: > > These changes match those made in the following libvirt commits: > > > > 2ac78307af docs: Clarify our stance on backported packages > > 78cffd450a docs: Spell out our policy concerning minor releases > > > > Since QEMU's platform support policy is based on libvirt's, it > > makes sense to mirror these recent changes made to the latter. > > > > The policy is not altered significantly - we're simply spelling > > out some rules that were likely already being implicitly > > enforced. > > Indeed, I think that's basically defacto the case already. > > Reviewed-by: Daniel P. Berrangé Thanks! Are you going to bring these in through one of your trees, or do I need to bug someone else so that they will pick them up? :) -- Andrea Bolognani / Red Hat / Virtualization
[PATCH 1/2] docs: build-platforms: Fix spelling for Homebrew
Signed-off-by: Andrea Bolognani --- docs/about/build-platforms.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/about/build-platforms.rst b/docs/about/build-platforms.rst index c29a4b8fe6..e9163ba556 100644 --- a/docs/about/build-platforms.rst +++ b/docs/about/build-platforms.rst @@ -80,7 +80,7 @@ Ubuntu LTS. Other distros will be assumed to ship similar software versions. For FreeBSD and OpenBSD, decisions will be made based on the contents of the respective ports repository, while NetBSD will use the pkgsrc repository. -For macOS, `HomeBrew`_ will be used, although `MacPorts`_ is expected to carry +For macOS, `Homebrew`_ will be used, although `MacPorts`_ is expected to carry similar versions. Windows @@ -92,6 +92,6 @@ hosted on Linux (Debian/Fedora). The version of the Windows API that's currently targeted is Vista / Server 2008. -.. _HomeBrew: https://brew.sh/ +.. _Homebrew: https://brew.sh/ .. _MacPorts: https://www.macports.org/ .. _Repology: https://repology.org/ -- 2.35.1
[PATCH 0/2] docs: build-platforms: Fix and clarify
Andrea Bolognani (2): docs: build-platforms: Fix spelling for Homebrew docs: build-platforms: Clarify stance on minor releases and backports docs/about/build-platforms.rst | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) -- 2.35.1
[PATCH 3/3] qapi: Fix typo
Signed-off-by: Andrea Bolognani --- qapi/sockets.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/qapi/sockets.json b/qapi/sockets.json index 5773d9fcc4..fccc38584b 100644 --- a/qapi/sockets.json +++ b/qapi/sockets.json @@ -149,7 +149,7 @@ # # Note: This type is deprecated in favor of SocketAddress. The # difference between SocketAddressLegacy and SocketAddress is that the -# latter is has fewer {} on the wire. +# latter has fewer {} on the wire. # # Since: 1.3 ## -- 2.35.1
[PATCH 2/3] qapi: Fix documentation for query-xen-replication-status
The correct return type is ReplicationStatus, not ReplicationResult. Signed-off-by: Andrea Bolognani --- qapi/migration.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/qapi/migration.json b/qapi/migration.json index 27d7b28158..409eb086a2 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -1619,7 +1619,7 @@ # # Query replication status while the vm is running. # -# Returns: A @ReplicationResult object showing the status. +# Returns: A @ReplicationStatus object showing the status. # # Example: # -- 2.35.1
[PATCH 1/3] docs: qapi: Remove outdated reference to simple unions
Commit 4e99f4b12c0e dropped simple unions and updated most documentation accordingly, but in one case we still claim that there are "two flavors of unions". Signed-off-by: Andrea Bolognani --- docs/devel/qapi-code-gen.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/devel/qapi-code-gen.rst b/docs/devel/qapi-code-gen.rst index 246709ede8..7b968433a6 100644 --- a/docs/devel/qapi-code-gen.rst +++ b/docs/devel/qapi-code-gen.rst @@ -41,8 +41,8 @@ used internally. There are several kinds of types: simple types (a number of built-in types, such as ``int`` and ``str``; as well as enumerations), arrays, -complex types (structs and two flavors of unions), and alternate types -(a choice between other types). +complex types (structs and unions), and alternate types (a choice +between other types). Schema syntax -- 2.35.1
[PATCH 0/3] qapi: Random small fixes
Andrea Bolognani (3): docs: qapi: Remove outdated reference to simple unions qapi: Fix documentation for query-xen-replication-status qapi: Fix typo docs/devel/qapi-code-gen.rst | 4 ++-- qapi/migration.json | 2 +- qapi/sockets.json| 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) -- 2.35.1
Re: [RFC PATCH v1 0/8] qapi: add generator for Golang interface
On Tue, Apr 19, 2022 at 11:12:28AM -0700, Andrea Bolognani wrote: > Dealing with errors and commands that don't have a return value might > require us to have generic CommandResult wrapper after all, but we > should really try as hard as we can to stick to type safe interfaces. On second thought, this wouldn't actually need to be generic: we could have something like type TraceEventGetStateResult struct { Result TraceEventInfo `json:"return"` Error *Error `json:"error"` } and the caller would check that res.Error is nil before accessing res.Result. Commands for which a return value is not expected would just have the Error part in their corresponding Result struct, and those that can either return an object or nothing (are there actually any like that?) could have a pointer as the Result member. -- Andrea Bolognani / Red Hat / Virtualization
Re: [RFC PATCH v1 0/8] qapi: add generator for Golang interface
On Sat, Apr 02, 2022 at 12:40:56AM +0200, Victor Toso wrote: > Thanks for taking a look, let me know if you have questions, ideas > or suggestions. Full disclosure: I have only given the actual implementation a very cursory look so far, and I've focused on the generated Go API instead. Overall things look pretty good. One concern that I have is about naming struct members: things like SpiceInfo.MouseMode and most others are translated from the QAPI schema exactly the way you'd expect them, but for example ChardevCommon.Logappend doesn't look quite right. Of course there's no way to programmatically figure out what to capitalize, but maybe there's room for adding this kind of information in the form of additional annotations or something like that? Same for the various structs or members that have unexpectedly-capitalized "Tls" or "Vnc" in them. To be clear, I don't think the above is a blocker - just something to be aware of, and think about. My biggest concern is about the interface offered for commands. Based on the example you have in the README and how commands are defined, invoking (a simplified version of) the trace-event-get-state command would look like cmd := Command{ Name: "trace-event-get-state", Arg: TraceEventGetStateCommand{ Name: "qemu_memalign", }, } qmp_input, _ := json.Marshal() // qmp_input now contains // {"execute":"trace-event-get-state","arguments":{"name":"qemu_memalign"}} // do something with it qmp_output := ([]byte)(`{"return":{"name":"qemu_memalign","state":"disabled"}}`) ret := cmd.GetReturnType() _ = json.Unmarshal(qmp_output, ) // ret is a CommandResult instance whose Value member can be cast // to a TraceEventInfo struct First of all, from an application's point of view there are way too many steps involved: performing this operation should really be as simple as ret, _ := qmp.TraceEventGetState("qemu_memalign") // ret is a TraceEventInfo instance That's the end state we should be working towards. Of course that assumes that the "qmp" object knows where the QMP socket is, knows how to talk the QMP protocol, transparently deals with serializing and deserializing data... Plus, in some case you might want to deal with the wire transfer yourself in an application-specific manner. So it makes sense to have the basic building blocks available and then build the more ergonomic SDK on top of that - with only the first part being in scope for this series. Even with that in mind, the current interface is IMO problematic because of its almost complete lack of type safety. Both Command.Arg and CommandResult.Value are of type Any and CommandBase.Name, which is used to drive the JSON unmarshal logic as well as ending up on the wire when executing a command, is just a plain string. I think the low-level interface should look more like cmd := TraceEventGetStateCommand{ Name: "qemu_memalign", } qmp_input, _ := json.Marshal() // qmp_input looks the same as before qmp_output := ([]byte)(`{"return":{"name":"qemu_memalign","state":"disabled"}}`) ret := TraceEventInfo{} _ = json.Unmarshal(qmp_output, ) // ret is a TraceEventInfo instance The advantages over the current implementation is that the compiler will prevent you from doing something silly like passing the wrong set of arguments to a commmand, and that the application has to explicitly spell out what kind of object it expects to get as output. I'm attaching an incomplete implementation that I used for playing around. It's obviously too simplistic, but hopefully it will help illustrate my point. Dealing with errors and commands that don't have a return value might require us to have generic CommandResult wrapper after all, but we should really try as hard as we can to stick to type safe interfaces. -- Andrea Bolognani / Red Hat / Virtualization package main import ( "encoding/json" "fmt" ) type TraceEventGetStateCommand struct { Name string `json:"name"` } func (self *TraceEventGetStateCommand) MarshalJSON() ([]byte, error) { type Arguments TraceEventGetStateCommand return json.Marshal( { Execute string `json:"execute"` Arguments *Arguments `json:"arguments"` }{ Execute: "trace-event-get-state", Arguments: (*Arguments)(self), }) } type TraceEventInfo struct { Name string `json:"name"` State string `json:"state"` } func (self *TraceEventInfo) UnmarshalJSON(data []byte) error { type Return TraceEventInfo ret := struct { Return Return `json:"return"
Re: [PATCH 01/10] python/aqmp: add explicit GPLv2 license to legacy.py
On Thu, Mar 24, 2022 at 09:03:07AM +, Daniel P. Berrangé wrote: > > Overall making it *all* GPLv2+ compat is going to be important if you > > want people to be comfortable using it. If it has a mix of GPLv2+ > > and GPLv2-only code in the source tarball, then the overall combined > > work will have to be considered GPLv2-only and that will put people > > off using it. Even if they could theoreticallly restrict their usage > > to only the GPLv2+ parts, many won't get that far before moving on. Agreed. > Actually I'll go furthuer and suggest that if we're going to do a > relicensing at all, and your goal is to encourage usage, then GPLv2+ > is the wrong choice. Use LGPLv2+ if you want to facilitate usage, while > retaining a copyleft license. Does LGPL make sense in the context of Python, where there is no linking? -- Andrea Bolognani / Red Hat / Virtualization
Re: qemu crash 100% CPU with Ubuntu10.04 guest (solved)
On Thu, Feb 17, 2022 at 12:07:15PM +1100, Ben Smith wrote: > Hi All, > > I'm cross-posting this from Reddit qemu_kvm, in case it helps in some > way. I know my setup is ancient and unique; let me know if you would > like more info. > > Symptoms: > 1. Ubuntu10.04 32-bit guest locks up randomly between 0 and 30 days. > 2. The console shows a CPU trace dump, nothing else logged on the guest or > host. > 3. Host system (Ubuntu20.04) 100% CPU for qemu process. > > Solution: > When using virt-install, always use the "--os-variant" parameter! > e.g. --os-variant ubuntu10.04 FWIW, the --os-variant / --osinfo argument is going to be mandatory starting with the upcoming virt-manager release. https://listman.redhat.com/archives/virt-tools-list/2022-February/msg00021.html -- Andrea Bolognani / Red Hat / Virtualization
Re: [PATCH 2/2] tests: Update CentOS 8 container to CentOS Stream 8
On Tue, Feb 01, 2022 at 06:47:14PM +0100, Philippe Mathieu-Daudé wrote: > Andrea, do you think it is acceptable to merge this and fix on top, or > we should do it properly from start? My preference is always to avoid follow-up tweaks if possible :) but ultimately the decision is up to the QEMU developers and maintainers. -- Andrea Bolognani / Red Hat / Virtualization
Re: [PATCH 2/2] tests: Update CentOS 8 container to CentOS Stream 8
On Tue, Feb 01, 2022 at 06:02:21PM +, Daniel P. Berrangé wrote: > On Tue, Feb 01, 2022 at 09:08:22AM -0800, Andrea Bolognani wrote: > > CentOS 8 and CentOS Stream 8 are two pretty distinct operating > > systems in terms of update cadence and stability expectations, so I > > think that using the label "centos8" for containers and CI jobs that > > are actually consuming CentOS Stream 8 is going to be a source of > > confusion. > > Given the EOL of what I call the "traditional" CentOS model at the > end of 2021, both "CentOS" and "CentOS Stream" terms effectively > mean the same thing now. Note that while CentOS 7 is no longer targeted by QEMU, it still is very much relevant as an OS and will keep receiving updates for ~2.5 more years. > The caveat is that aside from this dockerfile, we also have a > VM config in test/vm/ that historically used traditional CentOS. > That may also need updating to point to Stream unless it has > seemlessly transitioned to using Stream content without us > needing to change anything. I've not looked closely at that yet. This example IMO only highlights the need for names to match reality and why overloading existing names is a bad idea. -- Andrea Bolognani / Red Hat / Virtualization
Re: [PATCH 2/2] tests: Update CentOS 8 container to CentOS Stream 8
On Tue, Feb 01, 2022 at 11:19:11AM +0100, Thomas Huth wrote: > +++ b/tests/docker/dockerfiles/centos8.docker > @@ -1,10 +1,10 @@ > # THIS FILE WAS AUTO-GENERATED > # > -# $ lcitool dockerfile centos-8 qemu > +# $ lcitool dockerfile centos-stream-8 qemu > # > # https://gitlab.com/libvirt/libvirt-ci > > -FROM docker.io/library/centos:8 > +FROM quay.io/centos/centos:stream8 [...] > +++ b/tests/lcitool/refresh > @@ -77,7 +77,7 @@ ubuntu2004_tsanhack = [ > ] > > try: > - generate_dockerfile("centos8", "centos-8") > + generate_dockerfile("centos8", "centos-stream-8") I'm not convinced this is a good idea. CentOS 8 and CentOS Stream 8 are two pretty distinct operating systems in terms of update cadence and stability expectations, so I think that using the label "centos8" for containers and CI jobs that are actually consuming CentOS Stream 8 is going to be a source of confusion. I recommend proceeding with a thorough rename, after which the string "centos8" no longer shows up anywhere, instead. -- Andrea Bolognani / Red Hat / Virtualization
Re: [PATCH 0/3] meson: Don't pass 'method' to dependency()
On Wed, Jan 19, 2022 at 05:38:38PM +, Daniel P. Berrangé wrote: > On Wed, Jan 19, 2022 at 06:17:57PM +0100, Andrea Bolognani wrote: > > See [1] for recent discussion about libgcrypt specifically, which the > > first patch is about. > > > > After writing that one, I realized that there is no point in > > explicitly passing 'method' to dependency() because Meson will do the > > right thing by default - hence the next two patches. > > This whole series is effectively reverting > > commit 1a94933fcc3d641bda9988244cde61769baae2e5 > Author: Paolo Bonzini > Date: Mon Aug 31 06:27:00 2020 -0400 > > meson: use pkg-config method to find dependencies > > We do not need to ask cmake for the dependencies, so just use the > pkg-config mechanism. Keep "auto" for SDL so that it tries using > sdl-config too. > > The documentation is adjusted to use SDL2_image as the example, > rather than SDL which does not use the "pkg-config" method. > > Signed-off-by: Paolo Bonzini > > which IIRC was done to get rid of mesons' confusing/misleading > attempts to probe for things via cmake when the pkg-config file > is not present. I guess I stand corrected on Meson doing the right thing by default then :) The first patch should still makes sense though: libgcrypt is like SDL in that Meson implements special handling for it, and we should allow the pkg-config file to be used if available. -- Andrea Bolognani / Red Hat / Virtualization
[PATCH 1/3] meson: Don't force use of libgcrypt-config
libgcrypt 1.9.0 (released in 2021-01) ships with a proper pkg-config file, which Meson's libgcrypt detection code can use if available. Passing 'config-tool' as 'method' when calling dependency(), however, forces Meson to ignore the pkg-config file and always use libgcrypt-config instead. Signed-off-by: Andrea Bolognani --- meson.build | 1 - 1 file changed, 1 deletion(-) diff --git a/meson.build b/meson.build index 762d7cee85..bc17ba67fd 100644 --- a/meson.build +++ b/meson.build @@ -1036,7 +1036,6 @@ endif if not gnutls_crypto.found() if (not get_option('gcrypt').auto() or have_system) and not get_option('nettle').enabled() gcrypt = dependency('libgcrypt', version: '>=1.8', -method: 'config-tool', required: get_option('gcrypt'), kwargs: static_kwargs) # Debian has removed -lgpg-error from libgcrypt-config -- 2.34.1
[PATCH 3/3] docs: Don't recommend passing 'method' to dependency()
Meson will do the right thing by default. Signed-off-by: Andrea Bolognani --- docs/devel/build-system.rst | 1 - 1 file changed, 1 deletion(-) diff --git a/docs/devel/build-system.rst b/docs/devel/build-system.rst index 431caba7aa..fcdc0cd187 100644 --- a/docs/devel/build-system.rst +++ b/docs/devel/build-system.rst @@ -316,7 +316,6 @@ dependency will be used:: sdl_image = not_found if not get_option('sdl_image').auto() or have_system sdl_image = dependency('SDL2_image', required: get_option('sdl_image'), - method: 'pkg-config', static: enable_static) endif -- 2.34.1
[PATCH 0/3] meson: Don't pass 'method' to dependency()
See [1] for recent discussion about libgcrypt specifically, which the first patch is about. After writing that one, I realized that there is no point in explicitly passing 'method' to dependency() because Meson will do the right thing by default - hence the next two patches. [1] https://lists.gnu.org/archive/html/qemu-devel/2022-01/msg01224.html Andrea Bolognani (3): meson: Don't force use of libgcrypt-config meson: Don't pass 'method' to dependency() docs: Don't recommend passing 'method' to dependency() docs/devel/build-system.rst | 1 - meson.build | 75 +++-- tcg/meson.build | 2 +- 3 files changed, 31 insertions(+), 47 deletions(-) -- 2.34.1
[PATCH 2/3] meson: Don't pass 'method' to dependency()
Meson will do the right thing by default. Signed-off-by: Andrea Bolognani --- meson.build | 74 - tcg/meson.build | 2 +- 2 files changed, 31 insertions(+), 45 deletions(-) diff --git a/meson.build b/meson.build index bc17ba67fd..b807ad9fbb 100644 --- a/meson.build +++ b/meson.build @@ -427,13 +427,13 @@ if 'CONFIG_GIO' in config_host endif lttng = not_found if 'ust' in get_option('trace_backends') - lttng = dependency('lttng-ust', required: true, method: 'pkg-config', + lttng = dependency('lttng-ust', required: true, kwargs: static_kwargs) endif pixman = not_found if have_system or have_tools pixman = dependency('pixman-1', required: have_system, version:'>=0.21.8', - method: 'pkg-config', kwargs: static_kwargs) + kwargs: static_kwargs) endif zlib = dependency('zlib', required: true, kwargs: static_kwargs) @@ -446,18 +446,18 @@ endif linux_io_uring = not_found if not get_option('linux_io_uring').auto() or have_block linux_io_uring = dependency('liburing', required: get_option('linux_io_uring'), - method: 'pkg-config', kwargs: static_kwargs) + kwargs: static_kwargs) endif libxml2 = not_found if not get_option('libxml2').auto() or have_block libxml2 = dependency('libxml-2.0', required: get_option('libxml2'), - method: 'pkg-config', kwargs: static_kwargs) + kwargs: static_kwargs) endif libnfs = not_found if not get_option('libnfs').auto() or have_block libnfs = dependency('libnfs', version: '>=1.9.3', required: get_option('libnfs'), - method: 'pkg-config', kwargs: static_kwargs) + kwargs: static_kwargs) endif libattr_test = ''' @@ -505,7 +505,7 @@ seccomp = not_found if not get_option('seccomp').auto() or have_system or have_tools seccomp = dependency('libseccomp', version: '>=2.3.0', required: get_option('seccomp'), - method: 'pkg-config', kwargs: static_kwargs) + kwargs: static_kwargs) endif libcap_ng = not_found @@ -533,7 +533,7 @@ if get_option('xkbcommon').auto() and not have_system and not have_tools xkbcommon = not_found else xkbcommon = dependency('xkbcommon', required: get_option('xkbcommon'), - method: 'pkg-config', kwargs: static_kwargs) + kwargs: static_kwargs) endif vde = not_found @@ -562,30 +562,30 @@ endif pulse = not_found if not get_option('pa').auto() or (targetos == 'linux' and have_system) pulse = dependency('libpulse', required: get_option('pa'), - method: 'pkg-config', kwargs: static_kwargs) + kwargs: static_kwargs) endif alsa = not_found if not get_option('alsa').auto() or (targetos == 'linux' and have_system) alsa = dependency('alsa', required: get_option('alsa'), -method: 'pkg-config', kwargs: static_kwargs) +kwargs: static_kwargs) endif jack = not_found if not get_option('jack').auto() or have_system jack = dependency('jack', required: get_option('jack'), -method: 'pkg-config', kwargs: static_kwargs) +kwargs: static_kwargs) endif spice_protocol = not_found if not get_option('spice_protocol').auto() or have_system spice_protocol = dependency('spice-protocol', version: '>=0.12.3', required: get_option('spice_protocol'), - method: 'pkg-config', kwargs: static_kwargs) + kwargs: static_kwargs) endif spice = not_found if not get_option('spice').auto() or have_system spice = dependency('spice-server', version: '>=0.12.5', required: get_option('spice'), - method: 'pkg-config', kwargs: static_kwargs) + kwargs: static_kwargs) endif spice_headers = spice.partial_dependency(compile_args: true, includes: true) @@ -595,32 +595,29 @@ libiscsi = not_found if not get_option('libiscsi').auto() or have_block libiscsi = dependency('libiscsi', version: '>=1.9.0', required: get_option('libiscsi'), - method: 'pkg-config', kwargs: static_kwargs) + kwargs: static_kwargs) endif zstd = not_found if not get_option('zstd').auto() or have_block zstd = dependency('libzstd', version: '>=1.4.0', required: get_option('zstd'), -method: 'pkg-config', kwargs: static_kwargs) +kwargs: static_kwargs) endif virgl = not_found if not get_option('virglrenderer').auto() or have_system virgl = dependency('virglrenderer', - method: 'pkg-config', required: get_
Re: [Qemu-devel] [PATCH] configure: Add pkg-config handling for libgcrypt
On Fri, Jan 07, 2022 at 12:06:47PM +, Daniel P. Berrangé wrote: > On Fri, Jan 07, 2022 at 12:55:42PM +0100, Thomas Huth wrote: > > On 07/01/2022 12.43, Andrea Bolognani wrote: > > > On Thu, Aug 29, 2019 at 10:15:05AM +0100, Daniel P. Berrangé wrote: > > > > Where are you seeing pkg-config files for libgcrypt ? > > > > > > > > The upstream project has (frustratingly) been hostile to any proposal to > > > > add pkg-config support saying people should stick with their custom > > > > libgcrypt-config tool > > > > > > > > https://dev.gnupg.org/T2037 > > > > > > Resurrecting an old thread to point out that the upstream stance > > > seems to have changed since that discussion: > > > > > > > > > https://git.gnupg.org/cgi-bin/gitweb.cgi?p=libgcrypt.git;a=commit;h=97194b422bc89a6137f4e218d4cdee118c63e96e > > > > > > libgcrypt 1.9.0, released almost exactly a year ago, comes with a > > > pkg-config file out of the box. With that in mind, I think it would > > > make sense to re-evaluate this patch for inclusion. > > > > Maybe ... but we switched to Meson in between, so the patch needs to be > > rewritten to use meson.build instead, I think. Right. I didn't mean that the patch should be merged as-is, and I wouldn't expect it to even apply two years later :) > > Also it seems like version > > 1.9 is not available in all distros yet, so someone needs to do an > > assessment whether the distros that use an older version of libgrypt provide > > a .pc file or not... The original approach used the .pc file where present and used libgcrypt-config as a fallback. Once QEMU stopped targeting all operating system that have libgcrypt < 1.9, the fallback path could have been dropped. > Comes back to my question of what is the benefit of using the .pc file > when what we have already works and is known to be portable. > > When using meson there is essentially zero burden to using the > libgcrypt-config approach, because that's all handled transparently > by meson. This is different from the situation with configure, > where libgcrypt-config required extra work on our side. I didn't realize that was the case! I see now that Meson implements special handling for libgcrypt, which is very nice indeed :) > Overall I don't see any need to change it. IIUC, the fact that the call currently looks like gcrypt = dependency('libgcrypt', version: '>=1.8', method: 'config-tool', ...) means that Meson will not use the .pc file even when it's present. I think changing method to "auto" would cause it to use the .pc file when it's available, which is likely a better behavior. -- Andrea Bolognani / Red Hat / Virtualization
Re: [Qemu-devel] [PATCH] configure: Add pkg-config handling for libgcrypt
On Thu, Aug 29, 2019 at 10:15:05AM +0100, Daniel P. Berrangé wrote: > On Thu, Aug 29, 2019 at 04:53:02PM +0800, zhe...@windriver.com wrote: > > libgcrypt may also be controlled by pkg-config, this patch adds pkg-config > > handling for libgcrypt. > > Where are you seeing pkg-config files for libgcrypt ? > > The upstream project has (frustratingly) been hostile to any proposal to > add pkg-config support saying people should stick with their custom > libgcrypt-config tool > >https://dev.gnupg.org/T2037 > > Even if this is something added by some distro downstream, what is the > benefit in using it, compared with libgcrypt-confg which should already > work & is portable. Resurrecting an old thread to point out that the upstream stance seems to have changed since that discussion: https://git.gnupg.org/cgi-bin/gitweb.cgi?p=libgcrypt.git;a=commit;h=97194b422bc89a6137f4e218d4cdee118c63e96e libgcrypt 1.9.0, released almost exactly a year ago, comes with a pkg-config file out of the box. With that in mind, I think it would make sense to re-evaluate this patch for inclusion. -- Andrea Bolognani / Red Hat / Virtualization
Re: [qemu-web PATCH] remove deployment phase from CI
On Wed, Nov 03, 2021 at 09:14:59AM +0100, Paolo Bonzini wrote: > qemu.org is now served via a reverse proxy from qemu-project.gitlab.io; it > does > not need anymore the rsync step to the QEMU project\s shell server. *project's -- Andrea Bolognani / Red Hat / Virtualization
Re: [PATCH RESEND v2 0/6] target/arm: Add nested virtualization support
On Thu, 2021-04-01 at 12:55 +, Haibo Xu wrote: > v2: > - Move the NV to a CPU feature flag(Andrea) > - Add CPU feature 'el2' test(Andrew) > > Many thanks to Andrea and Andrew for their comments! > > This series add support for ARMv8.3/8.4 nested virtualization support > in KVM mode. It's based on Marc Zyngier's kernel KVM patches[1], and > has been tested on a FVP model to run a L2 guest with Qemu. Now the > feature can be enabled by "-M virt,accel=kvm -cpu host,el2=on" when > starting a VM. This looks good from the user interface point of view, thanks for addressing the concerns that were raised! I'll leave Drew to review the actual code changes :) -- Andrea Bolognani / Red Hat / Virtualization
Re: Serious doubts about Gitlab CI
On Wed, 2021-03-31 at 09:54 +0200, Thomas Huth wrote: > On 30/03/2021 15.19, Daniel P. Berrangé wrote: > > Yep, that looks similar to what we do in libvirt, though we don't override > > the compiler at the job level. Instead we just ensure the dir containing > > ccache symlinks appears first in $PATH. > > That's of course a nice idea, too, but it does not seem to work with the > clang builds: > > https://gitlab.com/thuth/libvirt/-/jobs/1142239591#L3780 > https://gitlab.com/thuth/libvirt/-/jobs/1142239581#L3738 That's because the corresponding Dockerfile doesn't contain instructions to create a symlink for clang: https://gitlab.com/libvirt/libvirt/-/blob/master/ci/containers/fedora-rawhide.Dockerfile#L105-107 It's simply a bug in lcitool that needs to be addressed. -- Andrea Bolognani / Red Hat / Virtualization
Re: [PATCH 0/3] target/arm: Add nested virtualization support
On Mon, 2021-03-22 at 10:07 +, Haibo Xu wrote: > This series add support for ARMv8.3/8.4 nested virtualization support > in KVM mode. It's based on Marc Zyngier's kernel KVM patches[1], and > has been tested on a FVP model to run a L2 guest with Qemu. Now the > feature can be enabled by "-M virt,accel=kvm,virtualization=on" when > starting a VM. Why the need to enable this explicitly? AFAIK, that's not necessary for any other architecture: on x86, you just need to make sure you're using '-cpu host' and pass a parameter to the kernel module. Even assuming this can't be enabled transparently, wouldn't its availability it be controlled by a CPU feature flag, similar to what already happens for SVE and PMU, rather than a machine type option? That would also address the discoverability issue: unless I'm mistaken (which I very well might be :), with the current implementation there's no way to tell whether nested KVM will be usable short of trying and seeing whether QEMU errors out. -- Andrea Bolognani / Red Hat / Virtualization
Re: minimal "zero conf" build dockerfiles for fedora:latest and alpine:latest
On Wed, 2021-01-13 at 13:31 -0500, John Snow wrote: > On 1/13/21 5:09 AM, Gerd Hoffmann wrote: > > > I don't like Perl really, but there's a chicken-and-egg problem between > > > detecting Python and using it to print the configure help script. For > > > configure-time tasks, Perl has the advantage that "#! /usr/bin/env perl" > > > just works. > > > > Assuming perl is actually installed, the world seems to shift to python. > > On a minimal fedora install python is present but perl is not ... > > > > On the other hand git depends on perl, so it is probably pretty hard to > > find a developer workstation without perl installed, so maybe that > > doesn't matter much for the time being. > > I agree that it doesn't matter much right now, Though I don't always > have git installed in containers when I am doing builds. It will become > more common to encounter environments that are missing "obvious" > dependencies. Note that Fedora has a git-core package that doesn't depend on Perl while still providing more than enough git for something like a CI build job. As a data point, the libvirt project has made it an explicit goal[1] to remove all usage of Perl in favor of Python. We're not quite there yet, but at this point there are only a very tiny handful of Perl scripts remaining in the repository. [1] https://libvirt.org/strategy.html -- Andrea Bolognani / Red Hat / Virtualization
Re: Command line QAPIfication and -readconfig
On Wed, 2020-11-11 at 11:35 +0100, Kevin Wolf wrote: > Am 11.11.2020 um 11:14 hat Daniel P. Berrangé geschrieben: > > Normally we would not mark something as deprecated unless its replacement > > is ready, because telling people "stop using this but the replacement > > doesn't exist yet" is not a nice message as there is no action users can > > take to deal with the deprecation. > > But there is a replacement: Put everything back into the command line > and keep it in a shell script. Config files with -readconfig were never > complete enough to fully describe a VM, so it's not too unlikely that > you'll already have that script anyway. This is correct: in fact... > > We might question whether -readconfig has any users but I would note > > that our own documentation illustrates its usage, and provides several > > example configs > > [...] > > config/mach-virt-graphical.cfg:# -readconfig mach-virt-graphical.cfg \ > > config/mach-virt-serial.cfg:# -readconfig mach-virt-serial.cfg \ > > config/q35-emulated.cfg:# -readconfig q35-emulated.cfg > > config/q35-virtio-graphical.cfg:# -readconfig q35-virtio-graphical.cfg > > config/q35-virtio-serial.cfg:# -readconfig q35-virtio-serial.cfg \ [...] ... these configuration files, which I contributed some time ago, all start with something along the lines of # Usage: # # $ qemu-system-aarch64 \ # -nodefaults \ # -readconfig mach-virt-serial.cfg \ # -display none -serial mon:stdio \ # -cpu host because it was simply impossible to provide QEMU with all the settings necessary to obtain the desired virtual hardware using the configuration file only. -- Andrea Bolognani / Red Hat / Virtualization
Re: [PATCH v4 2/4] Jobs based on custom runners: build environment docs and playbook
On Mon, 2020-11-09 at 11:37 -0500, Cleber Rosa wrote: > On Tue, Oct 20, 2020 at 07:52:43PM +0200, Andrea Bolognani wrote: > > On Sun, 2020-10-18 at 21:50 -0400, Cleber Rosa wrote: > > > +- name: Install basic packages to build QEMU on FreeBSD 12.x > > > + pkgng: > > > +# Originally from packages on .cirrus.yml under the > > > freebsd_12_task > > > +name: > > > bash,curl,cyrus-sasl,git,glib,gmake,gnutls,gsed,nettle,ninja,perl5,pixman,pkgconf,png,usbredir > > > > See above for 'pkgng' vs 'package', but at the very least this should > > be > > > > pkgng: > > name: > > - bash > > - curl > > ... > > > > or each time the list is touched the resulting diff is going to be > > unmanageable. > > The documentation suggests a comma separated list of package names: > >https://docs.ansible.com/ansible/2.8/modules/pkgng_module.html#pkgng-module > > And the reason is that this module is not as smart as others, and will > run one separate command for each individual package name value: > > > https://github.com/ansible/ansible/blob/v2.10.3/test/support/integration/plugins/modules/pkgng.py#L214 > > It's a tradeoff indeed, but I think we're aligned with the docs. I would probably take the performance hit over having to deal with an unmaintainable blob. As a rule of thumb, reviewer time is more valuable than machine time. If the performance hit turns out to be unacceptably big, you could at least do something along the lines of - set_fact: freebsd_packages: - bash - curl ... - pkgng: name: "{ freebsd_packages | join(',') }" to keep things reviewable. -- Andrea Bolognani / Red Hat / Virtualization
Re: [PATCH v4 3/4] Jobs based on custom runners: docs and gitlab-runner setup playbook
On Tue, 2020-10-20 at 09:29 +0100, Daniel P. Berrangé wrote: > I think the obvious and easy place is start using lcitool is for the > tests/docker/dockerfiles/*. All that's required is to add mappings > to lcitool for the various deps that QEMU has which libvirt does not > already have. Then we should be able to start auto-generating the > dockerfiles without too much difficulty. This will be a significant > step forward because it will help us keep te package lists in sync > across all the dockerfiles which is a major fail in QEMU right now. I took a quick look at the contents of those dockerfiles and while I agree that it's the obvious place to start, I'm not sure I would say that generating them using lcitool is going to be easy :) Basically there seem to be a number of assumptions both in lcitool and in the QEMU dockerfiles, and getting the two to agree on those will require quite some work from what I can tell. But again, I agree it's something that should be worked on. > Dealing with tests/vm replacement or these ansible recipes is likely > to be a significantly more challenging proposition. Perhaps we can > again start by just automating creation of the package lists that > the tests/vm and ansibile recipes need, as again those are all > inconsistent. Perhaps 'lcitool variables' could be of use here. -- Andrea Bolognani / Red Hat / Virtualization
Re: [PATCH v4 2/4] Jobs based on custom runners: build environment docs and playbook
On Sun, 2020-10-18 at 21:50 -0400, Cleber Rosa wrote: > +++ b/scripts/ci/setup/build-environment.yml > @@ -0,0 +1,233 @@ > +--- > +- name: Installation of basic packages to build QEMU > + hosts: all > + tasks: My Ansible-fu is a bit rusty at the moment, so please double-check my claims and apologies in advance for the ones that I will get wrong O:-) > +- name: Install basic packages to build QEMU on Ubuntu 18.04/20.04 > + apt: > +update_cache: yes > +# Originally from tests/docker/dockerfiles/ubuntu1804.docker > +pkg: > + - ccache > + - clang Instead of using the 'apt' module here, and the equivalent module further down, you could just do package: name: - pkg1 - pkg2 ... state: present every single time and let Ansible take care of the differences for you. > + when: "ansible_facts['distribution'] == 'Ubuntu'" Quoting the entire condition is not necessary, you can just have when: ansible_facts['distribution'] == 'Ubuntu' or, my preference, when: - ansible_facts['distribution'] == 'Ubuntu' which results in a nicer diff when you add/remove conditions. > +- name: Install packages to build QEMU on Ubuntu 18.04/20.04 on non-s390x > + apt: > +update_cache: yes > +pkg: > + - libspice-server-dev > + - libxen-dev Indentation of list items is inconsistent here. > +- name: Install basic packages to build QEMU on FreeBSD 12.x > + pkgng: > +# Originally from packages on .cirrus.yml under the freebsd_12_task > +name: > bash,curl,cyrus-sasl,git,glib,gmake,gnutls,gsed,nettle,ninja,perl5,pixman,pkgconf,png,usbredir See above for 'pkgng' vs 'package', but at the very least this should be pkgng: name: - bash - curl ... or each time the list is touched the resulting diff is going to be unmanageable. > +- name: Enable PowerTools repo on CentOS 8 > + ini_file: > +path: /etc/yum.repos.d/CentOS-PowerTools.repo > +section: PowerTools > +option: enabled > +value: "1" > + when: > +- "ansible_facts['distribution'] == 'CentOS'" > +- "ansible_facts['distribution_major_version'] == '8'" Another option would be to use command: 'dnf config-manager --set-enabled Stream-PowerTools -y' args: warn: no but I have to admit the way you're doing it is very clever ;) -- Andrea Bolognani / Red Hat / Virtualization
Re: [PATCH v2 03/15] python: add VERSION file
On Tue, 2020-10-20 at 10:06 +0100, Daniel P. Berrangé wrote: > The QEMU python modules are not like other python modules though, > precisely because they are talking to QEMU. If we are shipping > QEMU python releases on the same schedule as QEMU, then we can > expect the normal ase to be updating both QEMU & Python together. Once you start uploading the Python packages to PyPi, you really have no way to ensure this will be the case. -- Andrea Bolognani / Red Hat / Virtualization
Re: [PATCH v2 03/15] python: add VERSION file
On Mon, 2020-10-19 at 11:02 +0100, Daniel P. Berrangé wrote: > On Mon, Oct 19, 2020 at 11:45:09AM +0200, Andrea Bolognani wrote: > > I think this need to be considered very carefully. > > > > I'm not overly familiar with the Python ecosystem but it would appear > > that, despite PEP 440 not mandating this, many (most?) of the > > packages uploaded to PyPi are using semantic versioning. > > > https://packaging.python.org/guides/distributing-packages-using-setuptools/#choosing-a-versioning-scheme > > Semver is the recommended approach, but they explicitly list date > based versioning as a valid alternative > > "Semantic versioning is not a suitable choice for all projects, >such as those with a regular time based release cadence and a >deprecation process that provides warnings for a number of >releases prior to removal of a feature." > > That paragraph describes QEMU's scenario. The section on date based versioning continues with A key advantage of date based versioning is that it is straightforward to tell how old the base feature set of a particular release is given just the version number. Version numbers for date based projects typically take the form of YEAR.MONTH (for example, 12.04, 15.10). The problem with QEMU's version numbers is that, while they are date based, they still *look* like semver, so it wouldn't be at all unreasonable for the user to expect that they also *behave* like semver. This is not much of a problem when it comes to the main binary, but it is certainly much more confusing when you start using the same version number for a Python library. > > With that in mind, I think it would be unwise for qemu.* not to do > > the same; in particular, using a version number that's not <1.0.0 for > > a package that is very much in flux will almost certainly break > > people's expectations, and is also not something that you can easily > > take back at a later time. > > I don't think it is that big a deal, and there is clear benefit to > having the python code version match the QEMU version that it is > the companioon to. > > Ultimately the versioning scheme just impacts on the version string > conditionals people list for their dependancies. Apps consuming QEMU > can handle any of the version schemes without much difference. The problem comes from the expectations: a Python programmer, who is used to semver due to its prominence on PyPi, when deciding whether to move from qemu.core 4.2.0 to 5.0.0 might expect to need code changes to cope with API-breaking changes - where in fact there are none, and at the same time might expect upgrading to 5.2.0 from 5.0.0 to be completely straightforward when in reality a feature their application depends on might have been removed after the usual deprecation period. -- Andrea Bolognani / Red Hat / Virtualization
Re: [PATCH v2 03/15] python: add VERSION file
On Wed, 2020-10-14 at 10:29 -0400, John Snow wrote: > Python infrastructure as it exists today is not capable reliably of > single-sourcing a package version from a parent directory. The authors > of pip are working to correct this, but as of today this is not possible > to my knowledge. > > The problem is that when using pip to build and install a python > package, it copies files over to a temporary directory and performs its > build there. This loses access to any information in the parent > directory, including git itself. > > Further, Python versions have a standard (PEP 440) that may or may not > follow QEMU's versioning. In general, it does; but naturally QEMU does > not follow PEP 440. To avoid any automatically-generated conflict, a > manual version file is preferred. > > > I am proposing: > > - Python core tooling synchronizes with the QEMU version directly > (5.2.0, 5.1.1, 5.3.0, etc.) > > - In the event that a Python package needs to be updated independently > of the QEMU version, a pre-release alpha version should be preferred, > but *only* after inclusion to the qemu development or stable branches. > > e.g. 5.2.0a1, 5.2.0a2, and so on should be preferred prior to 5.2.0's > release. > > - The Python core tooling makes absolutely no version compatibility > checks or constraints. It *may* work with releases of QEMU from the > past or future, but it is not required to. > > i.e., "qemu.core" will always remain in lock-step with QEMU. > > - We reserve the right to split out e.g. qemu.core.qmp to qemu.qmp > and begin indepedently versioning such a package separately from the > QEMU version it accompanies. I think this need to be considered very carefully. I'm not overly familiar with the Python ecosystem but it would appear that, despite PEP 440 not mandating this, many (most?) of the packages uploaded to PyPi are using semantic versioning. With that in mind, I think it would be unwise for qemu.* not to do the same; in particular, using a version number that's not <1.0.0 for a package that is very much in flux will almost certainly break people's expectations, and is also not something that you can easily take back at a later time. -- Andrea Bolognani / Red Hat / Virtualization
Re: [RFC DOCUMENT 00/12] kubevirt-and-kvm: Add documents
On Tue, 2020-09-22 at 11:29 +0200, Philippe Mathieu-Daudé wrote: > Hi Andrea, Hi Philippe, and sorry for the delay in answering! First of all, thanks for taking the time to go through the documents and posting your thoughts. More comments below. > Thanks a lot for this documentation, I could learn new things, > use cases out of my interest area. Useful as a developer to > better understand how are used the areas I'm coding. This > shorten a bit that gap between developers and users. > > What would be more valuable than a developer review/feedback is > having feedback from users and technical writers. > Suggestion: also share it on qemu-disc...@nongnu.org which is > less technical (maybe simply repost the cover and link to the > Wiki). More eyes would obviously be good, but note that these are really intended to improve the interactions between QEMU/libvirt and KubeVirt, so the audience is ultimately developers. Of course you could say that KubeVirt developers *are* users when it comes to QEMU/libvirt, and you wouldn't be wrong ;) Still, qemu-devel seems like the proper venue. > What is not obvious in this cover (and the documents pasted on > the list) is there are schema pictures on the Wiki pages which > are not viewable and appreciable via an email post. You're right! I was pretty sure I had a line about that somewhere in there but I guess it got lost during editing. Hopefully the URL at the very beginning of each document caused people to browse the HTML version. > I had zero knowledge on Kubernetes. I have been confused by their > use in the introduction... > > From Index: > > "The intended audience is people who are familiar with the traditional > virtualization stack (QEMU plus libvirt), and in order to make it > more approachable to them comparisons, are included and little to no > knowledge of KubeVirt or Kubernetes is assumed." > > Then in Architecture's {Goals and Components} there is an assumption > Kubernetes is known. Entering in Components, Kubernetes is briefly > but enough explained. > > Then KubeVirt is very well explained. I guess the sections in the Index you're referring to assume that you know that Kubernetes is somehow connected to containers, and that it's a clustered environment. Anything else I missed? Perhaps we could move the contents of https://gitlab.cee.redhat.com/abologna/kubevirt-and-kvm/-/blob/master/Components.md#kubernetes to a small document that's linked to near the very top. Would that improve things, in your opinion? > Sometimes the "Other topics" category is confusing, it seems out > of the scope of the "better understanding and documenting the > interactions between KubeVirt and KVM" and looks like left over > notes. That's probably because they absolutely are O:-) > Maybe renaming the "Other topics" section would help. > "Unanswered questions", "Other possibilities to investigate",... This sounds sensible :) Thanks again for your feedback! -- Andrea Bolognani / Red Hat / Virtualization
Re: Python 3.5 EOL; when can require 3.6?
On Thu, 2020-09-17 at 17:35 +0100, Daniel P. Berrangé wrote: > On Thu, Sep 17, 2020 at 06:07:00PM +0200, Andrea Bolognani wrote: > > It would be *fantastic* if we could keep the platform support policy > > used by QEMU and libvirt as aligned as reasonably possible. > > The current QEMU policy was a copy+paste of the same policy I wrote for > libvirt originally, just adding OpenBSD/NetBSD. > > I've just posted an update for QEMU which matches the latest libvirt > policy, again just adding the extra BSDs. > > https://lists.gnu.org/archive/html/qemu-devel/2020-09/msg06371.html That's even better than I hoped for! Thank you so much :) -- Andrea Bolognani / Red Hat / Virtualization
Re: Python 3.5 EOL; when can require 3.6?
On Thu, 2020-09-17 at 17:30 +0200, Markus Armbruster wrote: > Thomas Huth writes: > > Sorry, I forgot to check Debian. If I got that right, Debian 9 still > > uses Python 3.5 by default. So I guess that means we can not deprecate > > Python 3.5 yet? > > Discussed before: > > Subject: Re: [PATCH] qapi: Fix code generation with Python 3.5 > Date: Sat, 18 Jan 2020 07:54:18 +0100 > Message-ID: <87lfq5s19h@dusky.pond.sub.org> > https://lists.nongnu.org/archive/html/qemu-devel/2020-01/msg03855.html > > Short version: Debian != Debian LTS. We support Debian until EOL, not > LTS. Debian 9 reached EOL in July. FWIW, this is the same policy the libvirt project follows, and we have formalized it at https://libvirt.org/platforms.html with Debian LTS being called out explicitly as not supported. It would be *fantastic* if we could keep the platform support policy used by QEMU and libvirt as aligned as reasonably possible. -- Andrea Bolognani / Red Hat / Virtualization
[RFC DOCUMENT 11/12] kubevirt-and-kvm: Add Backpropagation page
https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Backpropagation.md # Backpropagation Whenever a partial VM configuration is submitted to libvirt, any missing information is automatically filled in to obtain a configuration that's complete enough to guarantee long-term guest ABI stability. PCI addresses are perhaps the most prominent example of this: most management applications don't include this information at all in the XML they submit to libvirt, and rely on libvirt building a reasonable PCI topology to support the requested devices. For example, using a made-up YAML syntax for brevity, the input could look like ```yaml devices: disks: - image: /path/to/image.qcow2 ``` and the output could be augmented by libvirt to look like ```yaml devices: controllers: - model: pcie-root-port address: type: pci domain: 0x bus: 0x00 slot: 0x01 function: 0x0 disks: - image: /path/to/image.qcow2 model: virtio-blk address: type: pci domain: 0x bus: 0x01 slot: 0x00 function: 0x0 ``` This is where backpropagation comes in: the only version of the VM configuration that is complete enough to guarantee a stable guest ABI is the one that includes all information added by libvirt, so if the management application wants to be able to make further changes to the VM it needs to reflect the additional information back into its understanding of the VM configuration somehow. For applications like virsh and virt-manager, this is easy: they don't have their own configuration format or even store the VM configuration, and simply fetch it from libvirt and operate on it directly every single time. oVirt, to the best of my knowledge, generates an initial VM configuration based on the settings provided by the user, submits it to libvirt and then parses back the augmented version, figuring out what information was added and updating its database to match: if the VM configuration needs to be generated again later, it will include all information present in the database, including those that originated from libvirt rather than the user. KubeVirt does not currently perform any backpropagation. There are two ways a user can influence PCI address allocation: * explicitly add a `pciAddress` attribute for the device, which will cause KubeVirt to pass the corresponding address to libvirt, which in turn will attempt to comply with the user's request; * add the `kubevirt.io/placePCIDevicesOnRootComplex` annotation to the VM configuration, which will cause KubeVirt to provide libvirt with a fully-specified PCI topology where all devices live on the PCIe Root Bus. In all cases but the one where KubeVirt defines the full PCI topology itself, it's implicitly relying on libvirt always building the PCI topology in the exact same way every single time in order to have a stable guest ABI. While this works in practice, it's not something that libvirt actually guarantees: once a VM has been defined, libvirt will never change its PCI topology, but submitting the same partial VM configuration to different libvirt versions can result in different PCI topologies.
[RFC DOCUMENT 12/12] kubevirt-and-kvm: Add Contacts page
https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Contacts.md # Contacts and credits # Contacts The following people have agreed to serve as points of contact for follow-up discussion around the topics included in these documents. ## Overall * Andrea Bolognani <> (KVM user space) * Cole Robinson <> (KVM user space) * Roman Mohr <> (KubeVirt) * Vladik Romanovsky <> (KubeVirt) ## Networking * Alona Paz <> (KubeVirt) * Stefano Brivio <> (KVM user space) ## Storage * Adam Litke <> (KubeVirt) * Stefan Hajnoczi <> (KVM user space) # Credits In addition to those listed above, the following people have also contributed to the documents or the discussion around them. Ademar Reis, Adrian Moreno Zapata, Alice Frosi, Amnon Ilan, Ariel Adam, Christophe de Dinechin, Dan Kenisberg, David Gilbert, Eduardo Habkost, Fabian Deutsch, Gerd Hoffmann, Jason Wang, John Snow, Kevin Wolf, Marc-André Lureau, Michael Henriksen, Michael Tsirkin, Paolo Bonzini, Peter Krempa, Petr Horacek, Richard Jones, Sergio Lopez, Steve Gordon, Victor Toso, Viviek Goyal. If your name should be in the list above but is not, please know that was an honest mistake and not a way to downplay your contribution! Get in touch and we'll get it sorted out :)
[RFC DOCUMENT 10/12] kubevirt-and-kvm: Add Upgrades page
https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Upgrades.md # Upgrades The KubeVirt installation and upgrade process are entirely controlled by an [operator][], which is a common pattern in the Kubernetes world. The operator is a piece of software running in the cluster and managing the lifecycle of other components, in this case KubeVirt. ## The operator What it does: * Manages the whole KubeVirt installation * Keeps the cluster actively in sync with the desired state * Upgrades KubeVirt with zero downtime ## Installation Install the operator: ```bash $ LATEST=$(curl -L https://storage.googleapis.com/kubevirt-prow/devel/nightly/release/kubevirt/kubevirt/latest) $ kubectl apply -f https://storage.googleapis.com/kubevirt-prow/devel/nightly/release/kubevirt/kubevirt/$(LATEST)/kubevirt-operator.yaml $ kubectl get pods -n kubevirt NAME READY STATUSRESTARTS AGE virt-operator-58cf9d6648-c7qph1/1 Running 0 69s virt-operator-58cf9d6648-pvzw21/1 Running 0 69s ``` Trigger the installation of KubeVirt: ```bash $ LATEST=$(curl -L https://storage.googleapis.com/kubevirt-prow/devel/nightly/release/kubevirt/kubevirt/latest) $ kubectl apply -f https://storage.googleapis.com/kubevirt-prow/devel/nightly/release/kubevirt/kubevirt/${LATEST}/kubevirt-cr.yaml $ kubectl get pods -n kubevirt NAME READY STATUSRESTARTS AGE virt-api-8bdd88557-fllhr 1/1 Running 0 59s virt-controller-55ccb8cdcb-5rtp6 1/1 Running 0 43s virt-controller-55ccb8cdcb-v8kr9 1/1 Running 0 43s virt-handler-67pns1/1 Running 0 43s ``` The process happens in two steps because the operator relies on the KubeVirt [custom resource][] for information on the desired installation, and will not do anything until that resource exists in the cluster. ## Upgrade The upgrading process is similar: * Install the latest operator * Reference the new version in the KubeVirt CustomResource to trigger the actual upgrade ```bash $ kubectl.sh get kubevirt -n kubevirt kubevirt -o yaml apiVersion: kubevirt.io/v1alpha3 kind: KubeVirt metadata: name: kubevirt spec: imageTag: v0.30 certificateRotateStrategy: {} configuration: {} imagePullPolicy: IfNotPresent ``` Note the `imageTag` attribute: when present, the KubeVirt operator will take steps to ensure that the version of KubeVirt that's deployed on the cluster matches its value, which in our case will trigger an upgrade. The following chart explain the upgrade flow in more detail and shows how the various components are affected: ![KubeVirt upgrade flow][Upgrades-Kubevirt] KubeVirt is released as a complete suite: no individual `virt-launcher` releases are planned. Everything is tested together, everything is released together. ## QEMU and libvirt The versions of QEMU and libvirt used for VMs are also tied to the version of KubeVirt and are upgraded along with everything else. * Migrations from old libvirt/QEMU to new libvirt/QEMU pairs are possible * As soon as the new `virt-handler` and the new controller are rolled out, the cluster will only start VMIs with the new QEMU/libvirt versions ## Version compatibility The virt stack is updated along with KubeVirt, which mitigates compatibility concerns. As a rule of thumb, versions of QEMU and libvirt older than a year or so are not taken into consideration. Currently, the ability to perform backwards migation (eg. from a newer version of QEMU to an older one) is not necessary, but that could very well change as KubeVirt becomes more widely used. [Upgrades-Kubevirt]: https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Images/Upgrades-Kubevirt.png [custom resource]: https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/ [operator]: https://kubernetes.io/docs/concepts/extend-kubernetes/operator/
[RFC DOCUMENT 09/12] kubevirt-and-kvm: Add Isolation page
https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Isolation.md # Isolation How is the QEMU process isolated from the host and from other VMs? ## Traditional virtualization cgroups * managed by libvirt SELinux * libvirt is privileged and QEMU is protected by SELinux policies set by libvirt (SVirt) * QEMU runs with SELinux type `svirt_t` ## KubeVirt cgroups * Managed by kubelet * No involvement from libvirt * Memory limits * When using hard limits, the entire VM can be killed by Kubernetes * Memory consumption estimates are based on heuristics SELinux * KubeVirt is not using SVirt and there are no plans to do so * At the moment, the custom [KubeVirt SELinux policy][] is used to ensure libvirt has sufficient privilege to perform its own setup steps * The standard SELinux type used by containers is `container_t` * KubeVirt would like to eventually use the same for VMs as well Capabilities * The default set of capabilities is fairly conservative * Privileged operation should happen outside of the pod: in KubeVirt's case, a good candidate is `virt-handler`, the privileged components that runs at the node level * Additional capabilities can be requested for a pod * However, this is frowned upon and considered a liability from the security point of view * The cluster admin may even set a security policy that prevent pods from using certain capabilities * In such a scenario, KubeVirt workloads may be entirely unable to run ## Specific examples The following is a list of examples, either historical or current, of scenarios where libvirt's approach to isolation clashed with Kubernetes' and changes on either component were necessary. SELinux * libvirt use of hugetlbfs for hugepages config is disallowed by `container_t` * Possibly fixable by using memfd * [libvirt memoryBacking docs][] * [KubeVirt memfd issue][] * Use of libvirt+QEMU multiqueue tap support is disallowed by `container_t` * And there’s no way to pass in this setup from outside the existing stack * [KubeVirt multiqueue workaround][] extending their SELinux policy to allow `attach_queue` * Passing precreated tap devices to libvirt triggers relabelfrom+relabelto `tun_socket` SELinux access * This may not be virt stacks fault, seems to happen automatically when permissions aren’t correct Capabilities * libvirt performs memory locking for VFIO devices unconditionally * Previously KubeVirt had to grant `CAP_SYS_RESOURCE` to pods. KubeVirt worked around it by duplicating libvirt’s memory pinning calculations so the libvirt action would be a no-op, but that is fragile and may cause the issue to resurface if libvirt calculation logic changes. * References: [libvir-list memlock thread][], [KubeVirt memlock PR][], [libvirt qemuDomainGetMemLockLimitBytes][], [KubeVirt VMI.getMemlockSize][] * virtiofsd requires `CAP_SYS_ADMIN` capability to perform `unshare(CLONE_NEWPID|CLONE_NEWNS)` * This is required for certain use cases like running overlayfs in the VM on top of virtiofs, but is not a requirement for all usecases. * References: [KubeVirt virtiofs PR][], [RHEL virtiofs bug][] * KubeVirt uses libvirt for CPU pinning, which requires the pod to have `CAP_SYS_NICE`. * Long term, KubeVirt would like to handle that pinning in their privileged component virt-handler, so `CAP_SYS_NICE` can be dropped. * Sidenote: libvirt unconditionally requires `CAP_SYS_NICE` when any other running VM is using CPU pinning, however this sounds like a plain old bug. * References: [KubeVirt CPU pinning PR][], [KubeVirt CPU pinning workaround PR][], [RHEL CPU pinning bug][] * libvirt bridge usage used to require `CAP_NET_ADMIN` * This is a historical example for reference. libvirt usage of a bridge device always implied tap device creation, which required `CAP_NET_ADMIN` privileges for the pod * The fix was to teach libvirt to accept a precreated tap device and skip some setup operations on it * Example XML: `` * Kubevirt still hasn’t fully managed to drop `CAP_NET_ADMIN` though * References: [RHEL precreated TAP bug][], [libvirt precreated TAP patches][], [KubeVirt precreated TAP PR][], [KubeVirt NET_ADMIN issue][], [KubeVirt NET_ADMIN issue][] [KubeVirt CPU pinning PR]: https://github.com/kubevirt/kubevirt/pull/1381 [KubeVirt CPU pinning workaround PR]: https://github.com/kubevirt/kubevirt/pull/1648 [KubeVirt NET_ADMIN PR]: https://github.com/kubevirt/kubevirt/pull/3290 [KubeVirt NET_ADMIN issue]: https://github.com/kubevirt/kubevirt/issues/3085 [KubeVirt SELinux policy]: https://github.com/kubevirt/kubevirt/blob/master/cmd/virt-handler/virt_launcher.cil [KubeVirt VMI.getMemlockSize]: https://github.com/kubevirt/kubevirt/blob/f5ffba5f84365155c81d0e2cda4aa709da062230/pkg/virt-handler/isolation/isolation.go#L206 [KubeVirt memfd issue]:
[RFC DOCUMENT 08/12] kubevirt-and-kvm: Add NUMA Pinning page
https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/NUMA-Pinning.md # NUMA pinning KubeVirt doesn't currently implement NUMA pinning due to Kubernetes limitation. ## Kubernetes Topology Manager Allows aligning CPU and peripheral device allocations by NUMA node. Many limitations: * Not scheduler aware. * Doesn’t allow memory alignment. * etc...
[RFC DOCUMENT 07/12] kubevirt-and-kvm: Add CPU Pinning page
https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/CPU-Pinning.md # CPU pinning As is the case for many of KubeVirt's features, CPU pinning is partially achieved using standard Kubernetes components: this both reduces the amount of new code that has to be written and guarantees better integration with containers running side by side with the VMs. ## Kubernetes CPU Manager The Static policy allocates exclusive CPUs to pod containers in the Guaranteed QoS class which request integer CPUs. On a best-effort basis, the Static policy tries to allocate CPUs topologically in the following order: * Allocate all the CPUs in the same processor socket if available and the container requests at least an entire socket worth of CPUs. * Allocate all the logical CPUs (hyperthreads) from the same physical CPU core if available and the container requests an entire core worth of CPUs. * Allocate any available logical CPU, preferring to acquire CPUs from the same socket. ## KubeVirt dedicated CPU placement KubeVirt relies on the Kubernetes CPU Manager to allocate dedicated CPUs to the `virt-launcher` container. When `virt-launcher` starts, it reads `/sys/fs/cgroup/cpuset/cpuset.cpus` and generates `` configuration for libvirt based on the information found within. However, affinity changes require `CAP_SYS_NICE` so this additional capability has to be granted to the VM pod. Going forward, we would like to perform the affinity change in `virt-handler` (the privileged component running at the node level), which would allow the VM pod to work without additional capabilities.
[RFC DOCUMENT 06/12] kubevirt-and-kvm: Add Live Migration page
https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Live-Migration.md # Live Migration There are two scenarios where live migration is triggered in KubeVirt * As per user request, by posting a `VirtualMachineInstanceMigration` to the cluster * As per cluster request, for instance on a Node eviction (due lack of resources or maintenance of given Node) In both situations, KubeVirt will use libvirt to handle logic and coordination with QEMU while KubeVirt's components manage the Kubernetes control plane and the cluster's limitations. In short, KubeVirt: * Checks if the target host is capable of migration of the given VM; * Handles single network namespace per Pod by proxying migration data (more at [Networking][]) * Handles cluster resources usage (e.g: bandwidth usage); * Handles cross-version migration; ![Live migration between two nodes][Live-Migration-Flow] ## Limitations Live migration is not possible if: * The VM is configured with cpu-passthrough; * The VM has local or non-shared volume * The Pod is using bridge binding for network access (right side of image below) ![Kubevirt's Pod][Live-Migration-Network] ## More on KubeVirt's Live migration This blog [post on live migration][] explains how to have live migration enabled in KubeVirt's VMs and describes some of its caveats. [Live-Migration-Flow]: https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Images/Live-Migration-Flow.png [Live-Migration-Network]: https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Images/Live-Migration-Network.png [Networking]: https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Networking.md [post on live migration]: https://kubevirt.io/2020/Live-migration.html
[RFC DOCUMENT 02/12] kubevirt-and-kvm: Add Components page
https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Components.md # Components This document describes the various components of the KubeVirt architecture, how they fit together, and how they compare to the traditional virtualization architecture (QEMU + libvirt). ## Traditional architecture For the comparison to make sense, let's start by reviewing the architecture used for traditional virtualization. ![libvirt architecture][Components-Libvirt] (Image taken from the "[Look into libvirt][]" presentation by Osier Yang, which is a bit old but still mostly accurate from a high-level perspective.) In particular, the `libvirtd` process runs with high privileges on the host and is responsible for managing all VMs. When asked to start a VM, the management process will * Prepare the environment by performing a number of privileged operations upfront * Set up CGroups * Set up kernel namespaces * Apply SELinux labels * Configure network devices * Open host files * ... * Start a non-privileged QEMU process in that environment ## Kubernetes To understand how KubeVirt works, it's first necessary to have some knowledge of Kubernetes. In Kubernetes, every user workload runs inside [Pods][]. The pod is the smallest unit of work that Kubernetes will schedule. Some facts about pods: * They consist of multiple containers * The containers share a network namespace * The containers have their own PID and mount namespace * The containers have their own CGroups for CPU, memory, devices and so forth. They are controlled by k8s and can’t be modified from outside. * Pods can be started with extended privileges (`CAP_NICE`, `CAP_NET_RAW`, root user, ...) * The app in the pods can drop the privileges, but the pod can not drop them (`kubectl exec` gives you a shell with the full privileges). Creating pods with elevated privileges is generally frowned upon, and depending on the policy decided by the cluster administrator it might be outright impossible. ## KubeVirt architecture Let's now discuss how KubeVirt is structured. ![KubeVirt architecture][Components-Kubevirt] The main components are: * `virt-launcher`, a copy of which runs inside each pod besides QEMU and libvirt, is the unprivileged component responsible for receiving commands from other KubeVirt components and reporting back events such as VM crashes; * `virt-handler` runs at the node level via a DaemonSet, and is the privileged component which takes care of the VM setup by reaching into the corresponding pod and modifying its namespaces; * `virt-controller` runs at the cluster level and monitors the API server so that it can react to user requests and VM events; * `virt-api`, also running at the cluster level, exposes a few additional APIs that only apply to VMs, such as the "console" and "vnc" actions. When a KubeVirt VM is started: * We request a Pod with certain privileges and resources from Kubernetes. * The kubelet (the node daemon of kubernetes) prepares the environment with the help of a container runtime. * A shim process (virt-launcher) is our main entrypoint in the pod, which starts libvirt * Once our node-daemon (virt-handler) can reach our shim process, it does privileged setup from outside. It reaches into the namespaces and modifies their content as needed. We mostly have to modify the mount and network namespaces. * Once the environment is prepared, virt-handler asks virt-launcher to start a VM via its libvirt component. More information can be found in the [KubeVirt architecture][] page. ## Comparison The two architectures are quite similar from the high-level point of view: in both cases there are a number of privileged components which take care of preparing an environment suitable for running an unprivileged QEMU process in. The difference, however, is that while libvirtd takes care of all this setup itself, in the case of KubeVirt several smaller components are involved: some of these components are privileged just as libvirtd is, but others are not, and some of the tasks are not even performed by KubeVirt itself but rather delegated to the existing Kubernetes infrastructure. ## Use of libvirtd in KubeVirt In the traditional virtualization scenario, `libvirtd` provides a number of useful features on top of those available with plain QEMU, including * support for multiple clients connecting at the same time * management of multiple VMs through a single entry point * remote API access KubeVirt interacts with libvirt under certain conditions that make the features described above irrelevant: * there's only one client talking to libvirt: `virt-handler` * libvirt is only asked to manage a single VM * client and libvirt are running in the same pod, no remote libvirt access [Components-Kubevirt]: https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Images/Components-Kubevirt.png [Components-Libvirt]:
[RFC DOCUMENT 05/12] kubevirt-and-kvm: Add Networking page
https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Networking.md # Networking ## Problem description Service meshes (such as [Istio][], [Linkerd][]) typically expect application processes to run on the same physical host, usually in a separate user namespace. Network namespaces might be used too, for additional isolation. Network traffic to and from local processes is monitored and proxied by redirection and observation of local sockets. `iptables` and `nftables` (collectively referred to as the `netfilter` framework) are the typical Linux facilities providing classification and redirection of packets. ![containers][Networking-Containers] *Service meshes with containers. Typical ingress path: **1.** NIC driver queues buffers for IP processing **2.** `netfilter` rules installed by *service mesh* redirect packets to proxy **3.** IP receive path completes, L4 protocol handler invoked **4.** TCP socket of proxy receives packets **5.** proxy opens TCP socket towards application service **6.** packets get TCP header, ready for classification **7.** `netfilter` rules installed by service mesh forward request to service **8.** local IP routing queues packets for TCP protocol handler **9.** application process receives packets and handles request. Egress path is conceptually symmetrical.* If we move application processes to VMs, sockets and processes are not visible anymore. All the traffic is typically forwarded via interfaces operating at data link level. Socket redirection and port mapping to local processes don't work. ![and now?][Networking-Challenge] *Application process moved to VM: **8.** IP layer enqueues packets to L2 interface towards application **9.** `tap` driver forwards L2 packets to guest **10.** packets are received on `virtio-net` ring buffer **11.** guest driver queues buffers for IP processing **12.** IP receive path completes, L4 protocol handler invoked **13.** TCP socket of application receives packets and handles request. **:warning: Proxy challenge**: the service mesh can't forward packets to local sockets via `netfilter` rules. *Add-on* NAT rules might conflict, as service meshes expect full control of the ruleset. Socket monitoring and PID/UID classification isn't possible.* ## Existing solutions Existing solutions typically implement a full TCP/IP stack, replaying traffic on sockets local to the Pod of the service mesh. This creates the illusion of application processes running on the same host, eventually separated by user namespaces. ![slirp][Networking-Slirp] *Existing solutions introduce a third TCP/IP stack: **8.** local IP routing queues packets for TCP protocol handler **9.** userspace implementation of TCP/IP stack receives packets on local socket, and **10.** forwards L2 encapsulation to `tap` *QEMU* interface (socket back-end).* While being almost transparent to the service mesh infrastructure, this kind of solution comes with a number of downsides: * three different TCP/IP stacks (guest, adaptation and host) need to be traversed for every service request. There are no chances to implement zero-copy mechanisms, and the amount of context switches increases dramatically * addressing needs to be coordinated to create the pretense of consistent addresses and routes between guest and host environments. This typically needs a NAT with masquerading, or some form of packet bridging * the traffic seen by the service mesh and observable externally is a distant replica of the packets forwarded to and from the guest environment: * TCP congestion windows and network buffering mechanisms in general operate differently from what would be naturally expected by the application * protocols carrying addressing information might pose additional challenges, as the applications don't see the same set of addresses and routes as they would if deployed with regular containers ## Experiments ![experiments: thin layer][Networking-Experiments-Thin-Layer] *How can we improve on the existing solutions while maintaining drop-in compatibility? A thin layer implements a TCP adaptation and IP services.* These are some directions we have been exploring so far: * a thinner layer between guest and host, that only implements what's strictly needed to pretend processes are running locally. A further TCP/IP stack is not necessarily needed. Some sort of TCP adaptation is needed, however, if this layer (currently implemented as userspace process) runs without the `CAP_NET_RAW` capability: we can't create raw IP sockets on the Pod, and therefore need to map packets at layer 2 to layer 4 sockets offered by the host kernel * to avoid implementing an actual TCP/IP stack like the one offered by *libslirp*, we can align TCP parameters advertised towards the guest (MSS, congestion window) to the socket parameters provided by the host kernel, probing them via the `TCP_INFO` socket option (introduced in
[RFC DOCUMENT 03/12] kubevirt-and-kvm: Add Hotplug page
https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Hotplug.md # Hotplug In Kubernetes, pods are defined to be immutable, so it's not possible to perform hotplug of devices in the same way as with the traditional virtualization stack. This limitation is a result of KubeVirt's guiding principle of integrating with Kubernetes as much as possible and making VMs appear the same as containers from the user's point of view. There have been several attempts at lifting this restriction in Kubernetes over the years, but they have all been unsuccessful so far. ## Existing hotplug support When working with containers, changing the amount of resources associated with a pod will result in it being destroyed and a new pod with the updated resource allocation being created in its place. This works fine for containers, which are designed to be clonable and disposable, but when it comes to VMs they usually can't be destroyed on a whim and running multiple instances in parallel is generally not wise even when possible. ## Possible workarounds Until a proper hotplug API makes its way into Kubernetes, one possible way to implement hotplug could be to perform migration to a container compliant with the new allocation request, and only then perform the QEMU-level hotplug operation.
[RFC DOCUMENT 04/12] kubevirt-and-kvm: Add Storage page
https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Storage.md # Storage This document describes the known use-cases and architecture options we have for Linux Virtualization storage in [KubeVirt][]. ## Problem description The main goal of Kubevirt is to leverage the storage subsystem of Kubernetes (built around [CSI][] and [Persistent Volumes][] aka PVs), in order to let both workloads (VMs and containers) leverage the same storage. As a consequence Kubevirt is limited in its use of QEMU storage subsystem and features. That means: * Storage solutions should be implemented in k8s in a way that can be consumed by both containers and VMs. * VMs can only consume (and provide) storage features which are available in the pod, through k8s APIs. For example, a VM will not support disk snapshots if it’s attached to a storage provider that doesn’t support it. Ditto for incremental backup, block jobs, encryption, etc. ## Current situation ### Storage handled outside of QEMU In this scenario, the VM pod uses a [Persistent Volume Claim (PVC)][Persistent Volumes] to give QEMU access to a raw storage device or fs mount, which is provided by a [CSI][] driver. QEMU **doesn’t** handle any of the storage use-cases such as thin provisioning, snapshots, change block tracking, block jobs, etc. This is how things work today in KubeVirt. ![Storage handled outside of QEMU][Storage-Current] Devices and interfaces: * PVC: block or fs * QEMU backend: raw device or raw image * QEMU frontend: virtio-blk * alternative: emulated device for wider compatibility and Windows installations * CDROM (sata) * disk (sata) Pros: * Simplicity * Sharing the same storage model with other pods/containers Cons: * Limited feature-set (fully off-loaded to the storage provider from CSI). * No VM snapshots (disk + memory) * Limited opportunities for fine-tuning and optimizations for high-performance. * Hotplug is challenging, because the set of PVCs in a pod is immutable. Questions and comments * How to optimize this in QEMU? * Can we bypass the block layer for this use-case? Like having SPDK inside the VM pod? * Rust-based storage daemon (e.g. [vhost_user_block][]) running inside the VM pod alongside QEMU (bypassing the block layer) * We should be able to achieve high-performance with local NVME storage here, with multiple polling IOThreads and multi queue. * See [this blog post][PVC resize blog] for information about the PVC resize feature. To implement this for VMs we could have kubevirt watch PVCs and respond to capacity changes with a corresponding call to resize the image file (if applicable) and to notify qemu of the enlarged device. * Features such as incremental backup (CBT) and snapshots could be implemented through a generic CSI backend... Device mapper? Stratis? (See [Other Topics](#other-topics)) ## Possible alternatives ### Storage device passthrough (highest performance) Device passthrough via PCI VFIO, SCSI, or vDPA. No storage use-cases and no CSI, as the device is passed directly to the guest. ![Storage device passthrough][Storage-Passthrough] Devices and interfaces: * N/A (hardware passthrough) Pros: * Highest possible performance (same as host) Cons: * No storage features anywhere outside of the guest. * No live-migration for most cases. ### File-system passthrough (virtio-fs) File mount volumes (directories, actually) can be exposed to QEMU via [virtio-fs][] so that VMs have access to files and directories. ![File-system passthrough (virtio-fs)][Storage-Virtiofs] Devices and interfaces: * PVC: file-system Pros: * Simplicity from the user-perspective * Flexibility * Great for heterogeneous workloads that share data between containers and VMs (ie. OpenShift pipelines) Cons: * Performance when compared to block device passthrough Questions and comments: * Feature is still quite new (The Windows driver is fresh out of the oven) ### QEMU storage daemon in CSI for local storage The qemu-storage-daemon is a user-space daemon that exposes QEMU’s block layer to external users. It’s similar to [SPDK][], but includes the implementation of QEMU block layer features such as snapshots and bitmap tracking for incremental backup (CBT). It also allows the splitting of one single NVMe device, allowing multiple QEMU VMs to share one NVMe disk. In this architecture, the storage daemon runs as part of CSI (control plane), with the data-plane being either a vhost-user-blk interface for QEMU or a fs-mount export for containers. ![QEMU storage daemon in CSI for local storage][Storage-QSD] Devices and interfaces: * CSI: * fs mount with a vhost-user-blk socket for QEMU to open * (OR) fs mount via NBD or FUSE with the actual file-system contents * qemu-storage-daemon backend: NVMe local device w/ raw or qcow2 * alternative: any driver supported by QEMU, such as file-posix. * QEMU frontend: virtio-blk * alternative: any emulated
[RFC DOCUMENT 01/12] kubevirt-and-kvm: Add Index page
https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Index.md # KubeVirt and the KVM user space This is the entry point to a series of documents which, together, detail the current status of KubeVirt and how it interacts with the KVM user space. The intended audience is people who are familiar with the traditional virtualization stack (QEMU plus libvirt), and in order to make it more approachable to them comparisons, are included and little to no knowledge of KubeVirt or Kubernetes is assumed. Each section contains a short summary as well as a link to a separate document discussing the topic in more detail, with the intention that readers will be able to quickly get a high-level understading of the various topics by reading this overview document and then dig further into the specific areas they're interested in. ## Architecture ### Goals * KubeVirt aims to feel completely native to Kubernetes users * VMs should behave like containers whenever possible * There should be no features that are limited to VMs when it would make sense to implement them for containers too * KubeVirt also aims to support all the workloads that traditional virtualization can handle * Windows support, device assignment etc. are all fair game * When these two goals clash, integration with Kubernetes usually wins ### Components * KubeVirt is made up of various discrete components that interact with Kubernetes and the KVM user space * The overall design is somewhat similar to that of libvirt, except with a much higher granularity and many of the tasks offloaded to Kubernetes * Some of the components run at the cluster level or host level with very high privileges, others run at the pod level with significantly reduced privileges Additional information: [Components][] ### Runtime environment * QEMU expects its environment to be set up in advance, something that is typically taken care of by libvirt * libvirtd, when not running in session mode, assumes that it has root-level access to the system and can perform pretty much any privileged operation * In Kubernetes, the runtime environment is usually heavily locked down and many privileged operations are not permitted * Requiring additional permissions for VMs goes against the goal, mentioned earlier, to have VMs behave the same as containers whenever possible ## Specific areas ### Hotplug * QEMU supports hotplug (and hot-unplug) of most devices, and its use is extremely common * Conversely, resources associated with containers such as storage volumes, network interfaces and CPU shares are allocated upfront and do not change throughout the life of the workload * If the container needs more (or less) resources, the Kubernetes approach is to destroy the existing one and schedule a new one to take over its role Additional information: [Hotplug][] ### Storage * Handled through the same Kubernetes APIs used for containers * QEMU / libvirt only see an image file and don't have direct access to the underlying storage implementation * This makes certain scenarios that are common in the virtualization world very challenging: examples include hotplug and full VM snapshots (storage plus memory) * It might be possible to remove some of these limitations by changing the way storage is exposed to QEMU, or even take advantage of the storage technologies that QEMU already implements and make them available to containers in addition to VMs. Additional information: [Storage][] ### Networking * Application processes running in VMs are hidden behind a network interface as opposed to local sockets and processes running in a separated user namespace * Service meshes proxy and monitor applications by means of socket redirection and classification on local ports and process identifiers. We need to aim for generic compatibility * Existing solutions replicate a full TCP/IP stack to pretend applications running in a QEMU instance are local. No chances for zero-copy and context switching avoidance * Network connectivity is shared between control plane and workload itself. Addressing and port mapping need particular attention * Linux capabilities granted to the pod might be minimal, or none at all. Live migration presents further challenges in terms of network addressing and port mapping Additional information: [Networking][] ### Live migration * QEMU supports live migration between hosts, usually coordinated by libvirt * Kubernetes expects containers to be disposable, so the equivalent of live migration would be to simply destroy the ones running on the source node and schedule replacements on the destination node * For KubeVirt, a hybrid approach is used: a new container is created on the target node, then the VM is migrated from the old container, running on the source node, to the newly-created one Additional information: [Live migration][] ### CPU
[RFC DOCUMENT 00/12] kubevirt-and-kvm: Add documents
Hello there! Several weeks ago, a group of Red Hatters working on the virtualization stack (primarily QEMU and libvirt) started a conversation with developers from the KubeVirt project with the goal of better understanding and documenting the interactions between the two. Specifically, we were interested in integration pain points, with the underlying ideas being that only once those issues are understood it becomes possible to look for solutions, and that better communication would naturally lead to improvements on both sides. This series of documents was born out of that conversation. We're sharing them with the QEMU and libvirt communities in the hope that they can be a valuable resource for understanding how the projects they're working on are consumed by higher-level tools, and what challenges are encountered in the process. Note that, while the documents describe a number of potential directions for things like development of new components, that's all just brainstorming that naturally occurred as we were learning new things: the actual design process should, and will, happen on the upstream lists. Right now the documents live in their own little git repository[1], but the expectation is that eventually they will find a suitable long-term home. The most likely candidate right now is the main KubeVirt repository, but if you have other locations in mind please do speak up! I'm also aware of the fact that this delivery mechanism is fairly unconventional, but I thought it would be the best way to spark a discussion around these topics with the QEMU and libvirt developers. Last but not least, please keep in mind that the documents are a work in progress, and polish has been applied to them unevenly: while the information presented is, to the best of our knowledge, all accurate, some parts are in a rougher state than others. Improvements will hopefully come over time - and if you feel like helping out in making that happen, it would certainly be appreciated! Looking forward to your feedback :) [1] https://gitlab.com/abologna/kubevirt-and-kvm -- Andrea Bolognani / Red Hat / Virtualization
Re: Python 3.5 EOL; when can require 3.6?
On Wed, 2020-09-16 at 09:53 +0200, Philippe Mathieu-Daudé wrote: > On 9/16/20 9:43 AM, Markus Armbruster wrote: > > We require Python 3.5. It will reach its "end of life" at the end of > > September 2020[*]. Any reason not to require 3.6 for 5.2? qemu-iotests > > already does for its Python parts. > > Not answering your question, but it would help to start a table > of "oldest package released" versions, with our supported distributions > as columns and package names as row. > > This way when new distributions are released (and oldest dropped from > our side) we can add/remove a column and see the oldest version we aim > to support. In case you're not already aware of it, https://repology.org/ is a very valuable tool when it comes to figuring out minimum versions for dependencies. -- Andrea Bolognani / Red Hat / Virtualization
Re: [PATCH v2 2/2] GitLab Gating CI: initial set of jobs, documentation and scripts
On Thu, 2020-09-03 at 17:12 -0400, Cleber Rosa wrote: > On Thu, Jul 09, 2020 at 10:55:07AM +0200, Erik Skultety wrote: > > On Wed, Jul 08, 2020 at 10:46:57PM -0400, Cleber Rosa wrote: > > > +.. note:: there are currently limitations to gitlab-runner itself when > > > + setting up a service under FreeBSD systems. You will need to > > > + perform steps 4 to 10 manually, as described at > > > + https://docs.gitlab.com/runner/install/freebsd.html > > > > What kinds of limitations? Is it architecture constrained maybe? I'm asking > > because we have all of the steps covered by an Ansible playbook, so I kinda > > got > > triggered by the word "manually". Also, the document only mentions 9 steps > > overall. > > FreeBSD's "service management" (systemd/sys-v like) is not covered by > the GO library[1] used on gitlab-runner. It's not ideal, and the > second best solution would be to script the equivalent handling within > the playbook, but I remember trying and finding some inconsistencies. > Then, I had to give it up and defer to whenever a FreeBSD job is > actually added. > > [1] - https://github.com/ayufan/golang-kardianos-service Note that this is a fork of https://github.com/kardianos/service where FreeBSD support was added recently with https://github.com/kardianos/service/commit/14b2cc59a290407a6f1cb3daba59069429d9665b I'm not sure why gitlab-runner would use a fork rather than the primary repository, but perhaps they can be convinced to switch and gain better FreeBSD support in the process. -- Andrea Bolognani / Red Hat / Virtualization
[PATCH] schemas: Add vim modeline
The various schemas included in QEMU use a JSON-based format which is, however, strictly speaking not valid JSON. As a consequence, when vim tries to apply syntax highlight rules for JSON (as guessed from the file name), the result is an unreadable mess which mostly consist of red markers pointing out supposed errors in, well, pretty much everything. Using Python syntax highlighting produces much better results, and in fact these files already start with specially-formatted comments that instruct Emacs to process them as if they were Python files. This commit adds the equivalent special comments for vim. Signed-off-by: Andrea Bolognani --- docs/interop/firmware.json| 1 + docs/interop/vhost-user.json | 1 + qapi/authz.json | 1 + qapi/block-core.json | 1 + qapi/block.json | 1 + qapi/char.json| 1 + qapi/common.json | 1 + qapi/control.json | 1 + qapi/crypto.json | 1 + qapi/dump.json| 1 + qapi/error.json | 1 + qapi/introspect.json | 1 + qapi/job.json | 1 + qapi/machine-target.json | 1 + qapi/machine.json | 1 + qapi/migration.json | 1 + qapi/misc-target.json | 1 + qapi/misc.json| 1 + qapi/net.json | 1 + qapi/qapi-schema.json | 1 + qapi/qdev.json| 1 + qapi/qom.json | 1 + qapi/rdma.json| 1 + qapi/rocker.json | 1 + qapi/run-state.json | 1 + qapi/sockets.json | 1 + qapi/tpm.json | 1 + qapi/transaction.json | 1 + qapi/ui.json | 1 + qga/qapi-schema.json | 1 + storage-daemon/qapi/qapi-schema.json | 1 + tests/qapi-schema/doc-good.json | 2 ++ tests/qapi-schema/include/sub-module.json | 1 + tests/qapi-schema/qapi-schema-test.json | 1 + tests/qapi-schema/sub-sub-module.json | 1 + 35 files changed, 36 insertions(+) diff --git a/docs/interop/firmware.json b/docs/interop/firmware.json index 240f565397..989f10b626 100644 --- a/docs/interop/firmware.json +++ b/docs/interop/firmware.json @@ -1,4 +1,5 @@ # -*- Mode: Python -*- +# vim: filetype=python # # Copyright (C) 2018 Red Hat, Inc. # diff --git a/docs/interop/vhost-user.json b/docs/interop/vhost-user.json index ef8ac5941f..feb5fe58ca 100644 --- a/docs/interop/vhost-user.json +++ b/docs/interop/vhost-user.json @@ -1,4 +1,5 @@ # -*- Mode: Python -*- +# vim: filetype=python # # Copyright (C) 2018 Red Hat, Inc. # diff --git a/qapi/authz.json b/qapi/authz.json index 1c836a3abd..f3e9745426 100644 --- a/qapi/authz.json +++ b/qapi/authz.json @@ -1,4 +1,5 @@ # -*- Mode: Python -*- +# vim: filetype=python # # QAPI authz definitions diff --git a/qapi/block-core.json b/qapi/block-core.json index 943df1926a..5f72b50149 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -1,4 +1,5 @@ # -*- Mode: Python -*- +# vim: filetype=python ## # == Block core (VM unrelated) diff --git a/qapi/block.json b/qapi/block.json index 2ddbfa8306..c54a393cf3 100644 --- a/qapi/block.json +++ b/qapi/block.json @@ -1,4 +1,5 @@ # -*- Mode: Python -*- +# vim: filetype=python ## # = Block devices diff --git a/qapi/char.json b/qapi/char.json index daceb20f84..8aeedf96b2 100644 --- a/qapi/char.json +++ b/qapi/char.json @@ -1,4 +1,5 @@ # -*- Mode: Python -*- +# vim: filetype=python # ## diff --git a/qapi/common.json b/qapi/common.json index 7b9cbcd97b..716712d4b3 100644 --- a/qapi/common.json +++ b/qapi/common.json @@ -1,4 +1,5 @@ # -*- Mode: Python -*- +# vim: filetype=python ## # = Common data types diff --git a/qapi/control.json b/qapi/control.json index 6b816bb61f..de51e9916c 100644 --- a/qapi/control.json +++ b/qapi/control.json @@ -1,4 +1,5 @@ # -*- Mode: Python -*- +# vim: filetype=python # ## diff --git a/qapi/crypto.json b/qapi/crypto.json index b2a4cff683..c41e869e31 100644 --- a/qapi/crypto.json +++ b/qapi/crypto.json @@ -1,4 +1,5 @@ # -*- Mode: Python -*- +# vim: filetype=python # ## diff --git a/qapi/dump.json b/qapi/dump.json index a1eed7b15c..f7c4267e3f 100644 --- a/qapi/dump.json +++ b/qapi/dump.json @@ -1,4 +1,5 @@ # -*- Mode: Python -*- +# vim: filetype=python # # This work is licensed under the terms of the GNU GPL, version 2 or later. # See the COPYING file in the top-level directory. diff --git a/qapi/error.json b/qapi/error.json index 3fad08f506..94a6502de9 100644 --- a/qapi/error.json +++ b/qapi/error.json @@ -1,4 +1,5 @@ # -*- Mode: Python -*- +# vim: filetype=python
Re: [PATCH v2 2/2] GitLab Gating CI: initial set of jobs, documentation and scripts
On Thu, 2020-07-09 at 11:30 +0100, Daniel P. Berrangé wrote: > On Wed, Jul 08, 2020 at 10:46:57PM -0400, Cleber Rosa wrote: > > +- name: Installation of basic packages to build QEMU > > + hosts: all > > + vars_files: > > +- vars.yml > > + tasks: > > +- name: Install basic packages to build QEMU on Ubuntu 18.04/20.04 > > + apt: > > +update_cache: yes > > +# This matches the packages on > > tests/docker/Dockerfiles/ubuntu1804.docker > > I'd be inclined to actually use docker on the custom runners. > > eg. instead of having separate physical machines or VMs for each > (distro, arch) pair, have a single host distro for the arch. Then > use docker to provide the build environment against each distro. > > IOW, a RHEL-8 aarch64 host, running docker for ubuntu18.04, fedora30 > etc. > > That way we don't end up duplicating all these packages, and instead > can use tests/docker/Dockerfiles/ubuntu1804.docker. This ensures > that if a user needs to reproduce a build failure on their own local > aarch64 machine, they can run docker and get the exact same build > architecture. > > It also has the benefit that we don't need to worry about how to > setup gitlab runners for every distro we care about. We only need to > do gitlab runner for the standard host distro, which spawns a pristine > throwaway docker env. > > I appreciate this is a big change from what you've done in this patch > though, so don't consider this comment a blocker for initial merge. > I think we should do this as the long term strategy though. Essentially > for Linux builds, everything should always be container based. Agreed. You should be able to set up a fairly minimal environment, which consists of Docker, gitlab-runner and not much else, using a long-term supported distro such as CentOS and then just schedule whatever container build on it. No need to provision a new machine every time a new Fedora release comes out, just create a container image for it and add it to the mix. Additionally, the gitlab-runner Docker executor provides more isolation than the shell executor, so running untrusted builds becomes a more reasonable proposition - this is how the shared runners on gitlab.com work - and you don't have to worry about your jobs cleaning up properly after themselves nearly as much. -- Andrea Bolognani / Red Hat / Virtualization
Re: [PATCH 0/5] QEMU Gating CI
On Mon, 2020-04-27 at 01:24 -0400, Cleber Rosa wrote: > On Fri, 24 Apr 2020 08:57:46 +0200 > Erik Skultety wrote: > > On Thu, Apr 23, 2020 at 11:28:21PM +0200, Philippe Mathieu-Daudé > > wrote: > > > Thanks to insist with that point Daniel. I'd rather see every > > > configuration reproducible, so if we loose a hardware sponsor, we > > > can find another one and start another runner. > > > Also note, if it is not easy to reproduce a runner, it will be very > > > hard to debug a reported build/test error. > > > > (Thanks for bringing ^this point up Philippe) > > > > ...However, what we've been actively working on in libvirt is to > > extend the lcitool we have (which can spawn local test VMs) to the > > point where we're able to generate machines that would be the > > reproducible. Right now I'm playing with cloud-init integration with > > lcitool (patches coming soon) that would allow us to use the same > > machines locally as we'd want to have in, say, OpenStack and share > > them as compressed images, so even when updated/managed by lcitool > > locally, you'd get the same environment. > > This is great, and it aligns with my previous point that reproducibility > it's not *just* about the hardware, but about diligently documenting > and automating the environments, be them mundane or super specialized. > And IMO that should score some points when it comes to being > promoted/demoted as a Gating machine/job. I think there's room to extend and rework lcitool so that it can be used for QEMU CI as well, and we should definitely look into that. Right now it only really covers VMs and containers, but there's already one situation (FreeBSD) where the expectation is that you'd import an existing VM image and then apply CI-related customizations on top, so it's not too much of a stretch to imagine doing the same for a bare metal machine. We use Ansible, so as long as we can connect to the machine via ssh we're pretty much good to go. Installation of VMs we already perform in an unattended fashion using preseed/kickstart, and it should be relatively straighforward to adapt those configurations to also work on real hardware. This way we'd both be able to rely on having a sane OS as the base, and relieve the administrator from the duty of manually installing the machines. -- Andrea Bolognani / Red Hat / Virtualization
[PATCH] qapi: Expand documentation for LostTickPolicy
The current documentation is fairly terse and not easy to decode for someone who's not intimately familiar with the inner workings of timer devices. Expand on it by providing a somewhat verbose description of what behavior each policy will result in, as seen from both the guest OS and host point of view. Signed-off-by: Andrea Bolognani --- This information is reported pretty much word by word in https://libvirt.org/formatdomain.html#elementsTime so I'm hoping I can get the QEMU documentation updated and then just merge back the changes. qapi/misc.json | 34 +++--- 1 file changed, 23 insertions(+), 11 deletions(-) diff --git a/qapi/misc.json b/qapi/misc.json index 33b94e3589..cd7445d29f 100644 --- a/qapi/misc.json +++ b/qapi/misc.json @@ -163,17 +163,29 @@ ## # @LostTickPolicy: # -# Policy for handling lost ticks in timer devices. -# -# @discard: throw away the missed tick(s) and continue with future injection -# normally. Guest time may be delayed, unless the OS has explicit -# handling of lost ticks -# -# @delay: continue to deliver ticks at the normal rate. Guest time will be -# delayed due to the late tick -# -# @slew: deliver ticks at a higher rate to catch up with the missed tick. The -#guest time should not be delayed once catchup is complete. +# Policy for handling lost ticks in timer devices. Ticks end up getting +# lost when, for example, the guest is paused. +# +# @discard: throw away the missed ticks and continue with future injection +# normally. The guest OS will see the timer jump ahead by a +# potentially quite significant amount all at once, as if the +# intervening chunk of time had simply not existed; needless to +# say, such a sudden jump can easily confuse a guest OS which is +# not specifically prepared to deal with it. Assuming the guest +# OS can deal correctly with the time jump, the time in the guest +# and in the host should now match. +# +# @delay: continue to deliver ticks at the normal rate. The guest OS will +# not notice anything is amiss, as from its point of view time will +# have continued to flow normally. The time in the guest should now +# be behind the time in the host by exactly the amount of time during +# which ticks have been missed. +# +# @slew: deliver ticks at a higher rate to catch up with the missed ticks. +#The guest OS will not notice anything is amiss, as from its point +#of view time will have continued to flow normally. Once the timer +#has managed to catch up with all the missing ticks, the time in +#the guest and in the host should match. # # Since: 2.0 ## -- 2.24.1
Re: Making QEMU easier for management tools and applications
h its JSON-based API. Obviously QEMU developers, for their own use, could still benefit from having access to a user interface that doesn't require either rigging up libvirt support or messing with JSON directly, and such an interface could even look very similarly to the current CLI, but it could simply not be made user-facing and thus not subject to any compatibility concerns. [1] https://lists.nongnu.org/archive/html/qemu-devel/2020-01/msg06034.html -- Andrea Bolognani / Red Hat / Virtualization virt-run Description: application/shellscript
Re: [RFC PATCH v3 0/6] target/arm/kvm: Adjust virtual time
On Mon, 2020-01-20 at 11:10 +0100, Andrew Jones wrote: > v3: > - Added a target/arm/kvm_arm.h comment cleanup patch (1/6) > - Minor refactoring of assert_has_feature_enabled/disabled in 4/6, >kept Richard's r-b. > - Rewrote kvm-no-adjvtime documentation in 6/6. > - Reworked approach in 5/6 to properly deal with migration and to >track running vs. !running, rather than running vs. paused states. Probably too late since Peter already queued the series, but FWIW Tested-by: Andrea Bolognani -- Andrea Bolognani / Red Hat / Virtualization
Re: [PATCH qemu v5] spapr: Kill SLOF
On Thu, 2020-01-23 at 16:11 +1100, David Gibson wrote: > On Wed, Jan 22, 2020 at 06:14:37PM +1100, Alexey Kardashevskiy wrote: > > On 22/01/2020 17:32, David Gibson wrote: > > > I'm not thinking of "grub" as a separate option - that would be the > > > same as "vof". Using vof + no -kernel we'd need to scan the disks in > > > the same way SLOF does, and look for a boot partition, which will > > > probably contain a GRUB image. > > > > I was hoping we can avoid that by allowing > > "-kernel grub" and let grub do filesystems and MBR/GPT. > > I don't want that to be the only way, because I want the GRUB > installed by the OS installer to be the GRUB we use. Agreed, the bootloader and the kernel should live inside the guest image and not on the host's filesystem. -- Andrea Bolognani / Red Hat / Virtualization
Re: [RFC PATCH qemu] spapr: Kill SLOF
On Wed, 2020-01-08 at 13:34 +1100, Alexey Kardashevskiy wrote: > On 07/01/2020 20:39, Andrea Bolognani wrote: > > On Tue, 2020-01-07 at 12:55 +1100, Alexey Kardashevskiy wrote: > > > Petitboot kernel+initramdisk almost replaces SLOF + GRUB. > > > > Is this necessarily a good thing? > > The bare metal host and the powernv machine in QEMU do not use grub, > they use petitboot which parses all grub configs and supports quite a lot. How well does the distro integration work? Eg. if I change something in /etc/default/grub and then run grub2-mkconfig, can I expect my changes to be picked up? In which scenarios will that *not* work? > Using Linux for a boot loader is not powerpc-only thing really, some > folks do this too (forgot who, just heard this at the KVM forum). While other options are available and some architectures use something else entirely, GRUB is the de-facto standard across most of the non-obscure architectures. I guess the question is whether it's more important to be consistent within the architecture or across them. I think the latter might be preferable, especially when we consider what I think is the most common scenario, that is, someone who's used to having GRUB on their x86 machine running a ppc64 guest on the cloud. The more skills they can automatically transfer over, the better. > > Personally I quite like the fact > > that I can use the same bootloader across x86, ppc64 and aarch64. > > I am not suggesting removing SLOF soon Perhaps the patch subject shouldn't be "kill SLOF" then? ;) -- Andrea Bolognani / Red Hat / Virtualization
Re: [RFC PATCH qemu] spapr: Kill SLOF
On Tue, 2020-01-07 at 12:55 +1100, Alexey Kardashevskiy wrote: > On 07/01/2020 01:15, Daniel Henrique Barboza wrote: > > Question: does Petitboot already replaces SLOF in every possible > > scenario for all > > the spapr machine features? > > Petitboot kernel+initramdisk almost replaces SLOF + GRUB. Is this necessarily a good thing? Personally I quite like the fact that I can use the same bootloader across x86, ppc64 and aarch64. -- Andrea Bolognani / Red Hat / Virtualization
Re: [PATCH v1 0/5] target/arm/kvm: Provide an option to adjust virtual time
On Tue, 2019-12-10 at 17:08 +0100, Andrew Jones wrote: > I don't have a strong enough opinion about kvm-adjvtime vs. > kvm-no-adjvtime to insist one way or another. I agree double inversions > are easier to mess up, but I also like the way the '-no-' better > communicates that the default is [probably] 'yes'. > > All interested parties, please vote. I'll be sending v2 soon and I can > call this thing anything the majority (or the dominate minority) prefer. I like kvm-adjvtime better because it avoids the double negative, but on the other hand if the new default is be to adjust the time then I don't expect many people will actually need to use the parameter, so the name doesn't matter that much after all :) -- Andrea Bolognani / Red Hat / Virtualization
Re: [PATCH v1 0/5] target/arm/kvm: Provide an option to adjust virtual time
On Tue, 2019-12-10 at 15:33 +0100, Andrew Jones wrote: > On Tue, Dec 10, 2019 at 03:21:02PM +0100, Andrea Bolognani wrote: > > I agree with everything except the naming: why would > > > > kvm-no-adjvtime=off vtime is adjusted (default) > > kvm-no-adjvtime=on vtime is not adjusted > > > > be better than > > > > kvm-adjvtime=on vtime is adjusted (default) > > kvm-adjvtime=off vtime is not adjusted > > > > ? Both offer the exact same amount of flexibility, but the latter has > > the advantage of not introducing any unwieldy double negatives. > > A default of 'off' == 0 means not setting anything at all. There's > already precedent for 'kvm-no*' prefixed cpu features, > > kvm-no-smi-migration > kvm-nopiodelay Sorry, I'm not sure I understand. Do you mean that the array where you store CPU features is 0-inizialized, so it's more convenient to have off (0) as the default because it means you don't have to touch it beforehand? Or something different? -- Andrea Bolognani / Red Hat / Virtualization
Re: [PATCH v1 0/5] target/arm/kvm: Provide an option to adjust virtual time
On Tue, 2019-12-10 at 14:32 +0100, Andrew Jones wrote: > After having done this git mining, it looks more and more like we should > at least consider naming this feature 'kvm-no-adjvtime' and probably > also change arm's default. I agree with everything except the naming: why would kvm-no-adjvtime=off vtime is adjusted (default) kvm-no-adjvtime=on vtime is not adjusted be better than kvm-adjvtime=on vtime is adjusted (default) kvm-adjvtime=off vtime is not adjusted ? Both offer the exact same amount of flexibility, but the latter has the advantage of not introducing any unwieldy double negatives. -- Andrea Bolognani / Red Hat / Virtualization
Re: [PATCH v1 0/5] target/arm/kvm: Provide an option to adjust virtual time
On Fri, 2019-12-06 at 16:53 +0100, Andrew Jones wrote: > On Fri, Dec 06, 2019 at 03:22:58PM +, Peter Maydell wrote: > > On Wed, 16 Oct 2019 at 15:34, Andrew Jones wrote: > > > This series is inspired by a series[1] posted by Bijan Mottahedeh about > > > a year ago. The problem described in the cover letter of [1] is easily > > > reproducible and some users would like to have the option to avoid it. > > > However the solution, which is to adjust the virtual counter offset each > > > time the VM transitions to the running state, introduces a different > > > problem, which is that the virtual and physical counters diverge. As > > > described in the cover letter of [1] this divergence is easily observed > > > when comparing the output of `date` and `hwclock` after suspending the > > > guest, waiting a while, and then resuming it. Because this different > > > problem may actually be worse for some users, unlike [1], the series > > > posted here makes the virtual counter offset adjustment optional and not > > > even enabled by default. Besides the adjustment being optional, this > > > series approaches the needed changes differently to apply them in more > > > appropriate locations. Finally, unlike [1], this series doesn't attempt > > > to measure "pause time" itself. Simply using QEMU_CLOCK_VIRTUAL, which > > > only ticks when the VM is not stopped, is sufficient. > > > > So I guess my overall question is "what is the x86 solution to > > this problem, and why is this all arm-specific?" It would also > > x86 adjusts the counter offset by default, and I don't think there's any > way to turn that behavior off. I think it's too late to follow that > default for arm, but this series provides a way to opt into the same > behavior. My understanding is that turning kvm-adjvtime either on or off results in a different set of advantages and drawbacks, with neither begin a one-size-fits-all solution. So it's good that we offer a way for the user to pick one or the other based on their specific needs. The main difference is that kvm-adjvtime=on matches x86's behavior, which is fairly well understood and thoroughly documented, along with the corresponding workarounds, dos and don'ts. With that in mind, in my opinion it would make sense to make kvm-adjvtime=on the behavior for newer machine types so that people coming from an x86 background and/or having to manage both at the same time will get a consistent experience and will be able to draw, even for aarch64, on the considerable amount of existing x86-centric literature on the subject. -- Andrea Bolognani / Red Hat / Virtualization
Re: [PATCH-for-5.0] roms/edk2-funcs.sh: Use available GCC for ARM/Aarch64 targets
On Thu, 2019-12-05 at 17:27 +0100, Philippe Mathieu-Daudé wrote: > On 12/5/19 5:13 PM, Laszlo Ersek wrote: > > If that rules out CentOS 7 as a QEMU project build / CI platform for the > > bundled ArmVirtQemu binaries, then we need a more recent platform > > (perhaps CentOS 8, not sure). > > Unfortunately CentOS 8 is not available as a Docker image, which is a > convenient way to build EDK2 in a CI. Uh? https://hub.docker.com/_/centos seems to disagree with you. Is the 'centos:8' image not suitable for some non-obvious reason? -- Andrea Bolognani / Red Hat / Virtualization
Re: [Qemu-devel] git.qemu.org gitweb misrenders on git/ URLs
On Thu, 2019-08-29 at 16:48 +0100, Stefan Hajnoczi wrote: > Hi Jeff, > Philippe noticed that the git HTTPS clone URL > https://git.qemu.org/git/libslirp.git renders a gitweb page that looks > right but has broken links. The correct gitweb URL listed on > https://git.qemu.org/ is https://git.qemu.org/?p=libslirp.git;a=summary, > but there's a chance that people will open the HTTPS clone URL in their > browser and expect to see gitweb working. > > Is it possible to tweak the Apache configuration so that > https://git.qemu.org/git/libslirp.git[/] redirects to the working gitweb > URL? > > The tricky part is not breaking HTTPS git clone, which accesses URLs > below https://git.qemu.org/git/libslirp.git/ :). I know that's not quite the answer to your question, but if you look for example at https://git.zx2c4.com/cgit you'll see that the same URL can be used both for viewing with a browser *and* cloning. Basically with cgit all requests go through the CGI script, and an advantage of that is that you don't even need to call git update-server-info to make the repository accessible via HTTPs. It's also pretty fast and extremely easy to setup. Maybe consider switching from gitweb to it? -- Andrea Bolognani / Red Hat / Virtualization
Re: [Qemu-devel] [PATCH v3 00/15] target/arm/kvm: enable SVE in guests
On Tue, 2019-08-20 at 09:53 +0200, Andrew Jones wrote: > On Tue, Aug 20, 2019 at 06:08:04AM +, Zhang, Lei wrote: > > Hi Andrew, > > > > I have tested your patch on kernel- 5.2.7 + QEMU (4.0.94 + patch). > > Thanks for the testing! I guess it's time for me to get back to this > series and spin a v4 (so we can test some more :-) > > > This patch series works fine for my tests when use qemu-system-aarch64 > > directly. > > But I can't startup kvm when I use virsh[1]. > > > > Command I executed. > > # virsh start test1 > > > > The error message is [internal error: CPU features not supported by > > hypervisor for aarch64 architecture.] > > Do you have any ideas for this error? > > I've CC'ed Andrea. I've specifically patched out that check in my series... Are you sure you're using the modified libvirt version, and that your guest is configured to use the modified QEMU binary? Anyway, once v4 has been posted I'll respin the libvirt series as well, since in the meantime conflicts have popped up and it no longer applies cleanly on top of master. -- Andrea Bolognani / Red Hat / Virtualization
Re: [Qemu-devel] [PATCH 2/4] configure: Avoid using libssh deprecated API
On Wed, 2019-08-14 at 17:14 +0200, Philippe Mathieu-Daudé wrote: > On 8/14/19 4:51 PM, Andrea Bolognani wrote: > > On Wed, 2019-08-14 at 16:15 +0200, Philippe Mathieu-Daudé wrote: > > > On 8/14/19 3:27 PM, Andrea Bolognani wrote: > > > > On Wed, 2019-08-14 at 14:15 +0200, Philippe Mathieu-Daudé wrote: > > > > > Suggested-by: Andrea Bolognani > > > > > Signed-off-by: Philippe Mathieu-Daudé > > > > > --- > > > > > block/ssh.c | 2 +- > > > > > configure | 7 +++ > > > > > 2 files changed, 8 insertions(+), 1 deletion(-) > > > > > > > > Did I really suggest this? I have no recollection of doing so, or > > > > even getting involved with libssh support in QEMU at all for that > > > > matter. > > > > > > I took this suggestion from > > > https://www.redhat.com/archives/libvir-list/2018-May/msg00597.html > > > > I see :) > > > > I feel like adding a Suggested-by because of something that was > > posted on an unrelated project's mailing list is stretching the > > definition of the tag a bit, so if you end up having to respin I > > think it would be reasonable to drop it, but honestly it's not a > > big deal either way: I was just curious. > > Understood, sorry. Nothing to apologize for! :) -- Andrea Bolognani / Red Hat / Virtualization
Re: [Qemu-devel] [PATCH 2/4] configure: Avoid using libssh deprecated API
On Wed, 2019-08-14 at 16:15 +0200, Philippe Mathieu-Daudé wrote: > On 8/14/19 3:27 PM, Andrea Bolognani wrote: > > On Wed, 2019-08-14 at 14:15 +0200, Philippe Mathieu-Daudé wrote: > > > The libssh packaged by a distribution can predate version 0.8, > > > but still provides the newer API introduced after version 0.7. > > > > > > Using the deprecated API leads to build failure, as on Ubuntu 18.04: > > > > > > CC block/ssh.o > > > block/ssh.c: In function 'check_host_key_hash': > > > block/ssh.c:444:5: error: 'ssh_get_publickey' is deprecated > > > [-Werror=deprecated-declarations] > > >r = ssh_get_publickey(s->session, ); > > >^ > > > In file included from block/ssh.c:27:0: > > > /usr/include/libssh/libssh.h:489:31: note: declared here > > >SSH_DEPRECATED LIBSSH_API int ssh_get_publickey(ssh_session session, > > > ssh_key *key); > > > ^ > > > rules.mak:69: recipe for target 'block/ssh.o' failed > > > make: *** [block/ssh.o] Error 1 > > > > > > Fix by using the newer API if available. > > > > > > Suggested-by: Andrea Bolognani > > > Signed-off-by: Philippe Mathieu-Daudé > > > --- > > > block/ssh.c | 2 +- > > > configure | 7 +++ > > > 2 files changed, 8 insertions(+), 1 deletion(-) > > > > Did I really suggest this? I have no recollection of doing so, or > > even getting involved with libssh support in QEMU at all for that > > matter. > > I took this suggestion from > https://www.redhat.com/archives/libvir-list/2018-May/msg00597.html I see :) I feel like adding a Suggested-by because of something that was posted on an unrelated project's mailing list is stretching the definition of the tag a bit, so if you end up having to respin I think it would be reasonable to drop it, but honestly it's not a big deal either way: I was just curious. -- Andrea Bolognani / Red Hat / Virtualization
Re: [Qemu-devel] [PATCH 2/4] configure: Avoid using libssh deprecated API
On Wed, 2019-08-14 at 14:15 +0200, Philippe Mathieu-Daudé wrote: > The libssh packaged by a distribution can predate version 0.8, > but still provides the newer API introduced after version 0.7. > > Using the deprecated API leads to build failure, as on Ubuntu 18.04: > > CC block/ssh.o > block/ssh.c: In function 'check_host_key_hash': > block/ssh.c:444:5: error: 'ssh_get_publickey' is deprecated > [-Werror=deprecated-declarations] >r = ssh_get_publickey(s->session, ); >^ > In file included from block/ssh.c:27:0: > /usr/include/libssh/libssh.h:489:31: note: declared here >SSH_DEPRECATED LIBSSH_API int ssh_get_publickey(ssh_session session, > ssh_key *key); > ^ > rules.mak:69: recipe for target 'block/ssh.o' failed > make: *** [block/ssh.o] Error 1 > > Fix by using the newer API if available. > > Suggested-by: Andrea Bolognani > Signed-off-by: Philippe Mathieu-Daudé > --- > block/ssh.c | 2 +- > configure | 7 +++ > 2 files changed, 8 insertions(+), 1 deletion(-) Did I really suggest this? I have no recollection of doing so, or even getting involved with libssh support in QEMU at all for that matter. -- Andrea Bolognani / Red Hat / Virtualization
Re: [Qemu-devel] [RFC] virtio-mmio: implement modern (v2) personality (virtio-1)
On Tue, 2019-07-30 at 13:35 +0200, Cornelia Huck wrote: > On Tue, 30 Jul 2019 12:25:30 +0200 > Andrea Bolognani wrote: > > Can you please make sure virtio-mmio uses the existing interface > > instead of introducing a new one? > > FWIW, I really hate virtio-pci's disable-modern/disable-legacy... for a > starter, what is 'modern'? Will we have 'ultra-modern' in the future? AIUI the modern/legacy terminology is part of the VirtIO spec, so while I agree that it's not necessarily the least prone to ambiguity at least it's well defined. > It is also quite backwards with the 'disable' terminology. That's also true. I never claimed the way virtio-pci does it is perfect! > We also have a different mechanism for virtio-ccw ('max_revision', > which covers a bit more than virtio-1; it doesn't have a 'min_revision', > as negotiating the revision down is fine), so I don't see why > virtio-mmio should replicate the virtio-pci mechanism. > > Also, IIUC, virtio-mmio does not have transitional devices, but either > version 1 (legacy) or version 2 (virtio-1). It probably makes more > sense to expose the device version instead; either as an exact version > (especially if it isn't supposed to go up without incompatible > changes), or with some min/max concept (where version 1 would stand a > bit alone, so that would probably be a bit awkward.) I think that if reinventing the wheel is generally agreed not to be a good idea, then it stands to reason that reinventing it twice can only be described as absolute madness :) We should have a single way to control the VirtIO protocol version that works for all VirtIO devices, regardless of transport. We might even want to have virtio-*-{device,ccw}-non-transitional to mirror the existing virtio-*-pci-non-transitional. FWIW, libvirt already implements support for (non)-transitional virtio-pci devices using either the dedicated devices or the base virtio-pci plus the disable-{modern,legacy} attributes. -- Andrea Bolognani / Red Hat / Virtualization
Re: [Qemu-devel] [RFC] virtio-mmio: implement modern (v2) personality (virtio-1)
On Mon, 2019-07-29 at 14:57 +0200, Sergio Lopez wrote: [...] > /* virtio-mmio device */ > > static Property virtio_mmio_properties[] = { > DEFINE_PROP_BOOL("format_transport_address", VirtIOMMIOProxy, > format_transport_address, true), > +DEFINE_PROP_BOOL("modern", VirtIOMMIOProxy, modern, false), > DEFINE_PROP_END_OF_LIST(), > }; Not a QEMU developer so forgive me if I say something silly, but IIUC you'd be able to opt into the new feature by using eg. -device virtio-net-device,modern=on However, virtio-pci devices already have a mechanism to control the VirtIO protocol version, where you use -device virtio-net-pci,disable-modern=no,disable-legacy=yes to get a VirtIO 1.x-only device and -device virtio-net-pci,disable-modern=no,disable-legacy=no for a transitional device. Can you please make sure virtio-mmio uses the existing interface instead of introducing a new one? -- Andrea Bolognani / Red Hat / Virtualization
Re: [Qemu-devel] [Qemu-riscv] [RFC v1 0/5] RISC-V: Add firmware loading support and default
On Fri, 2019-06-21 at 14:35 +0200, Andrea Bolognani wrote: > On Thu, 2019-06-20 at 21:43 +0300, David Abdurachmanov wrote: > > On Thu, Jun 20, 2019 at 9:18 PM Alistair Francis > > wrote: > > > OE-Core already packages OpenSBI by default, Fedora and Debian are > > > moving to OpenSBI for RISC-V targets as well. > > > > > > Any distro that supports the RISC-V toolchain (which is all > > > upstreamed) can build OpenSBI. > > > > Fedora uses OpenSBI for the last 2 or 3 months now. I don't plan to update > > BBL builds. OpenSBI packages in Fedora/RISCV isn't finalized, but it does > > ship *.elf and *.bin files. > > Sounds good to me, thanks for confirming! >From further off-list discussion with David, I have learned that recent Fedora images include an OpenSBI build with embedded U-Boot payload, such that you only need to have that single file on the host and pass it to QEMU via -kernel[1] for RISC-V guest boot to work. I played with it over the past few days, and it works very nicely. I think this is the result that we want to ultimately reach: a single RISC-V "firmware" binary installed on the host through an appropriate distro package, shared among guests, with everything else that is guest-specific being contained in the corresponding disk image. This is what other architectures are already doing, with SeaBIOS and OVMF on x86_64, AAVMF on aarch64 and SLOF on ppc64 all being handled this way: RISC-V should, where it makes sense, follow suit. QEMU also recently introduced a JSON-based specification that can be used to advertise guest firmwares and libvirt already supports it, which makes firmware configuration either extremely convenient or entirely automatic for the user: the OpenSBI support should also be advertised this way. [1] I guess that'd be -bios after these patches? -- Andrea Bolognani / Red Hat / Virtualization
Re: [Qemu-devel] [Qemu-arm] [RFC v2 PATCH] hw/arm/virt: makes virt a default machine type
On Sat, 2019-06-22 at 16:58 +0100, Peter Maydell wrote: > On Fri, 21 Jun 2019 at 20:04, Cleber Rosa wrote: > > You can consider me biased (I do consider myself), but trying to wear > > the hat of a user first interacting with QEMU, I would expect a (any) > > reasonably capable environment that can represent the given target. > > That will probably be a different environment than the one I may need, > > and I think that's fine. > > I'm really not sure what you're trying to suggest here; maybe > you could clarify? If you specify a target (ie a machine type), > you get that machine type. If you don't specify a target, then > we can't really guess what you were hoping to run and > magically pick something that works. > > The main problem here is that users expect "all the world is a PC" > type behaviour, ie they can just provide qemu-system-arm or > qemu-system-aarch64 with no command line arguments except > a guest kernel (which is half the time something they found under > a rock or extracted from some firmware image) or a guest CDROM > image and have it boot, because that generally works for x86. It > doesn't and can't work for Arm, because of the much greater > diversity of machine types and the way that kernels are often > only compiled to work on a specific subset of machines. > Making the user specify a machine type means they do at least > get prompted that the world is more complicated than they > think it is and there are decisions that have to be made. > > In any case even if we did default to "virt" the user still > has to specify a CPU type, may well also want to provide > a GIC version (gicv3 being better than the default v2), > likely more RAM than the very small default, they need to provide > all the virtio devices, and so on and so on. So giving > them one option they no longer need to specify doesn't > really make it any easier IMHO. Additional note on GIC: most server-grade machines you can buy today do *not* support GICv2, so you will need to opt-in to GICv3 if you want your guest to even start. More generally, as someone who has worked on supporting non-x86 guests in libvirt for the past few years, I can tell you from experience that you're always going to need some arch-specific logic to deal with the small (and not so small :) differences in behavior between QEMU targets: as Peter correctly says, machine type is just a single example among many. -- Andrea Bolognani / Red Hat / Virtualization
Re: [Qemu-devel] [Qemu-riscv] [RFC v1 0/5] RISC-V: Add firmware loading support and default
On Thu, 2019-06-20 at 21:43 +0300, David Abdurachmanov wrote: > On Thu, Jun 20, 2019 at 9:18 PM Alistair Francis wrote: > > On Thu, Jun 20, 2019 at 1:16 AM Andrea Bolognani > > wrote: > > > but one question comes to mind: once this is in, we will probably > > > want to have OpenSBI packaged separately in distributions, the same > > > way it already happens for SeaBIOS, SLOF and edk2-based firmwares. > > > > > > Will using either of the formats prevent that from happening? > > > > Both options allow this. > > > > OE-Core already packages OpenSBI by default, Fedora and Debian are > > moving to OpenSBI for RISC-V targets as well. > > > > Any distro that supports the RISC-V toolchain (which is all > > upstreamed) can build OpenSBI. > > Fedora uses OpenSBI for the last 2 or 3 months now. I don't plan to update > BBL builds. OpenSBI packages in Fedora/RISCV isn't finalized, but it does > ship *.elf and *.bin files. Sounds good to me, thanks for confirming! -- Andrea Bolognani / Red Hat / Virtualization
Re: [Qemu-devel] [Qemu-riscv] [RFC v1 0/5] RISC-V: Add firmware loading support and default
On Wed, 2019-06-19 at 11:23 -0700, Alistair Francis wrote: > On Wed, Jun 19, 2019 at 7:42 AM Bin Meng wrote: > > On Wed, Jun 19, 2019 at 10:30 PM Alistair Francis > > wrote: > > > On Wed, Jun 19, 2019 at 7:26 AM Bin Meng wrote: > > > > > pc-bios/opensbi-riscv32-fw_jump.elf | Bin 0 -> 197988 bytes > > > > > pc-bios/opensbi-riscv64-fw_jump.elf | Bin 0 -> 200192 bytes > > > > > > > > Since we are considering adding "bios" images, I prefer to add the > > > > pure binary images instead of ELF images here. > > > > > > I didn't think about that. Can we just boot them in QEMU like we do > > > with the ELFs? > > > > Yes, use load_image_targphys() instead of load_elf(). > > Ah, that is obvious. I'll update it to use the bin files then. I'm unclear on the advantages of using one format over the other, but one question comes to mind: once this is in, we will probably want to have OpenSBI packaged separately in distributions, the same way it already happens for SeaBIOS, SLOF and edk2-based firmwares. Will using either of the formats prevent that from happening? -- Andrea Bolognani / Red Hat / Virtualization
Re: [Qemu-devel] How do we do user input bitmap properties?
On Fri, 2019-05-24 at 15:24 -0300, Eduardo Habkost wrote: > On Thu, May 23, 2019 at 10:35:24AM +0200, Andrea Bolognani wrote: > > [...] the above looks good to > > me as a general direction, but note that you'll have to implement at > > the very least the query-cpu-model-expansion QMP command for the > > introspection to work. > > Why is query-cpu-model-expansion needed? Isn't > device-list-properties enough? Good question. I'll have to check with Jirka, but from playing with both commands it looks like the latter returns a superset of what the former does, so for the purpose of figuring out which vector lengths the QEMU binary recognizes it should be good enough; I suspect, however, that query-cpu-model-expansion might be (made to be) smarter and for example not report vector lengths that the underlying hardware doesn't support, which would be valuable for the purpose of user friendly error reporting and allowing applications to decide which vector lengths to request when creating guests. I'll try to come back with more than theories soon :) -- Andrea Bolognani / Red Hat / Virtualization
Re: [Qemu-devel] How do we do user input bitmap properties?
On Wed, 2019-05-15 at 13:54 +0200, Andrew Jones wrote: > On Wed, May 15, 2019 at 12:52:29PM +0200, Igor Mammedov wrote: > > since using magic numbers is not very descriptive > > (but if there is some spec where they come from that we could point users to > > it might be acceptable too, but I'd reserve number approach for values > > only). > > The numbers aren't magic, they're part of the name. '1' in the above > 'sve1' means one quadword. It would probably have been better to use bits > instead in the example, i.e. > > -cpu host,sve128=on,sve256=on,sve384=off,sve512=on > > where it's now clear that "sve512" has an analogy with x86's "avx512". > [...] > > So I set off to convince Igor of the wide word idea (he sits next to me, > so I didn't have go far), but he has convinced me of the above property > idea. He used the magic phrase: "less code would be needed". If we use > the properties like above then we get introspection for free (cpu property > listing which libvirt already knows how to do) - so no QMP query needed. > The cost is adding several properties (16 to handle the current 2048-bit > limit), but I guess that's cheap enough. The command line is verbose, but > also much easier for a human to construct and read. I'm pretty sold on > this path, but adding Andrea and Eduardo for their input as well. Sorry for taking a while to respond. Anyway, the above looks good to me as a general direction, but note that you'll have to implement at the very least the query-cpu-model-expansion QMP command for the introspection to work. query-cpu-model-baseline and query-cpu-model-comparison are two more QMP command which, while perhaps not immediately applicabile, we will want to implement at some point; more in general, what s390x is doing with respect to CPU models is a good blueprint, according to the libvirt developer who's the most involved with that specific area of the project. -- Andrea Bolognani / Red Hat / Virtualization
Re: [Qemu-devel] [PATCH 00/13] target/arm/kvm: enable SVE in guests
On Wed, 2019-05-15 at 12:14 +0100, Dave Martin wrote: > On Wed, May 15, 2019 at 09:03:58AM +0100, Andrea Bolognani wrote: > > On Tue, 2019-05-14 at 13:14 -0700, Richard Henderson wrote: > > > Why is =4 less user-friendly than =512? > > > > > > I don't actually see "total bits in vector" as more user-friendly than > > > "number > > > of quadwords" when it comes to non-powers-of-2 like =7 vs =896 or =13 vs > > > =1664. > > > > I would wager most people are intimately familiar with bits, bytes > > and multiples due to having to work with them daily. Quadwords, not > > so much. > > Generally I tend to agree. For kvmtool I leaned torward quadwords > purely because > > 16,32,48,64,80,96,112,128,144,160,176,192,208 > > is a big pain to type compared with > > 1,2,3,4,5,6,7,8,9,10,11,12,13 > > Even though I prefer to specify vector lengths in bytes everywhere else > in the Linux user API (precisely to avoid the confusion you object to). > > This isn't finalised yet for kvmtool -- I need to rework the patches > and may not include it at all initially: kvmtool doesn't support > migration, which is the main usecase for being able to specify an exact > set of vector lengths AFAICT. > > Since this is otherwise only useful for migration, experimentation or > machine-driven configuration, a bitmask > > 0x1fff > > as some have suggested may well be a pragmatic alternative for kvmtool. Just to be clear, I have suggested using bits (or bytes or megabytes depending on the exact value) only for the command-line-user-oriented sve-vl-max option, which would take a single value. For interoperation with the management layer, on the other hand, using a bitmap is perfectly fine, and whether the values encoded within are expressed in quadwords or whatever other format is largely irrelevant, so long as it it's properly documented of course. -- Andrea Bolognani / Red Hat / Virtualization
Re: [Qemu-devel] [PATCH 00/13] target/arm/kvm: enable SVE in guests
On Tue, 2019-05-14 at 13:14 -0700, Richard Henderson wrote: > On 5/14/19 9:03 AM, Andrea Bolognani wrote: > > On Tue, 2019-05-14 at 14:53 +0200, Andrew Jones wrote: > > > We already have sve-max-vq, so I'm not sure we want to rename it. > > > > Oh, I didn't realize that was the case. And of course it already > > takes a number of quadwords as argument, I suppose? That's pretty > > unfortunate :( > > > > Perhaps we could consider deprecating it in favor of a user-friendly > > variant that's actually suitable for regular humans, like the one I > > suggest above? > > Why is =4 less user-friendly than =512? > > I don't actually see "total bits in vector" as more user-friendly than "number > of quadwords" when it comes to non-powers-of-2 like =7 vs =896 or =13 vs > =1664. I would wager most people are intimately familiar with bits, bytes and multiples due to having to work with them daily. Quadwords, not so much. -- Andrea Bolognani / Red Hat / Virtualization
Re: [Qemu-devel] [PATCH 00/13] target/arm/kvm: enable SVE in guests
On Tue, 2019-05-14 at 14:53 +0200, Andrew Jones wrote: > On Tue, May 14, 2019 at 02:29:51PM +0200, Andrea Bolognani wrote: > > Since we expect management applications to use QMP to discover what > > vector lengths are supported and then provide an explicit map, I > > think it's fair to say that the ability to specify a single maximum > > vector length is going to be exclusively used as a convenience for > > command line users. > > > > In that light, I think it would be reasonable for the usage to look > > along the lines of > > > > -cpu host,sve-vl-map=0xd # machine-friendly variant > > -cpu max,sve-vl-max=512 # user-friendly variant > > We already have sve-max-vq, so I'm not sure we want to rename it. Oh, I didn't realize that was the case. And of course it already takes a number of quadwords as argument, I suppose? That's pretty unfortunate :( Perhaps we could consider deprecating it in favor of a user-friendly variant that's actually suitable for regular humans, like the one I suggest above? [...] > > > Dave points out > > > we may want to reduce the list to a single set and then add flags > > > to indicate what can be done with it in order to derive other sets. > > > What do you think about that? > > > > So right now all that can be done is truncating the list by removing > > an arbitrary number of elements from the end, right? Eg. if you have > > [ 1, 2, 4 ] you can use either that or [ 1, 2 ] or [ 1 ]. But in the > > future you might also be able to mask single elements in the middle > > of the list, thus enabling things like [ 1, 4 ]. > > > > That doesn't sound very complicated to support in libvirt, though I > > have to say that I'm not a big fan of this proposal because as far as > > I can see it basically means implementing the very same logic twice, > > once in QEMU and then once more in libvirt. > > So if big tables of bits aren't a problem for QMP queries, then I'll > just leave the design as it is. I thought about it a bit more and perhaps the simplified design is better after all. Whatever the interface looks like on the QEMU side, we're going to want to offer libvirt users two options for configuring vector lengths: listing all desired vector lengths explicitly (basically sev-vl-map but more verbose and readable) and providing just the biggest desired vector length (like in sev-max-vq). In the latter case, we'll want to expand the user-provided value into an explicit list anyway in order to guarantee guest ABI stability, and doing so when a single bitmap has been obtained via QMP seems like it would be more manageable. Sorry for the flip-flop, but after all isn't this exactly what upstream design discussion is supposed to be all about? :) [...] > > If the size of the bitmap on the KVM side is 512 bits, why don't we > > just make it that size on the QEMU side too from the start? > > I'd still only want to input 64-bits on the command line, otherwise > we get into inputting arrays, which isn't easy. KVM's interface is > meant for expansion, but it won't be using most of those bits for > quite some time either. I'm probably missing something entirely obvious, but couldn't you just have a single, possibly fairly massive (up to 128 hex digits if I'm not mistaken) value on the command line and just work with that one, no arrays necessary? -- Andrea Bolognani / Red Hat / Virtualization
Re: [Qemu-devel] [PATCH 00/13] target/arm/kvm: enable SVE in guests
On Mon, 2019-05-13 at 14:36 +0200, Andrew Jones wrote: > On Mon, May 13, 2019 at 11:32:46AM +0200, Andrea Bolognani wrote: > > On Sun, 2019-05-12 at 10:36 +0200, Andrew Jones wrote: > > [...] > > >CPU type | accel | sve-max-vq | sve-vls-map > > >--- > > > 1) max | tcg | $MAX_VQ | $VLS_MAP > > > 2) max | kvm | $MAX_VQ | $VLS_MAP > > > 3)host | kvm | N/A | $VLS_MAP > > > > > > Where for (1) $MAX_VQ sets the maximum vq and smaller vqs are > > > all supported when $VLS_MAP is zero, or when the vqs are selected > > > in $VLS_MAP. > > > > I'm a bit confused by the nomenclature here. VL clearly stands for > > Vector Length, but what does VQ stand for? You seem to be using the > > two terms pretty much interchangeably throughout the cover letter. > > As Dave pointed out, they're both lengths, but VQ specifically points > out that the unit is 'Q'uadwords. We could use VQS instead of VLS, > "Vector Lengths" sounds better. Alright, it makes sense - once you've managed to figure out what exactly a "quadword" is, at least :) Since we expect management applications to use QMP to discover what vector lengths are supported and then provide an explicit map, I think it's fair to say that the ability to specify a single maximum vector length is going to be exclusively used as a convenience for command line users. In that light, I think it would be reasonable for the usage to look along the lines of -cpu host,sve-vl-map=0xd # machine-friendly variant -cpu max,sve-vl-max=512 # user-friendly variant with documentation clearly pointing out that the two options expect completely different formats - but that was the case even before, so we would have had to document that anyway. > > > The QMP query returns a list of valid vq lists. For example, if > > > a guest can use vqs 1, 2, 3, and 4, then the following list will > > > be returned > > > > > > [ [ 1 ], [ 1, 2 ], [ 1, 2, 3 ], [ 1, 2, 3, 4 ] ] > > > > > > Another example might be 1, 2, 4, as the architecture states 3 > > > is optional. In that case the list would be > > > > > > [ [ 1 ], [ 1, 2 ], [ 1, 2, 4 ] ] > > > > I think the proposed QMP command is problematic, because it reports > > the valid vector lengths for either KVM or TCG based on which > > accelerator is currently enabled: it should report all information > > at all times instead, similarly to how query-gic-capabilities does > > it. > > OK, and then with a flag stating which is when then. Correct. > Dave points out > we may want to reduce the list to a single set and then add flags > to indicate what can be done with it in order to derive other sets. > What do you think about that? So right now all that can be done is truncating the list by removing an arbitrary number of elements from the end, right? Eg. if you have [ 1, 2, 4 ] you can use either that or [ 1, 2 ] or [ 1 ]. But in the future you might also be able to mask single elements in the middle of the list, thus enabling things like [ 1, 4 ]. That doesn't sound very complicated to support in libvirt, though I have to say that I'm not a big fan of this proposal because as far as I can see it basically means implementing the very same logic twice, once in QEMU and then once more in libvirt. > > [...] > > > And now for what might be a bit more controversial; how we input > > > the valid vector set with sve-vls-map. Well, sve-vls-map is a > > > 64-bit bitmap, which is admittedly not user friendly and also not > > > the same size as KVM's vls bitmap (which is 8 64-bit words). Here's > > > the justification: > > > > > > 1) We want to use the QEMU command line in order for the information > > > to be migrated without needing to add more VM state. > > > 2) It's not easy/pretty to input arrays on the QEMU command line. > > > 3) We already need to use the QMP query to get a valid set, which > > > isn't user friendly either, meaning we're already in libvirt > > > territory. > > > 4) A 64-bit map (supporting up to 8192-bit vectors) is likely big > > > enough for quite some time (currently KVM and TCG only support > > > 2048-bit vectors). > > > 5) If user friendliness is needed more than migratability then > > > the 'max' cpu type can be used with the sve-max-vq property. > > > 6) It's possible to probe the full valid vector set from the > > > command line by using something like sve-vls-map=0x and > > > then, when it fails, the error message wil
Re: [Qemu-devel] [PATCH 00/13] target/arm/kvm: enable SVE in guests
On Sun, 2019-05-12 at 10:36 +0200, Andrew Jones wrote: [...] >CPU type | accel | sve-max-vq | sve-vls-map >--- > 1) max | tcg | $MAX_VQ | $VLS_MAP > 2) max | kvm | $MAX_VQ | $VLS_MAP > 3)host | kvm | N/A | $VLS_MAP > > Where for (1) $MAX_VQ sets the maximum vq and smaller vqs are > all supported when $VLS_MAP is zero, or when the vqs are selected > in $VLS_MAP. I'm a bit confused by the nomenclature here. VL clearly stands for Vector Length, but what does VQ stand for? You seem to be using the two terms pretty much interchangeably throughout the cover letter. [...] > There is never any need to provide both properties, but if both > are provided then they are checked for consistency. I would personally just error out when both are provided. > The QMP query returns a list of valid vq lists. For example, if > a guest can use vqs 1, 2, 3, and 4, then the following list will > be returned > > [ [ 1 ], [ 1, 2 ], [ 1, 2, 3 ], [ 1, 2, 3, 4 ] ] > > Another example might be 1, 2, 4, as the architecture states 3 > is optional. In that case the list would be > > [ [ 1 ], [ 1, 2 ], [ 1, 2, 4 ] ] I think the proposed QMP command is problematic, because it reports the valid vector lengths for either KVM or TCG based on which accelerator is currently enabled: it should report all information at all times instead, similarly to how query-gic-capabilities does it. [...] > And now for what might be a bit more controversial; how we input > the valid vector set with sve-vls-map. Well, sve-vls-map is a > 64-bit bitmap, which is admittedly not user friendly and also not > the same size as KVM's vls bitmap (which is 8 64-bit words). Here's > the justification: > > 1) We want to use the QEMU command line in order for the information > to be migrated without needing to add more VM state. > 2) It's not easy/pretty to input arrays on the QEMU command line. > 3) We already need to use the QMP query to get a valid set, which > isn't user friendly either, meaning we're already in libvirt > territory. > 4) A 64-bit map (supporting up to 8192-bit vectors) is likely big > enough for quite some time (currently KVM and TCG only support > 2048-bit vectors). > 5) If user friendliness is needed more than migratability then > the 'max' cpu type can be used with the sve-max-vq property. > 6) It's possible to probe the full valid vector set from the > command line by using something like sve-vls-map=0x and > then, when it fails, the error message will state the correct > map, e.g. 0xb. I don't have a problem with having to use a bitmap internally, though libvirt will clearly want to expose a more approachable interface to users. However, QMP reporting the information in the current format means we'd have to build an additional parser on top of the bitmap handling and conversion routines we'll clearly need to make this work; plus it just feels weird that the information reported by QMP can't be used on the command line without going through some tranformation first. Wouldn't it make more sense to both accept and report bitmaps? -- Andrea Bolognani / Red Hat / Virtualization
Re: [Qemu-devel] [PATCH] vl: fix -sandbox parsing crash when seccomp support is disabled
On Mon, 2019-04-29 at 15:47 +0200, Marc-André Lureau wrote: > $ ./x86_64-softmmu/qemu-system-x86_64 -sandbox off > qemu-system-x86_64: -sandbox off: There is no option group 'sandbox' > Segmentation fault > > Commit 5780760f5e ("seccomp: check TSYNC host capability") wrapped one > use of the sandbox option group to produce a sensible error, it didn't > do the same for another call to qemu_opts_parse_noisily(): > > (gdb) bt > at util/qemu-option.c:829 > #0 0x105b36d8 in opts_parse (list=0x0, params=0x3ab5 "off", > permit_abbrev=true, defaults=false, errp=0x3080) > at util/qemu-option.c:829 > #1 0x105b3b74 in qemu_opts_parse_noisily (list=, > params=, permit_abbrev=) at > util/qemu-option.c:890 > #2 0x10024964 in main (argc=, argv=, > envp=) at vl.c:3589 > > Fixes: 5780760f5ea6163939a5dabe7427318b4f07d1a2 > Cc: da...@gibson.dropbear.id.au > Cc: ot...@redhat.com > Signed-off-by: Marc-André Lureau > --- > vl.c | 18 ++ > 1 file changed, 10 insertions(+), 8 deletions(-) This fixes the crash for me, so Tested-by: Andrea Bolognani -- Andrea Bolognani / Red Hat / Virtualization
Re: [Qemu-devel] [Qemu-ppc] [PATCH v7 0/2] spapr-rtas: add ibm, get-vpd RTAS interface
Apologies for taking this long to respond. On Mon, 2019-04-08 at 14:27 +1000, David Gibson wrote: > On Tue, Apr 02, 2019 at 12:28:07PM +0200, Greg Kurz wrote: > > The recent fixes around "host-serial" and "host-model" simply moved > > the decision to expose host data to the upper layer, ie. libvirt > > which should be involved in this discussion. > > Right, that's deliberate. Note that roughly-equivalent information on > x86 is currently supplied via the SMBIOS. OpenStack Nova sets that, > rather than qemu, and I'd like to move towards a common configuration > model with x86, though it's a fairly long path to there. > > OpenStack had an equivalent security problem to our one, which it > addressed by taking the host serial from /etc/machine-id if present > rather than the real host info. IIUC the situation is a bit different between x86 and ppc64, because while for the latter SPAPR defines a way for the guest to access information about the host it's running on, that's not the case for the former, at least to the best of my knowledge. What OpenStack is doing is reading the machine-id (if explicitly configured to do so: the default is to use the guest's own UUID[1]) and exposing that as the *guest* serial, not as the *host* serial. >From libvirt's point of view, the entire mechanism is entirely optional, so unless the management layer explicitly asks it to set a certain value for the serial, libvirt will simply pass no information down to QEMU. The relevant XML elements[2] are clearly modeled after x86, so I wonder if Nova is setting them also on ppc64 and if so, what the guest will ultimately see... > > Cc'ing Andrea for expertise. Problem exposed below. > > > > The pseries machine used to expose the content of the host's > > /proc/device-tree/system-id and /proc/device-tree/model in the guest > > DT. This led to a CVE and QEMU doesn't do that anymore for new machine > > types. Instead, two new properties where added to the pseries machine: > > > > pseries-4.0.host-serial=string (Host serial number to advertise in guest > > device tree) > > pseries-4.0.host-model=string (Host model to advertise in guest device tree) > > > > It is up to the caller to pass something... which may be anything, > > including something like $(cat /proc/device-tree/system-id) or > > randomly generated. What happens if the caller doesn't provide any value? Will QEMU come up with something itself? Adding a few extra knobs in the vein as the existing ones sounds like a fairly reasonable idea. It will still be up to the management layer to actually provide the values. > > Is there a chance libvirt can be taught to pass a different string > > to the target QEMU in case of migration ? libvirt already supports providing a different XML to the target host, so changing a couple values should be no big deal. As a final note, unless I've gotten it wrong and x86 actually *does* provide a way for the guest to figure out its host's serial, then any software relying on the attributes defined by SPAPR is ultimately not portable to non-ppc64 hardware and should probably be rearchitected to go through the management layer, as Daniel was also suggesting earlier in the thread. [1] https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L364-L372 [2] https://libvirt.org/formatdomain.html#elementsSysinfo -- Andrea Bolognani / Red Hat / Virtualization
Re: [Qemu-devel] [Qemu-riscv] [PATCH v1 1/1] riscv: plic: Set msi_nonbroken as true
On Mon, 2019-03-18 at 02:31 -0700, Palmer Dabbelt wrote: > On Mon, 18 Mar 2019 01:39:46 PDT (-0700), pbonz...@redhat.com wrote: > > On 15/03/19 21:05, Alistair Francis wrote: > > > Set msi_nonbroken as true for the PLIC. > > > > > > According to the comment located here: > > > https://git.qemu.org/?p=qemu.git;a=blob;f=hw/pci/msi.c;h=47d2b0f33c664533b8dbd5cb17faa8e6a01afe1f;hb=HEAD#l38 > > > the msi_nonbroken variable should be set to true even if they don't > > > support MSI. In this case that is what we are doing as we don't support > > > MSI. > > > > I can queue this patch, and add the "select MSI" to CONFIG_SIFIVE. > > Works for me. Thanks! Just so we're on the same page, are you targeting this at 4.0.0? If it gets merged in the next few days I can probably get the corresponding libvirt patches in before our own freeze starts. It would be great if we could make it so guests created with QEMU 4.0.0 + libvirt 5.2.0 get PCI by default :) -- Andrea Bolognani / Red Hat / Virtualization
Re: [Qemu-devel] [PATCH v1 1/1] riscv: plic: Set msi_nonbroken as true
On Fri, 2019-03-15 at 20:05 +, Alistair Francis wrote: > Set msi_nonbroken as true for the PLIC. > > According to the comment located here: > https://git.qemu.org/?p=qemu.git;a=blob;f=hw/pci/msi.c;h=47d2b0f33c664533b8dbd5cb17faa8e6a01afe1f;hb=HEAD#l38 > the msi_nonbroken variable should be set to true even if they don't > support MSI. In this case that is what we are doing as we don't support > MSI. > > Signed-off-by: Alistair Francis > Reported-by: Andrea Bolognani > Reported-by: David Abdurachmanov > --- > This should allow working pcie-root-ports in QEMU and allow libvirt > to start using PCIe by default for RISC-V guests. > > hw/riscv/sifive_plic.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/hw/riscv/sifive_plic.c b/hw/riscv/sifive_plic.c > index d12ec3fc9a..4b0537c912 100644 > --- a/hw/riscv/sifive_plic.c > +++ b/hw/riscv/sifive_plic.c > @@ -22,6 +22,7 @@ > #include "qemu/log.h" > #include "qemu/error-report.h" > #include "hw/sysbus.h" > +#include "hw/pci/msi.h" > #include "target/riscv/cpu.h" > #include "hw/riscv/sifive_plic.h" > > @@ -443,6 +444,8 @@ static void sifive_plic_realize(DeviceState *dev, Error > **errp) > plic->enable = g_new0(uint32_t, plic->bitfield_words * plic->num_addrs); > sysbus_init_mmio(SYS_BUS_DEVICE(dev), >mmio); > qdev_init_gpio_in(dev, sifive_plic_irq_request, plic->num_sources); > + > +msi_nonbroken = true; > } > > static void sifive_plic_class_init(ObjectClass *klass, void *data) With this patch applied, I was able to bring up a riscv64/virt guest with graphics, using PCIe devices only: https://imgur.com/a/taN06hE Tested-by: Andrea Bolognani -- Andrea Bolognani / Red Hat / Virtualization
Re: [Qemu-devel] [PATCH v1 1/1] riscv: plic: Set msi_nonbroken as true
On Mon, 2019-03-18 at 09:39 +0100, Paolo Bonzini wrote: > On 15/03/19 21:05, Alistair Francis wrote: > > Set msi_nonbroken as true for the PLIC. > > > > According to the comment located here: > > https://git.qemu.org/?p=qemu.git;a=blob;f=hw/pci/msi.c;h=47d2b0f33c664533b8dbd5cb17faa8e6a01afe1f;hb=HEAD#l38 > > the msi_nonbroken variable should be set to true even if they don't > > support MSI. In this case that is what we are doing as we don't support > > MSI. > > > > Signed-off-by: Alistair Francis > > Reported-by: Andrea Bolognani > > Reported-by: David Abdurachmanov > > --- > > This should allow working pcie-root-ports in QEMU and allow libvirt > > to start using PCIe by default for RISC-V guests. > > > > hw/riscv/sifive_plic.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/hw/riscv/sifive_plic.c b/hw/riscv/sifive_plic.c > > index d12ec3fc9a..4b0537c912 100644 > > --- a/hw/riscv/sifive_plic.c > > +++ b/hw/riscv/sifive_plic.c > > @@ -22,6 +22,7 @@ > > #include "qemu/log.h" > > #include "qemu/error-report.h" > > #include "hw/sysbus.h" > > +#include "hw/pci/msi.h" > > #include "target/riscv/cpu.h" > > #include "hw/riscv/sifive_plic.h" > > > > @@ -443,6 +444,8 @@ static void sifive_plic_realize(DeviceState *dev, Error > > **errp) > > plic->enable = g_new0(uint32_t, plic->bitfield_words * > > plic->num_addrs); > > sysbus_init_mmio(SYS_BUS_DEVICE(dev), >mmio); > > qdev_init_gpio_in(dev, sifive_plic_irq_request, plic->num_sources); > > + > > +msi_nonbroken = true; > > } > > > > static void sifive_plic_class_init(ObjectClass *klass, void *data) > > I can queue this patch, and add the "select MSI" to CONFIG_SIFIVE. The interrupt controller is used by the virt machine type too IIUC, so the same should be added to CONFIG_RISCV_VIRT I think. -- Andrea Bolognani / Red Hat / Virtualization
Re: [Qemu-devel] [PATCH] ci: store Patchew configuration in the tree
On Fri, 2019-03-15 at 10:19 +0100, Paolo Bonzini wrote: [...] > +s390x: > + enabled: false > + requirements: ppcle This doesn't look right :) -- Andrea Bolognani / Red Hat / Virtualization
Re: [Qemu-devel] [PATCH 0/2] kconfig: add fine-grained dependencies for MSI
On Thu, 2019-03-14 at 15:30 +0100, Paolo Bonzini wrote: > RISC-V targets did not include PCIe ports before the Kconfig transition, > and grew them afterwards, but they are nonfunctional because the interrupt > controller does not support MSI. This patch restores the situation prior to > the introduction of Kconfig; in fact, it will automatically drop devices > that require MSI unless the binary includes an MSI-enabled board. > > Paolo > > Paolo Bonzini (2): > kconfig: add CONFIG_MSI > kconfig: add dependencies on CONFIG_MSI > > Kconfig.host | 3 +++ > Makefile | 3 ++- > hw/Kconfig| 1 + > hw/intc/Kconfig | 3 +++ > hw/misc/Kconfig | 4 ++-- > hw/net/Kconfig| 4 ++-- > hw/pci-bridge/Kconfig | 6 +++--- > hw/pci-host/Kconfig | 1 + > hw/pci/Kconfig| 4 > hw/ppc/Kconfig| 1 + > hw/rdma/Kconfig | 3 +++ > hw/rdma/Makefile.objs | 6 ++ > hw/s390x/Kconfig | 1 + > 13 files changed, 28 insertions(+), 12 deletions(-) > create mode 100644 hw/rdma/Kconfig FWIW Tested-by: Andrea Bolognani Thanks for addressing this :) -- Andrea Bolognani / Red Hat / Virtualization
Re: [Qemu-devel] [PULL 15/54] build: convert pci.mak to Kconfig
On Mon, 2019-03-04 at 19:19 +0100, Paolo Bonzini wrote: > Instead of including the same list of devices for each target, > set CONFIG_PCI to true, and make the devices default to present > whenever PCI is available. However, s390x does not want all the > PCI devices, so there is a separate symbol to enable them. [...] > +++ b/default-configs/riscv32-softmmu.mak > @@ -1,8 +1,8 @@ > # Default configuration for riscv-softmmu > > -include pci.mak > include usb.mak > - > +CONFIG_PCI=y > +CONFIG_PCI_DEVICES=y > CONFIG_SERIAL=y > CONFIG_VIRTIO_MMIO=y I *think* this might have caused some unwanted changes for RISC-V. pcie-root-port is built into qemu-system-riscv64 by default as of dbbc277510aa (along with ioh3420!), but if you actually try to use it you'll get: $ ./riscv64-softmmu/qemu-system-riscv64 \ -M virt \ -device pcie-root-port qemu-system-riscv64: -device pcie-root-port: MSI-X is not supported by interrupt controller This is a limitation we have been aware of, and the plan was to enable the device in QEMU once it had been addressed: from the libvirt side, the availability of the device would have meant that it was safe to use it, but if the device is enabled in QEMU before it can actually be used, then that makes detection on the libvirt side problematic. I haven't spent time digging further - and I'm not familiar enough with the QEMU build system anyway O:-) - but I wouldn't be surprised if the same happened for other architectures, too. -- Andrea Bolognani / Red Hat / Virtualization
Re: [Qemu-devel] [libvirt] [PATCH for-4.0 v4 0/2] virtio: Provide version-specific variants of virtio PCI devices
On Wed, 2019-03-06 at 09:30 +0100, Ján Tomko wrote: > On Wed, Mar 06, 2019 at 08:41:48AM +0100, Peter Krempa wrote: > > On Tue, Mar 05, 2019 at 16:56:43 +0100, Andrea Bolognani wrote: > > > So I agree neither scenario is exactly perfect, but I still think > > > adding non-transitional alias devices would overall be more > > > user-friendly. > > > > I don't think it makes sense to add it at the qemu level. From libvirt's > > point of view users should be shielded from any qemu impl detail or > > inconsistency as libvirt is the 'user friendly'[1] layer. In qemu the > > devices would be the same and thus does not make sense to do that > > because it would be more confusing. > > > > You can argue that we should add the alias at the libvirt level though. > > You can, but please don't. It would seem nobody except me thinks this is a good idea, so I guess I'll just drop it now. -- Andrea Bolognani / Red Hat / Virtualization
Re: [Qemu-devel] [PATCH for-4.0 v4 0/2] virtio: Provide version-specific variants of virtio PCI devices
On Tue, 2019-03-05 at 15:38 +0100, Gerd Hoffmann wrote: > Hi, > > > -device virtio-blk-pci-non-transitional \ > > -device virtio-net-pci-non-transitional \ > > -device virtio-gpu-pci-non-transitional \ > > > > and you wouldn't have to question why you can use the > > non-transitional variant for pretty much everything, except for the > > few cases where you can't - for no apparent reason... > > Well, there are no variants, only a single virtio-$foo-pci* device. So > you don't have to worry about picking one of the available variants, > there is no choice in the first place. > > When adding an virtio-gpu-pci-non-transitional variant we'll create > confusion too, because it wouldn't be a real variant. We would have two > 100% identical devices then, and people will probably wonder why they > exist and what the difference is ... When looking at a single device, I mostly agree with your assessment; however, when looking at the overall situation with VirtIO devices, one might quite reasonably infer the following rules: * devices marked as (non-)transitional are going to show up as (non-)transitional; * unmarked devices might show up as either one, depending on some factor which is not immediately obvious. So if you knew you wanted non-transitional devices you would expect to just use the non-transitional variant for *all* VirtIO devices, including virtio-gpu, without necessarily caring whether the unmarked devices behaves any differently; if you tried to use the transitional device, you'd get an error message telling you that device doesn't exist, which is pretty reasonable and easy to research / understand. With the current situation, once you've tried using non-transitional virtio-gpu and gotten back an error message, there's quite a bit more digging required to figure out *why* the device is not there in the first place. So I agree neither scenario is exactly perfect, but I still think adding non-transitional alias devices would overall be more user-friendly. > So I can't see how this would be so much better. We have to document > the mess no matter what. We have some documentation in libvirt: https://libvirt.org/formatdomain.html#elementsVirtioTransitional Not that more / improved documentation is ever a bad idea :) -- Andrea Bolognani / Red Hat / Virtualization
Re: [Qemu-devel] [PATCH for-4.0 v4 0/2] virtio: Provide version-specific variants of virtio PCI devices
Sorry to resurrect such an old thread, but I have been wondering... On Wed, 2018-12-05 at 17:57 -0200, Eduardo Habkost wrote: [...] > Changes v1 -> v2: > * Removed *-0.9 devices. Nobody will want to use them, if > transitional devices work with legacy drivers > (Gerd Hoffmann, Michael S. Tsirkin) > * Drop virtio version from name: rename -1.0-transitional to > -transitional (Michael S. Tsirkin) > * Renamed -1.0 to -non-transitional > * Don't add any extra variants to modern-only device types > (they don't need it) ... if doing this was a good idea after all? While I understand that something like virtio-gpu, which supports the 1.0 specification exclusively, only really needs to have a single device associated with it from the functionality point of view, looking at it from a user's perspective it seems to me like providing an explicit non-transitional variant would be appropriate for consistency reasons, so that your guest could look like -device virtio-blk-pci-non-transitional \ -device virtio-net-pci-non-transitional \ -device virtio-gpu-pci-non-transitional \ and you wouldn't have to question why you can use the non-transitional variant for pretty much everything, except for the few cases where you can't - for no apparent reason... It would also signal quite clearly which devices support both transitional and non-transitional variants and which ones don't, without having to infer that the complete lack of (non-)transitional variants means that only the non-transitional variant is available - except you have to use the suffix-less device name to use it. tl;dr providing the non-transitional variant for virtio 1.0-only devices would make using this much more user-friendly. -- Andrea Bolognani / Red Hat / Virtualization