Re: [Qemu-discuss] [kubevirt-dev] Re: Converting qcow2 image on the fly to raw format

2018-07-16 Thread Daniel P . Berrangé
On Wed, Jul 11, 2018 at 02:17:18PM +0300, Adam Litke wrote:
> Adding some kubevirt developers to the thread.  Thanks guys for the
> information!  I think this could work perfectly for on the fly conversion
> of qcow2 images to raw format on our PVCs.

FYI if you are intending to accept qcow2 images from untrustworthy sources
you must take special care to validate the image in a confined environment.
It is possible to construct malicious images that inflict a denial of
service attack on CPU or memory or both, even when merely opening the image
to query its metadata. This has been reported as a CVE against OpenStack
in the past:

  https://bugs.launchpad.net/ossa/+bug/1449062

Recommendation is to run 'qemu-img info' to extract the metadata and sanity
check results eg no backing file list, not unreasonable size, etc. When
running 'qemu-img info' apply process limits of 30 secs CPU time, and 1 GB
address space. 

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-discuss] [kubevirt-dev] Re: Converting qcow2 image on the fly to raw format

2018-07-20 Thread Daniel P . Berrangé
On Thu, Jul 19, 2018 at 09:50:00PM +0300, Nir Soffer wrote:
> On Mon, Jul 16, 2018 at 11:56 AM Daniel P. Berrangé 
> wrote:
> ...
> 
> > Recommendation is to run 'qemu-img info' to extract the metadata and sanity
> > check results eg no backing file list, not unreasonable size, etc. When
> > running 'qemu-img info' apply process limits of 30 secs CPU time, and 1 GB
> > address space.
> >
> 
> Can you explain the values of cpu seconds and addres space?

Initially the values were (informed) guesswork. After testing real world
examples we settled on these values as providing protection from DOS,
while not causing valid images to be rejected

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-discuss] [kubevirt-dev] Re: Converting qcow2 image on the fly to raw format

2018-07-20 Thread Daniel P . Berrangé
On Thu, Jul 19, 2018 at 09:39:35PM +0100, Richard W.M. Jones wrote:
> I did the original work using AFL to fuzz qemu-img and find
> problematic images.  From that work Dan & I suggested some fairly low
> limits (10 seconds IIRC).  See:
> 
> https://bugs.launchpad.net/qemu/+bug/1462944
> https://bugs.launchpad.net/qemu/+bug/1462949
> 
> A lot more problematic images were found (at least 16), but I cannot
> recall if we filed bugs for all of them.  Note the images do not need
> to be qcow2, since someone can upload any old thing to your service
> and cause you problems.
> 
> On Thu, Jul 19, 2018 at 11:00:14PM +0300, Nir Soffer wrote:
> > The 30 seconds cpu_time time limit confuses me; it was added in:
> > https://github.com/openstack/nova/commit/011ae614d5c5fb35b2e9c22a9c4c99158f6aee20
> >
> > The patch references this bug:
> > https://bugs.launchpad.net/nova/+bug/1705340
> 
> It looks as if those original limits were too low and they have been
> increased.  For RHV I think you should go with the same settings that
> OpenStack is using.

Yes, real world usage found our original limit was too low for certain
valid images, so we increased it to 30 seconds.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-discuss] [libvirt-users] Efficacy of jitterentropy RNG on qemu-kvm Guests

2018-08-16 Thread Daniel P . Berrangé
On Fri, Aug 10, 2018 at 08:33:00PM +, procmem wrote:
> Hello. I'm a distro maintainer and was wondering about the efficacy of
> entropy daemons like haveged and jitterentropyd in qemu-kvm. One of the
> authors of haveged [0] pointed out if the hardware cycles counter is
> emulated and deterministic, and thus predictible. He therefore does not
> recommend using HAVEGE on those systems. Is this the case with KVM's
> counters?
> 
> PS. I will be setting VM CPU settings to host-passthrough.

Hardware from circa 2011 onwards has RDRAND support, and with host-passthrough
this will be available to the guest.  The rngd daemon, running in the guest,
can use this as a source to feed the kernel entropy.

In addition QEMU has support for virtio-rng which can pull entropy from
/dev/urandom on the host, and feed it into the guest, where again rngd can
give it to the kernel.

So why do you need to consider haveged / jitterentropyd at all with QEMU ?
It should suffice to just enable virtio-rng in the host and run rngd in
all guests. If the host has RDRAND, that's an extra bonus.

haveged / jitterentropyd should only be needed on other non-QEMU hypervisors
which don't support something equiv to virtio-rng, and are on hardware that
is too old for RDRAND.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-discuss] [libvirt-users] Installing QEMU from source causing error

2019-07-19 Thread Daniel P . Berrangé
On Fri, Jul 19, 2019 at 12:16:25PM -0500, Probir Roy wrote:
> Hi,
> 
> I am trying to run Qemu-4.0 installed from source code.
> 
> When I run virt-manager to create a VM, I get the following error:
> 
> ```
> Unable to complete install: 'internal error: process exited while
> connecting to monitor: 2019-07-19T17:06:35.954242Z qemu-system-x86_64:
> -enable-kvm: unsupported machine type
> Use -machine help to list supported machines'

You don't mention what your OS is ?  Some OS vendors patch QEMU
to have different machine types than upstream QEMU provides, and
I guess that's what you've hit.

If you look in your libvirt XML 'virsh dumpxml $GUESTNAME' you'll
see a machine type listed - probably "pc-$SOMETHING" or "q35-$SOMETHING".

If you simply delete the "-$SOMETHING" part to leave just "pc" or
just "q35", then libvirt will expand the machine type into the
latest one available with your self-built QEMU

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [RFC PATCH] configure: deprecate 32 bit build hosts

2019-09-26 Thread Daniel P . Berrangé
On Thu, Sep 26, 2019 at 08:50:36AM +0100, Peter Maydell wrote:
> On Thu, 26 Sep 2019 at 00:31, Alex Bennée  wrote:
> >
> > The 32 bit hosts are already a second class citizen especially with
> > support for running 64 bit guests under TCG. We are also limited by
> > testing as actual working 32 bit machines are getting quite rare in
> > developers personal menageries. For TCG supporting newer types like
> > Int128 is a lot harder with 32 bit calling conventions compared to
> > their larger bit sized cousins. Fundamentally address space is the
> > most useful thing for the translator to have even for a 32 bit guest a
> > 32 bit host is quite constrained.
> >
> > As far as I'm aware 32 bit KVM users are even less numerous. Even
> > ILP32 doesn't make much sense given the address space QEMU needs to
> > manage.
> 
> For KVM we should wait until the kernel chooses to drop support,
> I think.

What if the kernel is waiting for QEMU to drop support too ;-P

> > @@ -745,19 +744,22 @@ case "$cpu" in
> >;;
> >armv*b|armv*l|arm)
> >  cpu="arm"
> > -supported_cpu="yes"
> >;;
> 
> I'll leave others to voice opinions about their architectures,
> but I still have 32-bit arm in my test set for builds, and
> I'm pretty sure we have users (raspi users, for a start).

RHEL dropped all 32-bit host support a long time ago, so Red Hat
don't care for our products.

Fedora has recently stopped building i686 kernels and thus also no
long composes i686 installs. Some users complained, but ultimately
no one cares enough to step forward as maintainers.

That leaves armv7 as the only 32-bit arch in Fedora that is somewhat
active & maintained. I don't have any real insight on whether any
armv7 (Fedora) users are making much use of QEMU/KVM though, either
system or user emulation. 

Our preference in Fedora is to have things built on every architecture
that the distro targets, but if upstream developers explicitly drop an
architecture we're not going to try to add it back.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [RFC PATCH] configure: deprecate 32 bit build hosts

2019-09-30 Thread Daniel P . Berrangé
On Thu, Sep 26, 2019 at 10:11:05AM -0700, Richard Henderson wrote:
> On 9/26/19 12:50 AM, Peter Maydell wrote:
> > On Thu, 26 Sep 2019 at 00:31, Alex Bennée  wrote:
> >>
> >> The 32 bit hosts are already a second class citizen especially with
> >> support for running 64 bit guests under TCG. We are also limited by
> >> testing as actual working 32 bit machines are getting quite rare in
> >> developers personal menageries. For TCG supporting newer types like
> >> Int128 is a lot harder with 32 bit calling conventions compared to
> >> their larger bit sized cousins. Fundamentally address space is the
> >> most useful thing for the translator to have even for a 32 bit guest a
> >> 32 bit host is quite constrained.
> >>
> >> As far as I'm aware 32 bit KVM users are even less numerous. Even
> >> ILP32 doesn't make much sense given the address space QEMU needs to
> >> manage.
> > 
> > For KVM we should wait until the kernel chooses to drop support,
> > I think.
> 
> Agreed.  I think this discussion should be more about TCG.
> 
> >> @@ -745,19 +744,22 @@ case "$cpu" in
> >>;;
> >>armv*b|armv*l|arm)
> >>  cpu="arm"
> >> -supported_cpu="yes"
> >>;;
> > 
> > I'll leave others to voice opinions about their architectures,
> > but I still have 32-bit arm in my test set for builds, and
> > I'm pretty sure we have users (raspi users, for a start).
> 
> I'd really like to know what raspi users might be using qemu for.  Depending 
> on
> that answer, perhaps it would be sufficient for arm32 tcg to only support
> 32-bit guests.

I asked on the Fedora development lists for feedback on the idea of
dropping QEMU 32-bit host support

  
https://lists.fedoraproject.org/archives/list/de...@lists.fedoraproject.org/thread/TPAVIC6YANGP2NK4RUOP7OCIOIFIOV3A/

The response was rather underwhealming to say the least, with only one
person explicitly expressing a desire for QEMU to keep 32-bit host
support for armv7l.

The main interesting thing I learnt was that even with 64-bit Raspberry
Pi hardware, it can be desirable to run 32-bit Raspberry Pi supporting
distro, supposedly for performance benefits.

> For context, the discussion that Alex and I were having was about int128_t, 
> and
> how to support that directly in tcg (especially to/from helpers), and how that
> might be vastly easier if we didn't have to consider 32-bit hosts.

I know nothing about TCG internals, but Is it viable to implement int128_t
support only in 64-bit host, leaving 32-bit hosts without it ? Or is this
really a blocking item that is holding back 64-bit host features.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [RFC PATCH] configure: deprecate 32 bit build hosts

2019-10-02 Thread Daniel P . Berrangé
On Tue, Oct 01, 2019 at 11:02:14AM -0700, Richard Henderson wrote:
> On 10/1/19 10:56 AM, Mark Cave-Ayland wrote:
> > Just out of interest, which host/compiler combinations don't currently 
> > implement
> > int128_t?
> 
> GCC only implements int128_t for 64-bit targets.

QEMU probes for that during configure  and sets CONFIG_INT128

If I'm reading correctly include/qemu/int128.h then provides a
fallback type based on a struct with two int64s.

This has some inconvenience though as you have to use the the (inline)
function calls for all the basic operands and will be less efficient
when using the fallback.

Presumably this is not viable for TCG ?

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: X86: Abnormal variation in Freebsd VM launch time w.r.t freebsd guest config

2020-04-07 Thread Daniel P . Berrangé
On Tue, Apr 07, 2020 at 09:59:59AM +0530, gokul cg wrote:
> Hi Team,
> 
> We are observing abnormal variation in VM launch time w.r.t guest config.
> 
> A simple VM(2gb ram no passthrough device) creation takes usually 6sec
> (Time from execution of 'virsh create guest.xml  to get print "Welcome to
> FreeBSD"' ),but when we add a USB passthrough device launch time increased
> to 18-19 sec and further increased to 39-44sec when we have increased guest
> ram to 48Gb.

snip.

> Note : 1) This we have seen with legacy pci passthrough device not with
> vfio. And we have not noticed any perfoance impact other than qemu-init/vm
> laucn time .

Legacy pci passthrough support was deleted way back to 2017, and we had
deprecated it for 2 years before then, in favour of VFIO.

>  Any suggestions to improve launch time with legacy passthrough ?

Just stop using legacy PCI assignment. VFIO has been the recommended
impl for 5+ years now.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: sync guest time

2020-04-30 Thread Daniel P . Berrangé
On Thu, Apr 30, 2020 at 01:52:12PM +0200, Miguel Duarte de Mora Barroso wrote:
> Hi,
> 
> I'm seeing the following issue when attempting to update the guest's
> clock on a running fc32 guest (using guest agent):
> 
> ```
> [root@virt-launcher-vmi-masquerade-mh2xm /]# virsh domtime 1 --pretty
> Time: 2020-04-30 23:27:29
> [root@virt-launcher-vmi-masquerade-mh2xm /]# virsh domtime 1 --sync
> error: internal error: unable to execute QEMU agent command
> 'guest-set-time': hwclock failed to set hardware clock to system time

This error is ultimately coming from the QEMU guest agent inside
your guest. It spawns "hwclock" and this is failing for some reason.
You'll probably need to debug this inside the guest - strace the
QEMU guest agent, see where it fails, and then file a bug against
the distro for it.

> # now, this one works.
> [root@virt-launcher-vmi-masquerade-mh2xm /]# virsh domtime 1 --now
> 
> [root@virt-launcher-vmi-masquerade-44v2x /]# virsh domtime 1 --pretty
> Time: 2020-04-30 11:15:45

This doesn't run hwclock as its merely reading the current time,.

> Is there any workaround I could try ? Am I doing something wrong here ?

I don't think you're doing anything wrong. This just looks like a guest
OS bug to me.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: sync guest time

2020-04-30 Thread Daniel P . Berrangé
On Thu, Apr 30, 2020 at 05:39:45PM +0200, Miguel Duarte de Mora Barroso wrote:
> On Thu, Apr 30, 2020 at 2:15 PM Daniel P. Berrangé  
> wrote:
> >
> > On Thu, Apr 30, 2020 at 01:52:12PM +0200, Miguel Duarte de Mora Barroso 
> > wrote:
> > > Hi,
> > >
> > > I'm seeing the following issue when attempting to update the guest's
> > > clock on a running fc32 guest (using guest agent):
> > >
> > > ```
> > > [root@virt-launcher-vmi-masquerade-mh2xm /]# virsh domtime 1 --pretty
> > > Time: 2020-04-30 23:27:29
> > > [root@virt-launcher-vmi-masquerade-mh2xm /]# virsh domtime 1 --sync
> > > error: internal error: unable to execute QEMU agent command
> > > 'guest-set-time': hwclock failed to set hardware clock to system time
> >
> > This error is ultimately coming from the QEMU guest agent inside
> > your guest. It spawns "hwclock" and this is failing for some reason.
> > You'll probably need to debug this inside the guest - strace the
> > QEMU guest agent, see where it fails, and then file a bug against
> > the distro for it.
> 
> Eventually I found out that if I make the call *without* specifying
> the `libvirt.DOMAIN_TIME_SYNC` flag this works
> as I intend. I've read the docs and could not understand what's the
> purpose of this flag .
> 
> It reads "Re-sync domain time from domain's RTC" on [0]. It begs the
> question: if I'm setting it to a fixed instant in time (which I am),
> why would I want it to sync with the domain's RTC ?
> 
> Is there any obvious issue that will appear from calling
> `virDomainSetTime` (defined at [1]) without the DOMAIN_TIME_SYNC flag
> specified ?
> 
> I'm not sure if this (removing the DOMAIN_TIME_SYNC) is a fix, an ugly
> hack, or a disaster waiting to happen.

If you passing  DOMAIN_TIME_SYNC, then the guest agent updates the
guest OS system time, to match the guest OS RTC time.

IOW, this assumes the guest RTC is in sync with the host time,
and we just need to resync the guest OS with the RTC.

If you don't pass DOMAIN_TIME_SYNC, then the time on the host
where you run  virsh is sent to the guest OS and this is used
to set the guest OS system time.  The issue here is all about
latency between when virsh reads the current time, and when
the guest OS sets this time.

IOW, I'd generally consider DOMAIN_TIME_SYNC a good thing.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: Qemu, VNC and non-US keymaps

2020-05-11 Thread Daniel P . Berrangé
On Mon, May 11, 2020 at 04:24:32PM +0200, Philippe Mathieu-Daudé wrote:
> Cc'ing more developers.
> 
> On 5/11/20 4:17 PM, B3r3n wrote:
> > Dear all,
> > 
> > I am struggling for days/weeks with Qemu and its VNC accesses...with
> > non-US keymaps.
> > 
> > Let me summ the facts:
> > - I am using a french keyboard over a Ubuntu 18.04.
> > - I installed a simple Debian in a Qemu VM, configured with FR keyboard
> > (AZERTY).
> > - I am launching the Qemu VM with the '-k fr' keymaping (original)
> > - I tested with Qemu 3.1.1, 4.2.0 & 5.0.0.
> > 
> > I fail to have the AltGr keys, critical to frenches (pipe, backslash,
> > dash etc).
> > checking with showkey, I see the keys arriving properly (29+56, 29+100,
> > etc).

There is no mention here of what VNC client program is being used, which
is quite important, as key handling is a big mess in VNC.

The default VNC protocol passes X11 keysyms over the wire.

The remote desktop gets hardware scancodes and turns them into keysyms,
which the VNC client sees. The VNC client passes them to the VNC server
in QEMU, which then has to turn them back into hardware scancodes. This
reverse mapping relies on knowledge of the keyboard mapping, and is what
the "-k fr" argument tells QEMU.

For this to work at all, the keymap used by the remote desktop must
match the keymap used by QEMU, which must match the keymap used by
the guest OS.  Even this is not sufficient though, because the act
of translating hardware scancodes into keysyms is *lossy*. There is
no way to reliably go back to hardware scancodes, which is precisely
what QEMU tries to do - some reverse mappings will be ambiguous.

Due to this mess, years ago (over a decade) QEMU introduced a VNC
protocol extension that allows for passing hardware scancodes over
the wire.

With this extension, the VNC client gets the hardware scancode
from the remote desktop, and passes it straight to the VNC server,
which passes it straight to the guest OS, which then applies the
localized keyboard mapping.   This is good because the localized
keyboard mapping conversion is now only done once, in the guest
OS.

To make use of this protocol extension to VNC, you must *NOT*
pass any "-k" arg to QEMU, and must use a VNC client that has
support for this protocol extension.  The GTK-VNC widget supports
this and is used by virt-viewer, remote-viewer, virt-manager,
GNOME Boxes, Vinagre client applications.  The TigerVNC client
also supports this extension.

To summarize, my recommendation is to remove the "-k" arg entirely,
and pick a VNC client that supports the scancode extension.

It is possible there might be a genuine bug in QEMU's 'fr' keymap
that can be fixed to deal with AltGr problems. Personally though I
don't spend time investigating these problems, as the broad reverse
keymapping problem is unfixable. The only sensible option is to take
the route of using the VNC hardware scancode extension. It is notable
that SPICE learnt from VNC's mistake and used hardware scancodes from
the very start.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: Qemu, VNC and non-US keymaps

2020-05-11 Thread Daniel P . Berrangé
On Mon, May 11, 2020 at 05:29:48PM +0200, B3r3n wrote:
> Hello Daniel,
> 
> > There is no mention here of what VNC client program is being used, which
> > is quite important, as key handling is a big mess in VNC.
> I tested with TightVNC & noVNC through Apache. Both behaves the same. I did
> not tested Ultr@VNC.

AFAIK, neithe TightVNC nor Ultra@VNC support the scancode extension.

noVNC does, for most modern browsers, so it might work if you remove
the -k arg from QEMU.

> > The default VNC protocol passes X11 keysyms over the wire.
> > 
> > The remote desktop gets hardware scancodes and turns them into keysyms,
> > which the VNC client sees. The VNC client passes them to the VNC server
> > in QEMU, which then has to turn them back into hardware scancodes. This
> > reverse mapping relies on knowledge of the keyboard mapping, and is what
> > the "-k fr" argument tells QEMU.
> > 
> > For this to work at all, the keymap used by the remote desktop must
> > match the keymap used by QEMU, which must match the keymap used by
> > the guest OS.  Even this is not sufficient though, because the act
> > of translating hardware scancodes into keysyms is *lossy*. There is
> > no way to reliably go back to hardware scancodes, which is precisely
> > what QEMU tries to do - some reverse mappings will be ambiguous.
> Yes, I saw that topic passing by. Looks messy with all these interferences...
> 
> > Due to this mess, years ago (over a decade) QEMU introduced a VNC
> > protocol extension that allows for passing hardware scancodes over
> > the wire.
> I guess I also crossed something about this on Internet.
> Are you talking of the RFB protocol ?

Yes, RFB protocol is the technical name for the VNC wire protocol.

> > With this extension, the VNC client gets the hardware scancode
> > from the remote desktop, and passes it straight to the VNC server,
> > which passes it straight to the guest OS, which then applies the
> > localized keyboard mapping.   This is good because the localized
> > keyboard mapping conversion is now only done once, in the guest
> > OS.
> > 
> > To make use of this protocol extension to VNC, you must *NOT*
> > pass any "-k" arg to QEMU, and must use a VNC client that has
> > support for this protocol extension.  The GTK-VNC widget supports
> > this and is used by virt-viewer, remote-viewer, virt-manager,
> > GNOME Boxes, Vinagre client applications.  The TigerVNC client
> > also supports this extension.
> So if I read you, if the client "enforce" this protocol (supposed RFB), Qemu
> will automatically uses it as well ?

The client should automatically activate the extension if QEMU advertizes
it, and QEMU advertizes it if you remove the -k arg.

> Removing -k option is great to me if it works, since user will have its own
> mapping and these are international :-)



> > To summarize, my recommendation is to remove the "-k" arg entirely,
> > and pick a VNC client that supports the scancode extension.
> For now I am using TightVNC & noVNC. noVNC is precious since it widens the
> user world, removing any client software constraint.

As above, noVNC ought to support the extension.

> 
> > It is possible there might be a genuine bug in QEMU's 'fr' keymap
> > that can be fixed to deal with AltGr problems. Personally though I
> > don't spend time investigating these problems, as the broad reverse
> > keymapping problem is unfixable. The only sensible option is to take
> > the route of using the VNC hardware scancode extension. It is notable
> > that SPICE learnt from VNC's mistake and used hardware scancodes from
> > the very start.
> 
> This was another path I intend to follow : using SPICE and a "noSPICE"
> client if VNC was too painful.
> If I understand you, using SPICE could also solve the issue ?
> 
> Many thanks for your inputs...

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: Qemu, VNC and non-US keymaps

2020-05-12 Thread Daniel P . Berrangé
On Tue, May 12, 2020 at 09:45:20AM +0200, B3r3n wrote:
> Hello Daniel, all,
> 
> I am a bit confused.
> 
> Ok, RFB protocol should be the solution that solves all, sending scancodes
> rather than doing keysyms stuff. No pb for me.
> So I removed my '-k fr' to my Qemu VM start as it was before.
> 
> However, reading TightVNC & noVNC docs, both are able to perform RFB.

VNC == RFB - they're two different terms of the same thing.

The core RFB/VNC protocol only knows about keysyms.

Hardware scancodes is an extension defined by QEMU, and GTK-VNC, and since
implemented by TigerVNC.

Removing the "-k" arg, merely enables use of the scancode extension.
This requires a compatible client app that knows the scancode extension.
If the client doesn't support scancodes, it will continue using keysyms,
which will then be treated as plain us keymap.

AFAIK,  TightVNC doesn't support the scancode extension, only TigerVNC.

> Since these explanations, I replayed a bit:
> 
> Under my testing Debian 10, I redefined keyboard to French + No compose key
> + AltGr as CTRL_R

This is a key example of the problems of VNC's traditional key handling.

QEMU has a single keymap "fr". It has no way of selecting compose key
on/off, or overriding AltGr to be CtrlR.  So as soon as you do that on
your local desktop, you make it impossible to QEMU VNC serve to work
correctly.

> 
> Under noVNC: Ctrl_R works well as alternative but when using AltGr, I
> received 29+100+7 (AltGr + 6) and keep displaying a minus as with AltGr was
> not pressed.
> 
> Under TightVNC (2.7.10) : Ctrl_R displays characters, I am still in QWERTY
> for letters, weird mapping for other characters, did not checked if
> compliant with whatever definition.
> Under TightVNC (last 2.8.27, supposed to be able to RFB): Ctrl_R displays
> nothing, keys are QWERTY. Seems the same as TightVNC 2.7.10.
> 
> With the keyboard defining AltGr as AltGr, no change.
> 
> I realize that AltGr is sending 29+100 (seen via showkey), when CTRL_R only
> sends 97.
> When using a remote console (iLo and iDRAC), AltGr only sends 100.
> 
> I wonder if the issue would not also be the fact AltGr sends 2 codes, still
> another one to select the character key (6 for example).
> 
> Is that normal Qemu is transforming AltGr (100) in 29+100 ?

It is hard to say without seeing debuging to see what QEMU received.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: Qemu, VNC and non-US keymaps

2020-05-13 Thread Daniel P . Berrangé
On Wed, May 13, 2020 at 10:38:52AM +0200, B3r3n wrote:
> Hello Daniel,
> 
> Ok, TigerVNC, added -shared=1 to behave the same as TightVNC, works greatly,
> Thanks !
> 
> But funny thing, I saw you were part of exchanges on that topic, noVNC
> totally fails now.
> Despite my keyboard isnt changed, debian VM is just in QWERTY as if noVNC
> only send keysyms.
> 
> If you know how to force noVNC keycodes instead, digging to find the trick :-(

Looking at the current git master code AFAICT it should "just work"
unless you have an older version of it perhaps


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: QEMU 5.1: Can we require each new device/machine to provided a test?

2020-05-15 Thread Daniel P . Berrangé
On Fri, May 15, 2020 at 12:11:17PM +0200, Thomas Huth wrote:
> On 07/04/2020 12.59, Philippe Mathieu-Daudé wrote:
> > Hello,
> > 
> > Following Markus thread on deprecating unmaintained (untested) code
> > (machines) [1] and the effort done to gather the information shared in
> > the replies [2], and the various acceptance tests added, is it
> > feasible to require for the next release that each new device/machine
> > is provided a test covering it?
> > 
> > If no, what is missing?
> 
> If a qtest is feasible, yes, I think we should require one for new
> devices. But what about machines - you normally need a test image for
> this. In that case, there is still the question where testing images
> could be hosted. Not every developer has a web space where they could
> put their test images onto. And what about images that contain non-free
> code?

Yep, it isn't feasible to make this a hard rule.

IMHO this is where a support tier classification comes into play

 - Tier 1: actively maintained, qtest coverage available. Expected
   to work reliably at all times since every commit is CI
   tested

  - Tier 2: actively maintained, no qtest coverage. Should usually
   work but regression may creep in due to reliance on the
   maintainer to manually test on adhoc basis

  - Tier 3: not actively maintained, unknown state but liable to
be broken indefinitely

Tier 1 is obviously the most desirable state we would like everthing to
be at. Contributors will have to fix problems their patches cause as
they will be blocked by CI.

Tier 2 is an admission that reality gets in the way. Ideally stuff in
this tier will graduate to Tier 1 at some point. Even if it doesn't
though, it is still valid to keep it in QEMU long term. Contributors
shouldn't gratuitously break stuff in these board, but if they do,
then the maintainer is ultimately responsible for fixing it, as the
contributors don't have a test rig for it.

Tier 3 is abandonware. If a maintainer doesn't appear, users should
not expect it to continue to exist long term. Contributors are free
to send patches which break this, and are under no obligation to
fix problems in these boards. We may deprecate & delete it after a
while


Over time we'll likely add more criteria to stuff in Tier 1. This
could lead to some things dropping from Tier 1 to Tier 2. This is
OK, as it doesn't make those things worse than they already were.
We're just saying that Tier 2 isn't as thoroughly tested as we
would like it to be in an ideal world.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: QEMU 5.1: Can we require each new device/machine to provided a test?

2020-05-19 Thread Daniel P . Berrangé
On Mon, May 18, 2020 at 03:56:36PM -0400, John Snow wrote:
> 
> 
> On 5/15/20 6:23 AM, Daniel P. Berrangé wrote:
> > On Fri, May 15, 2020 at 12:11:17PM +0200, Thomas Huth wrote:
> >> On 07/04/2020 12.59, Philippe Mathieu-Daudé wrote:
> >>> Hello,
> >>>
> >>> Following Markus thread on deprecating unmaintained (untested) code
> >>> (machines) [1] and the effort done to gather the information shared in
> >>> the replies [2], and the various acceptance tests added, is it
> >>> feasible to require for the next release that each new device/machine
> >>> is provided a test covering it?
> >>>
> >>> If no, what is missing?
> >>
> >> If a qtest is feasible, yes, I think we should require one for new
> >> devices. But what about machines - you normally need a test image for
> >> this. In that case, there is still the question where testing images
> >> could be hosted. Not every developer has a web space where they could
> >> put their test images onto. And what about images that contain non-free
> >> code?
> > 
> > Yep, it isn't feasible to make this a hard rule.
> > 
> > IMHO this is where a support tier classification comes into play
> > 
> >  - Tier 1: actively maintained, qtest coverage available. Expected
> >to work reliably at all times since every commit is CI
> >tested
> > 
> >   - Tier 2: actively maintained, no qtest coverage. Should usually
> >work but regression may creep in due to reliance on the
> >maintainer to manually test on adhoc basis
> > 
> >   - Tier 3: not actively maintained, unknown state but liable to
> > be broken indefinitely
> > 
> > Tier 1 is obviously the most desirable state we would like everthing to
> > be at. Contributors will have to fix problems their patches cause as
> > they will be blocked by CI.
> > 
> > Tier 2 is an admission that reality gets in the way. Ideally stuff in
> > this tier will graduate to Tier 1 at some point. Even if it doesn't
> > though, it is still valid to keep it in QEMU long term. Contributors
> > shouldn't gratuitously break stuff in these board, but if they do,
> > then the maintainer is ultimately responsible for fixing it, as the
> > contributors don't have a test rig for it.
> > 
> > Tier 3 is abandonware. If a maintainer doesn't appear, users should
> > not expect it to continue to exist long term. Contributors are free
> > to send patches which break this, and are under no obligation to
> > fix problems in these boards. We may deprecate & delete it after a
> > while
> > 
> > 
> > Over time we'll likely add more criteria to stuff in Tier 1. This
> > could lead to some things dropping from Tier 1 to Tier 2. This is
> > OK, as it doesn't make those things worse than they already were.
> > We're just saying that Tier 2 isn't as thoroughly tested as we
> > would like it to be in an ideal world.
> 
> I really like the idea of device support tiers codified directly in the
> QEMU codebase, to give upstream users some idea of which devices we
> expect to work and which we ... don't, really.
> 
> Not every last device we offer is enterprise production ready, but we
> don't necessarily do a good job of explaining which devices fall into
> which categories, and we've got quite a few of them.
> 
> I wonder if a 2.5th tier would be useful; something like a "hobbyist"
> tier for pet project SoC boards and the like -- they're not abandoned,
> but we also don't expect them to work, exactly.
> 
> Mild semantic difference from Tier 3.

I guess I was thinking such hobbyist stuff would fall into tier 2  if the
hobbyist maintainer actually responds to fixing stuff, or tier 3 if they
largely aren't active on the mailing list responding to issues/questions.

We add have a 4 tier system overall and put hobbyist stuff at tier 3,
and abandonware at tier 4.

Probably shouldn't go beyond 4 tiers though, as the more criteria we add
the harder it is to clearly decide which tier something should go into.

The tier 1 vs 2 divison is clearly split based on CI which is a simple
classification to decide on.

The tier 2 vs 3 division is moderately clearly split based on whether
there is a frequently active maintainer.

We can probably squeeze in the 4th tier without too much ambiguity in
the classisfication if we think it is adding something worthwhile either
from our POV as maintainers, or for users consuming it.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: QEMU 5.1: Can we require each new device/machine to provided a test?

2020-05-20 Thread Daniel P . Berrangé
On Tue, May 19, 2020 at 07:06:40PM -0400, John Snow wrote:
> 
> 
> On 5/19/20 5:04 AM, Daniel P. Berrangé wrote:
> > On Mon, May 18, 2020 at 03:56:36PM -0400, John Snow wrote:
> >>
> >>
> >> On 5/15/20 6:23 AM, Daniel P. Berrangé wrote:
> >>> On Fri, May 15, 2020 at 12:11:17PM +0200, Thomas Huth wrote:
> >>>> On 07/04/2020 12.59, Philippe Mathieu-Daudé wrote:
> >>>>> Hello,
> >>>>>
> >>>>> Following Markus thread on deprecating unmaintained (untested) code
> >>>>> (machines) [1] and the effort done to gather the information shared in
> >>>>> the replies [2], and the various acceptance tests added, is it
> >>>>> feasible to require for the next release that each new device/machine
> >>>>> is provided a test covering it?
> >>>>>
> >>>>> If no, what is missing?
> >>>>
> >>>> If a qtest is feasible, yes, I think we should require one for new
> >>>> devices. But what about machines - you normally need a test image for
> >>>> this. In that case, there is still the question where testing images
> >>>> could be hosted. Not every developer has a web space where they could
> >>>> put their test images onto. And what about images that contain non-free
> >>>> code?
> >>>
> >>> Yep, it isn't feasible to make this a hard rule.
> >>>
> >>> IMHO this is where a support tier classification comes into play
> >>>
> >>>  - Tier 1: actively maintained, qtest coverage available. Expected
> >>>to work reliably at all times since every commit is CI
> >>>  tested
> >>>
> >>>   - Tier 2: actively maintained, no qtest coverage. Should usually
> >>>work but regression may creep in due to reliance on the
> >>>  maintainer to manually test on adhoc basis
> >>>
> >>>   - Tier 3: not actively maintained, unknown state but liable to
> >>> be broken indefinitely
> >>>
> >>> Tier 1 is obviously the most desirable state we would like everthing to
> >>> be at. Contributors will have to fix problems their patches cause as
> >>> they will be blocked by CI.
> >>>
> >>> Tier 2 is an admission that reality gets in the way. Ideally stuff in
> >>> this tier will graduate to Tier 1 at some point. Even if it doesn't
> >>> though, it is still valid to keep it in QEMU long term. Contributors
> >>> shouldn't gratuitously break stuff in these board, but if they do,
> >>> then the maintainer is ultimately responsible for fixing it, as the
> >>> contributors don't have a test rig for it.
> >>>
> >>> Tier 3 is abandonware. If a maintainer doesn't appear, users should
> >>> not expect it to continue to exist long term. Contributors are free
> >>> to send patches which break this, and are under no obligation to
> >>> fix problems in these boards. We may deprecate & delete it after a
> >>> while
> >>>
> >>>
> >>> Over time we'll likely add more criteria to stuff in Tier 1. This
> >>> could lead to some things dropping from Tier 1 to Tier 2. This is
> >>> OK, as it doesn't make those things worse than they already were.
> >>> We're just saying that Tier 2 isn't as thoroughly tested as we
> >>> would like it to be in an ideal world.
> >>
> >> I really like the idea of device support tiers codified directly in the
> >> QEMU codebase, to give upstream users some idea of which devices we
> >> expect to work and which we ... don't, really.
> >>
> >> Not every last device we offer is enterprise production ready, but we
> >> don't necessarily do a good job of explaining which devices fall into
> >> which categories, and we've got quite a few of them.
> >>
> >> I wonder if a 2.5th tier would be useful; something like a "hobbyist"
> >> tier for pet project SoC boards and the like -- they're not abandoned,
> >> but we also don't expect them to work, exactly.
> >>
> >> Mild semantic difference from Tier 3.
> > 
> > I guess I was thinking such hobbyist stuff would fall into tier 2  if the
> > hobbyist maintainer actually responds to fixing stuff, or tier 3 if they
> > largely aren't active on the mailing li

Re: QEMU 5.1: Can we require each new device/machine to provided a test?

2020-05-20 Thread Daniel P . Berrangé
On Wed, May 20, 2020 at 08:13:07AM +0200, Thomas Huth wrote:
> On 20/05/2020 01.06, John Snow wrote:
> > 
> > 
> > On 5/19/20 5:04 AM, Daniel P. Berrangé wrote:
> >> On Mon, May 18, 2020 at 03:56:36PM -0400, John Snow wrote:
> >>>
> >>>
> >>> On 5/15/20 6:23 AM, Daniel P. Berrangé wrote:
> >>>> On Fri, May 15, 2020 at 12:11:17PM +0200, Thomas Huth wrote:
> >>>>> On 07/04/2020 12.59, Philippe Mathieu-Daudé wrote:
> >>>>>> Hello,
> >>>>>>
> >>>>>> Following Markus thread on deprecating unmaintained (untested) code
> >>>>>> (machines) [1] and the effort done to gather the information shared in
> >>>>>> the replies [2], and the various acceptance tests added, is it
> >>>>>> feasible to require for the next release that each new device/machine
> >>>>>> is provided a test covering it?
> >>>>>>
> >>>>>> If no, what is missing?
> >>>>>
> >>>>> If a qtest is feasible, yes, I think we should require one for new
> >>>>> devices. But what about machines - you normally need a test image for
> >>>>> this. In that case, there is still the question where testing images
> >>>>> could be hosted. Not every developer has a web space where they could
> >>>>> put their test images onto. And what about images that contain non-free
> >>>>> code?
> >>>>
> >>>> Yep, it isn't feasible to make this a hard rule.
> >>>>
> >>>> IMHO this is where a support tier classification comes into play
> >>>>
> >>>>  - Tier 1: actively maintained, qtest coverage available. Expected
> >>>>to work reliably at all times since every commit is CI
> >>>> tested
> >>>>
> >>>>   - Tier 2: actively maintained, no qtest coverage. Should usually
> >>>>work but regression may creep in due to reliance on the
> >>>> maintainer to manually test on adhoc basis
> >>>>
> >>>>   - Tier 3: not actively maintained, unknown state but liable to
> >>>> be broken indefinitely
> >>>>
> >>>> Tier 1 is obviously the most desirable state we would like everthing to
> >>>> be at. Contributors will have to fix problems their patches cause as
> >>>> they will be blocked by CI.
> >>>>
> >>>> Tier 2 is an admission that reality gets in the way. Ideally stuff in
> >>>> this tier will graduate to Tier 1 at some point. Even if it doesn't
> >>>> though, it is still valid to keep it in QEMU long term. Contributors
> >>>> shouldn't gratuitously break stuff in these board, but if they do,
> >>>> then the maintainer is ultimately responsible for fixing it, as the
> >>>> contributors don't have a test rig for it.
> >>>>
> >>>> Tier 3 is abandonware. If a maintainer doesn't appear, users should
> >>>> not expect it to continue to exist long term. Contributors are free
> >>>> to send patches which break this, and are under no obligation to
> >>>> fix problems in these boards. We may deprecate & delete it after a
> >>>> while
> >>>>
> >>>>
> >>>> Over time we'll likely add more criteria to stuff in Tier 1. This
> >>>> could lead to some things dropping from Tier 1 to Tier 2. This is
> >>>> OK, as it doesn't make those things worse than they already were.
> >>>> We're just saying that Tier 2 isn't as thoroughly tested as we
> >>>> would like it to be in an ideal world.
> >>>
> >>> I really like the idea of device support tiers codified directly in the
> >>> QEMU codebase, to give upstream users some idea of which devices we
> >>> expect to work and which we ... don't, really.
> >>>
> >>> Not every last device we offer is enterprise production ready, but we
> >>> don't necessarily do a good job of explaining which devices fall into
> >>> which categories, and we've got quite a few of them.
> >>>
> >>> I wonder if a 2.5th tier would be useful; something like a "hobbyist"
> >>> tier for pet project SoC boards and the like -- they're not abandoned,
> >>>

Re: [ovirt-devel] [ARM64] Possiblity to support oVirt on ARM64

2020-07-22 Thread Daniel P . Berrangé
On Sun, Jul 19, 2020 at 09:06:42PM +0300, Nir Soffer wrote:
> On Sun, Jul 19, 2020 at 5:04 PM Zhenyu Zheng  
> wrote:
> >
> > Hi oVirt,
> >
> > We are currently trying to make oVirt work on ARM64 platform, since I'm 
> > quite new to oVirt community, I'm wondering what is the current status 
> > about ARM64 support in the oVirt upstream, as I saw the oVirt Wikipedia 
> > page mentioned there is an ongoing efforts to support ARM platform. We have 
> > a small team here and we are willing to also help to make this work.
> 
> Hi Zhenyu,
> 
> I think this is a great idea, both supporting more hardware, and
> enlarging the oVirt
> community.
> 
> Regarding hardware support we depend mostly on libvirt and qemu, and I
> don't know
> that is the status. Adding relevant lists and people.

libvirt and qemu both support aarch64 guests and hosts for years now,
as do various KVM mgmt apps, so I think its largely just a matter of
the oVirt code that needs porting and updating to deal with any aarch64
specific aspects.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: Switching to the GitLab bug tracker

2021-05-05 Thread Daniel P . Berrangé
On Wed, May 05, 2021 at 11:55:30AM +0200, Stefano Garzarella wrote:
> On Tue, May 04, 2021 at 12:20:03PM +0200, Philippe Mathieu-Daudé wrote:
> > On 5/4/21 10:43 AM, Stefan Hajnoczi wrote:
> > > On Mon, May 03, 2021 at 01:16:51PM +0200, Thomas Huth wrote:
> > > > As you might have already noticed by some other mails on the qemu-devel
> > > > mailing list, we are in progress of switching our bug tracking tool from
> > > > Launchpad to Gitlab. The new tracker can now be found here:
> > > > 
> > > >  https://gitlab.com/qemu-project/qemu/-/issues
> > > 
> > > Thank you for doing this, Thomas!
> > > 
> > > > 1) We likely won't have the possibility anymore to automatically send 
> > > > e-mail
> > > > notifications for new bugs to the qemu-devel mailing list. If you want 
> > > > to
> > > > get informed about new bugs, please use the notification mechanism from
> > > > Gitlab instead. That means, log into your gitlab account, browse to
> > > > 
> > > >  https://gitlab.com/qemu-project/qemu
> > > > 
> > > > and click on the bell icon at the top of the page to manage your
> > > > notifications, e.g. enable notifications for "New issues" there.
> > > 
> > > All maintainers and most regular contributors should follow the issue
> > > tracker so that QEMU developers are aware of new issues. Please do this!
> > > 
> > > An alternative mechanism is the RSS/Atom feed available by clicking the
> > > "Subscribe to RSS feed" button left of the "New issue" button here:
> > > 
> > >   https://gitlab.com/qemu-project/qemu/-/issues
> > 
> > You can also subscribe to labels of interest [*] going to
> > https://gitlab.com/qemu-project/qemu/-/labels
> > 
> > For example in my case I subscribed to receive notifications
> > only from these labels:
> > 
> > - kind:Bug
> > - Storage
> > - pflash
> > - Fuzzer
> > - workflow:Merged
> 
> Cool feature, I also subscribed to some labels.
> 
> I was trying to assign a label, for example "Storage" to this issue:
> https://gitlab.com/qemu-project/qemu/-/issues/96
> 
> but I can't, should I have some special permission/role?

Yes, anyone who is a QEMU maintainer needs to be added to gitlab
project with "Reporter" role to be able to do bug janitoring.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: Libvirt on little.BIG ARM systems unable to start guest if no cpuset is provided

2021-12-14 Thread Daniel P . Berrangé
On Tue, Dec 14, 2021 at 09:34:18AM +, Marc Zyngier wrote:
> On Tue, 14 Dec 2021 00:41:01 +,
> Qu Wenruo  wrote:
> > 
> > 
> > 
> > On 2021/12/14 00:49, Marc Zyngier wrote:
> > > On Mon, 13 Dec 2021 16:06:14 +,
> > > Peter Maydell  wrote:
> > >> 
> > >> KVM on big.little setups is a kernel-level question really; I've
> > >> cc'd the kvmarm list.
> > > 
> > > Thanks Peter for throwing us under the big-little bus! ;-)
> > > 
> > >> 
> > >> On Mon, 13 Dec 2021 at 15:02, Qu Wenruo  wrote:
> > >>> 
> > >>> 
> > >>> 
> > >>> On 2021/12/13 21:17, Michal Prívozník wrote:
> >  On 12/11/21 02:58, Qu Wenruo wrote:
> > > Hi,
> > > 
> > > Recently I got my libvirt setup on both RK3399 (RockPro64) and RPI 
> > > CM4,
> > > with upstream kernels.
> > > 
> > > For RPI CM4 its mostly smooth sail, but on RK3399 due to its 
> > > little.BIG
> > > setup (core 0-3 are 4x A55 cores, and core 4-5 are 2x A72 cores), it
> > > brings quite some troubles for VMs.
> > > 
> > > In short, without proper cpuset to bind the VM to either all A72 cores
> > > or all A55 cores, the VM will mostly fail to boot.
> > > 
> > > s/A55/A53/. There were thankfully no A72+A55 ever produced (just the
> > > though of it makes me sick).
> > > 
> > > 
> > > Currently the working xml is:
> > > 
> > > 2
> > > 
> > > 
> > > But even with vcpupin, pinning each vcpu to each physical core, VM 
> > > will
> > > mostly fail to start up due to vcpu initialization failed with 
> > > -EINVAL.
> > > 
> > > Disclaimer: I know nothing about libvirt (and no, I don't want to
> > > know! ;-).
> > > 
> > > However, for things to be reliable, you need to taskset the whole QEMU
> > > process to the CPU type you intend to use.
> > 
> > Yep, that's what I'm doing.
> 
> Are you sure? The xml directive above seem to only apply to the vcpus,
> and no other QEMU thread.

For historical reasons this XML element is a bit misleadingly named.

With the config

   2

the 'cpuset' applies to the QEMU process as a whole - its vCPUs,
I/O threads and any other emulator threads.

There is a separate config for setting per-VCPU binding which was
illustrated elsewhere in this thread.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: Libvirt on little.BIG ARM systems unable to start guest if no cpuset is provided

2021-12-14 Thread Daniel P . Berrangé
On Tue, Dec 14, 2021 at 06:59:12PM +0800, Qu Wenruo wrote:
> 
> 
> On 2021/12/14 18:36, Daniel P. Berrangé wrote:
> > On Tue, Dec 14, 2021 at 09:34:18AM +, Marc Zyngier wrote:
> > > On Tue, 14 Dec 2021 00:41:01 +,
> > > Qu Wenruo  wrote:
> > > > 
> > > > 
> > > > 
> > > > On 2021/12/14 00:49, Marc Zyngier wrote:
> > > > > On Mon, 13 Dec 2021 16:06:14 +,
> > > > > Peter Maydell  wrote:
> > > > > > 
> > > > > > KVM on big.little setups is a kernel-level question really; I've
> > > > > > cc'd the kvmarm list.
> > > > > 
> > > > > Thanks Peter for throwing us under the big-little bus! ;-)
> > > > > 
> > > > > > 
> > > > > > On Mon, 13 Dec 2021 at 15:02, Qu Wenruo  
> > > > > > wrote:
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > On 2021/12/13 21:17, Michal Prívozník wrote:
> > > > > > > > On 12/11/21 02:58, Qu Wenruo wrote:
> > > > > > > > > Hi,
> > > > > > > > > 
> > > > > > > > > Recently I got my libvirt setup on both RK3399 (RockPro64) 
> > > > > > > > > and RPI CM4,
> > > > > > > > > with upstream kernels.
> > > > > > > > > 
> > > > > > > > > For RPI CM4 its mostly smooth sail, but on RK3399 due to its 
> > > > > > > > > little.BIG
> > > > > > > > > setup (core 0-3 are 4x A55 cores, and core 4-5 are 2x A72 
> > > > > > > > > cores), it
> > > > > > > > > brings quite some troubles for VMs.
> > > > > > > > > 
> > > > > > > > > In short, without proper cpuset to bind the VM to either all 
> > > > > > > > > A72 cores
> > > > > > > > > or all A55 cores, the VM will mostly fail to boot.
> > > > > 
> > > > > s/A55/A53/. There were thankfully no A72+A55 ever produced (just the
> > > > > though of it makes me sick).
> > > > > 
> > > > > > > > > 
> > > > > > > > > Currently the working xml is:
> > > > > > > > > 
> > > > > > > > >  2
> > > > > > > > >  
> > > > > > > > > 
> > > > > > > > > But even with vcpupin, pinning each vcpu to each physical 
> > > > > > > > > core, VM will
> > > > > > > > > mostly fail to start up due to vcpu initialization failed 
> > > > > > > > > with -EINVAL.
> > > > > 
> > > > > Disclaimer: I know nothing about libvirt (and no, I don't want to
> > > > > know! ;-).
> > > > > 
> > > > > However, for things to be reliable, you need to taskset the whole QEMU
> > > > > process to the CPU type you intend to use.
> > > > 
> > > > Yep, that's what I'm doing.
> > > 
> > > Are you sure? The xml directive above seem to only apply to the vcpus,
> > > and no other QEMU thread.
> > 
> > For historical reasons this XML element is a bit misleadingly named.
> > 
> > With the config
> > 
> > 2
> > 
> > the 'cpuset' applies to the QEMU process as a whole - its vCPUs,
> > I/O threads and any other emulator threads.
> > 
> > There is a separate config for setting per-VCPU binding which was
> > illustrated elsewhere in this thread.
> 
> Which also means, I can put the io threads to A53 cores freeing up the
> A72 cores more.
> 
> And is there any plan to deprecate the old "cpuset" key of vcpu element,
> and recommend to use "vcpupin" element?

No, they're complementary as they're operating at different levels
and not every scenario needs this fine grained level.
In fact if you just use 'vcpupin' and don't provide 'cpuset', then
internally treats it as if 'cpuset' was the union of all 'vcpupin'
bitsets.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: MP tables do not report multiple CPUs in Qemu 6.2.0 on x86 when given -smp cpus=n flag

2022-01-20 Thread Daniel P . Berrangé
On Thu, Jan 20, 2022 at 03:38:26PM +0530, Ani Sinha wrote:
> Actually I am not quite right. This is the real change which changed the
> preference. The previous change was a code re-org that preserved the
> behavior:
> 
> commit 4a0af2930a4e4f64ce551152fdb4b9e7be106408
> Author: Yanan Wang 
> Date:   Wed Sep 29 10:58:09 2021 +0800
> 
> machine: Prefer cores over sockets in smp parsing since 6.2
> 
> In the real SMP hardware topology world, it's much more likely that
> we have high cores-per-socket counts and few sockets totally. While
> the current preference of sockets over cores in smp parsing results
> in a virtual cpu topology with low cores-per-sockets counts and a
> large number of sockets, which is just contrary to the real world.
> 
> Given that it is better to make the virtual cpu topology be more
> reflective of the real world and also for the sake of compatibility,
> we start to prefer cores over sockets over threads in smp parsing
> since machine type 6.2 for different arches.
> 
> In this patch, a boolean "smp_prefer_sockets" is added, and we only
> enable the old preference on older machines and enable the new one
> since type 6.2 for all arches by using the machine compat mechanism.
> 
> Suggested-by: Daniel P. Berrange 
> Signed-off-by: Yanan Wang 
> Acked-by: David Gibson 
> Acked-by: Cornelia Huck 
> Reviewed-by: Andrew Jones 
> Reviewed-by: Pankaj Gupta 
> Reviewed-by: Daniel P. Berrangé 
> Message-Id: <20210929025816.21076-10-wangyana...@huawei.com>
> Signed-off-by: Paolo Bonzini 
> 
> In any case, the behavior change is intended because of the reasons the
> above commit outlines.

Further compelling reason not mentioned there is that some OS will
artifically restrict how many sockets they are willing to use, while
happily using as many cores as they get. This is usually a licensing
or billing restriction rather than some technical reason, and kinda
silly since cores/sockets are basically interchangable, but that's
life.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: QEMU cpu socket allocation

2022-05-18 Thread Daniel P . Berrangé
On Tue, May 17, 2022 at 10:50:02AM +, Rajesh A wrote:
> Hi QEMU dev
> 
> Virt Manager is able to configure a QEMU VM with more CPU sockets
> than the physical host has.
> For example, in the below VM, when I request 16 vCPU cores,  by
> default it takes as 16 Sockets with 1 core each. The host itself
> has only 2 Sockets.

You've told virt-manager to overcommit your CPUs.

>   1.  How does QEMU allow this and how the VM works?

It is functionally fine. A vCPU is merely OS thread, and the OS
schedular will just schedule each vCPU as it would any OS thread.

It multiple vCPUs all try to run work at the same time, then they
are going to compete with each other for running time on the physical
CPUs, and so they'll only get a subset of the time they reall want.

>   2.  What is the recommended configuration of Sockets/Cores/Threads
> for best VM performance of a 16 core VM running on a 2 sockets host ?

You've not said how many cores & threads your host CPUs have, so we
can't answer that.

The normal recommend is to not overcommit logical CPUs, where logical
CPUs means sockets * cores * threads.

VMs created by virt-manager don't use CPU pinning, so they float
freely across host CPUs as decided by the host OS scheduler. When
having floating CPUs, the CPU topology should never use threads > 1,
but the mix of cores & sockets has essentially no impact on performance.
Since each vCPU will move across different host pCPUs over time, the
effective topology is constantly changing.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: If your networking is failing after updating to the latest git version of QEMU...

2022-10-03 Thread Daniel P . Berrangé
On Mon, Oct 03, 2022 at 11:36:36AM +0100, Peter Maydell wrote:
> On Mon, 3 Oct 2022 at 11:25, Alex Bennée  wrote:
> >
> >
> > Peter Maydell  writes:
> >
> > > On Mon, 3 Oct 2022 at 10:09, Alex Bennée  wrote:
> > >>
> > >>
> > >> Thomas Huth  writes:
> > >>
> > >> > On 29/09/2022 04.32, Jason Wang wrote:
> > >> >> On Thu, Sep 29, 2022 at 1:06 AM Philippe Mathieu-Daudé 
> > >> >>  wrote:
> > >> >>> Jason, Marc-André, could we improve the buildsys check or display
> > >> >>> a more helpful information from the code instead?
> > >> >> It looks to me we need to improve the build.
> > >> >
> > >> > I'm not sure there is anything to improve in the build system -
> > >> > configure/meson.build are just doing what they should: Pick the
> > >> > default value for "slirp" if the user did not explicitly specify
> > >> > "--enable-slirp".
> > >>
> > >> Shouldn't it be the other way round and fail to configure unless the
> > >> user explicitly calls --disable-slirp?
> > >
> > > Our standard pattern for configure options is:
> > >  --enable-foo : check for foo; if it can't be enabled, fail configure
> > >  --disable-foo : don't even check for foo, and don't build it in
> > >  no option given : check for foo, decide whether to build in support if
> > >it's present
> >
> > Don't we make a distinction between libs that are truly optional and
> > those you probably need.
> 
> Yes. If something is truly mandatory then configure will always
> fail. This is true for zlib and glib, for instance...
> 
> > It seems missing working networking is one of
> > those things we should be telling the user about unless explicitly
> > disabled. It is after all how we worked before, we would silently
> > checkout libslirp and build it for you.
> 
> ...but building without libslirp is perfectly reasonable for some
> configurations, eg where you know you're going to be using QEMU
> in a TAP network config, and you don't want to have libslirp in
> your binary so you don't have to think about whether you need to
> act on security advisories relating to it. "no slirp" isn't like
> "no zlib", where you can't build a QEMU at all. I think it's more
> like gtk support, where we will happily configure without gtk/sdl/etc
> and only build in the VNC frontend -- that's a working configuration
> in some sense, but for the inexperienced user a QEMU which doesn't
> produce a GUI window is almost certainly not what they wanted.
> 
> So we could:
>  * say that we will opt for consistency, and have the slirp
>detection behave like every other optional library
>  * say that slirp is a special case purely because we used to
>ship it as a submodule and so users are used to it being present
>  * say that slirp is a special case because it's "optional but
>only experts will want to disable it", and think about what
>other configure options (like GUI support) we might want to
>move into this category
> 
> I don't think there's an obvious right answer here...

What I find particularly wierd about slirp, is how we have conditionalized
the behaviour of a bare 'qemu-system-XXX' invokation. In a fully featured
build, a no-args QEMU will be equiv to "-net nic,model=MODEL -net user",
but in a non-SLIRP build a no-args QEMU is merely "-net nic,model=MODEL".

If you specified "-net nic,model=MODEL" normally, you would get a message
printed:

   warning: hub 0 is not connected to host network

but we squelch this warning for the built-in default network, even though
the defaults are useless because of the missing backend.

I'm surprised we didn't just entirely remove the default NIC when no slirp
is present, given it can't do anything useful with the traffic. The
complete absence of a NIC would give a stronger sign to users that
something is different.

Ultimately it is only this 'no args' default case that is a significant
problem, as the explicit slirp args case we can report a clear error
message that explains the missing feature. Is 'no args' default important
enough that we need to make slirp diverge from normal configure/meson
practice for detecting libs ?

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: dropping 32-bit host support

2023-03-16 Thread Daniel P . Berrangé
On Thu, Mar 16, 2023 at 02:11:08PM +0300, Andrew Randrianasulu wrote:
> чт, 16 мар. 2023 г., 14:02 Thomas Huth :
> 
> > On 16/03/2023 11.22, Andrew Randrianasulu wrote:
> > >
> > >
> > > чт, 16 мар. 2023 г., 12:17 Andrew Randrianasulu  > > >:
> > >
> > >
> > >
> > > чт, 16 мар. 2023 г., 11:31 Thomas Huth  > > >:
> > >
> > > On 16/03/2023 08.36, Philippe Mathieu-Daudé wrote:
> > >  > On 16/3/23 08:17, Andrew Randrianasulu wrote:
> > >  >>
> > >  >> чт, 16 мар. 2023 г., 10:05 Philippe Mathieu-Daudé
> > > mailto:phi...@linaro.org>
> > >  >> >>:
> > >  >>
> > >  >> Hi Andrew,
> > >  >>
> > >  >> On 16/3/23 01:57, Andrew Randrianasulu wrote:
> > >  >>  > Looking at https://wiki.qemu.org/ChangeLog/8.0
> > > 
> > >  >>  > > >
> > >  >>  >  > > 
> > >  >>  > > >>
> > >  >>  >
> > >  >>  > ===
> > >  >>  > System emulation on 32-bit x86 and ARM hosts has been
> > > deprecated.
> > >  >> The
> > >  >>  > QEMU project no longer considers 32-bit x86 and ARM
> > > support for
> > >  >> system
> > >  >>  > emulation to be an effective use of its limited
> > > resources, and thus
> > >  >>  > intends to discontinue.
> > >  >>  >
> > >  >>  >   ==
> > >  >>  >
> > >  >>  > well, I guess arguing from memory-consuption point on
> > 32
> > > bit x86
> > >  >> hosts
> > >  >>  > (like my machine where I run 32 bit userspace on 64
> > bit
> > > kernel)
> > >
> > > All current PCs have multiple gigabytes of RAM, so using a 32-bit
> > > userspace
> > > to save some few bytes sounds weird.
> > >
> > >
> > > I think difference more like in 20-30% (on disk and in ram), not *few
> > > bytes*.
> > >
> > >
> > > I stand (self) corrected on *on disk* binary size, this parameter tend
> > to be
> > > ~same between bash / php binaries from Slackware 15.0 i586/x86_64. I do
> > not
> > > have full identical x64 Slackware setup for measuring memory impact.
> > >
> > >
> > > Still, pushing users into endless hw upgrade is no fun:
> > >
> > >
> > https://hackaday.com/2023/02/28/repurposing-old-smartphones-when-reusing-makes-more-sense-than-recycling/
> > >
> > >
> > > note e-waste and energy consumption
> >
> > Now you're mixing things quite badly. That would be an argument in the
> > years
> > before 2010 maybe, when not everybody had a 64-bit processor in their PC
> > yet, but it's been now more than 12 years that all recent Desktop
> > processors
> 
> ===
> 
> 
> Laptops, tablets etc exist.
> 
> 
> >
> > feature 64-bit mode. So if QEMU stops supporting 32-bit x86 environments,
> > this is not forcing you to buy a new hardware, since you're having a
> > 64-bit
> > hardware already anyway. If someone still has plain 32-bit x86 hardware
> > around for their daily use, that's certainly not a piece of hardware you
> > want to run QEMU on, since it's older than 12 years already, and thus not
> > really strong enough to run a recent emulator in a recent way.
> >
> 
> Well, current qemu runs quite well, than you very much (modulo all this
> twiddling with command line switches). I think very fact it runs well (even
> as tcg-only emulator, on integer tasks at least) on 32-bit hosts actually
> good, and if 32-bit arm hardware can keep some codeways in working state
> for me - even better.

The problem being debated here is not a technical one, but a question
of resource prioritization.

It is certainly technically possible to keep 32-bit support working
indefinitely and there are certainly people who would benefit from
that, like yourself.

The issue is that it comes at a cost to the QEMU project both in terms
of where our contributors invest their time, and in terms of what we
use our CI resources for. Both maintainer time and hardware resources
are finite quantities.

IOW, if we continue to support 32-bit host, that means that we will
be unable to work on some other feature(s) instead.

The question we're battling with is where to draw the line, so that
we can bring the biggest benefit to QEMU consumers as a whole.

If we keep supporting 32-bit host, that may (hypothetically) benefit
100 users.

If we drop 32-bit host we might be able to develop some new features
that (hypothetically) benefit 5000 new users.

In this illustration, it would make sense to drop 32-bit, because in

Re: dropping 32-bit host support

2023-03-16 Thread Daniel P . Berrangé
On Thu, Mar 16, 2023 at 04:01:06PM +0300, Andrew Randrianasulu wrote:
> Well, this language about "market" and "investment"  not just figures of
> the speech, sadly? Because paid developers work on  areas they paid to
> develop, by boss with big bucks.

This is FUD.

Many QEMU maintainers are employeed, but that does not mean that their
boss gets to dictate what the QEMU community does. The company has its
priorities but this cannot be forced onto the community. Changes have
to be made through tradeoffs and consensus building across all active
maintainers.

To put it another way, responsible open source maintainers/contributors
wear two hats.

With their corporate hat on they have tasks to work on that are directly
important to their employer in the short term. They can make a case for
why these contributions are beneficial, but there's never a guarantee
the community will agree / accept it.

With their community hat on they look at, and work on, what is important
for the health of the community in general. This can sometimes be contrary
to what the employer would otherwise like to see. Wise companies accept
this tradeoff, because the long term health of the community is ultimately
important to them too.

QEMU is fortunate to have many responsible maintainers who balance the
demands of their employer vs the community on an ongoing basis.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH] hw/i386/pc_piix: Mark the machine types from version 1.4 to 1.7 as deprecated

2023-09-13 Thread Daniel P . Berrangé
On Wed, Sep 13, 2023 at 08:33:01AM +0200, Philippe Mathieu-Daudé wrote:
> On 18/1/22 09:49, Thomas Huth wrote:
> > On 17/01/2022 21.12, Daniel P. Berrangé wrote:
> > > On Mon, Jan 17, 2022 at 08:16:39PM +0100, Thomas Huth wrote:
> > > > The list of machine types grows larger and larger each release ... and
> > > > it is unlikely that many people still use the very old ones for live
> > > > migration. QEMU v1.7 has been released more than 8 years ago, so most
> > > > people should have updated their machines to a newer version in those
> > > > 8 years at least once. Thus let's mark the very old 1.x machine types
> > > > as deprecated now.
> > > 
> > > What criteria did you use for picking v1.7 as the end point ?
> > 
> > I picked everything starting with a "1." this time ;-)
> > 
> > No, honestly, since we don't have a deprecation policy in place yet,
> > there was no real good criteria around this time. For the machine types
> > < 1.3 there was a bug with migration, so these machine types could not
> > be used for reliable migration anymore anyway. But for the newer machine
> > types, we likely have to decide by other means indeed.
> > 
> > > I'm fine with the idea of aging out machine types, but I'd like us
> > > to explain the criteria we use for this, so that we can set clear
> > > expectations for users. I'm not a fan of adhoc decisions that have
> > > different impact every time we randomly decide to apply them.
> > > 
> > > A simple rule could be time based - eg we could say
> > > 
> > >    "we'll keep machine type versions for 5 years or 15 releases."
> > > 
> > > one factor is how long our downstream consumers have been keeping
> > > machines around for.
> > > 
> > > In RHEL-9 for example, the oldest machine is "pc-i440fx-rhel7.6.0"
> > > which IIUC is derived from QEMU 2.12.0. RHEL-9 is likely to rebase
> > > QEMU quite a few times over the coming years, so that 2.12.0 version
> > > sets an example baseline for how long machines might need to live for.
> > > That's 4 years this April, and could potentially be 6-7 years by the
> > > time RHEL-9 stops rebasing QEMU.
> > 
> > Yeah, 5 years still seemed a little bit short to me, that's one of the
> > reasons why I did not add more machine types in my patch here. I think
> > with 7 or 8 years, we should be on the safe side.
> > 
> > Any other opinions? And if we agree on an amount of years, where should
> > we document this? At the top of docs/about/deprecated.rst?
> 
> I suppose x86 being the oldest, x86 maintainers have to comment, but
> 5 years should be enough from sysadmins to migrate their VMs, isn't it?
> (No need to migrate from 1 -> 8, they can do 1 -> 3 -> 5 -> 8, right?)

You can't change guest hardware during migrate. So whether you go direct
from 1 -> 8, or go from 1 -> 2 -> 3 -> 4 -> 5 -> 6 -> 7 -> 8, you're
going to have the same guest hardware before and after every step.


If someone is using upstream QEMU, I'm sceptical they will successfully
live migrate over QEMU versions spanning 5+ years. While we make a pretty
decent effort at ensuring back compat, and fixing problems, we've had
a number of mistakes over the years, that were caught in RHEL downstream
testing.

If someone is using RHEL QEMU (or another vendor who's putting in alot
of effort at live migration testing), then I can see them spanning over
5 years for a VM deployment. Of course they *should* have VM reboots over
that timeframe to deploy new kernels for example, so they will have had
opportunities to update the machine type, but it does not mean they have
actually done so.

The pc-i440fx-rhel7.6.0 machine type I mentioned earlier in the thread is
a bit of an unusual case, as that has lasted longer than intended (RHEL-7,
RHEL-8, and RHEL-9). Normally our downstream policy is for machine types
to last 2 major RHEL releases, so you can deploy on N and later upgrade
the VM to N+1 without a reboot for re-configuration.

Now in the case of RHEL we don't use upstream QEMU machines types, so we
don't actually care when QEMU deprecates and deletes old machine types.

What matter is whether there are any internal tunable knobs that were
used in the pc_compat_*_fn() functions that get deleted as a result
of their usage going away.  For example our rhel7.6.0 machine type uses

m->async_pf_vmexit_disable = true;
m->smbus_no_migration_support = true;
m->deprecation_reason = rhel_old_machine_deprecation;
pcmc->pvh_enabled = false;