[Qemu-devel] [PATCH] util/mmap-alloc: support MAP_SYNC in qemu_ram_mmap()

2017-12-26 Thread Haozhong Zhang
When a file supporting DAX is used as vNVDIMM backend, mmap it with
MAP_SYNC flag in addition can guarantee the persistence of guest write
to the backend file without other QEMU actions (e.g., periodic fsync()
by QEMU).

By using MAP_SHARED_VALIDATE flag with MAP_SYNC, we can ensure mmap
with MAP_SYNC fails if MAP_SYNC is not supported by the kernel or the
backend file. On such failures, QEMU retries mmap without MAP_SYNC and
MAP_SHARED_VALIDATE.

Signed-off-by: Haozhong Zhang 
---
 util/mmap-alloc.c | 36 +++-
 1 file changed, 35 insertions(+), 1 deletion(-)

diff --git a/util/mmap-alloc.c b/util/mmap-alloc.c
index 2fd8cbcc6f..37b302f057 100644
--- a/util/mmap-alloc.c
+++ b/util/mmap-alloc.c
@@ -18,7 +18,18 @@
 
 #ifdef CONFIG_LINUX
 #include 
+
+/*
+ * MAP_SHARED_VALIDATE and MAP_SYNC were introduced in 4.15 kernel, so
+ * they may not be defined when compiling on older kernels.
+ */
+#ifndef MAP_SHARED_VALIDATE
+#define MAP_SHARED_VALIDATE   0x3
 #endif
+#ifndef MAP_SYNC
+#define MAP_SYNC  0x8
+#endif
+#endif /* CONFIG_LINUX */
 
 size_t qemu_fd_getpagesize(int fd)
 {
@@ -97,6 +108,7 @@ void *qemu_ram_mmap(int fd, size_t size, size_t align, bool 
shared)
 #endif
 size_t offset;
 void *ptr1;
+int xflags = 0;
 
 if (ptr == MAP_FAILED) {
 return MAP_FAILED;
@@ -107,12 +119,34 @@ void *qemu_ram_mmap(int fd, size_t size, size_t align, 
bool shared)
 assert(align >= getpagesize());
 
 offset = QEMU_ALIGN_UP((uintptr_t)ptr, align) - (uintptr_t)ptr;
+
+#if defined(__linux__)
+/*
+ * If 'fd' refers to a file supporting DAX, mmap it with MAP_SYNC
+ * will guarantee the guest write persistence without other
+ * actions in QEMU (e.g., fsync() in QEMU).
+ *
+ * MAP_SHARED_VALIDATE ensures mmap with MAP_SYNC fails if
+ * MAP_SYNC is not supported by the kernel or the file.
+ *
+ * On failures of mmap with xflags, QEMU will retry mmap without
+ * xflags.
+ */
+xflags = shared ? (MAP_SHARED_VALIDATE | MAP_SYNC) : 0;
+#endif
+
+ retry_mmap_fd:
 ptr1 = mmap(ptr + offset, size, PROT_READ | PROT_WRITE,
 MAP_FIXED |
 (fd == -1 ? MAP_ANONYMOUS : 0) |
-(shared ? MAP_SHARED : MAP_PRIVATE),
+(shared ? MAP_SHARED : MAP_PRIVATE) | xflags,
 fd, 0);
 if (ptr1 == MAP_FAILED) {
+if (xflags) {
+xflags = 0;
+goto retry_mmap_fd;
+}
+
 munmap(ptr, total);
 return MAP_FAILED;
 }
-- 
2.14.1




Re: [Qemu-devel] [PATCH 13/17] e500: move PCI host bridge into CCSR

2017-12-26 Thread Michael Davidsaver
On 12/06/2017 05:11 AM, David Gibson wrote:
> On Tue, Dec 05, 2017 at 10:42:25PM -0500, Michael Davidsaver wrote:
>> On 12/05/2017 01:53 AM, David Gibson wrote:
>>> On Sun, Nov 26, 2017 at 03:59:11PM -0600, Michael Davidsaver wrote:
 Signed-off-by: Michael Davidsaver 
>>>
>>> Hmm.  Is there anything you're *not* planning to move under the CCSR.
>>
>> Well, the decrementer/timebase initialization for one as this has
>> nothing to do with the CCSR registers.
> 
> Right, but no actual devices, even small ones?

Well, there is the GPIO controller ("mpc8xxx_gpio") as I have frankly
have no idea where it comes from.  Neither MPC8540 nor 8544 define
anything at CCSR offset 0xFF000.  The registers modeled differ
from the GPIO controller which is defined at 0xE0030.

So I'm considering this to be specific to the existing boards.

>> I haven't added the TSEC/eTSEC instances either.
>> Partly this is because the existing boards, for reasons I don't understand,
>> use virtio NICs.
>>
>> Further, the mpc8540 has TSEC instances 1 and 2, while the mpc8544
>> has instances 1 and 3.  So I decided to leave NIC setup to the Machine
>> rather then add the extra code to parameterize this under the CCSR device.
>>
>>> If not, I'm really wondering if the CCSR ought to be a device in its
>>> own right, rather than just a container memory region used within the
>>> machine.
>>
>> I don't think I follow what you mean by "device" in this context?
>> The CCSR object is a SysBusDevice in the qom tree ("/machine/e500-ccsr").
>> What device-like characteristics could it have?
> 
> Sorry, I wasn't clear.  the CCSR definitely *is* a device in the
> current scheme, but I'm wondering if that was a good idea.

I think I see what you're getting at.  You're right that CCSR
isn't a "device" in the same sense as eg. a UART.  In my mind
it's more like a PCI host bridge, having a few registers itself,
though serving mainly to route to the devices behind it.

I see the CCSR "device" as the bridge onto the system bus.
In some ways it might be considered to be the only device
on the system bus apart from the CPU(s).  This "device"
handles first level routing of physical addresses to
RAM, PCI, or local bus via the LAWBAR* registers (unmodeled).
Or to the I/O devices in the CCSR range via CCSRBAR.

I don't have plans to model that LAWBAR* registers,
as it hasn't proved necessary for RTEMS or Linux guests.
I have considered how this could be done as Linux does
touch these registers, but doesn't readback/check the
values it has written.  I think having a CCSR "device"
would make it simpler to model the local windows
should this become desirable.  eg. if Linux starts
validating LAWBAR* or to run unmodified u-boot.



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v3 2/6] tests: migration test deprecated commands

2017-12-26 Thread Peter Xu
On Tue, Dec 26, 2017 at 08:51:10PM +0100, Juan Quintela wrote:
> Peter Xu  wrote:
> > On Fri, Dec 01, 2017 at 01:58:09PM +0100, Juan Quintela wrote:
> >> We now test the deprecated commands everytime that we test the new
> >> commands.  This makes unnecesary to add tests for deprecated commands.
> >> 
> >> Signed-off-by: Juan Quintela 
> >> ---
> >>  tests/migration-test.c | 32 
> >>  1 file changed, 28 insertions(+), 4 deletions(-)
> >> 
> >> diff --git a/tests/migration-test.c b/tests/migration-test.c
> >> index 799e24ebc6..51f49c74e9 100644
> >> --- a/tests/migration-test.c
> >> +++ b/tests/migration-test.c
> >> @@ -369,7 +369,7 @@ static void migrate_check_parameter(QTestState *who, 
> >> const char *parameter,
> >>  QDECREF(rsp);
> >>  }
> >>  
> >> -static void migrate_set_downtime(QTestState *who, const double value)
> >> +static void deprecated_set_downtime(QTestState *who, const double value)
> >>  {
> >>  QDict *rsp;
> >>  gchar *cmd;
> >> @@ -388,7 +388,7 @@ static void migrate_set_downtime(QTestState *who, 
> >> const double value)
> >>  g_free(expected);
> >>  }
> >>  
> >> -static void migrate_set_speed(QTestState *who, const char *value)
> >> +static void deprecated_set_speed(QTestState *who, const char *value)
> >>  {
> >>  QDict *rsp;
> >>  gchar *cmd;
> >> @@ -402,6 +402,30 @@ static void migrate_set_speed(QTestState *who, const 
> >> char *value)
> >>  migrate_check_parameter(who, "max-bandwidth", value);
> >>  }
> >>  
> >> +static void migrate_set_parameter(QTestState *who, const char *parameter,
> >> +  const char *value)
> >> +{
> >> +QDict *rsp;
> >> +gchar *cmd;
> >> +
> >> +if (strcmp(parameter, "downtime-limit") == 0) {
> >> +deprecated_set_downtime(who, 0.12345);
> >> +}
> >> +
> >> +if (strcmp(parameter, "max-bandwidth") == 0) {
> >> +deprecated_set_speed(who, "12345");
> >> +}
> >
> > I'm fine with current approach, but I would really prefer to put them
> > all into a standalone test, by just calling them one by one with some
> > specific numbers and that's all.
> 
> That means another test (at least), and we have, also at least three
> deprecated comands:
> - migrate_set_speed
> - migrate_set_downtime
> - migrate_set_cache_size
> 
> And each test makes things slower.  So I *thought* it would we wiser to
> just check _always_ use the deprecated an the not deprecated one.
> 
> > (luckily we only have two deprecated commands and limited tests,
> >  otherwise extra commands will be M*N, say "number of deprecated
> >  command" * "number of test mirations")
> 
> Each test takes time, so adding tests make everything much slower.
> Notice that setting a new setting is fast.
> 
> This was the way that I understood Dave he wanted.

Do you mean every test is slow, or just migration tests?  Here I mean
to only test setting the parameters without doing real migration tests
(then query to make sure the settings were taking effect).  I assume
that should be fast too?  Thanks,

-- 
Peter Xu



Re: [Qemu-devel] [PATCH v3] Add ability for user to specify mouse ungrab key

2017-12-26 Thread no-reply
Hi,

This series failed automatic build test. Please find the testing commands and
their output below. If you have docker installed, you can probably reproduce it
locally.

Type: series
Message-id: 20171227003544.39826-1-programmingk...@gmail.com
Subject: [Qemu-devel] [PATCH v3] Add ability for user to specify mouse ungrab 
key

=== TEST SCRIPT BEGIN ===
#!/bin/bash
set -e
git submodule update --init dtc
# Let docker tests dump environment info
export SHOW_ENV=1
export J=8
time make docker-test-quick@centos6
time make docker-test-build@min-glib
time make docker-test-mingw@fedora
# iotests is broken now, skip
# time make docker-test-block@fedora
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 t [tag update]patchew/20171201125813.1437-1-quint...@redhat.com -> 
patchew/20171201125813.1437-1-quint...@redhat.com
 * [new tag]   
patchew/20171227003544.39826-1-programmingk...@gmail.com -> 
patchew/20171227003544.39826-1-programmingk...@gmail.com
Switched to a new branch 'test'
5b8d161b85 Add ability for user to specify mouse ungrab key

=== OUTPUT BEGIN ===
Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc'
Cloning into '/var/tmp/patchew-tester-tmp-3zjto7_2/src/dtc'...
Submodule path 'dtc': checked out '558cd81bdd432769b59bff01240c44f82cfb1a9d'
  BUILD   centos6
make[1]: Entering directory '/var/tmp/patchew-tester-tmp-3zjto7_2/src'
  GEN 
/var/tmp/patchew-tester-tmp-3zjto7_2/src/docker-src.2017-12-26-19.39.09.29962/qemu.tar
Cloning into 
'/var/tmp/patchew-tester-tmp-3zjto7_2/src/docker-src.2017-12-26-19.39.09.29962/qemu.tar.vroot'...
done.
Your branch is up-to-date with 'origin/test'.
Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc'
Cloning into 
'/var/tmp/patchew-tester-tmp-3zjto7_2/src/docker-src.2017-12-26-19.39.09.29962/qemu.tar.vroot/dtc'...
Submodule path 'dtc': checked out '558cd81bdd432769b59bff01240c44f82cfb1a9d'
Submodule 'ui/keycodemapdb' (git://git.qemu.org/keycodemapdb.git) registered 
for path 'ui/keycodemapdb'
Cloning into 
'/var/tmp/patchew-tester-tmp-3zjto7_2/src/docker-src.2017-12-26-19.39.09.29962/qemu.tar.vroot/ui/keycodemapdb'...
Submodule path 'ui/keycodemapdb': checked out 
'10739aa26051a5d49d88132604539d3ed085e72e'
  COPYRUNNER
RUN test-quick in qemu:centos6 
Packages installed:
SDL-devel-1.2.14-7.el6_7.1.x86_64
bison-2.4.1-5.el6.x86_64
bzip2-devel-1.0.5-7.el6_0.x86_64
ccache-3.1.6-2.el6.x86_64
csnappy-devel-0-6.20150729gitd7bc683.el6.x86_64
flex-2.5.35-9.el6.x86_64
gcc-4.4.7-18.el6.x86_64
gettext-0.17-18.el6.x86_64
git-1.7.1-9.el6_9.x86_64
glib2-devel-2.28.8-9.el6.x86_64
libepoxy-devel-1.2-3.el6.x86_64
libfdt-devel-1.4.0-1.el6.x86_64
librdmacm-devel-1.0.21-0.el6.x86_64
lzo-devel-2.03-3.1.el6_5.1.x86_64
make-3.81-23.el6.x86_64
mesa-libEGL-devel-11.0.7-4.el6.x86_64
mesa-libgbm-devel-11.0.7-4.el6.x86_64
package g++ is not installed
pixman-devel-0.32.8-1.el6.x86_64
spice-glib-devel-0.26-8.el6.x86_64
spice-server-devel-0.12.4-16.el6.x86_64
tar-1.23-15.el6_8.x86_64
vte-devel-0.25.1-9.el6.x86_64
xen-devel-4.6.6-2.el6.x86_64
zlib-devel-1.2.3-29.el6.x86_64

Environment variables:
PACKAGES=bison bzip2-devel ccache csnappy-devel flex g++
 gcc gettext git glib2-devel libepoxy-devel libfdt-devel
 librdmacm-devel lzo-devel make mesa-libEGL-devel 
mesa-libgbm-devel pixman-devel SDL-devel spice-glib-devel 
spice-server-devel tar vte-devel xen-devel zlib-devel
HOSTNAME=55efcff157a2
MAKEFLAGS= -j8
J=8
CCACHE_DIR=/var/tmp/ccache
EXTRA_CONFIGURE_OPTS=
V=
SHOW_ENV=1
PATH=/usr/lib/ccache:/usr/lib64/ccache:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PWD=/
TARGET_LIST=
SHLVL=1
HOME=/root
TEST_DIR=/tmp/qemu-test
FEATURES= dtc
DEBUG=
_=/usr/bin/env

Configure options:
--enable-werror --target-list=x86_64-softmmu,aarch64-softmmu 
--prefix=/tmp/qemu-test/install
No C++ compiler available; disabling C++ specific optional code
Install prefix/tmp/qemu-test/install
BIOS directory/tmp/qemu-test/install/share/qemu
firmware path /tmp/qemu-test/install/share/qemu-firmware
binary directory  /tmp/qemu-test/install/bin
library directory /tmp/qemu-test/install/lib
module directory  /tmp/qemu-test/install/lib/qemu
libexec directory /tmp/qemu-test/install/libexec
include directory /tmp/qemu-test/install/include
config directory  /tmp/qemu-test/install/etc
local state directory   /tmp/qemu-test/install/var
Manual directory  /tmp/qemu-test/install/share/man
ELF interp prefix /usr/gnemul/qemu-%M
Source path   /tmp/qemu-test/src
GIT binarygit
GIT submodules
C compilercc
Host C compiler   cc
C++ compiler  
Objective-C compiler cc
ARFLAGS   rv
CFLAGS-O2 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -g 
QEMU_CFLAGS   -I/usr/include/pixman-1   -I$(SRC_PATH)/dtc/libfdt -pthread 
-I/usr/include/glib-2.0 

[Qemu-devel] [PATCH v4] Add ability for user to specify mouse ungrab key

2017-12-26 Thread John Arbuckle
Currently the ungrab keys for the Cocoa and GTK interface are Control-Alt-g.
This combination may not be very fun for the user to have to enter, so we
now enable the user to specify their own key(s) as the ungrab key(s). The
list of keys that can be used is found in the file qapi/ui.json under QKeyCode.
The max number of keys that can be used is three. The original ungrab keys
still remain usable because there is a real risk of the user forgetting 
the keys he or she specified as the ungrab keys. They remain as a plan B
if plan A is forgotten.

Syntax: -ungrab 

Example usage:  -ungrab home
-ungrab shift-ctrl
-ungrab ctrl-x
-ungrab pgup-pgdn
-ungrab kp_5-kp_6
-ungrab kp_4-kp_5-kp_6

Signed-off-by: John Arbuckle 
---
v4 changes:
- Removed initialization code for key_value_array.
- Added void keyword to console_ungrab_key_sequence(),
  and console_ungrab_key_string() functions.

v3 changes:
- Added the ability for any "sendkey supported" key to be used.
- Added ability for one to three key sequences to be used.

v2 changes:
- Removed the "int i" code from the for loops. 

 include/ui/console.h |  6 ++
 qemu-options.hx  |  2 ++
 ui/cocoa.m   | 48 +++--
 ui/console.c | 60 
 vl.c |  3 +++
 5 files changed, 117 insertions(+), 2 deletions(-)

diff --git a/include/ui/console.h b/include/ui/console.h
index 580dfc57ee..37dc150268 100644
--- a/include/ui/console.h
+++ b/include/ui/console.h
@@ -534,4 +534,10 @@ static inline void early_gtk_display_init(int opengl)
 /* egl-headless.c */
 void egl_headless_init(void);
 
+/* console.c */
+void set_ungrab_seq(const char *new_seq);
+int *console_ungrab_key_sequence(void);
+const char *console_ungrab_key_string(void);
+int console_ungrab_sequence_length(void);
+
 #endif
diff --git a/qemu-options.hx b/qemu-options.hx
index 94647e21e3..51666e6f74 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -4264,6 +4264,8 @@ contents of @code{iv.b64} to the second secret
 
 ETEXI
 
+DEF("ungrab", HAS_ARG, QEMU_OPTION_ungrab, \
+"-ungrab ", QEMU_ARCH_ALL)
 
 HXCOMM This is the last statement. Insert new options before this line!
 STEXI
diff --git a/ui/cocoa.m b/ui/cocoa.m
index 330ccebf90..412a5fc02d 100644
--- a/ui/cocoa.m
+++ b/ui/cocoa.m
@@ -303,6 +303,7 @@ - (float) cdx;
 - (float) cdy;
 - (QEMUScreen) gscreen;
 - (void) raiseAllKeys;
+- (bool) user_ungrab_seq;
 @end
 
 QemuCocoaView *cocoaView;
@@ -674,9 +675,24 @@ - (void) handleEvent:(NSEvent *)event
 }
 }
 
+/*
+ * This code has to be here because the user might use a modifier
+ * key like shift as an ungrab key.
+ */
+if ([self user_ungrab_seq] == true) {
+[self ungrabMouse];
+return;
+}
 break;
 case NSEventTypeKeyDown:
 keycode = cocoa_keycode_to_qemu([event keyCode]);
+[self toggleModifier: keycode];
+
+// If the user is issuing the custom ungrab key sequence
+if ([self user_ungrab_seq] == true) {
+[self ungrabMouse];
+return;
+}
 
 // forward command key combos to the host UI unless the mouse is 
grabbed
 if (!isMouseGrabbed && ([event modifierFlags] & 
NSEventModifierFlagCommand)) {
@@ -714,6 +730,7 @@ - (void) handleEvent:(NSEvent *)event
 break;
 case NSEventTypeKeyUp:
 keycode = cocoa_keycode_to_qemu([event keyCode]);
+[self toggleModifier: keycode];
 
 // don't pass the guest a spurious key-up if we treated this
 // command-key combo as a host UI action
@@ -842,10 +859,18 @@ - (void) grabMouse
 COCOA_DEBUG("QemuCocoaView: grabMouse\n");
 
 if (!isFullscreen) {
+NSString * message_string;
+if (console_ungrab_sequence_length() == 0) {
+message_string = [NSString stringWithFormat: @"- (Press ctrl + alt 
+ g to release Mouse"];
+} else {
+message_string = [NSString stringWithFormat: @"- (Press ctrl + alt 
+ g or %s to release Mouse", console_ungrab_key_string()];
+}
+
+
 if (qemu_name)
-[normalWindow setTitle:[NSString stringWithFormat:@"QEMU %s - 
(Press ctrl + alt + g to release Mouse)", qemu_name]];
+[normalWindow setTitle:[NSString stringWithFormat: @"QEMU %s %@", 
qemu_name, message_string]];
 else
-[normalWindow setTitle:@"QEMU - (Press ctrl + alt + g to release 
Mouse)"];
+[normalWindow setTitle:[NSString stringWithFormat: @"QEMU %@", 
message_string]];
 }
 [self hideCursor];
 if (!isAbsoluteEnabled) {
@@ -898,6 +923,25 @@ - (void) raiseAllKeys
}
}
 }
+
+/* Determines if the user specified ungrab 

Re: [Qemu-devel] [PATCH v3] Add ability for user to specify mouse ungrab key

2017-12-26 Thread no-reply
Hi,

This series failed build test on s390x host. Please find the details below.

Type: series
Message-id: 20171227003544.39826-1-programmingk...@gmail.com
Subject: [Qemu-devel] [PATCH v3] Add ability for user to specify mouse ungrab 
key

=== TEST SCRIPT BEGIN ===
#!/bin/bash
# Testing script will be invoked under the git checkout with
# HEAD pointing to a commit that has the patches applied on top of "base"
# branch
set -e
echo "=== ENV ==="
env
echo "=== PACKAGES ==="
rpm -qa
echo "=== TEST BEGIN ==="
CC=$HOME/bin/cc
INSTALL=$PWD/install
BUILD=$PWD/build
echo -n "Using CC: "
realpath $CC
mkdir -p $BUILD $INSTALL
SRC=$PWD
cd $BUILD
$SRC/configure --cc=$CC --prefix=$INSTALL
make -j4
# XXX: we need reliable clean up
# make check -j4 V=1
make install
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 t [tag update]patchew/20171201125813.1437-1-quint...@redhat.com -> 
patchew/20171201125813.1437-1-quint...@redhat.com
 * [new tag]   
patchew/20171227003544.39826-1-programmingk...@gmail.com -> 
patchew/20171227003544.39826-1-programmingk...@gmail.com
Switched to a new branch 'test'
5b8d161b85 Add ability for user to specify mouse ungrab key

=== OUTPUT BEGIN ===
=== ENV ===
LANG=en_US.UTF-8
XDG_SESSION_ID=25790
USER=fam
PWD=/var/tmp/patchew-tester-tmp-r3_eybwz/src
HOME=/home/fam
SHELL=/bin/sh
SHLVL=2
PATCHEW=/home/fam/patchew/patchew-cli -s http://patchew.org --nodebug
LOGNAME=fam
DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1012/bus
XDG_RUNTIME_DIR=/run/user/1012
PATH=/usr/bin:/bin
_=/usr/bin/env
=== PACKAGES ===
gpg-pubkey-873529b8-54e386ff
glibc-debuginfo-common-2.24-10.fc25.s390x
fedora-release-26-1.noarch
dejavu-sans-mono-fonts-2.35-4.fc26.noarch
xemacs-filesystem-21.5.34-22.20170124hgf412e9f093d4.fc26.noarch
bash-4.4.12-7.fc26.s390x
freetype-2.7.1-9.fc26.s390x
libSM-1.2.2-5.fc26.s390x
libmpc-1.0.2-6.fc26.s390x
libaio-0.3.110-7.fc26.s390x
libverto-0.2.6-7.fc26.s390x
perl-Scalar-List-Utils-1.48-1.fc26.s390x
iptables-libs-1.6.1-2.fc26.s390x
perl-threads-shared-1.57-1.fc26.s390x
p11-kit-trust-0.23.9-2.fc26.s390x
tcl-8.6.6-2.fc26.s390x
libxshmfence-1.2-4.fc26.s390x
expect-5.45-23.fc26.s390x
perl-Thread-Queue-3.12-1.fc26.noarch
perl-encoding-2.19-6.fc26.s390x
keyutils-1.5.10-1.fc26.s390x
gmp-devel-6.1.2-4.fc26.s390x
python2-setuptools-36.2.0-2.fc26.noarch
enchant-1.6.0-16.fc26.s390x
net-snmp-libs-5.7.3-17.fc26.s390x
python-gobject-base-3.24.1-1.fc26.s390x
glusterfs-cli-3.10.7-1.fc26.s390x
python3-distro-1.0.3-1.fc26.noarch
python3-enchant-1.6.10-1.fc26.noarch
python-lockfile-0.11.0-6.fc26.noarch
python2-pyparsing-2.1.10-3.fc26.noarch
python2-lxml-4.1.1-1.fc26.s390x
librados2-10.2.7-2.fc26.s390x
trousers-lib-0.3.13-7.fc26.s390x
libpaper-1.1.24-14.fc26.s390x
libdatrie-0.2.9-4.fc26.s390x
libsoup-2.58.2-1.fc26.s390x
passwd-0.79-9.fc26.s390x
bind99-libs-9.9.10-3.P3.fc26.s390x
python3-rpm-4.13.0.2-1.fc26.s390x
bodhi-client-2.12.2-2.fc26.noarch
mock-core-configs-27.4-1.fc26.noarch
systemd-233-7.fc26.s390x
virglrenderer-0.6.0-1.20170210git76b3da97b.fc26.s390x
s390utils-ziomon-1.36.1-3.fc26.s390x
s390utils-osasnmpd-1.36.1-3.fc26.s390x
libXrandr-1.5.1-2.fc26.s390x
libglvnd-glx-1.0.0-1.fc26.s390x
texlive-ifxetex-svn19685.0.5-33.fc26.2.noarch
texlive-psnfss-svn33946.9.2a-33.fc26.2.noarch
texlive-dvipdfmx-def-svn40328-33.fc26.2.noarch
texlive-natbib-svn20668.8.31b-33.fc26.2.noarch
texlive-xdvi-bin-svn40750-33.20160520.fc26.2.s390x
texlive-cm-svn32865.0-33.fc26.2.noarch
texlive-beton-svn15878.0-33.fc26.2.noarch
texlive-fpl-svn15878.1.002-33.fc26.2.noarch
texlive-mflogo-svn38628-33.fc26.2.noarch
texlive-texlive-docindex-svn41430-33.fc26.2.noarch
texlive-luaotfload-bin-svn34647.0-33.20160520.fc26.2.noarch
texlive-koma-script-svn41508-33.fc26.2.noarch
texlive-pst-tree-svn24142.1.12-33.fc26.2.noarch
texlive-breqn-svn38099.0.98d-33.fc26.2.noarch
texlive-xetex-svn41438-33.fc26.2.noarch
gstreamer1-plugins-bad-free-1.12.3-1.fc26.s390x
xorg-x11-font-utils-7.5-33.fc26.s390x
ghostscript-fonts-5.50-36.fc26.noarch
libXext-devel-1.3.3-5.fc26.s390x
libusbx-devel-1.0.21-2.fc26.s390x
libglvnd-devel-1.0.0-1.fc26.s390x
pcre2-devel-10.23-10.fc26.s390x
emacs-25.3-3.fc26.s390x
alsa-lib-devel-1.1.4.1-1.fc26.s390x
kbd-2.0.4-2.fc26.s390x
dhcp-client-4.3.5-9.fc26.s390x
dconf-0.26.0-2.fc26.s390x
ccache-3.3.4-1.fc26.s390x
glibc-static-2.25-12.fc26.s390x
mc-4.8.19-5.fc26.s390x
doxygen-1.8.13-9.fc26.s390x
dpkg-1.18.24-1.fc26.s390x
libtdb-1.3.13-1.fc26.s390x
python2-pynacl-1.1.1-1.fc26.s390x
nss-sysinit-3.34.0-1.0.fc26.s390x
gnutls-dane-3.5.16-3.fc26.s390x
kernel-4.13.16-202.fc26.s390x
perl-Filter-1.58-1.fc26.s390x
glibc-debuginfo-2.24-10.fc25.s390x
kernel-devel-4.13.13-100.fc25.s390x
fedora-repos-26-1.noarch
dejavu-fonts-common-2.35-4.fc26.noarch
bind99-license-9.9.10-3.P3.fc26.noarch
ncurses-libs-6.0-8.20170212.fc26.s390x
libpng-1.6.28-2.fc26.s390x
libICE-1.0.9-9.fc26.s390x
boost-thread-1.63.0-9.fc26.s390x
kmod-24-1.fc26.s390x
libseccomp-2.3.2-1.fc26.s390x

[Qemu-devel] [PATCH v3] Add ability for user to specify mouse ungrab key

2017-12-26 Thread John Arbuckle
Currently the ungrab keys for the Cocoa and GTK interface are Control-Alt-g.
This combination may not be very fun for the user to have to enter, so we
now enable the user to specify their own key(s) as the ungrab key(s). The
list of keys that can be used is found in the file qapi/ui.json under QKeyCode.
The max number of keys that can be used is three. The original ungrab keys
still remain usable because there is a real risk of the user forgetting 
the keys he or she specified as the ungrab keys. They remain as a plan B
if plan A is forgotten.

Syntax: -ungrab 

Example usage:  -ungrab home
-ungrab shift-ctrl
-ungrab ctrl-x
-ungrab pgup-pgdn
-ungrab kp_5-kp_6
-ungrab kp_4-kp_5-kp_6

Signed-off-by: John Arbuckle 
---
v3 changes:
- Added the ability for any "sendkey supported" key to be used.
- Added ability for one to three key sequences to be used.

v2 changes:
- Removed the "int i" code from the for loops. 

 include/ui/console.h |  6 ++
 qemu-options.hx  |  2 ++
 ui/cocoa.m   | 48 +++--
 ui/console.c | 61 
 vl.c |  3 +++
 5 files changed, 118 insertions(+), 2 deletions(-)

diff --git a/include/ui/console.h b/include/ui/console.h
index 580dfc57ee..37dc150268 100644
--- a/include/ui/console.h
+++ b/include/ui/console.h
@@ -534,4 +534,10 @@ static inline void early_gtk_display_init(int opengl)
 /* egl-headless.c */
 void egl_headless_init(void);
 
+/* console.c */
+void set_ungrab_seq(const char *new_seq);
+int *console_ungrab_key_sequence(void);
+const char *console_ungrab_key_string(void);
+int console_ungrab_sequence_length(void);
+
 #endif
diff --git a/qemu-options.hx b/qemu-options.hx
index 94647e21e3..51666e6f74 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -4264,6 +4264,8 @@ contents of @code{iv.b64} to the second secret
 
 ETEXI
 
+DEF("ungrab", HAS_ARG, QEMU_OPTION_ungrab, \
+"-ungrab ", QEMU_ARCH_ALL)
 
 HXCOMM This is the last statement. Insert new options before this line!
 STEXI
diff --git a/ui/cocoa.m b/ui/cocoa.m
index 330ccebf90..412a5fc02d 100644
--- a/ui/cocoa.m
+++ b/ui/cocoa.m
@@ -303,6 +303,7 @@ - (float) cdx;
 - (float) cdy;
 - (QEMUScreen) gscreen;
 - (void) raiseAllKeys;
+- (bool) user_ungrab_seq;
 @end
 
 QemuCocoaView *cocoaView;
@@ -674,9 +675,24 @@ - (void) handleEvent:(NSEvent *)event
 }
 }
 
+/*
+ * This code has to be here because the user might use a modifier
+ * key like shift as an ungrab key.
+ */
+if ([self user_ungrab_seq] == true) {
+[self ungrabMouse];
+return;
+}
 break;
 case NSEventTypeKeyDown:
 keycode = cocoa_keycode_to_qemu([event keyCode]);
+[self toggleModifier: keycode];
+
+// If the user is issuing the custom ungrab key sequence
+if ([self user_ungrab_seq] == true) {
+[self ungrabMouse];
+return;
+}
 
 // forward command key combos to the host UI unless the mouse is 
grabbed
 if (!isMouseGrabbed && ([event modifierFlags] & 
NSEventModifierFlagCommand)) {
@@ -714,6 +730,7 @@ - (void) handleEvent:(NSEvent *)event
 break;
 case NSEventTypeKeyUp:
 keycode = cocoa_keycode_to_qemu([event keyCode]);
+[self toggleModifier: keycode];
 
 // don't pass the guest a spurious key-up if we treated this
 // command-key combo as a host UI action
@@ -842,10 +859,18 @@ - (void) grabMouse
 COCOA_DEBUG("QemuCocoaView: grabMouse\n");
 
 if (!isFullscreen) {
+NSString * message_string;
+if (console_ungrab_sequence_length() == 0) {
+message_string = [NSString stringWithFormat: @"- (Press ctrl + alt 
+ g to release Mouse"];
+} else {
+message_string = [NSString stringWithFormat: @"- (Press ctrl + alt 
+ g or %s to release Mouse", console_ungrab_key_string()];
+}
+
+
 if (qemu_name)
-[normalWindow setTitle:[NSString stringWithFormat:@"QEMU %s - 
(Press ctrl + alt + g to release Mouse)", qemu_name]];
+[normalWindow setTitle:[NSString stringWithFormat: @"QEMU %s %@", 
qemu_name, message_string]];
 else
-[normalWindow setTitle:@"QEMU - (Press ctrl + alt + g to release 
Mouse)"];
+[normalWindow setTitle:[NSString stringWithFormat: @"QEMU %@", 
message_string]];
 }
 [self hideCursor];
 if (!isAbsoluteEnabled) {
@@ -898,6 +923,25 @@ - (void) raiseAllKeys
}
}
 }
+
+/* Determines if the user specified ungrab sequence is being used */
+- (bool) user_ungrab_seq
+{
+int *ungrab_seq, index, length;
+bool return_value = true;
+
+ungrab_seq = 

Re: [Qemu-devel] [PATCH v3 6/6] [RFH] tests: Add migration compress threads tests

2017-12-26 Thread Juan Quintela
"Dr. David Alan Gilbert"  wrote:
> * Juan Quintela (quint...@redhat.com) wrote:
>> Yeap, it is still not working. trying to learn how to debug threads
>> for guests running from the testt hardness.
>> 
>> For some reason, compression is not working at the moment, test is
>> disabled until I found why.
>
> How does it fail?

Source and destination hang.  Running exactly the same commands without
the test harnness work as expected.

attaching gdb show that every thread is waiting, but haven't found
anything obvious.  Happens in both sides, so I am not sure really which
one is not working (or if both are broken).

Later, Juan.


>
> Dave
>
>> Signed-off-by: Juan Quintela 
>> ---
>>  tests/migration-test.c | 51 
>> ++
>>  1 file changed, 51 insertions(+)
>> 
>> diff --git a/tests/migration-test.c b/tests/migration-test.c
>> index 41dee78a9a..eab3b146a4 100644
>> --- a/tests/migration-test.c
>> +++ b/tests/migration-test.c
>> @@ -739,6 +739,54 @@ static void test_xbzrle_unix(void)
>>  g_free(uri);
>>  }
>>  
>> +static void test_compress(const char *uri)
>> +{
>> +QTestState *from, *to;
>> +
>> +test_migrate_start(, , uri);
>> +
>> +/* We want to pick a speed slow enough that the test completes
>> + * quickly, but that it doesn't complete precopy even on a slow
>> + * machine, so also set the downtime.
>> + */
>> +/* 100 ms */
>> +migrate_set_parameter(from, "downtime-limit", "1");
>> +/* 1MB/s slow*/
>> +migrate_set_parameter(from, "max-bandwidth", "1");
>> +
>> +migrate_set_parameter(from, "compress-threads", "4");
>> +migrate_set_parameter(to, "decompress-threads", "3");
>> +
>> +migrate_set_capability(from, "compress", "true");
>> +migrate_set_capability(to, "compress", "true");
>> +/* Wait for the first serial output from the source */
>> +wait_for_serial("src_serial");
>> +
>> +migrate(from, uri);
>> +
>> +wait_for_migration_pass(from);
>> +
>> +/* 300ms it should converge */
>> +migrate_set_parameter(from, "downtime-limit", "300");
>> +
>> +if (!got_stop) {
>> +qtest_qmp_eventwait(from, "STOP");
>> +}
>> +qtest_qmp_eventwait(to, "RESUME");
>> +
>> +wait_for_serial("dest_serial");
>> +wait_for_migration_complete(from);
>> +
>> +test_migrate_end(from, to);
>> +}
>> +
>> +static void test_compress_unix(void)
>> +{
>> +char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
>> +
>> +test_compress(uri);
>> +g_free(uri);
>> +}
>>  
>>  int main(int argc, char **argv)
>>  {
>> @@ -763,6 +811,9 @@ int main(int argc, char **argv)
>>  qtest_add_func("/migration/precopy/tcp", test_precopy_tcp);
>>  qtest_add_func("/migration/postcopy/unix", test_postcopy);
>>  qtest_add_func("/migration/xbzrle/unix", test_xbzrle_unix);
>> +if (0) {
>> +qtest_add_func("/migration/compress/unix", test_compress_unix);
>> +}
>>  
>>  ret = g_test_run();
>>  
>> -- 
>> 2.14.3
>> 
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH v3 5/6] tests: Add migration xbzrle test

2017-12-26 Thread Juan Quintela
"Dr. David Alan Gilbert"  wrote:
> * Juan Quintela (quint...@redhat.com) wrote:
>> Signed-off-by: Juan Quintela 
>> ---
>>  tests/migration-test.c | 68 
>> ++
>>  1 file changed, 68 insertions(+)
>> 
>> diff --git a/tests/migration-test.c b/tests/migration-test.c
>> index 841c89e28a..41dee78a9a 100644
>> --- a/tests/migration-test.c
>> +++ b/tests/migration-test.c
>> @@ -410,6 +410,20 @@ static void deprecated_set_speed(QTestState *who, const 
>> char *value)
>>  migrate_check_parameter(who, "max-bandwidth", value);
>>  }
>>  
>> +static void deprecated_set_cache_size(QTestState *who, const char *value)
>> +{
>> +QDict *rsp;
>> +gchar *cmd;
>> +
>> +cmd = g_strdup_printf("{ 'execute': 'migrate-set-cache-size',"
>> +  "'arguments': { 'value': %s } }", value);
>> +rsp = qtest_qmp(who, cmd);
>> +g_free(cmd);
>> +g_assert(qdict_haskey(rsp, "return"));
>> +QDECREF(rsp);
>> +migrate_check_parameter(who, "xbzrle-cache-size", value);
>> +}
>> +
>>  static void migrate_set_parameter(QTestState *who, const char *parameter,
>>const char *value)
>>  {
>> @@ -424,6 +438,10 @@ static void migrate_set_parameter(QTestState *who, 
>> const char *parameter,
>>  deprecated_set_speed(who, "12345");
>>  }
>>  
>> +if (strcmp(parameter, "xbzrle-cache-size") == 0) {
>> +deprecated_set_cache_size(who, "4096");
>> +}
>> +
>>  cmd = g_strdup_printf("{ 'execute': 'migrate-set-parameters',"
>>"'arguments': { '%s': %s } }",
>>parameter, value);
>> @@ -673,6 +691,55 @@ static void test_precopy_tcp(void)
>>  g_free(port);
>>  }
>>  
>> +static void test_xbzrle(const char *uri)
>> +{
>> +QTestState *from, *to;
>> +
>> +test_migrate_start(, , uri);
>> +
>> +/* We want to pick a speed slow enough that the test completes
>> + * quickly, but that it doesn't complete precopy even on a slow
>> + * machine, so also set the downtime.
>> + */
>> +/* 100 ms */
>> +migrate_set_parameter(from, "downtime-limit", "1");
>
> ? That's 1 ms not 100ms as the comment says?

You are right, fixed all the comments and forget this one.
Fixed.

>> +/* 1MB/s slow*/
>> +migrate_set_parameter(from, "max-bandwidth", "10");
>
> That's still 1GB/s isn't it?
> Don't you need to reduce that if xbzrle is actually working otherwise
> it'll complete?

I was doing it differently before:

300ms downtime
1MB/s

But this was not reliable to make me wait for it.

So, I changed it to be:

1GB/s (always full speed)
1ms downtime  (so it never converge)

and I change the downtime to something higher later.


>
>> +migrate_set_parameter(from, "xbzrle-cache-size", "33554432");
>
> That's 32M - why?

It is a value smaller than the full amount of memory.  I want to test
the xbzrle cache size command.  I don't care about *which* value we are
using, really.

> The test continually rewrites 100M of data; only changing one byte/page
> - so actually the fun way to do the xbzrle test is to leave this as 32M
> here.

>> +migrate_set_capability(from, "xbzrle", "true");
>> +migrate_set_capability(to, "xbzrle", "true");
>> +/* Wait for the first serial output from the source */
>> +wait_for_serial("src_serial");
>> +
>> +migrate(from, uri);
>> +
>> +wait_for_migration_pass(from);
>> +
>> +/* 300ms it should converge */
>> +migrate_set_parameter(from, "downtime-limit", "300");
>
> but first increase the xvzrle cache to 120M and give it a couple of
> seconds delay; you might even find it completes without the downtime
> limit schange; but you should still do that after a few seconds.
> You can check the stats as well, you should get some xbzrle cache hits
> as long as you raised the cache size.

ok, will think something more about this.

Thanks, Juan.


> Dave
>
>> +if (!got_stop) {
>> +qtest_qmp_eventwait(from, "STOP");
>> +}
>> +qtest_qmp_eventwait(to, "RESUME");
>> +
>> +wait_for_serial("dest_serial");
>> +wait_for_migration_complete(from);
>> +
>> +test_migrate_end(from, to);
>> +}
>> +
>> +static void test_xbzrle_unix(void)
>> +{
>> +char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
>> +
>> +test_xbzrle(uri);
>> +g_free(uri);
>> +}
>> +
>> +
>>  int main(int argc, char **argv)
>>  {
>>  char template[] = "/tmp/migration-test-XX";
>> @@ -695,6 +762,7 @@ int main(int argc, char **argv)
>>  qtest_add_func("/migration/precopy/unix", test_precopy_unix);
>>  qtest_add_func("/migration/precopy/tcp", test_precopy_tcp);
>>  qtest_add_func("/migration/postcopy/unix", test_postcopy);
>> +qtest_add_func("/migration/xbzrle/unix", test_xbzrle_unix);
>>  
>>  ret = g_test_run();
>>  
>> -- 
>> 2.14.3
>> 
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, 

Re: [Qemu-devel] [PATCH v3 3/6] tests: Add migration precopy test

2017-12-26 Thread Juan Quintela
"Dr. David Alan Gilbert"  wrote:
> * Juan Quintela (quint...@redhat.com) wrote:
>> Signed-off-by: Juan Quintela 
>
> OK, although you seem to have chosen a bandwidth 10x what I use
> in the postcopy test but for the same reason.

I tried to make it as fast as possible while making sure that we wait
enough time.   In my home machine each test takes around 2-3 seconds,
and I want to add tests, so better that they are fast.

Later, Juan.

> Reviewed-by: Dr. David Alan Gilbert 

Thanks, Juan.


>
>> ---
>>  tests/migration-test.c | 44 ++--
>>  1 file changed, 42 insertions(+), 2 deletions(-)
>> 
>> diff --git a/tests/migration-test.c b/tests/migration-test.c
>> index 51f49c74e9..3f3f056be6 100644
>> --- a/tests/migration-test.c
>> +++ b/tests/migration-test.c
>> @@ -539,7 +539,7 @@ static void test_migrate_end(QTestState *from, 
>> QTestState *to)
>>  cleanup("dest_serial");
>>  }
>>  
>> -static void test_migrate(void)
>> +static void test_postcopy(void)
>>  {
>>  char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
>>  QTestState *from, *to;
>> @@ -582,6 +582,45 @@ static void test_migrate(void)
>>  test_migrate_end(from, to);
>>  }
>>  
>> +static void test_precopy_unix(void)
>> +{
>> +char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
>> +QTestState *from, *to;
>> +
>> +test_migrate_start(, , uri);
>> +
>> +/* We want to pick a speed slow enough that the test completes
>> + * quickly, but that it doesn't complete precopy even on a slow
>> + * machine, so also set the downtime.
>> + */
>> +/* 1 ms */
>> +migrate_set_parameter(from, "downtime-limit", "1");
>> +/* 1GB/s slow*/
>> +migrate_set_parameter(from, "max-bandwidth", "10");
>> +
>> +/* Wait for the first serial output from the source */
>> +wait_for_serial("src_serial");
>> +
>> +migrate(from, uri);
>> +
>> +wait_for_migration_pass(from);
>> +
>> +/* 300 ms should converge */
>> +migrate_set_parameter(from, "downtime-limit", "300");
>> +
>> +if (!got_stop) {
>> +qtest_qmp_eventwait(from, "STOP");
>> +}
>> +
>> +qtest_qmp_eventwait(to, "RESUME");
>> +
>> +wait_for_serial("dest_serial");
>> +wait_for_migration_complete(from);
>> +
>> +test_migrate_end(from, to);
>> +g_free(uri);
>> +}
>> +
>>  int main(int argc, char **argv)
>>  {
>>  char template[] = "/tmp/migration-test-XX";
>> @@ -601,7 +640,8 @@ int main(int argc, char **argv)
>>  
>>  module_call_init(MODULE_INIT_QOM);
>>  
>> -qtest_add_func("/migration/postcopy/unix", test_migrate);
>> +qtest_add_func("/migration/precopy/unix", test_precopy_unix);
>> +qtest_add_func("/migration/postcopy/unix", test_postcopy);
>>  
>>  ret = g_test_run();
>>  
>> -- 
>> 2.14.3
>> 
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH v3 2/6] tests: migration test deprecated commands

2017-12-26 Thread Juan Quintela
"Dr. David Alan Gilbert"  wrote:
> * Juan Quintela (quint...@redhat.com) wrote:
>> We now test the deprecated commands everytime that we test the new
>> commands.  This makes unnecesary to add tests for deprecated commands.
>> 
>> Signed-off-by: Juan Quintela 
>> ---
>>  tests/migration-test.c | 32 
>>  1 file changed, 28 insertions(+), 4 deletions(-)
>> 
>> diff --git a/tests/migration-test.c b/tests/migration-test.c
>> index 799e24ebc6..51f49c74e9 100644
>> --- a/tests/migration-test.c
>> +++ b/tests/migration-test.c
>> @@ -369,7 +369,7 @@ static void migrate_check_parameter(QTestState *who, 
>> const char *parameter,
>>  QDECREF(rsp);
>>  }
>>  
>> -static void migrate_set_downtime(QTestState *who, const double value)
>> +static void deprecated_set_downtime(QTestState *who, const double value)
>>  {
>>  QDict *rsp;
>>  gchar *cmd;
>> @@ -388,7 +388,7 @@ static void migrate_set_downtime(QTestState *who, const 
>> double value)
>>  g_free(expected);
>>  }
>>  
>> -static void migrate_set_speed(QTestState *who, const char *value)
>> +static void deprecated_set_speed(QTestState *who, const char *value)
>>  {
>>  QDict *rsp;
>>  gchar *cmd;
>> @@ -402,6 +402,30 @@ static void migrate_set_speed(QTestState *who, const 
>> char *value)
>>  migrate_check_parameter(who, "max-bandwidth", value);
>>  }
>>  
>> +static void migrate_set_parameter(QTestState *who, const char *parameter,
>> +  const char *value)
>> +{
>> +QDict *rsp;
>> +gchar *cmd;
>> +
>> +if (strcmp(parameter, "downtime-limit") == 0) {
>> +deprecated_set_downtime(who, 0.12345);
>> +}
>> +
>> +if (strcmp(parameter, "max-bandwidth") == 0) {
>> +deprecated_set_speed(who, "12345");
>> +}
>
> I find that odd; you call migrate_set_parameter to set a particular
> value, but we set them to a different arbitrary value first?

If I do:

migrate_deprecated_command(real_value)
migrate_non_deprecated_command(real_value)
  here the value is already set, so I am not sure that it is working.

This other way, if there is a _deprecated_ command,  I set it to a
known value, check that it was set correctly, and then set the real
value.

Later, Juan.

>
> Dave
>
>> +cmd = g_strdup_printf("{ 'execute': 'migrate-set-parameters',"
>> +  "'arguments': { '%s': %s } }",
>> +  parameter, value);
>> +rsp = qtest_qmp(who, cmd);
>> +g_free(cmd);
>> +g_assert(qdict_haskey(rsp, "return"));
>> +QDECREF(rsp);
>> +migrate_check_parameter(who, parameter, value);
>> +}
>> +
>>  static void migrate_set_capability(QTestState *who, const char *capability,
>> const char *value)
>>  {
>> @@ -530,8 +554,8 @@ static void test_migrate(void)
>>   * quickly, but that it doesn't complete precopy even on a slow
>>   * machine, so also set the downtime.
>>   */
>> -migrate_set_speed(from, "1");
>> -migrate_set_downtime(from, 0.001);
>> +migrate_set_parameter(from, "max-bandwidth", "1");
>> +migrate_set_parameter(from, "downtime-limit", "1");
>>  
>>  /* Wait for the first serial output from the source */
>>  wait_for_serial("src_serial");
>> -- 
>> 2.14.3
>> 
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH v3 2/6] tests: migration test deprecated commands

2017-12-26 Thread Juan Quintela
Peter Xu  wrote:
> On Fri, Dec 01, 2017 at 01:58:09PM +0100, Juan Quintela wrote:
>> We now test the deprecated commands everytime that we test the new
>> commands.  This makes unnecesary to add tests for deprecated commands.
>> 
>> Signed-off-by: Juan Quintela 
>> ---
>>  tests/migration-test.c | 32 
>>  1 file changed, 28 insertions(+), 4 deletions(-)
>> 
>> diff --git a/tests/migration-test.c b/tests/migration-test.c
>> index 799e24ebc6..51f49c74e9 100644
>> --- a/tests/migration-test.c
>> +++ b/tests/migration-test.c
>> @@ -369,7 +369,7 @@ static void migrate_check_parameter(QTestState *who, 
>> const char *parameter,
>>  QDECREF(rsp);
>>  }
>>  
>> -static void migrate_set_downtime(QTestState *who, const double value)
>> +static void deprecated_set_downtime(QTestState *who, const double value)
>>  {
>>  QDict *rsp;
>>  gchar *cmd;
>> @@ -388,7 +388,7 @@ static void migrate_set_downtime(QTestState *who, const 
>> double value)
>>  g_free(expected);
>>  }
>>  
>> -static void migrate_set_speed(QTestState *who, const char *value)
>> +static void deprecated_set_speed(QTestState *who, const char *value)
>>  {
>>  QDict *rsp;
>>  gchar *cmd;
>> @@ -402,6 +402,30 @@ static void migrate_set_speed(QTestState *who, const 
>> char *value)
>>  migrate_check_parameter(who, "max-bandwidth", value);
>>  }
>>  
>> +static void migrate_set_parameter(QTestState *who, const char *parameter,
>> +  const char *value)
>> +{
>> +QDict *rsp;
>> +gchar *cmd;
>> +
>> +if (strcmp(parameter, "downtime-limit") == 0) {
>> +deprecated_set_downtime(who, 0.12345);
>> +}
>> +
>> +if (strcmp(parameter, "max-bandwidth") == 0) {
>> +deprecated_set_speed(who, "12345");
>> +}
>
> I'm fine with current approach, but I would really prefer to put them
> all into a standalone test, by just calling them one by one with some
> specific numbers and that's all.

That means another test (at least), and we have, also at least three
deprecated comands:
- migrate_set_speed
- migrate_set_downtime
- migrate_set_cache_size

And each test makes things slower.  So I *thought* it would we wiser to
just check _always_ use the deprecated an the not deprecated one.

> (luckily we only have two deprecated commands and limited tests,
>  otherwise extra commands will be M*N, say "number of deprecated
>  command" * "number of test mirations")

Each test takes time, so adding tests make everything much slower.
Notice that setting a new setting is fast.

This was the way that I understood Dave he wanted.

Later, Juan.


>
> Thanks,
>
>> +
>> +cmd = g_strdup_printf("{ 'execute': 'migrate-set-parameters',"
>> +  "'arguments': { '%s': %s } }",
>> +  parameter, value);
>> +rsp = qtest_qmp(who, cmd);
>> +g_free(cmd);
>> +g_assert(qdict_haskey(rsp, "return"));
>> +QDECREF(rsp);
>> +migrate_check_parameter(who, parameter, value);
>> +}
>> +
>>  static void migrate_set_capability(QTestState *who, const char *capability,
>> const char *value)
>>  {
>> @@ -530,8 +554,8 @@ static void test_migrate(void)
>>   * quickly, but that it doesn't complete precopy even on a slow
>>   * machine, so also set the downtime.
>>   */
>> -migrate_set_speed(from, "1");
>> -migrate_set_downtime(from, 0.001);
>> +migrate_set_parameter(from, "max-bandwidth", "1");
>> +migrate_set_parameter(from, "downtime-limit", "1");
>>  
>>  /* Wait for the first serial output from the source */
>>  wait_for_serial("src_serial");
>> -- 
>> 2.14.3
>> 



Re: [Qemu-devel] [PATCH v20 4/7] virtio-balloon: VIRTIO_BALLOON_F_SG

2017-12-26 Thread Tetsuo Handa
Wei Wang wrote:
> On 12/26/2017 06:38 PM, Tetsuo Handa wrote:
> > Wei Wang wrote:
> >> On 12/25/2017 10:51 PM, Tetsuo Handa wrote:
> >>> Wei Wang wrote:
> >>>
> >> What we are doing here is to free the pages that were just allocated in
> >> this round of inflating. Next round will be sometime later when the
> >> balloon work item gets its turn to run. Yes, it will then continue to
> >> inflate.
> >> Here are the two cases that will happen then:
> >> 1) the guest is still under memory pressure, the inflate will fail at
> >> memory allocation, which results in a msleep(200), and then it exists
> >> for another time to run.
> >> 2) the guest isn't under memory pressure any more (e.g. the task which
> >> consumes the huge amount of memory is gone), it will continue to inflate
> >> as normal till the requested size.
> >>
> > How likely does 2) occur? It is not so likely. msleep(200) is enough to spam
> > the guest with puff messages. Next round is starting too quickly.
> 
> I meant one of the two cases, 1) or 2), would happen, rather than 2) 
> happens after 1).
> 
> If 2) doesn't happen, then 1) happens. It will continue to try to 
> inflate round by round. But the memory allocation won't succeed, so 
> there will be no pages to inflate to the host. That is, the inflating is 
> simply a code path to the msleep(200) as long as the guest is under 
> memory pressure.

No. See 
http://lkml.kernel.org/r/201710181959.aci05296.jlmvqoofths...@i-love.sakura.ne.jp
 .
Did you try how unlikely 2) occurs if once 1) started?

> 
> Back to our code change, it doesn't result in incorrect behavior as 
> explained above.

The guest will be effectively unusable due to spam.

> 
> >> I think what we are doing is a quite sensible behavior, except a small
> >> change I plan to make:
> >>
> >>   while ((page = balloon_page_pop())) {
> >> -   balloon_page_enqueue(>vb_dev_info, page);
> >>   if (use_sg) {
> >>   if (xb_set_page(vb, page, _min, _max) <
> >> 0) {
> >>   __free_page(page);
> >>   continue;
> >>   }
> >>   } else {
> >>   set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
> >>   }
> >> + balloon_page_enqueue(>vb_dev_info, page);
> >>
> >>> Also, as of Linux 4.15, only up to VIRTIO_BALLOON_ARRAY_PFNS_MAX pages 
> >>> (i.e.
> >>> 1MB) are invisible from deflate request. That amount would be an 
> >>> acceptable
> >>> error. But your patch makes more pages being invisible, for pages 
> >>> allocated
> >>> by balloon_page_alloc() without holding balloon_lock are stored into a 
> >>> local
> >>> variable "LIST_HEAD(pages)" (which means that balloon_page_dequeue() with
> >>> balloon_lock held won't be able to find pages not yet queued by
> >>> balloon_page_enqueue()), doesn't it? What if all memory pages were held in
> >>> "LIST_HEAD(pages)" and balloon_page_dequeue() was called before
> >>> balloon_page_enqueue() is called?
> >>>
> >> If we think of the balloon driver just as a regular driver or
> >> application, that will be a pretty nature thing. A regular driver can
> >> eat a huge amount of memory for its own usages, would this amount of
> >> memory be treated as an error as they are invisible to the
> >> balloon_page_enqueue?
> >>
> > No. Memory used by applications which consumed a lot of memory in their
> > mm_struct is reclaimed by the OOM killer/reaper. Drivers try to avoid
> > allocating more memory than they need. If drivers allocate more memory
> > than they need, they have a hook for releasing unused memory (i.e.
> > register_shrinker() or OOM notifier). What I'm saying here is that
> > the hook for releasing unused memory does not work unless memory held in
> > LIST_HEAD(pages) becomes visible to balloon_page_dequeue().
> >
> > If a system has 128GB of memory, and 127GB of memory was stored into
> > LIST_HEAD(pages) upon first fill_balloon() request, and somebody held
> > balloon_lock from OOM notifier path from out_of_memory() before
> > fill_balloon() holds balloon_lock, leak_balloon_sg_oom() finds that
> > no memory can be freed because balloon_page_enqueue() was never called,
> > and allows the caller of out_of_memory() to invoke the OOM killer despite
> > there is 127GB of memory which can be freed if fill_balloon() was able
> > to hold balloon_lock before leak_balloon_sg_oom() holds balloon_lock.
> > I don't think that that amount is an acceptable error.
> 
> I understand you are worried that OOM couldn't get balloon pages while 
> there are some in the local list. This is a debatable issue, and it may 
> lead to a long discussion. If this is considered to be a big issue, we 
> can make the local list to be global in vb, and accessed by oom 
> notifier, this won't affect this patch, and can be achieved with an 
> add-on patch. How about leaving this discussion as a second step outside 
> this 

Re: [Qemu-devel] [PATCH] Virt: ACPI: fix qemu assert due to re-assigned table data address

2017-12-26 Thread Andrew Jones
On Tue, Dec 26, 2017 at 07:54:15PM +0800, Shannon Zhao wrote:
> 
> 
> On 2017/12/26 19:48, Andrew Jones wrote:
> > On Fri, Dec 22, 2017 at 02:52:47PM +0800, Shannon Zhao wrote:
> >> acpi_data_push uses g_array_set_size to resize the memory size. If there 
> >> is no
> >> enough contiguous memory, the address will be changed. If we use the old 
> >> value,
> >> it will assert.
> >> qemu-kvm: hw/acpi/bios-linker-loader.c:214: 
> >> bios_linker_loader_add_checksum:
> >> Assertion `start_offset < file->blob->len' failed.`
> >>
> >> Signed-off-by: Shannon Zhao 
> >> ---
> >>  hw/arm/virt-acpi-build.c | 18 +++---
> >>  1 file changed, 11 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> >> index 3d78ff6..5901142 100644
> >> --- a/hw/arm/virt-acpi-build.c
> >> +++ b/hw/arm/virt-acpi-build.c
> >> @@ -453,6 +453,7 @@ build_spcr(GArray *table_data, BIOSLinker *linker, 
> >> VirtMachineState *vms)
> >>  AcpiSerialPortConsoleRedirection *spcr;
> >>  const MemMapEntry *uart_memmap = >memmap[VIRT_UART];
> >>  int irq = vms->irqmap[VIRT_UART] + ARM_SPI_BASE;
> >> +int spcr_start = table_data->len;
> >>  
> >>  spcr = acpi_data_push(table_data, sizeof(*spcr));
> >>  
> >> @@ -476,8 +477,8 @@ build_spcr(GArray *table_data, BIOSLinker *linker, 
> >> VirtMachineState *vms)
> >>  spcr->pci_device_id = 0x;  /* PCI Device ID: not a PCI device */
> >>  spcr->pci_vendor_id = 0x;  /* PCI Vendor ID: not a PCI device */
> >>  
> >> -build_header(linker, table_data, (void *)spcr, "SPCR", sizeof(*spcr), 
> >> 2,
> >> - NULL, NULL);
> >> +build_header(linker, table_data, (void *)(table_data->data + 
> >> spcr_start),
> >> + "SPCR", table_data->len - spcr_start, 2, NULL, NULL);
> >>  }
> > 
> > We don't need to change build_spcr(), as acpi_data_push() is only called
> > once, so spcr == new table_data->data + old table_data->len and new
> > table_data->len - spcr == sizeof(*spcr) (the size used in the only
> > acpi_data_push() call)
> > 
> >>  
> >>  static void
> >> @@ -512,8 +513,8 @@ build_srat(GArray *table_data, BIOSLinker *linker, 
> >> VirtMachineState *vms)
> >>  mem_base += numa_info[i].node_mem;
> >>  }
> >>  
> >> -build_header(linker, table_data, (void *)srat, "SRAT",
> >> - table_data->len - srat_start, 3, NULL, NULL);
> >> +build_header(linker, table_data, (void *)(table_data->data + 
> >> srat_start),
> >> + "SRAT", table_data->len - srat_start, 3, NULL, NULL);
> > 
> > Yes, we need this fix, as there are many acpi_data_push() calls in this
> > function. I guess this was the table that triggered the assert.
> > 
> >>  }
> >>  
> >>  static void
> >> @@ -522,6 +523,7 @@ build_mcfg(GArray *table_data, BIOSLinker *linker, 
> >> VirtMachineState *vms)
> >>  AcpiTableMcfg *mcfg;
> >>  const MemMapEntry *memmap = vms->memmap;
> >>  int len = sizeof(*mcfg) + sizeof(mcfg->allocation[0]);
> >> +int mcfg_start = table_data->len;
> >>  
> >>  mcfg = acpi_data_push(table_data, len);
> >>  mcfg->allocation[0].address = 
> >> cpu_to_le64(memmap[VIRT_PCIE_ECAM].base);
> >> @@ -532,7 +534,8 @@ build_mcfg(GArray *table_data, BIOSLinker *linker, 
> >> VirtMachineState *vms)
> >>  mcfg->allocation[0].end_bus_number = (memmap[VIRT_PCIE_ECAM].size
> >>/ PCIE_MMCFG_SIZE_MIN) - 1;
> >>  
> >> -build_header(linker, table_data, (void *)mcfg, "MCFG", len, 1, NULL, 
> >> NULL);
> >> +build_header(linker, table_data, (void *)(table_data->data + 
> >> mcfg_start),
> >> + "MCFG", len, 1, NULL, NULL);
> >>  }
> > 
> > No need to change this one.
> > 
> >>  
> >>  /* GTDT */
> >> @@ -651,6 +654,7 @@ build_madt(GArray *table_data, BIOSLinker *linker, 
> >> VirtMachineState *vms)
> >>  static void build_fadt(GArray *table_data, BIOSLinker *linker,
> >> VirtMachineState *vms, unsigned dsdt_tbl_offset)
> >>  {
> >> +int fadt_start = table_data->len;
> >>  AcpiFadtDescriptorRev5_1 *fadt = acpi_data_push(table_data, 
> >> sizeof(*fadt));
> >>  unsigned xdsdt_entry_offset = (char *)>x_dsdt - 
> >> table_data->data;
> >>  uint16_t bootflags;
> >> @@ -681,8 +685,8 @@ static void build_fadt(GArray *table_data, BIOSLinker 
> >> *linker,
> >>  ACPI_BUILD_TABLE_FILE, xdsdt_entry_offset, sizeof(fadt->x_dsdt),
> >>  ACPI_BUILD_TABLE_FILE, dsdt_tbl_offset);
> >>  
> >> -build_header(linker, table_data,
> >> - (void *)fadt, "FACP", sizeof(*fadt), 5, NULL, NULL);
> >> +build_header(linker, table_data, (void *)(table_data->data + 
> >> fadt_start),
> >> + "FACP", table_data->len - fadt_start, 5, NULL, NULL);
> >>  }
> > 
> > No need to change this one either.
> > 
> >>  
> >>  /* DSDT */
> >> -- 
> >> 2.0.4
> >>
> >>
> > 
> > Please respin only changing the one 

Re: [Qemu-devel] [PATCH] Virt: ACPI: fix qemu assert due to re-assigned table data address

2017-12-26 Thread Shannon Zhao


On 2017/12/26 19:48, Andrew Jones wrote:
> On Fri, Dec 22, 2017 at 02:52:47PM +0800, Shannon Zhao wrote:
>> acpi_data_push uses g_array_set_size to resize the memory size. If there is 
>> no
>> enough contiguous memory, the address will be changed. If we use the old 
>> value,
>> it will assert.
>> qemu-kvm: hw/acpi/bios-linker-loader.c:214: bios_linker_loader_add_checksum:
>> Assertion `start_offset < file->blob->len' failed.`
>>
>> Signed-off-by: Shannon Zhao 
>> ---
>>  hw/arm/virt-acpi-build.c | 18 +++---
>>  1 file changed, 11 insertions(+), 7 deletions(-)
>>
>> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
>> index 3d78ff6..5901142 100644
>> --- a/hw/arm/virt-acpi-build.c
>> +++ b/hw/arm/virt-acpi-build.c
>> @@ -453,6 +453,7 @@ build_spcr(GArray *table_data, BIOSLinker *linker, 
>> VirtMachineState *vms)
>>  AcpiSerialPortConsoleRedirection *spcr;
>>  const MemMapEntry *uart_memmap = >memmap[VIRT_UART];
>>  int irq = vms->irqmap[VIRT_UART] + ARM_SPI_BASE;
>> +int spcr_start = table_data->len;
>>  
>>  spcr = acpi_data_push(table_data, sizeof(*spcr));
>>  
>> @@ -476,8 +477,8 @@ build_spcr(GArray *table_data, BIOSLinker *linker, 
>> VirtMachineState *vms)
>>  spcr->pci_device_id = 0x;  /* PCI Device ID: not a PCI device */
>>  spcr->pci_vendor_id = 0x;  /* PCI Vendor ID: not a PCI device */
>>  
>> -build_header(linker, table_data, (void *)spcr, "SPCR", sizeof(*spcr), 2,
>> - NULL, NULL);
>> +build_header(linker, table_data, (void *)(table_data->data + 
>> spcr_start),
>> + "SPCR", table_data->len - spcr_start, 2, NULL, NULL);
>>  }
> 
> We don't need to change build_spcr(), as acpi_data_push() is only called
> once, so spcr == new table_data->data + old table_data->len and new
> table_data->len - spcr == sizeof(*spcr) (the size used in the only
> acpi_data_push() call)
> 
>>  
>>  static void
>> @@ -512,8 +513,8 @@ build_srat(GArray *table_data, BIOSLinker *linker, 
>> VirtMachineState *vms)
>>  mem_base += numa_info[i].node_mem;
>>  }
>>  
>> -build_header(linker, table_data, (void *)srat, "SRAT",
>> - table_data->len - srat_start, 3, NULL, NULL);
>> +build_header(linker, table_data, (void *)(table_data->data + 
>> srat_start),
>> + "SRAT", table_data->len - srat_start, 3, NULL, NULL);
> 
> Yes, we need this fix, as there are many acpi_data_push() calls in this
> function. I guess this was the table that triggered the assert.
> 
>>  }
>>  
>>  static void
>> @@ -522,6 +523,7 @@ build_mcfg(GArray *table_data, BIOSLinker *linker, 
>> VirtMachineState *vms)
>>  AcpiTableMcfg *mcfg;
>>  const MemMapEntry *memmap = vms->memmap;
>>  int len = sizeof(*mcfg) + sizeof(mcfg->allocation[0]);
>> +int mcfg_start = table_data->len;
>>  
>>  mcfg = acpi_data_push(table_data, len);
>>  mcfg->allocation[0].address = cpu_to_le64(memmap[VIRT_PCIE_ECAM].base);
>> @@ -532,7 +534,8 @@ build_mcfg(GArray *table_data, BIOSLinker *linker, 
>> VirtMachineState *vms)
>>  mcfg->allocation[0].end_bus_number = (memmap[VIRT_PCIE_ECAM].size
>>/ PCIE_MMCFG_SIZE_MIN) - 1;
>>  
>> -build_header(linker, table_data, (void *)mcfg, "MCFG", len, 1, NULL, 
>> NULL);
>> +build_header(linker, table_data, (void *)(table_data->data + 
>> mcfg_start),
>> + "MCFG", len, 1, NULL, NULL);
>>  }
> 
> No need to change this one.
> 
>>  
>>  /* GTDT */
>> @@ -651,6 +654,7 @@ build_madt(GArray *table_data, BIOSLinker *linker, 
>> VirtMachineState *vms)
>>  static void build_fadt(GArray *table_data, BIOSLinker *linker,
>> VirtMachineState *vms, unsigned dsdt_tbl_offset)
>>  {
>> +int fadt_start = table_data->len;
>>  AcpiFadtDescriptorRev5_1 *fadt = acpi_data_push(table_data, 
>> sizeof(*fadt));
>>  unsigned xdsdt_entry_offset = (char *)>x_dsdt - table_data->data;
>>  uint16_t bootflags;
>> @@ -681,8 +685,8 @@ static void build_fadt(GArray *table_data, BIOSLinker 
>> *linker,
>>  ACPI_BUILD_TABLE_FILE, xdsdt_entry_offset, sizeof(fadt->x_dsdt),
>>  ACPI_BUILD_TABLE_FILE, dsdt_tbl_offset);
>>  
>> -build_header(linker, table_data,
>> - (void *)fadt, "FACP", sizeof(*fadt), 5, NULL, NULL);
>> +build_header(linker, table_data, (void *)(table_data->data + 
>> fadt_start),
>> + "FACP", table_data->len - fadt_start, 5, NULL, NULL);
>>  }
> 
> No need to change this one either.
> 
>>  
>>  /* DSDT */
>> -- 
>> 2.0.4
>>
>>
> 
> Please respin only changing the one that needs the fix.
> 
Hi Andrew,

Thanks for your comments. What you said is right that only the
build_srat needs to be fixed but I thought we need to unify the style
and avoid new issues if add some acpi_data_push in other functions in
the future.

Thanks,
-- 
Shannon




Re: [Qemu-devel] [PATCH] Virt: ACPI: fix qemu assert due to re-assigned table data address

2017-12-26 Thread Andrew Jones
On Fri, Dec 22, 2017 at 02:52:47PM +0800, Shannon Zhao wrote:
> acpi_data_push uses g_array_set_size to resize the memory size. If there is no
> enough contiguous memory, the address will be changed. If we use the old 
> value,
> it will assert.
> qemu-kvm: hw/acpi/bios-linker-loader.c:214: bios_linker_loader_add_checksum:
> Assertion `start_offset < file->blob->len' failed.`
> 
> Signed-off-by: Shannon Zhao 
> ---
>  hw/arm/virt-acpi-build.c | 18 +++---
>  1 file changed, 11 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 3d78ff6..5901142 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -453,6 +453,7 @@ build_spcr(GArray *table_data, BIOSLinker *linker, 
> VirtMachineState *vms)
>  AcpiSerialPortConsoleRedirection *spcr;
>  const MemMapEntry *uart_memmap = >memmap[VIRT_UART];
>  int irq = vms->irqmap[VIRT_UART] + ARM_SPI_BASE;
> +int spcr_start = table_data->len;
>  
>  spcr = acpi_data_push(table_data, sizeof(*spcr));
>  
> @@ -476,8 +477,8 @@ build_spcr(GArray *table_data, BIOSLinker *linker, 
> VirtMachineState *vms)
>  spcr->pci_device_id = 0x;  /* PCI Device ID: not a PCI device */
>  spcr->pci_vendor_id = 0x;  /* PCI Vendor ID: not a PCI device */
>  
> -build_header(linker, table_data, (void *)spcr, "SPCR", sizeof(*spcr), 2,
> - NULL, NULL);
> +build_header(linker, table_data, (void *)(table_data->data + spcr_start),
> + "SPCR", table_data->len - spcr_start, 2, NULL, NULL);
>  }

We don't need to change build_spcr(), as acpi_data_push() is only called
once, so spcr == new table_data->data + old table_data->len and new
table_data->len - spcr == sizeof(*spcr) (the size used in the only
acpi_data_push() call)

>  
>  static void
> @@ -512,8 +513,8 @@ build_srat(GArray *table_data, BIOSLinker *linker, 
> VirtMachineState *vms)
>  mem_base += numa_info[i].node_mem;
>  }
>  
> -build_header(linker, table_data, (void *)srat, "SRAT",
> - table_data->len - srat_start, 3, NULL, NULL);
> +build_header(linker, table_data, (void *)(table_data->data + srat_start),
> + "SRAT", table_data->len - srat_start, 3, NULL, NULL);

Yes, we need this fix, as there are many acpi_data_push() calls in this
function. I guess this was the table that triggered the assert.

>  }
>  
>  static void
> @@ -522,6 +523,7 @@ build_mcfg(GArray *table_data, BIOSLinker *linker, 
> VirtMachineState *vms)
>  AcpiTableMcfg *mcfg;
>  const MemMapEntry *memmap = vms->memmap;
>  int len = sizeof(*mcfg) + sizeof(mcfg->allocation[0]);
> +int mcfg_start = table_data->len;
>  
>  mcfg = acpi_data_push(table_data, len);
>  mcfg->allocation[0].address = cpu_to_le64(memmap[VIRT_PCIE_ECAM].base);
> @@ -532,7 +534,8 @@ build_mcfg(GArray *table_data, BIOSLinker *linker, 
> VirtMachineState *vms)
>  mcfg->allocation[0].end_bus_number = (memmap[VIRT_PCIE_ECAM].size
>/ PCIE_MMCFG_SIZE_MIN) - 1;
>  
> -build_header(linker, table_data, (void *)mcfg, "MCFG", len, 1, NULL, 
> NULL);
> +build_header(linker, table_data, (void *)(table_data->data + mcfg_start),
> + "MCFG", len, 1, NULL, NULL);
>  }

No need to change this one.

>  
>  /* GTDT */
> @@ -651,6 +654,7 @@ build_madt(GArray *table_data, BIOSLinker *linker, 
> VirtMachineState *vms)
>  static void build_fadt(GArray *table_data, BIOSLinker *linker,
> VirtMachineState *vms, unsigned dsdt_tbl_offset)
>  {
> +int fadt_start = table_data->len;
>  AcpiFadtDescriptorRev5_1 *fadt = acpi_data_push(table_data, 
> sizeof(*fadt));
>  unsigned xdsdt_entry_offset = (char *)>x_dsdt - table_data->data;
>  uint16_t bootflags;
> @@ -681,8 +685,8 @@ static void build_fadt(GArray *table_data, BIOSLinker 
> *linker,
>  ACPI_BUILD_TABLE_FILE, xdsdt_entry_offset, sizeof(fadt->x_dsdt),
>  ACPI_BUILD_TABLE_FILE, dsdt_tbl_offset);
>  
> -build_header(linker, table_data,
> - (void *)fadt, "FACP", sizeof(*fadt), 5, NULL, NULL);
> +build_header(linker, table_data, (void *)(table_data->data + fadt_start),
> + "FACP", table_data->len - fadt_start, 5, NULL, NULL);
>  }

No need to change this one either.

>  
>  /* DSDT */
> -- 
> 2.0.4
> 
>

Please respin only changing the one that needs the fix.

Thanks,
drew 



Re: [Qemu-devel] [GPU and VFIO] qemu hang at startup, VFIO_IOMMU_MAP_DMA is extremely slow

2017-12-26 Thread Bob Chen
2017-12-26 18:51 GMT+08:00 Liu, Yi L :

> > -Original Message-
> > From: Qemu-devel [mailto:qemu-devel-bounces+yi.l.liu=
> intel@nongnu.org]
> > On Behalf Of Bob Chen
> > Sent: Tuesday, December 26, 2017 6:30 PM
> > To: qemu-devel@nongnu.org
> > Subject: [Qemu-devel] [GPU and VFIO] qemu hang at startup,
> > VFIO_IOMMU_MAP_DMA is extremely slow
> >
> > Hi,
> >
> > I have a host server with multiple GPU cards, and was assigning them to
> qemu
> > with VFIO.
> >
> > I found that when setting up the last free GPU, the qemu process would
> hang
>
> Are all the GPUs in the same iommu group?
>

Each of them is in a single group.


>
> > there and took almost 10 minutes before finishing startup. I made some
> dig by
> > gdb, and found the slowest part occurred at the
> > hw/vfio/common.c:vfio_dma_map function call.
>
> This is to setup mapping and it takes time. This function would be called
> multiple
> times and it will take some time. The slowest part, do you mean it takes
> a long time for a single vfio_dma_map() calling or the whole passthru
> spends a lot
> of time on creating mapping. If a single calling takes a lot of time, then
> it may be
> a problem.
>

Each vfio_dma_map() takes 3 to 10 mins accordingly.


>
> You may paste your Qemu command which might help. And the dmesg in host
> would also help.
>

cmd line:
After adding -device vfio-pci,host=09:00.0,multifunction=on,addr=0x15, qemu
would hang.
Otherwise, could start immediately without this option.

dmesg:
[Tue Dec 26 18:39:50 2017] vfio-pci :09:00.0: enabling device (0400 ->
0402)
[Tue Dec 26 18:39:51 2017] vfio_ecap_init: :09:00.0 hiding ecap
0x1e@0x258
[Tue Dec 26 18:39:51 2017] vfio_ecap_init: :09:00.0 hiding ecap
0x19@0x900
[Tue Dec 26 18:39:55 2017] kvm: zapping shadow pages for mmio generation
wraparound
[Tue Dec 26 18:39:55 2017] kvm: zapping shadow pages for mmio generation
wraparound
[Tue Dec 26 18:40:03 2017] kvm [74663]: vcpu0 ignored rdmsr: 0x345

Kernel:
3.10.0-514.16.1  CentOS 7.3


>
> >
> >
> > static int vfio_dma_map(VFIOContainer *container, hwaddr iova, ram_addr_t
> > size, void *vaddr, bool readonly) { ...
> > if (ioctl(container->fd, VFIO_IOMMU_MAP_DMA, ) == 0 ||
> > (errno == EBUSY && vfio_dma_unmap(container, iova, size) == 0 &&
> >  ioctl(container->fd, VFIO_IOMMU_MAP_DMA, ) == 0)) {
> > return 0;
> > }
> > ...
> > }
> >
> >
> > The hang was enable to reproduce on one of my hosts, I was setting up a
> 4GB
> > memory VM, while the host still had 16GB free. GPU physical mem is 8G.
>
> Does it happen when you only assign a single GPU?
>

Not sure. Didn't try multiple GPUs.


>
> > Also, this phenomenon was observed on other hosts occasionally, and the
> > similarity is that they always happened on the last free GPU.
> >
> >
> > Full stack trace file is attached. Looking forward for you help, thanks
> >
> >
> > - Bob
>
> Regards,
> Yi L
>


Re: [Qemu-devel] [PATCH v20 4/7] virtio-balloon: VIRTIO_BALLOON_F_SG

2017-12-26 Thread Wei Wang

On 12/26/2017 06:38 PM, Tetsuo Handa wrote:

Wei Wang wrote:

On 12/25/2017 10:51 PM, Tetsuo Handa wrote:

Wei Wang wrote:


What we are doing here is to free the pages that were just allocated in
this round of inflating. Next round will be sometime later when the
balloon work item gets its turn to run. Yes, it will then continue to
inflate.
Here are the two cases that will happen then:
1) the guest is still under memory pressure, the inflate will fail at
memory allocation, which results in a msleep(200), and then it exists
for another time to run.
2) the guest isn't under memory pressure any more (e.g. the task which
consumes the huge amount of memory is gone), it will continue to inflate
as normal till the requested size.


How likely does 2) occur? It is not so likely. msleep(200) is enough to spam
the guest with puff messages. Next round is starting too quickly.


I meant one of the two cases, 1) or 2), would happen, rather than 2) 
happens after 1).


If 2) doesn't happen, then 1) happens. It will continue to try to 
inflate round by round. But the memory allocation won't succeed, so 
there will be no pages to inflate to the host. That is, the inflating is 
simply a code path to the msleep(200) as long as the guest is under 
memory pressure.


Back to our code change, it doesn't result in incorrect behavior as 
explained above.



I think what we are doing is a quite sensible behavior, except a small
change I plan to make:

  while ((page = balloon_page_pop())) {
-   balloon_page_enqueue(>vb_dev_info, page);
  if (use_sg) {
  if (xb_set_page(vb, page, _min, _max) <
0) {
  __free_page(page);
  continue;
  }
  } else {
  set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
  }
+ balloon_page_enqueue(>vb_dev_info, page);


Also, as of Linux 4.15, only up to VIRTIO_BALLOON_ARRAY_PFNS_MAX pages (i.e.
1MB) are invisible from deflate request. That amount would be an acceptable
error. But your patch makes more pages being invisible, for pages allocated
by balloon_page_alloc() without holding balloon_lock are stored into a local
variable "LIST_HEAD(pages)" (which means that balloon_page_dequeue() with
balloon_lock held won't be able to find pages not yet queued by
balloon_page_enqueue()), doesn't it? What if all memory pages were held in
"LIST_HEAD(pages)" and balloon_page_dequeue() was called before
balloon_page_enqueue() is called?


If we think of the balloon driver just as a regular driver or
application, that will be a pretty nature thing. A regular driver can
eat a huge amount of memory for its own usages, would this amount of
memory be treated as an error as they are invisible to the
balloon_page_enqueue?


No. Memory used by applications which consumed a lot of memory in their
mm_struct is reclaimed by the OOM killer/reaper. Drivers try to avoid
allocating more memory than they need. If drivers allocate more memory
than they need, they have a hook for releasing unused memory (i.e.
register_shrinker() or OOM notifier). What I'm saying here is that
the hook for releasing unused memory does not work unless memory held in
LIST_HEAD(pages) becomes visible to balloon_page_dequeue().

If a system has 128GB of memory, and 127GB of memory was stored into
LIST_HEAD(pages) upon first fill_balloon() request, and somebody held
balloon_lock from OOM notifier path from out_of_memory() before
fill_balloon() holds balloon_lock, leak_balloon_sg_oom() finds that
no memory can be freed because balloon_page_enqueue() was never called,
and allows the caller of out_of_memory() to invoke the OOM killer despite
there is 127GB of memory which can be freed if fill_balloon() was able
to hold balloon_lock before leak_balloon_sg_oom() holds balloon_lock.
I don't think that that amount is an acceptable error.


I understand you are worried that OOM couldn't get balloon pages while 
there are some in the local list. This is a debatable issue, and it may 
lead to a long discussion. If this is considered to be a big issue, we 
can make the local list to be global in vb, and accessed by oom 
notifier, this won't affect this patch, and can be achieved with an 
add-on patch. How about leaving this discussion as a second step outside 
this series? Balloon has something more that can be improved, and this 
patch series is already big.


Best,
Wei




Re: [Qemu-devel] [GPU and VFIO] qemu hang at startup, VFIO_IOMMU_MAP_DMA is extremely slow

2017-12-26 Thread Liu, Yi L
> -Original Message-
> From: Qemu-devel [mailto:qemu-devel-bounces+yi.l.liu=intel@nongnu.org]
> On Behalf Of Bob Chen
> Sent: Tuesday, December 26, 2017 6:30 PM
> To: qemu-devel@nongnu.org
> Subject: [Qemu-devel] [GPU and VFIO] qemu hang at startup,
> VFIO_IOMMU_MAP_DMA is extremely slow
> 
> Hi,
> 
> I have a host server with multiple GPU cards, and was assigning them to qemu
> with VFIO.
> 
> I found that when setting up the last free GPU, the qemu process would hang

Are all the GPUs in the same iommu group?

> there and took almost 10 minutes before finishing startup. I made some dig by
> gdb, and found the slowest part occurred at the
> hw/vfio/common.c:vfio_dma_map function call.

This is to setup mapping and it takes time. This function would be called 
multiple
times and it will take some time. The slowest part, do you mean it takes
a long time for a single vfio_dma_map() calling or the whole passthru spends a 
lot
of time on creating mapping. If a single calling takes a lot of time, then it 
may be
a problem.

You may paste your Qemu command which might help. And the dmesg in host
would also help.

> 
> 
> static int vfio_dma_map(VFIOContainer *container, hwaddr iova, ram_addr_t
> size, void *vaddr, bool readonly) { ...
> if (ioctl(container->fd, VFIO_IOMMU_MAP_DMA, ) == 0 ||
> (errno == EBUSY && vfio_dma_unmap(container, iova, size) == 0 &&
>  ioctl(container->fd, VFIO_IOMMU_MAP_DMA, ) == 0)) {
> return 0;
> }
> ...
> }
> 
> 
> The hang was enable to reproduce on one of my hosts, I was setting up a 4GB
> memory VM, while the host still had 16GB free. GPU physical mem is 8G.

Does it happen when you only assign a single GPU?

> Also, this phenomenon was observed on other hosts occasionally, and the
> similarity is that they always happened on the last free GPU.
> 
> 
> Full stack trace file is attached. Looking forward for you help, thanks
> 
> 
> - Bob

Regards,
Yi L


Re: [Qemu-devel] [PATCH v20 4/7] virtio-balloon: VIRTIO_BALLOON_F_SG

2017-12-26 Thread Tetsuo Handa
Wei Wang wrote:
> On 12/25/2017 10:51 PM, Tetsuo Handa wrote:
> > Wei Wang wrote:
> >> @@ -173,8 +292,15 @@ static unsigned fill_balloon(struct
> >> virtio_balloon *vb, size_t num)
> >>  while ((page = balloon_page_pop())) {
> >>balloon_page_enqueue(>vb_dev_info, page);
> >> +if (use_sg) {
> >> +if (xb_set_page(vb, page, _min, _max) < 0) {
> >> +__free_page(page);
> >> +continue;
> >> +}
> >> +} else {
> >> +set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
> >> +}
> > Is this the right behaviour?
>  I don't think so. In the worst case, we can set no bit using
>  xb_set_page().
> > If we can't record the page in the xb,
> > wouldn't we rather send it across as a single page?
> >
>  I think that we need to be able to fallback to !use_sg path when OOM.
> >>> I also have different thoughts:
> >>>
> >>> 1) For OOM, we have leak_balloon_sg_oom (oom has nothing to do with
> >>> fill_balloon), which does not use xbitmap to record pages, thus no
> >>> memory allocation.
> >>>
> >>> 2) If the memory is already under pressure, it is pointless to
> >>> continue inflating memory to the host. We need to give thanks to the
> >>> memory allocation failure reported by xbitmap, which gets us a chance
> >>> to release the inflated pages that have been demonstrated to cause the
> >>> memory pressure of the guest.
> >>>
> >> Forgot to add my conclusion: I think the above behavior is correct.
> >>
> > What is the desired behavior when hitting OOM path during inflate/deflate?
> > Once inflation started, the inflation logic is called again and again
> > until the balloon inflates to the requested size.
> 
> The above is true, but I can't agree with the following. Please see below.
> 
> > Such situation will
> > continue wasting CPU resource between inflate-due-to-host's-request versus
> > deflate-due-to-guest's-OOM. It is pointless but cannot stop doing pointless
> > thing.
> 
> What we are doing here is to free the pages that were just allocated in 
> this round of inflating. Next round will be sometime later when the 
> balloon work item gets its turn to run. Yes, it will then continue to 
> inflate.
> Here are the two cases that will happen then:
> 1) the guest is still under memory pressure, the inflate will fail at 
> memory allocation, which results in a msleep(200), and then it exists 
> for another time to run.
> 2) the guest isn't under memory pressure any more (e.g. the task which 
> consumes the huge amount of memory is gone), it will continue to inflate 
> as normal till the requested size.
> 

How likely does 2) occur? It is not so likely. msleep(200) is enough to spam
the guest with puff messages. Next round is starting too quickly.

> I think what we are doing is a quite sensible behavior, except a small 
> change I plan to make:
> 
>  while ((page = balloon_page_pop())) {
> -   balloon_page_enqueue(>vb_dev_info, page);
>  if (use_sg) {
>  if (xb_set_page(vb, page, _min, _max) < 
> 0) {
>  __free_page(page);
>  continue;
>  }
>  } else {
>  set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
>  }
> + balloon_page_enqueue(>vb_dev_info, page);
> 
> >
> > Also, as of Linux 4.15, only up to VIRTIO_BALLOON_ARRAY_PFNS_MAX pages (i.e.
> > 1MB) are invisible from deflate request. That amount would be an acceptable
> > error. But your patch makes more pages being invisible, for pages allocated
> > by balloon_page_alloc() without holding balloon_lock are stored into a local
> > variable "LIST_HEAD(pages)" (which means that balloon_page_dequeue() with
> > balloon_lock held won't be able to find pages not yet queued by
> > balloon_page_enqueue()), doesn't it? What if all memory pages were held in
> > "LIST_HEAD(pages)" and balloon_page_dequeue() was called before
> > balloon_page_enqueue() is called?
> >
> 
> If we think of the balloon driver just as a regular driver or 
> application, that will be a pretty nature thing. A regular driver can 
> eat a huge amount of memory for its own usages, would this amount of 
> memory be treated as an error as they are invisible to the 
> balloon_page_enqueue?
> 

No. Memory used by applications which consumed a lot of memory in their
mm_struct is reclaimed by the OOM killer/reaper. Drivers try to avoid
allocating more memory than they need. If drivers allocate more memory
than they need, they have a hook for releasing unused memory (i.e.
register_shrinker() or OOM notifier). What I'm saying here is that
the hook for releasing unused memory does not work unless memory held in
LIST_HEAD(pages) becomes visible to balloon_page_dequeue().

If a system has 128GB of 

[Qemu-devel] [GPU and VFIO] qemu hang at startup, VFIO_IOMMU_MAP_DMA is extremely slow

2017-12-26 Thread Bob Chen
Hi,

I have a host server with multiple GPU cards, and was assigning them to
qemu with VFIO.

I found that when setting up the last free GPU, the qemu process would hang
there and took almost 10 minutes before finishing startup. I made some dig
by gdb, and found the slowest part occurred at the
hw/vfio/common.c:vfio_dma_map function call.


static int vfio_dma_map(VFIOContainer *container, hwaddr iova, ram_addr_t
size, void *vaddr, bool readonly)
{
...
if (ioctl(container->fd, VFIO_IOMMU_MAP_DMA, ) == 0 ||
(errno == EBUSY && vfio_dma_unmap(container, iova, size) == 0 &&
 ioctl(container->fd, VFIO_IOMMU_MAP_DMA, ) == 0)) {
return 0;
}
...
}


The hang was enable to reproduce on one of my hosts, I was setting up a 4GB
memory VM, while the host still had 16GB free. GPU physical mem is 8G.

Also, this phenomenon was observed on other hosts occasionally, and the
similarity is that they always happened on the last free GPU.


Full stack trace file is attached. Looking forward for you help, thanks


- Bob
#0  vfio_dma_map (container=0x56b1f880, iova=0, size=655360, 
vaddr=0x7ffecfe0, readonly=false) at 
/usr/src/debug/qemu-2.6.2/hw/vfio/common.c:227
map = {argsz = 655359, flags = 0, vaddr = 0, iova = 140737488339248, 
size = 93824994141691}
#1  0x557712dc in vfio_listener_region_add (listener=0x56b1f890, 
section=0x7fffc1f0) at /usr/src/debug/qemu-2.6.2/hw/vfio/common.c:419
container = 0x56b1f880
iova = 0
end = 655359
llend = {lo = 655360, hi = 0}
llsize = {lo = 655360, hi = 0}
vaddr = 0x7ffecfe0
ret = 7
__func__ = "vfio_listener_region_add"
#2  0x55728465 in listener_add_address_space (listener=0x56b1f890, 
as=0x560823e0 )
at /usr/src/debug/qemu-2.6.2/memory.c:2179
section = {mr = 0x566ec570, address_space = 0x560823e0 
, offset_within_region = 0, size = {lo = 655360, hi = 0},
  offset_within_address_space = 0, readonly = false}
view = 0x57ae3bd0
fr = 0x566f0c00
#3  0x5572860d in memory_listener_register (listener=0x56b1f890, 
filter=0x560823e0 )
at /usr/src/debug/qemu-2.6.2/memory.c:2208
other = 0x565b4910
as = 0x560823e0 
#4  0x55772811 in vfio_connect_container (group=0x5784bce0, 
as=0x560823e0 )
at /usr/src/debug/qemu-2.6.2/hw/vfio/common.c:900
container = 0x56b1f880
ret = 0
fd = 35
space = 0x5784bd20
#5  0x55772cbc in vfio_get_group (groupid=25, as=0x560823e0 
) at /usr/src/debug/qemu-2.6.2/hw/vfio/common.c:1008
group = 0x5784bce0
path = 
"/dev/vfio/25\000U\000\000P\303\377\377\377\177\000\000\332Q\224UUU\000"
status = {argsz = 8, flags = 1}
#6  0x5577af5c in vfio_initfn (pdev=0x581672b0) at 
/usr/src/debug/qemu-2.6.2/hw/vfio/pci.c:2447
vdev = 0x581672b0
vbasedev_iter = 0x40b
group = 0x55bbc65d
tmp = 0x57640b60 ""
group_path = 
"../../../../../../kernel/iommu_groups/25\000\000\000\000\343\003\000\000\031ĻUUU\000\000\000\000\000\000\000\000\000\000\220\304\377\377\377\177\000\000]ƻU\a\000\000\000\320ɻUUU\000\000\360\304\377\377\v\004\000\000\300\305\377\377\377\177\000\000I\252\260UUU\000\000\360\304\377\377\377\1-
77\000\000\000\000\000\000\000\000\000\000\320\304\377\377\377\177\000\000]ƻUUU\000\000\260ɻUUU\000\000f˲U\343\003\000\000\241:\000\000\000\200\377\377\002",
 '\000' , 
"\060\000\000\000[\000\000\000`\305\377\377\377\177"...
group_name = 0x7fffc466 "25"
len = 40
st = {st_dev = 17, st_ino = 39127, st_nlink = 3, st_mode = 16877, 
st_uid = 0, st_gid = 0, __pad0 = 0, st_rdev = 0, st_size = 0, st_blksize = 4096,
  st_blocks = 0, st_atim = {tv_sec = 1513939417, tv_nsec = 943657386}, 
st_mtim = {tv_sec = 1510113186, tv_nsec = 59601}, st_ctim = {
tv_sec = 1510113186, tv_nsec = 59601}, __unused = {0, 0, 0}}
groupid = 25
ret = 21845
#7  0x55943b65 in pci_default_realize (dev=0x581672b0, 
errp=0x7fffd4b8) at hw/pci/pci.c:1895
pc = 0x56568e70
__func__ = "pci_default_realize"
#8  0x55943a08 in pci_qdev_realize (qdev=0x581672b0, 
errp=0x7fffd520) at hw/pci/pci.c:1867
pci_dev = 0x581672b0
pc = 0x56568e70
__func__ = "pci_qdev_realize"
local_err = 0x0
bus = 0x569baea0
is_default_rom = false
#9  0x558af8da in device_set_realized (obj=0x581672b0, value=true, 
errp=0x7fffd6e0) at hw/core/qdev.c:1066
dev = 0x581672b0
__func__ = "device_set_realized"
dc = 0x56568e70
hotplug_ctrl = 0x55af83cf 
bus = 0x7fffd5c7
local_err = 0x0
#10 0x55a3754d in property_set_bool (obj=0x581672b0, 
v=0x565a9140, name=0x55b494e9 

Re: [Qemu-devel] [PATCH for-2.12 0/4] qmp dirty bitmap API

2017-12-26 Thread Fam Zheng
On Tue, 12/26 11:57, Vladimir Sementsov-Ogievskiy wrote:
> 26.12.2017 10:07, Fam Zheng wrote:
> > On Wed, 12/20 11:20, Vladimir Sementsov-Ogievskiy wrote:
> > > external backup:
> > > 
> > > 0. we have active_disk and attached to it dirty bitmap bitmap0
> > > 1. qmp blockdev-add tmp_disk (backing=active_disk)
> > > 2. guest fsfreeze
> > > 3. qmp transaction:
> > >      - block-dirty-bitmap-add node=active_disk name=bitmap1
> > >      - block-dirty-bitmap-disable node=active_disk name=bitmap0
> > >      - blockdev-backup drive=active_disk target=tmp_disk sync=none
> > > 4. guest fsthaw
> > > 5. (? not designed yet) qmp blockdev-add filter_node - special filter node
> > > over tmp_disk for synchronization of nbd-reads and backup(sync=none) cow
> > > requests (like it is done in block/replication)
> > > 6. qmp nbd-server-start
> > > 7. qmp nbd-server-add filter_node (there should be possibility of 
> > > exporting
> > > bitmap of child node filter_node->tmp_disk->active_disk->bitmap0)
> > > 
> > > then, external tool can connect to nbd server and get exported bitmap and
> > > read data (including bitmap0) accordingly to nbd specification.
> > > (also, external tool may get a merge of several bitmaps, if we already 
> > > have
> > > a sequence of them)
> > > then, after backup finish, what can be done:
> > > 
> > > 1. qmp block-job-cancel device=active_disk (stop our backup(sync=none))
> > > 2. qmp nbd-server-stop (or qmp nbd-server-remove filter_node)
> > > 3. qmp blockdev-remove filter_node
> > > 4. qmp blockdev-remove tmp_disk
> > > 
> > > on successful backup, you can drop old bitmap if you want (or do not drop
> > > it, if you need to keep sequence of disabled bitmaps):
> > > 1. block-dirty-bitmap-remove node=active_disk name=bitmap0
> > > 
> > > on failed backup, you can merge bitmaps, to make it look like nothing
> > > happened:
> > > 1. qmp transaction:
> > >     - block-dirty-bitmap-merge node=active_disk name-source=bitmap1
> > > name-target=bitmap0
> > Being done in a transaction, will merging a large-ish bitmap synchronously 
> > hurt
> > the responsiveness? Because we have the BQL lock held here which pauses all
> > device emulation.
> > 
> > Have you measured how long it takes to merge two typical bitmaps. Say, for 
> > a 1TB
> > disk?
> > 
> > Fam
> 
> We don't need merge in a transaction.

Yes. Either way, the command is synchronous and the whole merge process is done
with BQL held, so my question still stands. But your numbers have answered it
and the time is neglectable.

Bitmap merging even doesn't have to be synchronous if it really matters, but we
can live with a synchronous implementation for now.

Thanks!

Fam

> 
> Anyway, good question.
> 
> two full of ones bitmaps, 64k granularity, 1tb disk:
> # time virsh qemu-monitor-command tmp '{"execute":
> "block-dirty-bitmap-merge", "arguments": {"node": "disk", "src_name": "a",
> "dst_name": "b"}}'
> {"return":{},"id":"libvirt-1181"}
> real    0m0.009s
> user    0m0.006s
> sys 0m0.002s
> 
> and this is fine:
> for last level of hbitmap we will have
>    disk_size / granularity / nb_bits_in_long = (1024 ^ 4) / (64 * 1024) / 64
> = 262144
> oparations, which is quite a few
> 
> 
> 
> bitmaps in gdb:
> 
> (gdb) p bdrv_lookup_bs ("disk", "disk", 0)
> $1 = (BlockDriverState *) 0x7fd3f6274940
> (gdb) p *$1->dirty_bitmaps.lh_first
> $2 = {mutex = 0x7fd3f6277b28, bitmap = 0x7fd3f5a5adc0, meta = 0x0, successor
> = 0x0,
>   name = 0x7fd3f637b410 "b", size = 1099511627776, disabled = false,
> active_iterators = 0,
>   readonly = false, autoload = false, persistent = false, list = {le_next =
> 0x7fd3f567c650,
>     le_prev = 0x7fd3f6277b58}}
> (gdb) p *$1->dirty_bitmaps.lh_first ->bitmap
> $3 = {size = 16777216, count = 16777216, granularity = 16, meta = 0x0,
> levels = {0x7fd3f6279a90,
>     0x7fd3f5506350, 0x7fd3f5affcb0, 0x7fd3f547a860, 0x7fd3f637b200,
> 0x7fd3f67ff5c0, 0x7fd3d8dfe010},
>   sizes = {1, 1, 1, 1, 64, 4096, 262144}}
> (gdb) p *$1->dirty_bitmaps.lh_first ->list .le_next
> $4 = {mutex = 0x7fd3f6277b28, bitmap = 0x7fd3f567cb30, meta = 0x0, successor
> = 0x0,
>   name = 0x7fd3f5482fb0 "a", size = 1099511627776, disabled = false,
> active_iterators = 0,
>   readonly = false, autoload = false, persistent = false, list = {le_next =
> 0x0,
>     le_prev = 0x7fd3f6c779e0}}
> (gdb) p *$1->dirty_bitmaps.lh_first ->list .le_next ->bitmap
> $5 = {size = 16777216, count = 16777216, granularity = 16, meta = 0x0,
> levels = {0x7fd3f5ef8880,
>     0x7fd3f5facea0, 0x7fd3f5f1cec0, 0x7fd3f5f40a00, 0x7fd3f6c80a00,
> 0x7fd3f66e5f60, 0x7fd3d8fff010},
>   sizes = {1, 1, 1, 1, 64, 4096, 262144}}
> 
> -- 
> Best regards,
> Vladimir
> 



Re: [Qemu-devel] [PATCH for-2.12 0/4] qmp dirty bitmap API

2017-12-26 Thread Vladimir Sementsov-Ogievskiy

26.12.2017 10:07, Fam Zheng wrote:

On Wed, 12/20 11:20, Vladimir Sementsov-Ogievskiy wrote:

external backup:

0. we have active_disk and attached to it dirty bitmap bitmap0
1. qmp blockdev-add tmp_disk (backing=active_disk)
2. guest fsfreeze
3. qmp transaction:
     - block-dirty-bitmap-add node=active_disk name=bitmap1
     - block-dirty-bitmap-disable node=active_disk name=bitmap0
     - blockdev-backup drive=active_disk target=tmp_disk sync=none
4. guest fsthaw
5. (? not designed yet) qmp blockdev-add filter_node - special filter node
over tmp_disk for synchronization of nbd-reads and backup(sync=none) cow
requests (like it is done in block/replication)
6. qmp nbd-server-start
7. qmp nbd-server-add filter_node (there should be possibility of exporting
bitmap of child node filter_node->tmp_disk->active_disk->bitmap0)

then, external tool can connect to nbd server and get exported bitmap and
read data (including bitmap0) accordingly to nbd specification.
(also, external tool may get a merge of several bitmaps, if we already have
a sequence of them)
then, after backup finish, what can be done:

1. qmp block-job-cancel device=active_disk (stop our backup(sync=none))
2. qmp nbd-server-stop (or qmp nbd-server-remove filter_node)
3. qmp blockdev-remove filter_node
4. qmp blockdev-remove tmp_disk

on successful backup, you can drop old bitmap if you want (or do not drop
it, if you need to keep sequence of disabled bitmaps):
1. block-dirty-bitmap-remove node=active_disk name=bitmap0

on failed backup, you can merge bitmaps, to make it look like nothing
happened:
1. qmp transaction:
    - block-dirty-bitmap-merge node=active_disk name-source=bitmap1
name-target=bitmap0

Being done in a transaction, will merging a large-ish bitmap synchronously hurt
the responsiveness? Because we have the BQL lock held here which pauses all
device emulation.

Have you measured how long it takes to merge two typical bitmaps. Say, for a 1TB
disk?

Fam


We don't need merge in a transaction.

Anyway, good question.

two full of ones bitmaps, 64k granularity, 1tb disk:
# time virsh qemu-monitor-command tmp '{"execute": 
"block-dirty-bitmap-merge", "arguments": {"node": "disk", "src_name": 
"a", "dst_name": "b"}}'

{"return":{},"id":"libvirt-1181"}
real    0m0.009s
user    0m0.006s
sys 0m0.002s

and this is fine:
for last level of hbitmap we will have
   disk_size / granularity / nb_bits_in_long = (1024 ^ 4) / (64 * 1024) 
/ 64 = 262144

oparations, which is quite a few



bitmaps in gdb:

(gdb) p bdrv_lookup_bs ("disk", "disk", 0)
$1 = (BlockDriverState *) 0x7fd3f6274940
(gdb) p *$1->dirty_bitmaps.lh_first
$2 = {mutex = 0x7fd3f6277b28, bitmap = 0x7fd3f5a5adc0, meta = 0x0, 
successor = 0x0,
  name = 0x7fd3f637b410 "b", size = 1099511627776, disabled = false, 
active_iterators = 0,
  readonly = false, autoload = false, persistent = false, list = 
{le_next = 0x7fd3f567c650,

    le_prev = 0x7fd3f6277b58}}
(gdb) p *$1->dirty_bitmaps.lh_first ->bitmap
$3 = {size = 16777216, count = 16777216, granularity = 16, meta = 0x0, 
levels = {0x7fd3f6279a90,
    0x7fd3f5506350, 0x7fd3f5affcb0, 0x7fd3f547a860, 0x7fd3f637b200, 
0x7fd3f67ff5c0, 0x7fd3d8dfe010},

  sizes = {1, 1, 1, 1, 64, 4096, 262144}}
(gdb) p *$1->dirty_bitmaps.lh_first ->list .le_next
$4 = {mutex = 0x7fd3f6277b28, bitmap = 0x7fd3f567cb30, meta = 0x0, 
successor = 0x0,
  name = 0x7fd3f5482fb0 "a", size = 1099511627776, disabled = false, 
active_iterators = 0,
  readonly = false, autoload = false, persistent = false, list = 
{le_next = 0x0,

    le_prev = 0x7fd3f6c779e0}}
(gdb) p *$1->dirty_bitmaps.lh_first ->list .le_next ->bitmap
$5 = {size = 16777216, count = 16777216, granularity = 16, meta = 0x0, 
levels = {0x7fd3f5ef8880,
    0x7fd3f5facea0, 0x7fd3f5f1cec0, 0x7fd3f5f40a00, 0x7fd3f6c80a00, 
0x7fd3f66e5f60, 0x7fd3d8fff010},

  sizes = {1, 1, 1, 1, 64, 4096, 262144}}

--
Best regards,
Vladimir




Re: [Qemu-devel] [PATCH 4/4] qapi: add block-dirty-bitmap-merge

2017-12-26 Thread Vladimir Sementsov-Ogievskiy

13.11.2017 19:20, Vladimir Sementsov-Ogievskiy wrote:

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  qapi/block-core.json | 38 ++
  include/block/dirty-bitmap.h |  2 ++
  block/dirty-bitmap.c | 17 +
  blockdev.c   | 30 ++
  4 files changed, 87 insertions(+)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 487744c16b..074c7c4cb9 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1604,6 +1604,20 @@
  '*persistent': 'bool', '*autoload': 'bool' } }
  
  ##

+# @BlockDirtyBitmapMerge:
+#
+# @node: name of device/node which the bitmap is tracking
+#
+# @dst_name: name of the destination dirty bitmap
+#
+# @src_name: name of the source dirty bitmap
+#
+# Since: 2.12
+##
+{ 'struct': 'BlockDirtyBitmapMerge',
+  'data': { 'node': 'str', 'dst_name': 'str', 'src_name': 'str' } }
+
+##
  # @block-dirty-bitmap-add:
  #
  # Create a dirty bitmap with a name on the node, and start tracking the 
writes.
@@ -1714,6 +1728,30 @@
'data': 'BlockDirtyBitmap' }
  
  ##

+# @block-dirty-bitmap-merge:
+#
+# Merge @src_name dirty bitmap to @dst_name dirty bitmap. @src_name dirty
+# bitmap is unchanged.
+#
+# Returns: nothing on success
+#  If @node is not a valid block device, DeviceNotFound
+#  If @dst_name or @src_name is not found, GenericError
+#  If bitmaps has different sizes or granularities, GenericError
+#
+# Since: 2.12
+#
+# Example:
+#
+# -> { "execute": "block-dirty-bitmap-merge",
+#  "arguments": { "node": "drive0", "dst_name": "bitmap0",
+# "src_name": "bitmap1" } }
+# <- { "return": {} }
+#
+##
+  { 'command': 'block-dirty-bitmap-merge',
+'data': 'BlockDirtyBitmapMerge' }
+
+##
  # @BlockDirtyBitmapSha256:
  #
  # SHA256 hash of dirty bitmap data
diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index 3579a7597c..6d797b4a2e 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -68,6 +68,8 @@ void bdrv_dirty_bitmap_set_readonly(BdrvDirtyBitmap *bitmap, 
bool value);
  void bdrv_dirty_bitmap_set_autoload(BdrvDirtyBitmap *bitmap, bool autoload);
  void bdrv_dirty_bitmap_set_persistance(BdrvDirtyBitmap *bitmap,
 bool persistent);
+void bdrv_merge_dirty_bitmap(BdrvDirtyBitmap *dest, const BdrvDirtyBitmap *src,
+ Error **errp);
  
  /* Functions that require manual locking.  */

  void bdrv_dirty_bitmap_lock(BdrvDirtyBitmap *bitmap);
diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 2a0bcd9e51..f60c145e9c 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -719,3 +719,20 @@ char *bdrv_dirty_bitmap_sha256(const BdrvDirtyBitmap 
*bitmap, Error **errp)
  {
  return hbitmap_sha256(bitmap->bitmap, errp);
  }
+
+void bdrv_merge_dirty_bitmap(BdrvDirtyBitmap *dest, const BdrvDirtyBitmap *src,
+ Error **errp)
+{
+qemu_mutex_lock(dest->mutex);
+qemu_mutex_lock(src->mutex);


it's the same mutex, so it can't be locked twice


+
+assert(bdrv_dirty_bitmap_enabled(dest));
+assert(!bdrv_dirty_bitmap_readonly(dest));
+
+if (!hbitmap_merge(dest->bitmap, src->bitmap)) {
+error_setg(errp, "Bitmaps are incompatible and can't be merged");
+}
+
+qemu_mutex_lock(src->mutex);
+qemu_mutex_lock(dest->mutex);

unlock

+}
diff --git a/blockdev.c b/blockdev.c
index f6595ddcd3..922e8da39b 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2948,6 +2948,36 @@ void qmp_block_dirty_bitmap_disable(const char *node, 
const char *name,
  bdrv_disable_dirty_bitmap(bitmap);
  }
  
+void qmp_block_dirty_bitmap_merge(const char *node, const char *dst_name,

+  const char *src_name, Error **errp)
+{
+BlockDriverState *bs;
+BdrvDirtyBitmap *dst, *src;
+
+dst = block_dirty_bitmap_lookup(node, dst_name, , errp);
+if (!dst || !bs) {
+return;
+}
+
+if (bdrv_dirty_bitmap_frozen(dst)) {
+error_setg(errp, "Bitmap '%s' is frozen and cannot be modified",
+   dst_name);
+return;
+} else if (bdrv_dirty_bitmap_readonly(dst)) {
+error_setg(errp, "Bitmap '%s' is readonly and cannot be modified",
+   dst_name);
+return;
+}
+
+src = bdrv_find_dirty_bitmap(bs, src_name);
+if (!src) {
+error_setg(errp, "Dirty bitmap '%s' not found", src_name);
+return;
+}
+
+bdrv_merge_dirty_bitmap(dst, src, errp);
+}
+
  BlockDirtyBitmapSha256 *qmp_x_debug_block_dirty_bitmap_sha256(const char 
*node,
const char 
*name,
Error **errp)



--
Best regards,
Vladimir