date:20170512

Re: [Qemu-devel] [PATCH v2 0/3] script for crash-testing -device

2017-05-12 Thread no-reply

Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20170513033316.22395-1-ehabk...@redhat.com
Subject: [Qemu-devel] [PATCH v2 0/3] script for crash-testing -device

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
67ffc03 scripts: Test script to look for -device crashes
0be0012 qemu.py: Add QEMUMachine.exitcode() method
2dea278 qemu.py: Don't set _popen=None on error/shutdown

=== OUTPUT BEGIN ===
Checking PATCH 1/3: qemu.py: Don't set _popen=None on error/shutdown...
Checking PATCH 2/3: qemu.py: Add QEMUMachine.exitcode() method...
Checking PATCH 3/3: scripts: Test script to look for -device crashes...
WARNING: line over 80 characters
#82: FILE: scripts/device-crash-test.py:47:
+  dict(machine='niagara', expected=True),   # Unable to load a firmware 
for -M niagara

ERROR: line over 90 characters
#83: FILE: scripts/device-crash-test.py:48:
+  dict(machine='boston', expected=True),# Please provide either a 
-kernel or -bios argument

ERROR: line over 90 characters
#86: FILE: scripts/device-crash-test.py:51:
+  # devices that don't work out of the box because they require extra options 
to "-device DEV":

WARNING: line over 80 characters
#88: FILE: scripts/device-crash-test.py:53:
+  dict(device='.*-(i386|x86_64)-cpu', expected=True),# CPU socket-id is 
not set

WARNING: line over 80 characters
#89: FILE: scripts/device-crash-test.py:54:
+  dict(device='ARM,bitband-memory', expected=True),  # source-memory 
property not set

ERROR: line over 90 characters
#90: FILE: scripts/device-crash-test.py:55:
+  dict(device='arm.cortex-a9-global-timer', expected=True), # 
a9_gtimer_realize: num-cpu must be between 1 and 4

WARNING: line over 80 characters
#91: FILE: scripts/device-crash-test.py:56:
+  dict(device='arm_mptimer', expected=True), # num-cpu must be 
between 1 and 4

WARNING: line over 80 characters
#92: FILE: scripts/device-crash-test.py:57:
+  dict(device='armv7m', expected=True),  # memory property was 
not set

WARNING: line over 80 characters
#93: FILE: scripts/device-crash-test.py:58:
+  dict(device='aspeed.scu', expected=True),  # Unknown silicon 
revision: 0x0

WARNING: line over 80 characters
#94: FILE: scripts/device-crash-test.py:59:
+  dict(device='aspeed.sdmc', expected=True), # Unknown silicon 
revision: 0x0

ERROR: line over 90 characters
#95: FILE: scripts/device-crash-test.py:60:
+  dict(device='bcm2835-dma', expected=True), # 
bcm2835_dma_realize: required dma-mr link not found: Property '.dma-mr' not 
found

ERROR: line over 90 characters
#96: FILE: scripts/device-crash-test.py:61:
+  dict(device='bcm2835-fb', expected=True),  # bcm2835_fb_realize: 
required vcram-base property not set

ERROR: line over 90 characters
#97: FILE: scripts/device-crash-test.py:62:
+  dict(device='bcm2835-mbox', expected=True),# 
bcm2835_mbox_realize: required mbox-mr link not found: Property '.mbox-mr' not 
found

ERROR: line over 90 characters
#98: FILE: scripts/device-crash-test.py:63:
+  dict(device='bcm2835-peripherals', expected=True), # 
bcm2835_peripherals_realize: required ram link not found: Property '.ram' not 
found

ERROR: line over 90 characters
#99: FILE: scripts/device-crash-test.py:64:
+  dict(device='bcm2835-property', expected=True),# 
bcm2835_property_realize: required fb link not found: Property '.fb' not found

ERROR: line over 90 characters
#100: FILE: scripts/device-crash-test.py:65:
+  dict(device='bcm2835_gpio', expected=True),# 
bcm2835_gpio_realize: required sdhci link not found: Property '.sdbus-sdhci' 
not found

ERROR: line over 90 characters
#101: FILE: scripts/device-crash-test.py:66:
+  dict(device='bcm2836', expected=True), # bcm2836_realize: 
required ram link not found: Property '.ram' not found

ERROR: line over 90 characters
#102: FILE: scripts/device-crash-test.py:67:
+  dict(device='cfi.pflash01', expected=True),# attribute 
"sector-length" not specified or zero.

ERROR: line over 90 characters
#103: FILE: scripts/device-crash-test.py:68:
+  dict(device='cfi.pflash02', expected=True),# attribute 
"sector-length" not specified or zero.

ERROR: line over 90 characters
#104: FILE: scripts/device-crash-test.py:69:
+  dict(device='icp', expected=True), # icp_realize: 
required link

[Qemu-devel] [PATCH v2 3/3] scripts: Test script to look for -device crashes

2017-05-12 Thread Eduardo Habkost

Test code to check if we can crash QEMU using -device. It will
test all accel/machine/device combinations by default, which may
take a few hours (it's more than 90k test cases). There's a "-r"
option that makes it test a random sample of combinations.

The scripts contains a whitelist for: 1) known error messages
that make QEMU exit cleanly; 2) known QEMU crashes.

This is the behavior when the script finds a failure:

* Known clean (exitcode=1) error messages generate INFO messages
  (visible only in verbose mode), to make script output shorter
* Unknown clean error messages generate warnings
  (visible by default)
* Known crashes generate error messages, but are not fatal
* Unknown crashes generate fatal error messages

I'm unsure about the need to maintain a list of known clean error
messages, but I wanted to at least document all existing failure
cases to use as base to build more comprehensive test code.

Signed-off-by: Eduardo Habkost 
---
Changes v1 -> v2:
* New whitelist entries:
  * "could not find stage1 bootloader"
  * Segfaults when using devices: a15mpcore_priv, sb16, cs4231a, arm-gicv3
* Format "success" line using formatTestCase(), and using DEBUg
  loglevel
* Reword "test case:" line with "running test case:", for clarity
* Fix "pc-.*" whitelist to include "q35" too
* Add --devtype option to test only a specific device type
* Send all log messages to stdout instead of stderr
* Avoid printing "obsolete whitelist entry?" messages if we know
  we are not testing every single accel/machine/device
  combination
* --quick mode, to skip cases where failures are always expected,
  and to print a warning in case we don't get an expected failure
* Use qemu.QEMUMachine instead of qtest.QEMUQtestMachine, as we don't
  use any of the QEMUQtestMachine features
* Fix handling of multiple '-t' options
* Simplify code that generate random sample of test cases
---
 scripts/device-crash-test.py | 520 +++
 1 file changed, 520 insertions(+)
 create mode 100755 scripts/device-crash-test.py

diff --git a/scripts/device-crash-test.py b/scripts/device-crash-test.py
new file mode 100755
index 00..550da70ec7
--- /dev/null
+++ b/scripts/device-crash-test.py
@@ -0,0 +1,520 @@
+#!/usr/bin/env python2.7
+#
+# Run QEMU with all combinations of -machine and -device types,
+# check for crashes and unexpected errors.
+#
+#  Copyright (c) 2017 Red Hat Inc
+#
+# Author:
+#  Eduardo Habkost 
+#
+# This library is free software; you can redistribute it and/or
+# modify it under the terms of the GNU Lesser General Public
+# License as published by the Free Software Foundation; either
+# version 2 of the License, or (at your option) any later version.
+#
+# This library is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# Lesser General Public License for more details.
+#
+# You should have received a copy of the GNU Lesser General Public
+# License along with this library; if not, see .
+#
+
+import sys, os, glob
+sys.path.append(os.path.join(os.path.dirname(__file__), '..', 'scripts'))
+
+from itertools import chain
+from qemu import QEMUMachine
+import logging, traceback, re, random, argparse
+
+logger = logging.getLogger('device-crash-test')
+dbg = logger.debug
+
+# Valid whitelist entry keys:
+# - accel: regexp, full match only
+# - machine: regexp, full match only
+# - device: regexp, full match only
+# - log: regexp, partial match allowed
+# - exitcode: if not present, defaults to 1. If None, matches any exitcode
+# - warn: if True, matching failures will be logged as warnings
+# - expected: if True, QEMU is expected to always fail every time
+#   when testing the corresponding test case
+ERROR_WHITELIST = [
+  # Machines that won't work out of the box:
+  # MACHINE | ERROR MESSAGE
+  dict(machine='niagara', expected=True),   # Unable to load a firmware 
for -M niagara
+  dict(machine='boston', expected=True),# Please provide either a 
-kernel or -bios argument
+  dict(machine='leon3_generic', expected=True), # Can't read bios image (null)
+
+  # devices that don't work out of the box because they require extra options 
to "-device DEV":
+  #DEVICE| ERROR MESSAGE
+  dict(device='.*-(i386|x86_64)-cpu', expected=True),# CPU socket-id is 
not set
+  dict(device='ARM,bitband-memory', expected=True),  # source-memory 
property not set
+  dict(device='arm.cortex-a9-global-timer', expected=True), # 
a9_gtimer_realize: num-cpu must be between 1 and 4
+  dict(device='arm_mptimer', expected=True), # num-cpu must be 
between 1 and 4
+  dict(device='armv7m', expected=True),  # memory property was 
not set
+  dict(device='aspeed.scu', expected=True),

[Qemu-devel] [PATCH v2 0/3] script for crash-testing -device

2017-05-12 Thread Eduardo Habkost

Changes v1 -> v2:
* Use a simpler method to query QEMU exit code in qemu.py
* Use only qemu.py module, instead of qtest.py
* New whitelist entries:
  * "could not find stage1 bootloader"
  * Segfaults when using devices: a15mpcore_priv, sb16, cs4231a, arm-gicv3
* Format "success" line using formatTestCase(), and using DEBUG
  loglevel
* Reword "test case:" line with "running test case:", for clarity
* Fix "pc-.*" whitelist to include "q35" too
* Add --devtype option to test only a specific device type
* Send all log messages to stdout instead of stderr
* Avoid printing "obsolete whitelist entry?" messages if we know
  we are not testing every single accel/machine/device
  combination
* --quick mode, to skip cases where failures are always expected,
  and to print a warning in case we don't get an expected failure
* Use qemu.QEMUMachine instead of qtest.QEMUQtestMachine, as we don't
  use any of the QEMUQtestMachine features
* Fix handling of multiple '-t' options
* Simplify code that generate random sample of test cases

This series adds scripts/device-crashtest.py, that can be used to
crash-test -device with multiple machine/accel/device
combinations.

The script found a few crashes on some machines/devices. A dump
of existing cases can be seen here:
  https://gist.github.com/ehabkost/503b0af0375f0d98d3e84017e8ca54eb

The script contains a whitelist that can also be useful as
documentation of existing ways -device can fail or crash.

Note that the script takes a few hours to run on the default mode
(testing all accel/machine/device combinations), but the "-r N"
option can be used to make it only test N random samples.

Example script output:

  $ ../scripts/device-crash-test.py -v --shuffle
  INFO: test case: machine=verdex binary=./aarch64-softmmu/qemu-system-aarch64 
device=exynos4210-ehci-usb accel=tcg
  INFO: test case: machine=none binary=./aarch64-softmmu/qemu-system-aarch64 
device=onenand accel=tcg
  INFO: test case: machine=pc-i440fx-2.2 
binary=./x86_64-softmmu/qemu-system-x86_64 device=ide-cd accel=kvm
  INFO: success: ./x86_64-softmmu/qemu-system-x86_64 -S -machine 
pc-i440fx-2.2,accel=kvm -device ide-cd
  INFO: test case: machine=SPARCClassic 
binary=./sparc-softmmu/qemu-system-sparc device=memory accel=tcg
  qemu received signal 6: -S -machine SPARCClassic,accel=tcg -device memory
  ERROR: failed: machine=SPARCClassic binary=./sparc-softmmu/qemu-system-sparc 
device=memory accel=tcg
  ERROR: cmdline: ./sparc-softmmu/qemu-system-sparc -S -machine 
SPARCClassic,accel=tcg -device memory
  ERROR: log: qemu-system-sparc: /root/qemu-build/exec.c:1500: find_ram_offset: 
Assertion `size != 0' failed.
  ERROR: exit code: -6
  INFO: test case: machine=romulus-bmc binary=./arm-softmmu/qemu-system-arm 
device=ich9-usb-uhci6 accel=tcg
  INFO: test case: machine=ref405ep binary=./ppc-softmmu/qemu-system-ppc 
device=ivshmem-doorbell accel=tcg
  INFO: test case: machine=romulus-bmc 
binary=./aarch64-softmmu/qemu-system-aarch64 device=l2x0 accel=tcg
  INFO: test case: machine=pc-i440fx-1.7 
binary=./x86_64-softmmu/qemu-system-x86_64 device=virtio-input-host-pci 
accel=tcg
  INFO: test case: machine=none binary=./ppc-softmmu/qemu-system-ppc 
device=virtio-tablet-pci accel=tcg
  INFO: test case: machine=terrier binary=./aarch64-softmmu/qemu-system-aarch64 
device=sst25vf016b accel=tcg
  INFO: success: ./aarch64-softmmu/qemu-system-aarch64 -S -machine 
terrier,accel=tcg -device sst25vf016b
  INFO: test case: machine=none binary=./i386-softmmu/qemu-system-i386 
device=intel-iommu accel=kvm
  qemu received signal 6: -S -machine none,accel=kvm -device intel-iommu
  ERROR: failed: machine=none binary=./i386-softmmu/qemu-system-i386 
device=intel-iommu accel=kvm
  ERROR: cmdline: ./i386-softmmu/qemu-system-i386 -S -machine none,accel=kvm 
-device intel-iommu
  ERROR: log: /root/qemu-build/hw/i386/intel_iommu.c:2565:vtd_realize: Object 
0x7fe117fabfb0 is not an instance of type generic-pc-machine
  ERROR: exit code: -6
  INFO: test case: machine=tosa binary=./aarch64-softmmu/qemu-system-aarch64 
device=integrator_core accel=tcg
  INFO: test case: machine=isapc binary=./i386-softmmu/qemu-system-i386 
device=i82550 accel=kvm
  INFO: test case: machine=xlnx-ep108 
binary=./aarch64-softmmu/qemu-system-aarch64 device=digic accel=tcg
  qemu received signal 6: -S -machine xlnx-ep108,accel=tcg -device digic
  ERROR: failed: machine=xlnx-ep108 
binary=./aarch64-softmmu/qemu-system-aarch64 device=digic accel=tcg
  ERROR: cmdline: ./aarch64-softmmu/qemu-system-aarch64 -S -machine 
xlnx-ep108,accel=tcg -device digic
  ERROR: log: audio: Could not init `oss' audio driver
  ERROR: log: Unexpected error in qemu_chr_fe_init() at 
/root/qemu-build/chardev/char.c:512:
  ERROR: log: qemu-system-aarch64: -device digic: Device 'serial0' is in use
  ERROR: exit code: -6
  INFO: test case: machine=raspi2 binary=./arm-softmmu/qemu-system-arm 
device=sd-card accel=tcg
  INFO: success: ./arm-softmmu/qemu-system-arm -S -machine

[Qemu-devel] [PATCH v2 1/3] qemu.py: Don't set _popen=None on error/shutdown

2017-05-12 Thread Eduardo Habkost

Keep the Popen object around to we can query its exit code later.

To keep the existing 'self._popen is None' checks working, add a
is_running() method, that will check if the process is still running.

Signed-off-by: Eduardo Habkost 
---
 scripts/qemu.py | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/scripts/qemu.py b/scripts/qemu.py
index 6d1b6230b7..16934f1e02 100644
--- a/scripts/qemu.py
+++ b/scripts/qemu.py
@@ -85,8 +85,11 @@ class QEMUMachine(object):
 return
 raise
 
+def is_running(self):
+return self._popen and (self._popen.returncode is None)
+
 def get_pid(self):
-if not self._popen:
+if not self.is_running():
 return None
 return self._popen.pid
 
@@ -128,16 +131,16 @@ class QEMUMachine(object):
stderr=subprocess.STDOUT, 
shell=False)
 self._post_launch()
 except:
-if self._popen:
+if self.is_running():
 self._popen.kill()
+self._popen.wait()
 self._load_io_log()
 self._post_shutdown()
-self._popen = None
 raise
 
 def shutdown(self):
 '''Terminate the VM and clean up'''
-if not self._popen is None:
+if self.is_running():
 try:
 self._qmp.cmd('quit')
 self._qmp.close()
@@ -149,7 +152,6 @@ class QEMUMachine(object):
 sys.stderr.write('qemu received signal %i: %s\n' % (-exitcode, 
' '.join(self._args)))
 self._load_io_log()
 self._post_shutdown()
-self._popen = None
 
 underscore_to_dash = string.maketrans('_', '-')
 def qmp(self, cmd, conv_keys=True, **args):
-- 
2.11.0.259.g40922b1

[Qemu-devel] [PATCH v2 2/3] qemu.py: Add QEMUMachine.exitcode() method

2017-05-12 Thread Eduardo Habkost

Allow the exit code of QEMU to be queried by scripts.

Signed-off-by: Eduardo Habkost 
---
 scripts/qemu.py | 4 
 1 file changed, 4 insertions(+)

diff --git a/scripts/qemu.py b/scripts/qemu.py
index 16934f1e02..ebe1c4b919 100644
--- a/scripts/qemu.py
+++ b/scripts/qemu.py
@@ -88,6 +88,10 @@ class QEMUMachine(object):
 def is_running(self):
 return self._popen and (self._popen.returncode is None)
 
+def exitcode(self):
+if self._popen:
+return self._popen.returncode
+
 def get_pid(self):
 if not self.is_running():
 return None
-- 
2.11.0.259.g40922b1

[Qemu-devel] [PATCH] maintainers: Add myself as a NetBSD reviewer

2017-05-12 Thread Kamil Rytarowski

I volunteer to review NetBSD patches.
Adding myself will help to not miss some of them.

Restore NetBSD as a maintained host.

All patches to make qemu/pkgsrc building have been emitted to review.

Signed-off-by: Kamil Rytarowski 
---
 MAINTAINERS | 6 ++
 configure   | 1 +
 2 files changed, 7 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 0e8d731ebf..c4eff13ce2 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -354,6 +354,12 @@ L: qemu-devel@nongnu.org
 S: Maintained
 F: *posix*
 
+NETBSD
+L: qemu-devel@nongnu.org
+M: Kamil Rytarowski 
+S: Maintained
+K: (?i)NetBSD
+
 W32, W64
 L: qemu-devel@nongnu.org
 M: Stefan Weil 
diff --git a/configure b/configure
index 7c020c076b..0b3e014c93 100755
--- a/configure
+++ b/configure
@@ -611,6 +611,7 @@ NetBSD)
   audio_possible_drivers="oss sdl"
   oss_lib="-lossaudio"
   HOST_VARIANT_DIR="netbsd"
+  supported_os="yes"
 ;;
 OpenBSD)
   bsd="yes"
-- 
2.12.2

Re: [Qemu-devel] [PATCH] tcg: optimize gen_extr_i64_i32()

2017-05-12 Thread Richard Henderson


On 05/12/2017 05:29 PM, Philippe Mathieu-Daudé wrote:

Inspired by Richard Henderson comment:

 http://lists.nongnu.org/archive/html/qemu-devel/2017-05/msg02277.html

Patch applied mechanically with this coccinelle semantic patch:

 @@
 expression lo, hi,arg;
 @@
 -tcg_gen_extrl_i64_i32(lo, arg);
 -tcg_gen_extrh_i64_i32(hi, arg);
 +tcg_gen_extr_i64_i32(lo, hi, arg);

Signed-off-by: Philippe Mathieu-Daudé 
---
  tcg/tcg-op.c | 3 +--
  1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index 6b1f41500c..f3d556c21a 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -2562,8 +2562,7 @@ void tcg_gen_extr_i64_i32(TCGv_i32 lo, TCGv_i32 hi, 
TCGv_i64 arg)
  tcg_gen_mov_i32(lo, TCGV_LOW(arg));
  tcg_gen_mov_i32(hi, TCGV_HIGH(arg));
  } else {
-tcg_gen_extrl_i64_i32(lo, arg);
-tcg_gen_extrh_i64_i32(hi, arg);
+tcg_gen_extr_i64_i32(lo, hi, arg);


You've just created an instance of infinite self-recursion.


r~

[Qemu-devel] [PATCH] libvixl: Correct ordering of includes and fix NetBSD build

2017-05-12 Thread Kamil Rytarowski

The __STDC_CONSTANT_MACROS symbol must be defined before including
directly or indirectly  in order to get support for macros
for integer constants like INT8_C().

The vixl/globals.h headers defines __STDC_CONSTANT_MACROS and must be
included before other system headers.

This change fixes build failures on NetBSD.

Signed-off-by: Kamil Rytarowski 
---
 disas/libvixl/vixl/a64/disasm-a64.cc | 2 +-
 disas/libvixl/vixl/utils.h   | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/disas/libvixl/vixl/a64/disasm-a64.cc 
b/disas/libvixl/vixl/a64/disasm-a64.cc
index 7a58a5c087..fc87306893 100644
--- a/disas/libvixl/vixl/a64/disasm-a64.cc
+++ b/disas/libvixl/vixl/a64/disasm-a64.cc
@@ -24,8 +24,8 @@
 // OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE 
USE
 // OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 
-#include 
 #include "vixl/a64/disasm-a64.h"
+#include 
 
 namespace vixl {
 
diff --git a/disas/libvixl/vixl/utils.h b/disas/libvixl/vixl/utils.h
index 5ab134e240..17034addbc 100644
--- a/disas/libvixl/vixl/utils.h
+++ b/disas/libvixl/vixl/utils.h
@@ -27,10 +27,10 @@
 #ifndef VIXL_UTILS_H
 #define VIXL_UTILS_H
 
-#include 
-#include 
 #include "vixl/globals.h"
 #include "vixl/compiler-intrinsics.h"
+#include 
+#include 
 
 namespace vixl {
 
-- 
2.12.2

Re: [Qemu-devel] [PATCH V4 02/12] net/filter-mirror.c: Add new option to enable vnet support for filter-mirror

2017-05-12 Thread Hailiang Zhang


Hi,

On 2017/5/12 9:41, Zhang Chen wrote:

We add the vnet_hdr option for filter-mirror, default is disable.
If you use virtio-net-pci net driver, please enable it.
You can use it for example:
-object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0,vnet_hdr=on


Is there any way to detect whether or not the vNIC using vnet_hdr ?
I don't think it is a good idea to let users to confirm it, especially for 
users who may not
be so familiar with the vNIC realizing in qemu.


Thanks,
Hailiang


Signed-off-by: Zhang Chen 
---
  net/filter-mirror.c | 34 ++
  qemu-options.hx |  5 +++--
  2 files changed, 37 insertions(+), 2 deletions(-)

diff --git a/net/filter-mirror.c b/net/filter-mirror.c
index 72fa7c2..3766414 100644
--- a/net/filter-mirror.c
+++ b/net/filter-mirror.c
@@ -38,6 +38,7 @@ typedef struct MirrorState {
  NetFilterState parent_obj;
  char *indev;
  char *outdev;
+bool vnet_hdr;
  CharBackend chr_in;
  CharBackend chr_out;
  SocketReadState rs;
@@ -308,6 +309,13 @@ static char *filter_mirror_get_outdev(Object *obj, Error 
**errp)
  return g_strdup(s->outdev);
  }
  
+static char *filter_mirror_get_vnet_hdr(Object *obj, Error **errp)

+{
+MirrorState *s = FILTER_MIRROR(obj);
+
+return s->vnet_hdr ? g_strdup("on") : g_strdup("off");
+}
+
  static void
  filter_mirror_set_outdev(Object *obj, const char *value, Error **errp)
  {
@@ -322,6 +330,21 @@ filter_mirror_set_outdev(Object *obj, const char *value, 
Error **errp)
  }
  }
  
+static void filter_mirror_set_vnet_hdr(Object *obj,

+   const char *value,
+   Error **errp)
+{
+MirrorState *s = FILTER_MIRROR(obj);
+
+if (strcmp(value, "on") && strcmp(value, "off")) {
+error_setg(errp, "Invalid value for filter-mirror vnet_hdr, "
+ "should be 'on' or 'off'");
+return;
+}
+
+s->vnet_hdr = !strcmp(value, "on");
+}
+
  static char *filter_redirector_get_outdev(Object *obj, Error **errp)
  {
  MirrorState *s = FILTER_REDIRECTOR(obj);
@@ -340,8 +363,19 @@ filter_redirector_set_outdev(Object *obj, const char 
*value, Error **errp)
  
  static void filter_mirror_init(Object *obj)

  {
+MirrorState *s = FILTER_MIRROR(obj);
+
  object_property_add_str(obj, "outdev", filter_mirror_get_outdev,
  filter_mirror_set_outdev, NULL);
+
+/*
+ * The vnet_hdr is disabled by default, if you want to enable
+ * this option, you must enable all the option on related modules
+ * (like other filter or colo-compare).
+ */
+s->vnet_hdr = false;
+object_property_add_str(obj, "vnet_hdr", filter_mirror_get_vnet_hdr,
+ filter_mirror_set_vnet_hdr, NULL);
  }
  
  static void filter_redirector_init(Object *obj)

diff --git a/qemu-options.hx b/qemu-options.hx
index 70c0ded..1e08481 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -4024,10 +4024,11 @@ queue @var{all|rx|tx} is an option that can be applied 
to any netfilter.
  @option{tx}: the filter is attached to the transmit queue of the netdev,
   where it will receive packets sent by the netdev.
  
-@item -object filter-mirror,id=@var{id},netdev=@var{netdevid},outdev=@var{chardevid}[,queue=@var{all|rx|tx}]

+@item -object 
filter-mirror,id=@var{id},netdev=@var{netdevid},outdev=@var{chardevid},vnet_hdr=@var{on|off}[,queue=@var{all|rx|tx}]
  
  filter-mirror on netdev @var{netdevid},mirror net packet to chardev

-@var{chardevid}
+@var{chardevid}, if vnet_hdr = on, filter-mirror will mirror packet
+with vnet_hdr_len.
  
  @item -object filter-redirector,id=@var{id},netdev=@var{netdevid},indev=@var{chardevid},

  outdev=@var{chardevid}[,queue=@var{all|rx|tx}]

Re: [Qemu-devel] [PATCH] target/i386: enable A20 automatically in system management mode

2017-05-12 Thread Xu, Anthony

> -Original Message-
> From: Kevin O'Connor [mailto:ke...@koconnor.net]
> Sent: Friday, May 12, 2017 5:02 PM
> To: Xu, Anthony 
> Cc: Paolo Bonzini ; qemu-devel@nongnu.org
> Subject: Re: [PATCH] target/i386: enable A20 automatically in system
> management mode
> 
> On Fri, May 12, 2017 at 11:19:00PM +, Xu, Anthony wrote:
> > > SeaBIOS defaults to enabling A20 and it's a rare beast that disables
> > > it.  One could change x86.h:set_a20 and romlayout.S:transition32 to
> > > only issue the outb() if the inb() indicates a change is needed.  That
> > > would likely eliminate half the accesses.
> >
> > The 350 port 92 access is for write operation only.
> > If include the inb(), it would be 700, and every time it actually has a 
> > change
> > To be precise, It is about 175 switches from 32 bit to 16 bit, then back to 
> > 32
> bit.
> > call16 is called 175 times during Seabios boot without any option rom,
> > It would be more if some option roms are included.
> >
> >
> > I think A20 is disabled by default in SeaBios.
> 
> I don't know why you think that.  One can check with:
> 
> --- a/src/stacks.c
> +++ b/src/stacks.c
> @@ -99,6 +99,8 @@ call32_post(void)
>  if (cr0_caching)
>  cr0_mask(CR0_CD|CR0_NW, cr0_caching);
>  }
> +if (!get_a20())
> +dprintf(1, "a20=0\n");
> 
>  // Restore cmos index register
>  outb(GET_LOW(Call16Data.cmosindex), PORT_CMOS_INDEX);
> 
> With the above I only see a handful of cases where SeaBIOS has to
> restore a20 to a disabled state.

I think it is related to accel and platform, the result I gave before is for 
q35 tcg,

With the above change,   I got below data

Platformaccel   count of restoring A20 to 0
Q35 kvm 96
Q35 tcg 271
PC  kvm 3
PC  tcg 3

A lot of A20 restoring happen when SeaBIOS scans AHCI links.


> 
> The handful I do see are due to cases where yield() is called prior to
> option rom initialization.  Those handful are eliminated for me with
> the following fix:
> 
> --- a/src/stacks.c
> +++ b/src/stacks.c
> @@ -496,6 +496,7 @@ void
>  thread_setup(void)
>  {
>  CanInterrupt = 1;
> +call16_override(1);
>  if (! CONFIG_THREADS)
>  return;
>  ThreadControl = romfile_loadint("etc/threads", 1);

But I still see a lot of PORT_A20 accesses in QEMU as I expected


> 
> What OS / bootloader are you running?

/x86_64-softmmu/qemu-system-x86_64 -bios /home/root/git/seabios/out/bios.bin 
-smp 1 
-machine q35,accel=tcg -m 1G -drive 
format=raw,file=/home/root/images/centos7.2.img,if=ide,index=0 
-nographic  -nodefaults  -serial stdio -monitor pty

-Anthony

Re: [Qemu-devel] [RFC v1 8/9] virtio-crypto: add host feature bits support

2017-05-12 Thread Gonglei (Arei)

> From: Cornelia Huck [mailto:cornelia.h...@de.ibm.com]
> Sent: Friday, May 12, 2017 7:22 PM
> To: Gonglei (Arei)
> Cc: qemu-devel@nongnu.org; m...@redhat.com; Huangweidong (C);
> pa...@linux.vnet.ibm.com; stefa...@redhat.com; Luonengjun; Linqiangmin;
> xin.z...@intel.com; Wubin (H)
> Subject: Re: [RFC v1 8/9] virtio-crypto: add host feature bits support
> 
> On Fri, 12 May 2017 00:55:23 +
> "Gonglei (Arei)"  wrote:
> 
> > >
> > > From: Cornelia Huck [mailto:cornelia.h...@de.ibm.com]
> > > Sent: Thursday, May 11, 2017 11:05 PM
> > > Subject: Re: [RFC v1 8/9] virtio-crypto: add host feature bits support
> > >
> > > On Mon, 8 May 2017 19:38:23 +0800
> > > Gonglei  wrote:
> > >
> > > > We enable all feature bits acquiescently.
> > > >
> > > > Signed-off-by: Gonglei 
> > > > ---
> > > >  hw/virtio/virtio-crypto.c | 15 +++
> > > >  include/hw/virtio/virtio-crypto.h |  1 +
> > > >  2 files changed, 16 insertions(+)
> > > >
> > > > diff --git a/hw/virtio/virtio-crypto.c b/hw/virtio/virtio-crypto.c
> > > > index 5422f25..3dc0ff2 100644
> > > > --- a/hw/virtio/virtio-crypto.c
> > > > +++ b/hw/virtio/virtio-crypto.c
> > > > @@ -1034,6 +1034,11 @@ static uint64_t
> > > virtio_crypto_get_features(VirtIODevice *vdev,
> > > > uint64_t features,
> > > > Error **errp)
> > > >  {
> > > > +VirtIOCrypto *vcrypto = VIRTIO_CRYPTO(vdev);
> > > > +
> > > > +/* Firstly sync all virtio-crypto possible supported features */
> > > > +features |= vcrypto->host_features;
> > > > +
> > > >  return features;
> > > >  }
> > > >
> > > > @@ -1144,6 +1149,16 @@ static const VMStateDescription
> > > vmstate_virtio_crypto = {
> > > >  };
> > > >
> > > >  static Property virtio_crypto_properties[] = {
> > > > +DEFINE_PROP_BIT("mux_mode", VirtIOCrypto, host_features,
> > > > +VIRTIO_CRYPTO_F_MUX_MODE, true),
> > > > +DEFINE_PROP_BIT("cipher_stateless_mode", VirtIOCrypto,
> > > host_features,
> > > > +VIRTIO_CRYPTO_F_CIPHER_STATELESS_MODE,
> > > true),
> > > > +DEFINE_PROP_BIT("hash_stateless_mode", VirtIOCrypto,
> > > host_features,
> > > > +VIRTIO_CRYPTO_F_HASH_STATELESS_MODE,
> > > true),
> > > > +DEFINE_PROP_BIT("mac_stateless_mode", VirtIOCrypto,
> > > host_features,
> > > > +VIRTIO_CRYPTO_F_MAC_STATELESS_MODE,
> true),
> > > > +DEFINE_PROP_BIT("aead_stateless_mode", VirtIOCrypto,
> > > host_features,
> > > > +VIRTIO_CRYPTO_F_AEAD_STATELESS_MODE,
> > > true),
> > > >  DEFINE_PROP_END_OF_LIST(),
> > > >  };
> > > >
> > > > diff --git a/include/hw/virtio/virtio-crypto.h
> b/include/hw/virtio/virtio-crypto.h
> > > > index 465ad20..30ea51d 100644
> > > > --- a/include/hw/virtio/virtio-crypto.h
> > > > +++ b/include/hw/virtio/virtio-crypto.h
> > > > @@ -97,6 +97,7 @@ typedef struct VirtIOCrypto {
> > > >  int multiqueue;
> > > >  uint32_t curr_queues;
> > > >  size_t config_size;
> > > > +uint32_t host_features;
> > >
> > > I'd just make that 64 bits from the start.
> > >
> > Yes, that's better.
> >
> > > >  } VirtIOCrypto;
> > > >
> > > >  #endif /* _QEMU_VIRTIO_CRYPTO_H */
> > >
> > > Don't you need some kind of compat handling?
> >
> > I did that in patch 6 according to the results of those feature bits 
> > negotiated.
> > Patch 9 tests both session mode and stateless mode, they are work. :)
> 
> Ah, I meant machine compat handling. You probably don't want to offer
> these feature bits in older compat machines.

Oh, yes, the older compat machines only can support session mode.
I don't think it's necessary to offer these feature bits which have to
be false default.

Thanks,
-Gonglei

Re: [Qemu-devel] [RFC v1 6/9] virtio-crypto: rework virtio_crypto_handle_request

2017-05-12 Thread Gonglei (Arei)


> From: Halil Pasic [mailto:pa...@linux.vnet.ibm.com]
> Sent: Friday, May 12, 2017 7:02 PM
> 
> 
> On 05/08/2017 01:38 PM, Gonglei wrote:
> > According to the new spec, we should use different
> > requst structure to store the data request based
> > on whether VIRTIO_CRYPTO_F_MUX_MODE feature bit is
> > negotiated or not.
> >
> > In this patch, we havn't supported stateless mode
> > yet. The device reportes an error if both
> > VIRTIO_CRYPTO_F_MUX_MODE and
> VIRTIO_CRYPTO_F_CIPHER_STATELESS_MODE
> > are negotiated, meanwhile the header.flag doesn't set
> > to VIRTIO_CRYPTO_FLAG_SESSION_MODE.
> >
> > Let's handle this scenario in the following patches.
> >
> > Signed-off-by: Gonglei 
> > ---
> >  hw/virtio/virtio-crypto.c | 83
> ---
> >  1 file changed, 71 insertions(+), 12 deletions(-)
> >
> > diff --git a/hw/virtio/virtio-crypto.c b/hw/virtio/virtio-crypto.c
> > index 0353eb6..c4b8a2c 100644
> > --- a/hw/virtio/virtio-crypto.c
> > +++ b/hw/virtio/virtio-crypto.c
> > @@ -577,6 +577,7 @@ virtio_crypto_handle_request(VirtIOCryptoReq
> *request)
> >  VirtQueueElement *elem = >elem;
> >  int queue_index =
> virtio_crypto_vq2q(virtio_get_queue_index(request->vq));
> >  struct virtio_crypto_op_data_req req;
> > +struct virtio_crypto_op_data_req_mux req_mux;
> >  int ret;
> >  struct iovec *in_iov;
> >  struct iovec *out_iov;
> > @@ -587,6 +588,9 @@ virtio_crypto_handle_request(VirtIOCryptoReq
> *request)
> >  uint64_t session_id;
> >  CryptoDevBackendSymOpInfo *sym_op_info = NULL;
> >  Error *local_err = NULL;
> > +bool mux_mode_is_negotiated;
> > +struct virtio_crypto_op_header *header;
> > +bool is_stateless_req = false;
> >
> >  if (elem->out_num < 1 || elem->in_num < 1) {
> >  virtio_error(vdev, "virtio-crypto dataq missing headers");
> > @@ -597,12 +601,28 @@ virtio_crypto_handle_request(VirtIOCryptoReq
> *request)
> >  out_iov = elem->out_sg;
> >  in_num = elem->in_num;
> >  in_iov = elem->in_sg;
> > -if (unlikely(iov_to_buf(out_iov, out_num, 0, , sizeof(req))
> > -!= sizeof(req))) {
> > -virtio_error(vdev, "virtio-crypto request outhdr too short");
> > -return -1;
> > +
> > +mux_mode_is_negotiated =
> > +virtio_vdev_has_feature(vdev, VIRTIO_CRYPTO_F_MUX_MODE);
> > +if (!mux_mode_is_negotiated) {
> > +if (unlikely(iov_to_buf(out_iov, out_num, 0, , sizeof(req))
> > +!= sizeof(req))) {
> > +virtio_error(vdev, "virtio-crypto request outhdr too short");
> > +return -1;
> > +}
> > +iov_discard_front(_iov, _num, sizeof(req));
> > +
> > +header = 
> > +} else {
> > +if (unlikely(iov_to_buf(out_iov, out_num, 0, _mux,
> > +sizeof(req_mux)) != sizeof(req_mux))) {
> > +virtio_error(vdev, "virtio-crypto request outhdr too short");
> > +return -1;
> > +}
> > +iov_discard_front(_iov, _num, sizeof(req_mux));
> > +
> > +header = _mux.header;
> 
> I wonder if this request length checking logic is conform to the
> most recent spec draft on the list ("[PATCH v18 0/2] virtio-crypto:
> virtio crypto device specification").
> 
Sure. Please see below normative formulation:

'''
\drivernormative{\paragraph}{Symmetric algorithms Operation}{Device Types / 
Crypto Device / Device Operation / Symmetric algorithms Operation}
...
\item If the VIRTIO_CRYPTO_F_MUX_MODE feature bit is negotiated, the driver 
MUST use struct virtio_crypto_op_data_req_mux to wrap crypto requests.
Otherwise, the driver MUST use struct virtio_crypto_op_data_req.
...
'''

> AFAIU here you allow only requests of two sizes: one fixed size
> for VIRTIO_CRYPTO_F_MUX_MODE and one without that feature. This
> means that some requests need quite some padding between what
> you call the 'request' and the actual data on which the crypto
> operation dictated by the 'request' needs to be performed.

Yes, that's true.

> What are the benefits of this approach?
> 
We could unify the request for all algorithms, both symmetric algos and 
asymmetric algos,
which is very convenient for handling tens of hundreds of different algorithm 
requests.


Thanks,
-Gonglei

Re: [Qemu-devel] [RFC PATCH v4 0/6] optimize various tcg_gen() functions using extract op

2017-05-12 Thread Julia Lawall



On Fri, 12 May 2017, Philippe Mathieu-Daudé wrote:

> * Changes from v3
>
> Tried to fix wrong previous attempt...
> After getting some nice/fast pieces of advice from Coccinelle folks, I tried 
> to
> improved the script (not much inline documentation yet although).
> - correctly check if this optimizable?
> - document as Mersenne number instead of prime (Eric Blake)
> - try to write Python code instead of BASIC (Markus Elfring advices)
> - try to reduce regex usage
> - try to match shri(); unrelated(); andi(); pattern to optimize, I was 
> surprised
>   to see the alpha diff Coccinelle found.
>
> This is surely not the last version of this patchset, but I think now the
> generated patches are correct and I prefer reviewers to look at them fixed
> instead of wrong one in the ML.
> Still lot of work to do in the cocci script, now it seems to hang trying to
> parse "target/arm/translate.c".

Try using the arguments --debug and --show-trying.  This will help you see
what rule it is stuck on, and what function.  If the function is just very
complicated and the file is not important for transforming, you may just
want to give up, by adding eg --timeout 120.

julia


>
> * [v3] (v2 was a resend of the cocci script):
>
> In my first attempt I misunderstood tcg_gen_extract() intrinsics, and Richard
> Henderson pointed that out.
> In this patchset the cocci script is corrected and clarified, it also print 
> how
> arguments are checked while running.
> Also:
> - incorrect patches have been removed. (Richard Henderson, Nikunj A Dadhania)
> - Coccinelle script licensed GPLv2+ (Eric Blake)
> - comment in each commit about how to apply the patch (Eric Blake)
> - added Acked-by for m68k (Laurent Vivier)
> - Cc: Coccinelle developers.
>
> [v1]
>
> While reviewing a commit from Aurelien Jarno where he optimized a TCG 
> generator
> for SH-4 [1] I found the same optimization done on PPC by Nikunj A Dadhania 
> few
> months ago [2].
> After asking on the ML about a cocci script [3] I thought it would be easier 
> to
> learn about Coccinelle.
>
> citing Aurelien Jarno:
> This doesn't change the generated code on x86, but optimizes it on most
> RISC architectures and makes the code simpler to read.
>
> I actually applied the script using the following command:
>
> $ docker run -v `pwd`:`pwd` -w `pwd` petersenna/coccinelle \
> --sp-file scripts/coccinelle/tcg_gen_extract.cocci \
> --macro-file scripts/cocci-macro-file.h \
> --dir target \
> --in-place
>
> Please review again! thanks.
>
> [1] http://lists.nongnu.org/archive/html/qemu-devel/2017-05/msg01466.html
> [2] http://lists.nongnu.org/archive/html/qemu-devel/2017-02/msg05211.html
> [3] http://lists.nongnu.org/archive/html/qemu-devel/2017-05/msg01499.html
>
> Philippe Mathieu-Daudé (6):
>   coccinelle: add a script to optimize tcg op using tcg_gen_extract()
>   target/alpha: optimize cvtlq() using extract op
>   target/arm: optimize rev16() using extract op
>   target/m68k: optimize bcd_flags() using extract op
>   target/ppc: optimize various functions using extract op
>   target/sparc: optimize various functions using extract op
>
>  scripts/coccinelle/tcg_gen_extract.cocci | 103 
> +++
>  target/alpha/translate.c |   3 +-
>  target/arm/translate-a64.c   |   6 +-
>  target/m68k/translate.c  |   3 +-
>  target/ppc/translate.c   |  21 +++
>  target/ppc/translate/vsx-impl.inc.c  |  24 +++
>  target/sparc/translate.c |  15 ++---
>  7 files changed, 127 insertions(+), 48 deletions(-)
>  create mode 100644 scripts/coccinelle/tcg_gen_extract.cocci
>
> --
> 2.11.0
>
>

[Qemu-devel] [PATCH RESEND v6] qga: Add support network interface statistics in guest-network-get-interfaces command

2017-05-12 Thread ZhiPeng Lu

we can get the network interface statistics inside a virtual machine by
guest-network-get-interfaces command. it is very useful for us to monitor
and analyze network traffic.

Signed-off-by: ZhiPeng Lu 
Signed-off-by: Daniel P. Berrange 
---
 qga/commands-posix.c | 80 +++-
 qga/qapi-schema.json | 38 -
 2 files changed, 116 insertions(+), 2 deletions(-)

diff --git a/qga/commands-posix.c b/qga/commands-posix.c
index 915df9e..233b024 100644
--- a/qga/commands-posix.c
+++ b/qga/commands-posix.c
@@ -1638,6 +1638,73 @@ guest_find_interface(GuestNetworkInterfaceList *head,
 return head;
 }
 
+
+static int str_trim_off(const char *s, int off, int lmt)
+{
+for (; off < lmt; ++off) {
+if (!isspace(s[off])) {
+break;
+}
+}
+return off;
+}
+
+static int guest_get_network_stats(const char *name,
+   GuestNetworkInterfaceStat *stats)
+{
+int name_len;
+char const *devinfo = "/proc/net/dev";
+FILE *fp;
+char *line = NULL, *colon;
+size_t n;
+fp = fopen(devinfo, "r");
+if (!fp) {
+return -1;
+}
+name_len = strlen(name);
+while (getline(, , fp) != -1) {
+long long dummy;
+long long rx_bytes;
+long long rx_packets;
+long long rx_errs;
+long long rx_dropped;
+long long tx_bytes;
+long long tx_packets;
+long long tx_errs;
+long long tx_dropped;
+int trim_off;
+colon = strchr(line, ':');
+if (!colon) {
+continue;
+}
+trim_off = str_trim_off(line, 0, strlen(line));
+if (colon - name_len - trim_off == line &&
+   strncmp(line + trim_off, name, colon - line - trim_off) == 0) {
+if (sscanf(colon + 1,
+"%lld %lld %lld %lld %lld %lld %lld %lld %lld %lld %lld %lld 
%lld %lld %lld %lld",
+  _bytes, _packets, _errs, _dropped,
+  , , , ,
+  _bytes, _packets, _errs, _dropped,
+  , , , ) != 16) {
+continue;
+}
+stats->rx_bytes = rx_bytes;
+stats->rx_packets = rx_packets;
+stats->rx_errs = rx_errs;
+stats->rx_dropped = rx_dropped;
+stats->tx_bytes = tx_bytes;
+stats->tx_packets = tx_packets;
+stats->tx_errs = tx_errs;
+stats->tx_dropped = tx_dropped;
+fclose(fp);
+return 0;
+}
+}
+fclose(fp);
+g_debug("/proc/net/dev: Interface not found");
+return -1;
+}
+
 /*
  * Build information about guest interfaces
  */
@@ -1654,6 +1721,7 @@ GuestNetworkInterfaceList 
*qmp_guest_network_get_interfaces(Error **errp)
 for (ifa = ifap; ifa; ifa = ifa->ifa_next) {
 GuestNetworkInterfaceList *info;
 GuestIpAddressList **address_list = NULL, *address_item = NULL;
+GuestNetworkInterfaceStat  *interface_stat = NULL;
 char addr4[INET_ADDRSTRLEN];
 char addr6[INET6_ADDRSTRLEN];
 int sock;
@@ -1773,7 +1841,17 @@ GuestNetworkInterfaceList 
*qmp_guest_network_get_interfaces(Error **errp)
 
 info->value->has_ip_addresses = true;
 
-
+if (!info->value->has_statistics) {
+interface_stat = g_malloc0(sizeof(*interface_stat));
+if (guest_get_network_stats(info->value->name,
+interface_stat) == -1) {
+info->value->has_statistics = false;
+g_free(interface_stat);
+} else {
+info->value->statistics = interface_stat;
+info->value->has_statistics = true;
+}
+}
 }
 
 freeifaddrs(ifap);
diff --git a/qga/qapi-schema.json b/qga/qapi-schema.json
index a02dbf2..948219b 100644
--- a/qga/qapi-schema.json
+++ b/qga/qapi-schema.json
@@ -635,6 +635,38 @@
'prefix': 'int'} }
 
 ##
+# @GuestNetworkInterfaceStat:
+#
+# @rx-bytes: total bytes received
+#
+# @rx-packets: total packets received
+#
+# @rx-errs: bad packets received
+#
+# @rx-dropped: receiver dropped packets
+#
+# @tx-bytes: total bytes transmitted
+#
+# @tx-packets: total packets transmitted
+#
+# @tx-errs: packet transmit problems
+#
+# @tx-dropped: dropped packets transmitted
+#
+# Since: 2.10
+##
+{ 'struct': 'GuestNetworkInterfaceStat',
+  'data': {'rx-bytes': 'uint64',
+'rx-packets': 'uint64',
+'rx-errs': 'uint64',
+'rx-dropped': 'uint64',
+'tx-bytes': 'uint64',
+'tx-packets': 'uint64',
+'tx-errs': 'uint64',
+'tx-dropped': 'uint64'
+   } }
+
+##
 # @GuestNetworkInterface:
 #
 # @name: The name of interface for which info are being delivered
@@ -643,12 +675,16 @@
 #
 # @ip-addresses: List of addresses assigned to @name
 #
+# @statistics: various statistic counters related to @name
+#

[Qemu-devel] [PATCH] ivshmem-server: Detect and use if there is required -lrt linking

2017-05-12 Thread Kamil Rytarowski

ivshmem-server makes use of the POSIX shared memory object interfaces.
This library is provided on NetBSD in -lrt (POSIX Real-time Library).
Add ./configure check if there is needed -lrt linking for shm_open()
and if so use it. Introduce new configure generated variable LIBS_SHMLIB.

This fixes build issue on NetBSD.

Signed-off-by: Kamil Rytarowski 
---
 Makefile  |  1 +
 configure | 20 
 2 files changed, 21 insertions(+)

diff --git a/Makefile b/Makefile
index 31d41a7eae..3248cb53d7 100644
--- a/Makefile
+++ b/Makefile
@@ -473,6 +473,7 @@ ivshmem-client$(EXESUF): $(ivshmem-client-obj-y) 
$(COMMON_LDADDS)
$(call LINK, $^)
 ivshmem-server$(EXESUF): $(ivshmem-server-obj-y) $(COMMON_LDADDS)
$(call LINK, $^)
+ivshmem-server$(EXESUF): LIBS += $(LIBS_SHMLIB)
 
 module_block.h: $(SRC_PATH)/scripts/modules/module_block.py config-host.mak
$(call quiet-command,$(PYTHON) $< $@ \
diff --git a/configure b/configure
index 7c020c076b..50c3aee746 100755
--- a/configure
+++ b/configure
@@ -179,6 +179,7 @@ audio_pt_int=""
 audio_win_int=""
 cc_i386=i386-pc-linux-gnu-gcc
 libs_qga=""
+libs_shmlib=""
 debug_info="yes"
 stack_protector=""
 
@@ -4133,6 +4134,24 @@ elif compile_prog "" "$pthread_lib -lrt" ; then
   libs_qga="$libs_qga -lrt"
 fi
 
+##
+# Do we need librt for shm_open()
+cat > $TMPC <
+#include 
+#include 
+#include 
+int main(void) {
+  return shm_open(NULL, O_RDWR, 0644);
+}
+EOF
+
+if compile_prog "" "" ; then
+  :
+elif compile_prog "" "-lrt" ; then
+  libs_shmlib="$libs_shmlib -lrt"
+fi
+
 if test "$darwin" != "yes" -a "$mingw32" != "yes" -a "$solaris" != yes -a \
 "$aix" != "yes" -a "$haiku" != "yes" ; then
 libs_softmmu="-lutil $libs_softmmu"
@@ -5949,6 +5968,7 @@ echo "EXESUF=$EXESUF" >> $config_host_mak
 echo "DSOSUF=$DSOSUF" >> $config_host_mak
 echo "LDFLAGS_SHARED=$LDFLAGS_SHARED" >> $config_host_mak
 echo "LIBS_QGA+=$libs_qga" >> $config_host_mak
+echo "LIBS_SHMLIB+=$libs_shmlib" >> $config_host_mak
 echo "TASN1_LIBS=$tasn1_libs" >> $config_host_mak
 echo "TASN1_CFLAGS=$tasn1_cflags" >> $config_host_mak
 echo "POD2MAN=$POD2MAN" >> $config_host_mak
-- 
2.12.2

[Qemu-devel] [PATCH] tcg: optimize gen_extr_i64_i32()

2017-05-12 Thread Philippe Mathieu-Daudé

Inspired by Richard Henderson comment:

http://lists.nongnu.org/archive/html/qemu-devel/2017-05/msg02277.html

Patch applied mechanically with this coccinelle semantic patch:

@@
expression lo, hi,arg;
@@
-tcg_gen_extrl_i64_i32(lo, arg);
-tcg_gen_extrh_i64_i32(hi, arg);
+tcg_gen_extr_i64_i32(lo, hi, arg);

Signed-off-by: Philippe Mathieu-Daudé 
---
 tcg/tcg-op.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index 6b1f41500c..f3d556c21a 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -2562,8 +2562,7 @@ void tcg_gen_extr_i64_i32(TCGv_i32 lo, TCGv_i32 hi, 
TCGv_i64 arg)
 tcg_gen_mov_i32(lo, TCGV_LOW(arg));
 tcg_gen_mov_i32(hi, TCGV_HIGH(arg));
 } else {
-tcg_gen_extrl_i64_i32(lo, arg);
-tcg_gen_extrh_i64_i32(hi, arg);
+tcg_gen_extr_i64_i32(lo, hi, arg);
 }
 }
 
-- 
2.11.0

Re: [Qemu-devel] [PATCH v4 6/6] target/sparc: optimize various functions using extract op

2017-05-12 Thread Richard Henderson


On 05/12/2017 04:38 PM, Philippe Mathieu-Daudé wrote:

Patch created mechanically using Coccinelle script via:

 $ spatch --macro-file scripts/cocci-macro-file.h --in-place \
 --sp-file scripts/coccinelle/tcg_gen_extract.cocci --dir target

Signed-off-by: Philippe Mathieu-Daudé 
---
  target/sparc/translate.c | 15 +--
  1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index aa6734d54e..67a83b77cc 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -380,29 +380,25 @@ static inline void gen_goto_tb(DisasContext *s, int 
tb_num,
  static inline void gen_mov_reg_N(TCGv reg, TCGv_i32 src)
  {
  tcg_gen_extu_i32_tl(reg, src);
-tcg_gen_shri_tl(reg, reg, PSR_NEG_SHIFT);
-tcg_gen_andi_tl(reg, reg, 0x1);
+tcg_gen_extract_tl(reg, reg, PSR_NEG_SHIFT, 1);
  }
  
  static inline void gen_mov_reg_Z(TCGv reg, TCGv_i32 src)

  {
  tcg_gen_extu_i32_tl(reg, src);
-tcg_gen_shri_tl(reg, reg, PSR_ZERO_SHIFT);
-tcg_gen_andi_tl(reg, reg, 0x1);
+tcg_gen_extract_tl(reg, reg, PSR_ZERO_SHIFT, 1);
  }
  
  static inline void gen_mov_reg_V(TCGv reg, TCGv_i32 src)

  {
  tcg_gen_extu_i32_tl(reg, src);
-tcg_gen_shri_tl(reg, reg, PSR_OVF_SHIFT);
-tcg_gen_andi_tl(reg, reg, 0x1);
+tcg_gen_extract_tl(reg, reg, PSR_OVF_SHIFT, 1);
  }
  
  static inline void gen_mov_reg_C(TCGv reg, TCGv_i32 src)

  {
  tcg_gen_extu_i32_tl(reg, src);
-tcg_gen_shri_tl(reg, reg, PSR_CARRY_SHIFT);
-tcg_gen_andi_tl(reg, reg, 0x1);
+tcg_gen_extract_tl(reg, reg, PSR_CARRY_SHIFT, 1);
  }
  


These ones get a

Reviewed-by: Richard Henderson 


  static inline void gen_op_add_cc(TCGv dst, TCGv src1, TCGv src2)
@@ -638,8 +634,7 @@ static inline void gen_op_mulscc(TCGv dst, TCGv src1, TCGv 
src2)
  // env->y = (b2 << 31) | (env->y >> 1);
  tcg_gen_andi_tl(r_temp, cpu_cc_src, 0x1);
  tcg_gen_shli_tl(r_temp, r_temp, 31);
-tcg_gen_shri_tl(t0, cpu_y, 1);
-tcg_gen_andi_tl(t0, t0, 0x7fff);
+tcg_gen_extract_tl(t0, cpu_y, 1, 31);
  tcg_gen_or_tl(t0, t0, r_temp);
  tcg_gen_andi_tl(cpu_y, t0, 0x);


But this should use

  tcg_gen_extract_tl(cpu_y, cpu_y, 1, 31);
  tcg_gen_deposit_tl(cpu_y, cpu_y, cpu_cc_src, 31, 1);


r~

Re: [Qemu-devel] [PATCH v4 4/6] target/m68k: optimize bcd_flags() using extract op

2017-05-12 Thread Richard Henderson


On 05/12/2017 04:38 PM, Philippe Mathieu-Daudé wrote:

Patch created mechanically using Coccinelle script via:

 $ spatch --macro-file scripts/cocci-macro-file.h --in-place \
 --sp-file scripts/coccinelle/tcg_gen_extract.cocci --dir target

Signed-off-by: Philippe Mathieu-Daudé
Acked-by: Laurent Vivier
---
  target/m68k/translate.c | 3 +--
  1 file changed, 1 insertion(+), 2 deletions(-)



Reviewed-by: Richard Henderson 


r~

Re: [Qemu-devel] [PATCH v4 5/6] target/ppc: optimize various functions using extract op

2017-05-12 Thread Richard Henderson


On 05/12/2017 04:38 PM, Philippe Mathieu-Daudé wrote:

Patch created mechanically using Coccinelle script via:

 $ spatch --macro-file scripts/cocci-macro-file.h --in-place \
 --sp-file scripts/coccinelle/tcg_gen_extract.cocci --dir target

Signed-off-by: Philippe Mathieu-Daudé
---
  target/ppc/translate.c  | 21 +++--
  target/ppc/translate/vsx-impl.inc.c | 24 
  2 files changed, 15 insertions(+), 30 deletions(-)


Reviewed-by: Richard Henderson 


r~

Re: [Qemu-devel] [PATCH v4 2/6] target/alpha: optimize cvtlq() using extract op

2017-05-12 Thread Richard Henderson


On 05/12/2017 04:38 PM, Philippe Mathieu-Daudé wrote:

Patch created mechanically using Coccinelle script via:

 $ spatch --macro-file scripts/cocci-macro-file.h --in-place \
 --sp-file scripts/coccinelle/tcg_gen_extract.cocci --dir target

Signed-off-by: Philippe Mathieu-Daudé 
---
  target/alpha/translate.c | 3 +--
  1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/target/alpha/translate.c b/target/alpha/translate.c
index df5d695344..531af4f5b8 100644
--- a/target/alpha/translate.c
+++ b/target/alpha/translate.c
@@ -747,9 +747,8 @@ static void gen_cvtlq(TCGv vc, TCGv vb)
  /* The arithmetic right shift here, plus the sign-extended mask below
 yields a sign-extended result without an explicit ext32s_i64.  */
  tcg_gen_sari_i64(tmp, vb, 32);
-tcg_gen_shri_i64(vc, vb, 29);
+tcg_gen_extract_i64(vc, vb, 29, 30);
  tcg_gen_andi_i64(tmp, tmp, (int32_t)0xc000);
-tcg_gen_andi_i64(vc, vc, 0x3fff);
  tcg_gen_or_i64(vc, vc, tmp);


While this is accurate, looking at the broader context I think it would be 
better to use a deposit operation for this case.


  tcg_gen_shri_i64(tmp, vb, 29);
  tcg_gen_sari_i64(vc, vb, 32);
  tcg_gen_deposit_i64(vc, vc, tmp, 0, 30);


r~

Re: [Qemu-devel] [PATCH 6/6] spec/vhost-user spec: Add IOMMU support

2017-05-12 Thread Michael S. Tsirkin

On Fri, May 12, 2017 at 04:21:58PM +0200, Maxime Coquelin wrote:
> 
> 
> On 05/11/2017 08:25 PM, Michael S. Tsirkin wrote:
> > On Thu, May 11, 2017 at 02:32:46PM +0200, Maxime Coquelin wrote:
> > > This patch specifies and implements the master/slave communication
> > > to support device IOTLB in slave.
> > > 
> > > The vhost_iotlb_msg structure introduced for kernel backends is
> > > re-used, making the design close between the two backends.
> > > 
> > > An exception is the use of the secondary channel to enable the
> > > slave to send IOTLB miss requests to the master.
> > > 
> > > Signed-off-by: Maxime Coquelin 
> > > ---
> > >   docs/specs/vhost-user.txt | 75 
> > > +++
> > >   hw/virtio/vhost-user.c| 31 
> > >   2 files changed, 106 insertions(+)
> > > 
> > > diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
> > > index 5fa7016..4a1f0c3 100644
> > > --- a/docs/specs/vhost-user.txt
> > > +++ b/docs/specs/vhost-user.txt
> > > @@ -97,6 +97,23 @@ Depending on the request type, payload can be:
> > >  log offset: offset from start of supplied file descriptor
> > >  where logging starts (i.e. where guest address 0 would be logged)
> > > + * An IOTLB message
> > > +   -
> > > +   | iova | size | user address | permissions flags | type |
> > > +   -
> > > +
> > > +   IOVA: a 64-bit guest I/O virtual address
> > 
> > guest -> VM
> 
> Ok.
> 
> > 
> > > +   Size: a 64-bit size
> > 
> > How do you specify "all memory"? give special meaning to size 0?
> 
> Good point, it does not support all memory currently.
> It is not vhost-user specific, but general to the vhost implementation.

But iommu needs it to support passthrough.

> 
> > > +   User address: a 64-bit user address
> > > +   Permissions flags: a 8-bit bit field:
> > > +- Bit 0: Read access
> > > +- Bit 1: Write access
> > 
> > Can both bits be set? Can none?
> 
> Both. I will change it by listing values directly:
>  - 0 : No access
>  - 1 : Read
>  - 2 : Write
>  - 3 : Read Write
> 
> > > +   Type: a 8-bit IOTLB message type:
> > > +- 1: IOTLB miss
> > > +- 2: IOTLB update
> > > +- 3: IOTLB invalidate
> > > +- 4: IOTLB access fail
> > > +
> > >   In QEMU the vhost-user message is implemented with the following struct:
> > >   typedef struct VhostUserMsg {
> > > @@ -109,6 +126,7 @@ typedef struct VhostUserMsg {
> > >   struct vhost_vring_addr addr;
> > >   VhostUserMemory memory;
> > >   VhostUserLog log;
> > > +struct vhost_iotlb_msg iotlb;
> > >   };
> > >   } QEMU_PACKED VhostUserMsg;
> > > @@ -253,6 +271,31 @@ Once the source has finished migration, rings will 
> > > be stopped by
> > >   the source. No further update must be done before rings are
> > >   restarted.
> > > +IOMMU support
> > > +-
> > > +
> > > +When the VIRTIO_F_IOMMU_PLATFORM feature has been negotiated, the master 
> > > has
> > > +to send IOTLB entries update & invalidation by sending 
> > > VHOST_USER_IOTLB_MSG
> > > +requests to the slave with a struct vhost_iotlb_msg payload.
> > 
> > Always? This seems a bit strange since iommu can be enabled/disabled
> > dynamically.
> Ok, what about:
> 
> When the VIRTIO_F_IOMMU_PLATFORM feature has been negotiated and iommu
> is enbaled, the master sends IOTLB entries update & invalidation via
> VHOST_USER_IOTLB_MSG requests to the slave with a struct vhost_iotlb_msg
> payload.
> 
> 
> > Closing channel seems like a wrong thing to do for this.
> 
> Sorry, I'm not sure to get your comment.

What happens when guest disables the IOMMU?

> > > For update events,
> > > +the iotlb payload has to be filled with the update message type (2), the 
> > > I/O
> > > +virtual address, the size, the user virtual address, and the permissions
> > > +flags. For invalidation events, the iotlb payload has to be filled with 
> > > the
> > > +invalidation message type (3), the I/O virtual address and the size. On
> > > +success, the slave is expected to reply with a zero payload, non-zero
> > > +otherwise.
> > > +
> > > +When the VHOST_USER_PROTOCOL_F_SLAVE_REQ is supported by the slave, and 
> > > the
> > > +master initiated the slave to master communication channel using the
> > > +VHOST_USER_SET_SLAVE_REQ_FD request, the slave can send IOTLB miss and 
> > > access
> > > +failure events by sending VHOST_USER_SLAVE_IOTLB_MSG requests to the 
> > > master
> > > +with a struct vhost_iotlb_msg payload. For miss events, the iotlb 
> > > payload has
> > > +to be filled with the miss message type (1), the I/O virtual address and 
> > > the
> > > +permissions flags. For access failure event, the iotlb payload has to be
> > > +filled with the access failure message type (4), the I/O virtual address 
> > > and
> > > +the permissions flags. For synchronization purpose,

Re: [Qemu-devel] [PATCH] target/i386: enable A20 automatically in system management mode

2017-05-12 Thread Kevin O'Connor

On Fri, May 12, 2017 at 11:19:00PM +, Xu, Anthony wrote:
> > SeaBIOS defaults to enabling A20 and it's a rare beast that disables
> > it.  One could change x86.h:set_a20 and romlayout.S:transition32 to
> > only issue the outb() if the inb() indicates a change is needed.  That
> > would likely eliminate half the accesses.
> 
> The 350 port 92 access is for write operation only.
> If include the inb(), it would be 700, and every time it actually has a change
> To be precise, It is about 175 switches from 32 bit to 16 bit, then back to 
> 32 bit.
> call16 is called 175 times during Seabios boot without any option rom,
> It would be more if some option roms are included.
> 
> 
> I think A20 is disabled by default in SeaBios.

I don't know why you think that.  One can check with:

--- a/src/stacks.c
+++ b/src/stacks.c
@@ -99,6 +99,8 @@ call32_post(void)
 if (cr0_caching)
 cr0_mask(CR0_CD|CR0_NW, cr0_caching);
 }
+if (!get_a20())
+dprintf(1, "a20=0\n");
 
 // Restore cmos index register
 outb(GET_LOW(Call16Data.cmosindex), PORT_CMOS_INDEX);

With the above I only see a handful of cases where SeaBIOS has to
restore a20 to a disabled state.

The handful I do see are due to cases where yield() is called prior to
option rom initialization.  Those handful are eliminated for me with
the following fix:

--- a/src/stacks.c
+++ b/src/stacks.c
@@ -496,6 +496,7 @@ void
 thread_setup(void)
 {
 CanInterrupt = 1;
+call16_override(1);
 if (! CONFIG_THREADS)
 return;
 ThreadControl = romfile_loadint("etc/threads", 1);

What OS / bootloader are you running?

-Kevin

Re: [Qemu-devel] [PATCH v3 08/15] target/sh4: fold ctx->bstate = BS_BRANCH into gen_conditional_jump

2017-05-12 Thread Philippe Mathieu-Daudé


On 05/10/2017 03:26 PM, Aurelien Jarno wrote:

Reviewed-by: Richard Henderson 
Signed-off-by: Aurelien Jarno 


Reviewed-by: Philippe Mathieu-Daudé 


---
 target/sh4/translate.c | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/target/sh4/translate.c b/target/sh4/translate.c
index 8cee7d333f..a4c7a0895b 100644
--- a/target/sh4/translate.c
+++ b/target/sh4/translate.c
@@ -279,6 +279,7 @@ static void gen_conditional_jump(DisasContext * ctx,
 gen_goto_tb(ctx, 0, ifnott);
 gen_set_label(l1);
 gen_goto_tb(ctx, 1, ift);
+ctx->bstate = BS_BRANCH;
 }

 /* Delayed conditional jump (bt or bf) */
@@ -1158,9 +1159,7 @@ static void _decode_opc(DisasContext * ctx)
return;
 case 0x8b00:   /* bf label */
CHECK_NOT_DELAY_SLOT
-   gen_conditional_jump(ctx, ctx->pc + 2,
-ctx->pc + 4 + B7_0s * 2);
-   ctx->bstate = BS_BRANCH;
+gen_conditional_jump(ctx, ctx->pc + 2, ctx->pc + 4 + B7_0s * 2);
return;
 case 0x8f00:   /* bf/s label */
CHECK_NOT_DELAY_SLOT
@@ -1170,9 +1169,7 @@ static void _decode_opc(DisasContext * ctx)
return;
 case 0x8900:   /* bt label */
CHECK_NOT_DELAY_SLOT
-   gen_conditional_jump(ctx, ctx->pc + 4 + B7_0s * 2,
-ctx->pc + 2);
-   ctx->bstate = BS_BRANCH;
+gen_conditional_jump(ctx, ctx->pc + 4 + B7_0s * 2, ctx->pc + 2);
return;
 case 0x8d00:   /* bt/s label */
CHECK_NOT_DELAY_SLOT

Re: [Qemu-devel] [PATCH v3 1/4] ACPI: Add APEI GHES Table Generation support

2017-05-12 Thread Michael S. Tsirkin

On Sun, Apr 30, 2017 at 01:35:03PM +0800, Dongjiu Geng wrote:
> This implements APEI GHES Table by passing the error cper info
> to the guest via a fw_cfg_blob. After a CPER info is added, an
> SEA/SEI exception will be injected into the guest OS.
> 
> Below is the table layout, the max number of error soure is 11,
> which is classified by notification type.
> 
> etc/acpi/tables etc/hardware_errors
>  ==
>  +---+
> +--+ | address   | +-> +--+
> |HEST  + | registers | |   | Error Status |
> + ++ | +-+ |   | Data Block 1 |
> | | GHES1  | --> | |address1 | +   | ++
> | | GHES2  | --> | |address2 | --+ | |  CPER  |
> | | GHES3  | --> | |address3 | + | | |  CPER  |
> | |    | --> | | ... | | | | |  CPER  |
> | | GHES10 | --> | |address10| -+  | | | |  CPER  |
> +-++ +-+-+  |  | | +-++
> |  | |
> |  | +---> +--+
> |  |   | Error Status |
> |  |   | Data Block 2 |
> |  |   | ++
> |  |   | |  CPER  |
> |  |   | |  CPER  |
> |  |   +-++
> |  |
> |  +-> +--+
> |  | Error Status |
> |  | Data Block 3 |
> |  | ++
> |  | |  CPER  |
> |  +-++
> |...
> +> +--+
>| Error Status |
>| Data Block 10|
>| ++
>| |  CPER  |
>| |  CPER  |
>| |  CPER  |
>+-++
> 
> Signed-off-by: Dongjiu Geng 
> ---
>  default-configs/arm-softmmu.mak |   1 +
>  hw/acpi/Makefile.objs   |   1 +
>  hw/acpi/aml-build.c |   2 +
>  hw/acpi/hest_ghes.c | 203 +++
>  hw/arm/virt-acpi-build.c|   6 ++
>  include/hw/acpi/acpi-defs.h | 227 
> 
>  include/hw/acpi/aml-build.h |   1 +
>  include/hw/acpi/hest_ghes.h |  43 
>  8 files changed, 484 insertions(+)
>  create mode 100644 hw/acpi/hest_ghes.c
>  create mode 100644 include/hw/acpi/hest_ghes.h
> 
> diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
> index 1e3bd2b..d5f1552 100644
> --- a/default-configs/arm-softmmu.mak
> +++ b/default-configs/arm-softmmu.mak
> @@ -121,3 +121,4 @@ CONFIG_ACPI=y
>  CONFIG_SMBIOS=y
>  CONFIG_ASPEED_SOC=y
>  CONFIG_GPIO_KEY=y
> +CONFIG_ACPI_APEI_GENERATION=y
> diff --git a/hw/acpi/Makefile.objs b/hw/acpi/Makefile.objs
> index 11c35bc..776b46e 100644
> --- a/hw/acpi/Makefile.objs
> +++ b/hw/acpi/Makefile.objs
> @@ -6,6 +6,7 @@ common-obj-$(CONFIG_ACPI_MEMORY_HOTPLUG) += memory_hotplug.o
>  common-obj-$(CONFIG_ACPI_CPU_HOTPLUG) += cpu.o
>  common-obj-$(CONFIG_ACPI_NVDIMM) += nvdimm.o
>  common-obj-$(CONFIG_ACPI_VMGENID) += vmgenid.o
> +common-obj-$(CONFIG_ACPI_APEI_GENERATION) += hest_ghes.o
>  common-obj-$(call lnot,$(CONFIG_ACPI_X86)) += acpi-stub.o
>  
>  common-obj-y += acpi_interface.o
> diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
> index c6f2032..802b98d 100644
> --- a/hw/acpi/aml-build.c
> +++ b/hw/acpi/aml-build.c
> @@ -1560,6 +1560,7 @@ void acpi_build_tables_init(AcpiBuildTables *tables)
>  tables->table_data = g_array_new(false, true /* clear */, 1);
>  tables->tcpalog = g_array_new(false, true /* clear */, 1);
>  tables->vmgenid = g_array_new(false, true /* clear */, 1);
> +tables->hardware_errors = g_array_new(false, true /* clear */, 1);
>  tables->linker = bios_linker_loader_init();
>  }
>  
> @@ -1570,6 +1571,7 @@ void acpi_build_tables_cleanup(AcpiBuildTables *tables, 
> bool mfre)
>  g_array_free(tables->table_data, true);
>  g_array_free(tables->tcpalog, mfre);
>  g_array_free(tables->vmgenid, mfre);
> +g_array_free(tables->hardware_errors, mfre);
>  }
>  
>  /*

Re: [Qemu-devel] [PATCH v3 05/15] target/sh4: fix BS_STOP exit

2017-05-12 Thread Philippe Mathieu-Daudé


On 05/10/2017 03:26 PM, Aurelien Jarno wrote:

When stopping the translation because the state has changed, goto_tb
should not be used as it might link TB with different flags.

Reviewed-by: Richard Henderson 
Signed-off-by: Aurelien Jarno 


Reviewed-by: Philippe Mathieu-Daudé 


---
 target/sh4/translate.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/target/sh4/translate.c b/target/sh4/translate.c
index 2e29936ad8..04bc18bf7c 100644
--- a/target/sh4/translate.c
+++ b/target/sh4/translate.c
@@ -1901,8 +1901,9 @@ void gen_intermediate_code(CPUSH4State * env, struct 
TranslationBlock *tb)
 } else {
switch (ctx.bstate) {
 case BS_STOP:
-/* gen_op_interrupt_restart(); */
-/* fall through */
+tcg_gen_movi_i32(cpu_pc, ctx.pc);
+tcg_gen_exit_tb(0);
+break;
 case BS_NONE:
 if (ctx.envflags) {
 gen_store_flags(ctx.envflags);

Re: [Qemu-devel] [PATCH v3 4/5] target/ppc: using various functions using extract op

2017-05-12 Thread Philippe Mathieu-Daudé


This patch is also incorrect, please see v4.

On 05/12/2017 12:35 AM, Philippe Mathieu-Daudé wrote:

Patch created mechanically using Coccinelle script via:

$ spatch --macro-file scripts/cocci-macro-file.h --in-place \
--sp-file scripts/coccinelle/tcg_gen_extract.cocci --dir target

Signed-off-by: Philippe Mathieu-Daudé 
---

David I did not add your Reviewed-by as suggested by Laurent Vivier after
Nikunj A Dadhania review.

 target/ppc/translate.c  |  9 +++--
 target/ppc/translate/vsx-impl.inc.c | 15 +--
 2 files changed, 8 insertions(+), 16 deletions(-)

diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index f40b5a1abf..64ab412bf3 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -868,8 +868,7 @@ static inline void gen_op_arith_add(DisasContext *ctx, TCGv 
ret, TCGv arg1,
 }
 tcg_gen_xor_tl(cpu_ca, t0, t1);/* bits changed w/ carry */
 tcg_temp_free(t1);
-tcg_gen_shri_tl(cpu_ca, cpu_ca, 32);   /* extract bit 32 */
-tcg_gen_andi_tl(cpu_ca, cpu_ca, 1);
+tcg_gen_extract_tl(cpu_ca, cpu_ca, 32, 1);
 if (is_isa300(ctx)) {
 tcg_gen_mov_tl(cpu_ca32, cpu_ca);
 }
@@ -1399,8 +1398,7 @@ static inline void gen_op_arith_subf(DisasContext *ctx, 
TCGv ret, TCGv arg1,
 tcg_temp_free(inv1);
 tcg_gen_xor_tl(cpu_ca, t0, t1); /* bits changes w/ carry */
 tcg_temp_free(t1);
-tcg_gen_shri_tl(cpu_ca, cpu_ca, 32);/* extract bit 32 */
-tcg_gen_andi_tl(cpu_ca, cpu_ca, 1);
+tcg_gen_extract_tl(cpu_ca, cpu_ca, 32, 1);
 if (is_isa300(ctx)) {
 tcg_gen_mov_tl(cpu_ca32, cpu_ca);
 }
@@ -5383,8 +5381,7 @@ static void gen_mfsri(DisasContext *ctx)
 CHK_SV;
 t0 = tcg_temp_new();
 gen_addr_reg_index(ctx, t0);
-tcg_gen_shri_tl(t0, t0, 28);
-tcg_gen_andi_tl(t0, t0, 0xF);
+tcg_gen_extract_tl(t0, t0, 28, 0xF);


WRONG


 gen_helper_load_sr(cpu_gpr[rd], cpu_env, t0);
 tcg_temp_free(t0);
 if (ra != 0 && ra != rd)
diff --git a/target/ppc/translate/vsx-impl.inc.c 
b/target/ppc/translate/vsx-impl.inc.c
index 7f12908029..9faffd2ddc 100644
--- a/target/ppc/translate/vsx-impl.inc.c
+++ b/target/ppc/translate/vsx-impl.inc.c
@@ -1262,8 +1262,7 @@ static void gen_xsxexpqp(DisasContext *ctx)
 gen_exception(ctx, POWERPC_EXCP_VSXU);
 return;
 }
-tcg_gen_shri_i64(xth, xbh, 48);
-tcg_gen_andi_i64(xth, xth, 0x7FFF);
+tcg_gen_extract_i64(xth, xbh, 48, 0x7FFF);


WRONG


 tcg_gen_movi_i64(xtl, 0);
 }

@@ -1448,10 +1447,8 @@ static void gen_xvxexpdp(DisasContext *ctx)
 gen_exception(ctx, POWERPC_EXCP_VSXU);
 return;
 }
-tcg_gen_shri_i64(xth, xbh, 52);
-tcg_gen_andi_i64(xth, xth, 0x7FF);
-tcg_gen_shri_i64(xtl, xbl, 52);
-tcg_gen_andi_i64(xtl, xtl, 0x7FF);
+tcg_gen_extract_i64(xth, xbh, 52, 0x7FF);
+tcg_gen_extract_i64(xtl, xbl, 52, 0x7FF);


WRONG


 }

 GEN_VSX_HELPER_2(xvxsigsp, 0x00, 0x04, 0, PPC2_ISA300)
@@ -1474,16 +1471,14 @@ static void gen_xvxsigdp(DisasContext *ctx)
 zr = tcg_const_i64(0);
 nan = tcg_const_i64(2047);

-tcg_gen_shri_i64(exp, xbh, 52);
-tcg_gen_andi_i64(exp, exp, 0x7FF);
+tcg_gen_extract_i64(exp, xbh, 52, 0x7FF);


WRONG


 tcg_gen_movi_i64(t0, 0x0010);
 tcg_gen_movcond_i64(TCG_COND_EQ, t0, exp, zr, zr, t0);
 tcg_gen_movcond_i64(TCG_COND_EQ, t0, exp, nan, zr, t0);
 tcg_gen_andi_i64(xth, xbh, 0x000F);
 tcg_gen_or_i64(xth, xth, t0);

-tcg_gen_shri_i64(exp, xbl, 52);
-tcg_gen_andi_i64(exp, exp, 0x7FF);
+tcg_gen_extract_i64(exp, xbl, 52, 0x7FF);


WRONG


 tcg_gen_movi_i64(t0, 0x0010);
 tcg_gen_movcond_i64(TCG_COND_EQ, t0, exp, zr, zr, t0);
 tcg_gen_movcond_i64(TCG_COND_EQ, t0, exp, nan, zr, t0);

[Qemu-devel] [PULL 8/9] target/s390x: fix SIGNAL PROCESSOR return value

2017-05-12 Thread Richard Henderson

From: Aurelien Jarno 

The SIGNAL PROCESSOR helper returns its value through the CC register.
set_cc_static should be called just after the helper.

Signed-off-by: Aurelien Jarno 
Message-Id: <20170509082800.10756-3-aurel...@aurel32.net>
Signed-off-by: Richard Henderson 
---
 target/s390x/translate.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/target/s390x/translate.c b/target/s390x/translate.c
index 19276cc..3a0a3ee 100644
--- a/target/s390x/translate.c
+++ b/target/s390x/translate.c
@@ -3406,6 +3406,7 @@ static ExitStatus op_sigp(DisasContext *s, DisasOps *o)
 check_privileged(s);
 potential_page_fault(s);
 gen_helper_sigp(cc_op, cpu_env, o->in2, r1, o->in1);
+set_cc_static(s);
 tcg_temp_free_i32(r1);
 return NO_EXIT;
 }
-- 
2.9.3

Re: [Qemu-devel] [PATCH v3 2/5] target/arm: optimize rev16() using extract op

2017-05-12 Thread Philippe Mathieu-Daudé


On 05/12/2017 01:50 PM, Richard Henderson wrote:

On 05/11/2017 08:35 PM, Philippe Mathieu-Daudé wrote:

-tcg_gen_shri_i64(tcg_tmp, tcg_rn, 16);
-tcg_gen_andi_i64(tcg_tmp, tcg_tmp, 0x);
+tcg_gen_extract_i64(tcg_tmp, tcg_rn, 16, 0x);


So your new script didn't work then?  This should be "..., 16, 16);".


Yeah this is wrong :(

I hope I got it in the last patchset (v4).

Thank for the review,

Phil.

[Qemu-devel] [PULL 9/9] target/s390x: implement serialization in BRANCH CONDITION

2017-05-12 Thread Richard Henderson

From: Aurelien Jarno 

Signed-off-by: Aurelien Jarno 
Message-Id: <20170509082800.10756-4-aurel...@aurel32.net>
Signed-off-by: Richard Henderson 
---
 target/s390x/translate.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/target/s390x/translate.c b/target/s390x/translate.c
index 3a0a3ee..4c48c59 100644
--- a/target/s390x/translate.c
+++ b/target/s390x/translate.c
@@ -1518,6 +1518,21 @@ static ExitStatus op_bc(DisasContext *s, DisasOps *o)
 int imm = is_imm ? get_field(s->fields, i2) : 0;
 DisasCompare c;
 
+/* BCR with R2 = 0 causes no branching */
+if (have_field(s->fields, r2) && get_field(s->fields, r2) == 0) {
+if (m1 == 14) {
+/* Perform serialization */
+/* FIXME: check for fast-BCR-serialization facility */
+tcg_gen_mb(TCG_MO_ALL | TCG_BAR_SC);
+}
+if (m1 == 15) {
+/* Perform serialization */
+/* FIXME: perform checkpoint-synchronisation */
+tcg_gen_mb(TCG_MO_ALL | TCG_BAR_SC);
+}
+return NO_EXIT;
+}
+
 disas_jcc(s, , m1);
 return help_branch(s, , is_imm, imm, o->in2);
 }
-- 
2.9.3

Re: [Qemu-devel] [PATCH v3 5/5] target/sparc: optimize various functions using extract op

2017-05-12 Thread Philippe Mathieu-Daudé


This patch is incorrect, please see v4.

On 05/12/2017 12:35 AM, Philippe Mathieu-Daudé wrote:

Patch created mechanically using Coccinelle script via:

$ spatch --macro-file scripts/cocci-macro-file.h --in-place \
--sp-file scripts/coccinelle/tcg_gen_extract.cocci --dir target

Signed-off-by: Philippe Mathieu-Daudé 
---
 target/sparc/translate.c | 15 +--
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index aa6734d54e..a92b5c425c 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -380,29 +380,25 @@ static inline void gen_goto_tb(DisasContext *s, int 
tb_num,
 static inline void gen_mov_reg_N(TCGv reg, TCGv_i32 src)
 {
 tcg_gen_extu_i32_tl(reg, src);
-tcg_gen_shri_tl(reg, reg, PSR_NEG_SHIFT);
-tcg_gen_andi_tl(reg, reg, 0x1);
+tcg_gen_extract_tl(reg, reg, PSR_NEG_SHIFT, 0x1);
 }

 static inline void gen_mov_reg_Z(TCGv reg, TCGv_i32 src)
 {
 tcg_gen_extu_i32_tl(reg, src);
-tcg_gen_shri_tl(reg, reg, PSR_ZERO_SHIFT);
-tcg_gen_andi_tl(reg, reg, 0x1);
+tcg_gen_extract_tl(reg, reg, PSR_ZERO_SHIFT, 0x1);
 }

 static inline void gen_mov_reg_V(TCGv reg, TCGv_i32 src)
 {
 tcg_gen_extu_i32_tl(reg, src);
-tcg_gen_shri_tl(reg, reg, PSR_OVF_SHIFT);
-tcg_gen_andi_tl(reg, reg, 0x1);
+tcg_gen_extract_tl(reg, reg, PSR_OVF_SHIFT, 0x1);
 }

 static inline void gen_mov_reg_C(TCGv reg, TCGv_i32 src)
 {
 tcg_gen_extu_i32_tl(reg, src);
-tcg_gen_shri_tl(reg, reg, PSR_CARRY_SHIFT);
-tcg_gen_andi_tl(reg, reg, 0x1);
+tcg_gen_extract_tl(reg, reg, PSR_CARRY_SHIFT, 0x1);
 }

 static inline void gen_op_add_cc(TCGv dst, TCGv src1, TCGv src2)
@@ -638,8 +634,7 @@ static inline void gen_op_mulscc(TCGv dst, TCGv src1, TCGv 
src2)
 // env->y = (b2 << 31) | (env->y >> 1);
 tcg_gen_andi_tl(r_temp, cpu_cc_src, 0x1);
 tcg_gen_shli_tl(r_temp, r_temp, 31);
-tcg_gen_shri_tl(t0, cpu_y, 1);
-tcg_gen_andi_tl(t0, t0, 0x7fff);
+tcg_gen_extract_tl(t0, cpu_y, 1, 0x7fff);


WRONG


 tcg_gen_or_tl(t0, t0, r_temp);
 tcg_gen_andi_tl(cpu_y, t0, 0x);

[Qemu-devel] [PULL 6/9] target/s390x: Use atomic operations for LOAD AND OP

2017-05-12 Thread Richard Henderson

Reviewed-by: Aurelien Jarno 
Signed-off-by: Richard Henderson 
---
 target/s390x/insn-data.def | 20 ++--
 target/s390x/translate.c   | 78 +-
 2 files changed, 60 insertions(+), 38 deletions(-)

diff --git a/target/s390x/insn-data.def b/target/s390x/insn-data.def
index 5e5fcc5..55a7c52 100644
--- a/target/s390x/insn-data.def
+++ b/target/s390x/insn-data.def
@@ -390,20 +390,20 @@
 /* LOAD ADDRESS RELATIVE LONG */
 C(0xc000, LARL,RIL_b, Z,   0, ri2, 0, r1, mov2, 0)
 /* LOAD AND ADD */
-C(0xebf8, LAA, RSY_a, ILA, r3_32s, m2_32s_atomic, new, 
m2_32_r1_atomic, add, adds32)
-C(0xebe8, LAAG,RSY_a, ILA, r3, m2_64_atomic, new, m2_64_r1_atomic, 
add, adds64)
+D(0xebf8, LAA, RSY_a, ILA, r3_32s, a2, new, in2_r1_32, laa, adds32, 
MO_TESL)
+D(0xebe8, LAAG,RSY_a, ILA, r3, a2, new, in2_r1, laa, adds64, MO_TEQ)
 /* LOAD AND ADD LOGICAL */
-C(0xebfa, LAAL,RSY_a, ILA, r3_32s, m2_32s_atomic, new, 
m2_32_r1_atomic, add, addu32)
-C(0xebea, LAALG,   RSY_a, ILA, r3, m2_64_atomic, new, m2_64_r1_atomic, 
add, addu64)
+D(0xebfa, LAAL,RSY_a, ILA, r3_32u, a2, new, in2_r1_32, laa, addu32, 
MO_TEUL)
+D(0xebea, LAALG,   RSY_a, ILA, r3, a2, new, in2_r1, laa, addu64, MO_TEQ)
 /* LOAD AND AND */
-C(0xebf4, LAN, RSY_a, ILA, r3_32s, m2_32s_atomic, new, 
m2_32_r1_atomic, and, nz32)
-C(0xebe4, LANG,RSY_a, ILA, r3, m2_64_atomic, new, m2_64_r1_atomic, 
and, nz64)
+D(0xebf4, LAN, RSY_a, ILA, r3_32s, a2, new, in2_r1_32, lan, nz32, 
MO_TESL)
+D(0xebe4, LANG,RSY_a, ILA, r3, a2, new, in2_r1, lan, nz64, MO_TEQ)
 /* LOAD AND EXCLUSIVE OR */
-C(0xebf7, LAX, RSY_a, ILA, r3_32s, m2_32s_atomic, new, 
m2_32_r1_atomic, xor, nz32)
-C(0xebe7, LAXG,RSY_a, ILA, r3, m2_64_atomic, new, m2_64_r1_atomic, 
xor, nz64)
+D(0xebf7, LAX, RSY_a, ILA, r3_32s, a2, new, in2_r1_32, lax, nz32, 
MO_TESL)
+D(0xebe7, LAXG,RSY_a, ILA, r3, a2, new, in2_r1, lax, nz64, MO_TEQ)
 /* LOAD AND OR */
-C(0xebf6, LAO, RSY_a, ILA, r3_32s, m2_32s_atomic, new, 
m2_32_r1_atomic, or, nz32)
-C(0xebe6, LAOG,RSY_a, ILA, r3, m2_64_atomic, new, m2_64_r1_atomic, or, 
nz64)
+D(0xebf6, LAO, RSY_a, ILA, r3_32s, a2, new, in2_r1_32, lao, nz32, 
MO_TESL)
+D(0xebe6, LAOG,RSY_a, ILA, r3, a2, new, in2_r1, lao, nz64, MO_TEQ)
 /* LOAD AND TEST */
 C(0x1200, LTR, RR_a,  Z,   0, r2_o, 0, cond_r1r2_32, mov2, s32)
 C(0xb902, LTGR,RRE,   Z,   0, r2_o, 0, r1, mov2, s64)
diff --git a/target/s390x/translate.c b/target/s390x/translate.c
index f23b705..19276cc 100644
--- a/target/s390x/translate.c
+++ b/target/s390x/translate.c
@@ -2309,6 +2309,50 @@ static ExitStatus op_iske(DisasContext *s, DisasOps *o)
 }
 #endif
 
+static ExitStatus op_laa(DisasContext *s, DisasOps *o)
+{
+/* The real output is indeed the original value in memory;
+   recompute the addition for the computation of CC.  */
+tcg_gen_atomic_fetch_add_i64(o->in2, o->in2, o->in1, get_mem_index(s),
+ s->insn->data | MO_ALIGN);
+/* However, we need to recompute the addition for setting CC.  */
+tcg_gen_add_i64(o->out, o->in1, o->in2);
+return NO_EXIT;
+}
+
+static ExitStatus op_lan(DisasContext *s, DisasOps *o)
+{
+/* The real output is indeed the original value in memory;
+   recompute the addition for the computation of CC.  */
+tcg_gen_atomic_fetch_and_i64(o->in2, o->in2, o->in1, get_mem_index(s),
+ s->insn->data | MO_ALIGN);
+/* However, we need to recompute the operation for setting CC.  */
+tcg_gen_and_i64(o->out, o->in1, o->in2);
+return NO_EXIT;
+}
+
+static ExitStatus op_lao(DisasContext *s, DisasOps *o)
+{
+/* The real output is indeed the original value in memory;
+   recompute the addition for the computation of CC.  */
+tcg_gen_atomic_fetch_or_i64(o->in2, o->in2, o->in1, get_mem_index(s),
+s->insn->data | MO_ALIGN);
+/* However, we need to recompute the operation for setting CC.  */
+tcg_gen_or_i64(o->out, o->in1, o->in2);
+return NO_EXIT;
+}
+
+static ExitStatus op_lax(DisasContext *s, DisasOps *o)
+{
+/* The real output is indeed the original value in memory;
+   recompute the addition for the computation of CC.  */
+tcg_gen_atomic_fetch_xor_i64(o->in2, o->in2, o->in1, get_mem_index(s),
+ s->insn->data | MO_ALIGN);
+/* However, we need to recompute the operation for setting CC.  */
+tcg_gen_xor_i64(o->out, o->in1, o->in2);
+return NO_EXIT;
+}
+
 static ExitStatus op_ldeb(DisasContext *s, DisasOps *o)
 {
 gen_helper_ldeb(o->out, cpu_env, o->in2);
@@ -4483,21 +4527,17 @@ static void wout_m2_32(DisasContext *s, DisasFields *f, 
DisasOps *o)
 }
 #define SPEC_wout_m2_32 0
 
-static void wout_m2_32_r1_atomic(DisasContext *s, DisasFields *f, DisasOps *o)
+static void

[Qemu-devel] [PULL 7/9] target/s390x: mask the SIGP order_code using SIGP_ORDER_MASK

2017-05-12 Thread Richard Henderson

From: Aurelien Jarno 

For that move the definition from kvm.c to cpu.h

Reviewed-by: Thomas Huth 
Reviewed-by: Cornelia Huck 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Aurelien Jarno 
Message-Id: <20170509082800.10756-2-aurel...@aurel32.net>
Signed-off-by: Richard Henderson 
---
 target/s390x/cpu.h | 3 +++
 target/s390x/kvm.c | 2 --
 target/s390x/misc_helper.c | 3 +--
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/target/s390x/cpu.h b/target/s390x/cpu.h
index bbed320..240b8a5 100644
--- a/target/s390x/cpu.h
+++ b/target/s390x/cpu.h
@@ -1078,6 +1078,9 @@ struct sysib_322 {
 #define SIGP_MODE_Z_ARCH_TRANS_ALL_PSW 1
 #define SIGP_MODE_Z_ARCH_TRANS_CUR_PSW 2
 
+/* SIGP order code mask corresponding to bit positions 56-63 */
+#define SIGP_ORDER_MASK 0x00ff
+
 void load_psw(CPUS390XState *env, uint64_t mask, uint64_t addr);
 int mmu_translate(CPUS390XState *env, target_ulong vaddr, int rw, uint64_t asc,
   target_ulong *raddr, int *flags, bool exc);
diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c
index 1a249d8..fb10542 100644
--- a/target/s390x/kvm.c
+++ b/target/s390x/kvm.c
@@ -1764,8 +1764,6 @@ static int sigp_set_architecture(S390CPU *cpu, uint32_t 
param,
 return SIGP_CC_ORDER_CODE_ACCEPTED;
 }
 
-#define SIGP_ORDER_MASK 0x00ff
-
 static int handle_sigp(S390CPU *cpu, struct kvm_run *run, uint8_t ipa1)
 {
 CPUS390XState *env = >env;
diff --git a/target/s390x/misc_helper.c b/target/s390x/misc_helper.c
index bd94242..23ec52c 100644
--- a/target/s390x/misc_helper.c
+++ b/target/s390x/misc_helper.c
@@ -517,8 +517,7 @@ uint32_t HELPER(sigp)(CPUS390XState *env, uint64_t 
order_code, uint32_t r1,
 /* Remember: Use "R1 or R1 + 1, whichever is the odd-numbered register"
as parameter (input). Status (output) is always R1. */
 
-/* sigp contains the order code in bit positions 56-63, mask it here. */
-switch (order_code & 0xff) {
+switch (order_code & SIGP_ORDER_MASK) {
 case SIGP_SET_ARCH:
 /* switch arch */
 break;
-- 
2.9.3

[Qemu-devel] [PULL 2/9] target/s390x: Implement LOAD PROGRAM PARAMETER

2017-05-12 Thread Richard Henderson

From: Miroslav Benes 

Linux arch/s390/kernel/head(64).S uses LPP instruction if it is
available in facilities list provided by stfl/stfle instruction.
This is the case of newer z/System generations and their qemu
definition.

The description of LPP is at
http://www-01.ibm.com/support/docview.wss?uid=isg26fcd1cc32246f4c8852574ce0044734a

Reviewed-by: Aurelien Jarno 
Signed-off-by: Miroslav Benes 
Message-Id: <20170227085353.20787-1-mbe...@suse.cz>
Signed-off-by: Richard Henderson 
---
 target/s390x/insn-data.def | 2 ++
 target/s390x/translate.c   | 9 +
 2 files changed, 11 insertions(+)

diff --git a/target/s390x/insn-data.def b/target/s390x/insn-data.def
index b6702da..43c5707 100644
--- a/target/s390x/insn-data.def
+++ b/target/s390x/insn-data.def
@@ -845,6 +845,8 @@
 /* LOAD CONTROL */
 C(0xb700, LCTL,RS_a,  Z,   0, a2, 0, 0, lctl, 0)
 C(0xeb2f, LCTLG,   RSY_a, Z,   0, a2, 0, 0, lctlg, 0)
+/* LOAD PROGRAM PARAMETER */
+C(0xb280, LPP, S,   LPP,   0, m2_64, 0, 0, lpp, 0)
 /* LOAD PSW */
 C(0x8200, LPSW,S, Z,   0, a2, 0, 0, lpsw, 0)
 /* LOAD PSW EXTENDED */
diff --git a/target/s390x/translate.c b/target/s390x/translate.c
index 69940e3..2b66a4e 100644
--- a/target/s390x/translate.c
+++ b/target/s390x/translate.c
@@ -1194,6 +1194,7 @@ typedef enum DisasFacility {
 FAC_SCF,/* store clock fast */
 FAC_SFLE,   /* store facility list extended */
 FAC_ILA,/* interlocked access facility 1 */
+FAC_LPP,/* load-program-parameter */
 } DisasFacility;
 
 struct DisasInsn {
@@ -2567,6 +2568,14 @@ static ExitStatus op_lra(DisasContext *s, DisasOps *o)
 return NO_EXIT;
 }
 
+static ExitStatus op_lpp(DisasContext *s, DisasOps *o)
+{
+check_privileged(s);
+
+tcg_gen_st_i64(o->in2, cpu_env, offsetof(CPUS390XState, pp));
+return NO_EXIT;
+}
+
 static ExitStatus op_lpsw(DisasContext *s, DisasOps *o)
 {
 TCGv_i64 t1, t2;
-- 
2.9.3

[Qemu-devel] [PATCH v4 6/6] target/sparc: optimize various functions using extract op

2017-05-12 Thread Philippe Mathieu-Daudé

Patch created mechanically using Coccinelle script via:

$ spatch --macro-file scripts/cocci-macro-file.h --in-place \
--sp-file scripts/coccinelle/tcg_gen_extract.cocci --dir target

Signed-off-by: Philippe Mathieu-Daudé 
---
 target/sparc/translate.c | 15 +--
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index aa6734d54e..67a83b77cc 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -380,29 +380,25 @@ static inline void gen_goto_tb(DisasContext *s, int 
tb_num,
 static inline void gen_mov_reg_N(TCGv reg, TCGv_i32 src)
 {
 tcg_gen_extu_i32_tl(reg, src);
-tcg_gen_shri_tl(reg, reg, PSR_NEG_SHIFT);
-tcg_gen_andi_tl(reg, reg, 0x1);
+tcg_gen_extract_tl(reg, reg, PSR_NEG_SHIFT, 1);
 }
 
 static inline void gen_mov_reg_Z(TCGv reg, TCGv_i32 src)
 {
 tcg_gen_extu_i32_tl(reg, src);
-tcg_gen_shri_tl(reg, reg, PSR_ZERO_SHIFT);
-tcg_gen_andi_tl(reg, reg, 0x1);
+tcg_gen_extract_tl(reg, reg, PSR_ZERO_SHIFT, 1);
 }
 
 static inline void gen_mov_reg_V(TCGv reg, TCGv_i32 src)
 {
 tcg_gen_extu_i32_tl(reg, src);
-tcg_gen_shri_tl(reg, reg, PSR_OVF_SHIFT);
-tcg_gen_andi_tl(reg, reg, 0x1);
+tcg_gen_extract_tl(reg, reg, PSR_OVF_SHIFT, 1);
 }
 
 static inline void gen_mov_reg_C(TCGv reg, TCGv_i32 src)
 {
 tcg_gen_extu_i32_tl(reg, src);
-tcg_gen_shri_tl(reg, reg, PSR_CARRY_SHIFT);
-tcg_gen_andi_tl(reg, reg, 0x1);
+tcg_gen_extract_tl(reg, reg, PSR_CARRY_SHIFT, 1);
 }
 
 static inline void gen_op_add_cc(TCGv dst, TCGv src1, TCGv src2)
@@ -638,8 +634,7 @@ static inline void gen_op_mulscc(TCGv dst, TCGv src1, TCGv 
src2)
 // env->y = (b2 << 31) | (env->y >> 1);
 tcg_gen_andi_tl(r_temp, cpu_cc_src, 0x1);
 tcg_gen_shli_tl(r_temp, r_temp, 31);
-tcg_gen_shri_tl(t0, cpu_y, 1);
-tcg_gen_andi_tl(t0, t0, 0x7fff);
+tcg_gen_extract_tl(t0, cpu_y, 1, 31);
 tcg_gen_or_tl(t0, t0, r_temp);
 tcg_gen_andi_tl(cpu_y, t0, 0x);
 
-- 
2.11.0

[Qemu-devel] [PULL 5/9] target/s390x: Use atomic operations for COMPARE SWAP

2017-05-12 Thread Richard Henderson

Reviewed-by: Aurelien Jarno 
Signed-off-by: Richard Henderson 
---
 target/s390x/helper.h  |  1 +
 target/s390x/insn-data.def | 10 +++---
 target/s390x/mem_helper.c  | 40 ++
 target/s390x/translate.c   | 83 --
 4 files changed, 60 insertions(+), 74 deletions(-)

diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index 01adb50..0b70770 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -25,6 +25,7 @@ DEF_HELPER_3(cxgb, i64, env, s64, i32)
 DEF_HELPER_3(celgb, i64, env, i64, i32)
 DEF_HELPER_3(cdlgb, i64, env, i64, i32)
 DEF_HELPER_3(cxlgb, i64, env, i64, i32)
+DEF_HELPER_4(cdsg, void, env, i64, i32, i32)
 DEF_HELPER_FLAGS_3(aeb, TCG_CALL_NO_WG, i64, env, i64, i64)
 DEF_HELPER_FLAGS_3(adb, TCG_CALL_NO_WG, i64, env, i64, i64)
 DEF_HELPER_FLAGS_5(axb, TCG_CALL_NO_WG, i64, env, i64, i64, i64, i64)
diff --git a/target/s390x/insn-data.def b/target/s390x/insn-data.def
index 0909060..5e5fcc5 100644
--- a/target/s390x/insn-data.def
+++ b/target/s390x/insn-data.def
@@ -239,12 +239,12 @@
 D(0xec7d, CLGIJ,   RIE_c, GIE, r1_o, i2_8u, 0, 0, cj, 0, 1)
 
 /* COMPARE AND SWAP */
-D(0xba00, CS,  RS_a,  Z,   r3_32u, r1_32u, new, r1_32, cs, 0, 0)
-D(0xeb14, CSY, RSY_a, LD,  r3_32u, r1_32u, new, r1_32, cs, 0, 0)
-D(0xeb30, CSG, RSY_a, Z,   r3_o, r1_o, new, r1, cs, 0, 1)
+D(0xba00, CS,  RS_a,  Z,   r3_32u, r1_32u, new, r1_32, cs, 0, MO_TEUL)
+D(0xeb14, CSY, RSY_a, LD,  r3_32u, r1_32u, new, r1_32, cs, 0, MO_TEUL)
+D(0xeb30, CSG, RSY_a, Z,   r3_o, r1_o, new, r1, cs, 0, MO_TEQ)
 /* COMPARE DOUBLE AND SWAP */
-D(0xbb00, CDS, RS_a,  Z,   r3_D32, r1_D32, new, r1_D32, cs, 0, 1)
-D(0xeb31, CDSY,RSY_a, LD,  r3_D32, r1_D32, new, r1_D32, cs, 0, 1)
+D(0xbb00, CDS, RS_a,  Z,   r3_D32, r1_D32, new, r1_D32, cs, 0, MO_TEQ)
+D(0xeb31, CDSY,RSY_a, LD,  r3_D32, r1_D32, new, r1_D32, cs, 0, MO_TEQ)
 C(0xeb3e, CDSG,RSY_a, Z,   0, 0, 0, 0, cdsg, 0)
 
 /* COMPARE AND TRAP */
diff --git a/target/s390x/mem_helper.c b/target/s390x/mem_helper.c
index 675aba2..f6e5bce 100644
--- a/target/s390x/mem_helper.c
+++ b/target/s390x/mem_helper.c
@@ -23,6 +23,7 @@
 #include "exec/helper-proto.h"
 #include "exec/exec-all.h"
 #include "exec/cpu_ldst.h"
+#include "qemu/int128.h"
 
 #if !defined(CONFIG_USER_ONLY)
 #include "hw/s390x/storage-keys.h"
@@ -844,6 +845,45 @@ uint32_t HELPER(trt)(CPUS390XState *env, uint32_t len, 
uint64_t array,
 return cc;
 }
 
+void HELPER(cdsg)(CPUS390XState *env, uint64_t addr,
+  uint32_t r1, uint32_t r3)
+{
+uintptr_t ra = GETPC();
+Int128 cmpv = int128_make128(env->regs[r1 + 1], env->regs[r1]);
+Int128 newv = int128_make128(env->regs[r3 + 1], env->regs[r3]);
+Int128 oldv;
+bool fail;
+
+if (parallel_cpus) {
+#ifndef CONFIG_ATOMIC128
+cpu_loop_exit_atomic(ENV_GET_CPU(env), ra);
+#else
+int mem_idx = cpu_mmu_index(env, false);
+TCGMemOpIdx oi = make_memop_idx(MO_TEQ | MO_ALIGN_16, mem_idx);
+oldv = helper_atomic_cmpxchgo_be_mmu(env, addr, cmpv, newv, oi, ra);
+fail = !int128_eq(oldv, cmpv);
+#endif
+} else {
+uint64_t oldh, oldl;
+
+oldh = cpu_ldq_data_ra(env, addr + 0, ra);
+oldl = cpu_ldq_data_ra(env, addr + 8, ra);
+
+oldv = int128_make128(oldl, oldh);
+fail = !int128_eq(oldv, cmpv);
+if (fail) {
+newv = oldv;
+}
+
+cpu_stq_data_ra(env, addr + 0, int128_gethi(newv), ra);
+cpu_stq_data_ra(env, addr + 8, int128_getlo(newv), ra);
+}
+
+env->cc_op = fail;
+env->regs[r1] = int128_gethi(oldv);
+env->regs[r1 + 1] = int128_getlo(oldv);
+}
+
 #if !defined(CONFIG_USER_ONLY)
 void HELPER(lctlg)(CPUS390XState *env, uint32_t r1, uint64_t a2, uint32_t r3)
 {
diff --git a/target/s390x/translate.c b/target/s390x/translate.c
index 522a5e3..f23b705 100644
--- a/target/s390x/translate.c
+++ b/target/s390x/translate.c
@@ -1943,102 +1943,47 @@ static ExitStatus op_cps(DisasContext *s, DisasOps *o)
 
 static ExitStatus op_cs(DisasContext *s, DisasOps *o)
 {
-/* FIXME: needs an atomic solution for CONFIG_USER_ONLY.  */
 int d2 = get_field(s->fields, d2);
 int b2 = get_field(s->fields, b2);
-int is_64 = s->insn->data;
-TCGv_i64 addr, mem, cc, z;
+TCGv_i64 addr, cc;
 
 /* Note that in1 = R3 (new value) and
in2 = (zero-extended) R1 (expected value).  */
 
-/* Load the memory into the (temporary) output.  While the PoO only talks
-   about moving the memory to R1 on inequality, if we include equality it
-   means that R1 is equal to the memory in all conditions.  */
 addr = get_address(s, 0, b2, d2);
-if (is_64) {
-tcg_gen_qemu_ld64(o->out, addr, get_mem_index(s));
-} else {
-tcg_gen_qemu_ld32u(o->out, addr, get_mem_index(s));
-}
+tcg_gen_atomic_cmpxchg_i64(o->out, addr,

[Qemu-devel] [PULL 4/9] target/s390x: Implement LOAD PAIR DISJOINT

2017-05-12 Thread Richard Henderson

From: Eric Bischoff 

Reviewed-by: Aurelien Jarno 
Signed-off-by: Eric Bischoff 
Message-Id: <20170228120134.7921-1-ebisch...@suse.com>
[rth: Combine the two via insn->data; free the address temps.]
Signed-off-by: Richard Henderson 
---
 target/s390x/insn-data.def |  4 +++-
 target/s390x/translate.c   | 42 ++
 2 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/target/s390x/insn-data.def b/target/s390x/insn-data.def
index 43c5707..0909060 100644
--- a/target/s390x/insn-data.def
+++ b/target/s390x/insn-data.def
@@ -504,7 +504,9 @@
 C(0xb9e2, LOCGR,   RRF_c, LOC, r1, r2, r1, 0, loc, 0)
 C(0xebf2, LOC, RSY_b, LOC, r1, m2_32u, new, r1_32, loc, 0)
 C(0xebe2, LOCG,RSY_b, LOC, r1, m2_64, r1, 0, loc, 0)
-/* LOAD PAIR DISJOINT TODO */
+/* LOAD PAIR DISJOINT */
+D(0xc804, LPD, SSF,   ILA, 0, 0, new_P, r3_P32, lpd, 0, MO_TEUL)
+D(0xc805, LPDG,SSF,   ILA, 0, 0, new_P, r3_P64, lpd, 0, MO_TEQ)
 /* LOAD POSITIVE */
 C(0x1000, LPR, RR_a,  Z,   0, r2_32s, new, r1_32, abs, abs32)
 C(0xb900, LPGR,RRE,   Z,   0, r2, r1, 0, abs, abs64)
diff --git a/target/s390x/translate.c b/target/s390x/translate.c
index 2b66a4e..522a5e3 100644
--- a/target/s390x/translate.c
+++ b/target/s390x/translate.c
@@ -2559,6 +2559,7 @@ static ExitStatus op_lctlg(DisasContext *s, DisasOps *o)
 tcg_temp_free_i32(r3);
 return NO_EXIT;
 }
+
 static ExitStatus op_lra(DisasContext *s, DisasOps *o)
 {
 check_privileged(s);
@@ -2759,6 +2760,31 @@ static ExitStatus op_lm64(DisasContext *s, DisasOps *o)
 return NO_EXIT;
 }
 
+static ExitStatus op_lpd(DisasContext *s, DisasOps *o)
+{
+TCGv_i64 a1, a2;
+TCGMemOp mop = s->insn->data;
+
+/* In a parallel context, stop the world and single step.  */
+if (parallel_cpus) {
+potential_page_fault(s);
+gen_exception(EXCP_ATOMIC);
+return EXIT_NORETURN;
+}
+
+/* In a serial context, perform the two loads ... */
+a1 = get_address(s, 0, get_field(s->fields, b1), get_field(s->fields, d1));
+a2 = get_address(s, 0, get_field(s->fields, b2), get_field(s->fields, d2));
+tcg_gen_qemu_ld_i64(o->out, a1, get_mem_index(s), mop | MO_ALIGN);
+tcg_gen_qemu_ld_i64(o->out2, a2, get_mem_index(s), mop | MO_ALIGN);
+tcg_temp_free_i64(a1);
+tcg_temp_free_i64(a2);
+
+/* ... and indicate that we performed them while interlocked.  */
+gen_op_movi_cc(s, 0);
+return NO_EXIT;
+}
+
 #ifndef CONFIG_USER_ONLY
 static ExitStatus op_lura(DisasContext *s, DisasOps *o)
 {
@@ -4430,6 +4456,22 @@ static void wout_r1_D32(DisasContext *s, DisasFields *f, 
DisasOps *o)
 }
 #define SPEC_wout_r1_D32 SPEC_r1_even
 
+static void wout_r3_P32(DisasContext *s, DisasFields *f, DisasOps *o)
+{
+int r3 = get_field(f, r3);
+store_reg32_i64(r3, o->out);
+store_reg32_i64(r3 + 1, o->out2);
+}
+#define SPEC_wout_r3_P32 SPEC_r3_even
+
+static void wout_r3_P64(DisasContext *s, DisasFields *f, DisasOps *o)
+{
+int r3 = get_field(f, r3);
+store_reg(r3, o->out);
+store_reg(r3 + 1, o->out2);
+}
+#define SPEC_wout_r3_P64 SPEC_r3_even
+
 static void wout_e1(DisasContext *s, DisasFields *f, DisasOps *o)
 {
 store_freg32_i64(get_field(f, r1), o->out);
-- 
2.9.3

[Qemu-devel] [PULL 1/9] target/s390x: Implement STORE FACILITIES LIST EXTENDED

2017-05-12 Thread Richard Henderson

At the same time, improve STORE FACILITIES LIST
so that we don't hard-code the list for all cpus.

Signed-off-by: Richard Henderson 
---
 target/s390x/helper.h  |  2 ++
 target/s390x/insn-data.def |  2 ++
 target/s390x/misc_helper.c | 59 ++
 target/s390x/translate.c   | 17 ++---
 4 files changed, 72 insertions(+), 8 deletions(-)

diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index 9102071..01adb50 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -83,6 +83,8 @@ DEF_HELPER_FLAGS_5(calc_cc, TCG_CALL_NO_RWG_SE, i32, env, 
i32, i64, i64, i64)
 DEF_HELPER_FLAGS_2(sfpc, TCG_CALL_NO_RWG, void, env, i64)
 DEF_HELPER_FLAGS_2(sfas, TCG_CALL_NO_WG, void, env, i64)
 DEF_HELPER_FLAGS_1(popcnt, TCG_CALL_NO_RWG_SE, i64, i64)
+DEF_HELPER_FLAGS_1(stfl, TCG_CALL_NO_RWG, void, env)
+DEF_HELPER_2(stfle, i32, env, i64)
 
 #ifndef CONFIG_USER_ONLY
 DEF_HELPER_3(servc, i32, env, i64, i64)
diff --git a/target/s390x/insn-data.def b/target/s390x/insn-data.def
index 075ff59..b6702da 100644
--- a/target/s390x/insn-data.def
+++ b/target/s390x/insn-data.def
@@ -747,6 +747,8 @@
 C(0xe33e, STRV,RXY_a, Z,   la2, r1_32u, new, m1_32, rev32, 0)
 C(0xe32f, STRVG,   RXY_a, Z,   la2, r1_o, new, m1_64, rev64, 0)
 
+/* STORE FACILITY LIST EXTENDED */
+C(0xb2b0, STFLE,   S,  SFLE,   0, a2, 0, 0, stfle, 0)
 /* STORE FPC */
 C(0xb29c, STFPC,   S, Z,   0, a2, new, m2_32, efpc, 0)
 
diff --git a/target/s390x/misc_helper.c b/target/s390x/misc_helper.c
index eca8244..bd94242 100644
--- a/target/s390x/misc_helper.c
+++ b/target/s390x/misc_helper.c
@@ -678,3 +678,62 @@ void HELPER(per_ifetch)(CPUS390XState *env, uint64_t addr)
 }
 }
 #endif
+
+/* The maximum bit defined at the moment is 129.  */
+#define MAX_STFL_WORDS  3
+
+/* Canonicalize the current cpu's features into the 64-bit words required
+   by STFLE.  Return the index-1 of the max word that is non-zero.  */
+static unsigned do_stfle(CPUS390XState *env, uint64_t words[MAX_STFL_WORDS])
+{
+S390CPU *cpu = s390_env_get_cpu(env);
+const unsigned long *features = cpu->model->features;
+unsigned max_bit = 0;
+S390Feat feat;
+
+memset(words, 0, sizeof(uint64_t) * MAX_STFL_WORDS);
+
+if (test_bit(S390_FEAT_ZARCH, features)) {
+/* z/Architecture is always active if around */
+words[0] = 1ull << (63 - 2);
+}
+
+for (feat = find_first_bit(features, S390_FEAT_MAX);
+ feat < S390_FEAT_MAX;
+ feat = find_next_bit(features, S390_FEAT_MAX, feat + 1)) {
+const S390FeatDef *def = s390_feat_def(feat);
+if (def->type == S390_FEAT_TYPE_STFL) {
+unsigned bit = def->bit;
+if (bit > max_bit) {
+max_bit = bit;
+}
+assert(bit / 64 < MAX_STFL_WORDS);
+words[bit / 64] |= 1ULL << (63 - bit % 64);
+}
+}
+
+return max_bit / 64;
+}
+
+void HELPER(stfl)(CPUS390XState *env)
+{
+uint64_t words[MAX_STFL_WORDS];
+
+do_stfle(env, words);
+cpu_stl_data(env, 200, words[0] >> 32);
+}
+
+uint32_t HELPER(stfle)(CPUS390XState *env, uint64_t addr)
+{
+uint64_t words[MAX_STFL_WORDS];
+unsigned count_m1 = env->regs[0] & 0xff;
+unsigned max_m1 = do_stfle(env, words);
+unsigned i;
+
+for (i = 0; i <= count_m1; ++i) {
+cpu_stq_data(env, addr + 8 * i, words[i]);
+}
+
+env->regs[0] = deposit64(env->regs[0], 0, 8, max_m1);
+return (count_m1 >= max_m1 ? 0 : 3);
+}
diff --git a/target/s390x/translate.c b/target/s390x/translate.c
index 01c6217..69940e3 100644
--- a/target/s390x/translate.c
+++ b/target/s390x/translate.c
@@ -3628,15 +3628,8 @@ static ExitStatus op_spt(DisasContext *s, DisasOps *o)
 
 static ExitStatus op_stfl(DisasContext *s, DisasOps *o)
 {
-TCGv_i64 f, a;
-/* We really ought to have more complete indication of facilities
-   that we implement.  Address this when STFLE is implemented.  */
 check_privileged(s);
-f = tcg_const_i64(0xc000);
-a = tcg_const_i64(200);
-tcg_gen_qemu_st32(f, a, get_mem_index(s));
-tcg_temp_free_i64(f);
-tcg_temp_free_i64(a);
+gen_helper_stfl(cpu_env);
 return NO_EXIT;
 }
 
@@ -3802,6 +3795,14 @@ static ExitStatus op_sturg(DisasContext *s, DisasOps *o)
 }
 #endif
 
+static ExitStatus op_stfle(DisasContext *s, DisasOps *o)
+{
+potential_page_fault(s);
+gen_helper_stfle(cc_op, cpu_env, o->in2);
+set_cc_static(s);
+return NO_EXIT;
+}
+
 static ExitStatus op_st8(DisasContext *s, DisasOps *o)
 {
 tcg_gen_qemu_st8(o->in1, o->in2, get_mem_index(s));
-- 
2.9.3

[Qemu-devel] [PATCH v4 3/6] target/arm: optimize rev16() using extract op

2017-05-12 Thread Philippe Mathieu-Daudé

Patch created mechanically using Coccinelle script via:

$ spatch --macro-file scripts/cocci-macro-file.h --in-place \
--sp-file scripts/coccinelle/tcg_gen_extract.cocci --dir target

Signed-off-by: Philippe Mathieu-Daudé 
---
 target/arm/translate-a64.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 24de30d92c..759b2466ef 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -4038,14 +4038,12 @@ static void handle_rev16(DisasContext *s, unsigned int 
sf,
 tcg_gen_andi_i64(tcg_tmp, tcg_rn, 0x);
 tcg_gen_bswap16_i64(tcg_rd, tcg_tmp);
 
-tcg_gen_shri_i64(tcg_tmp, tcg_rn, 16);
-tcg_gen_andi_i64(tcg_tmp, tcg_tmp, 0x);
+tcg_gen_extract_i64(tcg_tmp, tcg_rn, 16, 16);
 tcg_gen_bswap16_i64(tcg_tmp, tcg_tmp);
 tcg_gen_deposit_i64(tcg_rd, tcg_rd, tcg_tmp, 16, 16);
 
 if (sf) {
-tcg_gen_shri_i64(tcg_tmp, tcg_rn, 32);
-tcg_gen_andi_i64(tcg_tmp, tcg_tmp, 0x);
+tcg_gen_extract_i64(tcg_tmp, tcg_rn, 32, 16);
 tcg_gen_bswap16_i64(tcg_tmp, tcg_tmp);
 tcg_gen_deposit_i64(tcg_rd, tcg_rd, tcg_tmp, 32, 16);
 
-- 
2.11.0

[Qemu-devel] [PATCH v4 4/6] target/m68k: optimize bcd_flags() using extract op

2017-05-12 Thread Philippe Mathieu-Daudé

Patch created mechanically using Coccinelle script via:

$ spatch --macro-file scripts/cocci-macro-file.h --in-place \
--sp-file scripts/coccinelle/tcg_gen_extract.cocci --dir target

Signed-off-by: Philippe Mathieu-Daudé 
Acked-by: Laurent Vivier 
---
 target/m68k/translate.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/target/m68k/translate.c b/target/m68k/translate.c
index 9f60fbc0db..babb9e2c5b 100644
--- a/target/m68k/translate.c
+++ b/target/m68k/translate.c
@@ -1463,8 +1463,7 @@ static void bcd_flags(TCGv val)
 tcg_gen_andi_i32(QREG_CC_C, val, 0x0ff);
 tcg_gen_or_i32(QREG_CC_Z, QREG_CC_Z, QREG_CC_C);
 
-tcg_gen_shri_i32(QREG_CC_C, val, 8);
-tcg_gen_andi_i32(QREG_CC_C, QREG_CC_C, 1);
+tcg_gen_extract_i32(QREG_CC_C, val, 8, 1);
 
 tcg_gen_mov_i32(QREG_CC_X, QREG_CC_C);
 }
-- 
2.11.0

[Qemu-devel] [PULL 3/9] target/s390x: Diagnose specification exception for atomics

2017-05-12 Thread Richard Henderson

All of the interlocked access facility instructions raise a
specification exception for unaligned accesses.  Do this by
using the (previously unused) unaligned_access hook.

Reviewed-by: Aurelien Jarno 
Signed-off-by: Richard Henderson 
---
 target/s390x/cpu.c|  1 +
 target/s390x/cpu.h|  3 +++
 target/s390x/helper.c | 16 
 3 files changed, 20 insertions(+)

diff --git a/target/s390x/cpu.c b/target/s390x/cpu.c
index 066dcd1..a1bf2ba 100644
--- a/target/s390x/cpu.c
+++ b/target/s390x/cpu.c
@@ -430,6 +430,7 @@ static void s390_cpu_class_init(ObjectClass *oc, void *data)
 cc->write_elf64_note = s390_cpu_write_elf64_note;
 cc->cpu_exec_interrupt = s390_cpu_exec_interrupt;
 cc->debug_excp_handler = s390x_cpu_debug_excp_handler;
+cc->do_unaligned_access = s390x_cpu_do_unaligned_access;
 #endif
 cc->disas_set_info = s390_cpu_disas_set_info;
 
diff --git a/target/s390x/cpu.h b/target/s390x/cpu.h
index 058ddad..bbed320 100644
--- a/target/s390x/cpu.h
+++ b/target/s390x/cpu.h
@@ -480,6 +480,9 @@ int s390_cpu_handle_mmu_fault(CPUState *cpu, vaddr address, 
int rw,
 
 #ifndef CONFIG_USER_ONLY
 void do_restart_interrupt(CPUS390XState *env);
+void s390x_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
+   MMUAccessType access_type,
+   int mmu_idx, uintptr_t retaddr);
 
 static inline hwaddr decode_basedisp_s(CPUS390XState *env, uint32_t ipb,
uint8_t *ar)
diff --git a/target/s390x/helper.c b/target/s390x/helper.c
index 68bd2f9..9978490 100644
--- a/target/s390x/helper.c
+++ b/target/s390x/helper.c
@@ -718,4 +718,20 @@ void s390x_cpu_debug_excp_handler(CPUState *cs)
 cpu_loop_exit_noexc(cs);
 }
 }
+
+/* Unaligned accesses are only diagnosed with MO_ALIGN.  At the moment,
+   this is only for the atomic operations, for which we want to raise a
+   specification exception.  */
+void s390x_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
+   MMUAccessType access_type,
+   int mmu_idx, uintptr_t retaddr)
+{
+S390CPU *cpu = S390_CPU(cs);
+CPUS390XState *env = >env;
+
+if (retaddr) {
+cpu_restore_state(cs, retaddr);
+}
+program_interrupt(env, PGM_SPECIFICATION, ILEN_LATER);
+}
 #endif /* CONFIG_USER_ONLY */
-- 
2.9.3

[Qemu-devel] [PATCH v4 2/6] target/alpha: optimize cvtlq() using extract op

2017-05-12 Thread Philippe Mathieu-Daudé

Patch created mechanically using Coccinelle script via:

$ spatch --macro-file scripts/cocci-macro-file.h --in-place \
--sp-file scripts/coccinelle/tcg_gen_extract.cocci --dir target

Signed-off-by: Philippe Mathieu-Daudé 
---
 target/alpha/translate.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/target/alpha/translate.c b/target/alpha/translate.c
index df5d695344..531af4f5b8 100644
--- a/target/alpha/translate.c
+++ b/target/alpha/translate.c
@@ -747,9 +747,8 @@ static void gen_cvtlq(TCGv vc, TCGv vb)
 /* The arithmetic right shift here, plus the sign-extended mask below
yields a sign-extended result without an explicit ext32s_i64.  */
 tcg_gen_sari_i64(tmp, vb, 32);
-tcg_gen_shri_i64(vc, vb, 29);
+tcg_gen_extract_i64(vc, vb, 29, 30);
 tcg_gen_andi_i64(tmp, tmp, (int32_t)0xc000);
-tcg_gen_andi_i64(vc, vc, 0x3fff);
 tcg_gen_or_i64(vc, vc, tmp);
 
 tcg_temp_free(tmp);
-- 
2.11.0

[Qemu-devel] [PULL 0/9] Queued s390 patches

2017-05-12 Thread Richard Henderson

This is a combination of my queued patches and those
posted by Aurelien this week.


r~


The following changes since commit ecc1f5adeec4e3324d1b695a7c54e3967c526949:

  maintainers: Add myself as linux-user reviewer (2017-05-11 13:31:11 -0400)

are available in the git repository at:

  git://github.com/rth7680/qemu.git tags/pull-s390-20170512

for you to fetch changes up to 538fad597d898f677f81cb4daacd37e7cdc18e6e:

  target/s390x: implement serialization in BRANCH CONDITION (2017-05-12 
15:48:41 -0700)


Queued target/s390 patches


Aurelien Jarno (3):
  target/s390x: mask the SIGP order_code using SIGP_ORDER_MASK
  target/s390x: fix SIGNAL PROCESSOR return value
  target/s390x: implement serialization in BRANCH CONDITION

Eric Bischoff (1):
  target/s390x: Implement LOAD PAIR DISJOINT

Miroslav Benes (1):
  target/s390x: Implement LOAD PROGRAM PARAMETER

Richard Henderson (4):
  target/s390x: Implement STORE FACILITIES LIST EXTENDED
  target/s390x: Diagnose specification exception for atomics
  target/s390x: Use atomic operations for COMPARE SWAP
  target/s390x: Use atomic operations for LOAD AND OP

 target/s390x/cpu.c |   1 +
 target/s390x/cpu.h |   6 ++
 target/s390x/helper.c  |  16 +++
 target/s390x/helper.h  |   3 +
 target/s390x/insn-data.def |  38 ---
 target/s390x/kvm.c |   2 -
 target/s390x/mem_helper.c  |  40 
 target/s390x/misc_helper.c |  62 +++-
 target/s390x/translate.c   | 245 ++---
 9 files changed, 288 insertions(+), 125 deletions(-)

[Qemu-devel] [RFC PATCH v4 0/6] optimize various tcg_gen() functions using extract op

2017-05-12 Thread Philippe Mathieu-Daudé

* Changes from v3

Tried to fix wrong previous attempt...
After getting some nice/fast pieces of advice from Coccinelle folks, I tried to
improved the script (not much inline documentation yet although).
- correctly check if this optimizable?
- document as Mersenne number instead of prime (Eric Blake)
- try to write Python code instead of BASIC (Markus Elfring advices)
- try to reduce regex usage
- try to match shri(); unrelated(); andi(); pattern to optimize, I was surprised
  to see the alpha diff Coccinelle found.

This is surely not the last version of this patchset, but I think now the
generated patches are correct and I prefer reviewers to look at them fixed
instead of wrong one in the ML.
Still lot of work to do in the cocci script, now it seems to hang trying to
parse "target/arm/translate.c".

* [v3] (v2 was a resend of the cocci script):

In my first attempt I misunderstood tcg_gen_extract() intrinsics, and Richard
Henderson pointed that out.
In this patchset the cocci script is corrected and clarified, it also print how
arguments are checked while running.
Also:
- incorrect patches have been removed. (Richard Henderson, Nikunj A Dadhania)
- Coccinelle script licensed GPLv2+ (Eric Blake)
- comment in each commit about how to apply the patch (Eric Blake)
- added Acked-by for m68k (Laurent Vivier)
- Cc: Coccinelle developers.

[v1]

While reviewing a commit from Aurelien Jarno where he optimized a TCG generator
for SH-4 [1] I found the same optimization done on PPC by Nikunj A Dadhania few
months ago [2].
After asking on the ML about a cocci script [3] I thought it would be easier to
learn about Coccinelle.

citing Aurelien Jarno:
This doesn't change the generated code on x86, but optimizes it on most
RISC architectures and makes the code simpler to read.

I actually applied the script using the following command:

$ docker run -v `pwd`:`pwd` -w `pwd` petersenna/coccinelle \
--sp-file scripts/coccinelle/tcg_gen_extract.cocci \
--macro-file scripts/cocci-macro-file.h \
--dir target \
--in-place

Please review again! thanks.

[1] http://lists.nongnu.org/archive/html/qemu-devel/2017-05/msg01466.html
[2] http://lists.nongnu.org/archive/html/qemu-devel/2017-02/msg05211.html
[3] http://lists.nongnu.org/archive/html/qemu-devel/2017-05/msg01499.html

Philippe Mathieu-Daudé (6):
  coccinelle: add a script to optimize tcg op using tcg_gen_extract()
  target/alpha: optimize cvtlq() using extract op
  target/arm: optimize rev16() using extract op
  target/m68k: optimize bcd_flags() using extract op
  target/ppc: optimize various functions using extract op
  target/sparc: optimize various functions using extract op

 scripts/coccinelle/tcg_gen_extract.cocci | 103 +++
 target/alpha/translate.c |   3 +-
 target/arm/translate-a64.c   |   6 +-
 target/m68k/translate.c  |   3 +-
 target/ppc/translate.c   |  21 +++
 target/ppc/translate/vsx-impl.inc.c  |  24 +++
 target/sparc/translate.c |  15 ++---
 7 files changed, 127 insertions(+), 48 deletions(-)
 create mode 100644 scripts/coccinelle/tcg_gen_extract.cocci

-- 
2.11.0

[Qemu-devel] [RFC PATCH v4 1/6] coccinelle: add a script to optimize tcg op using tcg_gen_extract()

2017-05-12 Thread Philippe Mathieu-Daudé

If you have coccinelle installed you can apply this script using:

$ spatch \
--macro-file scripts/cocci-macro-file.h \
--dir target --in-place

You can also use directly Peter Senna Tschudin docker image (easier):

$ docker run -v `pwd`:`pwd` -w `pwd` petersenna/coccinelle \
--sp-file scripts/coccinelle/tcg_gen_extract.cocci \
--macro-file scripts/cocci-macro-file.h \
--dir target --in-place

Then verified that no manual touchups are required.

The following thread was helpful while writing this script:

https://github.com/coccinelle/coccinelle/issues/86

Signed-off-by: Philippe Mathieu-Daudé 
---
 scripts/coccinelle/tcg_gen_extract.cocci | 103 +++
 1 file changed, 103 insertions(+)
 create mode 100644 scripts/coccinelle/tcg_gen_extract.cocci

diff --git a/scripts/coccinelle/tcg_gen_extract.cocci 
b/scripts/coccinelle/tcg_gen_extract.cocci
new file mode 100644
index 00..37546834ee
--- /dev/null
+++ b/scripts/coccinelle/tcg_gen_extract.cocci
@@ -0,0 +1,103 @@
+// optimize TCG using extract op
+//
+// Copyright: (C) 2017 Philippe Mathieu-Daudé. GPLv2+.
+// Confidence: High
+// Options: --macro-file scripts/cocci-macro-file.h
+//
+// Nikunj A Dadhania optimization:
+// http://lists.nongnu.org/archive/html/qemu-devel/2017-02/msg05211.html
+// Aurelien Jarno optimization:
+// http://lists.nongnu.org/archive/html/qemu-devel/2017-05/msg01466.html
+// Coccinelle helpful issue:
+// https://github.com/coccinelle/coccinelle/issues/86
+
+@initialize:python@
+@@
+import sys
+fd = sys.stderr
+def debug(msg="", trailer="\n"):
+fd.write("[DBG] " + msg + trailer)
+def low_bits_count(value):
+bits_count = 0
+while (value & (1 << bits_count)):
+bits_count += 1
+return bits_count
+def Mn(order): # Mersenne number
+return (1 << order) - 1
+
+@match@ // depends on never match_and_check_reg_used@
+metavariable ret, arg;
+constant ofs, msk;
+expression tcg_arg;
+identifier tcg_func =~ "^tcg_gen_";
+position shr_p, and_p;
+@@
+(
+tcg_gen_shri_i32@shr_p
+|
+tcg_gen_shri_i64@shr_p
+|
+tcg_gen_shri_tl@shr_p
+)(ret, arg, ofs);
+<...
+tcg_func(tcg_arg, ...);
+...>
+(
+tcg_gen_andi_i32@and_p
+|
+tcg_gen_andi_i64@and_p
+|
+tcg_gen_andi_tl@and_p
+)(ret, ret, msk);
+
+@script:python verify_len depends on match@
+ret_s << match.ret;
+msk_s << match.msk;
+shr_p << match.shr_p;
+tcg_func << match.tcg_func;
+tcg_arg << match.tcg_arg;
+extract_len;
+@@
+is_optimizable = False
+debug("candidate at %s:%s" % (shr_p[0].file, shr_p[0].line))
+if tcg_arg == ret_s:
+debug("  %s() modifies argument '%s'" % (tcg_func, ret_s))
+else:
+debug("candidate at %s:%s" % (shr_p[0].file, shr_p[0].line))
+try: # only eval integer, no #define like 'SR_M' (cpp did this, else some 
headers are missing).
+msk_v = long(msk_s.strip("UL"), 0)
+msk_b = low_bits_count(msk_v)
+if msk_b == 0:
+debug("  value: 0x%x low_bits: %d" % (msk_v, msk_b))
+else:
+debug("  value: 0x%x low_bits: %d [Mersenne prime: 0x%x]" % 
(msk_v, msk_b, Mn(msk_b)))
+is_optimizable = Mn(msk_b) == msk_v # check low_bits
+coccinelle.extract_len = "%d" % msk_b
+debug("  candidate %s optimizable" % ("IS" if is_optimizable else "is 
NOT"))
+except:
+debug("  ERROR (check included headers?)")
+cocci.include_match(is_optimizable)
+debug()
+
+@replacement depends on verify_len@
+metavariable match.ret, match.arg;
+constant match.ofs, match.msk;
+position match.shr_p, match.and_p;
+identifier verify_len.extract_len;
+@@
+(
+-tcg_gen_shri_i32@shr_p(ret, arg, ofs);
++tcg_gen_extract_i32(ret, arg, ofs, extract_len);
+...
+-tcg_gen_andi_i32@and_p(ret, ret, msk);
+|
+-tcg_gen_shri_i64@shr_p(ret, arg, ofs);
++tcg_gen_extract_i64(ret, arg, ofs, extract_len);
+...
+-tcg_gen_andi_i64@and_p(ret, ret, msk);
+|
+-tcg_gen_shri_tl@shr_p(ret, arg, ofs);
++tcg_gen_extract_tl(ret, arg, ofs, extract_len);
+...
+-tcg_gen_andi_tl@and_p(ret, ret, msk);
+)
-- 
2.11.0

[Qemu-devel] [PATCH v4 5/6] target/ppc: optimize various functions using extract op

2017-05-12 Thread Philippe Mathieu-Daudé

Patch created mechanically using Coccinelle script via:

$ spatch --macro-file scripts/cocci-macro-file.h --in-place \
--sp-file scripts/coccinelle/tcg_gen_extract.cocci --dir target

Signed-off-by: Philippe Mathieu-Daudé 
---
 target/ppc/translate.c  | 21 +++--
 target/ppc/translate/vsx-impl.inc.c | 24 
 2 files changed, 15 insertions(+), 30 deletions(-)

diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index f40b5a1abf..6521365bfa 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -868,8 +868,7 @@ static inline void gen_op_arith_add(DisasContext *ctx, TCGv 
ret, TCGv arg1,
 }
 tcg_gen_xor_tl(cpu_ca, t0, t1);/* bits changed w/ carry */
 tcg_temp_free(t1);
-tcg_gen_shri_tl(cpu_ca, cpu_ca, 32);   /* extract bit 32 */
-tcg_gen_andi_tl(cpu_ca, cpu_ca, 1);
+tcg_gen_extract_tl(cpu_ca, cpu_ca, 32, 1);
 if (is_isa300(ctx)) {
 tcg_gen_mov_tl(cpu_ca32, cpu_ca);
 }
@@ -1399,8 +1398,7 @@ static inline void gen_op_arith_subf(DisasContext *ctx, 
TCGv ret, TCGv arg1,
 tcg_temp_free(inv1);
 tcg_gen_xor_tl(cpu_ca, t0, t1); /* bits changes w/ carry */
 tcg_temp_free(t1);
-tcg_gen_shri_tl(cpu_ca, cpu_ca, 32);/* extract bit 32 */
-tcg_gen_andi_tl(cpu_ca, cpu_ca, 1);
+tcg_gen_extract_tl(cpu_ca, cpu_ca, 32, 1);
 if (is_isa300(ctx)) {
 tcg_gen_mov_tl(cpu_ca32, cpu_ca);
 }
@@ -4310,8 +4308,7 @@ static void gen_mfsrin(DisasContext *ctx)
 
 CHK_SV;
 t0 = tcg_temp_new();
-tcg_gen_shri_tl(t0, cpu_gpr[rB(ctx->opcode)], 28);
-tcg_gen_andi_tl(t0, t0, 0xF);
+tcg_gen_extract_tl(t0, cpu_gpr[rB(ctx->opcode)], 28, 4);
 gen_helper_load_sr(cpu_gpr[rD(ctx->opcode)], cpu_env, t0);
 tcg_temp_free(t0);
 #endif /* defined(CONFIG_USER_ONLY) */
@@ -4342,8 +4339,7 @@ static void gen_mtsrin(DisasContext *ctx)
 CHK_SV;
 
 t0 = tcg_temp_new();
-tcg_gen_shri_tl(t0, cpu_gpr[rB(ctx->opcode)], 28);
-tcg_gen_andi_tl(t0, t0, 0xF);
+tcg_gen_extract_tl(t0, cpu_gpr[rB(ctx->opcode)], 28, 4);
 gen_helper_store_sr(cpu_env, t0, cpu_gpr[rD(ctx->opcode)]);
 tcg_temp_free(t0);
 #endif /* defined(CONFIG_USER_ONLY) */
@@ -4377,8 +4373,7 @@ static void gen_mfsrin_64b(DisasContext *ctx)
 
 CHK_SV;
 t0 = tcg_temp_new();
-tcg_gen_shri_tl(t0, cpu_gpr[rB(ctx->opcode)], 28);
-tcg_gen_andi_tl(t0, t0, 0xF);
+tcg_gen_extract_tl(t0, cpu_gpr[rB(ctx->opcode)], 28, 4);
 gen_helper_load_sr(cpu_gpr[rD(ctx->opcode)], cpu_env, t0);
 tcg_temp_free(t0);
 #endif /* defined(CONFIG_USER_ONLY) */
@@ -4409,8 +4404,7 @@ static void gen_mtsrin_64b(DisasContext *ctx)
 
 CHK_SV;
 t0 = tcg_temp_new();
-tcg_gen_shri_tl(t0, cpu_gpr[rB(ctx->opcode)], 28);
-tcg_gen_andi_tl(t0, t0, 0xF);
+tcg_gen_extract_tl(t0, cpu_gpr[rB(ctx->opcode)], 28, 4);
 gen_helper_store_sr(cpu_env, t0, cpu_gpr[rS(ctx->opcode)]);
 tcg_temp_free(t0);
 #endif /* defined(CONFIG_USER_ONLY) */
@@ -5383,8 +5377,7 @@ static void gen_mfsri(DisasContext *ctx)
 CHK_SV;
 t0 = tcg_temp_new();
 gen_addr_reg_index(ctx, t0);
-tcg_gen_shri_tl(t0, t0, 28);
-tcg_gen_andi_tl(t0, t0, 0xF);
+tcg_gen_extract_tl(t0, t0, 28, 4);
 gen_helper_load_sr(cpu_gpr[rd], cpu_env, t0);
 tcg_temp_free(t0);
 if (ra != 0 && ra != rd)
diff --git a/target/ppc/translate/vsx-impl.inc.c 
b/target/ppc/translate/vsx-impl.inc.c
index 7f12908029..85ed135d44 100644
--- a/target/ppc/translate/vsx-impl.inc.c
+++ b/target/ppc/translate/vsx-impl.inc.c
@@ -1248,8 +1248,7 @@ static void gen_xsxexpdp(DisasContext *ctx)
 gen_exception(ctx, POWERPC_EXCP_VSXU);
 return;
 }
-tcg_gen_shri_i64(rt, cpu_vsrh(xB(ctx->opcode)), 52);
-tcg_gen_andi_i64(rt, rt, 0x7FF);
+tcg_gen_extract_i64(rt, cpu_vsrh(xB(ctx->opcode)), 52, 11);
 }
 
 static void gen_xsxexpqp(DisasContext *ctx)
@@ -1262,8 +1261,7 @@ static void gen_xsxexpqp(DisasContext *ctx)
 gen_exception(ctx, POWERPC_EXCP_VSXU);
 return;
 }
-tcg_gen_shri_i64(xth, xbh, 48);
-tcg_gen_andi_i64(xth, xth, 0x7FFF);
+tcg_gen_extract_i64(xth, xbh, 48, 15);
 tcg_gen_movi_i64(xtl, 0);
 }
 
@@ -1323,8 +1321,7 @@ static void gen_xsxsigdp(DisasContext *ctx)
 zr = tcg_const_i64(0);
 nan = tcg_const_i64(2047);
 
-tcg_gen_shri_i64(exp, cpu_vsrh(xB(ctx->opcode)), 52);
-tcg_gen_andi_i64(exp, exp, 0x7FF);
+tcg_gen_extract_i64(exp, cpu_vsrh(xB(ctx->opcode)), 52, 11);
 tcg_gen_movi_i64(t0, 0x0010);
 tcg_gen_movcond_i64(TCG_COND_EQ, t0, exp, zr, zr, t0);
 tcg_gen_movcond_i64(TCG_COND_EQ, t0, exp, nan, zr, t0);
@@ -1352,8 +1349,7 @@ static void gen_xsxsigqp(DisasContext *ctx)
 zr = tcg_const_i64(0);
 nan = tcg_const_i64(32767);
 
-tcg_gen_shri_i64(exp,

[Qemu-devel] [PATCH 3/3] target/xtensa: support output to chardev console

2017-05-12 Thread Max Filippov

In semihosting mode QEMU allows guest to read and write host file
descriptors directly, including descriptors 0..2, a.k.a. stdin, stdout
and stderr. Sometimes it's desirable to have semihosting console
controlled by -serial option, e.g. to connect it to network.

Add semihosting console to xtensa-semi.c, open it in the 'sim' machine
in the presence of -serial option and direct stdout and stderr to it
when it's present.

Signed-off-by: Max Filippov 
---
 hw/xtensa/sim.c |  4 +++
 target/xtensa/cpu.h |  1 +
 target/xtensa/xtensa-semi.c | 66 +++--
 3 files changed, 57 insertions(+), 14 deletions(-)

diff --git a/hw/xtensa/sim.c b/hw/xtensa/sim.c
index b27e28d..5521e91 100644
--- a/hw/xtensa/sim.c
+++ b/hw/xtensa/sim.c
@@ -114,6 +114,9 @@ static void xtensa_sim_init(MachineState *machine)
 xtensa_create_memory_regions(, "xtensa.sysram");
 }
 
+if (serial_hds[0]) {
+xtensa_sim_open_console(serial_hds[0]);
+}
 if (kernel_filename) {
 uint64_t elf_entry;
 uint64_t elf_lowaddr;
@@ -136,6 +139,7 @@ static void xtensa_sim_machine_init(MachineClass *mc)
 mc->is_default = true;
 mc->init = xtensa_sim_init;
 mc->max_cpus = 4;
+mc->no_serial = 1;
 }
 
 DEFINE_MACHINE("sim", xtensa_sim_machine_init)
diff --git a/target/xtensa/cpu.h b/target/xtensa/cpu.h
index ecca17d..ee29fb1 100644
--- a/target/xtensa/cpu.h
+++ b/target/xtensa/cpu.h
@@ -483,6 +483,7 @@ void xtensa_translate_init(void);
 void xtensa_breakpoint_handler(CPUState *cs);
 void xtensa_finalize_config(XtensaConfig *config);
 void xtensa_register_core(XtensaConfigList *node);
+void xtensa_sim_open_console(Chardev *chr);
 void check_interrupts(CPUXtensaState *s);
 void xtensa_irq_init(CPUXtensaState *env);
 void *xtensa_get_extint(CPUXtensaState *env, unsigned extint);
diff --git a/target/xtensa/xtensa-semi.c b/target/xtensa/xtensa-semi.c
index ffcaf8d..01c622e 100644
--- a/target/xtensa/xtensa-semi.c
+++ b/target/xtensa/xtensa-semi.c
@@ -29,7 +29,12 @@
 #include "cpu.h"
 #include "exec/helper-proto.h"
 #include "exec/semihost.h"
+#include "qapi/error.h"
 #include "qemu/log.h"
+#include "sysemu/char.h"
+#include "sysemu/sysemu.h"
+
+static CharBackend *xtensa_sim_console;
 
 enum {
 TARGET_SYS_exit = 1,
@@ -148,6 +153,15 @@ static uint32_t errno_h2g(int host_errno)
 }
 }
 
+void xtensa_sim_open_console(Chardev *chr)
+{
+static CharBackend console;
+
+qemu_chr_fe_init(, chr, _abort);
+qemu_chr_fe_set_handlers(, NULL, NULL, NULL, NULL, NULL, true);
+xtensa_sim_console = 
+}
+
 void HELPER(simcall)(CPUXtensaState *env)
 {
 CPUState *cs = CPU(xtensa_env_get_cpu(env));
@@ -181,10 +195,25 @@ void HELPER(simcall)(CPUXtensaState *env)
 if (buf) {
 vaddr += io_sz;
 len -= io_sz;
-io_done = is_write ?
-write(fd, buf, io_sz) :
-read(fd, buf, io_sz);
-regs[3] = errno_h2g(errno);
+if (fd < 3 && xtensa_sim_console) {
+if (is_write && (fd == 1 || fd == 2)) {
+io_done = qemu_chr_fe_write_all(xtensa_sim_console,
+buf, io_sz);
+regs[3] = errno_h2g(errno);
+} else {
+qemu_log_mask(LOG_GUEST_ERROR,
+  "%s fd %d is not supported with 
chardev console\n",
+  is_write ?
+  "writing to" : "reading from", fd);
+io_done = -1;
+regs[3] = TARGET_EBADF;
+}
+} else {
+io_done = is_write ?
+write(fd, buf, io_sz) :
+read(fd, buf, io_sz);
+regs[3] = errno_h2g(errno);
+}
 if (io_done == -1) {
 error = true;
 io_done = 0;
@@ -256,10 +285,6 @@ void HELPER(simcall)(CPUXtensaState *env)
 uint32_t target_tvv[2];
 
 struct timeval tv = {0};
-fd_set fdset;
-
-FD_ZERO();
-FD_SET(fd, );
 
 if (target_tv) {
 cpu_memory_rw_debug(cs, target_tv,
@@ -267,12 +292,25 @@ void HELPER(simcall)(CPUXtensaState *env)
 tv.tv_sec = (int32_t)tswap32(target_tvv[0]);
 tv.tv_usec = (int32_t)tswap32(target_tvv[1]);
 }
-regs[2] = select(fd + 1,
-rq == SELECT_ONE_READ   ?  : NULL,
-rq == SELECT_ONE_WRITE  ?  : NULL,
-rq == SELECT_ONE_EXCEPT ?  : NULL,
-target_tv ?  : NULL);
-

[Qemu-devel] [PATCH 2/3] target/xtensa: fix return value of read/write simcalls

2017-05-12 Thread Max Filippov

Return value of read/write simcalls is not calculated correctly in case
of operations crossing page boundary and in case of short reads/writes.
Read and write simcalls should return the size of data actually
read/written or -1 in case of error.

Cc: qemu-sta...@nongnu.org
Signed-off-by: Max Filippov 
---
 target/xtensa/xtensa-semi.c | 25 -
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/target/xtensa/xtensa-semi.c b/target/xtensa/xtensa-semi.c
index 98ae28c..ffcaf8d 100644
--- a/target/xtensa/xtensa-semi.c
+++ b/target/xtensa/xtensa-semi.c
@@ -166,6 +166,7 @@ void HELPER(simcall)(CPUXtensaState *env)
 uint32_t fd = regs[3];
 uint32_t vaddr = regs[4];
 uint32_t len = regs[5];
+uint32_t len_done = 0;
 
 while (len > 0) {
 hwaddr paddr = cpu_get_phys_page_debug(cs, vaddr);
@@ -174,24 +175,38 @@ void HELPER(simcall)(CPUXtensaState *env)
 uint32_t io_sz = page_left < len ? page_left : len;
 hwaddr sz = io_sz;
 void *buf = cpu_physical_memory_map(paddr, , !is_write);
+uint32_t io_done;
+bool error = false;
 
 if (buf) {
 vaddr += io_sz;
 len -= io_sz;
-regs[2] = is_write ?
+io_done = is_write ?
 write(fd, buf, io_sz) :
 read(fd, buf, io_sz);
 regs[3] = errno_h2g(errno);
-cpu_physical_memory_unmap(buf, sz, !is_write, sz);
-if (regs[2] == -1) {
-break;
+if (io_done == -1) {
+error = true;
+io_done = 0;
 }
+cpu_physical_memory_unmap(buf, sz, !is_write, io_done);
 } else {
-regs[2] = -1;
+error = true;
 regs[3] = TARGET_EINVAL;
 break;
 }
+if (error) {
+if (!len_done) {
+len_done = -1;
+}
+break;
+}
+len_done += io_done;
+if (io_done < io_sz) {
+break;
+}
 }
+regs[2] = len_done;
 }
 break;
 
-- 
2.1.4

[Qemu-devel] [PATCH 0/3] target/xtensa semihosting fixes

2017-05-12 Thread Max Filippov

Hello,

this series fixes two issues in xtensa semihosting read/write calls:
incorrect direction flags used to map physical memory and incorrect
return value for requests crossing page boundary, and allows using
QEMU chardev for stdout and stderr output in semihosting mode.

Max Filippov (3):
  target/xtensa: fix mapping direction in read/write simcalls
  target/xtensa: fix return value of read/write simcalls
  target/xtensa: support output to chardev console

 hw/xtensa/sim.c |  4 ++
 target/xtensa/cpu.h |  1 +
 target/xtensa/xtensa-semi.c | 91 +++--
 3 files changed, 77 insertions(+), 19 deletions(-)

-- 
2.1.4

[Qemu-devel] [PATCH 1/3] target/xtensa: fix mapping direction in read/write simcalls

2017-05-12 Thread Max Filippov

Read and write simcalls map physical memory to access I/O buffers, but
'read' simcall need to map it for writing and 'write' simcall need to
map it for reading, i.e. the opposite of what they do now. Fix that.

Cc: qemu-sta...@nongnu.org
Signed-off-by: Max Filippov 
---
 target/xtensa/xtensa-semi.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/xtensa/xtensa-semi.c b/target/xtensa/xtensa-semi.c
index a888a9d..98ae28c 100644
--- a/target/xtensa/xtensa-semi.c
+++ b/target/xtensa/xtensa-semi.c
@@ -173,7 +173,7 @@ void HELPER(simcall)(CPUXtensaState *env)
 TARGET_PAGE_SIZE - (vaddr & (TARGET_PAGE_SIZE - 1));
 uint32_t io_sz = page_left < len ? page_left : len;
 hwaddr sz = io_sz;
-void *buf = cpu_physical_memory_map(paddr, , is_write);
+void *buf = cpu_physical_memory_map(paddr, , !is_write);
 
 if (buf) {
 vaddr += io_sz;
@@ -182,7 +182,7 @@ void HELPER(simcall)(CPUXtensaState *env)
 write(fd, buf, io_sz) :
 read(fd, buf, io_sz);
 regs[3] = errno_h2g(errno);
-cpu_physical_memory_unmap(buf, sz, is_write, sz);
+cpu_physical_memory_unmap(buf, sz, !is_write, sz);
 if (regs[2] == -1) {
 break;
 }
-- 
2.1.4

Re: [Qemu-devel] [PATCH] target/i386: enable A20 automatically in system management mode

2017-05-12 Thread Xu, Anthony

wrote:
> > On 12/05/2017 20:55, Xu, Anthony wrote:
> > > If that's the case,  QEMU/TCG should work with SeaBios even with
> ignoring A20.
> > >
> > > During SeaBios boot, there are >350 port 92 access, if we don't need to
> handle A20,
> > > we can make A20 configurable in Seabios, It may reduce SeaBios boot
> time.
> >
> > Yes, that's a good idea.
> 
> SeaBIOS defaults to enabling A20 and it's a rare beast that disables
> it.  One could change x86.h:set_a20 and romlayout.S:transition32 to
> only issue the outb() if the inb() indicates a change is needed.  That
> would likely eliminate half the accesses.

The 350 port 92 access is for write operation only.
If include the inb(), it would be 700, and every time it actually has a change
To be precise, It is about 175 switches from 32 bit to 16 bit, then back to 32 
bit.
call16 is called 175 times during Seabios boot without any option rom,
It would be more if some option roms are included.

I think A20 is disabled by default in SeaBios.
Call16Data.a20 is initialized to 0,
call16_override may set Call16Data.a20 to 1,
call16 called before call16_override would disable and enable A20.

BTW, A20 is enabled in QEMU by default.

> 
> I'd be surprised if it would impact the overall boot time though.
> SeaBIOS only touches the port on a cpu mode switch and I would have
> thought that was heavier than an IO port access.  Maybe that is skewed
> on KVM though.

Maybe not, but if we have more these kind  of optimizations , we may see the 
impact.
A20 is already configurable in SeaBios,  CONFIG_DISABLE_A20.
Then the change is very small. No PORT_A20 access after the change.

-Anthony

diff --git a/src/x86.h b/src/x86.h
index a770e6f..8efb94a 100644
--- a/src/x86.h
+++ b/src/x86.h
@@ -21,6 +21,7 @@
 #ifndef __ASSEMBLY__

 #include "types.h" // u32
+#include "../out/autoconf.h"

 static inline void irq_disable(void)
 {
@@ -254,13 +255,19 @@ static inline void lgdt(struct descloc_s *desc) {
 }

 static inline u8 get_a20(void) {
-return (inb(PORT_A20) & A20_ENABLE_BIT) != 0;
+if (CONFIG_DISABLE_A20) {
+return (inb(PORT_A20) & A20_ENABLE_BIT) != 0;
+}
+return 1;
 }

 static inline u8 set_a20(u8 cond) {
-u8 val = inb(PORT_A20);
-outb((val & ~A20_ENABLE_BIT) | (cond ? A20_ENABLE_BIT : 0), PORT_A20);
-return (val & A20_ENABLE_BIT) != 0;
+if (CONFIG_DISABLE_A20) {
+u8 val = inb(PORT_A20);
+outb((val & ~A20_ENABLE_BIT) | (cond ? A20_ENABLE_BIT : 0), PORT_A20);
+return (val & A20_ENABLE_BIT) != 0;
+}
+return 1;
 }

Re: [Qemu-devel] [PATCH v9 01/13] qcow2: Unallocate unmapped zero clusters if no backing file

2017-05-12 Thread John Snow



On 05/12/2017 12:06 PM, Max Reitz wrote:
> On 2017-05-11 16:56, Eric Blake wrote:
>> [revisiting this older patch version, even though the final version in
>> today's pull request changed somewhat from this approach]
>>
>> On 04/12/2017 04:49 AM, Kevin Wolf wrote:
>>> Am 11.04.2017 um 03:17 hat Eric Blake geschrieben:
 'qemu-img map' already coalesces information about unallocated
 clusters (those with status 0) and pure zero clusters (those
 with status BDRV_BLOCK_ZERO and no offset).  Furthermore, all
 qcow2 images with no backing file already report all unallocated
 clusters (in the preallocation sense of clusters with no offset)
 as BDRV_BLOCK_ZERO, regardless of whether the QCOW_OFLAG_ZERO was
 set in that L2 entry (QCOW_OFLAG_ZERO also implies a return of
 BDRV_BLOCK_ALLOCATED, but we intentionally do not expose that bit
 to external users), thanks to generic block layer code in
 bdrv_co_get_block_status().

 So, for an image with no backing file, having bdrv_pwrite_zeroes
 mark clusters as unallocated (defer to backing file) rather than
 reads-as-zero (regardless of backing file) makes no difference
 to normal behavior, but may potentially allow for fewer writes to
 the L2 table of a freshly-created image where the L2 table is
 initially written to all-zeroes (although I don't actually know
 if we skip an L2 update and flush when re-writing the same
 contents as already present).
>>>
>>> I don't get this. Allocating a cluster always involves an L2 update, no
>>> matter whether it was previously unallocated or a zero cluster.
>>
>> On IRC, John, Kevin, and I were discussing the current situation with
>> libvirt NBD storage migration.  When libvirt creates a file on the
>> destination (rather than the user pre-creating it), it currently
>> defaults to 0.10 [v2] images, even if the source was a 1.1 image [v3]
>> (arguably something that could be improved in libvirt, but
>> https://bugzilla.redhat.com/show_bug.cgi?id=1371749 was closed as not a
>> bug).
>>
>> Therefore, the use case of doing a mirror job to a v2 image, and having
>> that image become thick even though the source was thin, is happening
>> more than we'd like
>> (https://bugzilla.redhat.com/show_bug.cgi?id=1371749).  While Kevin had
>> a point that in the common case we ALWAYS want to turn an unallocated
>> cluster into a zero cluster (so that we don't have to audit whether all
>> callers are properly accounting for the case where a backing image is
>> added later), our conversation on IRC today conceded that we may want to
>> introduce a new BDRV_REQ_READ_ZERO_NOP (or some such name) that
>> particular callers can use to request that if a cluster already reads as
>> zeroes, the write zero request does NOT have to change it.  Normal guest
>> operations would not use the flag, but mirroring IS a case that would
>> use the flag, so that we can end up with thinner mirrors even to 0.10
>> images.
>>
>> The other consideration is that on 0.10 images, even if we have to
>> allocate, right now our allocation is done by way of failing with
>> -ENOTSUP and falling back to the normal pwrite() of explicit zeroes.  It
>> may be worth teaching the qcow2 layer to explicitly handle write zeroes,
>> even on 0.10 images, by allocating a cluster (as needed) but then
>> telling bs->file to write zeroes (punching a hole as appropriate) so
>> that the file is still thin.  In fact, it matches the fact that we
>> already have code that probes whether a qcow2 cluster that reports
>> BDRV_BLOCK_DATA|BDRV_BLOCK_OFFSET_VALID then probes bs->file to see if
>> there is a hole there, at which point it can add BDRV_BLOCK_ZERO to the
>> bdrv_get_block_status.
>>
>> I don't know which one of us will tackle patches along these lines, but
>> wanted to at least capture the gist of the IRC conversation in the
>> archives for a bit more permanence.
> 
> Just short ideas:
> 
> (1) I do consider it a bug if v2 images are created. The BZ wasn't
> closed for a very good reason, but because nobody answered this question:
> 
>> is this really a problem?
> 
> And apparently it is.
> 
> (2) There is a way of creating zero clusters on v2 images, but we've
> never done it (yet): You create a single data cluster containing zeroes
> and then you just point to it whenever you need a zero cluster (until
> its refcount overflows, at which point you allocate another cluster).
> Maybe that helps.
> 
> I'd be really careful with optimizations for v2 images. You have already
> proven to me that my fears of too much complexity are out of proportions
> sometimes, but still. v2 upstream is old legacy and should be treated as
> such, at least IMHO.
> 
> Max
> 
> 

I agree that V2 changes should be limited in nature, but there are
less-fiddly ways to have thin 0.10 images on sparse filesystems. This
way feels a little too clever and mostly likely too intrusive for an old
format.

I'd also think that it'd probably confuse

Re: [Qemu-devel] multiple -append?

2017-05-12 Thread Laszlo Ersek

On 05/12/17 16:20, Rob Landley wrote:
> When I feed a second -append to qemu-system-i386 they don't get
> concatenated, the second replaces the first. Why is it called "append" then?

This behavior dates back to commit a20dd508aa38 ("simplified invocation
- added automatic IDE disk geometry guessing to reuse old disk images
directly", 2003-09-30), which is when "-append" was added.

In said commit, "kernel_cmdline" was *appended* to "params->commandline"
with pstrcat(). The rest is history.

TL;DR: "just because" :)

Thanks
Laszlo

[Qemu-devel] QEMU seg-fault with intermediate image streaming -- bdrv_reopen() in stream_start()

2017-05-12 Thread Kashyap Chamarthy

Reproducer
--

[Disk image chain: disk1.qcow2 <- b.qcow2 <- c.qcow2]

$ qemu-system-x86_64 -display none -nodefconfig -nodefaults \
-m 512 -device virtio-scsi-pci,id=scsi \
-device virtio-serial-pci  \
-drive driver=qcow2,file.driver=file,file.filename=./disk1.qcow2,id=virtio0 
\
-monitor stdio -qmp unix:./qmp-sock,server,nowait

Create two overlays (I used `qmp-shell`):

(QEMU) blockdev-snapshot-sync device=virtio0 snapshot-file=b.qcow2
(QEMU) blockdev-snapshot-sync device=virtio0 snapshot-file=c.qcow2


[Figure out the (format) 'node-name' of 'b.qcow2', from the output of
QMP `query-named-block-nodes` so that it can be supplied to the 'device'
parameter]

Try to perform intermediate streaming (pull clusters from 'disk1.qcow2'
into 'b.qcow2':

(QEMU) block-stream device=#block832 base=disk1.qcow2


Result
--

QEMU crashes with SIGSEGV:

[...]
Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
0x5593d8f7 in stream_start (job_id=0x0, bs=0x58646e20, 
base=0x568548c0, backing_file_str=0x5863d710 "disk1.qcow2", speed=0, 
on_error=BLOCKDEV_ON_ERROR_REPORT, 
errp=0x7fffbcf8) at /home/kashyapc/tinker-space/qemu/block/stream.c:283
283 bdrv_reopen(bs, s->bs_flags, NULL);
[...]

* * *

NOTE: Of course, streaming to active layer works.


Stack traces


I've attached the stack traces from GDB to this email.


Version
---

v2.9.0-304-gca7305b


`git blame` seems to point to this commit:

commit a170a91fd3eab6155da39e740381867e80bcc93e
[...]
stream: Use real permissions in streaming block job

The correct permissions are relatively obvious here (and explained in
code comments). For intermediate streaming, we need to reopen the top
node read-write before creating the job now because the permissions
system catches attempts to get the BLK_PERM_WRITE_UNCHANGED permission
on a read-only node.


-- 
/kashyap
(gdb) thread apply all bt full

Thread 4 (Thread 0x7fffc4c8e700 (LWP 730)):
#0  0x7fffdccb4bd0 in pthread_cond_wait@@GLIBC_2.3.2 () at 
/lib64/libpthread.so.0
#1  0x55c83e8f in qemu_cond_wait (cond=0x568b9980, 
mutex=0x56323fc0 ) at 
/home/kashyapc/tinker-space/qemu/util/qemu-thread-posix.c:133
err = 21845
__func__ = "qemu_cond_wait"
#2  0x557a74c0 in qemu_tcg_wait_io_event (cpu=0x56886dc0) at 
/home/kashyapc/tinker-space/qemu/cpus.c:1074
#3  0x557a7d10 in qemu_tcg_rr_cpu_thread_fn (arg=0x56886dc0) at 
/home/kashyapc/tinker-space/qemu/cpus.c:1385
cpu = 0x0
#4  0x7fffdccaf5ca in start_thread () at /lib64/libpthread.so.0
#5  0x7fffdc9e90ed in clone () at /lib64/libc.so.6

Thread 2 (Thread 0x7fffd0b01700 (LWP 728)):
#0  0x7fffdc9e3239 in syscall () at /lib64/libc.so.6
#1  0x55c8421d in qemu_futex_wait (f=0x56757184 
, val=4294967295) at 
/home/kashyapc/tinker-space/qemu/include/qemu/futex.h:26
#2  0x55c84320 in qemu_event_wait (ev=0x56757184 
) at 
/home/kashyapc/tinker-space/qemu/util/qemu-thread-posix.c:399
value = 1
#3  0x55c9b7fd in call_rcu_thread (opaque=0x0) at 
/home/kashyapc/tinker-space/qemu/util/rcu.c:249
tries = 0
n = 0
node = 0x7fff941f9c10
#4  0x7fffdccaf5ca in start_thread () at /lib64/libpthread.so.0
#5  0x7fffdc9e90ed in clone () at /lib64/libc.so.6

Thread 1 (Thread 0x77ee0f80 (LWP 724)):
#0  0x5593d8f7 in stream_start (job_id=0x0, bs=0x58646e20, 
base=0x568548c0, backing_file_str=0x5863d710 "disk1.qcow2", speed=0, 
on_error=BLOCKDEV_ON_ERROR_REPORT, errp=0x
7fffbcf8) at /home/kashyapc/tinker-space/qemu/block/stream.c:283
s = 0x0
iter = 0xe5685e050
orig_bs_flags = 8192
---Type  to continue, or q  to quit---
#1  0x558f8acf in qmp_block_stream (has_job_id=false, job_id=0x0, 
device=0x586282f0 "#block830", has_base=true, base=0x5863d710 
"disk1.qcow2", has_base_node=false, base_node=
0x0, has_backing_file=false, backing_file=0x0, has_speed=false, speed=0, 
has_on_error=false, on_error=BLOCKDEV_ON_ERROR_REPORT, errp=0x7fffbda0)
at /home/kashyapc/tinker-space/qemu/blockdev.c:3033
bs = 0x58646e20
iter = 0x568548c0
base_bs = 0x568548c0
aio_context = 0x5683cb40
local_err = 0x5684a230
base_name = 0x5863d710 "disk1.qcow2"
__func__ = "qmp_block_stream"
__PRETTY_FUNCTION__ = "qmp_block_stream"
#2  0x5590f6e8 in qmp_marshal_block_stream (args=0x5689ddd0, 
ret=0x7fffbe90, errp=0x7fffbe88) at qmp-marshal.c:488
err = 0x0
v = 0x5779cd80
arg = 
  {has_job_id = false, job_id = 0x0, device = 0x586282f0 
"#block830", has_base = true, base = 0x5863d710 "disk1.qcow2",

Re: [Qemu-devel] [PATCH v3 15/15] target/sh4: use cpu_loop_exit_restore

2017-05-12 Thread Richard Henderson


On 05/10/2017 11:26 AM, Aurelien Jarno wrote:

Use cpu_loop_exit_restore when using cpu_restore_state and cpu_loop_exit
together.

Signed-off-by: Aurelien Jarno
---
  target/sh4/op_helper.c | 10 ++
  1 file changed, 2 insertions(+), 8 deletions(-)


Reviewed-by: Richard Henderson 


r~

Re: [Qemu-devel] [PATCH v3 14/15] target/sh4: trap unaligned accesses

2017-05-12 Thread Richard Henderson


On 05/10/2017 11:26 AM, Aurelien Jarno wrote:

SH4 requires that memory accesses are naturally aligned, except for the
SH4-A movua.l instructions which can do unaligned loads.

Reviewed-by: Philippe Mathieu-Daudé
Signed-off-by: Aurelien Jarno
---
  target/sh4/cpu.c   |  1 +
  target/sh4/cpu.h   |  4 
  target/sh4/op_helper.c | 16 
  target/sh4/translate.c |  6 --
  4 files changed, 25 insertions(+), 2 deletions(-)


Reviewed-by: Richard Henderson 


r~

Re: [Qemu-devel] [PATCH v3 09/15] target/sh4: optimize gen_store_fpr64

2017-05-12 Thread Richard Henderson


On 05/10/2017 11:26 AM, Aurelien Jarno wrote:

Using extr and avoiding intermediate temps.

Signed-off-by: Aurelien Jarno
---
  target/sh4/translate.c | 8 +---
  1 file changed, 1 insertion(+), 7 deletions(-)


Reviewed-by: Richard Henderson 


r~

Re: [Qemu-devel] [RFC PATCH 2/8] iommu/vt-d: add bind_pasid_table function

2017-05-12 Thread Alex Williamson

On Wed, 26 Apr 2017 18:11:59 +0800
"Liu, Yi L"  wrote:

> From: Jacob Pan 
> 
> Add Intel VT-d ops to the generic iommu_bind_pasid_table API
> functions.
> 
> The primary use case is for direct assignment of SVM capable
> device. Originated from emulated IOMMU in the guest, the request goes
> through many layers (e.g. VFIO). Upon calling host IOMMU driver, caller
> passes guest PASID table pointer (GPA) and size.
> 
> Device context table entry is modified by Intel IOMMU specific
> bind_pasid_table function. This will turn on nesting mode and matching
> translation type.
> 
> The unbind operation restores default context mapping.
> 
> Signed-off-by: Jacob Pan 
> Signed-off-by: Liu, Yi L 
> ---
>  drivers/iommu/intel-iommu.c   | 103 
> ++
>  include/linux/dma_remapping.h |   1 +
>  2 files changed, 104 insertions(+)
> 
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 646756c..6d5b939 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -5306,6 +5306,105 @@ struct intel_iommu *intel_svm_device_to_iommu(struct 
> device *dev)
>  
>   return iommu;
>  }
> +
> +static int intel_iommu_bind_pasid_table(struct iommu_domain *domain,
> + struct device *dev, struct pasid_table_info *pasidt_binfo)
> +{
> + struct intel_iommu *iommu;
> + struct context_entry *context;
> + struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> + struct device_domain_info *info;
> + u8 bus, devfn;
> + u16 did, *sid;
> + int ret = 0;
> + unsigned long flags;
> + u64 ctx_lo;
> +
> + if (pasidt_binfo == NULL || pasidt_binfo->model != INTEL_IOMMU) {
> + pr_warn("%s: Invalid bind request!\n", __func__);
> + return -EINVAL;
> + }
> +
> + iommu = device_to_iommu(dev, , );
> + if (!iommu)
> + return -ENODEV;
> +
> + sid = (u16 *)_binfo->opaque;

struct pasid_table_info is expected to be provided by a user, the
opaque data structure for model == INTEL_IOMMU therefore needs to be
documented in uapi.

> + /* check SID, if it is not correct, return */
> + if (PCI_DEVID(bus, devfn) != *sid)
> + return 0;

This is a bit weird, it took me until later in the series to understand
why this is a success case.  Perhaps the device matching needs to be
standardized in pasid_table_info rather than the opaque data.
Minimally, more comments.

> +
> + info = dev->archdata.iommu;
> + if (!info || !info->pasid_supported) {
> + pr_err("Device %d:%d.%d has no pasid support\n", bus,
> + PCI_SLOT(devfn), PCI_FUNC(devfn));

PCI addresses should be printed in hex and include the segment.  This
also looks like it might be user reachable, so a user could DoS the
host by continuously calling this where pasid is not supported and fill
logs with pr_err.  Maybe dropping the pr_err is the better choice.


> + ret = -EINVAL;
> + goto out;
> + }
> +
> + if (pasidt_binfo->size >= intel_iommu_get_pts(iommu)) {
> + pr_err("Invalid gPASID table size %llu, host size %lu\n",
> + pasidt_binfo->size,
> + intel_iommu_get_pts(iommu));
> + ret = -EINVAL;
> + goto out;

equal is not valid?

> + }
> + spin_lock_irqsave(>lock, flags);
> + context = iommu_context_addr(iommu, bus, devfn, 0);
> + if (!context || !context_present(context)) {
> + pr_warn("%s: ctx not present for bus devfn %x:%x\n",
> + __func__, bus, devfn);

Use standard PCI address format, including segment.

> + spin_unlock_irqrestore(>lock, flags);
> + goto out;
> + }
> + /* Anticipate guest to use SVM and owns the first level */
> + ctx_lo = context[0].lo;
> + ctx_lo |= CONTEXT_NESTE;
> + ctx_lo |= CONTEXT_PRS;
> + ctx_lo |= CONTEXT_PASIDE;
> + ctx_lo &= ~CONTEXT_TT_MASK;
> + ctx_lo |= CONTEXT_TT_DEV_IOTLB << 2;
> + context[0].lo = ctx_lo;
> +
> + /* Assign guest PASID table pointer and size */
> + ctx_lo = (pasidt_binfo->ptr & VTD_PAGE_MASK) | pasidt_binfo->size;
> + context[1].lo = ctx_lo;
> + /* make sure context entry is updated before flushing */
> + wmb();
> + did = dmar_domain->iommu_did[iommu->seq_id];
> + iommu->flush.flush_context(iommu, did,
> + (((u16)bus) << 8) | devfn,
> + DMA_CCMD_MASK_NOBIT,
> + DMA_CCMD_DEVICE_INVL);
> + iommu->flush.flush_iotlb(iommu, did, 0, 0, DMA_TLB_DSI_FLUSH);
> + spin_unlock_irqrestore(>lock, flags);

Mildly concerned what sort of Pandora's box this opens, but I guess
we're relying on the 2nd level translation to validate and make sure
the user can only hurt themselves.

> +
> +

Re: [Qemu-devel] [RFC PATCH 4/8] iommu/vt-d: Add iommu do invalidate function

2017-05-12 Thread Alex Williamson

On Wed, 26 Apr 2017 18:12:01 +0800
"Liu, Yi L"  wrote:

> From: Jacob Pan 
> 
> This patch adds Intel VT-d specific function to implement
> iommu_do_invalidate API.
> 
> The use case is for supporting caching structure invalidation
> of assigned SVM capable devices. Emulated IOMMU exposes queue
> invalidation capability and passes down all descriptors from the guest
> to the physical IOMMU.
> 
> The assumption is that guest to host device ID mapping should be
> resolved prior to calling IOMMU driver. Based on the device handle,
> host IOMMU driver can replace certain fields before submit to the
> invalidation queue.
> 
> Signed-off-by: Liu, Yi L 
> Signed-off-by: Jacob Pan 
> ---
>  drivers/iommu/intel-iommu.c | 43 +++
>  include/linux/intel-iommu.h | 11 +++
>  2 files changed, 54 insertions(+)
> 
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 6d5b939..0b098ad 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -5042,6 +5042,48 @@ static void intel_iommu_detach_device(struct 
> iommu_domain *domain,
>   dmar_remove_one_dev_info(to_dmar_domain(domain), dev);
>  }
>  
> +static int intel_iommu_do_invalidate(struct iommu_domain *domain,
> + struct device *dev, struct tlb_invalidate_info *inv_info)
> +{
> + int ret = 0;
> + struct intel_iommu *iommu;
> + struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> + struct intel_invalidate_data *inv_data;
> + struct qi_desc *qi;
> + u16 did;
> + u8 bus, devfn;
> +
> + if (!inv_info || !dmar_domain || (inv_info->model != INTEL_IOMMU))
> + return -EINVAL;
> +
> + iommu = device_to_iommu(dev, , );
> + if (!iommu)
> + return -ENODEV;
> +
> + inv_data = (struct intel_invalidate_data *)_info->opaque;
> +
> + /* check SID */
> + if (PCI_DEVID(bus, devfn) != inv_data->sid)
> + return 0;
> +
> + qi = _data->inv_desc;
> +
> + switch (qi->low & QI_TYPE_MASK) {
> + case QI_DIOTLB_TYPE:
> + case QI_DEIOTLB_TYPE:
> + /* for device IOTLB, we just let it pass through */
> + break;
> + default:
> + did = dmar_domain->iommu_did[iommu->seq_id];
> + set_mask_bits(>low, QI_DID_MASK, QI_DID(did));
> + break;
> + }
> +
> + ret = qi_submit_sync(qi, iommu);
> +
> + return ret;

nit, ret variable is unnecessary.

> +}
> +
>  static int intel_iommu_map(struct iommu_domain *domain,
>  unsigned long iova, phys_addr_t hpa,
>  size_t size, int iommu_prot)
> @@ -5416,6 +5458,7 @@ static int intel_iommu_unbind_pasid_table(struct 
> iommu_domain *domain,
>  #ifdef CONFIG_INTEL_IOMMU_SVM
>   .bind_pasid_table   = intel_iommu_bind_pasid_table,
>   .unbind_pasid_table = intel_iommu_unbind_pasid_table,
> + .do_invalidate  = intel_iommu_do_invalidate,
>  #endif
>   .map= intel_iommu_map,
>   .unmap  = intel_iommu_unmap,
> diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
> index ac04f28..9d6562c 100644
> --- a/include/linux/intel-iommu.h
> +++ b/include/linux/intel-iommu.h
> @@ -29,6 +29,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  
> @@ -271,6 +272,10 @@ enum {
>  #define QI_PGRP_RESP_TYPE0x9
>  #define QI_PSTRM_RESP_TYPE   0xa
>  
> +#define QI_DID(did)  (((u64)did & 0x) << 16)
> +#define QI_DID_MASK  GENMASK(31, 16)
> +#define QI_TYPE_MASK GENMASK(3, 0)
> +
>  #define QI_IEC_SELECTIVE (((u64)1) << 4)
>  #define QI_IEC_IIDEX(idx)(((u64)(idx & 0x) << 32))
>  #define QI_IEC_IM(m) (((u64)(m & 0x1f) << 27))
> @@ -529,6 +534,12 @@ struct intel_svm {
>  extern struct intel_iommu *intel_svm_device_to_iommu(struct device *dev);
>  #endif
>  
> +struct intel_invalidate_data {
> + u16 sid;
> + u32 pasid;
> + struct qi_desc inv_desc;
> +};

This needs to be uapi since the vfio user is expected to create it, so
we need a uapi version of qi_desc too.

> +
>  extern const struct attribute_group *intel_iommu_groups[];
>  extern void intel_iommu_debugfs_init(void);
>  extern struct context_entry *iommu_context_addr(struct intel_iommu *iommu,

Re: [Qemu-devel] [RFC PATCH 3/8] iommu: Introduce iommu do invalidate API function

2017-05-12 Thread Alex Williamson

On Wed, 26 Apr 2017 18:12:00 +0800
"Liu, Yi L"  wrote:

> From: "Liu, Yi L" 
> 
> When a SVM capable device is assigned to a guest, the first level page
> tables are owned by the guest and the guest PASID table pointer is
> linked to the device context entry of the physical IOMMU.
> 
> Host IOMMU driver has no knowledge of caching structure updates unless
> the guest invalidation activities are passed down to the host. The
> primary usage is derived from emulated IOMMU in the guest, where QEMU
> can trap invalidation activities before pass them down the
> host/physical IOMMU. There are IOMMU architectural specific actions
> need to be taken which requires the generic APIs introduced in this
> patch to have opaque data in the tlb_invalidate_info argument.
> 
> Signed-off-by: Liu, Yi L 
> Signed-off-by: Jacob Pan 
> ---
>  drivers/iommu/iommu.c | 13 +
>  include/linux/iommu.h | 16 
>  2 files changed, 29 insertions(+)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index f2da636..ca7cff2 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -1153,6 +1153,19 @@ int iommu_unbind_pasid_table(struct iommu_domain 
> *domain, struct device *dev)
>  }
>  EXPORT_SYMBOL_GPL(iommu_unbind_pasid_table);
>  
> +int iommu_do_invalidate(struct iommu_domain *domain,
> + struct device *dev, struct tlb_invalidate_info *inv_info)
> +{
> + int ret = 0;
> +
> + if (unlikely(domain->ops->do_invalidate == NULL))
> + return -ENODEV;
> +
> + ret = domain->ops->do_invalidate(domain, dev, inv_info);
> + return ret;

nit, ret is unnecessary.

> +}
> +EXPORT_SYMBOL_GPL(iommu_do_invalidate);
> +
>  static void __iommu_detach_device(struct iommu_domain *domain,
> struct device *dev)
>  {
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 491a011..a48e3b75 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -140,6 +140,11 @@ struct pasid_table_info {
>   __u8opaque[];/* IOMMU-specific details */
>  };
>  
> +struct tlb_invalidate_info {
> + __u32   model;
> + __u8opaque[];
> +};

I'm wondering if 'model' is really necessary here, shouldn't this
function only be called if a bind_pasid_table() succeeded, and then the
model would be set at that time?

This also needs to be uapi since you're expecting a user to provide it
to vfio.  The opaque data needs to be fully specified (relative to
uapi) per model.

> +
>  #ifdef CONFIG_IOMMU_API
>  
>  /**
> @@ -215,6 +220,8 @@ struct iommu_ops {
>   struct pasid_table_info *pasidt_binfo);
>   int (*unbind_pasid_table)(struct iommu_domain *domain,
>   struct device *dev);
> + int (*do_invalidate)(struct iommu_domain *domain,
> + struct device *dev, struct tlb_invalidate_info *inv_info);
>  
>   unsigned long pgsize_bitmap;
>  };
> @@ -240,6 +247,9 @@ extern int iommu_bind_pasid_table(struct iommu_domain 
> *domain,
>   struct device *dev, struct pasid_table_info *pasidt_binfo);
>  extern int iommu_unbind_pasid_table(struct iommu_domain *domain,
>   struct device *dev);
> +extern int iommu_do_invalidate(struct iommu_domain *domain,
> + struct device *dev, struct tlb_invalidate_info *inv_info);
> +
>  extern struct iommu_domain *iommu_get_domain_for_dev(struct device *dev);
>  extern int iommu_map(struct iommu_domain *domain, unsigned long iova,
>phys_addr_t paddr, size_t size, int prot);
> @@ -626,6 +636,12 @@ int iommu_unbind_pasid_table(struct iommu_domain 
> *domain, struct device *dev)
>   return -EINVAL;
>  }
>  
> +static inline int iommu_do_invalidate(struct iommu_domain *domain,
> + struct device *dev, struct tlb_invalidate_info *inv_info)
> +{
> + return -EINVAL;
> +}
> +
>  #endif /* CONFIG_IOMMU_API */
>  
>  #endif /* __LINUX_IOMMU_H */

Re: [Qemu-devel] [RFC PATCH 1/8] iommu: Introduce bind_pasid_table API function

2017-05-12 Thread Alex Williamson

On Wed, 26 Apr 2017 18:11:58 +0800
"Liu, Yi L"  wrote:

> From: Jacob Pan 
> 
> Virtual IOMMU was proposed to support Shared Virtual Memory (SVM) use
> case in the guest:
> https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg05311.html
> 
> As part of the proposed architecture, when a SVM capable PCI
> device is assigned to a guest, nested mode is turned on. Guest owns the
> first level page tables (request with PASID) and performs GVA->GPA
> translation. Second level page tables are owned by the host for GPA->HPA
> translation for both request with and without PASID.
> 
> A new IOMMU driver interface is therefore needed to perform tasks as
> follows:
> * Enable nested translation and appropriate translation type
> * Assign guest PASID table pointer (in GPA) and size to host IOMMU
> 
> This patch introduces new functions called iommu_(un)bind_pasid_table()
> to IOMMU APIs. Architecture specific IOMMU function can be added later
> to perform the specific steps for binding pasid table of assigned devices.
> 
> This patch also adds model definition in iommu.h. It would be used to
> check if the bind request is from a compatible entity. e.g. a bind
> request from an intel_iommu emulator may not be supported by an ARM SMMU
> driver.
> 
> Signed-off-by: Jacob Pan 
> Signed-off-by: Liu, Yi L 
> ---
>  drivers/iommu/iommu.c | 19 +++
>  include/linux/iommu.h | 31 +++
>  2 files changed, 50 insertions(+)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index dbe7f65..f2da636 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -1134,6 +1134,25 @@ int iommu_attach_device(struct iommu_domain *domain, 
> struct device *dev)
>  }
>  EXPORT_SYMBOL_GPL(iommu_attach_device);
>  
> +int iommu_bind_pasid_table(struct iommu_domain *domain, struct device *dev,
> + struct pasid_table_info *pasidt_binfo)
> +{
> + if (unlikely(!domain->ops->bind_pasid_table))
> + return -EINVAL;
> +
> + return domain->ops->bind_pasid_table(domain, dev, pasidt_binfo);
> +}
> +EXPORT_SYMBOL_GPL(iommu_bind_pasid_table);
> +
> +int iommu_unbind_pasid_table(struct iommu_domain *domain, struct device *dev)
> +{
> + if (unlikely(!domain->ops->unbind_pasid_table))
> + return -EINVAL;
> +
> + return domain->ops->unbind_pasid_table(domain, dev);
> +}
> +EXPORT_SYMBOL_GPL(iommu_unbind_pasid_table);
> +
>  static void __iommu_detach_device(struct iommu_domain *domain,
> struct device *dev)
>  {
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 0ff5111..491a011 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -131,6 +131,15 @@ struct iommu_dm_region {
>   int prot;
>  };
>  
> +struct pasid_table_info {
> + __u64   ptr;/* PASID table ptr */
> + __u64   size;   /* PASID table size*/
> + __u32   model;  /* magic number */
> +#define INTEL_IOMMU  (1 << 0)
> +#define ARM_SMMU (1 << 1)
> + __u8opaque[];/* IOMMU-specific details */
> +};

This needs to be in uapi since you're expecting a user to pass it 

> +
>  #ifdef CONFIG_IOMMU_API
>  
>  /**
> @@ -159,6 +168,8 @@ struct iommu_dm_region {
>   * @domain_get_windows: Return the number of windows for a domain
>   * @of_xlate: add OF master IDs to iommu grouping
>   * @pgsize_bitmap: bitmap of all possible supported page sizes
> + * @bind_pasid_table: bind pasid table pointer for guest SVM
> + * @unbind_pasid_table: unbind pasid table pointer and restore defaults
>   */
>  struct iommu_ops {
>   bool (*capable)(enum iommu_cap);
> @@ -200,6 +211,10 @@ struct iommu_ops {
>   u32 (*domain_get_windows)(struct iommu_domain *domain);
>  
>   int (*of_xlate)(struct device *dev, struct of_phandle_args *args);
> + int (*bind_pasid_table)(struct iommu_domain *domain, struct device *dev,
> + struct pasid_table_info *pasidt_binfo);
> + int (*unbind_pasid_table)(struct iommu_domain *domain,
> + struct device *dev);
>  
>   unsigned long pgsize_bitmap;
>  };
> @@ -221,6 +236,10 @@ extern int iommu_attach_device(struct iommu_domain 
> *domain,
>  struct device *dev);
>  extern void iommu_detach_device(struct iommu_domain *domain,
>   struct device *dev);
> +extern int iommu_bind_pasid_table(struct iommu_domain *domain,
> + struct device *dev, struct pasid_table_info *pasidt_binfo);
> +extern int iommu_unbind_pasid_table(struct iommu_domain *domain,
> + struct device *dev);
>  extern struct iommu_domain *iommu_get_domain_for_dev(struct device *dev);
>  extern int iommu_map(struct iommu_domain *domain, unsigned long iova,
>phys_addr_t paddr, size_t size, int prot);
>

Re: [Qemu-devel] [RFC PATCH 6/8] VFIO: do pasid table binding

2017-05-12 Thread Alex Williamson

On Wed, 26 Apr 2017 18:12:03 +0800
"Liu, Yi L"  wrote:

> From: "Liu, Yi L" 
> 
> This patch adds IOCTL processing in vfio_iommu_type1 for
> VFIO_IOMMU_SVM_BIND_TASK. Binds the PASID table bind by
> calling iommu_ops->bind_pasid_table to link the whole
> PASID table to pIOMMU.
> 
> For VT-d, it is linking the guest PASID table to host pIOMMU.
> This is key point to support SVM virtualization on VT-d.
> 
> Signed-off-by: Liu, Yi L 
> ---
>  drivers/vfio/vfio_iommu_type1.c | 72 
> +
>  1 file changed, 72 insertions(+)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index b3cc33f..30b6d48 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -1512,6 +1512,50 @@ static int vfio_domains_have_iommu_cache(struct 
> vfio_iommu *iommu)
>   return ret;
>  }
>  
> +struct vfio_svm_task {
> + struct iommu_domain *domain;
> + void *payload;
> +};
> +
> +static int bind_pasid_tbl_fn(struct device *dev, void *data)
> +{
> + int ret = 0;
> + struct vfio_svm_task *task = data;

Maybe avoid "task" or use svm_task to differentiate from task_struct
task used elsewhere in this file.

> + struct pasid_table_info *pasidt_binfo;
> +
> + pasidt_binfo = task->payload;
> + ret = iommu_bind_pasid_table(task->domain, dev, pasidt_binfo);
> + return ret;
> +}
> +
> +static int vfio_do_svm_task(struct vfio_iommu *iommu, void *data,
> + int (*fn)(struct device *, void *))
> +{
> + int ret = 0;
> + struct vfio_domain *d;
> + struct vfio_group *g;
> + struct vfio_svm_task task;
> +
> + task.payload = data;
> +
> + mutex_lock(>lock);
> +
> + list_for_each_entry(d, >domain_list, next) {
> + list_for_each_entry(g, >group_list, next) {
> + if (g->iommu_group != NULL) {

Can it ever be NULL?

> + task.domain = d->domain;
> + ret = iommu_group_for_each_dev(
> + g->iommu_group, , fn);
> + if (ret != 0)
> + break;
> + }
> + }
> + }
> +
> + mutex_unlock(>lock);
> + return ret;
> +}
> +
>  static long vfio_iommu_type1_ioctl(void *iommu_data,
>  unsigned int cmd, unsigned long arg)
>  {
> @@ -1582,6 +1626,34 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
>  
>   return copy_to_user((void __user *)arg, , minsz) ?
>   -EFAULT : 0;
> + } else if (cmd == VFIO_IOMMU_SVM_BIND_TASK) {
> + struct vfio_device_svm hdr;
> + u8 *data = NULL;

But it really should be a struct pasid_table_info.

> + int ret = 0;
> +
> + minsz = offsetofend(struct vfio_device_svm, length);
> + if (copy_from_user(, (void __user *)arg, minsz))
> + return -EFAULT;
> +
> + if (hdr.length == 0)
> + return -EINVAL;
> +
> + data = memdup_user((void __user *)(arg + minsz),
> + hdr.length);
> + if (IS_ERR(data))
> + return PTR_ERR(data);
> +
> + switch (hdr.flags & VFIO_SVM_TYPE_MASK) {
> + case VFIO_SVM_BIND_PASIDTBL:
> + ret = vfio_do_svm_task(iommu, data,
> + bind_pasid_tbl_fn);
> + break;
> + default:
> + ret = -EINVAL;
> + break;
> + }
> + kfree(data);
> + return ret;
>   }
>  
>   return -ENOTTY;

Re: [Qemu-devel] [RFC PATCH 5/8] VFIO: Add new IOTCL for PASID Table bind propagation

2017-05-12 Thread Alex Williamson

On Wed, 26 Apr 2017 18:12:02 +0800
"Liu, Yi L"  wrote:

> From: "Liu, Yi L" 
> 
> This patch adds VFIO_IOMMU_SVM_BIND_TASK for potential PASID table
> binding requests.
> 
> On VT-d, this IOCTL cmd would be used to link the guest PASID page table
> to host. While for other vendors, it may also be used to support other
> kind of SVM bind request. Previously, there is a discussion on it with
> ARM engineer. It can be found by the link below. This IOCTL cmd may
> support SVM PASID bind request from userspace driver, or page table(cr3)
> bind request from guest. These SVM bind requests would be supported by
> adding different flags. e.g. VFIO_SVM_BIND_PASID is added to support
> PASID bind from userspace driver, VFIO_SVM_BIND_PGTABLE is added to
> support page table bind from guest.
> 
> https://patchwork.kernel.org/patch/9594231/
> 
> Signed-off-by: Liu, Yi L 
> ---
>  include/uapi/linux/vfio.h | 17 +
>  1 file changed, 17 insertions(+)
> 
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index 519eff3..6b97987 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -547,6 +547,23 @@ struct vfio_iommu_type1_dma_unmap {
>  #define VFIO_IOMMU_ENABLE_IO(VFIO_TYPE, VFIO_BASE + 15)
>  #define VFIO_IOMMU_DISABLE   _IO(VFIO_TYPE, VFIO_BASE + 16)
>  
> +/* IOCTL for Shared Virtual Memory Bind */
> +struct vfio_device_svm {
> + __u32   argsz;
> +#define VFIO_SVM_BIND_PASIDTBL   (1 << 0) /* Bind PASID Table */
> +#define VFIO_SVM_BIND_PASID  (1 << 1) /* Bind PASID from userspace driver */
> +#define VFIO_SVM_BIND_PGTABLE(1 << 2) /* Bind guest mmu page table */
> + __u32   flags;
> + __u32   length;
> + __u8data[];

In the case of VFIO_SVM_BIND_PASIDTBL this is clearly struct
pasid_table_info?  So at a minimum this is a union including struct
pasid_table_info.  Furthermore how does a user learn what the opaque
data in struct pasid_table_info is without looking at the code?  A user
API needs to be clear and documented, not opaque and variable.  We
should also have references to the hardware spec for an Intel or ARM
PASID table in uapi.  flags should be defined as they're used, let's
not reserve them with the expectation of future use.

> +};
> +
> +#define VFIO_SVM_TYPE_MASK   (VFIO_SVM_BIND_PASIDTBL | \
> + VFIO_SVM_BIND_PASID | \
> + VFIO_SVM_BIND_PGTABLE)
> +
> +#define VFIO_IOMMU_SVM_BIND_TASK _IO(VFIO_TYPE, VFIO_BASE + 22)
> +
>  /*  Additional API for SPAPR TCE (Server POWERPC) IOMMU  */
>  
>  /*

Re: [Qemu-devel] [RFC PATCH 7/8] VFIO: Add new IOCTL for IOMMU TLB invalidate propagation

2017-05-12 Thread Alex Williamson

On Wed, 26 Apr 2017 18:12:04 +0800
"Liu, Yi L"  wrote:

> From: "Liu, Yi L" 
> 
> This patch adds VFIO_IOMMU_TLB_INVALIDATE to propagate IOMMU TLB
> invalidate request from guest to host.
> 
> In the case of SVM virtualization on VT-d, host IOMMU driver has
> no knowledge of caching structure updates unless the guest
> invalidation activities are passed down to the host. So a new
> IOCTL is needed to propagate the guest cache invalidation through
> VFIO.
> 
> Signed-off-by: Liu, Yi L 
> ---
>  include/uapi/linux/vfio.h | 9 +
>  1 file changed, 9 insertions(+)
> 
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index 6b97987..50c51f8 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -564,6 +564,15 @@ struct vfio_device_svm {
>  
>  #define VFIO_IOMMU_SVM_BIND_TASK _IO(VFIO_TYPE, VFIO_BASE + 22)
>  
> +/* For IOMMU TLB Invalidation Propagation */
> +struct vfio_iommu_tlb_invalidate {
> + __u32   argsz;
> + __u32   length;
> + __u8data[];
> +};
> +
> +#define VFIO_IOMMU_TLB_INVALIDATE_IO(VFIO_TYPE, VFIO_BASE + 23)

I'm kind of wondering why this isn't just a new flag bit on
vfio_device_svm, the data structure is so similar.  Of course data
needs to be fully specified in uapi.

> +
>  /*  Additional API for SPAPR TCE (Server POWERPC) IOMMU  */
>  
>  /*

Re: [Qemu-devel] [PATCH 7/7] curl: do not do aio_poll when waiting for a free CURLState

2017-05-12 Thread Jeff Cody

On Wed, May 10, 2017 at 04:32:05PM +0200, Paolo Bonzini wrote:
> Instead, put the CURLAIOCB on a wait list; curl_clean_state will
> wake the corresponding coroutine.
> 
> Because of CURL's callback-based structure, we cannot easily convert
> everything to CoMutex/CoQueue; keeping the QemuMutex is simpler.
> However, CoQueue is a simple wrapper around a linked list, so we can
> use QSIMPLEQ easily to open-code a CoQueue that has a QemuMutex's
> protection instead of a CoMutex's.
> 
> Signed-off-by: Paolo Bonzini 
> ---
>  block/curl.c | 16 +++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/block/curl.c b/block/curl.c
> index 80870bd60c..4ccdf63510 100644
> --- a/block/curl.c
> +++ b/block/curl.c
> @@ -98,6 +98,8 @@ typedef struct CURLAIOCB {
>  
>  size_t start;
>  size_t end;
> +
> +QSIMPLEQ_ENTRY(CURLAIOCB) next;
>  } CURLAIOCB;
>  
>  typedef struct CURLSocket {
> @@ -133,6 +135,7 @@ typedef struct BDRVCURLState {
>  bool accept_range;
>  AioContext *aio_context;
>  QemuMutex mutex;
> +QSIMPLEQ_HEAD(, CURLAIOCB) free_state_waitq;
>  char *username;
>  char *password;
>  char *proxyusername;
> @@ -532,6 +535,7 @@ static int curl_init_state(BDRVCURLState *s, CURLState 
> *state)
>  /* Called with s->mutex held.  */
>  static void curl_clean_state(CURLState *s)
>  {
> +CURLAIOCB *next;
>  int j;
>  for (j=0; j  assert(!s->acb[j]);
> @@ -548,6 +552,14 @@ static void curl_clean_state(CURLState *s)
>  }
>  
>  s->in_use = 0;
> +
> +next = QSIMPLEQ_FIRST(>s->free_state_waitq);
> +if (next) {
> +QSIMPLEQ_REMOVE_HEAD(>s->free_state_waitq, next);
> +qemu_mutex_unlock(>s->mutex);
> +aio_co_wake(next->co);
> +qemu_mutex_lock(>s->mutex);
> +}
>  }
>  
>  static void curl_parse_filename(const char *filename, QDict *options,
> @@ -744,6 +756,7 @@ static int curl_open(BlockDriverState *bs, QDict 
> *options, int flags,
>  
>  DPRINTF("CURL: Opening %s\n", file);
>  qemu_mutex_init(>mutex);
> +QSIMPLEQ_INIT(>free_state_waitq);
>  s->aio_context = bdrv_get_aio_context(bs);
>  s->url = g_strdup(file);
>  qemu_mutex_lock(>mutex);
> @@ -843,8 +856,9 @@ static void curl_setup_preadv(BlockDriverState *bs, 
> CURLAIOCB *acb)
>  if (state) {
>  break;
>  }
> +QSIMPLEQ_INSERT_TAIL(>free_state_waitq, acb, next);
>  qemu_mutex_unlock(>mutex);
> -aio_poll(bdrv_get_aio_context(bs), true);
> +qemu_coroutine_yield();
>  qemu_mutex_lock(>mutex);
>  }
>  
> -- 
> 2.12.2
>

Reviewed-by: Jeff Cody

Re: [Qemu-devel] [PATCH 6/7] curl: convert readv to coroutines

2017-05-12 Thread Jeff Cody

On Wed, May 10, 2017 at 04:32:04PM +0200, Paolo Bonzini wrote:
> This is pretty simple.  The bottom half goes away because, unlike
> bdrv_aio_readv, coroutine-based read can return immediately without
> yielding.  However, for simplicity I kept the former bottom half
> handler in a separate function.
> 
> Signed-off-by: Paolo Bonzini 
> ---
>  block/curl.c | 94 
> 
>  1 file changed, 38 insertions(+), 56 deletions(-)
> 
> diff --git a/block/curl.c b/block/curl.c
> index 3e288f2bc7..80870bd60c 100644
> --- a/block/curl.c
> +++ b/block/curl.c
> @@ -76,10 +76,6 @@ static CURLMcode __curl_multi_socket_action(CURLM 
> *multi_handle,
>  #define CURL_TIMEOUT_DEFAULT 5
>  #define CURL_TIMEOUT_MAX 1
>  
> -#define FIND_RET_NONE   0
> -#define FIND_RET_OK 1
> -#define FIND_RET_WAIT   2
> -
>  #define CURL_BLOCK_OPT_URL   "url"
>  #define CURL_BLOCK_OPT_READAHEAD "readahead"
>  #define CURL_BLOCK_OPT_SSLVERIFY "sslverify"
> @@ -93,11 +89,12 @@ static CURLMcode __curl_multi_socket_action(CURLM 
> *multi_handle,
>  struct BDRVCURLState;
>  
>  typedef struct CURLAIOCB {
> -BlockAIOCB common;
> +Coroutine *co;
>  QEMUIOVector *qiov;
>  
>  uint64_t offset;
>  uint64_t bytes;
> +int ret;
>  
>  size_t start;
>  size_t end;
> @@ -268,11 +265,11 @@ static size_t curl_read_cb(void *ptr, size_t size, 
> size_t nmemb, void *opaque)
>request_length - offset);
>  }
>  
> +acb->ret = 0;
> +s->acb[i] = NULL;
>  qemu_mutex_unlock(>s->mutex);
> -acb->common.cb(acb->common.opaque, 0);
> +aio_co_wake(acb->co);
>  qemu_mutex_lock(>s->mutex);
> -qemu_aio_unref(acb);
> -s->acb[i] = NULL;
>  }
>  }
>  
> @@ -282,8 +279,8 @@ read_end:
>  }
>  
>  /* Called with s->mutex held.  */
> -static int curl_find_buf(BDRVCURLState *s, uint64_t start, uint64_t len,
> - CURLAIOCB *acb)
> +static bool curl_find_buf(BDRVCURLState *s, uint64_t start, uint64_t len,
> +  CURLAIOCB *acb)
>  {
>  int i;
>  uint64_t end = start + len;
> @@ -312,7 +309,8 @@ static int curl_find_buf(BDRVCURLState *s, uint64_t 
> start, uint64_t len,
>  if (clamped_len < len) {
>  qemu_iovec_memset(acb->qiov, clamped_len, 0, len - 
> clamped_len);
>  }
> -return FIND_RET_OK;
> +acb->ret = 0;
> +return true;
>  }
>  
>  // Wait for unfinished chunks
> @@ -330,13 +328,13 @@ static int curl_find_buf(BDRVCURLState *s, uint64_t 
> start, uint64_t len,
>  for (j=0; j  if (!state->acb[j]) {
>  state->acb[j] = acb;
> -return FIND_RET_WAIT;
> +return true;
>  }
>  }
>  }
>  }
>  
> -return FIND_RET_NONE;
> +return false;
>  }
>  
>  /* Called with s->mutex held.  */
> @@ -381,11 +379,11 @@ static void curl_multi_check_completion(BDRVCURLState 
> *s)
>  continue;
>  }
>  
> +acb->ret = -EIO;
> +state->acb[i] = NULL;
>  qemu_mutex_unlock(>mutex);
> -acb->common.cb(acb->common.opaque, -EIO);
> +aio_co_wake(acb->co);
>  qemu_mutex_lock(>mutex);
> -qemu_aio_unref(acb);
> -state->acb[i] = NULL;
>  }
>  }
>  
> @@ -821,19 +819,11 @@ out_noclean:
>  return -EINVAL;
>  }
>  
> -static const AIOCBInfo curl_aiocb_info = {
> -.aiocb_size = sizeof(CURLAIOCB),
> -};
> -
> -
> -static void curl_readv_bh_cb(void *p)
> +static void curl_setup_preadv(BlockDriverState *bs, CURLAIOCB *acb)
>  {
>  CURLState *state;
>  int running;
> -int ret = -EINPROGRESS;
>  
> -CURLAIOCB *acb = p;
> -BlockDriverState *bs = acb->common.bs;
>  BDRVCURLState *s = bs->opaque;
>  
>  uint64_t start = acb->offset;
> @@ -843,14 +833,8 @@ static void curl_readv_bh_cb(void *p)
>  
>  // In case we have the requested data already (e.g. read-ahead),
>  // we can just call the callback and be done.
> -switch (curl_find_buf(s, start, acb->bytes, acb)) {
> -case FIND_RET_OK:
> -ret = 0;
> -goto out;
> -case FIND_RET_WAIT:
> -goto out;
> -default:
> -break;
> +if (curl_find_buf(s, start, acb->bytes, acb)) {
> +goto out;
>  }
>  
>  // No cache found, so let's start a new request
> @@ -866,7 +850,7 @@ static void curl_readv_bh_cb(void *p)
>  
>  if (curl_init_state(s, state) < 0) {
>  curl_clean_state(state);
> -ret = -EIO;
> +acb->ret = -EIO;
>

Re: [Qemu-devel] [PATCH 5/7] curl: convert CURLAIOCB to byte values

2017-05-12 Thread Jeff Cody

On Wed, May 10, 2017 at 04:32:03PM +0200, Paolo Bonzini wrote:
> This is in preparation for the conversion from bdrv_aio_readv to
> bdrv_co_preadv, and it also requires changing some of the size_t values
> to uint64_t.  This was broken before for disks > 2TB, but now it would
> break at 4GB.
> 
> Signed-off-by: Paolo Bonzini 
> ---
>  block/curl.c | 44 ++--
>  1 file changed, 22 insertions(+), 22 deletions(-)
> 
> diff --git a/block/curl.c b/block/curl.c
> index 4b4d5a2389..3e288f2bc7 100644
> --- a/block/curl.c
> +++ b/block/curl.c
> @@ -96,8 +96,8 @@ typedef struct CURLAIOCB {
>  BlockAIOCB common;
>  QEMUIOVector *qiov;
>  
> -int64_t sector_num;
> -int nb_sectors;
> +uint64_t offset;
> +uint64_t bytes;
>  
>  size_t start;
>  size_t end;
> @@ -115,7 +115,7 @@ typedef struct CURLState
>  CURL *curl;
>  QLIST_HEAD(, CURLSocket) sockets;
>  char *orig_buf;
> -size_t buf_start;
> +uint64_t buf_start;
>  size_t buf_off;
>  size_t buf_len;
>  char range[128];
> @@ -126,7 +126,7 @@ typedef struct CURLState
>  typedef struct BDRVCURLState {
>  CURLM *multi;
>  QEMUTimer timer;
> -size_t len;
> +uint64_t len;
>  CURLState states[CURL_NUM_STATES];
>  char *url;
>  size_t readahead_size;
> @@ -257,7 +257,7 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t 
> nmemb, void *opaque)
>  continue;
>  
>  if ((s->buf_off >= acb->end)) {
> -size_t request_length = acb->nb_sectors * BDRV_SECTOR_SIZE;
> +size_t request_length = acb->bytes;
>  
>  qemu_iovec_from_buf(acb->qiov, 0, s->orig_buf + acb->start,
>  acb->end - acb->start);
> @@ -282,18 +282,18 @@ read_end:
>  }
>  
>  /* Called with s->mutex held.  */
> -static int curl_find_buf(BDRVCURLState *s, size_t start, size_t len,
> +static int curl_find_buf(BDRVCURLState *s, uint64_t start, uint64_t len,
>   CURLAIOCB *acb)
>  {
>  int i;
> -size_t end = start + len;
> -size_t clamped_end = MIN(end, s->len);
> -size_t clamped_len = clamped_end - start;
> +uint64_t end = start + len;
> +uint64_t clamped_end = MIN(end, s->len);
> +uint64_t clamped_len = clamped_end - start;
>  
>  for (i=0; i  CURLState *state = >states[i];
> -size_t buf_end = (state->buf_start + state->buf_off);
> -size_t buf_fend = (state->buf_start + state->buf_len);
> +uint64_t buf_end = (state->buf_start + state->buf_off);
> +uint64_t buf_fend = (state->buf_start + state->buf_len);
>  
>  if (!state->orig_buf)
>  continue;
> @@ -788,7 +788,7 @@ static int curl_open(BlockDriverState *bs, QDict 
> *options, int flags,
>  }
>  #endif
>  
> -s->len = (size_t)d;
> +s->len = d;
>  
>  if ((!strncasecmp(s->url, "http://;, strlen("http://;))
>  || !strncasecmp(s->url, "https://;, strlen("https://;)))
> @@ -797,7 +797,7 @@ static int curl_open(BlockDriverState *bs, QDict 
> *options, int flags,
>  "Server does not support 'range' (byte ranges).");
>  goto out;
>  }
> -DPRINTF("CURL: Size = %zd\n", s->len);
> +DPRINTF("CURL: Size = %" PRIu64 "\n", s->len);
>  
>  qemu_mutex_lock(>mutex);
>  curl_clean_state(state);
> @@ -836,14 +836,14 @@ static void curl_readv_bh_cb(void *p)
>  BlockDriverState *bs = acb->common.bs;
>  BDRVCURLState *s = bs->opaque;
>  
> -size_t start = acb->sector_num * BDRV_SECTOR_SIZE;
> -size_t end;
> +uint64_t start = acb->offset;
> +uint64_t end;
>  
>  qemu_mutex_lock(>mutex);
>  
>  // In case we have the requested data already (e.g. read-ahead),
>  // we can just call the callback and be done.
> -switch (curl_find_buf(s, start, acb->nb_sectors * BDRV_SECTOR_SIZE, 
> acb)) {
> +switch (curl_find_buf(s, start, acb->bytes, acb)) {
>  case FIND_RET_OK:
>  ret = 0;
>  goto out;
> @@ -871,7 +871,7 @@ static void curl_readv_bh_cb(void *p)
>  }
>  
>  acb->start = 0;
> -acb->end = MIN(acb->nb_sectors * BDRV_SECTOR_SIZE, s->len - start);
> +acb->end = MIN(acb->bytes, s->len - start);
>  
>  state->buf_off = 0;
>  g_free(state->orig_buf);
> @@ -886,9 +886,9 @@ static void curl_readv_bh_cb(void *p)
>  }
>  state->acb[0] = acb;
>  
> -snprintf(state->range, 127, "%zd-%zd", start, end);
> -DPRINTF("CURL (AIO): Reading %llu at %zd (%s)\n",
> -(acb->nb_sectors * BDRV_SECTOR_SIZE), start, state->range);
> +snprintf(state->range, 127, "%" PRIu64 "-%" PRIu64, start, end);
> +DPRINTF("CURL (AIO): Reading %" PRIu64 " at %" PRIu64 " (%s)\n",
> +acb->bytes, start, state->range);
>  curl_easy_setopt(state->curl, CURLOPT_RANGE, state->range);
>  
>  curl_multi_add_handle(s->multi, state->curl);
> @@

Re: [Qemu-devel] [PATCH 4/7] curl: split curl_find_state/curl_init_state

2017-05-12 Thread Jeff Cody

On Wed, May 10, 2017 at 04:32:02PM +0200, Paolo Bonzini wrote:
> If curl_easy_init fails, a CURLState is left with s->in_use = 1.  Split
> curl_init_state in two, so that we can distinguish the two failures and
> call curl_clean_state if needed.
> 
> While at it, simplify curl_find_state, removing a dummy loop.  The
> aio_poll loop is moved to the sole caller that needs it.
> 
> Signed-off-by: Paolo Bonzini 
> ---
>  block/curl.c | 52 ++--
>  1 file changed, 30 insertions(+), 22 deletions(-)
> 
> diff --git a/block/curl.c b/block/curl.c
> index b18e79bf54..4b4d5a2389 100644
> --- a/block/curl.c
> +++ b/block/curl.c
> @@ -455,34 +455,27 @@ static void curl_multi_timeout_do(void *arg)
>  }
>  
>  /* Called with s->mutex held.  */
> -static CURLState *curl_init_state(BlockDriverState *bs, BDRVCURLState *s)
> +static CURLState *curl_find_state(BDRVCURLState *s)
>  {
>  CURLState *state = NULL;
> -int i, j;
> -
> -do {
> -for (i=0; i -for (j=0; j -if (s->states[i].acb[j])
> -continue;
> -if (s->states[i].in_use)
> -continue;
> +int i;
>  
> +for (i=0; i +if (!s->states[i].in_use) {
>  state = >states[i];
>  state->in_use = 1;
>  break;
>  }
> -if (!state) {
> -qemu_mutex_unlock(>mutex);
> -aio_poll(bdrv_get_aio_context(bs), true);
> -qemu_mutex_lock(>mutex);
> -}
> -} while(!state);
> +}
> +return state;
> +}
>  
> +static int curl_init_state(BDRVCURLState *s, CURLState *state)
> +{
>  if (!state->curl) {
>  state->curl = curl_easy_init();
>  if (!state->curl) {
> -return NULL;
> +return -EIO;
>  }
>  curl_easy_setopt(state->curl, CURLOPT_URL, s->url);
>  curl_easy_setopt(state->curl, CURLOPT_SSL_VERIFYPEER,
> @@ -535,7 +528,7 @@ static CURLState *curl_init_state(BlockDriverState *bs, 
> BDRVCURLState *s)
>  QLIST_INIT(>sockets);
>  state->s = s;
>  
> -return state;
> +return 0;
>  }
>  
>  /* Called with s->mutex held.  */
> @@ -756,13 +749,18 @@ static int curl_open(BlockDriverState *bs, QDict 
> *options, int flags,
>  s->aio_context = bdrv_get_aio_context(bs);
>  s->url = g_strdup(file);
>  qemu_mutex_lock(>mutex);
> -state = curl_init_state(bs, s);
> +state = curl_find_state(s);
>  qemu_mutex_unlock(>mutex);
> -if (!state)
> +if (!state) {
>  goto out_noclean;
> +}
>  
>  // Get file size
>  
> +if (curl_init_state(s, state) < 0) {
> +goto out;
> +}
> +
>  s->accept_range = false;
>  curl_easy_setopt(state->curl, CURLOPT_NOBODY, 1);
>  curl_easy_setopt(state->curl, CURLOPT_HEADERFUNCTION,
> @@ -856,8 +854,18 @@ static void curl_readv_bh_cb(void *p)
>  }
>  
>  // No cache found, so let's start a new request
> -state = curl_init_state(acb->common.bs, s);
> -if (!state) {
> +for (;;) {
> +state = curl_find_state(s);
> +if (state) {
> +break;
> +}
> +qemu_mutex_unlock(>mutex);
> +aio_poll(bdrv_get_aio_context(bs), true);
> +qemu_mutex_lock(>mutex);
> +}
> +
> +if (curl_init_state(s, state) < 0) {
> +curl_clean_state(state);

For some reason, I initially thought this might cause problems with the
assert in curl_clean_state(), but that isn't the case.

Reviewed-by: Jeff Cody 

>  ret = -EIO;
>  goto out;
>  }
> -- 
> 2.12.2
> 
>

Re: [Qemu-devel] [PATCH V2] migration: expose qemu_announce_self() via qmp

2017-05-12 Thread Vlad Yasevich

On 05/12/2017 03:24 PM, Dr. David Alan Gilbert wrote:
> * Vlad Yasevich (vyase...@redhat.com) wrote:
>> On 02/20/2017 07:16 PM, Germano Veit Michel wrote:
>>> qemu_announce_self() is triggered by qemu at the end of migrations
>>> to update the network regarding the path to the guest l2addr.
>>>
>>> however it is also useful when there is a network change such as
>>> an active bond slave swap. Essentially, it's the same as a migration
>>> from a network perspective - the guest moves to a different point
>>> in the network topology.
>>>
>>> this exposes the function via qmp.
>>>
>>> Signed-off-by: Germano Veit Michel 
>>> ---
>>>  include/migration/vmstate.h |  5 +
>>>  migration/savevm.c  | 30 +++---
>>>  qapi-schema.json| 18 ++
>>>  3 files changed, 42 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
>>> index 63e7b02..a08715c 100644
>>> --- a/include/migration/vmstate.h
>>> +++ b/include/migration/vmstate.h
>>> @@ -1042,6 +1042,11 @@ int64_t self_announce_delay(int round)
>>>  return 50 + (SELF_ANNOUNCE_ROUNDS - round - 1) * 100;
>>>  }
>>>
>>> +struct AnnounceRound {
>>> +QEMUTimer *timer;
>>> +int count;
>>> +};
>>> +
>>>  void dump_vmstate_json_to_file(FILE *out_fp);
>>>
>>>  #endif
>>> diff --git a/migration/savevm.c b/migration/savevm.c
>>> index 5ecd264..44e196b 100644
>>> --- a/migration/savevm.c
>>> +++ b/migration/savevm.c
>>> @@ -118,29 +118,37 @@ static void qemu_announce_self_iter(NICState
>>> *nic, void *opaque)
>>>  qemu_send_packet_raw(qemu_get_queue(nic), buf, len);
>>>  }
>>>
>>> -
>>>  static void qemu_announce_self_once(void *opaque)
>>>  {
>>> -static int count = SELF_ANNOUNCE_ROUNDS;
>>> -QEMUTimer *timer = *(QEMUTimer **)opaque;
>>> +struct AnnounceRound *round = opaque;
>>>
>>>  qemu_foreach_nic(qemu_announce_self_iter, NULL);
>>>
>>> -if (--count) {
>>> +round->count--;
>>> +if (round->count) {
>>>  /* delay 50ms, 150ms, 250ms, ... */
>>> -timer_mod(timer, qemu_clock_get_ms(QEMU_CLOCK_REALTIME) +
>>> -  self_announce_delay(count));
>>> +timer_mod(round->timer, qemu_clock_get_ms(QEMU_CLOCK_REALTIME) +
>>> +  self_announce_delay(round->count));
>>>  } else {
>>> -timer_del(timer);
>>> -timer_free(timer);
>>> +timer_del(round->timer);
>>> +timer_free(round->timer);
>>> +g_free(round);
>>>  }
>>>  }
>>>
>>>  void qemu_announce_self(void)
>>>  {
>>> -static QEMUTimer *timer;
>>> -timer = timer_new_ms(QEMU_CLOCK_REALTIME, qemu_announce_self_once, 
>>> );
>>> -qemu_announce_self_once();
>>> +struct AnnounceRound *round = g_malloc(sizeof(struct AnnounceRound));
>>> +if (!round)
>>> +return;
>>> +round->count = SELF_ANNOUNCE_ROUNDS;
>>> +round->timer = timer_new_ms(QEMU_CLOCK_REALTIME,
>>> qemu_announce_self_once, round);
>>> +qemu_announce_self_once(round);
>>> +}
>>
>> So, I've been looking and this code and have been playing with it and with 
>> David's
>> patches and my patches to include virtio self announcements as well.  What 
>> I've discovered
>> is what I think is a possible packet amplification issue here.
>>
>> This creates a new timer every time we do do a announce_self.  With just 
>> migration,
>> this is not an issue since you only migrate once at a time, so there is only 
>> 1 timer.
>> With exposing this as an API, a user can potentially call it in a tight loop
>> and now you have a ton of timers being created.  Add in David's patches 
>> allowing timeouts
>> and retries to be configurable, and you may now have a ton of long lived 
>> timers.
>> Add in the patches I am working on to let virtio do self announcements too 
>> (to really fix
>> bonding issues), and now you add in a possibility of a lot of packets being 
>> sent for
>> each timeout (RARP, GARP, NA, IGMPv4 Reports, IGMPv6 Reports [even worse if 
>> MLD1 is used]).
>>
>> As you can see, this can get rather ugly...
>>
>> I think we need timer user here.  Migration and QMP being two to begin with. 
>>  Each
>> one would get a single timer to play with.  If a given user already has a 
>> timer running,
>> we could return an error or just not do anything.
> 
> If you did have specific timers, then you could add to/reset the counts
> rather than doing nothing.  That way it's less racy; if you issue the
> command just as you reconfigure your network, there's no chance the
> command would fail, you will send the packets out.

Yes.  That's another possible way to handle this.

-vlad
> 
> Dave
> 
>> -vlad
>>
>>> +
>>> +void qmp_announce_self(Error **errp)
>>> +{
>>> +qemu_announce_self();
>>>  }
>>>
>>>  /***/
>>> diff --git a/qapi-schema.json b/qapi-schema.json
>>> index baa0d26..0d9bffd 100644
>>> ---

Re: [Qemu-devel] [PATCH v3 1/4] ACPI: Add APEI GHES Table Generation support

2017-05-12 Thread Laszlo Ersek

On 04/30/17 07:35, Dongjiu Geng wrote:
> This implements APEI GHES Table by passing the error cper info
> to the guest via a fw_cfg_blob. After a CPER info is added, an
> SEA/SEI exception will be injected into the guest OS.
>
> Below is the table layout, the max number of error soure is 11,
> which is classified by notification type.
>
> etc/acpi/tables etc/hardware_errors
>  ==
>  +---+
> +--+ | address   | +-> +--+
> |HEST  + | registers | |   | Error Status |
> + ++ | +-+ |   | Data Block 1 |
> | | GHES1  | --> | |address1 | +   | ++
> | | GHES2  | --> | |address2 | --+ | |  CPER  |
> | | GHES3  | --> | |address3 | + | | |  CPER  |
> | |    | --> | | ... | | | | |  CPER  |
> | | GHES10 | --> | |address10| -+  | | | |  CPER  |
> +-++ +-+-+  |  | | +-++
> |  | |
> |  | +---> +--+
> |  |   | Error Status |
> |  |   | Data Block 2 |
> |  |   | ++
> |  |   | |  CPER  |
> |  |   | |  CPER  |
> |  |   +-++
> |  |
> |  +-> +--+
> |  | Error Status |
> |  | Data Block 3 |
> |  | ++
> |  | |  CPER  |
> |  +-++
> |...
> +> +--+
>| Error Status |
>| Data Block 10|
>| ++
>| |  CPER  |
>| |  CPER  |
>| |  CPER  |
>+-++
>
> Signed-off-by: Dongjiu Geng 
> ---
>  default-configs/arm-softmmu.mak |   1 +
>  hw/acpi/Makefile.objs   |   1 +
>  hw/acpi/aml-build.c |   2 +
>  hw/acpi/hest_ghes.c | 203 +++
>  hw/arm/virt-acpi-build.c|   6 ++
>  include/hw/acpi/acpi-defs.h | 227 
> 
>  include/hw/acpi/aml-build.h |   1 +
>  include/hw/acpi/hest_ghes.h |  43 
>  8 files changed, 484 insertions(+)
>  create mode 100644 hw/acpi/hest_ghes.c
>  create mode 100644 include/hw/acpi/hest_ghes.h

Disclaimer: I'm not an ACPI (or any kind of) QEMU maintainer, so I can
only share my personal opinion.

(1) This patch is too big. It should be split in two parts at least.

The first patch should contain the new ACPI structures and macros. The
second patch should contain the generation feature.

I'll reorder the diff in my response.

> diff --git a/include/hw/acpi/acpi-defs.h b/include/hw/acpi/acpi-defs.h
> index 4cc3630..27adede 100644
> --- a/include/hw/acpi/acpi-defs.h
> +++ b/include/hw/acpi/acpi-defs.h
> @@ -295,6 +295,58 @@ typedef struct AcpiMultipleApicTable 
> AcpiMultipleApicTable;
>  #define ACPI_APIC_GENERIC_TRANSLATOR15
>  #define ACPI_APIC_RESERVED  16   /* 16 and greater are reserved 
> */
>

(2) Please add a comment above the following macros: they come from the
UEFI Spec 2.6, "N.2.5 Memory Error Section".

> +#define CPER_MEM_VALID_ERROR_STATUS 0x0001
> +#define CPER_MEM_VALID_PA   0x0002
> +#define CPER_MEM_VALID_PA_MASK  0x0004
> +#define CPER_MEM_VALID_NODE 0x0008
> +#define CPER_MEM_VALID_CARD 0x0010
> +#define CPER_MEM_VALID_MODULE   0x0020
> +#define CPER_MEM_VALID_BANK 0x0040
> +#define CPER_MEM_VALID_DEVICE   0x0080
> +#define CPER_MEM_VALID_ROW  0x0100
> +#define CPER_MEM_VALID_COLUMN   0x0200
> +#define CPER_MEM_VALID_BIT_POSITION 0x0400
> +#define CPER_MEM_VALID_REQUESTOR_ID 0x0800
> +#define CPER_MEM_VALID_RESPONDER_ID 0x1000
> +#define CPER_MEM_VALID_TARGET_ID0x2000

(3) _ID should be dropped.

> +#define CPER_MEM_VALID_ERROR_TYPE   0x4000
> +#define CPER_MEM_VALID_RANK_NUMBER  0x8000
> +#define CPER_MEM_VALID_CARD_HANDLE  0x1
> +#define

Re: [Qemu-devel] [PULL for-2.9 1/3] coroutine: remove GThread implementation

2017-05-12 Thread Eric Blake

On 05/12/2017 09:37 AM, Stefan Hajnoczi wrote:
> From: "Daniel P. Berrange" 
> 
> The GThread implementation is not functional enough to actually
> run QEMU reliably. While it was potentially useful for debugging,
> we have a scripts/qemugdb/coroutine.py to enable tracing of
> ucontext coroutines in GDB, so that removes the only reason for
> GThread to exist.
> 

Quick question (unrelated to the pull request): Can we get a patch to
scripts/qemugdb/coroutine.py or nearby that mentions the preferred way
to actually use the script?  I didn't know the script existed before
seeing this commit message, and was sad to see that it was not
self-documenting (or else a scripts/qemugdb/README file) on how best to
utilize the script (is it as simple as running 'gdb --some-arg
/path/to/coroutine.py'? do you have to install some magic file to tell
gdb to autoload the file any time you are debugging qemu? something
else???).

Okay, I searched a bit harder, and found that scripts/qemu-gdb.py loads
these subscripts, if you type 'source scripts/qemu-gdb.py' at the (gdb)
prompt.  Still, is there a way to automate that so that the scripts are
loaded every time you fire up gdb, instead of having to remember to do
it by hand? And if so, can that be documented?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH V5 0/9] calculate blocktime for postcopy live migration

2017-05-12 Thread Eric Blake

On 05/12/2017 08:31 AM, Alexey Perevalov wrote:
> The rationale for that idea is following:
> vCPU could suspend during postcopy live migration until faulted
> page is not copied into kernel. Downtime on source side it's a value -
> time interval since source turn vCPU off, till destination start runnig
> vCPU. But that value was proper value for precopy migration it really shows
> amount of time when vCPU is down. But not for postcopy migration, because
> several vCPU threads could susppend after vCPU was started. That is important
> to estimate packet drop for SDN software.
> 
> This is 5th version of patch set. In previous was build error in mingw.
> First version was tagged as RFC, second was without version tag, third with 
> V3.
> 
> This patch set doesn't include improvements sugested by Peter Xu for
> get_mem_fault_cpu_index, but I would prefer to do it. I think to introduce a
> tree for fast CPUState lookup by thread_id, or general code, due to there are
> places like qemu_get_cpu (cpus.c) with the similar lookup.
> 
> (V4 -> V5)
> - fill_destination_postcopy_migration_info empty stub was missed for none 
> linux
> build
> 
> (V3 -> V4)

I reviewed a couple of spots related to QMP in v4 before seeing that you
had already posted v5; those comments still apply.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH V4 9/9] migration: postcopy_blocktime documentation

2017-05-12 Thread Eric Blake

On 05/12/2017 06:31 AM, Alexey Perevalov wrote:
> Signed-off-by: Alexey Perevalov 
> ---
>  docs/migration.txt | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/docs/migration.txt b/docs/migration.txt
> index 1b940a8..d0f5a6d 100644
> --- a/docs/migration.txt
> +++ b/docs/migration.txt
> @@ -402,6 +402,16 @@ will now cause the transition from precopy to postcopy.
>  It can be issued immediately after migration is started or any
>  time later on.  Issuing it after the end of a migration is harmless.
>  
> +Blocktime it's a postcopy live migration metric, intend to show

s/it's/is/
s/intend/intended/

> +when source vCPU was in state interruptable sleep due to pagefault.

s/when/how long the/

> +This value is calculated on destination side.
> +To enable postcopy blocktime calculation, enter following command on 
> destination
> +monitor:
> +
> +migrate_set_capability postcopy-blocktime on
> +
> +Postcopy blocktime could be retrieved by query-migrate qmp command.

s/could/can/

> +
>  Note: During the postcopy phase, the bandwidth limits set using
>  migrate_set_speed is ignored (to avoid delaying requested pages that
>  the destination is waiting for).
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH V4 8/9] migration: add postcopy total blocktime into query-migrate

2017-05-12 Thread Eric Blake

On 05/12/2017 06:31 AM, Alexey Perevalov wrote:
> Postcopy total blocktime is available on destination side only.
> But query-migrate was possible only for source. This patch
> adds ability to call query-migrate on destination. To distinguish
> src/dst, state of the MigrationState is using, query-migrate prepares
> MigrationInfo for source machine only in case of migration's state is 
> different
> than MIGRATION_STATUS_NONE.
> 
> To be able to see postcopy blocktime, need to request postcopy-blocktime
> capability.
> 
> The query-migrate command will show following sample result:
> {"return":
> "postcopy_vcpu_blocktime": [115, 100],
> "status": "completed",
> "postcopy_blocktime": 100
> }}
> 

> +++ b/qapi-schema.json
> @@ -712,6 +712,8 @@
>  #  @status is 'failed'. Clients should not attempt to parse the
>  #  error strings. (Since 2.7)
>  #
> +# @postcopy_vcpu_blocktime: list of the postcopy blocktime per vCPU (Since 
> 2.9)

New members should favor '-' over '_' in their name, unless being
consistent with exiting members...

> +#
>  # Since: 0.14.0
>  ##
>  { 'struct': 'MigrationInfo',
> @@ -723,7 +725,9 @@
> '*downtime': 'int',
> '*setup-time': 'int',
> '*cpu-throttle-percentage': 'int',
> -   '*error-desc': 'str'} }
> +   '*error-desc': 'str',
> +   '*postcopy_blocktime' : 'int64',
> +   '*postcopy_vcpu_blocktime': ['int64']} }

...but existing members use '-', so these should be 'postcopy-blocktime'
and 'postcopy-vcpu-blocktime'.


-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [Qemu-ppc] [PATCH v9 6/6] migration: spapr: migrate pending_events of spapr state

2017-05-12 Thread Daniel Henrique Barboza




On 05/12/2017 03:28 AM, David Gibson wrote:

On Fri, May 05, 2017 at 05:47:46PM -0300, Daniel Henrique Barboza wrote:

From: Jianjun Duan 

In racing situations between hotplug events and migration operation,
a rtas hotplug event could have not yet be delivered to the source
guest when migration is started. In this case the pending_events of
spapr state need be transmitted to the target so that the hotplug
event can be finished on the target.

All the different fields of the events are encoded as defined by
PAPR. We can migrate them as uint8_t binary stream without any
concerns about data padding or endianess.

pending_events is put in a subsection in the spapr state VMSD to make
sure migration across different versions is not broken.

Signed-off-by: Jianjun Duan 
Signed-off-by: Daniel Henrique Barboza 

This seems like it's probably a good idea, even independent of the
hotplug migration stuff.  I suspect there are other races where we
could lose a shutdown event or similar if there's a migration.

Perhaps we can detach this patch (and the ccs_list one) from this
series and evaluate them separately?


Daniel




---
  hw/ppc/spapr.c | 33 +
  hw/ppc/spapr_events.c  | 24 +---
  include/hw/ppc/spapr.h |  3 ++-
  3 files changed, 48 insertions(+), 12 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index bc56249..e924fd4 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1498,6 +1498,38 @@ static const VMStateDescription vmstate_spapr_ccs_list = 
{
  },
  };
  
+static bool spapr_pending_events_needed(void *opaque)

+{
+sPAPRMachineState *spapr = (sPAPRMachineState *)opaque;
+return !QTAILQ_EMPTY(>pending_events);
+}
+
+static const VMStateDescription vmstate_spapr_event_entry = {
+.name = "spapr_event_log_entry",
+.version_id = 1,
+.minimum_version_id = 1,
+.fields = (VMStateField[]) {
+VMSTATE_INT32(log_type, sPAPREventLogEntry),

This requires changing the actual type to int32_t in the structure.


+VMSTATE_BOOL(exception, sPAPREventLogEntry),

So, at the moment, AFAICT every event is marked as exception == true,
so this doesn't actually tell us anything.   If that becomes not the
case in future, can the exception flag be derived from the log_type or
information in the even buffer?


+VMSTATE_UINT32(data_size, sPAPREventLogEntry),
+VMSTATE_VARRAY_UINT32_ALLOC(data, sPAPREventLogEntry, data_size,
+0, vmstate_info_uint8, uint8_t),

So, data_size duplicates information that's in the event header, which
is a bit sad.  I suppose I'm ok with that, since setting up the VARRAY
thing is going to be pretty awkward otherwise.


+VMSTATE_END_OF_LIST()
+},
+};
+
+static const VMStateDescription vmstate_spapr_pending_events = {
+.name = "spapr_pending_events",
+.version_id = 1,
+.minimum_version_id = 1,
+.needed = spapr_pending_events_needed,
+.fields = (VMStateField[]) {
+VMSTATE_QTAILQ_V(pending_events, sPAPRMachineState, 1,
+ vmstate_spapr_event_entry, sPAPREventLogEntry, next),
+VMSTATE_END_OF_LIST()
+},
+};
+
  static bool spapr_ov5_cas_needed(void *opaque)
  {
  sPAPRMachineState *spapr = opaque;
@@ -1598,6 +1630,7 @@ static const VMStateDescription vmstate_spapr = {
  _spapr_patb_entry,
  _spapr_pending_dimm_unplugs,
  _spapr_ccs_list,
+_spapr_pending_events,
  NULL
  }
  };
diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
index f0b28d8..70c7cfc 100644
--- a/hw/ppc/spapr_events.c
+++ b/hw/ppc/spapr_events.c
@@ -342,7 +342,8 @@ static int rtas_event_log_to_irq(sPAPRMachineState *spapr, 
int log_type)
  return source->irq;
  }
  
-static void rtas_event_log_queue(int log_type, void *data, bool exception)

+static void rtas_event_log_queue(int log_type, void *data, bool exception,
+ int data_size)
  {
  sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
  sPAPREventLogEntry *entry = g_new(sPAPREventLogEntry, 1);
@@ -351,6 +352,7 @@ static void rtas_event_log_queue(int log_type, void *data, 
bool exception)
  entry->log_type = log_type;
  entry->exception = exception;
  entry->data = data;
+entry->data_size = data_size;

I think it would make more sense to derive data_size from the buffer
header contents here, rather than in all the callers.


  QTAILQ_INSERT_TAIL(>pending_events, entry, next);
  }
  
@@ -445,6 +447,7 @@ static void spapr_powerdown_req(Notifier *n, void *opaque)

  struct rtas_event_log_v6_mainb *mainb;
  struct rtas_event_log_v6_epow *epow;
  struct epow_log_full *new_epow;
+uint32_t data_size;
  
  new_epow = g_malloc0(sizeof(*new_epow));

  hdr = _epow->hdr;
@@ -453,14 +456,13 @@ static void spapr_powerdown_req(Notifier

Re: [Qemu-devel] [PATCH 2/3] migration: Remove use of old MigrationParams

2017-05-12 Thread Eric Blake

On 05/12/2017 05:55 AM, Juan Quintela wrote:
>>> @@ -1239,6 +1240,7 @@ void qmp_migrate(const char *uri, bool has_blk, bool 
>>> blk,
>>>  }
>>>  
>>>  if (has_inc && inc) {
>>> +migrate_set_block_enabled(s, true);
>>>  migrate_set_block_shared(s, true);
>>
>> [2]
>>
>> IIUC for [1] & [2] we are solving the same problem that "shared"
>> depends on "enabled" bit. Would it be good to unitfy this dependency
>> somewhere? E.g., by changing migrate_set_block_shared() into:
>>
>> void migrate_set_block_shared(MigrationState *s, bool value)
>> {
>> s->enabled_capabilities[MIGRATION_CAPABILITY_BLOCK_SHARED] = value;
>> if (value) {
>> migrate_set_block_enabled(s, true);
>> }
>> }
> 
> ok with this.

Or, as I commented on 1/3, maybe having a single property that is a
tri-state enum value, instead of 2 separate boolean properties, might be
nicer (but certainly a bit more complex to code up).

> I will add once here that when we disable block enabled, we also disable
> shared, or just let it that way?
> 
>> Another thing to mention: after switching to the capability interface,
>> we'll cache the "enabled" and "shared" bits now while we don't cache
>> it before, right? IIUC it'll affect behavior of such sequence:
>>
>> - 1st migrate with enabled=1, shared=1, then
>> - 2nd migrate with enabled=0, shared=0
>>
>> Before the series, the 2nd migrate will use enabled=shared=0, but
>> after the series it should be using enabled=shared=1. Not sure whether
>> this would be a problem (or I missed anything?).
> 
> We can't be consistent with both old/new way.
> 
> Old way: we always setup the capabilities on command line (that should
> have been deprecated long, long ago)

Well, the easy way out is to have the HMP migrate command (I assume
that's what you mean by "on command line") explicitly clear the
parameters if it is called without the -b/-i flag.  So the start of each
migration is what changes the properties, so long as you are still using
HMP to start the migration.  Or, on the QMP side, since 'migrate' has
optional 'blk' and 'inc' booleans, basically leave the settings alone if
the parameters were omitted, and explicitly update the property to the
value of those parameters if they were present.

Or is the proposal that we are also going to simplify the QMP 'migrate'
command to get rid of crufty parameters?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [Qemu-ppc] [PATCH v9 4/6] hw/ppc/spapr.c: migrate pending_dimm_unplugs of spapr state

2017-05-12 Thread Daniel Henrique Barboza




On 05/12/2017 03:12 AM, David Gibson wrote:

On Fri, May 05, 2017 at 05:47:44PM -0300, Daniel Henrique Barboza wrote:

To allow for a DIMM unplug event to resume its work if a migration
occurs in the middle of it, this patch migrates the non-empty
pending_dimm_unplugs QTAILQ that stores the DIMM information
that the spapr_lmb_release() callback uses.

It was considered an apprach where the DIMM states would be restored
on the post-_load after a migration. The problem is that there is
no way of knowing, from the sPAPRMachineState, if a given DIMM is going
through an unplug process and the callback needs the updated DIMM State.

We could migrate a flag indicating that there is an unplug event going
on for a certain DIMM, fetching this information from the start of the
spapr_del_lmbs call. But this would also require a scan on post_load to
figure out how many nr_lmbs are left. At this point we can just
migrate the nr_lmbs information as well, given that it is being calculated
at spapr_del_lmbs already, and spare a scanning/discovery in the
post-load. All that we need is inside the sPAPRDIMMState structure
that is added to the pending_dimm_unplugs queue at the start of the
spapr_del_lmbs, so it's convenient to just migrated this queue it if it's
not empty.

Signed-off-by: Daniel Henrique Barboza 

NACK.

As I believe I suggested previously, you can reconstruct this state on
the receiving side by doing a full scan of the DIMM and LMB DRC states.


Just had an idea that I think it's in the line of what you're 
suggesting. Given

that the information we need is only created in the spapr_del_lmbs
(as per patch 1), we can use the absence of this information in the
release callback as a sort of a flag, an indication that a migration got
in the way and we need to reconstruct the nr_lmbs states again, using
the same scanning function I've used in v8.

The flow would be like this (considering the changes in the
previous 3 patches so far):



/* Callback to be called during DRC release. */
void spapr_lmb_release(DeviceState *dev)
{
 HotplugHandler *hotplug_ctrl;

 uint64_t addr = spapr_dimm_get_address(PC_DIMM(dev));
 sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
 sPAPRDIMMState *ds = spapr_pending_dimm_unplugs_find(spapr, addr);

// no DIMM state found in spapr - re-create it to find out how may 
LMBs are left

if (ds == NULL) {
uint32 nr_lmbs  = ***call_scanning_LMB_DRCs_function(dev)***
// recreate the sPAPRDIMMState element and add it back to spapr
}

( resume callback as usual )

---

Is this approach be adequate? Another alternative would be to use another
way of detecting if an LMB unplug is happening and, if positive, do the same
process in the post_load(). In this case I'll need to take a look in the 
code and

see how we can detect an ongoing unplug besides what I've said above.


Thanks,


Daniel




---
  hw/ppc/spapr.c | 31 +++
  1 file changed, 31 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index e190eb9..30f0b7b 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1437,6 +1437,36 @@ static bool version_before_3(void *opaque, int 
version_id)
  return version_id < 3;
  }
  
+static bool spapr_pending_dimm_unplugs_needed(void *opaque)

+{
+sPAPRMachineState *spapr = (sPAPRMachineState *)opaque;
+return !QTAILQ_EMPTY(>pending_dimm_unplugs);
+}
+
+static const VMStateDescription vmstate_spapr_dimmstate = {
+.name = "spapr_dimm_state",
+.version_id = 1,
+.minimum_version_id = 1,
+.fields = (VMStateField[]) {
+VMSTATE_UINT64(addr, sPAPRDIMMState),
+VMSTATE_UINT32(nr_lmbs, sPAPRDIMMState),
+VMSTATE_END_OF_LIST()
+},
+};
+
+static const VMStateDescription vmstate_spapr_pending_dimm_unplugs = {
+.name = "spapr_pending_dimm_unplugs",
+.version_id = 1,
+.minimum_version_id = 1,
+.needed = spapr_pending_dimm_unplugs_needed,
+.fields = (VMStateField[]) {
+VMSTATE_QTAILQ_V(pending_dimm_unplugs, sPAPRMachineState, 1,
+ vmstate_spapr_dimmstate, sPAPRDIMMState,
+ next),
+VMSTATE_END_OF_LIST()
+},
+};
+
  static bool spapr_ov5_cas_needed(void *opaque)
  {
  sPAPRMachineState *spapr = opaque;
@@ -1535,6 +1565,7 @@ static const VMStateDescription vmstate_spapr = {
  .subsections = (const VMStateDescription*[]) {
  _spapr_ov5_cas,
  _spapr_patb_entry,
+_spapr_pending_dimm_unplugs,
  NULL
  }
  };

Re: [Qemu-devel] [PATCH 1/3] migration: Create block capabilities for shared and enable

2017-05-12 Thread Eric Blake

On 05/11/2017 11:32 AM, Juan Quintela wrote:
> Those two capabilities were added through the command line.  Notice that
> we just created them.  This is just the boilerplate.
> 
> Signed-off-by: Juan Quintela 
> Reviewed-by: Eric Blake 
> 
> --
> 
> Make migrate_set_block_* take a boolean argument.

Question - do we support the orthogonal selection of all 4 combinations
under HMP 'migrate' (no argument, -b alone, -i alone, -b and -i
together), or are there only 3 actual states? If the latter, should we
represent this as a single enum-valued property, rather than as two
independent boolean properties?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v4] qemu-img: Check for backing image if specified during create

2017-05-12 Thread John Snow



On 05/12/2017 03:46 PM, Eric Blake wrote:
> On 05/12/2017 01:07 PM, Max Reitz wrote:
>> On 2017-05-11 20:27, John Snow wrote:
>>> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1213786
>>>
>>> Or, rather, force the open of a backing image if one was specified
>>> for creation. Using a similar -unsafe option as rebase, allow qemu-img
>>> to ignore the backing file validation if possible.
>>>
> 
>>> +++ b/block.c
>>> @@ -4275,37 +4275,37 @@ void bdrv_img_create(const char *filename, const 
>>> char *fmt,
>>>  // The size for the image must always be specified, with one exception:
>>>  // If we are using a backing file, we can obtain the size from there
>>>  size = qemu_opt_get_size(opts, BLOCK_OPT_SIZE, 0);
>>> -if (size == -1) {
>>
>> "Hang on, why should this be -1 when the defval is 0? Where does the -1
>> come from?"
>> "..."
>> "Oh, the option exists and is set to -1? Why is that?"
>> "..."
>> "Oh, because this function always sets it itself, and because @img_size
>> is set to (uint64_t)-1."
> 
> I had pretty much the same conversation on my v1 review.
> https://lists.gnu.org/archive/html/qemu-devel/2017-05/msg01097.html
> 
>>
>> First, I won't start with how signed integer overflow is
>> implementation-defined in C because I hope you have thrashed that out
>> with Eric (I hope that "to thrash out" is a good translation for
>> "auskaspern" (lit. "to buffoon out").).
> 
> Sounds like a reasonable choice of words, even if I don't speak the
> counterpart language to validate your translation.
> 
> (uint64_t)-1 is well-defined in C (so I think we're just fine here). But
> (int64_t)UINT64_MAX is where signed integer overflow does indeed throw
> wrinkles at you.
> 
> I seem to recall that qemu has chosen to use compiler flags and/or
> assumptions that we are using 2s-complement arithmetic with sane
> behavior (that is, tighter behavior than the bare minimum that C
> requires), because it was easier than auditing our code for strict C
> compliance on border cases of conversions from unsigned to signed that
> trigger undefined behavior.  But again, I don't think it affects this
> patch (where our conversion is only from signed to unsigned, and that is
> well-defined behavior).
> 
> 
>>
>> Second, well, at least we should put -1 as the default value here, then.
> 
> Indeed, now that two reviewers have tripped on it,
> qemu_opt_get_size(,,-1) would be nicer.
> 
>>
>> Not strictly your fault or something that you need to fix, but it is
>> just a single line in the vicinity...
>>
>> Let me know if you want to address this, for now I'll leave a
>>
>> Reviewed-by: Max Reitz 
>>
>> here if you don't want to.
> 
> I'm okay whether you want to squash that fix into this patch, or whether
> you do it as a separate followup patch.
> 

I had considered the issue separate; but you're welcome to either write
a patch or squish it into this one, I'm not going to be picky.

--js

Re: [Qemu-devel] [PATCH v4] qemu-img: Check for backing image if specified during create

2017-05-12 Thread Eric Blake

On 05/12/2017 01:07 PM, Max Reitz wrote:
> On 2017-05-11 20:27, John Snow wrote:
>> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1213786
>>
>> Or, rather, force the open of a backing image if one was specified
>> for creation. Using a similar -unsafe option as rebase, allow qemu-img
>> to ignore the backing file validation if possible.
>>

>> +++ b/block.c
>> @@ -4275,37 +4275,37 @@ void bdrv_img_create(const char *filename, const 
>> char *fmt,
>>  // The size for the image must always be specified, with one exception:
>>  // If we are using a backing file, we can obtain the size from there
>>  size = qemu_opt_get_size(opts, BLOCK_OPT_SIZE, 0);
>> -if (size == -1) {
> 
> "Hang on, why should this be -1 when the defval is 0? Where does the -1
> come from?"
> "..."
> "Oh, the option exists and is set to -1? Why is that?"
> "..."
> "Oh, because this function always sets it itself, and because @img_size
> is set to (uint64_t)-1."

I had pretty much the same conversation on my v1 review.
https://lists.gnu.org/archive/html/qemu-devel/2017-05/msg01097.html

> 
> First, I won't start with how signed integer overflow is
> implementation-defined in C because I hope you have thrashed that out
> with Eric (I hope that "to thrash out" is a good translation for
> "auskaspern" (lit. "to buffoon out").).

Sounds like a reasonable choice of words, even if I don't speak the
counterpart language to validate your translation.

(uint64_t)-1 is well-defined in C (so I think we're just fine here). But
(int64_t)UINT64_MAX is where signed integer overflow does indeed throw
wrinkles at you.

I seem to recall that qemu has chosen to use compiler flags and/or
assumptions that we are using 2s-complement arithmetic with sane
behavior (that is, tighter behavior than the bare minimum that C
requires), because it was easier than auditing our code for strict C
compliance on border cases of conversions from unsigned to signed that
trigger undefined behavior.  But again, I don't think it affects this
patch (where our conversion is only from signed to unsigned, and that is
well-defined behavior).

> 
> Second, well, at least we should put -1 as the default value here, then.

Indeed, now that two reviewers have tripped on it,
qemu_opt_get_size(,,-1) would be nicer.

> 
> Not strictly your fault or something that you need to fix, but it is
> just a single line in the vicinity...
> 
> Let me know if you want to address this, for now I'll leave a
> 
> Reviewed-by: Max Reitz 
> 
> here if you don't want to.

I'm okay whether you want to squash that fix into this patch, or whether
you do it as a separate followup patch.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH] target/i386: enable A20 automatically in system management mode

2017-05-12 Thread Kevin O'Connor

On Fri, May 12, 2017 at 09:16:31PM +0200, Paolo Bonzini wrote:
> On 12/05/2017 20:55, Xu, Anthony wrote:
> > If that's the case,  QEMU/TCG should work with SeaBios even with ignoring 
> > A20.
> > 
> > During SeaBios boot, there are >350 port 92 access, if we don't need to 
> > handle A20, 
> > we can make A20 configurable in Seabios, It may reduce SeaBios boot time.
> 
> Yes, that's a good idea.

SeaBIOS defaults to enabling A20 and it's a rare beast that disables
it.  One could change x86.h:set_a20 and romlayout.S:transition32 to
only issue the outb() if the inb() indicates a change is needed.  That
would likely eliminate half the accesses.

I'd be surprised if it would impact the overall boot time though.
SeaBIOS only touches the port on a cpu mode switch and I would have
thought that was heavier than an IO port access.  Maybe that is skewed
on KVM though.

-Kevin

Re: [Qemu-devel] [PATCH v3 2/5] target/arm: optimize rev16() using extract op

2017-05-12 Thread Richard Henderson


On 05/12/2017 12:22 PM, Aurelien Jarno wrote:

On 2017-05-12 12:05, Richard Henderson wrote:

On 05/12/2017 11:21 AM, Aurelien Jarno wrote:

+uint64_t mask1 = sf ? 0x00ff00ff00ff00ffull : 0x00ff00ff;
+uint64_t mask2 = sf ? 0xff00ff00ff00ff00ull : 0xff00ff00;
+
+tcg_gen_shri_i64(tcg_tmp, tcg_rn, 8);
+tcg_gen_andi_i64(tcg_tmp, tcg_tmp, mask1);
+tcg_gen_shli_i64(tcg_rd, tcg_rn, 8);
+tcg_gen_andi_i64(tcg_rd, tcg_rd, mask2);


It would probably be better to use a single mask, since they're not free to
instantiate in a register.  So e.g.

   TCGv mask = tcg_const_i64(sf ? 0x00ff00ff00ff00ffull : 0x00ff00ff);
   tcg_gen_shri_i64(tcg_tmp, tcg_rn, 8);
   tcg_gen_and_i64(tcg_rd, tcg_rn, mask);
   tcg_gen_and_i64(tcg_tmp, tcg_tmp, mask);
   tcg_gen_shli_i64(tcg_rd, tcg_rd, 8);


Indeed that improves things a bit for sf=1. For sf=0 though the
constant is never loaded into a register, it is passed to the and
instructions as an immediate.

For x86 (and sometimes s390) it isn't, but it certainly would be for all other 
hosts.



r~

[Qemu-devel] [PATCH] block: Correct documentation for BLOCK_WRITE_THRESHOLD

2017-05-12 Thread Eric Blake

Use the correct command name.

Signed-off-by: Eric Blake 
---
 qapi/block-core.json | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 614181b..206e33b 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3582,7 +3582,7 @@
 # means the device should be extended to avoid pausing for
 # disk exhaustion.
 # The event is one shot. Once triggered, it needs to be
-# re-registered with another block-set-threshold command.
+# re-registered with another block-set-write-threshold command.
 #
 # @node-name: graph node name on which the threshold was exceeded.
 #
-- 
2.9.3

Re: [Qemu-devel] [PATCH V2] migration: expose qemu_announce_self() via qmp

2017-05-12 Thread Dr. David Alan Gilbert

* Vlad Yasevich (vyase...@redhat.com) wrote:
> On 02/20/2017 07:16 PM, Germano Veit Michel wrote:
> > qemu_announce_self() is triggered by qemu at the end of migrations
> > to update the network regarding the path to the guest l2addr.
> > 
> > however it is also useful when there is a network change such as
> > an active bond slave swap. Essentially, it's the same as a migration
> > from a network perspective - the guest moves to a different point
> > in the network topology.
> > 
> > this exposes the function via qmp.
> > 
> > Signed-off-by: Germano Veit Michel 
> > ---
> >  include/migration/vmstate.h |  5 +
> >  migration/savevm.c  | 30 +++---
> >  qapi-schema.json| 18 ++
> >  3 files changed, 42 insertions(+), 11 deletions(-)
> > 
> > diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
> > index 63e7b02..a08715c 100644
> > --- a/include/migration/vmstate.h
> > +++ b/include/migration/vmstate.h
> > @@ -1042,6 +1042,11 @@ int64_t self_announce_delay(int round)
> >  return 50 + (SELF_ANNOUNCE_ROUNDS - round - 1) * 100;
> >  }
> > 
> > +struct AnnounceRound {
> > +QEMUTimer *timer;
> > +int count;
> > +};
> > +
> >  void dump_vmstate_json_to_file(FILE *out_fp);
> > 
> >  #endif
> > diff --git a/migration/savevm.c b/migration/savevm.c
> > index 5ecd264..44e196b 100644
> > --- a/migration/savevm.c
> > +++ b/migration/savevm.c
> > @@ -118,29 +118,37 @@ static void qemu_announce_self_iter(NICState
> > *nic, void *opaque)
> >  qemu_send_packet_raw(qemu_get_queue(nic), buf, len);
> >  }
> > 
> > -
> >  static void qemu_announce_self_once(void *opaque)
> >  {
> > -static int count = SELF_ANNOUNCE_ROUNDS;
> > -QEMUTimer *timer = *(QEMUTimer **)opaque;
> > +struct AnnounceRound *round = opaque;
> > 
> >  qemu_foreach_nic(qemu_announce_self_iter, NULL);
> > 
> > -if (--count) {
> > +round->count--;
> > +if (round->count) {
> >  /* delay 50ms, 150ms, 250ms, ... */
> > -timer_mod(timer, qemu_clock_get_ms(QEMU_CLOCK_REALTIME) +
> > -  self_announce_delay(count));
> > +timer_mod(round->timer, qemu_clock_get_ms(QEMU_CLOCK_REALTIME) +
> > +  self_announce_delay(round->count));
> >  } else {
> > -timer_del(timer);
> > -timer_free(timer);
> > +timer_del(round->timer);
> > +timer_free(round->timer);
> > +g_free(round);
> >  }
> >  }
> > 
> >  void qemu_announce_self(void)
> >  {
> > -static QEMUTimer *timer;
> > -timer = timer_new_ms(QEMU_CLOCK_REALTIME, qemu_announce_self_once, 
> > );
> > -qemu_announce_self_once();
> > +struct AnnounceRound *round = g_malloc(sizeof(struct AnnounceRound));
> > +if (!round)
> > +return;
> > +round->count = SELF_ANNOUNCE_ROUNDS;
> > +round->timer = timer_new_ms(QEMU_CLOCK_REALTIME,
> > qemu_announce_self_once, round);
> > +qemu_announce_self_once(round);
> > +}
> 
> So, I've been looking and this code and have been playing with it and with 
> David's
> patches and my patches to include virtio self announcements as well.  What 
> I've discovered
> is what I think is a possible packet amplification issue here.
> 
> This creates a new timer every time we do do a announce_self.  With just 
> migration,
> this is not an issue since you only migrate once at a time, so there is only 
> 1 timer.
> With exposing this as an API, a user can potentially call it in a tight loop
> and now you have a ton of timers being created.  Add in David's patches 
> allowing timeouts
> and retries to be configurable, and you may now have a ton of long lived 
> timers.
> Add in the patches I am working on to let virtio do self announcements too 
> (to really fix
> bonding issues), and now you add in a possibility of a lot of packets being 
> sent for
> each timeout (RARP, GARP, NA, IGMPv4 Reports, IGMPv6 Reports [even worse if 
> MLD1 is used]).
> 
> As you can see, this can get rather ugly...
> 
> I think we need timer user here.  Migration and QMP being two to begin with.  
> Each
> one would get a single timer to play with.  If a given user already has a 
> timer running,
> we could return an error or just not do anything.

If you did have specific timers, then you could add to/reset the counts
rather than doing nothing.  That way it's less racy; if you issue the
command just as you reconfigure your network, there's no chance the
command would fail, you will send the packets out.

Dave

> -vlad
> 
> > +
> > +void qmp_announce_self(Error **errp)
> > +{
> > +qemu_announce_self();
> >  }
> > 
> >  /***/
> > diff --git a/qapi-schema.json b/qapi-schema.json
> > index baa0d26..0d9bffd 100644
> > --- a/qapi-schema.json
> > +++ b/qapi-schema.json
> > @@ -6080,3 +6080,21 @@
> >  #
> >  ##
> >  { 'command': 'query-hotpluggable-cpus', 'returns':

Re: [Qemu-devel] [PATCH v3 2/5] target/arm: optimize rev16() using extract op

2017-05-12 Thread Aurelien Jarno

On 2017-05-12 12:05, Richard Henderson wrote:
> On 05/12/2017 11:21 AM, Aurelien Jarno wrote:
> > +uint64_t mask1 = sf ? 0x00ff00ff00ff00ffull : 0x00ff00ff;
> > +uint64_t mask2 = sf ? 0xff00ff00ff00ff00ull : 0xff00ff00;
> > +
> > +tcg_gen_shri_i64(tcg_tmp, tcg_rn, 8);
> > +tcg_gen_andi_i64(tcg_tmp, tcg_tmp, mask1);
> > +tcg_gen_shli_i64(tcg_rd, tcg_rn, 8);
> > +tcg_gen_andi_i64(tcg_rd, tcg_rd, mask2);
> 
> It would probably be better to use a single mask, since they're not free to
> instantiate in a register.  So e.g.
> 
>   TCGv mask = tcg_const_i64(sf ? 0x00ff00ff00ff00ffull : 0x00ff00ff);
>   tcg_gen_shri_i64(tcg_tmp, tcg_rn, 8);
>   tcg_gen_and_i64(tcg_rd, tcg_rn, mask);
>   tcg_gen_and_i64(tcg_tmp, tcg_tmp, mask);
>   tcg_gen_shli_i64(tcg_rd, tcg_rd, 8);

Indeed that improves things a bit for sf=1. For sf=0 though the
constant is never loaded into a register, it is passed to the and
instructions as an immediate.

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net

Re: [Qemu-devel] [PATCH] target/i386: enable A20 automatically in system management mode

2017-05-12 Thread Paolo Bonzini



On 12/05/2017 20:55, Xu, Anthony wrote:
>  
>> On 12/05/2017 01:55, Xu, Anthony wrote:
>>> Hi Paolo,
>>>
>>> In KVM mode, seems A20 is ignored.
>>> Do you see any potential issue here?
>>
>> No; recent processors don't have A20 at all.
> 
> I mean A20 in guest, not A20 in host.  
> Guest is running on old platform, it tries to control A20  through port 92 
> like what SeaBios does.
> QEMU/KVM does handle port 92 access to set correct env->a20_mask,
> but QEMU/KVM ignores A20 status when handling guest memory access.
> 
> Since QEMU/KVM works well with SeaBios, does that imply SeaBios doesn't 
> generate address
> larger than 0x10 in real mode?

No, only the guest's OS or software (not the firmware) might require
A20.  But really anything that ran with MS-DOS 5.0 HIMEM.SYS or newer
(which used DOS=HIGH to relocate DOS into the HMA, I think) does not
need it.

> If that's the case,  QEMU/TCG should work with SeaBios even with ignoring A20.
> 
> During SeaBios boot, there are >350 port 92 access, if we don't need to 
> handle A20, 
> we can make A20 configurable in Seabios, It may reduce SeaBios boot time.

Yes, that's a good idea.

Paolo

Re: [Qemu-devel] [PATCH v3 2/5] target/arm: optimize rev16() using extract op

2017-05-12 Thread Richard Henderson


On 05/12/2017 11:21 AM, Aurelien Jarno wrote:

+uint64_t mask1 = sf ? 0x00ff00ff00ff00ffull : 0x00ff00ff;
+uint64_t mask2 = sf ? 0xff00ff00ff00ff00ull : 0xff00ff00;
+
+tcg_gen_shri_i64(tcg_tmp, tcg_rn, 8);
+tcg_gen_andi_i64(tcg_tmp, tcg_tmp, mask1);
+tcg_gen_shli_i64(tcg_rd, tcg_rn, 8);
+tcg_gen_andi_i64(tcg_rd, tcg_rd, mask2);


It would probably be better to use a single mask, since they're not free to 
instantiate in a register.  So e.g.


  TCGv mask = tcg_const_i64(sf ? 0x00ff00ff00ff00ffull : 0x00ff00ff);
  tcg_gen_shri_i64(tcg_tmp, tcg_rn, 8);
  tcg_gen_and_i64(tcg_rd, tcg_rn, mask);
  tcg_gen_and_i64(tcg_tmp, tcg_tmp, mask);
  tcg_gen_shli_i64(tcg_rd, tcg_rd, 8);


r~

Re: [Qemu-devel] [PATCH] target/i386: enable A20 automatically in system management mode

2017-05-12 Thread Xu, Anthony

> On 12/05/2017 01:55, Xu, Anthony wrote:
> > Hi Paolo,
> >
> > In KVM mode, seems A20 is ignored.
> > Do you see any potential issue here?
> 
> No; recent processors don't have A20 at all.

I mean A20 in guest, not A20 in host.  
Guest is running on old platform, it tries to control A20  through port 92 like 
what SeaBios does.
QEMU/KVM does handle port 92 access to set correct env->a20_mask,
but QEMU/KVM ignores A20 status when handling guest memory access.

Since QEMU/KVM works well with SeaBios, does that imply SeaBios doesn't 
generate address
larger than 0x10 in real mode?

If that's the case,  QEMU/TCG should work with SeaBios even with ignoring A20.

During SeaBios boot, there are >350 port 92 access, if we don't need to handle 
A20, 
we can make A20 configurable in Seabios, It may reduce SeaBios boot time.

Anthony

Re: [Qemu-devel] [PATCH RESEND v2 00/21] qdev/sysbus: Set user_creatable=false by default on sysbus

2017-05-12 Thread Eduardo Habkost

Ping? If there are no objections to this series, I plan to merge
it through the Machine Core tree.

If anybody is interested, below are the results of squashing
patches 2-20 together:

---
 hw/core/sysbus.c | 11 +++
 hw/i386/amd_iommu.c  |  2 ++
 hw/i386/intel_iommu.c|  2 ++
 hw/net/fsl_etsec/etsec.c |  2 ++
 hw/ppc/spapr_pci.c   |  2 ++
 hw/vfio/amd-xgbe.c   |  2 ++
 hw/vfio/calxeda-xgmac.c  |  2 ++
 hw/xen/xen_backend.c |  2 ++
 8 files changed, 25 insertions(+)

diff --git a/hw/core/sysbus.c b/hw/core/sysbus.c
index c0f560b289..5d0887f499 100644
--- a/hw/core/sysbus.c
+++ b/hw/core/sysbus.c
@@ -326,6 +326,17 @@ static void sysbus_device_class_init(ObjectClass *klass, 
void *data)
 DeviceClass *k = DEVICE_CLASS(klass);
 k->init = sysbus_device_init;
 k->bus_type = TYPE_SYSTEM_BUS;
+/*
+ * device_add plugs devices into a suitable bus.  For "real" buses,
+ * that actually connects the device.  For sysbus, the connections
+ * need to be made separately, and device_add can't do that.  The
+ * device would be left unconnected, and will probably not work
+ *
+ * However, a few machines can handle device_add/-device with
+ * a few specific sysbus devices. In those cases, the device
+ * subclass needs to override it and set user_creatable=true.
+ */
+k->user_creatable = false;
 }
 
 static const TypeInfo sysbus_device_type_info = {
diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
index f86a40aa30..efcc93cbfd 100644
--- a/hw/i386/amd_iommu.c
+++ b/hw/i386/amd_iommu.c
@@ -1186,6 +1186,8 @@ static void amdvi_class_init(ObjectClass *klass, void* 
data)
 dc->vmsd = _amdvi;
 dc->hotpluggable = false;
 dc_class->realize = amdvi_realize;
+/* Supported by the pc-q35-* machine types */
+dc->user_creatable = true;
 }
 
 static const TypeInfo amdvi = {
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 02f047c8e3..327a46cd19 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -3009,6 +3009,8 @@ static void vtd_class_init(ObjectClass *klass, void *data)
 dc->hotpluggable = false;
 x86_class->realize = vtd_realize;
 x86_class->int_remap = vtd_int_remap;
+/* Supported by the pc-q35-* machine types */
+dc->user_creatable = true;
 }
 
 static const TypeInfo vtd_info = {
diff --git a/hw/net/fsl_etsec/etsec.c b/hw/net/fsl_etsec/etsec.c
index aa2b0d5a85..9da1932970 100644
--- a/hw/net/fsl_etsec/etsec.c
+++ b/hw/net/fsl_etsec/etsec.c
@@ -416,6 +416,8 @@ static void etsec_class_init(ObjectClass *klass, void *data)
 dc->realize = etsec_realize;
 dc->reset = etsec_reset;
 dc->props = etsec_properties;
+/* Supported by ppce500 machine */
+dc->user_creatable = true;
 }
 
 static TypeInfo etsec_info = {
diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index e7567e2e8f..a7cff32bbf 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -1994,6 +1994,8 @@ static void spapr_phb_class_init(ObjectClass *klass, void 
*data)
 dc->props = spapr_phb_properties;
 dc->reset = spapr_phb_reset;
 dc->vmsd = _spapr_pci;
+/* Supported by TYPE_SPAPR_MACHINE */
+dc->user_creatable = true;
 set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
 hp->plug = spapr_phb_hot_plug_child;
 hp->unplug = spapr_phb_hot_unplug_child;
diff --git a/hw/vfio/amd-xgbe.c b/hw/vfio/amd-xgbe.c
index 2c60310cf9..fab196cebf 100644
--- a/hw/vfio/amd-xgbe.c
+++ b/hw/vfio/amd-xgbe.c
@@ -38,6 +38,8 @@ static void vfio_amd_xgbe_class_init(ObjectClass *klass, void 
*data)
 dc->realize = amd_xgbe_realize;
 dc->desc = "VFIO AMD XGBE";
 dc->vmsd = _platform_amd_xgbe_vmstate;
+/* Supported by TYPE_VIRT_MACHINE */
+dc->user_creatable = true;
 }
 
 static const TypeInfo vfio_amd_xgbe_dev_info = {
diff --git a/hw/vfio/calxeda-xgmac.c b/hw/vfio/calxeda-xgmac.c
index bb15d588e5..7bb17af7ad 100644
--- a/hw/vfio/calxeda-xgmac.c
+++ b/hw/vfio/calxeda-xgmac.c
@@ -38,6 +38,8 @@ static void vfio_calxeda_xgmac_class_init(ObjectClass *klass, 
void *data)
 dc->realize = calxeda_xgmac_realize;
 dc->desc = "VFIO Calxeda XGMAC";
 dc->vmsd = _platform_calxeda_xgmac_vmstate;
+/* Supported by TYPE_VIRT_MACHINE */
+dc->user_creatable = true;
 }
 
 static const TypeInfo vfio_calxeda_xgmac_dev_info = {
diff --git a/hw/xen/xen_backend.c b/hw/xen/xen_backend.c
index c85f1637e4..f29b2b027b 100644
--- a/hw/xen/xen_backend.c
+++ b/hw/xen/xen_backend.c
@@ -619,6 +619,8 @@ static void xendev_class_init(ObjectClass *klass, void 
*data)
 
 dc->props = xendev_properties;
 set_bit(DEVICE_CATEGORY_MISC, dc->categories);
+/* xen-backend devices can be plugged/unplugged dynamically */
+dc->user_creatable = true;
 }
 
 static const TypeInfo xendev_type_info = {
-- 
2.11.0.259.g40922b1

-- 
Eduardo

Re: [Qemu-devel] [PATCH v3 2/5] target/arm: optimize rev16() using extract op

2017-05-12 Thread Aurelien Jarno

On 2017-05-12 09:50, Richard Henderson wrote:
> On 05/11/2017 08:35 PM, Philippe Mathieu-Daudé wrote:
> > -tcg_gen_shri_i64(tcg_tmp, tcg_rn, 16);
> > -tcg_gen_andi_i64(tcg_tmp, tcg_tmp, 0x);
> > +tcg_gen_extract_i64(tcg_tmp, tcg_rn, 16, 0x);
> 
> So your new script didn't work then?  This should be "..., 16, 16);".

Indeed that should be ..., 16, 16). That said looking a bit at the 
actual code, it looks like rev16 is not implemented efficiently. Instead
of byteswapping individual 16-bit words one by one, it would be better
to work on the whole register at the same time using shifts and mask.
This is actually how rev16 is implemented for aarch32 (and a few other
targets). Something like that (i can send a proper patch later):

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 24de30d92c..ccb276417b 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -4034,25 +4034,14 @@ static void handle_rev16(DisasContext *s, unsigned int 
sf,
 TCGv_i64 tcg_rd = cpu_reg(s, rd);
 TCGv_i64 tcg_tmp = tcg_temp_new_i64();
 TCGv_i64 tcg_rn = read_cpu_reg(s, rn, sf);
-
-tcg_gen_andi_i64(tcg_tmp, tcg_rn, 0x);
-tcg_gen_bswap16_i64(tcg_rd, tcg_tmp);
-
-tcg_gen_shri_i64(tcg_tmp, tcg_rn, 16);
-tcg_gen_andi_i64(tcg_tmp, tcg_tmp, 0x);
-tcg_gen_bswap16_i64(tcg_tmp, tcg_tmp);
-tcg_gen_deposit_i64(tcg_rd, tcg_rd, tcg_tmp, 16, 16);
-
-if (sf) {
-tcg_gen_shri_i64(tcg_tmp, tcg_rn, 32);
-tcg_gen_andi_i64(tcg_tmp, tcg_tmp, 0x);
-tcg_gen_bswap16_i64(tcg_tmp, tcg_tmp);
-tcg_gen_deposit_i64(tcg_rd, tcg_rd, tcg_tmp, 32, 16);
-
-tcg_gen_shri_i64(tcg_tmp, tcg_rn, 48);
-tcg_gen_bswap16_i64(tcg_tmp, tcg_tmp);
-tcg_gen_deposit_i64(tcg_rd, tcg_rd, tcg_tmp, 48, 16);
-}
+uint64_t mask1 = sf ? 0x00ff00ff00ff00ffull : 0x00ff00ff;
+uint64_t mask2 = sf ? 0xff00ff00ff00ff00ull : 0xff00ff00;
+
+tcg_gen_shri_i64(tcg_tmp, tcg_rn, 8);
+tcg_gen_andi_i64(tcg_tmp, tcg_tmp, mask1);
+tcg_gen_shli_i64(tcg_rd, tcg_rn, 8);
+tcg_gen_andi_i64(tcg_rd, tcg_rd, mask2);
+tcg_gen_or_i64(tcg_rd, tcg_rd, tcg_tmp);
 
 tcg_temp_free_i64(tcg_tmp);
 }

This makes the generated x86-64 code much shorter, especially with sf=1:


* rev16 with sf = 0

before:
0x5631ecfda582:  movzwl %bx,%r12d
0x5631ecfda586:  rol$0x8,%r12w
0x5631ecfda58b:  shr$0x10,%rbx
0x5631ecfda58f:  rol$0x8,%bx
0x5631ecfda593:  movzwl %bx,%ebx
0x5631ecfda596:  shl$0x10,%rbx
0x5631ecfda59a:  mov$0x,%r13
0x5631ecfda5a4:  and%r13,%r12
0x5631ecfda5a7:  or %rbx,%r12

after:
0x559f7aeae5f2:  mov%rbx,%r12
0x559f7aeae5f5:  shr$0x8,%r12
0x559f7aeae5f9:  and$0xff00ff,%r12d
0x559f7aeae600:  shl$0x8,%rbx
0x559f7aeae604:  and$0xff00ff00,%ebx
0x559f7aeae60a:  or %r12,%rbx


* rev16 with sf = 1

before:
0x5631ecfe5380:  mov%rbx,%r12
0x5631ecfe5383:  movzwl %bx,%ebx
0x5631ecfe5386:  rol$0x8,%bx
0x5631ecfe538a:  mov%r12,%r13
0x5631ecfe538d:  shr$0x10,%r13
0x5631ecfe5391:  movzwl %r13w,%r13d
0x5631ecfe5395:  rol$0x8,%r13w
0x5631ecfe539a:  movzwl %r13w,%r13d
0x5631ecfe539e:  shl$0x10,%r13
0x5631ecfe53a2:  mov$0x,%r15
0x5631ecfe53ac:  and%r15,%rbx
0x5631ecfe53af:  or %r13,%rbx
0x5631ecfe53b2:  mov%r12,%r13
0x5631ecfe53b5:  shr$0x20,%r13
0x5631ecfe53b9:  movzwl %r13w,%r13d
0x5631ecfe53bd:  rol$0x8,%r13w
0x5631ecfe53c2:  movzwl %r13w,%r13d
0x5631ecfe53c6:  shl$0x20,%r13
0x5631ecfe53ca:  mov$0x,%r15
0x5631ecfe53d4:  and%r15,%rbx
0x5631ecfe53d7:  or %r13,%rbx
0x5631ecfe53da:  shr$0x30,%r12
0x5631ecfe53de:  rol$0x8,%r12w
0x5631ecfe53e3:  shl$0x30,%r12
0x5631ecfe53e7:  mov$0x,%r13
0x5631ecfe53f1:  and%r13,%rbx
0x5631ecfe53f4:  or %r12,%rbx

after:
0x559f7aeb93e0:  mov%rbx,%r12
0x559f7aeb93e3:  shr$0x8,%r12
0x559f7aeb93e7:  mov$0xff00ff00ff00ff,%r13
0x559f7aeb93f1:  and%r13,%r12
0x559f7aeb93f4:  shl$0x8,%rbx
0x559f7aeb93f8:  mov$0xff00ff00ff00ff00,%r13
0x559f7aeb9402:  and%r13,%rbx
0x559f7aeb9405:  or %r12,%rbx

Aurelien

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net

Re: [Qemu-devel] [PATCH v4] qemu-img: Check for backing image if specified during create

2017-05-12 Thread Max Reitz

On 2017-05-11 20:27, John Snow wrote:
> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1213786
> 
> Or, rather, force the open of a backing image if one was specified
> for creation. Using a similar -unsafe option as rebase, allow qemu-img
> to ignore the backing file validation if possible.
> 
> It may not always be possible, as in the existing case when a filesize
> for the new image was not specified.
> 
> This is accomplished by shifting around the conditionals in
> bdrv_img_create, such that a backing file is always opened unless we
> provide BDRV_O_NO_BACKING. qemu-img is adjusted to pass this new flag
> when -u is provided to create.
> 
> Sorry for the heinous looking diffstat, but it's mostly whitespace.
> 
> Reported-by: Yi Sun 
> Signed-off-by: John Snow 
> Reviewed-by: Eric Blake 
> ---
> 
> v4: Actually do the things Eric told me to.
> v3: Rebased
> v2: Rebased for 2.10
> Corrected some of my less cromulent grammar
> 
> 
>  block.c| 73 
> +++---
>  qemu-img-cmds.hx   |  4 +--
>  qemu-img.c | 16 ++
>  tests/qemu-iotests/082 |  4 +--
>  tests/qemu-iotests/082.out |  4 +--
>  5 files changed, 54 insertions(+), 47 deletions(-)
> 
> diff --git a/block.c b/block.c
> index a45b9b5..3c3df54 100644
> --- a/block.c
> +++ b/block.c
> @@ -4275,37 +4275,37 @@ void bdrv_img_create(const char *filename, const char 
> *fmt,
>  // The size for the image must always be specified, with one exception:
>  // If we are using a backing file, we can obtain the size from there
>  size = qemu_opt_get_size(opts, BLOCK_OPT_SIZE, 0);
> -if (size == -1) {

"Hang on, why should this be -1 when the defval is 0? Where does the -1
come from?"
"..."
"Oh, the option exists and is set to -1? Why is that?"
"..."
"Oh, because this function always sets it itself, and because @img_size
is set to (uint64_t)-1."

First, I won't start with how signed integer overflow is
implementation-defined in C because I hope you have thrashed that out
with Eric (I hope that "to thrash out" is a good translation for
"auskaspern" (lit. "to buffoon out").).

Second, well, at least we should put -1 as the default value here, then.

Not strictly your fault or something that you need to fix, but it is
just a single line in the vicinity...

Let me know if you want to address this, for now I'll leave a

Reviewed-by: Max Reitz 

here if you don't want to.

Max

> -if (backing_file) {
> -BlockDriverState *bs;
> -char *full_backing = g_new0(char, PATH_MAX);
> -int64_t size;
> -int back_flags;
> -QDict *backing_options = NULL;
> -
> -bdrv_get_full_backing_filename_from_filename(filename, 
> backing_file,
> - full_backing, 
> PATH_MAX,
> - _err);
> -if (local_err) {
> -g_free(full_backing);
> -goto out;
> -}
> -
> -/* backing files always opened read-only */
> -back_flags = flags;
> -back_flags &= ~(BDRV_O_RDWR | BDRV_O_SNAPSHOT | 
> BDRV_O_NO_BACKING);
> -
> -if (backing_fmt) {
> -backing_options = qdict_new();
> -qdict_put_str(backing_options, "driver", backing_fmt);
> -}
> -
> -bs = bdrv_open(full_backing, NULL, backing_options, back_flags,
> -   _err);
> +if (backing_file && !(flags & BDRV_O_NO_BACKING)) {
> +BlockDriverState *bs;
> +char *full_backing = g_new0(char, PATH_MAX);
> +int back_flags;
> +QDict *backing_options = NULL;
> +
> +bdrv_get_full_backing_filename_from_filename(filename, backing_file,
> + full_backing, PATH_MAX,
> + _err);
> +if (local_err) {
>  g_free(full_backing);
> -if (!bs) {
> -goto out;
> -}
> +goto out;
> +}
> +
> +/* backing files always opened read-only */
> +back_flags = flags;
> +back_flags &= ~(BDRV_O_RDWR | BDRV_O_SNAPSHOT | BDRV_O_NO_BACKING);
> +
> +if (backing_fmt) {
> +backing_options = qdict_new();
> +qdict_put_str(backing_options, "driver", backing_fmt);
> +}
> +
> +bs = bdrv_open(full_backing, NULL, backing_options, back_flags,
> +   _err);
> +g_free(full_backing);
> +if (!bs) {
> +goto out;
> +}
> +
> +if (size == -1) {
>  size = bdrv_getlength(bs);
>  if (size < 0) {
>  error_setg_errno(errp, -size, "Could not get size of '%s'",
> @@ -4313,14 +4313,15 @@ void

Re: [Qemu-devel] [PATCH 12/12] migration: migration.h was not needed

2017-05-12 Thread Dr. David Alan Gilbert

* Juan Quintela (quint...@redhat.com) wrote:
> This files don't use any function from migration.h, so drop it.
> 
> Signed-off-by: Juan Quintela 
> ---
>  block/qed.c | 1 -
>  hw/i386/pc_q35.c| 1 -
>  hw/virtio/vhost-user.c  | 1 -
>  hw/virtio/vhost-vsock.c | 1 -
>  hw/virtio/virtio.c  | 1 -
>  monitor.c   | 1 -
>  6 files changed, 6 deletions(-)
> 
> diff --git a/block/qed.c b/block/qed.c
> index fd76817..8d899fd 100644
> --- a/block/qed.c
> +++ b/block/qed.c
> @@ -19,7 +19,6 @@
>  #include "trace.h"
>  #include "qed.h"
>  #include "qapi/qmp/qerror.h"
> -#include "migration/migration.h"
>  #include "sysemu/block-backend.h"
>  
>  static const AIOCBInfo qed_aiocb_info = {
> diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
> index dd792a8..76b08f8 100644
> --- a/hw/i386/pc_q35.c
> +++ b/hw/i386/pc_q35.c
> @@ -46,7 +46,6 @@
>  #include "hw/ide/ahci.h"
>  #include "hw/usb.h"
>  #include "qemu/error-report.h"
> -#include "migration/migration.h"
>  
>  /* ICH9 AHCI has 6 ports */
>  #define MAX_SATA_PORTS 6
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index 9334a8a..ebc8ccf 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -17,7 +17,6 @@
>  #include "sysemu/kvm.h"
>  #include "qemu/error-report.h"
>  #include "qemu/sockets.h"
> -#include "migration/migration.h"
>  
>  #include 
>  #include 
> diff --git a/hw/virtio/vhost-vsock.c b/hw/virtio/vhost-vsock.c
> index b481562..49e0022 100644
> --- a/hw/virtio/vhost-vsock.c
> +++ b/hw/virtio/vhost-vsock.c
> @@ -17,7 +17,6 @@
>  #include "qapi/error.h"
>  #include "hw/virtio/virtio-bus.h"
>  #include "hw/virtio/virtio-access.h"
> -#include "migration/migration.h"
>  #include "qemu/error-report.h"
>  #include "hw/virtio/vhost-vsock.h"
>  #include "qemu/iov.h"

Aren't these including it to get vmstate macros?
but have they picked up that instead somewhere?

Dave

> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> index 03592c5..2d2b6bf 100644
> --- a/hw/virtio/virtio.c
> +++ b/hw/virtio/virtio.c
> @@ -21,7 +21,6 @@
>  #include "hw/virtio/virtio.h"
>  #include "qemu/atomic.h"
>  #include "hw/virtio/virtio-bus.h"
> -#include "migration/migration.h"
>  #include "hw/virtio/virtio-access.h"
>  #include "sysemu/dma.h"
>  
> diff --git a/monitor.c b/monitor.c
> index 078cba5..fa295c4 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -49,7 +49,6 @@
>  #include "disas/disas.h"
>  #include "sysemu/balloon.h"
>  #include "qemu/timer.h"
> -#include "migration/migration.h"
>  #include "sysemu/hw_accel.h"
>  #include "qemu/acl.h"
>  #include "sysemu/tpm.h"
> -- 
> 2.9.3
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH 09/12] migration: Split vmstate-types.c from vmstate.c

2017-05-12 Thread Dr. David Alan Gilbert

* Juan Quintela (quint...@redhat.com) wrote:
> Now one just has the interperter, and the other has the basic types.
> Once there, add copyright boilerplate.
> 
> Signed-off-by: Juan Quintela 

I think this is generally OK, but as discussed on IRC, I think
you need to check the licenses.

Dave

> ---
>  migration/Makefile.objs   |   2 +-
>  migration/vmstate-types.c | 676 
> ++
>  migration/vmstate.c   | 669 ++---
>  tests/Makefile.include|   2 +-
>  4 files changed, 705 insertions(+), 644 deletions(-)
>  create mode 100644 migration/vmstate-types.c
> 
> diff --git a/migration/Makefile.objs b/migration/Makefile.objs
> index ce8ce12..812b2ec 100644
> --- a/migration/Makefile.objs
> +++ b/migration/Makefile.objs
> @@ -1,7 +1,7 @@
>  common-obj-y += migration.o socket.o fd.o exec.o
>  common-obj-y += tls.o channel.o
>  common-obj-y += colo-comm.o colo.o colo-failover.o
> -common-obj-y += vmstate.o page_cache.o
> +common-obj-y += vmstate.o vmstate-types.o page_cache.o
>  common-obj-y += qemu-file.o
>  common-obj-y += qemu-file-channel.o
>  common-obj-y += xbzrle.o postcopy-ram.o
> diff --git a/migration/vmstate-types.c b/migration/vmstate-types.c
> new file mode 100644
> index 000..0cf14d4
> --- /dev/null
> +++ b/migration/vmstate-types.c
> @@ -0,0 +1,676 @@
> +/*
> + * QEMU System Emulator
> + *
> + * Copyright (c) 2009-2017 Red Hat Inc
> + *
> + * Authors:
> + *  Juan Quintela 
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a 
> copy
> + * of this software and associated documentation files (the "Software"), to 
> deal
> + * in the Software without restriction, including without limitation the 
> rights
> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> + * copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
> FROM,
> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
> + * THE SOFTWARE.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qemu-common.h"
> +#include "migration/migration.h"
> +#include "migration/qemu-file.h"
> +#include "migration/vmstate.h"
> +#include "qemu/error-report.h"
> +#include "qemu/queue.h"
> +#include "trace.h"
> +
> +/* bool */
> +
> +static int get_bool(QEMUFile *f, void *pv, size_t size, VMStateField *field)
> +{
> +bool *v = pv;
> +*v = qemu_get_byte(f);
> +return 0;
> +}
> +
> +static int put_bool(QEMUFile *f, void *pv, size_t size, VMStateField *field,
> +QJSON *vmdesc)
> +{
> +bool *v = pv;
> +qemu_put_byte(f, *v);
> +return 0;
> +}
> +
> +const VMStateInfo vmstate_info_bool = {
> +.name = "bool",
> +.get  = get_bool,
> +.put  = put_bool,
> +};
> +
> +/* 8 bit int */
> +
> +static int get_int8(QEMUFile *f, void *pv, size_t size, VMStateField *field)
> +{
> +int8_t *v = pv;
> +qemu_get_s8s(f, v);
> +return 0;
> +}
> +
> +static int put_int8(QEMUFile *f, void *pv, size_t size, VMStateField *field,
> + QJSON *vmdesc)
> +{
> +int8_t *v = pv;
> +qemu_put_s8s(f, v);
> +return 0;
> +}
> +
> +const VMStateInfo vmstate_info_int8 = {
> +.name = "int8",
> +.get  = get_int8,
> +.put  = put_int8,
> +};
> +
> +/* 16 bit int */
> +
> +static int get_int16(QEMUFile *f, void *pv, size_t size, VMStateField *field)
> +{
> +int16_t *v = pv;
> +qemu_get_sbe16s(f, v);
> +return 0;
> +}
> +
> +static int put_int16(QEMUFile *f, void *pv, size_t size, VMStateField *field,
> + QJSON *vmdesc)
> +{
> +int16_t *v = pv;
> +qemu_put_sbe16s(f, v);
> +return 0;
> +}
> +
> +const VMStateInfo vmstate_info_int16 = {
> +.name = "int16",
> +.get  = get_int16,
> +.put  = put_int16,
> +};
> +
> +/* 32 bit int */
> +
> +static int get_int32(QEMUFile *f, void *pv, size_t size, VMStateField *field)
> +{
> +int32_t *v = pv;
> +qemu_get_sbe32s(f, v);
> +return 0;
> +}
> +
> +static int put_int32(QEMUFile *f, void *pv, size_t size, VMStateField *field,
> + QJSON *vmdesc)
> +{
> +int32_t *v = pv;
> +qemu_put_sbe32s(f, v);
> +return 0;
> +}
> +
> +const VMStateInfo vmstate_info_int32 = {
> +.name = "int32",
> +.get  = get_int32,
> +.put  = put_int32,
> +};
>

Re: [Qemu-devel] [PATCH 05/12] migration: Move colo.h to migration/

2017-05-12 Thread Dr. David Alan Gilbert

* Juan Quintela (quint...@redhat.com) wrote:
> There are functions only used by migration code.

That's only mostly true; see the current 'integrate colo frame with
block replication and net compare' series (posted 22nd April).
That adds colo_handle_shutdown to this header and calls it from vl.c
( https://lists.gnu.org/archive/html/qemu-devel/2017-04/msg03901.html )
where should that go?

There's also a net/colo.h as well, so using the
  #include "colo.h" in migration is correct but that's
really scary when there are two files of the same name.

Dave

> Signed-off-by: Juan Quintela 
> ---
>  MAINTAINERS | 2 +-
>  migration/colo-comm.c   | 2 +-
>  migration/colo-failover.c   | 2 +-
>  {include/migration => migration}/colo.h | 0
>  migration/migration.c   | 2 +-
>  migration/ram.c | 2 +-
>  6 files changed, 5 insertions(+), 5 deletions(-)
>  rename {include/migration => migration}/colo.h (100%)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 0e8d731..e834876 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1537,7 +1537,7 @@ COLO Framework
>  M: zhanghailiang 
>  S: Maintained
>  F: migration/colo*
> -F: include/migration/colo.h
> +F: migration/colo.h
>  F: include/migration/failover.h
>  F: docs/COLO-FT.txt
>  
> diff --git a/migration/colo-comm.c b/migration/colo-comm.c
> index 3d91798..9b35027 100644
> --- a/migration/colo-comm.c
> +++ b/migration/colo-comm.c
> @@ -13,7 +13,7 @@
>  
>  #include "qemu/osdep.h"
>  #include "migration/migration.h"
> -#include "migration/colo.h"
> +#include "colo.h"
>  #include "trace.h"
>  
>  typedef struct {
> diff --git a/migration/colo-failover.c b/migration/colo-failover.c
> index cc229f5..29b8d63 100644
> --- a/migration/colo-failover.c
> +++ b/migration/colo-failover.c
> @@ -11,7 +11,7 @@
>   */
>  
>  #include "qemu/osdep.h"
> -#include "migration/colo.h"
> +#include "colo.h"
>  #include "migration/failover.h"
>  #include "qmp-commands.h"
>  #include "qapi/qmp/qerror.h"
> diff --git a/include/migration/colo.h b/migration/colo.h
> similarity index 100%
> rename from include/migration/colo.h
> rename to migration/colo.h
> diff --git a/migration/migration.c b/migration/migration.c
> index ad5ed14..a5a17fe 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -38,7 +38,7 @@
>  #include "exec/address-spaces.h"
>  #include "io/channel-buffer.h"
>  #include "io/channel-tls.h"
> -#include "migration/colo.h"
> +#include "colo.h"
>  
>  #define MAX_THROTTLE  (32 << 20)  /* Migration transfer speed throttling 
> */
>  
> diff --git a/migration/ram.c b/migration/ram.c
> index 2564c00..7a5f5fa 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -44,7 +44,7 @@
>  #include "trace.h"
>  #include "exec/ram_addr.h"
>  #include "qemu/rcu_queue.h"
> -#include "migration/colo.h"
> +#include "colo.h"
>  
>  /***/
>  /* ram save/restore */
> -- 
> 2.9.3
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [RFC PATCH v3 1/5] coccinelle: add a script to optimize tcg op using tcg_gen_extract()

2017-05-12 Thread Eric Blake

On 05/12/2017 12:36 AM, Philippe Mathieu-Daudé wrote:

> In this patch
> http://lists.nongnu.org/archive/html/qemu-devel/2017-05/msg01466.html
> Aurelien does:
> 
> -tcg_gen_shri_i32(cpu_sr_q, src, SR_Q);
> -tcg_gen_andi_i32(cpu_sr_q, cpu_sr_q, 1);
> +tcg_gen_extract_i32(cpu_sr_q, src, SR_Q, 1);
> 
> having:
> 
> #define SR_Q  8
> 
> I wanted to write a Coccinelle script to check for this pattern.
> My first version was wrong, as Richard Henderson reminded me this
> pattern can be applied as long as the len argument (here "1") is a
> Mersenne prime (all least significant bits as "1").

Side note: while you are correct that a Mersenne prime is one less than
a power of 2, your use of the term here is incorrect. You are looking
for ALL instances of numbers that are one less than a power of two, and
not just the Mersenne primes.  (For instance, 0xf is NOT a Mersenne
prime, but IS a candidate for an optimization using a length of 4.)

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 3/3] net/filter-rewriter: Remove unused option in filter-rewirter

2017-05-12 Thread Eric Blake

On 05/11/2017 08:35 PM, Zhang Chen wrote:
> Signed-off-by: Zhang Chen 

In the subject: s/rewirter/rewriter/

> ---
>  qemu-options.hx | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 70c0ded..f5e088e 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -4038,7 +4038,8 @@ Create a filter-redirector we need to differ outdev id 
> from indev id, id can not
>  be the same. we can just use indev or outdev, but at least one of indev or 
> outdev
>  need to be specified.
>  
> -@item -object 
> filter-rewriter,id=@var{id},netdev=@var{netdevid},rewriter-mode=@var{mode}[,queue=@var{all|rx|tx}]
> +@item -object filter-rewriter,id=@var{id},netdev=@var{netdevid},
> +[,queue=@var{all|rx|tx}]

Texinfo sources have restrictions on how line-wrapping works, and it's
not always consistent.  It's probably best to stick with the long line,
if you haven't actually validated that the rendered documentation is not
broken when you split the line.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v8 0/4] Improve convert and dd commands

2017-05-12 Thread Max Reitz

On 2017-05-09 11:48, Daniel P. Berrange wrote:
> Update to
> 
>   v1: https://lists.gnu.org/archive/html/qemu-devel/2017-01/msg05699.html
>   v2: https://lists.gnu.org/archive/html/qemu-devel/2017-02/msg00728.html
>   v3: https://lists.gnu.org/archive/html/qemu-devel/2017-02/msg04391.html
>   v4: https://lists.gnu.org/archive/html/qemu-devel/2017-04/msg02153.html
>   v5: https://lists.gnu.org/archive/html/qemu-devel/2017-04/msg04109.html
>   v6: https://lists.gnu.org/archive/html/qemu-devel/2017-05/msg00215.html
> 
> This series is in response to Max pointing out that you cannot
> use 'convert' for an encrypted target image.
> 
> The 'convert' and 'dd' commands need to first create the image
> and then open it. The bdrv_create() method takes a set of options
> for creating the image, which let us provide a key-secret for the
> encryption key. When the commands then open the new image, they
> don't provide any options, so the image is unable to be opened
> due to lack of encryption key. It is also not possible to use
> the --image-opts argument to provide structured options in the
> target image name - it must be a plain filename to satisfy the
> bdrv_create() API contract.
> 
> This series addresses these problems to some extent
> 
>  - Adds a new --target-image-opts flag which is used to say
>that the target filename is using structured options.
>It is *only* permitted to use this when -n is also set.
>ie the target image must be pre-created so convert/dd
>don't need to run bdrv_create().
> 
>  - When --target-image-opts is not used, add special case
>code that identifies options passed to bdrv_create()
>named "*key-secret" and adds them to the options used
>to open the new image
> 
> In future it is desirable to make --target-image-opts work even when -n is
> *not* given. This requires considerable work to create a new bdrv_create()
> API impl.
> 
> The first patch fixes a bug in the 'dd' command while the second adds support
> for the missing '--object' arg to 'dd', allowing it to reference secrets when
> opening files.  The last two patches implement the new features described 
> above
> for the 'convert' command.
> 
> NB v8 is based against git master once more, since the img_convert changes
> previously in block-next have now merged.

Changes from the previous version look good, but unfortunately here's
the "but": The image locking series has brought even more changes to
qemu-img. :-(

I tried resolving them, but the following backport-diff didn't look like
I should proceed:

001/4:[] [-C] 'qemu-img: add support for --object with 'dd' command'
002/4:[0004] [FC] 'qemu-img: fix --image-opts usage with dd command'
003/4:[0015] [FC] 'qemu-img: introduce --target-image-opts for 'convert'
command'
004/4:[0024] [FC] 'qemu-img: copy *key-secret opts when opening newly
created files'

The fun is increased by the fact that the locking series has
(inadvertently) removed the -B documentation from convert, so there is
another conflict looming in the future...

(Or you just inadvertently add it back. Then we'd have resolved the
issue altogether...)

Max



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [Qemu-block] [PULL 05/58] qemu-img: Update documentation for -U

2017-05-12 Thread Max Reitz

On 2017-05-11 16:32, Kevin Wolf wrote:
> From: Fam Zheng 
> 
> Signed-off-by: Fam Zheng 
> Signed-off-by: Kevin Wolf 
> ---
>  qemu-img-cmds.hx | 36 ++--
>  1 file changed, 18 insertions(+), 18 deletions(-)
> 
> diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
> index bf4ce59..e5bc28f 100644
> --- a/qemu-img-cmds.hx
> +++ b/qemu-img-cmds.hx

[...]

>  DEF("convert", img_convert,
> -"convert [--object objectdef] [--image-opts] [-c] [-p] [-q] [-n] [-f 
> fmt] [-t cache] [-T src_cache] [-O output_fmt] [-B backing_file] [-o options] 
> [-s snapshot_id_or_name] [-l snapshot_param] [-S sparse_size] [-m 
> num_coroutines] [-W] filename [filename2 [...]] output_filename")
> +"convert [--object objectdef] [--image-opts] [-U] [-c] [-p] [-q] [-n] 
> [-f fmt] [-t cache] [-T src_cache] [-O output_fmt] [-o options] [-s 
> snapshot_id_or_name] [-l snapshot_param] [-S sparse_size] [-m num_coroutines] 
> [-W] filename [filename2 [...]] output_filename")
>  STEXI
> -@item convert [--object @var{objectdef}] [--image-opts] [-c] [-p] [-q] [-n] 
> [-f @var{fmt}] [-t @var{cache}] [-T @var{src_cache}] [-O @var{output_fmt}] 
> [-B @var{backing_file}] [-o @var{options}] [-s @var{snapshot_id_or_name}] [-l 
> @var{snapshot_param}] [-S @var{sparse_size}] [-m @var{num_coroutines}] [-W] 
> @var{filename} [@var{filename2} [...]] @var{output_filename}
> +@item convert [--object @var{objectdef}] [--image-opts] [-U] [-c] [-p] [-q] 
> [-n] [-f @var{fmt}] [-t @var{cache}] [-T @var{src_cache}] [-O 
> @var{output_fmt}] [-o @var{options}] [-s @var{snapshot_id_or_name}] [-l 
> @var{snapshot_param}] [-S @var{sparse_size}] [-m @var{num_coroutines}] [-W] 
> @var{filename} [@var{filename2} [...]] @var{output_filename}

Soo... Who gets to add the -B documentation back?

Max



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v3 2/3] arm64: kvm: inject SError with virtual syndrome

2017-05-12 Thread James Morse

Hi gengdongjiu,

On 05/05/17 14:19, gengdongjiu wrote:
> On 2017/5/2 23:37, James Morse wrote:
> > ... I think you expect an SError to arrive at EL2 and have its ESR recorded 
> > in
> > vcpu->arch.fault.vsesr_el2. Some time later KVM decides to inject an SError 
> > into
> > the guest, and this ESR is reused...
> > 
> > We shouldn't do this. Qemu/kvmtool may want to inject a virtual-SError that
> > never started as a physical-SError. Qemu/kvmtool may choose to notify the 
> > guest
> > of RAS events via another mechanism, or not at all.
> > 
> > KVM should not give the guest an ESR value of its choice. For SError the ESR
> > describes whether the error is corrected, correctable or fatal. Qemu/kvmtool
> > must choose this.

> Below is my previous solution:
> For the SError, CPU will firstly trap to EL3 firmware and records the 
> syndrome to ESR_EL3.
> Before jumping to El2 hypervisors, it will copy the esr_el3 to esr_el2.

(Copying the ESR value won't always be the right thing to do.)


> so in order to pass this syndrome to vsesr_el2, using the esr_el2 value to 
> assign it.

> If Qemu/kvmtool chooses the ESR value and ESR only describes whether the 
> error is corrected/correctable/fatal,
> whether the information is not enough for the guest?

So the API should specify which of these three severities to use? I think this
is too specific. The API should be useful for anything the VSE/VSESR hardware
can do.

VSESR_EL2 is described in the RAS spec: 4.4.12 [0], its a 64 bit register. I
think we should let Qemu/kvmtool specify any 64bit value here, but KVM should
reject values that try to set bits described as RES0.

This would let Qemu/kvmtool specify any SError ISS, either setting ESR_ELx.IDS
and some virtual-machine specific value, or encoding any severity in AET and
choosing the DFSC/EA bits appropriately.


>> > I think we need an API that allows Qemu/kvmtool to inject SError into a 
>> > guest,
>> > but that isn't quite what you have here.

> KVM provides APIs to inject the SError, Qemu/kvmtool call the API though 
> IOCTL, may be OK?

(just the one API call), yes.


Thanks,

James

[0]
https://static.docs.arm.com/ddi0587/a/RAS%20Extension-release%20candidate_march_29.pdf

Re: [Qemu-devel] KVM "fake DAX" device flushing

2017-05-12 Thread Kevin Wolf

Am 12.05.2017 um 15:42 hat Stefan Hajnoczi geschrieben:
> On Thu, May 11, 2017 at 05:38:40PM -0400, Rik van Riel wrote:
> > On Thu, 2017-05-11 at 14:17 -0400, Stefan Hajnoczi wrote:
> > > On Wed, May 10, 2017 at 09:26:00PM +0530, Pankaj Gupta wrote:
> > > > * For live migration use case, if host side backing file is 
> > > >   shared storage, we need to flush the page cache for the disk 
> > > >   image at the destination (new fadvise interface,
> > > > FADV_INVALIDATE_CACHE?) 
> > > >   before starting execution of the guest on the destination host.
> > > 
> > > Good point.  QEMU currently only supports live migration with
> > > O_DIRECT.
> > > I think the problem was that userspace cannot guarantee consistency
> > > in
> > > the general case.  If you find a solution to this problem for fake
> > > NVDIMM then maybe the QEMU block layer can also begin supporting live
> > > migration with buffered I/O.
> > 
> > I'll be happy to work with you on that, independently
> > of Pankaj's project.
> > 
> > It looks like the fadvise system call could be extended
> > pretty easily with an FADV_INVALIDATE_CACHE command, the
> > other side of which can simply hook into the existing
> > page cache invalidation code in the kernel.
> > 
> > Qemu will need to know whether the invalidation succeeded,
> > but that is something we can test for pretty easily before
> > returning to userspace.
> 
> Sounds great.  I will review the long discussions that took place on
> qemu-devel about cache invalidation for live migration - just want to
> make sure there were no other reasons why only O_DIRECT is supported
> :).

There are other reasons why we recommend against using non-O_DIRECT
modes in production (including the error handling), but with respect to
live migration, this is the only one I'm aware of.

As I already said in the private email thread, an FADV_INVALIDATE_CACHE
should do the trick and I'd be happy to work with you guys on that.

Kevin


pgp_flSuX9zEd.pgp
Description: PGP signature

Re: [Qemu-devel] [Qemu-block] [PATCH] qcow2: remove extra local_error variable

2017-05-12 Thread Max Reitz

On 2017-05-11 17:03, Alberto Garcia wrote:
> Commit d7086422b1c1e75e320519cfe26176db6ec97a37 added a local_err
> variable global to the qcow2_amend_options() function, so there's no
> need to have this other one.
> 
> Signed-off-by: Alberto Garcia 
> ---
>  block/qcow2.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)

Thanks, applied to my block branch:

https://github.com/XanClic/qemu/commits/block

Max



signature.asc
Description: OpenPGP digital signature

1 2 3 >

1 - 100 of 273 matches

Mail list logo