Re: [Qemu-devel] [RFC PATCH] ati-vga: Implement dummy VBlank IRQ

2019-08-14 Thread Gerd Hoffmann
On Thu, Aug 15, 2019 at 02:25:07AM +0200, BALATON Zoltan wrote:
> The MacOS driver exits if the card does not have an interrupt. If we
> set PCI_INTERRUPT_PIN to 1 then it enables VBlank interrupts and it
> boots but the mouse poniter can not be moved. This patch implements a
> dummy VBlank interrupt by a timer triggered at 60 Hz to test if it
> helps. Unfortunately it doesn't: MacOS with this patch hangs during
> boot just polling interrupts and acknowledging them so maybe it needs
> something else or there may be some other problem with this
> implementation.
> 
> This is posted for comments and to let others experiment with it but
> probably should not be committed upstream yet.
> 
> Signed-off-by: BALATON Zoltan 
> ---
>  hw/display/ati.c  | 41 +
>  hw/display/ati_dbg.c  |  1 +
>  hw/display/ati_int.h  |  4 
>  hw/display/ati_regs.h |  1 +
>  4 files changed, 47 insertions(+)
> 
> diff --git a/hw/display/ati.c b/hw/display/ati.c
> index a365e2455d..e06cbf3e91 100644
> --- a/hw/display/ati.c
> +++ b/hw/display/ati.c
> @@ -243,6 +243,21 @@ static uint64_t ati_i2c(bitbang_i2c_interface *i2c, 
> uint64_t data, int base)
>  return data;
>  }
>  
> +static void ati_vga_update_irq(ATIVGAState *s)
> +{
> +pci_set_irq(&s->dev, s->regs.gen_int_status & 1);

This should be "s->regs.gen_int_status & s->regs.gen_int_cntl" I guess?

> +static void ati_vga_vblank_irq(void *opaque)
> +{
> +ATIVGAState *s = opaque;
> +
> +timer_mod(&s->vblank_timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) +
> +  NANOSECONDS_PER_SECOND / 60);
> +s->regs.gen_int_status |= 1;

#defines for the irq status bits would be nice.

> +case GEN_INT_CNTL:
> +s->regs.gen_int_cntl = data;
> +if (data & 1) {
> +ati_vga_vblank_irq(s);
> +} else {
> +timer_del(&s->vblank_timer);
> +}

ati_vga_update_irq() needed here.

> +break;
> +case GEN_INT_STATUS:
> +data &= (s->dev_id == PCI_DEVICE_ID_ATI_RAGE128_PF ?
> + 0x000f040fUL : 0xfc080effUL);

Add IRQ_MASK #define ?

> +s->regs.gen_int_status &= ~data;

ati_vga_update_irq() needed here too.

cheers,
  Gerd




Re: [Qemu-devel] [PATCH 0/3] colo: Add support for continious replication

2019-08-14 Thread Zhang, Chen
Hi Lukas,

Please fix this issue and add more comments in the commit log.

Thanks
Zhang Chen

> -Original Message-
> From: no-re...@patchew.org [mailto:no-re...@patchew.org]
> Sent: Thursday, August 15, 2019 11:20 AM
> To: lukasstra...@web.de
> Cc: Zhang, Chen ; qemu-devel@nongnu.org
> Subject: Re: [Qemu-devel] [PATCH 0/3] colo: Add support for continious
> replication
> 
> Patchew URL:
> https://patchew.org/QEMU/cover.1565814686.git.lukasstra...@web.de/
> 
> 
> 
> Hi,
> 
> This series failed build test on s390x host. Please find the details below.
> 
> === TEST SCRIPT BEGIN ===
> #!/bin/bash
> # Testing script will be invoked under the git checkout with # HEAD pointing 
> to
> a commit that has the patches applied on top of "base"
> # branch
> set -e
> 
> echo
> echo "=== ENV ==="
> env
> 
> echo
> echo "=== PACKAGES ==="
> rpm -qa
> 
> echo
> echo "=== UNAME ==="
> uname -a
> 
> CC=$HOME/bin/cc
> INSTALL=$PWD/install
> BUILD=$PWD/build
> mkdir -p $BUILD $INSTALL
> SRC=$PWD
> cd $BUILD
> $SRC/configure --cc=$CC --prefix=$INSTALL make -j4 # XXX: we need reliable
> clean up # make check -j4 V=1 make install === TEST SCRIPT END ===
> 
>  from /var/tmp/patchew-tester-tmp-
> 6ji6qfi2/src/include/net/filter.h:13,
>  from 
> /var/tmp/patchew-tester-tmp-6ji6qfi2/src/net/filter.c:14:
> /var/tmp/patchew-tester-tmp-6ji6qfi2/src/net/filter.c: In function
> ‘netfilter_complete’:
> /var/tmp/patchew-tester-tmp-6ji6qfi2/src/include/qemu/queue.h:412:44: error:
> ‘position’ may be used uninitialized in this function [-Werror=maybe-
> uninitialized]
>   412 | (listelm)->field.tqe_circ.tql_prev = &(elm)->field.tqe_circ;  
>\
>   |^
> /var/tmp/patchew-tester-tmp-6ji6qfi2/src/net/filter.c:237:21: note: ‘position’
> was declared here
> 
> 
> The full log is available at
> http://patchew.org/logs/cover.1565814686.git.lukasstra...@web.de/testing.s3
> 90x/?type=message.
> ---
> Email generated automatically by Patchew [https://patchew.org/].
> Please send your feedback to patchew-de...@redhat.com


Re: [Qemu-devel] [PATCH] usb: reword -usb command-line option and mention xHCI

2019-08-14 Thread Gerd Hoffmann
  Hi,

> > > -Enable the USB driver (if it is not used by default yet).
> > > +Enable USB emulation on machine types with an on-board USB host 
> > > controller (if
> > > +not enabled by default).  Note that on-board USB host controllers may not
> > > +support USB 3.0.  In this case -device nec-usb-xhci can be used instead 
> > > on
> > 
> > Should we maybe rather recommend qemu-xhci instead?
> 
> I think nec-usb-xhci is preferred because there are Windows drivers.
> IIRC qemu-xhci works under Linux but not under Windows (just because the
> PCI Vendor/Device ID aren't covered by any driver).
> 
> Gerd: Can you confirm this?

That applies to windows 7 only, which is EOL next year.

win7 doesn't ship with xhci drivers, but you can download and use
nec/renesas drivers which require nec-usb-xhci.

win8+ ships with generic xhci drivers which works with all xhci
hardware, including qemu-xhci.

So it indeed makes sense to refer to qemu-xhci.

cheers,
  Gerd




Re: [Qemu-devel] [PATCH 0/3] colo: Add support for continious replication

2019-08-14 Thread no-reply
Patchew URL: https://patchew.org/QEMU/cover.1565814686.git.lukasstra...@web.de/



Hi,

This series failed build test on s390x host. Please find the details below.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
# Testing script will be invoked under the git checkout with
# HEAD pointing to a commit that has the patches applied on top of "base"
# branch
set -e

echo
echo "=== ENV ==="
env

echo
echo "=== PACKAGES ==="
rpm -qa

echo
echo "=== UNAME ==="
uname -a

CC=$HOME/bin/cc
INSTALL=$PWD/install
BUILD=$PWD/build
mkdir -p $BUILD $INSTALL
SRC=$PWD
cd $BUILD
$SRC/configure --cc=$CC --prefix=$INSTALL
make -j4
# XXX: we need reliable clean up
# make check -j4 V=1
make install
=== TEST SCRIPT END ===

 from 
/var/tmp/patchew-tester-tmp-6ji6qfi2/src/include/net/filter.h:13,
 from /var/tmp/patchew-tester-tmp-6ji6qfi2/src/net/filter.c:14:
/var/tmp/patchew-tester-tmp-6ji6qfi2/src/net/filter.c: In function 
‘netfilter_complete’:
/var/tmp/patchew-tester-tmp-6ji6qfi2/src/include/qemu/queue.h:412:44: error: 
‘position’ may be used uninitialized in this function 
[-Werror=maybe-uninitialized]
  412 | (listelm)->field.tqe_circ.tql_prev = &(elm)->field.tqe_circ;
 \
  |^
/var/tmp/patchew-tester-tmp-6ji6qfi2/src/net/filter.c:237:21: note: ‘position’ 
was declared here


The full log is available at
http://patchew.org/logs/cover.1565814686.git.lukasstra...@web.de/testing.s390x/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [Qemu-devel] [PATCH v2] target/riscv: Hardwire mcounter.TM and upper bits of [m|s]counteren

2019-08-14 Thread Jonathan Behrens
Ping! What is the status of this patch?

On Wed, Jul 3, 2019 at 2:02 PM Jonathan Behrens 
wrote:

> Bin, that proposal proved to be somewhat more controversial than I was
> expecting, since it was different than how currently available hardware
> worked. This option seemed much more likely to be accepted in the short
> term.
>
> Jonathan
>
> On Mon, Jul 1, 2019 at 9:26 PM Bin Meng  wrote:
>
>> On Tue, Jul 2, 2019 at 8:20 AM Alistair Francis 
>> wrote:
>> >
>> > On Mon, Jul 1, 2019 at 8:56 AM  wrote:
>> > >
>> > > From: Jonathan Behrens 
>> > >
>> > > QEMU currently always triggers an illegal instruction exception when
>> > > code attempts to read the time CSR. This is valid behavor, but only if
>> > > the TM bit in mcounteren is hardwired to zero. This change also
>> > > corrects mcounteren and scounteren CSRs to be 32-bits on both 32-bit
>> > > and 64-bit targets.
>> > >
>> > > Signed-off-by: Jonathan Behrens 
>> >
>> > Reviewed-by: Alistair Francis 
>> >
>>
>> I am a little bit lost here. I think we agreed to allow directly read
>> to time CSR when mcounteren.TM is set, no?
>>
>> Regards,
>> Bin
>>
>


Re: [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation

2019-08-14 Thread no-reply
Patchew URL: https://patchew.org/QEMU/20190815020928.9679-1-jan.bo...@gmail.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 
instruction translation
Message-id: 20190815020928.9679-1-jan.bo...@gmail.com
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag] patchew/20190815020928.9679-1-jan.bo...@gmail.com -> 
patchew/20190815020928.9679-1-jan.bo...@gmail.com
Submodule 'capstone' (https://git.qemu.org/git/capstone.git) registered for 
path 'capstone'
Submodule 'dtc' (https://git.qemu.org/git/dtc.git) registered for path 'dtc'
Submodule 'roms/QemuMacDrivers' (https://git.qemu.org/git/QemuMacDrivers.git) 
registered for path 'roms/QemuMacDrivers'
Submodule 'roms/SLOF' (https://git.qemu.org/git/SLOF.git) registered for path 
'roms/SLOF'
Submodule 'roms/edk2' (https://git.qemu.org/git/edk2.git) registered for path 
'roms/edk2'
Submodule 'roms/ipxe' (https://git.qemu.org/git/ipxe.git) registered for path 
'roms/ipxe'
Submodule 'roms/openbios' (https://git.qemu.org/git/openbios.git) registered 
for path 'roms/openbios'
Submodule 'roms/openhackware' (https://git.qemu.org/git/openhackware.git) 
registered for path 'roms/openhackware'
Submodule 'roms/opensbi' (https://git.qemu.org/git/opensbi.git) registered for 
path 'roms/opensbi'
Submodule 'roms/qemu-palcode' (https://git.qemu.org/git/qemu-palcode.git) 
registered for path 'roms/qemu-palcode'
Submodule 'roms/seabios' (https://git.qemu.org/git/seabios.git/) registered for 
path 'roms/seabios'
Submodule 'roms/seabios-hppa' (https://git.qemu.org/git/seabios-hppa.git) 
registered for path 'roms/seabios-hppa'
Submodule 'roms/sgabios' (https://git.qemu.org/git/sgabios.git) registered for 
path 'roms/sgabios'
Submodule 'roms/skiboot' (https://git.qemu.org/git/skiboot.git) registered for 
path 'roms/skiboot'
Submodule 'roms/u-boot' (https://git.qemu.org/git/u-boot.git) registered for 
path 'roms/u-boot'
Submodule 'roms/u-boot-sam460ex' (https://git.qemu.org/git/u-boot-sam460ex.git) 
registered for path 'roms/u-boot-sam460ex'
Submodule 'slirp' (https://git.qemu.org/git/libslirp.git) registered for path 
'slirp'
Submodule 'tests/fp/berkeley-softfloat-3' 
(https://git.qemu.org/git/berkeley-softfloat-3.git) registered for path 
'tests/fp/berkeley-softfloat-3'
Submodule 'tests/fp/berkeley-testfloat-3' 
(https://git.qemu.org/git/berkeley-testfloat-3.git) registered for path 
'tests/fp/berkeley-testfloat-3'
Submodule 'ui/keycodemapdb' (https://git.qemu.org/git/keycodemapdb.git) 
registered for path 'ui/keycodemapdb'
Cloning into 'capstone'...
Submodule path 'capstone': checked out 
'22ead3e0bfdb87516656453336160e0a37b066bf'
Cloning into 'dtc'...
Submodule path 'dtc': checked out '88f18909db731a627456f26d779445f84e449536'
Cloning into 'roms/QemuMacDrivers'...
Submodule path 'roms/QemuMacDrivers': checked out 
'90c488d5f4a407342247b9ea869df1c2d9c8e266'
Cloning into 'roms/SLOF'...
Submodule path 'roms/SLOF': checked out 
'ba1ab360eebe6338bb8d7d83a9220ccf7e213af3'
Cloning into 'roms/edk2'...
Submodule path 'roms/edk2': checked out 
'20d2e5a125e34fc8501026613a71549b2a1a3e54'
Submodule 'SoftFloat' (https://github.com/ucb-bar/berkeley-softfloat-3.git) 
registered for path 'ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3'
Submodule 'CryptoPkg/Library/OpensslLib/openssl' 
(https://github.com/openssl/openssl) registered for path 
'CryptoPkg/Library/OpensslLib/openssl'
Cloning into 'ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3'...
Submodule path 'roms/edk2/ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3': 
checked out 'b64af41c3276f97f0e181920400ee056b9c88037'
Cloning into 'CryptoPkg/Library/OpensslLib/openssl'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl': checked out 
'50eaac9f3337667259de725451f201e784599687'
Submodule 'boringssl' (https://boringssl.googlesource.com/boringssl) registered 
for path 'boringssl'
Submodule 'krb5' (https://github.com/krb5/krb5) registered for path 'krb5'
Submodule 'pyca.cryptography' (https://github.com/pyca/cryptography.git) 
registered for path 'pyca-cryptography'
Cloning into 'boringssl'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl/boringssl': 
checked out '2070f8ad9151dc8f3a73bffaa146b5e6937a583f'
Cloning into 'krb5'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl/krb5': checked 
out 'b9ad6c49505c96a088326b62a52568e3484f2168'
Cloning into 'pyca-cryptography'...
Submodule path 
'roms/edk2/CryptoPkg/Library/OpensslLib/openssl/pyca-cryptography': checked out 
'09403100de2f6f1cdd0d484dcb8e620f1c335c8f'
Cloning into 'roms/ip

Re: [Qemu-devel] [PULL 5/7] file-posix: Support BDRV_REQ_NO_FALLBACK for zero writes

2019-08-14 Thread Eric Blake
On 3/26/19 10:51 AM, Kevin Wolf wrote:
> We know that the kernel implements a slow fallback code path for
> BLKZEROOUT, so if BDRV_REQ_NO_FALLBACK is given, we shouldn't call it.
> The other operations we call in the context of .bdrv_co_pwrite_zeroes
> should usually be quick, so no modification should be needed for them.
> If we ever notice that there are additional problematic cases, we can
> still make these conditional as well.

Are there cases where fallocate(FALLOC_FL_ZERO_RANGE) falls back to slow
writes?  It may be fast on some file systems, but when used on a block
device, that may equally trigger slow fallbacks.  The man page is not
clear on that fact; I suspect that there may be cases in there that need
to be made conditional (it would be awesome if the kernel folks would
give us another FALLOC_ flag when we want to guarantee no fallback).

By the way, is there an easy setup to prove (maybe some qemu-img convert
command on a specially-prepared source image) whether the no fallback
flag makes a difference?  I'm about to cross-post a series of patches to
nbd/qemu/nbdkit/libnbd that adds a new NBD_CMD_FLAG_FAST_ZERO which fits
the bill of BDRV_REQ_NO_FALLBACK, but would like to include some
benchmark numbers in my cover letter if I can reproduce a setup where it
matters.

And this patch has a bug:

> +++ b/block/file-posix.c
> @@ -652,7 +652,7 @@ static int raw_open_common(BlockDriverState *bs, QDict 
> *options,
>  }
>  #endif
>  
> -bs->supported_zero_flags = BDRV_REQ_MAY_UNMAP;
> +bs->supported_zero_flags = BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK;
>  ret = 0;
>  fail:
>  if (filename && (bdrv_flags & BDRV_O_TEMPORARY)) {
> @@ -1500,14 +1500,19 @@ static ssize_t 
> handle_aiocb_write_zeroes_block(RawPosixAIOData *aiocb)
{
int ret = -ENOTSUP;
BDRVRawState *s = aiocb->bs->opaque;

if (!s->has_write_zeroes) {
return -ENOTSUP;
>  }

At this point, ret is -ENOTSUP.

>  
>  #ifdef BLKZEROOUT
> -do {
> -uint64_t range[2] = { aiocb->aio_offset, aiocb->aio_nbytes };
> -if (ioctl(aiocb->aio_fildes, BLKZEROOUT, range) == 0) {
> -return 0;
> -}
> -} while (errno == EINTR);
> +/* The BLKZEROOUT implementation in the kernel doesn't set
> + * BLKDEV_ZERO_NOFALLBACK, so we can't call this if we have to avoid slow
> + * fallbacks. */
> +if (!(aiocb->aio_type & QEMU_AIO_NO_FALLBACK)) {
> +do {
> +uint64_t range[2] = { aiocb->aio_offset, aiocb->aio_nbytes };
> +if (ioctl(aiocb->aio_fildes, BLKZEROOUT, range) == 0) {
> +return 0;
> +}
> +} while (errno == EINTR);
>  
> -ret = translate_err(-errno);
> +ret = translate_err(-errno);
> +}

If the very first call to this function is with NO_FALLBACK, then this
'if' is skipped,

>  #endif
>  
>  if (ret == -ENOTSUP) {
s->has_write_zeroes = false;
}

and we set s->has_write_zeroes to false, permanently disabling any
BLKZEROOUT attempts in future calls, even if the future calls no longer
pass the NO_FALLBACK flag.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PATCH v1 1/2] accel/tcg: adding integration with linux perf

2019-08-14 Thread vandersonmr
This commit adds support to Linux Perf in order
to be able to analyze qemu jitted code and
also to able to see the TBs PC in it.

Signed-off-by: Vanderson M. do Rosario 
---
 accel/tcg/Makefile.objs  |   1 +
 accel/tcg/perf/Makefile.objs |   1 +
 accel/tcg/perf/jitdump.c | 180 +++
 accel/tcg/perf/jitdump.h |  19 
 accel/tcg/translate-all.c|  12 +++
 include/qemu-common.h|   3 +
 linux-user/main.c|   7 ++
 qemu-options.hx  |  12 +++
 8 files changed, 235 insertions(+)
 create mode 100644 accel/tcg/perf/Makefile.objs
 create mode 100644 accel/tcg/perf/jitdump.c
 create mode 100644 accel/tcg/perf/jitdump.h

diff --git a/accel/tcg/Makefile.objs b/accel/tcg/Makefile.objs
index d381a02f34..f393a7438f 100644
--- a/accel/tcg/Makefile.objs
+++ b/accel/tcg/Makefile.objs
@@ -3,6 +3,7 @@ obj-$(CONFIG_SOFTMMU) += cputlb.o
 obj-y += tcg-runtime.o tcg-runtime-gvec.o
 obj-y += cpu-exec.o cpu-exec-common.o translate-all.o
 obj-y += translator.o
+obj-y += perf/
 
 obj-$(CONFIG_USER_ONLY) += user-exec.o
 obj-$(call lnot,$(CONFIG_SOFTMMU)) += user-exec-stub.o
diff --git a/accel/tcg/perf/Makefile.objs b/accel/tcg/perf/Makefile.objs
new file mode 100644
index 00..f82fba35e5
--- /dev/null
+++ b/accel/tcg/perf/Makefile.objs
@@ -0,0 +1 @@
+obj-y += jitdump.o
diff --git a/accel/tcg/perf/jitdump.c b/accel/tcg/perf/jitdump.c
new file mode 100644
index 00..6f4c0911c2
--- /dev/null
+++ b/accel/tcg/perf/jitdump.c
@@ -0,0 +1,180 @@
+#ifdef __linux__
+
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "jitdump.h"
+#include "qemu-common.h"
+
+struct jitheader {
+uint32_t magic; /* characters "jItD" */
+uint32_t version;   /* header version */
+uint32_t total_size;/* total size of header */
+uint32_t elf_mach;  /* elf mach target */
+uint32_t pad1;  /* reserved */
+uint32_t pid;   /* JIT process id */
+uint64_t timestamp; /* timestamp */
+uint64_t flags; /* flags */
+};
+
+enum jit_record_type {
+JIT_CODE_LOAD   = 0,
+JIT_CODE_MOVE   = 1,
+JIT_CODE_DEBUG_INFO = 2,
+JIT_CODE_CLOSE  = 3,
+
+JIT_CODE_MAX,
+};
+
+/* record prefix (mandatory in each record) */
+struct jr_prefix {
+uint32_t id;
+uint32_t total_size;
+uint64_t timestamp;
+};
+
+struct jr_code_load {
+struct jr_prefix p;
+
+uint32_t pid;
+uint32_t tid;
+uint64_t vma;
+uint64_t code_addr;
+uint64_t code_size;
+uint64_t code_index;
+};
+
+struct jr_code_close {
+struct jr_prefix p;
+};
+
+struct jr_code_move {
+struct jr_prefix p;
+
+uint32_t pid;
+uint32_t tid;
+uint64_t vma;
+uint64_t old_code_addr;
+uint64_t new_code_addr;
+uint64_t code_size;
+uint64_t code_index;
+};
+
+FILE *dumpfile;
+void *perf_marker;
+
+static uint64_t get_timestamp(void)
+{
+struct timespec ts;
+if (clock_gettime(CLOCK_MONOTONIC, &ts)) {
+fprintf(stderr, "No support for CLOCK_MONOTONIC! -perf cannot be 
used!\n");
+exit(1);
+}
+return (uint64_t) ts.tv_sec * 10 + ts.tv_nsec;
+}
+
+static uint32_t get_e_machine(void)
+{
+uint32_t e_machine = EM_NONE;
+Elf64_Ehdr elf_header;
+FILE *exe = fopen("/proc/self/exe", "r");
+
+if (exe == NULL) {
+return e_machine;
+}
+
+if (fread(&elf_header, sizeof(Elf64_Ehdr), 1, exe) != 1) {
+goto end;
+}
+
+e_machine = elf_header.e_machine;
+
+end:
+fclose(exe);
+return e_machine;
+}
+
+void start_jitdump_file(void)
+{
+GString *dumpfile_name = g_string_new(NULL);;
+g_string_printf(dumpfile_name, "./jit-%d.dump", getpid());
+dumpfile = fopen(dumpfile_name->str, "w+");
+
+perf_marker = mmap(NULL, sysconf(_SC_PAGESIZE),
+  PROT_READ | PROT_EXEC,
+  MAP_PRIVATE,
+  fileno(dumpfile), 0);
+
+if (perf_marker == MAP_FAILED) {
+printf("Failed to create mmap marker file for perf %d\n", 
fileno(dumpfile));
+fclose(dumpfile);
+return;
+}
+
+g_string_free(dumpfile_name, TRUE);
+
+struct jitheader *header = g_new0(struct jitheader, 1);
+header->magic = 0x4A695444;
+header->version = 1;
+header->elf_mach = get_e_machine();
+header->total_size = sizeof(struct jitheader);
+header->pid = getpid();
+header->timestamp = get_timestamp();
+
+fwrite(header, header->total_size, 1, dumpfile);
+
+free(header);
+fflush(dumpfile);
+}
+
+void append_load_in_jitdump_file(TranslationBlock *tb)
+{
+GString *func_name = g_string_new(NULL);
+g_string_printf(func_name, "TB virt:0x"TARGET_FMT_lx"%c", tb->pc, '\0');
+
+struct jr_code_load *load_event = g_new0(struct jr_code_load, 1);
+load_event->p.id = JIT_CODE_LOAD;
+load_event->p.total_size = sizeof(struct jr_code_load) + func_name->len + 
tb->tc.size;
+load_event->p.timestamp = get_timestamp();
+l

[Qemu-devel] [PATCH v1 0/2] Integrating qemu to Linux Perf

2019-08-14 Thread vandersonmr
This patch is part of Google Summer of Code (GSoC) 2019.
More about the project can be found in:
https://wiki.qemu.org/Internships/ProjectIdeas/TCGCodeQuality

This adds --perf command-line option to dump Linux Perf 
jitdump files. These files are used to enhant Perf report
and to be able to analyze and dump JITed code with perf.

Example of use:
 perf record -k 1 qemu-x86_64 -perf ./a.out
 perf inject -j -i perf.data -o perf.data.jitted
 perf report -i perf.data.jitted

vandersonmr (2):
  accel/tcg: adding integration with linux perf
  tb-stats: adding TBStatistics info into perf dump

 accel/tcg/Makefile.objs  |   1 +
 accel/tcg/perf/Makefile.objs |   1 +
 accel/tcg/perf/jitdump.c | 193 +++
 accel/tcg/perf/jitdump.h |  19 
 accel/tcg/translate-all.c|  12 +++
 include/qemu-common.h|   3 +
 linux-user/main.c|   7 ++
 qemu-options.hx  |  12 +++
 8 files changed, 248 insertions(+)
 create mode 100644 accel/tcg/perf/Makefile.objs
 create mode 100644 accel/tcg/perf/jitdump.c
 create mode 100644 accel/tcg/perf/jitdump.h

-- 
2.22.0




Re: [Qemu-devel] [PATCH v9 05/11] numa: Extend CLI to provide initiator information for numa nodes

2019-08-14 Thread Dan Williams
On Wed, Aug 14, 2019 at 6:57 PM Tao Xu  wrote:
>
> On 8/15/2019 5:29 AM, Dan Williams wrote:
> > On Tue, Aug 13, 2019 at 10:14 PM Tao Xu  wrote:
> >>
> >> On 8/14/2019 10:39 AM, Dan Williams wrote:
> >>> On Tue, Aug 13, 2019 at 8:00 AM Igor Mammedov  wrote:
> 
>  On Fri,  9 Aug 2019 14:57:25 +0800
>  Tao  wrote:
> 
> > From: Tao Xu 
> >
> [...]
> > +for (i = 0; i < machine->numa_state->num_nodes; i++) {
> > +if (numa_info[i].initiator_valid &&
> > +!numa_info[numa_info[i].initiator].has_cpu) {
>  ^^ possible out of 
>  bounds read, see bellow
> 
> > +error_report("The initiator-id %"PRIu16 " of NUMA node %d"
> > + " does not exist.", numa_info[i].initiator, 
> > i);
> > +error_printf("\n");
> > +
> > +exit(1);
> > +}
>  it takes care only about nodes that have cpus or memory-only ones that 
>  have
>  initiator explicitly provided on CLI. And leaves possibility to have
>  memory-only nodes without initiator mixed with nodes that have initiator.
>  Is it valid to have mixed configuration?
>  Should we forbid it?
> >>>
> >>> The spec talks about the "Proximity Domain for the Attached Initiator"
> >>> field only being valid if the memory controller for the memory can be
> >>> identified by an initiator id in the SRAT. So I expect the only way to
> >>> define a memory proximity domain without this local initiator is to
> >>> allow specifying a node-id that does not have an entry in the SRAT.
> >>>
> >> Hi Dan,
> >>
> >> So there may be a situation for the Attached Initiator field is not
> >> valid? If true, I would allow user to input Initiator invalid.
> >
> > Yes it's something the OS needs to consider because the platform may
> > not be able to meet the constraint that a single initiator is
> > associated with the memory controller for a given memory target. In
> > retrospect it would have been nice if the spec reserved 0x for
> > this purpose, but it seems "not in SRAT" is the only way to identify
> > memory that is not attached to any single initiator.
> >
> But As far as I konw, QEMU can't emulate a NUMA node "not in SRAT". I am
> wondering if it is effective only set Initiator invalid?

You don't need to emulate a NUMA node not in SRAT. Just put a number
in this HMAT entry larger than the largest proximity domain number
found in the SRAT.
>



[Qemu-devel] [PATCH v5 09/10] monitor: adding new info cfg command

2019-08-14 Thread vandersonmr
Adding "info cfg id depth" commands to HMP.
This command allow the exploration a TB
neighbors by dumping [and opening] a .dot
file with the TB CFG neighbors colorized
by their hotness.

The goal of this command is to allow the dynamic exploration
of TCG behavior and code quality. Therefore, for now, a
corresponding QMP command is not worthwhile.

Signed-off-by: Vanderson M. do Rosario 
---
 accel/tcg/tb-stats.c| 177 
 hmp-commands-info.hx|   7 ++
 include/exec/tb-stats.h |   1 +
 monitor/misc.c  |  22 +
 4 files changed, 207 insertions(+)

diff --git a/accel/tcg/tb-stats.c b/accel/tcg/tb-stats.c
index f5e519bdb7..5fda2bed9e 100644
--- a/accel/tcg/tb-stats.c
+++ b/accel/tcg/tb-stats.c
@@ -637,6 +637,182 @@ void dump_tb_info(int id, int log_mask, bool use_monitor)
 /* tbdi free'd by do_dump_tb_info_safe */
 }
 
+/* TB CFG xdot/dot dump implementation */
+#define MAX_CFG_NUM_NODES 1000
+static int cfg_tb_id;
+static GHashTable *cfg_nodes;
+static uint64_t root_count;
+
+static void fputs_jump(TBStatistics *from, TBStatistics *to, FILE *dot)
+{
+if (!from || !to) {
+return;
+}
+
+int *from_id = (int *) g_hash_table_lookup(cfg_nodes, from);
+int *to_id   = (int *) g_hash_table_lookup(cfg_nodes, to);
+
+if (!from_id || !to_id) {
+return;
+}
+
+GString *line = g_string_new(NULL);
+
+g_string_printf(line, "   node_%d -> node_%d;\n", *from_id, *to_id);
+
+fputs(line->str, dot);
+
+g_string_free(line, true);
+}
+
+static void fputs_tbstats(TBStatistics *tbs, FILE *dot, int log_flags)
+{
+if (!tbs) {
+return;
+}
+
+GString *line = g_string_new(NULL);;
+
+uint32_t color = 0xFF666;
+uint64_t count = tbs->executions.normal;
+if (count > 1.6 * root_count) {
+color = 0xFF000;
+} else if (count > 1.2 * root_count) {
+color = 0xFF333;
+} else if (count < 0.4 * root_count) {
+color = 0xFFCCC;
+} else if (count < 0.8 * root_count) {
+color = 0xFF999;
+}
+
+GString *code_s = get_code_string(tbs, log_flags);
+
+for (int i = 0; i < code_s->len; i++) {
+if (code_s->str[i] == '\n') {
+code_s->str[i] = ' ';
+code_s = g_string_insert(code_s, i, "\\l");
+i += 2;
+}
+}
+
+g_string_printf(line,
+"   node_%d [fillcolor=\"#%xFF\" shape=\"record\" "
+"label=\"TB %d\\l"
+"-\\l"
+"PC:\t0x"TARGET_FMT_lx"\\l"
+"exec count:\t%lu\\l"
+"\\l %s\"];\n",
+cfg_tb_id, color, cfg_tb_id, tbs->pc,
+tbs->executions.normal, code_s->str);
+
+fputs(line->str, dot);
+
+int *id = g_new(int, 1);
+*id = cfg_tb_id;
+g_hash_table_insert(cfg_nodes, tbs, id);
+
+cfg_tb_id++;
+
+g_string_free(line, true);
+g_string_free(code_s, true);
+}
+
+static void fputs_preorder_walk(TBStatistics *tbs, int depth, FILE *dot, int 
log_flags)
+{
+if (tbs && depth > 0
+&& cfg_tb_id < MAX_CFG_NUM_NODES
+&& !g_hash_table_contains(cfg_nodes, tbs)) {
+
+fputs_tbstats(tbs, dot, log_flags);
+
+if (tbs->tb) {
+TranslationBlock *left_tb  = NULL;
+TranslationBlock *right_tb = NULL;
+if (tbs->tb->jmp_dest[0]) {
+left_tb = (TranslationBlock *) atomic_read(tbs->tb->jmp_dest);
+}
+if (tbs->tb->jmp_dest[1]) {
+right_tb = (TranslationBlock *) atomic_read(tbs->tb->jmp_dest 
+ 1);
+}
+
+if (left_tb) {
+fputs_preorder_walk(left_tb->tb_stats, depth - 1, dot, 
log_flags);
+fputs_jump(tbs, left_tb->tb_stats, dot);
+}
+if (right_tb) {
+fputs_preorder_walk(right_tb->tb_stats, depth - 1, dot, 
log_flags);
+fputs_jump(tbs, right_tb->tb_stats, dot);
+}
+}
+}
+}
+
+struct PreorderInfo {
+TBStatistics *tbs;
+int depth;
+int log_flags;
+};
+
+static void fputs_preorder_walk_safe(CPUState *cpu, run_on_cpu_data icmd)
+{
+struct PreorderInfo *info = icmd.host_ptr;
+
+GString *file_name = g_string_new(NULL);;
+g_string_printf(file_name, "/tmp/qemu-cfg-tb-%d-%d.dot", id, info->depth);
+FILE *dot = fopen(file_name->str, "w+");
+
+fputs(
+"digraph G {\n"
+"   mclimit=1.5;\n"
+"   rankdir=TD; ordering=out;\n"
+"   graph[fontsize=10 fontname=\"Verdana\"];\n"
+"   color=\"#efefef\";\n"
+"   node[shape=box style=filled fontsize=8 fontname=\"Verdana\" 
fillcolor=\"#efefef\"];\n"
+"   edge[fontsize=8 fontname=\"Verdana\"];\n"
+ , dot);
+
+cfg_nodes = g_hash_table_new(NULL, NULL);
+fputs_preorder_walk(info->tbs, info->depth, dot, info->log_flags);
+g_hash_table_destroy(cfg_nodes);
+
+fputs("}\n\0", dot);
+fclose(dot

[Qemu-devel] [PATCH v5 08/10] Adding info [tbs|tb|coverset] commands to HMP. These commands allow the exploration of TBs generated by the TCG. Understand which one hotter, with more guest/host instruc

2019-08-14 Thread vandersonmr
The goal of this command is to allow the dynamic exploration
of TCG behavior and code quality. Therefore, for now, a
corresponding QMP command is not worthwhile.

Signed-off-by: Vanderson M. do Rosario 
---
 accel/tcg/tb-stats.c | 398 ++-
 accel/tcg/translate-all.c|   2 +-
 disas.c  |  31 ++-
 hmp-commands-info.hx |  24 +++
 include/exec/tb-stats.h  |  43 +++-
 include/qemu/log-for-trace.h |   4 +
 include/qemu/log.h   |   2 +
 monitor/misc.c   |  74 +++
 util/log.c   |  52 -
 9 files changed, 609 insertions(+), 21 deletions(-)

diff --git a/accel/tcg/tb-stats.c b/accel/tcg/tb-stats.c
index f28fd7b434..f5e519bdb7 100644
--- a/accel/tcg/tb-stats.c
+++ b/accel/tcg/tb-stats.c
@@ -11,9 +11,36 @@
 
 /* only accessed in safe work */
 static GList *last_search;
-
+int id = 1; /* display_id increment counter */
 uint64_t dev_time;
 
+static TBStatistics *get_tbstats_by_id(int id)
+{
+GList *iter;
+
+for (iter = last_search; iter; iter = g_list_next(iter)) {
+TBStatistics *tbs = iter->data;
+if (tbs && tbs->display_id == id) {
+return tbs;
+break;
+}
+}
+return NULL;
+}
+
+static TBStatistics *get_tbstats_by_addr(target_ulong pc)
+{
+GList *iter;
+for (iter = last_search; iter; iter = g_list_next(iter)) {
+TBStatistics *tbs = iter->data;
+if (tbs && tbs->pc == pc) {
+return tbs;
+break;
+}
+}
+return NULL;
+}
+
 struct jit_profile_info {
 uint64_t translations;
 uint64_t aborted;
@@ -155,6 +182,7 @@ static void clean_tbstats(void)
 qht_destroy(&tb_ctx.tb_stats);
 }
 
+
 void do_hmp_tbstats_safe(CPUState *cpu, run_on_cpu_data icmd)
 {
 struct TbstatsCommand *cmdinfo = icmd.host_ptr;
@@ -242,6 +270,374 @@ void init_tb_stats_htable_if_not(void)
 }
 }
 
+static void collect_tb_stats(void *p, uint32_t hash, void *userp)
+{
+last_search = g_list_prepend(last_search, p);
+}
+
+static void dump_tb_targets(TBStatistics *tbs)
+{
+if (tbs && tbs->tb) {
+uintptr_t dst1 = atomic_read(tbs->tb->jmp_dest);
+uintptr_t dst2 = atomic_read(tbs->tb->jmp_dest + 1);
+TranslationBlock* tb_dst1 = dst1 > 1 ? (TranslationBlock *) dst1 : 0;
+TranslationBlock* tb_dst2 = dst2 > 1 ? (TranslationBlock *) dst2 : 0;
+target_ulong pc1 = tb_dst1 ? tb_dst1->pc : 0;
+target_ulong pc2 = tb_dst2 ? tb_dst2->pc : 0;
+
+/* if there is no display id from the last_search, then create one */
+TBStatistics *tbstats_pc1 = get_tbstats_by_addr(pc1);
+TBStatistics *tbstats_pc2 = get_tbstats_by_addr(pc2);
+
+if (!tbstats_pc1 && tb_dst1 && tb_dst1->tb_stats) {
+last_search = g_list_append(last_search, tb_dst1->tb_stats);
+tbstats_pc1 = tb_dst1->tb_stats;
+}
+
+if (!tbstats_pc2 && tb_dst2 && tb_dst2->tb_stats) {
+last_search = g_list_append(last_search, tb_dst2->tb_stats);
+tbstats_pc2 = tb_dst2->tb_stats;
+}
+
+if (tbstats_pc1 && tbstats_pc1->display_id == 0) {
+tbstats_pc1->display_id = id++;
+}
+
+if (tbstats_pc2 && tbstats_pc2->display_id == 0) {
+tbstats_pc2->display_id = id++;
+}
+
+if (pc1 && !pc2) {
+qemu_log("\t| targets: 0x"TARGET_FMT_lx" (id:%d)\n",
+pc1, tb_dst1 ? tbstats_pc1->display_id : -1);
+} else if (pc1 && pc2) {
+qemu_log("\t| targets: 0x"TARGET_FMT_lx" (id:%d), "
+ "0x"TARGET_FMT_lx" (id:%d)\n",
+pc1, tb_dst1 ? tbstats_pc1->display_id : -1,
+pc2, tb_dst2 ? tbstats_pc2->display_id : -1);
+} else {
+qemu_log("\t| targets: no direct target\n");
+}
+}
+}
+
+static void dump_tb_header(TBStatistics *tbs)
+{
+unsigned g = stat_per_translation(tbs, code.num_guest_inst);
+unsigned ops = stat_per_translation(tbs, code.num_tcg_ops);
+unsigned ops_opt = stat_per_translation(tbs, code.num_tcg_ops_opt);
+unsigned spills = stat_per_translation(tbs, code.spills);
+unsigned h = stat_per_translation(tbs, code.out_len);
+
+float guest_host_prop = g ? ((float) h / g) : 0;
+
+qemu_log("TB id:%d | phys:0x"TB_PAGE_ADDR_FMT" virt:0x"TARGET_FMT_lx
+ " flags:%#08x\n", tbs->display_id, tbs->phys_pc, tbs->pc, 
tbs->flags);
+
+if (tbs_stats_enabled(tbs, TB_EXEC_STATS)) {
+qemu_log("\t| exec:%lu/%lu\n", tbs->executions.normal, 
tbs->executions.atomic);
+}
+
+if (tbs_stats_enabled(tbs, TB_JIT_STATS)) {
+qemu_log("\t| trans:%lu ints: g:%u op:%u op_opt:%u spills:%d"
+ "\n\t| h/g (host bytes / guest insts): %f\n",
+ tbs->translations.total, g, ops, ops_opt, spills, 
guest_host_prop);
+}
+
+if (tbs_stats_enabled(tbs, TB_JIT_TIME)) {
+qemu_log("\t| time

[Qemu-devel] [PATCH v5 00/10] Measure Tiny Code Generation Quality

2019-08-14 Thread vandersonmr
This patch is part of Google Summer of Code (GSoC) 2019.
More about the project can be found in:
https://wiki.qemu.org/Internships/ProjectIdeas/TCGCodeQuality

The goal of this patch is to add infrastructure to collect
execution and JIT statistics during the emulation with accel/TCG.
The statistics are stored in TBStatistic structures (TBStats)
with each TB having its respective TBStats.

We added -d tb_stats and HMP tb_stats commands to allow the control
of this statistics collection. And info tb, tbs, and coverset commands
were also added to allow dumping and exploring all this information
while emulating.

Collecting these statistics and information is useful to understand
qemu performance and to help to add the support for traces to QEMU. 

v5:
 - full replacement of CONFIG_PROFILER
 - several fixes
 - adds "info cfg"
 - adds TB's targets to dump

vandersonmr (10):
  accel: introducing TBStatistics structure
  accel: collecting TB execution count
  accel: collecting JIT statistics
  accel: replacing part of CONFIG_PROFILER with TBStats
  accel: adding TB_JIT_TIME and full replacing CONFIG_PROFILER
  log: adding -d tb_stats to control tbstats
  monitor: adding tb_stats hmp command
  Adding info [tbs|tb|coverset] commands to HMP. These commands allow
the exploration of TBs generated by the TCG. Understand which one
hotter, with more guest/host instructions... and examine their
guest, host and IR code.
  monitor: adding new info cfg command
  linux-user: dumping hot TBs at the end of the execution

 accel/tcg/Makefile.objs  |   2 +-
 accel/tcg/cpu-exec.c |   4 +
 accel/tcg/perf/Makefile.objs |   1 +
 accel/tcg/tb-stats.c | 865 +++
 accel/tcg/tcg-runtime.c  |   7 +
 accel/tcg/tcg-runtime.h  |   2 +
 accel/tcg/translate-all.c| 133 --
 accel/tcg/translator.c   |   6 +
 configure|   3 -
 cpus.c   |  14 +-
 disas.c  |  31 +-
 hmp-commands-info.hx |  31 ++
 hmp-commands.hx  |  17 +
 include/exec/exec-all.h  |  15 +-
 include/exec/gen-icount.h|  10 +
 include/exec/tb-context.h|  12 +
 include/exec/tb-hash.h   |   7 +
 include/exec/tb-stats.h  | 142 ++
 include/qemu-common.h|  16 +
 include/qemu/log-for-trace.h |   4 +
 include/qemu/log.h   |   3 +
 include/qemu/timer.h |   5 +-
 linux-user/exit.c|   4 +
 monitor/misc.c   | 171 ++-
 tcg/tcg.c| 231 +++---
 tcg/tcg.h|  22 +-
 util/log.c   |  90 +++-
 vl.c |   8 +-
 28 files changed, 1572 insertions(+), 284 deletions(-)
 create mode 100644 accel/tcg/perf/Makefile.objs
 create mode 100644 accel/tcg/tb-stats.c
 create mode 100644 include/exec/tb-stats.h

-- 
2.22.0




[Qemu-devel] [PATCH v5 04/10] accel: replacing part of CONFIG_PROFILER with TBStats

2019-08-14 Thread vandersonmr
We add some of the statistics collected in the TCGProfiler
into the TBStats, having the statistics not only for the whole
emulation but for each TB. Then, we removed these stats
from TCGProfiler and reconstruct the information for the
"info jit" using the sum of all TBStats statistics.

The goal is to have one unique and better way of collecting
emulation statistics. Moreover, checking dynamiclly if the
profiling is enabled showed to have an insignificant impact
on the performance:
https://wiki.qemu.org/Internships/ProjectIdeas/TCGCodeQuality#Overheads.

Signed-off-by: Vanderson M. do Rosario 
---
 accel/tcg/tb-stats.c  | 95 +++
 accel/tcg/translate-all.c |  8 +---
 include/exec/tb-stats.h   | 11 +
 tcg/tcg.c | 93 +-
 tcg/tcg.h | 10 -
 5 files changed, 118 insertions(+), 99 deletions(-)

diff --git a/accel/tcg/tb-stats.c b/accel/tcg/tb-stats.c
index 3489133e9e..9b720d9b86 100644
--- a/accel/tcg/tb-stats.c
+++ b/accel/tcg/tb-stats.c
@@ -1,9 +1,104 @@
 #include "qemu/osdep.h"
 
 #include "disas/disas.h"
+#include "exec/exec-all.h"
+#include "tcg.h"
+
+#include "qemu/qemu-print.h"
 
 #include "exec/tb-stats.h"
 
+struct jit_profile_info {
+uint64_t translations;
+uint64_t aborted;
+uint64_t ops;
+unsigned ops_max;
+uint64_t del_ops;
+uint64_t temps;
+unsigned temps_max;
+uint64_t host;
+uint64_t guest;
+uint64_t search_data;
+};
+
+/* accumulate the statistics from all TBs */
+static void collect_jit_profile_info(void *p, uint32_t hash, void *userp)
+{
+struct jit_profile_info *jpi = userp;
+TBStatistics *tbs = p;
+
+jpi->translations += tbs->translations.total;
+jpi->ops += tbs->code.num_tcg_ops;
+if (stat_per_translation(tbs, code.num_tcg_ops) > jpi->ops_max) {
+jpi->ops_max = stat_per_translation(tbs, code.num_tcg_ops);
+}
+jpi->del_ops += tbs->code.deleted_ops;
+jpi->temps += tbs->code.temps;
+if (stat_per_translation(tbs, code.temps) > jpi->temps_max) {
+jpi->temps_max = stat_per_translation(tbs, code.temps);
+}
+jpi->host += tbs->code.out_len;
+jpi->guest += tbs->code.in_len;
+jpi->search_data += tbs->code.search_out_len;
+}
+
+/* dump JIT statisticis using TCGProfile and TBStats */
+void dump_jit_profile_info(TCGProfile *s)
+{
+if (!tb_stats_collection_enabled()) {
+return;
+}
+
+struct jit_profile_info *jpi = g_new0(struct jit_profile_info, 1);
+
+qht_iter(&tb_ctx.tb_stats, collect_jit_profile_info, jpi);
+
+if (jpi->translations) {
+qemu_printf("translated TBs  %" PRId64 "\n", jpi->translations);
+qemu_printf("avg ops/TB  %0.1f max=%d\n",
+jpi->ops / (double) jpi->translations, jpi->ops_max);
+qemu_printf("deleted ops/TB  %0.2f\n",
+jpi->del_ops / (double) jpi->translations);
+qemu_printf("avg temps/TB%0.2f max=%d\n",
+jpi->temps / (double) jpi->translations, jpi->temps_max);
+qemu_printf("avg host code/TB%0.1f\n",
+jpi->host / (double) jpi->translations);
+qemu_printf("avg search data/TB  %0.1f\n",
+jpi->search_data / (double) jpi->translations);
+
+if (s) {
+int64_t tot = s->interm_time + s->code_time;
+qemu_printf("JIT cycles  %" PRId64 " (%0.3f s at 2.4 
GHz)\n",
+tot, tot / 2.4e9);
+qemu_printf("cycles/op   %0.1f\n",
+jpi->ops ? (double)tot / jpi->ops : 0);
+qemu_printf("cycles/in byte  %0.1f\n",
+jpi->guest ? (double)tot / jpi->guest : 0);
+qemu_printf("cycles/out byte %0.1f\n",
+jpi->host ? (double)tot / jpi->host : 0);
+qemu_printf("cycles/search byte %0.1f\n",
+jpi->search_data ? (double)tot / jpi->search_data : 0);
+if (tot == 0) {
+tot = 1;
+}
+qemu_printf("  gen_interm time   %0.1f%%\n",
+(double)s->interm_time / tot * 100.0);
+qemu_printf("  gen_code time %0.1f%%\n",
+(double)s->code_time / tot * 100.0);
+qemu_printf("optim./code time%0.1f%%\n",
+(double)s->opt_time / (s->code_time ? s->code_time : 1)
+* 100.0);
+qemu_printf("liveness/code time  %0.1f%%\n",
+(double)s->la_time / (s->code_time ? s->code_time : 1) * 
100.0);
+qemu_printf("cpu_restore count   %" PRId64 "\n",
+s->restore_count);
+qemu_printf("  avg cycles%0.1f\n",
+s->restore_count ? (double)s->restore_time / 
s->restore_count : 0);
+}
+}
+}
+
+
 void init_tb_stats_htable_if_not(void)
 {
 if (tb_stats_c

[Qemu-devel] [RFC PATCH v3 43/46] target/i386: introduce SSE2 instructions to sse-opcode.inc.h

2019-08-14 Thread Jan Bobek
Add all the SSE2 instruction entries to sse-opcode.inc.h.

Signed-off-by: Jan Bobek 
---
 target/i386/sse-opcode.inc.h | 323 ++-
 1 file changed, 322 insertions(+), 1 deletion(-)

diff --git a/target/i386/sse-opcode.inc.h b/target/i386/sse-opcode.inc.h
index 39947aeb51..efa67b7ce2 100644
--- a/target/i386/sse-opcode.inc.h
+++ b/target/i386/sse-opcode.inc.h
@@ -43,241 +43,535 @@
 OPCODE(movd, LEG(NP, 0F, 0, 0x6e), MMX, WR, Pq, Ed)
 /* NP 0F 7E /r: MOVD r/m32,mm */
 OPCODE(movd, LEG(NP, 0F, 0, 0x7e), MMX, WR, Ed, Pq)
+/* 66 0F 6E /r: MOVD xmm,r/m32 */
+OPCODE(movd, LEG(66, 0F, 0, 0x6e), SSE2, WR, Vdq, Ed)
+/* 66 0F 7E /r: MOVD r/m32,xmm */
+OPCODE(movd, LEG(66, 0F, 0, 0x7e), SSE2, WR, Ed, Vdq)
 /* NP REX.W + 0F 6E /r: MOVQ mm,r/m64 */
 OPCODE(movq, LEG(NP, 0F, 1, 0x6e), MMX, WR, Pq, Eq)
 /* NP REX.W + 0F 7E /r: MOVQ r/m64,mm */
 OPCODE(movq, LEG(NP, 0F, 1, 0x7e), MMX, WR, Eq, Pq)
+/* 66 REX.W 0F 6E /r: MOVQ xmm,r/m64 */
+OPCODE(movq, LEG(66, 0F, 1, 0x6e), SSE2, WR, Vdq, Eq)
+/* 66 REX.W 0F 7E /r: MOVQ r/m64,xmm */
+OPCODE(movq, LEG(66, 0F, 1, 0x7e), SSE2, WR, Eq, Vdq)
 /* NP 0F 6F /r: MOVQ mm, mm/m64 */
 OPCODE(movq, LEG(NP, 0F, 0, 0x6f), MMX, WR, Pq, Qq)
 /* NP 0F 7F /r: MOVQ mm/m64, mm */
 OPCODE(movq, LEG(NP, 0F, 0, 0x7f), MMX, WR, Qq, Pq)
+/* F3 0F 7E /r: MOVQ xmm1, xmm2/m64 */
+OPCODE(movq, LEG(F3, 0F, 0, 0x7e), SSE2, WR, Vdq, Wq)
+/* 66 0F D6 /r: MOVQ xmm2/m64, xmm1 */
+OPCODE(movq, LEG(66, 0F, 0, 0xd6), SSE2, WR, UdqMq, Vq)
 /* NP 0F 28 /r: MOVAPS xmm1, xmm2/m128 */
 OPCODE(movaps, LEG(NP, 0F, 0, 0x28), SSE, WR, Vdq, Wdq)
 /* NP 0F 29 /r: MOVAPS xmm2/m128, xmm1 */
 OPCODE(movaps, LEG(NP, 0F, 0, 0x29), SSE, WR, Wdq, Vdq)
+/* 66 0F 28 /r: MOVAPD xmm1, xmm2/m128 */
+OPCODE(movapd, LEG(66, 0F, 0, 0x28), SSE2, WR, Vdq, Wdq)
+/* 66 0F 29 /r: MOVAPD xmm2/m128, xmm1 */
+OPCODE(movapd, LEG(66, 0F, 0, 0x29), SSE2, WR, Wdq, Vdq)
+/* 66 0F 6F /r: MOVDQA xmm1, xmm2/m128 */
+OPCODE(movdqa, LEG(66, 0F, 0, 0x6f), SSE2, WR, Vdq, Wdq)
+/* 66 0F 7F /r: MOVDQA xmm2/m128, xmm1 */
+OPCODE(movdqa, LEG(66, 0F, 0, 0x7f), SSE2, WR, Wdq, Vdq)
 /* NP 0F 10 /r: MOVUPS xmm1, xmm2/m128 */
 OPCODE(movups, LEG(NP, 0F, 0, 0x10), SSE, WR, Vdq, Wdq)
 /* NP 0F 11 /r: MOVUPS xmm2/m128, xmm1 */
 OPCODE(movups, LEG(NP, 0F, 0, 0x11), SSE, WR, Wdq, Vdq)
+/* 66 0F 10 /r: MOVUPD xmm1, xmm2/m128 */
+OPCODE(movupd, LEG(66, 0F, 0, 0x10), SSE2, WR, Vdq, Wdq)
+/* 66 0F 11 /r: MOVUPD xmm2/m128, xmm1 */
+OPCODE(movupd, LEG(66, 0F, 0, 0x11), SSE2, WR, Wdq, Vdq)
+/* F3 0F 6F /r: MOVDQU xmm1,xmm2/m128 */
+OPCODE(movdqu, LEG(F3, 0F, 0, 0x6f), SSE2, WR, Vdq, Wdq)
+/* F3 0F 7F /r: MOVDQU xmm2/m128,xmm1 */
+OPCODE(movdqu, LEG(F3, 0F, 0, 0x7f), SSE2, WR, Wdq, Vdq)
 /* F3 0F 10 /r: MOVSS xmm1, xmm2/m32 */
 OPCODE(movss, LEG(F3, 0F, 0, 0x10), SSE, WRRR, Vdq, Vdq, Wd, modrm_mod)
 /* F3 0F 11 /r: MOVSS xmm2/m32, xmm1 */
 OPCODE(movss, LEG(F3, 0F, 0, 0x11), SSE, WR, Wd, Vd)
+/* F2 0F 10 /r: MOVSD xmm1, xmm2/m64 */
+OPCODE(movsd, LEG(F2, 0F, 0, 0x10), SSE2, WRRR, Vdq, Vdq, Wq, modrm_mod)
+/* F2 0F 11 /r: MOVSD xmm1/m64, xmm2 */
+OPCODE(movsd, LEG(F2, 0F, 0, 0x11), SSE2, WR, Wq, Vq)
+/* F3 0F D6 /r: MOVQ2DQ xmm, mm */
+OPCODE(movq2dq, LEG(F3, 0F, 0, 0xd6), SSE2, WR, Vdq, Nq)
+/* F2 0F D6 /r: MOVDQ2Q mm, xmm */
+OPCODE(movdq2q, LEG(F2, 0F, 0, 0xd6), SSE2, WR, Pq, Uq)
 /* NP 0F 12 /r: MOVHLPS xmm1, xmm2 */
 /* NP 0F 12 /r: MOVLPS xmm1, m64 */
 OPCODE(movhlps, LEG(NP, 0F, 0, 0x12), SSE, WR, Vq, UdqMhq)
 /* 0F 13 /r: MOVLPS m64, xmm1 */
 OPCODE(movlps, LEG(NP, 0F, 0, 0x13), SSE, WR, Mq, Vq)
+/* 66 0F 12 /r: MOVLPD xmm1,m64 */
+OPCODE(movlpd, LEG(66, 0F, 0, 0x12), SSE2, WR, Vq, Mq)
+/* 66 0F 13 /r: MOVLPD m64,xmm1 */
+OPCODE(movlpd, LEG(66, 0F, 0, 0x13), SSE2, WR, Mq, Vq)
 /* NP 0F 16 /r: MOVLHPS xmm1, xmm2 */
 /* NP 0F 16 /r: MOVHPS xmm1, m64 */
 OPCODE(movlhps, LEG(NP, 0F, 0, 0x16), SSE, WRR, Vdq, Vq, Wq)
 /* NP 0F 17 /r: MOVHPS m64, xmm1 */
 OPCODE(movhps, LEG(NP, 0F, 0, 0x17), SSE, WR, Mq, Vdq)
+/* 66 0F 16 /r: MOVHPD xmm1, m64 */
+OPCODE(movhpd, LEG(66, 0F, 0, 0x16), SSE2, WRR, Vdq, Vd, Mq)
+/* 66 0F 17 /r: MOVHPD m64, xmm1 */
+OPCODE(movhpd, LEG(66, 0F, 0, 0x17), SSE2, WR, Mq, Vdq)
 /* NP 0F D7 /r: PMOVMSKB r32, mm */
 OPCODE(pmovmskb, LEG(NP, 0F, 0, 0xd7), SSE, WR, Gd, Nq)
 /* NP REX.W 0F D7 /r: PMOVMSKB r64, mm */
 OPCODE(pmovmskb, LEG(NP, 0F, 1, 0xd7), SSE, WR, Gq, Nq)
+/* 66 0F D7 /r: PMOVMSKB r32, xmm */
+OPCODE(pmovmskb, LEG(66, 0F, 0, 0xd7), SSE2, WR, Gd, Udq)
+/* 66 REX.W 0F D7 /r: PMOVMSKB r64, xmm */
+OPCODE(pmovmskb, LEG(66, 0F, 1, 0xd7), SSE2, WR, Gq, Udq)
 /* NP 0F 50 /r: MOVMSKPS r32, xmm */
 OPCODE(movmskps, LEG(NP, 0F, 0, 0x50), SSE, WR, Gd, Udq)
 /* NP REX.W 0F 50 /r: MOVMSKPS r64, xmm */
 OPCODE(movmskps, LEG(NP, 0F, 1, 0x50), SSE, WR, Gq, Udq)
+/* 66 0F 50 /r: MOVMSKPD r32, xmm */
+OPCODE(movmskpd, LEG(66, 0F, 0, 0x50), SSE2, WR, Gd, Udq)
+/* 66 REX.W 0F 50 /r: MOVMSKPD r64, xmm */
+OPCODE(movmskpd, LEG(66, 0F, 1, 0x50), SSE2, WR, Gq, Udq)
 /* NP 0F FC /r: PADDB mm, mm/m64 */
 OPCODE(paddb, LEG(NP, 0F, 0, 0xfc), MM

[Qemu-devel] [PATCH v5 02/10] accel: collecting TB execution count

2019-08-14 Thread vandersonmr
If a TB has a TBS (TBStatistics) with the TB_EXEC_STATS
enabled, then we instrument the start code of this TB
to atomically count the number of times it is executed.
We count both the number of "normal" executions and atomic
executions of a TB.

The execution count of the TB is stored in its respective
TBS.

All TBStatistics are created by default with the flags from
default_tbstats_flag.

Signed-off-by: Vanderson M. do Rosario 
---
 accel/tcg/cpu-exec.c  |  4 
 accel/tcg/tb-stats.c  |  5 +
 accel/tcg/tcg-runtime.c   |  7 +++
 accel/tcg/tcg-runtime.h   |  2 ++
 accel/tcg/translate-all.c |  7 +++
 accel/tcg/translator.c|  1 +
 include/exec/gen-icount.h |  9 +
 include/exec/tb-stats.h   | 19 +++
 util/log.c|  1 +
 9 files changed, 55 insertions(+)

diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 6c85c3ee1e..e54be69499 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -252,6 +252,10 @@ void cpu_exec_step_atomic(CPUState *cpu)
 
 start_exclusive();
 
+if (tb_stats_enabled(tb, TB_EXEC_STATS)) {
+tb->tb_stats->executions.atomic++;
+}
+
 /* Since we got here, we know that parallel_cpus must be true.  */
 parallel_cpus = false;
 in_exclusive_region = true;
diff --git a/accel/tcg/tb-stats.c b/accel/tcg/tb-stats.c
index 02844717cb..3489133e9e 100644
--- a/accel/tcg/tb-stats.c
+++ b/accel/tcg/tb-stats.c
@@ -37,3 +37,8 @@ bool tb_stats_collection_paused(void)
 {
 return tcg_collect_tb_stats == TB_STATS_PAUSED;
 }
+
+uint32_t get_default_tbstats_flag(void)
+{
+return default_tbstats_flag;
+}
diff --git a/accel/tcg/tcg-runtime.c b/accel/tcg/tcg-runtime.c
index 8a1e408e31..6f4aafba11 100644
--- a/accel/tcg/tcg-runtime.c
+++ b/accel/tcg/tcg-runtime.c
@@ -167,3 +167,10 @@ void HELPER(exit_atomic)(CPUArchState *env)
 {
 cpu_loop_exit_atomic(env_cpu(env), GETPC());
 }
+
+void HELPER(inc_exec_freq)(void *ptr)
+{
+TBStatistics *stats = (TBStatistics *) ptr;
+g_assert(stats);
+atomic_inc(&stats->executions.normal);
+}
diff --git a/accel/tcg/tcg-runtime.h b/accel/tcg/tcg-runtime.h
index 4fa61b49b4..bf0b75dbe8 100644
--- a/accel/tcg/tcg-runtime.h
+++ b/accel/tcg/tcg-runtime.h
@@ -28,6 +28,8 @@ DEF_HELPER_FLAGS_1(lookup_tb_ptr, TCG_CALL_NO_WG_SE, ptr, env)
 
 DEF_HELPER_FLAGS_1(exit_atomic, TCG_CALL_NO_WG, noreturn, env)
 
+DEF_HELPER_FLAGS_1(inc_exec_freq, TCG_CALL_NO_RWG, void, ptr)
+
 #ifdef CONFIG_SOFTMMU
 
 DEF_HELPER_FLAGS_5(atomic_cmpxchgb, TCG_CALL_NO_WG,
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index b7bccacd3b..df08d183df 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -1785,6 +1785,13 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
  */
 if (tb_stats_collection_enabled()) {
 tb->tb_stats = tb_get_stats(phys_pc, pc, cs_base, flags, tb);
+uint32_t flag = get_default_tbstats_flag();
+
+if (qemu_log_in_addr_range(tb->pc)) {
+if (flag & TB_EXEC_STATS) {
+tb->tb_stats->stats_enabled |= TB_EXEC_STATS;
+}
+}
 } else {
 tb->tb_stats = NULL;
 }
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index 9226a348a3..396a11e828 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -46,6 +46,7 @@ void translator_loop(const TranslatorOps *ops, 
DisasContextBase *db,
 
 ops->init_disas_context(db, cpu);
 tcg_debug_assert(db->is_jmp == DISAS_NEXT);  /* no early exit */
+gen_tb_exec_count(tb);
 
 /* Reset the temp count so that we can identify leaks */
 tcg_clear_temp_count();
diff --git a/include/exec/gen-icount.h b/include/exec/gen-icount.h
index f7669b6841..b3efe41894 100644
--- a/include/exec/gen-icount.h
+++ b/include/exec/gen-icount.h
@@ -7,6 +7,15 @@
 
 static TCGOp *icount_start_insn;
 
+static inline void gen_tb_exec_count(TranslationBlock *tb)
+{
+if (tb_stats_enabled(tb, TB_EXEC_STATS)) {
+TCGv_ptr ptr = tcg_const_ptr(tb->tb_stats);
+gen_helper_inc_exec_freq(ptr);
+tcg_temp_free_ptr(ptr);
+}
+}
+
 static inline void gen_tb_start(TranslationBlock *tb)
 {
 TCGv_i32 count, imm;
diff --git a/include/exec/tb-stats.h b/include/exec/tb-stats.h
index cc8f8a6ce6..0265050b79 100644
--- a/include/exec/tb-stats.h
+++ b/include/exec/tb-stats.h
@@ -6,6 +6,9 @@
 #include "exec/tb-context.h"
 #include "tcg.h"
 
+#define tb_stats_enabled(tb, JIT_STATS) \
+(tb && tb->tb_stats && (tb->tb_stats->stats_enabled & JIT_STATS))
+
 typedef struct TBStatistics TBStatistics;
 
 /*
@@ -22,6 +25,15 @@ struct TBStatistics {
 uint32_t flags;
 /* cs_base isn't included in the hash but we do check for matches */
 target_ulong cs_base;
+
+uint32_t stats_enabled;
+
+/* Execution stats */
+struct {
+unsigned long normal;
+unsigned long atomic;
+} executions;
+
 /* current TB linked to this TBStatistics */
   

[Qemu-devel] [RFC PATCH v3 42/46] target/i386: introduce SSE2 code generators

2019-08-14 Thread Jan Bobek
Introduce code generators required by SSE2 instructions.

Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 444 +++-
 1 file changed, 442 insertions(+), 2 deletions(-)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 177bedd0ef..7ec082e79d 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -5391,6 +5391,21 @@ GEN_INSN2(movd, Ed, Pq)
 tcg_gen_ld_i32(arg1, cpu_env, arg2 + ofs);
 }
 
+GEN_INSN2(movq, Vdq, Eq);   /* forward declaration */
+GEN_INSN2(movd, Vdq, Ed)
+{
+const insnop_arg_t(Eq) arg2_r64 = tcg_temp_new_i64();
+tcg_gen_extu_i32_i64(arg2_r64, arg2);
+gen_insn2(movq, Vdq, Eq)(env, s, arg1, arg2_r64);
+tcg_temp_free_i64(arg2_r64);
+}
+
+GEN_INSN2(movd, Ed, Vdq)
+{
+const insnop_arg_t(Vdq) ofs = offsetof(ZMMReg, ZMM_L(0));
+tcg_gen_ld_i32(arg1, cpu_env, arg2 + ofs);
+}
+
 GEN_INSN2(movq, Pq, Eq)
 {
 const insnop_arg_t(Pq) ofs = offsetof(MMXReg, MMX_Q(0));
@@ -5403,12 +5418,53 @@ GEN_INSN2(movq, Eq, Pq)
 tcg_gen_ld_i64(arg1, cpu_env, arg2 + ofs);
 }
 
+GEN_INSN2(movq, Vdq, Eq)
+{
+const insnop_arg_t(Vdq) ofs0 = offsetof(ZMMReg, ZMM_Q(0));
+tcg_gen_st_i64(arg2, cpu_env, arg1 + ofs0);
+
+const insnop_arg_t(Vdq) ofs1 = offsetof(ZMMReg, ZMM_Q(1));
+tcg_gen_movi_i64(arg2, 0);
+tcg_gen_st_i64(arg2, cpu_env, arg1 + ofs1);
+}
+
+GEN_INSN2(movq, Eq, Vdq)
+{
+const insnop_arg_t(Vdq) ofs = offsetof(ZMMReg, ZMM_Q(0));
+tcg_gen_ld_i64(arg1, cpu_env, arg2 + ofs);
+}
+
 DEF_GEN_INSN2_GVEC_MM(movq, mov, Pq, Qq, MO_64)
 DEF_GEN_INSN2_GVEC_MM(movq, mov, Qq, Pq, MO_64)
+
+GEN_INSN2(movq, Vdq, Wq)
+{
+const insnop_arg_t(Vdq) dofs = offsetof(ZMMReg, ZMM_Q(0));
+const insnop_arg_t(Wq) aofs = offsetof(ZMMReg, ZMM_Q(0));
+gen_op_movq(s, arg1 + dofs, arg2 + aofs);
+
+const TCGv_i64 r64z = tcg_const_i64(0);
+tcg_gen_st_i64(r64z, cpu_env, arg1 + offsetof(ZMMReg, ZMM_Q(1)));
+tcg_temp_free_i64(r64z);
+}
+
+GEN_INSN2(movq, UdqMq, Vq)
+{
+gen_insn2(movq, Vdq, Wq)(env, s, arg1, arg2);
+}
+
 DEF_GEN_INSN2_GVEC_XMM(movaps, mov, Vdq, Wdq, MO_64)
 DEF_GEN_INSN2_GVEC_XMM(movaps, mov, Wdq, Vdq, MO_64)
+DEF_GEN_INSN2_GVEC_XMM(movapd, mov, Vdq, Wdq, MO_64)
+DEF_GEN_INSN2_GVEC_XMM(movapd, mov, Wdq, Vdq, MO_64)
+DEF_GEN_INSN2_GVEC_XMM(movdqa, mov, Vdq, Wdq, MO_64)
+DEF_GEN_INSN2_GVEC_XMM(movdqa, mov, Wdq, Vdq, MO_64)
 DEF_GEN_INSN2_GVEC_XMM(movups, mov, Vdq, Wdq, MO_64)
 DEF_GEN_INSN2_GVEC_XMM(movups, mov, Wdq, Vdq, MO_64)
+DEF_GEN_INSN2_GVEC_XMM(movupd, mov, Vdq, Wdq, MO_64)
+DEF_GEN_INSN2_GVEC_XMM(movupd, mov, Wdq, Vdq, MO_64)
+DEF_GEN_INSN2_GVEC_XMM(movdqu, mov, Vdq, Wdq, MO_64)
+DEF_GEN_INSN2_GVEC_XMM(movdqu, mov, Wdq, Vdq, MO_64)
 
 GEN_INSN2(movss, Wd, Vd);   /* forward declaration */
 GEN_INSN4(movss, Vdq, Vdq, Wd, modrm_mod)
@@ -5442,6 +5498,44 @@ GEN_INSN2(movss, Wd, Vd)
 gen_op_movl(s, arg1 + dofs, arg2 + aofs);
 }
 
+GEN_INSN2(movsd, Wq, Vq);   /* forward declaration */
+GEN_INSN4(movsd, Vdq, Vdq, Wq, modrm_mod)
+{
+assert(arg1 == arg2);
+
+if (arg4 == 3) {
+/* merging movsd */
+gen_insn2(movsd, Wq, Vq)(env, s, arg1, arg3);
+} else {
+/* zero-extending movsd */
+gen_insn2(movq, Vdq, Wq)(env, s, arg1, arg3);
+}
+}
+
+GEN_INSN2(movsd, Wq, Vq)
+{
+const size_t ofs = offsetof(ZMMReg, ZMM_Q(0));
+gen_op_movq(s, arg1 + ofs, arg2 + ofs);
+}
+
+GEN_INSN2(movq2dq, Vdq, Nq)
+{
+const insnop_arg_t(Vdq) dofs = offsetof(ZMMReg, ZMM_Q(0));
+const insnop_arg_t(Nq) aofs = offsetof(MMXReg, MMX_Q(0));
+gen_op_movq(s, arg1 + dofs, arg2 + aofs);
+
+const TCGv_i64 r64z = tcg_const_i64(0);
+tcg_gen_st_i64(r64z, cpu_env, arg1 + offsetof(ZMMReg, ZMM_Q(1)));
+tcg_temp_free_i64(r64z);
+}
+
+GEN_INSN2(movdq2q, Pq, Uq)
+{
+const insnop_arg_t(Pq) dofs = offsetof(MMXReg, MMX_Q(0));
+const insnop_arg_t(Uq) aofs = offsetof(ZMMReg, ZMM_Q(0));
+gen_op_movq(s, arg1 + dofs, arg2 + aofs);
+}
+
 GEN_INSN2(movhlps, Vq, UdqMhq)
 {
 const size_t dofs = offsetof(ZMMReg, ZMM_Q(0));
@@ -5455,6 +5549,17 @@ GEN_INSN2(movlps, Mq, Vq)
 gen_stq_env_A0(s, arg2 + offsetof(ZMMReg, ZMM_Q(0)));
 }
 
+GEN_INSN2(movlpd, Vq, Mq)
+{
+assert(arg2 == s->A0);
+gen_ldq_env_A0(s, arg1 + offsetof(ZMMReg, ZMM_Q(0)));
+}
+
+GEN_INSN2(movlpd, Mq, Vq)
+{
+gen_insn2(movlps, Mq, Vq)(env, s, arg1, arg2);
+}
+
 GEN_INSN3(movlhps, Vdq, Vq, Wq)
 {
 assert(arg1 == arg2);
@@ -5470,6 +5575,18 @@ GEN_INSN2(movhps, Mq, Vdq)
 gen_stq_env_A0(s, arg2 + offsetof(ZMMReg, ZMM_Q(1)));
 }
 
+GEN_INSN3(movhpd, Vdq, Vd, Mq)
+{
+assert(arg1 == arg2);
+assert(arg3 == s->A0);
+gen_ldq_env_A0(s, arg1 + offsetof(ZMMReg, ZMM_Q(1)));
+}
+
+GEN_INSN2(movhpd, Mq, Vdq)
+{
+gen_insn2(movhps, Mq, Vdq)(env, s, arg1, arg2);
+}
+
 DEF_GEN_INSN2_HELPER_DEP(pmovmskb, pmovmskb_mmx, Gd, Nq)
 
 GEN_INSN2(pmovmskb, Gq, Nq)
@@ -5480,6 +5597,16 @@ GEN_INSN2(pmovmskb, Gq, Nq)
 tcg_temp_free_i32(arg1_r32);
 }
 
+DEF_GEN_

[Qemu-devel] [RFC PATCH v3 44/46] target/i386: introduce SSE3 translators

2019-08-14 Thread Jan Bobek
Use the translator macros to define translators required by SSE3
instructions.

Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 7ec082e79d..c72138014a 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -6363,6 +6363,7 @@ DEF_TRANSLATE_INSN2(Vd, Wd)
 DEF_TRANSLATE_INSN2(Vd, Wq)
 DEF_TRANSLATE_INSN2(Vdq, Ed)
 DEF_TRANSLATE_INSN2(Vdq, Eq)
+DEF_TRANSLATE_INSN2(Vdq, Mdq)
 DEF_TRANSLATE_INSN2(Vdq, Nq)
 DEF_TRANSLATE_INSN2(Vdq, Qq)
 DEF_TRANSLATE_INSN2(Vdq, Udq)
-- 
2.20.1




[Qemu-devel] [RFC PATCH v3 38/46] target/i386: introduce SSE translators

2019-08-14 Thread Jan Bobek
Use the translator macros to define translators required by SSE
instructions.

Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 29 +
 1 file changed, 29 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index a02e9cd0d2..ef64fe606f 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -5533,6 +5533,9 @@ static void translate_insn0()(
 }   \
 }
 
+DEF_TRANSLATE_INSN1(Mb)
+DEF_TRANSLATE_INSN1(Md)
+
 #define DEF_TRANSLATE_INSN2(opT1, opT2) \
 static void translate_insn2(opT1, opT2)(\
 CPUX86State *env, DisasContext *s, int modrm,   \
@@ -5571,11 +5574,29 @@ static void translate_insn0()(
 DEF_TRANSLATE_INSN2(Ed, Pq)
 DEF_TRANSLATE_INSN2(Eq, Pq)
 DEF_TRANSLATE_INSN2(Gd, Nq)
+DEF_TRANSLATE_INSN2(Gd, Udq)
+DEF_TRANSLATE_INSN2(Gd, Wd)
 DEF_TRANSLATE_INSN2(Gq, Nq)
+DEF_TRANSLATE_INSN2(Gq, Udq)
+DEF_TRANSLATE_INSN2(Gq, Wd)
+DEF_TRANSLATE_INSN2(Mdq, Vdq)
+DEF_TRANSLATE_INSN2(Mq, Pq)
+DEF_TRANSLATE_INSN2(Mq, Vdq)
+DEF_TRANSLATE_INSN2(Mq, Vq)
 DEF_TRANSLATE_INSN2(Pq, Ed)
 DEF_TRANSLATE_INSN2(Pq, Eq)
+DEF_TRANSLATE_INSN2(Pq, Nq)
 DEF_TRANSLATE_INSN2(Pq, Qq)
+DEF_TRANSLATE_INSN2(Pq, Wq)
 DEF_TRANSLATE_INSN2(Qq, Pq)
+DEF_TRANSLATE_INSN2(Vd, Ed)
+DEF_TRANSLATE_INSN2(Vd, Eq)
+DEF_TRANSLATE_INSN2(Vd, Wd)
+DEF_TRANSLATE_INSN2(Vdq, Qq)
+DEF_TRANSLATE_INSN2(Vdq, Wdq)
+DEF_TRANSLATE_INSN2(Vq, UdqMhq)
+DEF_TRANSLATE_INSN2(Wd, Vd)
+DEF_TRANSLATE_INSN2(Wdq, Vdq)
 
 #define DEF_TRANSLATE_INSN3(opT1, opT2, opT3)   \
 static void translate_insn3(opT1, opT2, opT3)(  \
@@ -5627,6 +5648,9 @@ DEF_TRANSLATE_INSN3(Nq, Nq, Ib)
 DEF_TRANSLATE_INSN3(Pq, Pq, Qd)
 DEF_TRANSLATE_INSN3(Pq, Pq, Qq)
 DEF_TRANSLATE_INSN3(Pq, Qq, Ib)
+DEF_TRANSLATE_INSN3(Vd, Vd, Wd)
+DEF_TRANSLATE_INSN3(Vdq, Vdq, Wdq)
+DEF_TRANSLATE_INSN3(Vdq, Vq, Wq)
 
 #define DEF_TRANSLATE_INSN4(opT1, opT2, opT3, opT4) \
 static void translate_insn4(opT1, opT2, opT3, opT4)(\
@@ -5680,6 +5704,11 @@ DEF_TRANSLATE_INSN3(Pq, Qq, Ib)
 }   \
 }
 
+DEF_TRANSLATE_INSN4(Pq, Pq, RdMw, Ib)
+DEF_TRANSLATE_INSN4(Vd, Vd, Wd, Ib)
+DEF_TRANSLATE_INSN4(Vdq, Vdq, Wd, modrm_mod)
+DEF_TRANSLATE_INSN4(Vdq, Vdq, Wdq, Ib)
+
 #define OPCODE_GRP_BEGIN(grpname)   \
 static void translate_group(grpname)(   \
 CPUX86State *env, DisasContext *s, int modrm)   \
-- 
2.20.1




[Qemu-devel] [RFC PATCH v3 40/46] target/i386: introduce SSE instructions to sse-opcode.inc.h

2019-08-14 Thread Jan Bobek
Add all the SSE instruction entries to sse-opcode.inc.h.

Signed-off-by: Jan Bobek 
---
 target/i386/sse-opcode.inc.h | 158 +++
 1 file changed, 158 insertions(+)

diff --git a/target/i386/sse-opcode.inc.h b/target/i386/sse-opcode.inc.h
index 36963e5a7c..39947aeb51 100644
--- a/target/i386/sse-opcode.inc.h
+++ b/target/i386/sse-opcode.inc.h
@@ -51,6 +51,36 @@ OPCODE(movq, LEG(NP, 0F, 1, 0x7e), MMX, WR, Eq, Pq)
 OPCODE(movq, LEG(NP, 0F, 0, 0x6f), MMX, WR, Pq, Qq)
 /* NP 0F 7F /r: MOVQ mm/m64, mm */
 OPCODE(movq, LEG(NP, 0F, 0, 0x7f), MMX, WR, Qq, Pq)
+/* NP 0F 28 /r: MOVAPS xmm1, xmm2/m128 */
+OPCODE(movaps, LEG(NP, 0F, 0, 0x28), SSE, WR, Vdq, Wdq)
+/* NP 0F 29 /r: MOVAPS xmm2/m128, xmm1 */
+OPCODE(movaps, LEG(NP, 0F, 0, 0x29), SSE, WR, Wdq, Vdq)
+/* NP 0F 10 /r: MOVUPS xmm1, xmm2/m128 */
+OPCODE(movups, LEG(NP, 0F, 0, 0x10), SSE, WR, Vdq, Wdq)
+/* NP 0F 11 /r: MOVUPS xmm2/m128, xmm1 */
+OPCODE(movups, LEG(NP, 0F, 0, 0x11), SSE, WR, Wdq, Vdq)
+/* F3 0F 10 /r: MOVSS xmm1, xmm2/m32 */
+OPCODE(movss, LEG(F3, 0F, 0, 0x10), SSE, WRRR, Vdq, Vdq, Wd, modrm_mod)
+/* F3 0F 11 /r: MOVSS xmm2/m32, xmm1 */
+OPCODE(movss, LEG(F3, 0F, 0, 0x11), SSE, WR, Wd, Vd)
+/* NP 0F 12 /r: MOVHLPS xmm1, xmm2 */
+/* NP 0F 12 /r: MOVLPS xmm1, m64 */
+OPCODE(movhlps, LEG(NP, 0F, 0, 0x12), SSE, WR, Vq, UdqMhq)
+/* 0F 13 /r: MOVLPS m64, xmm1 */
+OPCODE(movlps, LEG(NP, 0F, 0, 0x13), SSE, WR, Mq, Vq)
+/* NP 0F 16 /r: MOVLHPS xmm1, xmm2 */
+/* NP 0F 16 /r: MOVHPS xmm1, m64 */
+OPCODE(movlhps, LEG(NP, 0F, 0, 0x16), SSE, WRR, Vdq, Vq, Wq)
+/* NP 0F 17 /r: MOVHPS m64, xmm1 */
+OPCODE(movhps, LEG(NP, 0F, 0, 0x17), SSE, WR, Mq, Vdq)
+/* NP 0F D7 /r: PMOVMSKB r32, mm */
+OPCODE(pmovmskb, LEG(NP, 0F, 0, 0xd7), SSE, WR, Gd, Nq)
+/* NP REX.W 0F D7 /r: PMOVMSKB r64, mm */
+OPCODE(pmovmskb, LEG(NP, 0F, 1, 0xd7), SSE, WR, Gq, Nq)
+/* NP 0F 50 /r: MOVMSKPS r32, xmm */
+OPCODE(movmskps, LEG(NP, 0F, 0, 0x50), SSE, WR, Gd, Udq)
+/* NP REX.W 0F 50 /r: MOVMSKPS r64, xmm */
+OPCODE(movmskps, LEG(NP, 0F, 1, 0x50), SSE, WR, Gq, Udq)
 /* NP 0F FC /r: PADDB mm, mm/m64 */
 OPCODE(paddb, LEG(NP, 0F, 0, 0xfc), MMX, WRR, Pq, Pq, Qq)
 /* NP 0F FD /r: PADDW mm, mm/m64 */
@@ -65,6 +95,10 @@ OPCODE(paddsw, LEG(NP, 0F, 0, 0xed), MMX, WRR, Pq, Pq, Qq)
 OPCODE(paddusb, LEG(NP, 0F, 0, 0xdc), MMX, WRR, Pq, Pq, Qq)
 /* NP 0F DD /r: PADDUSW mm,mm/m64 */
 OPCODE(paddusw, LEG(NP, 0F, 0, 0xdd), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F 58 /r: ADDPS xmm1, xmm2/m128 */
+OPCODE(addps, LEG(NP, 0F, 0, 0x58), SSE, WRR, Vdq, Vdq, Wdq)
+/* F3 0F 58 /r: ADDSS xmm1, xmm2/m32 */
+OPCODE(addss, LEG(F3, 0F, 0, 0x58), SSE, WRR, Vd, Vd, Wd)
 /* NP 0F F8 /r: PSUBB mm, mm/m64 */
 OPCODE(psubb, LEG(NP, 0F, 0, 0xf8), MMX, WRR, Pq, Pq, Qq)
 /* NP 0F F9 /r: PSUBW mm, mm/m64 */
@@ -79,12 +113,60 @@ OPCODE(psubsw, LEG(NP, 0F, 0, 0xe9), MMX, WRR, Pq, Pq, Qq)
 OPCODE(psubusb, LEG(NP, 0F, 0, 0xd8), MMX, WRR, Pq, Pq, Qq)
 /* NP 0F D9 /r: PSUBUSW mm, mm/m64 */
 OPCODE(psubusw, LEG(NP, 0F, 0, 0xd9), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F 5C /r: SUBPS xmm1, xmm2/m128 */
+OPCODE(subps, LEG(NP, 0F, 0, 0x5c), SSE, WRR, Vdq, Vdq, Wdq)
+/* F3 0F 5C /r: SUBSS xmm1, xmm2/m32 */
+OPCODE(subss, LEG(F3, 0F, 0, 0x5c), SSE, WRR, Vd, Vd, Wd)
 /* NP 0F D5 /r: PMULLW mm, mm/m64 */
 OPCODE(pmullw, LEG(NP, 0F, 0, 0xd5), MMX, WRR, Pq, Pq, Qq)
 /* NP 0F E5 /r: PMULHW mm, mm/m64 */
 OPCODE(pmulhw, LEG(NP, 0F, 0, 0xe5), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F E4 /r: PMULHUW mm1, mm2/m64 */
+OPCODE(pmulhuw, LEG(NP, 0F, 0, 0xe4), SSE, WRR, Pq, Pq, Qq)
+/* NP 0F 59 /r: MULPS xmm1, xmm2/m128 */
+OPCODE(mulps, LEG(NP, 0F, 0, 0x59), SSE, WRR, Vdq, Vdq, Wdq)
+/* F3 0F 59 /r: MULSS xmm1,xmm2/m32 */
+OPCODE(mulss, LEG(F3, 0F, 0, 0x59), SSE, WRR, Vd, Vd, Wd)
 /* NP 0F F5 /r: PMADDWD mm, mm/m64 */
 OPCODE(pmaddwd, LEG(NP, 0F, 0, 0xf5), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F 5E /r: DIVPS xmm1, xmm2/m128 */
+OPCODE(divps, LEG(NP, 0F, 0, 0x5e), SSE, WRR, Vdq, Vdq, Wdq)
+/* F3 0F 5E /r: DIVSS xmm1, xmm2/m32 */
+OPCODE(divss, LEG(F3, 0F, 0, 0x5e), SSE, WRR, Vd, Vd, Wd)
+/* NP 0F 53 /r: RCPPS xmm1, xmm2/m128 */
+OPCODE(rcpps, LEG(NP, 0F, 0, 0x53), SSE, WR, Vdq, Wdq)
+/* F3 0F 53 /r: RCPSS xmm1, xmm2/m32 */
+OPCODE(rcpss, LEG(F3, 0F, 0, 0x53), SSE, WR, Vd, Wd)
+/* NP 0F 51 /r: SQRTPS xmm1, xmm2/m128 */
+OPCODE(sqrtps, LEG(NP, 0F, 0, 0x51), SSE, WR, Vdq, Wdq)
+/* F3 0F 51 /r: SQRTSS xmm1, xmm2/m32 */
+OPCODE(sqrtss, LEG(F3, 0F, 0, 0x51), SSE, WR, Vd, Wd)
+/* NP 0F 52 /r: RSQRTPS xmm1, xmm2/m128 */
+OPCODE(rsqrtps, LEG(NP, 0F, 0, 0x52), SSE, WR, Vdq, Wdq)
+/* F3 0F 52 /r: RSQRTSS xmm1, xmm2/m32 */
+OPCODE(rsqrtss, LEG(F3, 0F, 0, 0x52), SSE, WR, Vd, Wd)
+/* NP 0F DA /r: PMINUB mm1, mm2/m64 */
+OPCODE(pminub, LEG(NP, 0F, 0, 0xda), SSE, WRR, Pq, Pq, Qq)
+/* NP 0F EA /r: PMINSW mm1, mm2/m64 */
+OPCODE(pminsw, LEG(NP, 0F, 0, 0xea), SSE, WRR, Pq, Pq, Qq)
+/* NP 0F 5D /r: MINPS xmm1, xmm2/m128 */
+OPCODE(minps, LEG(NP, 0F, 0, 0x5d), SSE, WRR, Vdq, Vdq, Wdq)
+/* F3 0F 5D /r: MINSS xmm1,xmm2/m32 */
+OPCODE(minss, LEG(F3, 0F, 0, 0x5d), SSE, WRR, Vd, Vd, Wd)
+/

[Qemu-devel] [RFC PATCH v3 41/46] target/i386: introduce SSE2 translators

2019-08-14 Thread Jan Bobek
Use the translator macros to define translators required by SSE2
instructions.

Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 33 +
 1 file changed, 33 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 3d526ee470..177bedd0ef 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -5891,14 +5891,20 @@ DEF_TRANSLATE_INSN1(Md)
 }
 
 DEF_TRANSLATE_INSN2(Ed, Pq)
+DEF_TRANSLATE_INSN2(Ed, Vdq)
 DEF_TRANSLATE_INSN2(Eq, Pq)
+DEF_TRANSLATE_INSN2(Eq, Vdq)
 DEF_TRANSLATE_INSN2(Gd, Nq)
 DEF_TRANSLATE_INSN2(Gd, Udq)
 DEF_TRANSLATE_INSN2(Gd, Wd)
+DEF_TRANSLATE_INSN2(Gd, Wq)
 DEF_TRANSLATE_INSN2(Gq, Nq)
 DEF_TRANSLATE_INSN2(Gq, Udq)
 DEF_TRANSLATE_INSN2(Gq, Wd)
+DEF_TRANSLATE_INSN2(Gq, Wq)
+DEF_TRANSLATE_INSN2(Md, Gd)
 DEF_TRANSLATE_INSN2(Mdq, Vdq)
+DEF_TRANSLATE_INSN2(Mq, Gq)
 DEF_TRANSLATE_INSN2(Mq, Pq)
 DEF_TRANSLATE_INSN2(Mq, Vdq)
 DEF_TRANSLATE_INSN2(Mq, Vq)
@@ -5906,16 +5912,33 @@ DEF_TRANSLATE_INSN2(Pq, Ed)
 DEF_TRANSLATE_INSN2(Pq, Eq)
 DEF_TRANSLATE_INSN2(Pq, Nq)
 DEF_TRANSLATE_INSN2(Pq, Qq)
+DEF_TRANSLATE_INSN2(Pq, Uq)
+DEF_TRANSLATE_INSN2(Pq, Wdq)
 DEF_TRANSLATE_INSN2(Pq, Wq)
 DEF_TRANSLATE_INSN2(Qq, Pq)
+DEF_TRANSLATE_INSN2(UdqMq, Vq)
 DEF_TRANSLATE_INSN2(Vd, Ed)
 DEF_TRANSLATE_INSN2(Vd, Eq)
 DEF_TRANSLATE_INSN2(Vd, Wd)
+DEF_TRANSLATE_INSN2(Vd, Wq)
+DEF_TRANSLATE_INSN2(Vdq, Ed)
+DEF_TRANSLATE_INSN2(Vdq, Eq)
+DEF_TRANSLATE_INSN2(Vdq, Nq)
 DEF_TRANSLATE_INSN2(Vdq, Qq)
+DEF_TRANSLATE_INSN2(Vdq, Udq)
 DEF_TRANSLATE_INSN2(Vdq, Wdq)
+DEF_TRANSLATE_INSN2(Vdq, Wq)
+DEF_TRANSLATE_INSN2(Vq, Ed)
+DEF_TRANSLATE_INSN2(Vq, Eq)
+DEF_TRANSLATE_INSN2(Vq, Mq)
 DEF_TRANSLATE_INSN2(Vq, UdqMhq)
+DEF_TRANSLATE_INSN2(Vq, Wd)
+DEF_TRANSLATE_INSN2(Vq, Wq)
 DEF_TRANSLATE_INSN2(Wd, Vd)
 DEF_TRANSLATE_INSN2(Wdq, Vdq)
+DEF_TRANSLATE_INSN2(Wq, Vq)
+DEF_TRANSLATE_INSN2(Wq, Wd)
+DEF_TRANSLATE_INSN2(modrm_mod, modrm)
 
 #define DEF_TRANSLATE_INSN3(opT1, opT2, opT3)   \
 static void translate_insn3(opT1, opT2, opT3)(  \
@@ -5962,14 +5985,21 @@ DEF_TRANSLATE_INSN2(Wdq, Vdq)
 }
 
 DEF_TRANSLATE_INSN3(Gd, Nq, Ib)
+DEF_TRANSLATE_INSN3(Gd, Udq, Ib)
 DEF_TRANSLATE_INSN3(Gq, Nq, Ib)
+DEF_TRANSLATE_INSN3(Gq, Udq, Ib)
 DEF_TRANSLATE_INSN3(Nq, Nq, Ib)
 DEF_TRANSLATE_INSN3(Pq, Pq, Qd)
 DEF_TRANSLATE_INSN3(Pq, Pq, Qq)
 DEF_TRANSLATE_INSN3(Pq, Qq, Ib)
+DEF_TRANSLATE_INSN3(Udq, Udq, Ib)
 DEF_TRANSLATE_INSN3(Vd, Vd, Wd)
+DEF_TRANSLATE_INSN3(Vdq, Vd, Mq)
+DEF_TRANSLATE_INSN3(Vdq, Vdq, Mq)
 DEF_TRANSLATE_INSN3(Vdq, Vdq, Wdq)
 DEF_TRANSLATE_INSN3(Vdq, Vq, Wq)
+DEF_TRANSLATE_INSN3(Vdq, Wdq, Ib)
+DEF_TRANSLATE_INSN3(Vq, Vq, Wq)
 
 #define DEF_TRANSLATE_INSN4(opT1, opT2, opT3, opT4) \
 static void translate_insn4(opT1, opT2, opT3, opT4)(\
@@ -6025,8 +6055,11 @@ DEF_TRANSLATE_INSN3(Vdq, Vq, Wq)
 
 DEF_TRANSLATE_INSN4(Pq, Pq, RdMw, Ib)
 DEF_TRANSLATE_INSN4(Vd, Vd, Wd, Ib)
+DEF_TRANSLATE_INSN4(Vdq, Vdq, RdMw, Ib)
 DEF_TRANSLATE_INSN4(Vdq, Vdq, Wd, modrm_mod)
 DEF_TRANSLATE_INSN4(Vdq, Vdq, Wdq, Ib)
+DEF_TRANSLATE_INSN4(Vdq, Vdq, Wq, modrm_mod)
+DEF_TRANSLATE_INSN4(Vq, Vq, Wq, Ib)
 
 #define OPCODE_GRP_BEGIN(grpname)   \
 static void translate_group(grpname)(   \
-- 
2.20.1




[Qemu-devel] [RFC PATCH v3 37/46] target/i386: introduce MMX instructions to sse-opcode.inc.h

2019-08-14 Thread Jan Bobek
Add all MMX instruction entries to sse-opcode.inc.h.

Signed-off-by: Jan Bobek 
---
 target/i386/sse-opcode.inc.h | 131 +++
 1 file changed, 131 insertions(+)

diff --git a/target/i386/sse-opcode.inc.h b/target/i386/sse-opcode.inc.h
index c5e81a6a80..36963e5a7c 100644
--- a/target/i386/sse-opcode.inc.h
+++ b/target/i386/sse-opcode.inc.h
@@ -39,6 +39,137 @@
 #   define OPCODE_GRP_END(grpname)
 #endif /* OPCODE_GRP_END */
 
+/* NP 0F 6E /r: MOVD mm,r/m32 */
+OPCODE(movd, LEG(NP, 0F, 0, 0x6e), MMX, WR, Pq, Ed)
+/* NP 0F 7E /r: MOVD r/m32,mm */
+OPCODE(movd, LEG(NP, 0F, 0, 0x7e), MMX, WR, Ed, Pq)
+/* NP REX.W + 0F 6E /r: MOVQ mm,r/m64 */
+OPCODE(movq, LEG(NP, 0F, 1, 0x6e), MMX, WR, Pq, Eq)
+/* NP REX.W + 0F 7E /r: MOVQ r/m64,mm */
+OPCODE(movq, LEG(NP, 0F, 1, 0x7e), MMX, WR, Eq, Pq)
+/* NP 0F 6F /r: MOVQ mm, mm/m64 */
+OPCODE(movq, LEG(NP, 0F, 0, 0x6f), MMX, WR, Pq, Qq)
+/* NP 0F 7F /r: MOVQ mm/m64, mm */
+OPCODE(movq, LEG(NP, 0F, 0, 0x7f), MMX, WR, Qq, Pq)
+/* NP 0F FC /r: PADDB mm, mm/m64 */
+OPCODE(paddb, LEG(NP, 0F, 0, 0xfc), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F FD /r: PADDW mm, mm/m64 */
+OPCODE(paddw, LEG(NP, 0F, 0, 0xfd), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F FE /r: PADDD mm, mm/m64 */
+OPCODE(paddd, LEG(NP, 0F, 0, 0xfe), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F EC /r: PADDSB mm, mm/m64 */
+OPCODE(paddsb, LEG(NP, 0F, 0, 0xec), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F ED /r: PADDSW mm, mm/m64 */
+OPCODE(paddsw, LEG(NP, 0F, 0, 0xed), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F DC /r: PADDUSB mm,mm/m64 */
+OPCODE(paddusb, LEG(NP, 0F, 0, 0xdc), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F DD /r: PADDUSW mm,mm/m64 */
+OPCODE(paddusw, LEG(NP, 0F, 0, 0xdd), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F F8 /r: PSUBB mm, mm/m64 */
+OPCODE(psubb, LEG(NP, 0F, 0, 0xf8), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F F9 /r: PSUBW mm, mm/m64 */
+OPCODE(psubw, LEG(NP, 0F, 0, 0xf9), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F FA /r: PSUBD mm, mm/m64 */
+OPCODE(psubd, LEG(NP, 0F, 0, 0xfa), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F E8 /r: PSUBSB mm, mm/m64 */
+OPCODE(psubsb, LEG(NP, 0F, 0, 0xe8), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F E9 /r: PSUBSW mm, mm/m64 */
+OPCODE(psubsw, LEG(NP, 0F, 0, 0xe9), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F D8 /r: PSUBUSB mm, mm/m64 */
+OPCODE(psubusb, LEG(NP, 0F, 0, 0xd8), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F D9 /r: PSUBUSW mm, mm/m64 */
+OPCODE(psubusw, LEG(NP, 0F, 0, 0xd9), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F D5 /r: PMULLW mm, mm/m64 */
+OPCODE(pmullw, LEG(NP, 0F, 0, 0xd5), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F E5 /r: PMULHW mm, mm/m64 */
+OPCODE(pmulhw, LEG(NP, 0F, 0, 0xe5), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F F5 /r: PMADDWD mm, mm/m64 */
+OPCODE(pmaddwd, LEG(NP, 0F, 0, 0xf5), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F 74 /r: PCMPEQB mm,mm/m64 */
+OPCODE(pcmpeqb, LEG(NP, 0F, 0, 0x74), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F 75 /r: PCMPEQW mm,mm/m64 */
+OPCODE(pcmpeqw, LEG(NP, 0F, 0, 0x75), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F 76 /r: PCMPEQD mm,mm/m64 */
+OPCODE(pcmpeqd, LEG(NP, 0F, 0, 0x76), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F 64 /r: PCMPGTB mm,mm/m64 */
+OPCODE(pcmpgtb, LEG(NP, 0F, 0, 0x64), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F 65 /r: PCMPGTW mm,mm/m64 */
+OPCODE(pcmpgtw, LEG(NP, 0F, 0, 0x65), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F 66 /r: PCMPGTD mm,mm/m64 */
+OPCODE(pcmpgtd, LEG(NP, 0F, 0, 0x66), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F DB /r: PAND mm, mm/m64 */
+OPCODE(pand, LEG(NP, 0F, 0, 0xdb), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F DF /r: PANDN mm, mm/m64 */
+OPCODE(pandn, LEG(NP, 0F, 0, 0xdf), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F EB /r: POR mm, mm/m64 */
+OPCODE(por, LEG(NP, 0F, 0, 0xeb), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F EF /r: PXOR mm, mm/m64 */
+OPCODE(pxor, LEG(NP, 0F, 0, 0xef), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F F1 /r: PSLLW mm, mm/m64 */
+OPCODE(psllw, LEG(NP, 0F, 0, 0xf1), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F F2 /r: PSLLD mm, mm/m64 */
+OPCODE(pslld, LEG(NP, 0F, 0, 0xf2), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F F3 /r: PSLLQ mm, mm/m64 */
+OPCODE(psllq, LEG(NP, 0F, 0, 0xf3), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F D1 /r: PSRLW mm, mm/m64 */
+OPCODE(psrlw, LEG(NP, 0F, 0, 0xd1), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F D2 /r: PSRLD mm, mm/m64 */
+OPCODE(psrld, LEG(NP, 0F, 0, 0xd2), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F D3 /r: PSRLQ mm, mm/m64 */
+OPCODE(psrlq, LEG(NP, 0F, 0, 0xd3), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F E1 /r: PSRAW mm,mm/m64 */
+OPCODE(psraw, LEG(NP, 0F, 0, 0xe1), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F E2 /r: PSRAD mm,mm/m64 */
+OPCODE(psrad, LEG(NP, 0F, 0, 0xe2), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F 63 /r: PACKSSWB mm1, mm2/m64 */
+OPCODE(packsswb, LEG(NP, 0F, 0, 0x63), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F 6B /r: PACKSSDW mm1, mm2/m64 */
+OPCODE(packssdw, LEG(NP, 0F, 0, 0x6b), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F 67 /r: PACKUSWB mm, mm/m64 */
+OPCODE(packuswb, LEG(NP, 0F, 0, 0x67), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F 68 /r: PUNPCKHBW mm, mm/m64 */
+OPCODE(punpckhbw, LEG(NP, 0F, 0, 0x68), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F 69 /r: PUNPCKHWD mm, mm/m64 */
+OPCODE(punpckhwd, LEG(NP, 0F, 0, 0x69), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F 6A /r: PUNPCKHDQ mm, mm/m64 */
+OPCODE(punpckhdq, LEG(NP, 0F, 0, 0x6a), M

[Qemu-devel] [PATCH v5 06/10] log: adding -d tb_stats to control tbstats

2019-08-14 Thread vandersonmr
Adding -d tb_stats to control TBStatistics collection:

 -d tb_stats[[,level=(+all+jit+exec+time)][,dump_limit=]]

"dump_limit" is used to limit the number of dumped TBStats in
linux-user mode.

[all+jit+exec+time] control the profilling level used
by the TBStats. Can be used as follow:

-d tb_stats
-d tb_stats,level=jit+time
-d tb_stats,dump_limit=15
...

Signed-off-by: Vanderson M. do Rosario 
---
 accel/tcg/tb-stats.c  |  1 +
 accel/tcg/translator.c|  1 +
 include/exec/gen-icount.h |  1 +
 include/exec/tb-stats.h   | 15 ---
 include/qemu-common.h | 15 +++
 include/qemu/log.h|  1 +
 tcg/tcg.c |  1 +
 util/log.c| 35 +++
 8 files changed, 55 insertions(+), 15 deletions(-)

diff --git a/accel/tcg/tb-stats.c b/accel/tcg/tb-stats.c
index 2bb1fde837..dddb9d4537 100644
--- a/accel/tcg/tb-stats.c
+++ b/accel/tcg/tb-stats.c
@@ -3,6 +3,7 @@
 #include "disas/disas.h"
 #include "exec/exec-all.h"
 #include "tcg.h"
+#include "qemu-common.h"
 
 #include "qemu/qemu-print.h"
 
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index 834265d5be..ea7c3a9f77 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -16,6 +16,7 @@
 #include "exec/gen-icount.h"
 #include "exec/log.h"
 #include "exec/translator.h"
+#include "qemu-common.h"
 
 /* Pairs with tcg_clear_temp_count.
To be called by #TranslatorOps.{translate_insn,tb_stop} if
diff --git a/include/exec/gen-icount.h b/include/exec/gen-icount.h
index b3efe41894..6f54586dd6 100644
--- a/include/exec/gen-icount.h
+++ b/include/exec/gen-icount.h
@@ -2,6 +2,7 @@
 #define GEN_ICOUNT_H
 
 #include "qemu/timer.h"
+#include "qemu-common.h"
 
 /* Helpers for instruction counting code generation.  */
 
diff --git a/include/exec/tb-stats.h b/include/exec/tb-stats.h
index 1dcfcdf9e8..c0948e606a 100644
--- a/include/exec/tb-stats.h
+++ b/include/exec/tb-stats.h
@@ -79,21 +79,6 @@ void init_tb_stats_htable_if_not(void);
 void dump_jit_profile_info(TCGProfile *s);
 
 /* TBStatistic collection controls */
-enum TBStatsStatus {
-TB_STATS_DISABLED = 0,
-TB_STATS_RUNNING,
-TB_STATS_PAUSED,
-TB_STATS_STOPPED
-};
-
-#define TB_NOTHING0
-#define TB_EXEC_STATS 1
-#define TB_JIT_STATS  2
-#define TB_JIT_TIME   4
-
-extern int tcg_collect_tb_stats;
-extern uint32_t default_tbstats_flag;
-
 void enable_collect_tb_stats(void);
 void disable_collect_tb_stats(void);
 void pause_collect_tb_stats(void);
diff --git a/include/qemu-common.h b/include/qemu-common.h
index 0235cd3b91..362e48c445 100644
--- a/include/qemu-common.h
+++ b/include/qemu-common.h
@@ -130,4 +130,19 @@ void page_size_init(void);
  * returned. */
 bool dump_in_progress(void);
 
+enum TBStatsStatus {
+TB_STATS_DISABLED = 0,
+TB_STATS_RUNNING,
+TB_STATS_PAUSED,
+TB_STATS_STOPPED
+};
+
+#define TB_NOTHING0
+#define TB_EXEC_STATS 1
+#define TB_JIT_STATS  2
+#define TB_JIT_TIME   4
+
+extern int tcg_collect_tb_stats;
+extern uint32_t default_tbstats_flag;
+
 #endif
diff --git a/include/qemu/log.h b/include/qemu/log.h
index b097a6cae1..a8d1997cde 100644
--- a/include/qemu/log.h
+++ b/include/qemu/log.h
@@ -45,6 +45,7 @@ static inline bool qemu_log_separate(void)
 /* LOG_TRACE (1 << 15) is defined in log-for-trace.h */
 #define CPU_LOG_TB_OP_IND  (1 << 16)
 #define CPU_LOG_TB_FPU (1 << 17)
+#define CPU_LOG_TB_STATS   (1 << 18)
 
 /* Lock output for a series of related logs.  Since this is not needed
  * for a single qemu_log / qemu_log_mask / qemu_log_mask_and_addr, we
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 1cd07c6c47..c6c9c938dc 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -35,6 +35,7 @@
 #include "qemu/host-utils.h"
 #include "qemu/qemu-print.h"
 #include "qemu/timer.h"
+#include "qemu-common.h"
 
 /* Note: the long term plan is to reduce the dependencies on the QEMU
CPU definitions. Currently they are used for qemu_ld/st
diff --git a/util/log.c b/util/log.c
index 29021a4584..09cfb13b45 100644
--- a/util/log.c
+++ b/util/log.c
@@ -19,17 +19,20 @@
 
 #include "qemu/osdep.h"
 #include "qemu/log.h"
+#include "qemu/qemu-print.h"
 #include "qemu/range.h"
 #include "qemu/error-report.h"
 #include "qapi/error.h"
 #include "qemu/cutils.h"
 #include "trace/control.h"
+#include "qemu-common.h"
 
 static char *logfilename;
 FILE *qemu_logfile;
 int qemu_loglevel;
 static int log_append = 0;
 static GArray *debug_regions;
+int32_t max_num_hot_tbs_to_dump;
 
 int tcg_collect_tb_stats;
 uint32_t default_tbstats_flag;
@@ -276,6 +279,9 @@ const QEMULogItem qemu_log_items[] = {
 { CPU_LOG_TB_NOCHAIN, "nochain",
   "do not chain compiled TBs so that \"exec\" and \"cpu\" show\n"
   "complete traces" },
+{ CPU_LOG_TB_STATS, 
"tb_stats[[,level=(+all+jit+exec+time)][,dump_limit=]]",
+  "enable collection of TBs statistics"
+  "(and dump until given a limit if in user mode).\n" },
 { 0, NULL, NULL },
 };
 
@@ -297,6 +303,35 @@ int qemu_str_to_log_mask(cons

[Qemu-devel] [RFC PATCH v3 39/46] target/i386: introduce SSE code generators

2019-08-14 Thread Jan Bobek
Introduce code generators required by SSE instructions.

Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 319 
 1 file changed, 319 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index ef64fe606f..3d526ee470 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -5360,6 +5360,9 @@ INSNOP_LDST(xmm_t0, Mhq)
 #define DEF_GEN_INSN2_GVEC_MM(mnem, gvec, opT1, opT2, vece)   \
 DEF_GEN_INSN2_GVEC(mnem, gvec, opT1, opT2, vece,  \
sizeof(MMXReg), sizeof(MMXReg))
+#define DEF_GEN_INSN2_GVEC_XMM(mnem, gvec, opT1, opT2, vece)  \
+DEF_GEN_INSN2_GVEC(mnem, gvec, opT1, opT2, vece,  \
+   sizeof(XMMReg), sizeof(XMMReg))
 
 #define DEF_GEN_INSN3_GVEC(mnem, gvec, opT1, opT2, opT3, vece, oprsz, maxsz) \
 GEN_INSN3(mnem, opT1, opT2, opT3)   \
@@ -5369,6 +5372,9 @@ INSNOP_LDST(xmm_t0, Mhq)
 #define DEF_GEN_INSN3_GVEC_MM(mnem, gvec, opT1, opT2, opT3, vece)   \
 DEF_GEN_INSN3_GVEC(mnem, gvec, opT1, opT2, opT3, vece,  \
sizeof(MMXReg), sizeof(MMXReg))
+#define DEF_GEN_INSN3_GVEC_XMM(mnem, gvec, opT1, opT2, opT3, vece)  \
+DEF_GEN_INSN3_GVEC(mnem, gvec, opT1, opT2, opT3, vece,  \
+   sizeof(XMMReg), sizeof(XMMReg))
 
 GEN_INSN2(movq, Pq, Eq);/* forward declaration */
 GEN_INSN2(movd, Pq, Ed)
@@ -5399,6 +5405,90 @@ GEN_INSN2(movq, Eq, Pq)
 
 DEF_GEN_INSN2_GVEC_MM(movq, mov, Pq, Qq, MO_64)
 DEF_GEN_INSN2_GVEC_MM(movq, mov, Qq, Pq, MO_64)
+DEF_GEN_INSN2_GVEC_XMM(movaps, mov, Vdq, Wdq, MO_64)
+DEF_GEN_INSN2_GVEC_XMM(movaps, mov, Wdq, Vdq, MO_64)
+DEF_GEN_INSN2_GVEC_XMM(movups, mov, Vdq, Wdq, MO_64)
+DEF_GEN_INSN2_GVEC_XMM(movups, mov, Wdq, Vdq, MO_64)
+
+GEN_INSN2(movss, Wd, Vd);   /* forward declaration */
+GEN_INSN4(movss, Vdq, Vdq, Wd, modrm_mod)
+{
+assert(arg1 == arg2);
+
+if (arg4 == 3) {
+/* merging movss */
+gen_insn2(movss, Wd, Vd)(env, s, arg1, arg3);
+} else {
+/* zero-extending movss */
+const TCGv_i32 r32 = tcg_temp_new_i32();
+const TCGv_i64 r64 = tcg_temp_new_i64();
+
+tcg_gen_ld_i32(r32, cpu_env, arg3 + offsetof(ZMMReg, ZMM_L(0)));
+tcg_gen_extu_i32_i64(r64, r32);
+tcg_gen_st_i64(r64, cpu_env, arg1 + offsetof(ZMMReg, ZMM_Q(0)));
+
+tcg_gen_movi_i64(r64, 0);
+tcg_gen_st_i64(r64, cpu_env, arg1 + offsetof(ZMMReg, ZMM_Q(1)));
+
+tcg_temp_free_i32(r32);
+tcg_temp_free_i64(r64);
+}
+}
+
+GEN_INSN2(movss, Wd, Vd)
+{
+const insnop_arg_t(Wd) dofs = offsetof(ZMMReg, ZMM_L(0));
+const insnop_arg_t(Vd) aofs = offsetof(ZMMReg, ZMM_L(0));
+gen_op_movl(s, arg1 + dofs, arg2 + aofs);
+}
+
+GEN_INSN2(movhlps, Vq, UdqMhq)
+{
+const size_t dofs = offsetof(ZMMReg, ZMM_Q(0));
+const size_t aofs = offsetof(ZMMReg, ZMM_Q(1));
+gen_op_movq(s, arg1 + dofs, arg2 + aofs);
+}
+
+GEN_INSN2(movlps, Mq, Vq)
+{
+assert(arg1 == s->A0);
+gen_stq_env_A0(s, arg2 + offsetof(ZMMReg, ZMM_Q(0)));
+}
+
+GEN_INSN3(movlhps, Vdq, Vq, Wq)
+{
+assert(arg1 == arg2);
+
+const size_t dofs = offsetof(ZMMReg, ZMM_Q(1));
+const size_t aofs = offsetof(ZMMReg, ZMM_Q(0));
+gen_op_movq(s, arg1 + dofs, arg3 + aofs);
+}
+
+GEN_INSN2(movhps, Mq, Vdq)
+{
+assert(arg1 == s->A0);
+gen_stq_env_A0(s, arg2 + offsetof(ZMMReg, ZMM_Q(1)));
+}
+
+DEF_GEN_INSN2_HELPER_DEP(pmovmskb, pmovmskb_mmx, Gd, Nq)
+
+GEN_INSN2(pmovmskb, Gq, Nq)
+{
+const TCGv_i32 arg1_r32 = tcg_temp_new_i32();
+gen_insn2(pmovmskb, Gd, Nq)(env, s, arg1_r32, arg2);
+tcg_gen_extu_i32_i64(arg1, arg1_r32);
+tcg_temp_free_i32(arg1_r32);
+}
+
+DEF_GEN_INSN2_HELPER_DEP(movmskps, movmskps, Gd, Udq)
+
+GEN_INSN2(movmskps, Gq, Udq)
+{
+const TCGv_i32 arg1_r32 = tcg_temp_new_i32();
+gen_insn2(movmskps, Gd, Udq)(env, s, arg1_r32, arg2);
+tcg_gen_extu_i32_i64(arg1, arg1_r32);
+tcg_temp_free_i32(arg1_r32);
+}
 
 DEF_GEN_INSN3_GVEC_MM(paddb, add, Pq, Pq, Qq, MO_8)
 DEF_GEN_INSN3_GVEC_MM(paddw, add, Pq, Pq, Qq, MO_16)
@@ -5407,6 +5497,8 @@ DEF_GEN_INSN3_GVEC_MM(paddsb, ssadd, Pq, Pq, Qq, MO_8)
 DEF_GEN_INSN3_GVEC_MM(paddsw, ssadd, Pq, Pq, Qq, MO_16)
 DEF_GEN_INSN3_GVEC_MM(paddusb, usadd, Pq, Pq, Qq, MO_8)
 DEF_GEN_INSN3_GVEC_MM(paddusw, usadd, Pq, Pq, Qq, MO_16)
+DEF_GEN_INSN3_HELPER_EPP(addps, addps, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(addss, addss, Vd, Vd, Wd)
 
 DEF_GEN_INSN3_GVEC_MM(psubb, sub, Pq, Pq, Qq, MO_8)
 DEF_GEN_INSN3_GVEC_MM(psubw, sub, Pq, Pq, Qq, MO_16)
@@ -5415,11 +5507,38 @@ DEF_GEN_INSN3_GVEC_MM(psubsb, sssub, Pq, Pq, Qq, MO_8)
 DEF_GEN_INSN3_GVEC_MM(psubsw, sssub, Pq, Pq, Qq, MO_16)
 DEF_GEN_INSN3_GVEC_MM(psubusb, ussub, Pq, Pq, Qq, MO_8)
 DEF_GEN_INSN3_GVEC_MM(psubusw, ussub, Pq, Pq, Qq, MO_16)
+DEF_GEN_INSN3_HELPER_EPP(subps, subps, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(subss, subss, Vd, Vd, Wd)
 
 DEF_GEN_INSN3_HELPER_

[Qemu-devel] [PATCH v5 01/10] accel: introducing TBStatistics structure

2019-08-14 Thread vandersonmr
To store statistics for each TB, we created a TBStatistics structure
which is linked with the TBs. TBStatistics can stay alive after
tb_flush and be relinked to a regenerated TB. So the statistics can
be accumulated even through flushes.

The goal is to have all present and future qemu/tcg statistics and
meta-data stored in this new structure.

Signed-off-by: Vanderson M. do Rosario 
---
 accel/tcg/Makefile.objs  |  2 +-
 accel/tcg/perf/Makefile.objs |  1 +
 accel/tcg/tb-stats.c | 39 
 accel/tcg/translate-all.c| 57 
 include/exec/exec-all.h  | 15 +++---
 include/exec/tb-context.h| 12 
 include/exec/tb-hash.h   |  7 +
 include/exec/tb-stats.h  | 43 +++
 util/log.c   |  2 ++
 9 files changed, 166 insertions(+), 12 deletions(-)
 create mode 100644 accel/tcg/perf/Makefile.objs
 create mode 100644 accel/tcg/tb-stats.c
 create mode 100644 include/exec/tb-stats.h

diff --git a/accel/tcg/Makefile.objs b/accel/tcg/Makefile.objs
index d381a02f34..49ffe81b5d 100644
--- a/accel/tcg/Makefile.objs
+++ b/accel/tcg/Makefile.objs
@@ -2,7 +2,7 @@ obj-$(CONFIG_SOFTMMU) += tcg-all.o
 obj-$(CONFIG_SOFTMMU) += cputlb.o
 obj-y += tcg-runtime.o tcg-runtime-gvec.o
 obj-y += cpu-exec.o cpu-exec-common.o translate-all.o
-obj-y += translator.o
+obj-y += translator.o tb-stats.o
 
 obj-$(CONFIG_USER_ONLY) += user-exec.o
 obj-$(call lnot,$(CONFIG_SOFTMMU)) += user-exec-stub.o
diff --git a/accel/tcg/perf/Makefile.objs b/accel/tcg/perf/Makefile.objs
new file mode 100644
index 00..f82fba35e5
--- /dev/null
+++ b/accel/tcg/perf/Makefile.objs
@@ -0,0 +1 @@
+obj-y += jitdump.o
diff --git a/accel/tcg/tb-stats.c b/accel/tcg/tb-stats.c
new file mode 100644
index 00..02844717cb
--- /dev/null
+++ b/accel/tcg/tb-stats.c
@@ -0,0 +1,39 @@
+#include "qemu/osdep.h"
+
+#include "disas/disas.h"
+
+#include "exec/tb-stats.h"
+
+void init_tb_stats_htable_if_not(void)
+{
+if (tb_stats_collection_enabled() && !tb_ctx.tb_stats.map) {
+qht_init(&tb_ctx.tb_stats, tb_stats_cmp,
+CODE_GEN_HTABLE_SIZE, QHT_MODE_AUTO_RESIZE);
+}
+}
+
+void enable_collect_tb_stats(void)
+{
+init_tb_stats_htable_if_not();
+tcg_collect_tb_stats = TB_STATS_RUNNING;
+}
+
+void disable_collect_tb_stats(void)
+{
+tcg_collect_tb_stats = TB_STATS_PAUSED;
+}
+
+void pause_collect_tb_stats(void)
+{
+tcg_collect_tb_stats = TB_STATS_STOPPED;
+}
+
+bool tb_stats_collection_enabled(void)
+{
+return tcg_collect_tb_stats == TB_STATS_RUNNING;
+}
+
+bool tb_stats_collection_paused(void)
+{
+return tcg_collect_tb_stats == TB_STATS_PAUSED;
+}
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 5d1e08b169..b7bccacd3b 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -1118,6 +1118,23 @@ static inline void code_gen_alloc(size_t tb_size)
 }
 }
 
+/*
+ * This is the more or less the same compare as tb_cmp(), but the
+ * data persists over tb_flush. We also aggregate the various
+ * variations of cflags under one record and ignore the details of
+ * page overlap (although we can count it).
+ */
+bool tb_stats_cmp(const void *ap, const void *bp)
+{
+const TBStatistics *a = ap;
+const TBStatistics *b = bp;
+
+return a->phys_pc == b->phys_pc &&
+a->pc == b->pc &&
+a->cs_base == b->cs_base &&
+a->flags == b->flags;
+}
+
 static bool tb_cmp(const void *ap, const void *bp)
 {
 const TranslationBlock *a = ap;
@@ -1137,6 +1154,7 @@ static void tb_htable_init(void)
 unsigned int mode = QHT_MODE_AUTO_RESIZE;
 
 qht_init(&tb_ctx.htable, tb_cmp, CODE_GEN_HTABLE_SIZE, mode);
+init_tb_stats_htable_if_not();
 }
 
 /* Must be called before using the QEMU cpus. 'tb_size' is the size
@@ -1666,6 +1684,34 @@ tb_link_page(TranslationBlock *tb, tb_page_addr_t 
phys_pc,
 return tb;
 }
 
+static TBStatistics *tb_get_stats(tb_page_addr_t phys_pc, target_ulong pc,
+  target_ulong cs_base, uint32_t flags,
+  TranslationBlock *current_tb)
+{
+TBStatistics *new_stats = g_new0(TBStatistics, 1);
+uint32_t hash = tb_stats_hash_func(phys_pc, pc, flags);
+void *existing_stats = NULL;
+new_stats->phys_pc = phys_pc;
+new_stats->pc = pc;
+new_stats->cs_base = cs_base;
+new_stats->flags = flags;
+new_stats->tb = current_tb;
+
+qht_insert(&tb_ctx.tb_stats, new_stats, hash, &existing_stats);
+
+if (unlikely(existing_stats)) {
+/*
+ * If there is already a TBStatistic for this TB from a previous flush
+ * then just make the new TB point to the older TBStatistic
+ */
+g_free(new_stats);
+return existing_stats;
+} else {
+return new_stats;
+}
+}
+
+
 /* Called with mmap_lock held for user mode emulation.  */
 TranslationBlock *tb_gen_code(CPUState *cpu,
  

[Qemu-devel] [RFC PATCH v3 32/46] target/i386: introduce gvec-based code generator macros

2019-08-14 Thread Jan Bobek
Code generators defined using these macros rely on a gvec operation
(i.e. tcg_gen_gvec_*).

Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index b28d651b82..75652afb45 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -23,6 +23,7 @@
 #include "disas/disas.h"
 #include "exec/exec-all.h"
 #include "tcg-op.h"
+#include "tcg-op-gvec.h"
 #include "exec/cpu_ldst.h"
 #include "exec/translator.h"
 
@@ -5351,6 +5352,18 @@ INSNOP_LDST(xmm_t0, Mhq)
 tcg_temp_free_i32(arg4_r32);\
 }
 
+#define DEF_GEN_INSN2_GVEC(mnem, gvec, opT1, opT2, vece, oprsz, maxsz)  \
+GEN_INSN2(mnem, opT1, opT2) \
+{   \
+tcg_gen_gvec_ ## gvec(vece, arg1, arg2, oprsz, maxsz);  \
+}
+
+#define DEF_GEN_INSN3_GVEC(mnem, gvec, opT1, opT2, opT3, vece, oprsz, maxsz) \
+GEN_INSN3(mnem, opT1, opT2, opT3)   \
+{   \
+tcg_gen_gvec_ ## gvec(vece, arg1, arg2, arg3, oprsz, maxsz);\
+}
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
 enum {
-- 
2.20.1




[Qemu-devel] [PATCH v1 2/2] tb-stats: adding TBStatistics info into perf dump

2019-08-14 Thread vandersonmr
Adding TBStatistics information to linux perf TB's symbol names.

This commit depends on the following PATCH:
[PATCH v5 00/10] Measure Tiny Code Generation Quality

Signed-off-by: Vanderson M. do Rosario 
---
 accel/tcg/perf/jitdump.c | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/accel/tcg/perf/jitdump.c b/accel/tcg/perf/jitdump.c
index 6f4c0911c2..b2334fd601 100644
--- a/accel/tcg/perf/jitdump.c
+++ b/accel/tcg/perf/jitdump.c
@@ -8,6 +8,7 @@
 #include 
 #include 
 
+#include "exec/tb-stats.h"
 #include "jitdump.h"
 #include "qemu-common.h"
 
@@ -135,7 +136,19 @@ void start_jitdump_file(void)
 void append_load_in_jitdump_file(TranslationBlock *tb)
 {
 GString *func_name = g_string_new(NULL);
-g_string_printf(func_name, "TB virt:0x"TARGET_FMT_lx"%c", tb->pc, '\0');
+if (tb->tb_stats) {
+TBStatistics *tbs = tb->tb_stats;
+g = stat_per_translation(tbs, code.num_guest_inst);
+ops = stat_per_translation(tbs, code.num_tcg_ops);
+ops_opt = stat_per_translation(tbs, code.num_tcg_ops_opt);
+spills = stat_per_translation(tbs, code.spills);
+
+g_string_printf(func_name,
+"TB virt:0x"TARGET_FMT_lx" (g:%u op:%u opt:%u spills:%d)%c",
+tb->pc, g, ops, ops_opt, spills, '\0');
+} else {
+g_string_printf(func_name, "TB virt:0x"TARGET_FMT_lx"%c", tb->pc, 
'\0');
+}
 
 struct jr_code_load *load_event = g_new0(struct jr_code_load, 1);
 load_event->p.id = JIT_CODE_LOAD;
-- 
2.22.0




[Qemu-devel] [RFC PATCH v3 36/46] target/i386: introduce MMX code generators

2019-08-14 Thread Jan Bobek
Define code generators required for MMX instructions.

Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 111 
 1 file changed, 111 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 4fecb0d240..a02e9cd0d2 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -5357,12 +5357,123 @@ INSNOP_LDST(xmm_t0, Mhq)
 {   \
 tcg_gen_gvec_ ## gvec(vece, arg1, arg2, oprsz, maxsz);  \
 }
+#define DEF_GEN_INSN2_GVEC_MM(mnem, gvec, opT1, opT2, vece)   \
+DEF_GEN_INSN2_GVEC(mnem, gvec, opT1, opT2, vece,  \
+   sizeof(MMXReg), sizeof(MMXReg))
 
 #define DEF_GEN_INSN3_GVEC(mnem, gvec, opT1, opT2, opT3, vece, oprsz, maxsz) \
 GEN_INSN3(mnem, opT1, opT2, opT3)   \
 {   \
 tcg_gen_gvec_ ## gvec(vece, arg1, arg2, arg3, oprsz, maxsz);\
 }
+#define DEF_GEN_INSN3_GVEC_MM(mnem, gvec, opT1, opT2, opT3, vece)   \
+DEF_GEN_INSN3_GVEC(mnem, gvec, opT1, opT2, opT3, vece,  \
+   sizeof(MMXReg), sizeof(MMXReg))
+
+GEN_INSN2(movq, Pq, Eq);/* forward declaration */
+GEN_INSN2(movd, Pq, Ed)
+{
+const insnop_arg_t(Eq) arg2_r64 = tcg_temp_new_i64();
+tcg_gen_extu_i32_i64(arg2_r64, arg2);
+gen_insn2(movq, Pq, Eq)(env, s, arg1, arg2_r64);
+tcg_temp_free_i64(arg2_r64);
+}
+
+GEN_INSN2(movd, Ed, Pq)
+{
+const insnop_arg_t(Pq) ofs = offsetof(MMXReg, MMX_L(0));
+tcg_gen_ld_i32(arg1, cpu_env, arg2 + ofs);
+}
+
+GEN_INSN2(movq, Pq, Eq)
+{
+const insnop_arg_t(Pq) ofs = offsetof(MMXReg, MMX_Q(0));
+tcg_gen_st_i64(arg2, cpu_env, arg1 + ofs);
+}
+
+GEN_INSN2(movq, Eq, Pq)
+{
+const insnop_arg_t(Pq) ofs = offsetof(MMXReg, MMX_Q(0));
+tcg_gen_ld_i64(arg1, cpu_env, arg2 + ofs);
+}
+
+DEF_GEN_INSN2_GVEC_MM(movq, mov, Pq, Qq, MO_64)
+DEF_GEN_INSN2_GVEC_MM(movq, mov, Qq, Pq, MO_64)
+
+DEF_GEN_INSN3_GVEC_MM(paddb, add, Pq, Pq, Qq, MO_8)
+DEF_GEN_INSN3_GVEC_MM(paddw, add, Pq, Pq, Qq, MO_16)
+DEF_GEN_INSN3_GVEC_MM(paddd, add, Pq, Pq, Qq, MO_32)
+DEF_GEN_INSN3_GVEC_MM(paddsb, ssadd, Pq, Pq, Qq, MO_8)
+DEF_GEN_INSN3_GVEC_MM(paddsw, ssadd, Pq, Pq, Qq, MO_16)
+DEF_GEN_INSN3_GVEC_MM(paddusb, usadd, Pq, Pq, Qq, MO_8)
+DEF_GEN_INSN3_GVEC_MM(paddusw, usadd, Pq, Pq, Qq, MO_16)
+
+DEF_GEN_INSN3_GVEC_MM(psubb, sub, Pq, Pq, Qq, MO_8)
+DEF_GEN_INSN3_GVEC_MM(psubw, sub, Pq, Pq, Qq, MO_16)
+DEF_GEN_INSN3_GVEC_MM(psubd, sub, Pq, Pq, Qq, MO_32)
+DEF_GEN_INSN3_GVEC_MM(psubsb, sssub, Pq, Pq, Qq, MO_8)
+DEF_GEN_INSN3_GVEC_MM(psubsw, sssub, Pq, Pq, Qq, MO_16)
+DEF_GEN_INSN3_GVEC_MM(psubusb, ussub, Pq, Pq, Qq, MO_8)
+DEF_GEN_INSN3_GVEC_MM(psubusw, ussub, Pq, Pq, Qq, MO_16)
+
+DEF_GEN_INSN3_HELPER_EPP(pmullw, pmullw_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(pmulhw, pmulhw_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(pmaddwd, pmaddwd_mmx, Pq, Pq, Qq)
+
+DEF_GEN_INSN3_GVEC_MM(pcmpeqb, cmpeq, Pq, Pq, Qq, MO_8)
+DEF_GEN_INSN3_GVEC_MM(pcmpeqw, cmpeq, Pq, Pq, Qq, MO_16)
+DEF_GEN_INSN3_GVEC_MM(pcmpeqd, cmpeq, Pq, Pq, Qq, MO_32)
+DEF_GEN_INSN3_GVEC_MM(pcmpgtb, cmpgt, Pq, Pq, Qq, MO_8)
+DEF_GEN_INSN3_GVEC_MM(pcmpgtw, cmpgt, Pq, Pq, Qq, MO_16)
+DEF_GEN_INSN3_GVEC_MM(pcmpgtd, cmpgt, Pq, Pq, Qq, MO_32)
+
+DEF_GEN_INSN3_GVEC_MM(pand, and, Pq, Pq, Qq, MO_64)
+DEF_GEN_INSN3_GVEC_MM(pandn, andn, Pq, Pq, Qq, MO_64)
+DEF_GEN_INSN3_GVEC_MM(por, or, Pq, Pq, Qq, MO_64)
+DEF_GEN_INSN3_GVEC_MM(pxor, xor, Pq, Pq, Qq, MO_64)
+
+DEF_GEN_INSN3_HELPER_EPP(psllw, psllw_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(pslld, pslld_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(psllq, psllq_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(psrlw, psrlw_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(psrld, psrld_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(psrlq, psrlq_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(psraw, psraw_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(psrad, psrad_mmx, Pq, Pq, Qq)
+
+#define DEF_GEN_PSHIFT_IMM_MM(mnem, opT1, opT2) \
+GEN_INSN3(mnem, opT1, opT2, Ib) \
+{   \
+const uint64_t arg3_ui64 = (uint8_t)arg3;   \
+const insnop_arg_t(Eq) arg3_r64 = s->tmp1_i64;  \
+const insnop_arg_t(Qq) arg3_mm =\
+offsetof(CPUX86State, mmx_t0.MMX_Q(0)); \
+\
+tcg_gen_movi_i64(arg3_r64, arg3_ui64);  \
+gen_insn2(movq, Pq, Eq)(env, s, arg3_mm, arg3_r64); \
+gen_insn3(mnem, Pq, Pq, Qq)(env, s, arg1, arg2, arg3_mm);   \
+}
+
+DEF_GEN_PSHIFT_IMM_MM(psllw, Nq, Nq)
+DEF_GEN_PSHIFT_IMM_MM(pslld, Nq, Nq)
+DEF_GEN_PSHIFT_IMM_MM(psllq, Nq, Nq)
+

Re: [Qemu-devel] [PATCH v2 0/3] Fix MemoryRegionSection alignment and comparison

2019-08-14 Thread no-reply
Patchew URL: https://patchew.org/QEMU/20190814175535.2023-1-dgilb...@redhat.com/



Hi,

This series failed build test on s390x host. Please find the details below.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
# Testing script will be invoked under the git checkout with
# HEAD pointing to a commit that has the patches applied on top of "base"
# branch
set -e

echo
echo "=== ENV ==="
env

echo
echo "=== PACKAGES ==="
rpm -qa

echo
echo "=== UNAME ==="
uname -a

CC=$HOME/bin/cc
INSTALL=$PWD/install
BUILD=$PWD/build
mkdir -p $BUILD $INSTALL
SRC=$PWD
cd $BUILD
$SRC/configure --cc=$CC --prefix=$INSTALL
make -j4
# XXX: we need reliable clean up
# make check -j4 V=1
make install
=== TEST SCRIPT END ===

  CC  xtensa-linux-user/accel/stubs/kvm-stub.o
  CC  xtensa-linux-user/accel/tcg/tcg-runtime.o
  CC  xtensa-linux-user/accel/tcg/tcg-runtime-gvec.o
collect2: error: ld returned 1 exit status
make[1]: *** [Makefile:209: qemu-x86_64] Error 1
make: *** [Makefile:472: x86_64-linux-user/all] Error 2
make: *** Waiting for unfinished jobs


The full log is available at
http://patchew.org/logs/20190814175535.2023-1-dgilb...@redhat.com/testing.s390x/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

[Qemu-devel] [RFC PATCH v3 31/46] target/i386: introduce helper-based code generator macros

2019-08-14 Thread Jan Bobek
Code generators defined using these macros rely on a helper function
(as emitted by gen_helper_*).

Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 106 
 1 file changed, 106 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index b5f609e147..b28d651b82 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -5245,6 +5245,112 @@ INSNOP_LDST(xmm_t0, Mhq)
 insnop_arg_t(opT1) arg1, insnop_arg_t(opT2) arg2,   \
 insnop_arg_t(opT3) arg3, insnop_arg_t(opT4) arg4)
 
+#define DEF_GEN_INSN0_HELPER(mnem, helper)  \
+GEN_INSN0(mnem) \
+{   \
+gen_helper_ ## helper(cpu_env); \
+}
+
+#define DEF_GEN_INSN2_HELPER_EPD(mnem, helper, opT1, opT2)  \
+GEN_INSN2(mnem, opT1, opT2) \
+{   \
+const TCGv_ptr arg1_ptr = tcg_temp_new_ptr();   \
+\
+tcg_gen_addi_ptr(arg1_ptr, cpu_env, arg1);  \
+gen_helper_ ## helper(cpu_env, arg1_ptr, arg2); \
+\
+tcg_temp_free_ptr(arg1_ptr);\
+}
+#define DEF_GEN_INSN2_HELPER_DEP(mnem, helper, opT1, opT2)  \
+GEN_INSN2(mnem, opT1, opT2) \
+{   \
+const TCGv_ptr arg2_ptr = tcg_temp_new_ptr();   \
+\
+tcg_gen_addi_ptr(arg2_ptr, cpu_env, arg2);  \
+gen_helper_ ## helper(arg1, cpu_env, arg2_ptr); \
+\
+tcg_temp_free_ptr(arg2_ptr);\
+}
+#ifdef TARGET_X86_64
+#define DEF_GEN_INSN2_HELPER_EPQ(mnem, helper, opT1, opT2)  \
+DEF_GEN_INSN2_HELPER_EPD(mnem, helper, opT1, opT2)
+#define DEF_GEN_INSN2_HELPER_QEP(mnem, helper, opT1, opT2)  \
+DEF_GEN_INSN2_HELPER_DEP(mnem, helper, opT1, opT2)
+#else /* !TARGET_X86_64 */
+#define DEF_GEN_INSN2_HELPER_EPQ(mnem, helper, opT1, opT2)  \
+GEN_INSN2(mnem, opT1, opT2) \
+{   \
+g_assert_not_reached(); \
+}
+#define DEF_GEN_INSN2_HELPER_QEP(mnem, helper, opT1, opT2)  \
+GEN_INSN2(mnem, opT1, opT2) \
+{   \
+g_assert_not_reached(); \
+}
+#endif /* !TARGET_X86_64 */
+#define DEF_GEN_INSN2_HELPER_EPP(mnem, helper, opT1, opT2)  \
+GEN_INSN2(mnem, opT1, opT2) \
+{   \
+const TCGv_ptr arg1_ptr = tcg_temp_new_ptr();   \
+const TCGv_ptr arg2_ptr = tcg_temp_new_ptr();   \
+\
+tcg_gen_addi_ptr(arg1_ptr, cpu_env, arg1);  \
+tcg_gen_addi_ptr(arg2_ptr, cpu_env, arg2);  \
+gen_helper_ ## helper(cpu_env, arg1_ptr, arg2_ptr); \
+\
+tcg_temp_free_ptr(arg1_ptr);\
+tcg_temp_free_ptr(arg2_ptr);\
+}
+
+#define DEF_GEN_INSN3_HELPER_EPP(mnem, helper, opT1, opT2, opT3)\
+GEN_INSN3(mnem, opT1, opT2, opT3)   \
+{   \
+const TCGv_ptr arg1_ptr = tcg_temp_new_ptr();   \
+const TCGv_ptr arg3_ptr = tcg_temp_new_ptr();   \
+\
+assert(arg1 == arg2);   \
+tcg_gen_addi_ptr(arg1_ptr, cpu_env, arg1);  \
+tcg_gen_addi_ptr(arg3_ptr, cpu_env, arg3);  \
+gen_helper_ ## helper(cpu_env, arg1_ptr, arg3_ptr); \
+\
+tcg_temp_free_ptr(arg1_ptr);\
+tcg_temp_free_ptr(arg3_ptr);\
+}
+#define DEF_GEN_INSN3_HELPER_PPI(mnem, helper, opT1, opT2, opT3)\
+GEN_INSN3(mnem, opT1, opT2, opT3)   \
+{   \
+const TCGv_ptr arg1_ptr = tcg_temp_new_ptr();   \
+const TCGv_pt

[Qemu-devel] [RFC PATCH v3 28/46] target/i386: introduce P*, N*, Q* (MMX) operands

2019-08-14 Thread Jan Bobek
These address the MMX-technology register file; the corresponding
cpu_env offset is passed as the operand value. Notably, offset of the
entire register is pased at all times, regardless of the operand-size
suffix.

Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 79 +
 1 file changed, 79 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 779b692942..bd3c7f9356 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -5011,6 +5011,85 @@ INSNOP_LDST(tcg_temp_i64, Mq)
 }
 }
 
+/*
+ * MMX-technology register operands
+ */
+#define DEF_INSNOP_MM(opT, opTmmid) \
+typedef unsigned int insnop_arg_t(opT); \
+typedef struct {\
+insnop_ctxt_t(opTmmid) mmid;\
+} insnop_ctxt_t(opT);   \
+\
+INSNOP_INIT(opT)\
+{   \
+return insnop_init(opTmmid)(&ctxt->mmid, env, s, modrm, is_write); \
+}   \
+INSNOP_PREPARE(opT) \
+{   \
+const insnop_arg_t(opTmmid) mmid =  \
+insnop_prepare(opTmmid)(&ctxt->mmid, env, s, modrm, is_write); \
+const insnop_arg_t(opT) arg =   \
+offsetof(CPUX86State, fpregs[mmid & 7].mmx);\
+insnop_finalize(opTmmid)(&ctxt->mmid, env, s, modrm, is_write, mmid); \
+return arg; \
+}   \
+INSNOP_FINALIZE(opT)\
+{   \
+}
+
+typedef unsigned int insnop_arg_t(mm_t0);
+typedef struct {} insnop_ctxt_t(mm_t0);
+
+INSNOP_INIT(mm_t0)
+{
+return 0;
+}
+INSNOP_PREPARE(mm_t0)
+{
+return offsetof(CPUX86State, mmx_t0);
+}
+INSNOP_FINALIZE(mm_t0)
+{
+}
+
+DEF_INSNOP_MM(P, modrm_reg)
+DEF_INSNOP_ALIAS(Pd, P)
+DEF_INSNOP_ALIAS(Pq, P)
+
+DEF_INSNOP_MM(N, modrm_rm_direct)
+DEF_INSNOP_ALIAS(Nd, N)
+DEF_INSNOP_ALIAS(Nq, N)
+
+DEF_INSNOP_LDST(MQd, mm_t0, Md)
+DEF_INSNOP_LDST(MQq, mm_t0, Mq)
+DEF_INSNOP_EITHER(Qd, Nd, MQd)
+DEF_INSNOP_EITHER(Qq, Nq, MQq)
+
+INSNOP_LDST(mm_t0, Md)
+{
+const insnop_arg_t(mm_t0) ofs =
+offsetof(MMXReg, MMX_L(0));
+
+assert(ptr == s->A0);
+if (is_write) {
+gen_std_env_A0(s, arg + ofs);
+} else {
+gen_ldd_env_A0(s, arg + ofs);
+}
+}
+INSNOP_LDST(mm_t0, Mq)
+{
+const insnop_arg_t(mm_t0) ofs =
+offsetof(MMXReg, MMX_Q(0));
+
+assert(ptr == s->A0);
+if (is_write) {
+gen_stq_env_A0(s, arg + ofs);
+} else {
+gen_ldq_env_A0(s, arg + ofs);
+}
+}
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
 enum {
-- 
2.20.1




[Qemu-devel] [PATCH v5 10/10] linux-user: dumping hot TBs at the end of the execution

2019-08-14 Thread vandersonmr
dumps, in linux-user mode, the hottest TBs if -d tb_stats is used.

Signed-off-by: Vanderson M. do Rosario 
---
 linux-user/exit.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/linux-user/exit.c b/linux-user/exit.c
index bdda720553..7226104959 100644
--- a/linux-user/exit.c
+++ b/linux-user/exit.c
@@ -28,6 +28,10 @@ extern void __gcov_dump(void);
 
 void preexit_cleanup(CPUArchState *env, int code)
 {
+if (tb_stats_collection_enabled()) {
+dump_tbs_info(max_num_hot_tbs_to_dump, SORT_BY_HOTNESS, false);
+}
+
 #ifdef TARGET_GPROF
 _mcleanup();
 #endif
-- 
2.22.0




[Qemu-devel] [RFC PATCH v3 33/46] target/i386: introduce sse-opcode.inc.h

2019-08-14 Thread Jan Bobek
This header is intended to eventually list all supported instructions
along with some useful details (e.g. mnemonics, opcode, operands etc.)
It shall be used (along with some preprocessor magic) anytime we need
to automatically generate code for every instruction.

Signed-off-by: Jan Bobek 
---
 target/i386/sse-opcode.inc.h | 69 
 1 file changed, 69 insertions(+)
 create mode 100644 target/i386/sse-opcode.inc.h

diff --git a/target/i386/sse-opcode.inc.h b/target/i386/sse-opcode.inc.h
new file mode 100644
index 00..c5e81a6a80
--- /dev/null
+++ b/target/i386/sse-opcode.inc.h
@@ -0,0 +1,69 @@
+#define FMTI (0, 0, 0, )
+#define FMTI__R__(1, 1, 0, r)
+#define FMTI__RR__   (2, 2, 0, rr)
+#define FMTI__W__(1, 0, 1, w)
+#define FMTI__WR__   (2, 1, 1, wr)
+#define FMTI__WRR__  (3, 2, 1, wrr)
+#define FMTI__WRRR__ (4, 3, 1, wrrr)
+
+#define FMTI__(prop, fmti) FMTI_ ## prop ## __ fmti
+
+#define FMTI_ARGC__(argc, argc_rd, argc_wr, lower)argc
+#define FMTI_ARGC_RD__(argc, argc_rd, argc_wr, lower) argc_rd
+#define FMTI_ARGC_WR__(argc, argc_rd, argc_wr, lower) argc_wr
+#define FMTI_LOWER__(argc, argc_rd, argc_wr, lower)   lower
+
+#define FMT_ARGC(fmt)FMTI__(ARGC, FMTI__ ## fmt ## __)
+#define FMT_ARGC_RD(fmt) FMTI__(ARGC_RD, FMTI__ ## fmt ## __)
+#define FMT_ARGC_WR(fmt) FMTI__(ARGC_WR, FMTI__ ## fmt ## __)
+#define FMT_LOWER(fmt)   FMTI__(LOWER, FMTI__ ## fmt ## __)
+#define FMT_UPPER(fmt)   fmt
+
+#ifndef OPCODE
+#   define OPCODE(mnem, opcode, feat, fmt, ...)
+#endif /* OPCODE */
+
+#ifndef OPCODE_GRP
+#   define OPCODE_GRP(grpname, opcode)
+#endif /* OPCODE_GRP */
+
+#ifndef OPCODE_GRP_BEGIN
+#   define OPCODE_GRP_BEGIN(grpname)
+#endif /* OPCODE_GRP_BEGIN */
+
+#ifndef OPCODE_GRPMEMB
+#   define OPCODE_GRPMEMB(grpname, mnem, opcode, feat, fmt, ...)
+#endif /* OPCODE_GRPMEMB */
+
+#ifndef OPCODE_GRP_END
+#   define OPCODE_GRP_END(grpname)
+#endif /* OPCODE_GRP_END */
+
+#undef FMTI
+#undef FMTI__R__
+#undef FMTI__RR__
+#undef FMTI__W__
+#undef FMTI__WR__
+#undef FMTI__WRR__
+#undef FMTI__WRRR__
+
+#undef FMTI__
+
+#undef FMTI_ARGC__
+#undef FMTI_ARGC_RD__
+#undef FMTI_ARGC_WR__
+#undef FMTI_LOWER__
+
+#undef FMT_ARGC
+#undef FMT_ARGC_RD
+#undef FMT_ARGC_WR
+#undef FMT_LOWER
+#undef FMT_UPPER
+
+#undef LEG
+#undef VEX
+#undef OPCODE
+#undef OPCODE_GRP
+#undef OPCODE_GRP_BEGIN
+#undef OPCODE_GRPMEMB
+#undef OPCODE_GRP_END
-- 
2.20.1




[Qemu-devel] [RFC PATCH v3 34/46] target/i386: introduce instruction translator macros

2019-08-14 Thread Jan Bobek
Instruction "translators" are responsible for decoding and loading
instruction operands, calling the passed-in code generator, and
storing the operands back (if applicable). Once a translator returns,
the instruction has been translated to TCG ops, hence the name.

Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 237 
 1 file changed, 237 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 75652afb45..76c27d0380 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -5364,6 +5364,228 @@ INSNOP_LDST(xmm_t0, Mhq)
 tcg_gen_gvec_ ## gvec(vece, arg1, arg2, arg3, oprsz, maxsz);\
 }
 
+/*
+ * Instruction translators
+ */
+#define translate_insn(argc, ...)   \
+glue(translate_insn, argc)(__VA_ARGS__)
+#define translate_insn0()   \
+translate_insn_0
+#define translate_insn1(opT1)   \
+translate_insn_1 ## opT1
+#define translate_insn2(opT1, opT2) \
+translate_insn_2 ## opT1 ## opT2
+#define translate_insn3(opT1, opT2, opT3)   \
+translate_insn_3 ## opT1 ## opT2 ## opT3
+#define translate_insn4(opT1, opT2, opT3, opT4) \
+translate_insn_4 ## opT1 ## opT2 ## opT3 ## opT4
+#define translate_group(grpname)\
+translate_group_ ## grpname
+
+static void translate_insn0()(
+CPUX86State *env, DisasContext *s, int modrm,
+int ck_cpuid_feat, unsigned int argc_wr,
+void (*gen_insn_fp)(CPUX86State *, DisasContext *))
+{
+if (ck_cpuid(env, s, ck_cpuid_feat)) {
+gen_illegal_opcode(s);
+return;
+}
+
+(*gen_insn_fp)(env, s);
+}
+
+#define DEF_TRANSLATE_INSN1(opT1)   \
+static void translate_insn1(opT1)(  \
+CPUX86State *env, DisasContext *s, int modrm,   \
+int ck_cpuid_feat, unsigned int argc_wr,\
+void (*gen_insn1_fp)(CPUX86State *, DisasContext *, \
+ insnop_arg_t(opT1)))   \
+{   \
+insnop_ctxt_t(opT1) ctxt1;  \
+\
+const bool is_write1 = (1 <= argc_wr);  \
+\
+int ret = ck_cpuid(env, s, ck_cpuid_feat);  \
+if (!ret) { \
+ret = insnop_init(opT1)(&ctxt1, env, s, modrm, is_write1);  \
+}   \
+if (!ret) { \
+const insnop_arg_t(opT1) arg1 = \
+insnop_prepare(opT1)(&ctxt1, env, s, modrm, is_write1); \
+\
+(*gen_insn1_fp)(env, s, arg1);  \
+\
+insnop_finalize(opT1)(&ctxt1, env, s, modrm, is_write1, arg1); \
+} else {\
+gen_illegal_opcode(s);  \
+}   \
+}
+
+#define DEF_TRANSLATE_INSN2(opT1, opT2) \
+static void translate_insn2(opT1, opT2)(\
+CPUX86State *env, DisasContext *s, int modrm,   \
+int ck_cpuid_feat, unsigned int argc_wr,\
+void (*gen_insn2_fp)(CPUX86State *, DisasContext *, \
+ insnop_arg_t(opT1), insnop_arg_t(opT2)))   \
+{   \
+insnop_ctxt_t(opT1) ctxt1;  \
+insnop_ctxt_t(opT2) ctxt2;  \
+\
+const bool is_write1 = (1 <= argc_wr);  \
+const bool is_write2 = (2 <= argc_wr);  \
+\
+int ret = ck_cpuid(env, s, ck_cpuid_feat);  \
+if (!ret) { \
+ret = insnop_init(opT1)(&ctxt1, env, s, modrm, is_write1);  \
+}   \
+if (!ret) { \
+ret = insnop_init(opT2)(&ctxt2, env, s, modrm, is_write2

[Qemu-devel] [PATCH v5 07/10] monitor: adding tb_stats hmp command

2019-08-14 Thread vandersonmr
Adding tb_stats [start|pause|stop|filter] command to hmp.
This allows controlling the collection of statistics.
It is also possible to set the level of collection:
all, jit, or exec.

tb_stats filter allow to only collect statistics for the TB
in the last_search list.

The goal of this command is to allow the dynamic exploration
of the TCG behavior and quality. Therefore, for now, a
corresponding QMP command is not worthwhile.

Signed-off-by: Vanderson M. do Rosario 
---
 accel/tcg/tb-stats.c| 111 
 hmp-commands.hx |  17 ++
 include/exec/tb-stats.h |  12 +
 include/qemu-common.h   |   1 +
 monitor/misc.c  |  49 ++
 vl.c|   6 +++
 6 files changed, 196 insertions(+)

diff --git a/accel/tcg/tb-stats.c b/accel/tcg/tb-stats.c
index dddb9d4537..f28fd7b434 100644
--- a/accel/tcg/tb-stats.c
+++ b/accel/tcg/tb-stats.c
@@ -9,6 +9,9 @@
 
 #include "exec/tb-stats.h"
 
+/* only accessed in safe work */
+static GList *last_search;
+
 uint64_t dev_time;
 
 struct jit_profile_info {
@@ -140,6 +143,96 @@ void dump_jit_profile_info(TCGProfile *s)
 g_free(jpi);
 }
 
+static void free_tbstats(void *p, uint32_t hash, void *userp)
+{
+g_free(p);
+}
+
+static void clean_tbstats(void)
+{
+/* remove all tb_stats */
+qht_iter(&tb_ctx.tb_stats, free_tbstats, NULL);
+qht_destroy(&tb_ctx.tb_stats);
+}
+
+void do_hmp_tbstats_safe(CPUState *cpu, run_on_cpu_data icmd)
+{
+struct TbstatsCommand *cmdinfo = icmd.host_ptr;
+int cmd = cmdinfo->cmd;
+uint32_t level = cmdinfo->level;
+
+switch (cmd) {
+case START:
+if (tb_stats_collection_paused()) {
+set_tbstats_flags(level);
+} else {
+if (tb_stats_collection_enabled()) {
+qemu_printf("TB information already being recorded");
+return;
+}
+qht_init(&tb_ctx.tb_stats, tb_stats_cmp, CODE_GEN_HTABLE_SIZE,
+QHT_MODE_AUTO_RESIZE);
+}
+
+set_default_tbstats_flag(level);
+enable_collect_tb_stats();
+tb_flush(cpu);
+break;
+case PAUSE:
+if (!tb_stats_collection_enabled()) {
+qemu_printf("TB information not being recorded");
+return;
+}
+
+/* Continue to create TBStatistic structures but stop collecting 
statistics */
+pause_collect_tb_stats();
+set_default_tbstats_flag(TB_NOTHING);
+set_tbstats_flags(TB_PAUSED);
+tb_flush(cpu);
+break;
+case STOP:
+if (!tb_stats_collection_enabled()) {
+qemu_printf("TB information not being recorded");
+return;
+}
+
+/* Dissalloc all TBStatistics structures and stop creating new ones */
+disable_collect_tb_stats();
+clean_tbstats();
+tb_flush(cpu);
+break;
+case FILTER:
+if (!tb_stats_collection_enabled()) {
+qemu_printf("TB information not being recorded");
+return;
+}
+if (!last_search) {
+qemu_printf("no search on record! execute info tbs before 
filtering!");
+return;
+}
+
+set_default_tbstats_flag(TB_NOTHING);
+
+/* Set all tbstats as paused, then return only the ones from 
last_search */
+pause_collect_tb_stats();
+set_tbstats_flags(TB_PAUSED);
+
+for (GList *iter = last_search; iter; iter = g_list_next(iter)) {
+TBStatistics *tbs = iter->data;
+tbs->stats_enabled = level;
+}
+
+tb_flush(cpu);
+
+break;
+default: /* INVALID */
+g_assert_not_reached();
+break;
+}
+
+g_free(cmdinfo);
+}
+
 
 void init_tb_stats_htable_if_not(void)
 {
@@ -175,6 +268,24 @@ bool tb_stats_collection_paused(void)
 return tcg_collect_tb_stats == TB_STATS_PAUSED;
 }
 
+static void reset_tbstats_flag(void *p, uint32_t hash, void *userp)
+{
+uint32_t flag = *((int *)userp);
+TBStatistics *tbs = p;
+tbs->stats_enabled = flag;
+}
+
+void set_default_tbstats_flag(uint32_t flag)
+{
+default_tbstats_flag = flag;
+}
+
+void set_tbstats_flags(uint32_t flag)
+{
+/* iterate over tbstats setting their flag as TB_NOTHING */
+qht_iter(&tb_ctx.tb_stats, reset_tbstats_flag, &flag);
+}
+
 uint32_t get_default_tbstats_flag(void)
 {
 return default_tbstats_flag;
diff --git a/hmp-commands.hx b/hmp-commands.hx
index bfa5681dd2..419898751e 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1885,6 +1885,23 @@ STEXI
 @findex qemu-io
 Executes a qemu-io command on the given block device.
 
+ETEXI
+#if defined(CONFIG_TCG)
+{
+.name   = "tb_stats",
+.args_type  = "command:s,level:s?",
+.params = "command [stats_level]",
+.help   = "Control tb statistics collection:"
+"tb_stats (start|pause|stop|filter) 
[all|jit_stats|exec_stats]",
+.c

[Qemu-devel] [RFC PATCH v3 29/46] target/i386: introduce H*, V*, U*, W* (SSE/AVX) operands

2019-08-14 Thread Jan Bobek
These address the SSE/AVX-technology register file. Offset of the
entire corresponding register is passed as the operand value,
regardless of operand-size suffix.

Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 117 
 1 file changed, 117 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index bd3c7f9356..69233fd0f8 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4930,6 +4930,7 @@ DEF_INSNOP_ALIAS(Mb, M)
 DEF_INSNOP_ALIAS(Mw, M)
 DEF_INSNOP_ALIAS(Md, M)
 DEF_INSNOP_ALIAS(Mq, M)
+DEF_INSNOP_ALIAS(Mhq, M)
 DEF_INSNOP_ALIAS(Mdq, M)
 DEF_INSNOP_ALIAS(Mqq, M)
 
@@ -5090,6 +5091,122 @@ INSNOP_LDST(mm_t0, Mq)
 }
 }
 
+/*
+ * SSE/AVX-technology registers
+ */
+#define DEF_INSNOP_XMM(opT, opTxmmid)   \
+typedef unsigned int insnop_arg_t(opT); \
+typedef struct {\
+insnop_ctxt_t(opTxmmid) xmmid;  \
+} insnop_ctxt_t(opT);   \
+\
+INSNOP_INIT(opT)\
+{   \
+return insnop_init(opTxmmid)(&ctxt->xmmid, env, s, modrm, is_write); \
+}   \
+INSNOP_PREPARE(opT) \
+{   \
+const insnop_arg_t(opTxmmid) xmmid =\
+insnop_prepare(opTxmmid)(&ctxt->xmmid, env, s, modrm, is_write); \
+const insnop_arg_t(opT) arg =   \
+offsetof(CPUX86State, xmm_regs[xmmid]); \
+insnop_finalize(opTxmmid)(&ctxt->xmmid, env, s, \
+  modrm, is_write, xmmid);  \
+return arg; \
+}   \
+INSNOP_FINALIZE(opT)\
+{   \
+}
+
+typedef unsigned int insnop_arg_t(xmm_t0);
+typedef struct {} insnop_ctxt_t(xmm_t0);
+
+INSNOP_INIT(xmm_t0)
+{
+return 0;
+}
+INSNOP_PREPARE(xmm_t0)
+{
+return offsetof(CPUX86State, xmm_t0);
+}
+INSNOP_FINALIZE(xmm_t0)
+{
+}
+
+DEF_INSNOP_XMM(V, modrm_reg)
+DEF_INSNOP_ALIAS(Vd, V)
+DEF_INSNOP_ALIAS(Vq, V)
+DEF_INSNOP_ALIAS(Vdq, V)
+DEF_INSNOP_ALIAS(Vqq, V)
+
+DEF_INSNOP_XMM(U, modrm_rm_direct)
+DEF_INSNOP_ALIAS(Ud, U)
+DEF_INSNOP_ALIAS(Uq, U)
+DEF_INSNOP_ALIAS(Udq, U)
+DEF_INSNOP_ALIAS(Uqq, U)
+
+DEF_INSNOP_XMM(H, vex_v)
+DEF_INSNOP_ALIAS(Hd, H)
+DEF_INSNOP_ALIAS(Hq, H)
+DEF_INSNOP_ALIAS(Hdq, H)
+DEF_INSNOP_ALIAS(Hqq, H)
+
+DEF_INSNOP_LDST(MUd, xmm_t0, Md)
+DEF_INSNOP_LDST(MUq, xmm_t0, Mq)
+DEF_INSNOP_LDST(MWdq, xmm_t0, Mdq)
+DEF_INSNOP_LDST(MUdqMhq, xmm_t0, Mhq)
+DEF_INSNOP_EITHER(Wd, Ud, MUd)
+DEF_INSNOP_EITHER(Wq, Uq, MUq)
+DEF_INSNOP_EITHER(Wdq, Udq, MWdq)
+DEF_INSNOP_EITHER(UdqMq, Udq, MUq)
+DEF_INSNOP_EITHER(UdqMhq, Udq, MUdqMhq)
+
+INSNOP_LDST(xmm_t0, Md)
+{
+const insnop_arg_t(xmm_t0) ofs =
+offsetof(ZMMReg, ZMM_L(0));
+
+assert(ptr == s->A0);
+if (is_write) {
+gen_std_env_A0(s, arg + ofs);
+} else {
+gen_ldd_env_A0(s, arg + ofs);
+}
+}
+INSNOP_LDST(xmm_t0, Mq)
+{
+const insnop_arg_t(xmm_t0) ofs =
+offsetof(ZMMReg, ZMM_Q(0));
+
+assert(ptr == s->A0);
+if (is_write) {
+gen_stq_env_A0(s, arg + ofs);
+} else {
+gen_ldq_env_A0(s, arg + ofs);
+}
+}
+INSNOP_LDST(xmm_t0, Mdq)
+{
+assert(ptr == s->A0);
+if (is_write) {
+gen_sto_env_A0(s, arg);
+} else {
+gen_ldo_env_A0(s, arg);
+}
+}
+INSNOP_LDST(xmm_t0, Mhq)
+{
+const insnop_arg_t(xmm_t0) ofs =
+offsetof(ZMMReg, ZMM_Q(1));
+
+assert(ptr == s->A0);
+if (is_write) {
+gen_stq_env_A0(s, arg + ofs);
+} else {
+gen_ldq_env_A0(s, arg + ofs);
+}
+}
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
 enum {
-- 
2.20.1




[Qemu-devel] [RFC PATCH v3 27/46] target/i386: introduce G*, R*, E* (general register) operands

2019-08-14 Thread Jan Bobek
These address the general-purpose register file. The corresponding
32-bit or 64-bit register is passed as the operand value.

Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 78 +
 1 file changed, 78 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 2374876b38..779b692942 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4933,6 +4933,84 @@ DEF_INSNOP_ALIAS(Mq, M)
 DEF_INSNOP_ALIAS(Mdq, M)
 DEF_INSNOP_ALIAS(Mqq, M)
 
+/*
+ * 32-bit general register operands
+ */
+DEF_INSNOP_LDST(Gd, tcg_temp_i32, modrm_reg)
+DEF_INSNOP_LDST(Rd, tcg_temp_i32, modrm_rm_direct)
+
+INSNOP_LDST(tcg_temp_i32, modrm_reg)
+{
+assert(0 <= ptr && ptr < CPU_NB_REGS);
+if (is_write) {
+tcg_gen_extu_i32_tl(cpu_regs[ptr], arg);
+} else {
+tcg_gen_trunc_tl_i32(arg, cpu_regs[ptr]);
+}
+}
+INSNOP_LDST(tcg_temp_i32, modrm_rm_direct)
+{
+insnop_ldst(tcg_temp_i32, modrm_reg)(env, s, modrm, is_write, arg, ptr);
+}
+
+DEF_INSNOP_LDST(MEd, tcg_temp_i32, Md)
+DEF_INSNOP_EITHER(Ed, Rd, MEd)
+DEF_INSNOP_LDST(MRdMw, tcg_temp_i32, Mw)
+DEF_INSNOP_EITHER(RdMw, Rd, MRdMw)
+
+INSNOP_LDST(tcg_temp_i32, Md)
+{
+if (is_write) {
+tcg_gen_qemu_st_i32(arg, ptr, s->mem_index, MO_LEUL);
+} else {
+tcg_gen_qemu_ld_i32(arg, ptr, s->mem_index, MO_LEUL);
+}
+}
+INSNOP_LDST(tcg_temp_i32, Mw)
+{
+if (is_write) {
+tcg_gen_qemu_st_i32(arg, ptr, s->mem_index, MO_LEUW);
+} else {
+tcg_gen_qemu_ld_i32(arg, ptr, s->mem_index, MO_LEUW);
+}
+}
+
+/*
+ * 64-bit general register operands
+ */
+DEF_INSNOP_LDST(Gq, tcg_temp_i64, modrm_reg)
+DEF_INSNOP_LDST(Rq, tcg_temp_i64, modrm_rm_direct)
+
+INSNOP_LDST(tcg_temp_i64, modrm_reg)
+{
+#ifdef TARGET_X86_64
+assert(0 <= ptr && ptr < CPU_NB_REGS);
+if (is_write) {
+tcg_gen_mov_i64(cpu_regs[ptr], arg);
+} else {
+tcg_gen_mov_i64(arg, cpu_regs[ptr]);
+}
+#else /* !TARGET_X86_64 */
+g_assert_not_reached();
+#endif /* !TARGET_X86_64 */
+}
+INSNOP_LDST(tcg_temp_i64, modrm_rm_direct)
+{
+insnop_ldst(tcg_temp_i64, modrm_reg)(env, s, modrm, is_write, arg, ptr);
+}
+
+DEF_INSNOP_LDST(MEq, tcg_temp_i64, Mq)
+DEF_INSNOP_EITHER(Eq, Rq, MEq)
+
+INSNOP_LDST(tcg_temp_i64, Mq)
+{
+if (is_write) {
+tcg_gen_qemu_st_i64(arg, ptr, s->mem_index, MO_LEQ);
+} else {
+tcg_gen_qemu_ld_i64(arg, ptr, s->mem_index, MO_LEQ);
+}
+}
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
 enum {
-- 
2.20.1




[Qemu-devel] [RFC PATCH v3 26/46] target/i386: introduce M* (memptr) operands

2019-08-14 Thread Jan Bobek
The memory-pointer operand decodes the indirect form of ModR/M byte,
loads the effective address into a register and passes that register
as the operand.

Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 36 
 1 file changed, 36 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 78e8f7a212..2374876b38 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4897,6 +4897,42 @@ INSNOP_FINALIZE(Ib)
 {
 }
 
+/*
+ * Memory-pointer operand
+ */
+typedef TCGv insnop_arg_t(M);
+typedef struct {} insnop_ctxt_t(M);
+
+INSNOP_INIT(M)
+{
+int ret;
+insnop_ctxt_t(modrm_mod) modctxt;
+
+ret = insnop_init(modrm_mod)(&modctxt, env, s, modrm, is_write);
+if (!ret) {
+const int mod =
+insnop_prepare(modrm_mod)(&modctxt, env, s, modrm, is_write);
+ret = !(mod != 3);
+insnop_finalize(modrm_mod)(&modctxt, env, s, modrm, is_write, mod);
+}
+return ret;
+}
+INSNOP_PREPARE(M)
+{
+gen_lea_modrm(env, s, modrm);
+return s->A0;
+}
+INSNOP_FINALIZE(M)
+{
+}
+
+DEF_INSNOP_ALIAS(Mb, M)
+DEF_INSNOP_ALIAS(Mw, M)
+DEF_INSNOP_ALIAS(Md, M)
+DEF_INSNOP_ALIAS(Mq, M)
+DEF_INSNOP_ALIAS(Mdq, M)
+DEF_INSNOP_ALIAS(Mqq, M)
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
 enum {
-- 
2.20.1




[Qemu-devel] [PATCH v5 05/10] accel: adding TB_JIT_TIME and full replacing CONFIG_PROFILER

2019-08-14 Thread vandersonmr
Replace all others CONFIG_PROFILER statistics and migrate it to
TBStatistics system. However, TCGProfiler still exists and can
be use to store global statistics and times. All TB related
statistics goes to TBStatistics.

Signed-off-by: Vanderson M. do Rosario 
---
 accel/tcg/tb-stats.c  |  95 -
 accel/tcg/translate-all.c |  47 ---
 configure |   3 -
 cpus.c|  14 ++---
 include/exec/tb-stats.h   |  21 ++-
 include/qemu/timer.h  |   5 +-
 monitor/misc.c|  28 ++---
 tcg/tcg.c | 124 +++---
 tcg/tcg.h |  10 +--
 vl.c  |   8 +--
 10 files changed, 161 insertions(+), 194 deletions(-)

diff --git a/accel/tcg/tb-stats.c b/accel/tcg/tb-stats.c
index 9b720d9b86..2bb1fde837 100644
--- a/accel/tcg/tb-stats.c
+++ b/accel/tcg/tb-stats.c
@@ -8,6 +8,8 @@
 
 #include "exec/tb-stats.h"
 
+uint64_t dev_time;
+
 struct jit_profile_info {
 uint64_t translations;
 uint64_t aborted;
@@ -19,6 +21,13 @@ struct jit_profile_info {
 uint64_t host;
 uint64_t guest;
 uint64_t search_data;
+
+uint64_t interm_time;
+uint64_t code_time;
+uint64_t restore_count;
+uint64_t restore_time;
+uint64_t opt_time;
+uint64_t la_time;
 };
 
 /* accumulate the statistics from all TBs */
@@ -40,6 +49,29 @@ static void collect_jit_profile_info(void *p, uint32_t hash, 
void *userp)
 jpi->host += tbs->code.out_len;
 jpi->guest += tbs->code.in_len;
 jpi->search_data += tbs->code.search_out_len;
+
+jpi->interm_time += stat_per_translation(tbs, time.interm);
+jpi->code_time += stat_per_translation(tbs, time.code);
+jpi->opt_time += stat_per_translation(tbs, time.opt);
+jpi->la_time += stat_per_translation(tbs, time.la);
+jpi->restore_time += tbs->time.restore;
+jpi->restore_count += tbs->time.restore_count;
+}
+
+void dump_jit_exec_time_info(uint64_t dev_time)
+{
+static uint64_t last_cpu_exec_time;
+uint64_t cpu_exec_time;
+uint64_t delta;
+
+cpu_exec_time = tcg_cpu_exec_time();
+delta = cpu_exec_time - last_cpu_exec_time;
+
+qemu_printf("async time  %" PRId64 " (%0.3f)\n",
+   dev_time, dev_time / (double) NANOSECONDS_PER_SECOND);
+qemu_printf("qemu time   %" PRId64 " (%0.3f)\n",
+   delta, delta / (double) NANOSECONDS_PER_SECOND);
+last_cpu_exec_time = cpu_exec_time;
 }
 
 /* dump JIT statisticis using TCGProfile and TBStats */
@@ -66,36 +98,45 @@ void dump_jit_profile_info(TCGProfile *s)
 qemu_printf("avg search data/TB  %0.1f\n",
 jpi->search_data / (double) jpi->translations);
 
+uint64_t tot = jpi->interm_time + jpi->code_time;
+
+qemu_printf("JIT cycles  %" PRId64 " (%0.3fs at 2.4 GHz)\n",
+tot, tot / 2.4e9);
+qemu_printf("cycles/op   %0.1f\n",
+jpi->ops ? (double)tot / jpi->ops : 0);
+qemu_printf("cycles/in byte  %0.1f\n",
+jpi->guest ? (double)tot / jpi->guest : 0);
+qemu_printf("cycles/out byte %0.1f\n",
+jpi->host ? (double)tot / jpi->host : 0);
+qemu_printf("cycles/search byte %0.1f\n",
+jpi->search_data ? (double)tot / jpi->search_data : 0);
+if (tot == 0) {
+tot = 1;
+}
+
+qemu_printf("  gen_interm time   %0.1f%%\n",
+(double)jpi->interm_time / tot * 100.0);
+qemu_printf("  gen_code time %0.1f%%\n",
+(double)jpi->code_time / tot * 100.0);
+
+qemu_printf("optim./code time%0.1f%%\n",
+(double)jpi->opt_time / (jpi->code_time ? jpi->code_time : 1)
+* 100.0);
+qemu_printf("liveness/code time  %0.1f%%\n",
+(double)jpi->la_time / (jpi->code_time ? jpi->code_time : 1) * 
100.0);
+
+qemu_printf("cpu_restore count   %" PRId64 "\n",
+jpi->restore_count);
+qemu_printf("  avg cycles%0.1f\n",
+jpi->restore_count ? (double)jpi->restore_time / 
jpi->restore_count : 0);
+
 if (s) {
-int64_t tot = s->interm_time + s->code_time;
-qemu_printf("JIT cycles  %" PRId64 " (%0.3f s at 2.4 
GHz)\n",
-tot, tot / 2.4e9);
-qemu_printf("cycles/op   %0.1f\n",
-jpi->ops ? (double)tot / jpi->ops : 0);
-qemu_printf("cycles/in byte  %0.1f\n",
-jpi->guest ? (double)tot / jpi->guest : 0);
-qemu_printf("cycles/out byte %0.1f\n",
-jpi->host ? (double)tot / jpi->host : 0);
-qemu_printf("cycles/search byte %0.1f\n",
-jpi->search_data ? (double)tot / jpi->search_data : 0);
-if (tot == 0) {
-tot = 1;
-}
-qe

[Qemu-devel] [RFC PATCH v3 30/46] target/i386: introduce code generators

2019-08-14 Thread Jan Bobek
In this context, "code generators" are functions that receive decoded
instruction operands and emit TCG ops implementing the correct
instruction functionality. Introduce the naming macros first, actual
generator macros will be added later.

Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 38 ++
 1 file changed, 38 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 69233fd0f8..b5f609e147 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -5207,6 +5207,44 @@ INSNOP_LDST(xmm_t0, Mhq)
 }
 }
 
+/*
+ * Code generators
+ */
+#define gen_insn(mnem, argc, ...)   \
+glue(gen_insn, argc)(mnem, ## __VA_ARGS__)
+#define gen_insn0(mnem) \
+gen_ ## mnem ## _0
+#define gen_insn1(mnem, opT1)   \
+gen_ ## mnem ## _1 ## opT1
+#define gen_insn2(mnem, opT1, opT2) \
+gen_ ## mnem ## _2 ## opT1 ## opT2
+#define gen_insn3(mnem, opT1, opT2, opT3)   \
+gen_ ## mnem ## _3 ## opT1 ## opT2 ## opT3
+#define gen_insn4(mnem, opT1, opT2, opT3, opT4) \
+gen_ ## mnem ## _4 ## opT1 ## opT2 ## opT3 ## opT4
+
+#define GEN_INSN0(mnem) \
+static void gen_insn0(mnem)(\
+CPUX86State *env, DisasContext *s)
+#define GEN_INSN1(mnem, opT1)   \
+static void gen_insn1(mnem, opT1)(  \
+CPUX86State *env, DisasContext *s,  \
+insnop_arg_t(opT1) arg1)
+#define GEN_INSN2(mnem, opT1, opT2) \
+static void gen_insn2(mnem, opT1, opT2)(\
+CPUX86State *env, DisasContext *s,  \
+insnop_arg_t(opT1) arg1, insnop_arg_t(opT2) arg2)
+#define GEN_INSN3(mnem, opT1, opT2, opT3)   \
+static void gen_insn3(mnem, opT1, opT2, opT3)(  \
+CPUX86State *env, DisasContext *s,  \
+insnop_arg_t(opT1) arg1, insnop_arg_t(opT2) arg2,   \
+insnop_arg_t(opT3) arg3)
+#define GEN_INSN4(mnem, opT1, opT2, opT3, opT4) \
+static void gen_insn4(mnem, opT1, opT2, opT3, opT4)(\
+CPUX86State *env, DisasContext *s,  \
+insnop_arg_t(opT1) arg1, insnop_arg_t(opT2) arg2,   \
+insnop_arg_t(opT3) arg3, insnop_arg_t(opT4) arg4)
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
 enum {
-- 
2.20.1




[Qemu-devel] [RFC PATCH v3 25/46] target/i386: introduce Ib (immediate) operand

2019-08-14 Thread Jan Bobek
Introduce the immediate-byte operand, which loads a byte from the
instruction stream and passes its value as the operand.

Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 4562a097fa..78e8f7a212 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4879,6 +4879,24 @@ INSNOP_FINALIZE(vex_v)
 {
 }
 
+/*
+ * Immediate operand
+ */
+typedef uint8_t insnop_arg_t(Ib);
+typedef struct {} insnop_ctxt_t(Ib);
+
+INSNOP_INIT(Ib)
+{
+return 0;
+}
+INSNOP_PREPARE(Ib)
+{
+return x86_ldub_code(env, s);
+}
+INSNOP_FINALIZE(Ib)
+{
+}
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
 enum {
-- 
2.20.1




[Qemu-devel] [RFC PATCH v3 19/46] target/i386: introduce generic load-store operand

2019-08-14 Thread Jan Bobek
This operand attempts to capture the "indirect" or "memory" operand in
a generic way. It significatly reduces the amount code that needs to
be written in order to read operands from memory to temporary storage
and write them back.

Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 54 +
 1 file changed, 54 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index a0b883c680..99f46be34e 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4642,6 +4642,60 @@ static int ck_cpuid(CPUX86State *env, DisasContext *s, 
CkCpuidFeat feat)
  env, s, modrm, is_write, arg));\
 }
 
+/*
+ * Generic load-store operand
+ */
+#define insnop_ldst(opTarg, opTptr) \
+insnop_ldst_ ## opTarg ## opTptr
+
+#define INSNOP_LDST(opTarg, opTptr) \
+static void insnop_ldst(opTarg, opTptr)(CPUX86State *env,   \
+DisasContext *s,\
+int modrm, bool is_write,   \
+insnop_arg_t(opTarg) arg,   \
+insnop_arg_t(opTptr) ptr)
+
+#define DEF_INSNOP_LDST(opT, opTarg, opTptr)\
+typedef insnop_arg_t(opTarg) insnop_arg_t(opT); \
+typedef struct {\
+insnop_ctxt_t(opTarg) arg;  \
+insnop_ctxt_t(opTptr) ptr;  \
+} insnop_ctxt_t(opT);   \
+\
+/* forward declaration */   \
+INSNOP_LDST(opTarg, opTptr);\
+\
+INSNOP_INIT(opT)\
+{   \
+int ret = insnop_init(opTarg)(&ctxt->arg, env, s, modrm, is_write); \
+if (!ret) { \
+ret = insnop_init(opTptr)(&ctxt->ptr, env, s, modrm, is_write); \
+}   \
+return ret; \
+}   \
+INSNOP_PREPARE(opT) \
+{   \
+const insnop_arg_t(opTarg) arg =\
+insnop_prepare(opTarg)(&ctxt->arg, env, s, modrm, is_write); \
+if (!is_write) {\
+const insnop_arg_t(opTptr) ptr =\
+insnop_prepare(opTptr)(&ctxt->ptr, env, s, modrm, is_write); \
+insnop_ldst(opTarg, opTptr)(env, s, modrm, is_write, arg, ptr); \
+insnop_finalize(opTptr)(&ctxt->ptr, env, s, modrm, is_write, ptr); 
\
+}   \
+return arg; \
+}   \
+INSNOP_FINALIZE(opT)\
+{   \
+if (is_write) { \
+const insnop_arg_t(opTptr) ptr =\
+insnop_prepare(opTptr)(&ctxt->ptr, env, s, modrm, is_write); \
+insnop_ldst(opTarg, opTptr)(env, s, modrm, is_write, arg, ptr); \
+insnop_finalize(opTptr)(&ctxt->ptr, env, s, modrm, is_write, ptr); 
\
+}   \
+insnop_finalize(opTarg)(&ctxt->arg, env, s, modrm, is_write, arg); \
+}
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
 enum {
-- 
2.20.1




[Qemu-devel] [RFC PATCH v3 24/46] target/i386: introduce operand vex_v

2019-08-14 Thread Jan Bobek
This operand yields value of the VEX. field.

Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index c918065b96..4562a097fa 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4859,6 +4859,26 @@ INSNOP_FINALIZE(modrm_rm_direct)
 insnop_finalize(modrm_rm)(&ctxt->rm, env, s, modrm, is_write, arg);
 }
 
+/*
+ * vex_v
+ *
+ * Operand whose value is the  field of the VEX prefix.
+ */
+typedef int insnop_arg_t(vex_v);
+typedef struct {} insnop_ctxt_t(vex_v);
+
+INSNOP_INIT(vex_v)
+{
+return !(s->prefix & PREFIX_VEX);
+}
+INSNOP_PREPARE(vex_v)
+{
+return s->vex_v;
+}
+INSNOP_FINALIZE(vex_v)
+{
+}
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
 enum {
-- 
2.20.1




[Qemu-devel] [PATCH v5 03/10] accel: collecting JIT statistics

2019-08-14 Thread vandersonmr
If a TB has a TBS (TBStatistics) with the TB_JIT_STATS
enabled then we collect statistics of its translation
processes and code translation.

Collecting the number of host instructions seems to be
not simple as it would imply in having to modify several
target source files. So, for now, we are only collecting
the size of the host gen code.

Signed-off-by: Vanderson M. do Rosario 
---
 accel/tcg/translate-all.c | 14 ++
 accel/tcg/translator.c|  4 
 include/exec/tb-stats.h   | 15 +++
 tcg/tcg.c | 23 +++
 tcg/tcg.h |  2 ++
 5 files changed, 58 insertions(+)

diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index df08d183df..85c6b7b409 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -1696,6 +1696,7 @@ static TBStatistics *tb_get_stats(tb_page_addr_t phys_pc, 
target_ulong pc,
 new_stats->cs_base = cs_base;
 new_stats->flags = flags;
 new_stats->tb = current_tb;
+new_stats->translations.total = 1;
 
 qht_insert(&tb_ctx.tb_stats, new_stats, hash, &existing_stats);
 
@@ -1705,6 +1706,7 @@ static TBStatistics *tb_get_stats(tb_page_addr_t phys_pc, 
target_ulong pc,
  * then just make the new TB point to the older TBStatistic
  */
 g_free(new_stats);
+((TBStatistics *) existing_stats)->tb = current_tb;
 return existing_stats;
 } else {
 return new_stats;
@@ -1792,6 +1794,11 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
 tb->tb_stats->stats_enabled |= TB_EXEC_STATS;
 }
 }
+
+if (flag & TB_JIT_STATS) {
+tb->tb_stats->stats_enabled |= TB_JIT_STATS;
+atomic_inc(&tb->tb_stats->translations.total);
+}
 } else {
 tb->tb_stats = NULL;
 }
@@ -1869,6 +1876,10 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
 atomic_set(&prof->search_out_len, prof->search_out_len + search_size);
 #endif
 
+if (tb_stats_enabled(tb, TB_JIT_STATS)) {
+atomic_add(&tb->tb_stats->code.out_len, gen_code_size);
+}
+
 #ifdef DEBUG_DISAS
 if (qemu_loglevel_mask(CPU_LOG_TB_OUT_ASM) &&
 qemu_log_in_addr_range(tb->pc)) {
@@ -1926,6 +1937,9 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
 phys_page2 = -1;
 if ((pc & TARGET_PAGE_MASK) != virt_page2) {
 phys_page2 = get_page_addr_code(env, virt_page2);
+if (tb_stats_enabled(tb, TB_JIT_STATS)) {
+atomic_inc(&tb->tb_stats->translations.spanning);
+}
 }
 /*
  * No explicit memory barrier is required -- tb_link_page() makes the
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index 396a11e828..834265d5be 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -117,6 +117,10 @@ void translator_loop(const TranslatorOps *ops, 
DisasContextBase *db,
 db->tb->size = db->pc_next - db->pc_first;
 db->tb->icount = db->num_insns;
 
+if (tb_stats_enabled(tb, TB_JIT_STATS)) {
+atomic_add(&db->tb->tb_stats->code.num_guest_inst, db->num_insns);
+}
+
 #ifdef DEBUG_DISAS
 if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)
 && qemu_log_in_addr_range(db->pc_first)) {
diff --git a/include/exec/tb-stats.h b/include/exec/tb-stats.h
index 0265050b79..3c219123c2 100644
--- a/include/exec/tb-stats.h
+++ b/include/exec/tb-stats.h
@@ -34,6 +34,20 @@ struct TBStatistics {
 unsigned long atomic;
 } executions;
 
+struct {
+unsigned num_guest_inst;
+unsigned num_tcg_ops;
+unsigned num_tcg_ops_opt;
+unsigned spills;
+unsigned out_len;
+} code;
+
+struct {
+unsigned long total;
+unsigned long uncached;
+unsigned long spanning;
+} translations;
+
 /* current TB linked to this TBStatistics */
 TranslationBlock *tb;
 };
@@ -47,6 +61,7 @@ enum TBStatsStatus { TB_STATS_RUNNING, TB_STATS_PAUSED, 
TB_STATS_STOPPED };
 
 #define TB_NOTHING0
 #define TB_EXEC_STATS 1
+#define TB_JIT_STATS  (1 << 2)
 
 extern int tcg_collect_tb_stats;
 extern uint32_t default_tbstats_flag;
diff --git a/tcg/tcg.c b/tcg/tcg.c
index be2c33c400..446e3d1708 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -3126,6 +3126,11 @@ static void temp_sync(TCGContext *s, TCGTemp *ts, 
TCGRegSet allocated_regs,
 case TEMP_VAL_REG:
 tcg_out_st(s, ts->type, ts->reg,
ts->mem_base->reg, ts->mem_offset);
+
+/* Count number of spills */
+if (tb_stats_enabled(s->current_tb, TB_JIT_STATS)) {
+atomic_inc(&s->current_tb->tb_stats->code.spills);
+}
 break;
 
 case TEMP_VAL_MEM:
@@ -3997,6 +4002,8 @@ int tcg_gen_code(TCGContext *s, TranslationBlock *tb)
 int i, num_insns;
 TCGOp *op;
 
+s->current_tb = tb;
+
 #ifdef CONFIG_PROFILER
 {
 int n = 0;
@@ -4028,6 +4035,14 @@ int tcg_gen_code(TCGContext *s, TranslationBlock *tb)
 }
 #endif
 
+ 

[Qemu-devel] [RFC PATCH v3 23/46] target/i386: introduce operand for direct-only r/m field

2019-08-14 Thread Jan Bobek
Many operands can only decode successfully if the ModR/M byte has the
direct form (i.e. MOD=3). Capture this common aspect by introducing a
special direct-only r/m field operand.

Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 37 +
 1 file changed, 37 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index e4515e81df..c918065b96 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4822,6 +4822,43 @@ INSNOP_FINALIZE(modrm_rm)
 {
 }
 
+/*
+ * modrm_rm_direct
+ *
+ * Equivalent of modrm_rm, but only decodes successfully if
+ * the ModR/M byte has the direct form (i.e. MOD=3).
+ */
+typedef insnop_arg_t(modrm_rm) insnop_arg_t(modrm_rm_direct);
+typedef struct {
+insnop_ctxt_t(modrm_rm) rm;
+} insnop_ctxt_t(modrm_rm_direct);
+
+INSNOP_INIT(modrm_rm_direct)
+{
+int ret;
+insnop_ctxt_t(modrm_mod) modctxt;
+
+ret = insnop_init(modrm_mod)(&modctxt, env, s, modrm, 0);
+if (!ret) {
+const int mod = insnop_prepare(modrm_mod)(&modctxt, env, s, modrm, 0);
+if (mod == 3) {
+ret = insnop_init(modrm_rm)(&ctxt->rm, env, s, modrm, is_write);
+} else {
+ret = 1;
+}
+insnop_finalize(modrm_mod)(&modctxt, env, s, modrm, 0, mod);
+}
+return ret;
+}
+INSNOP_PREPARE(modrm_rm_direct)
+{
+return insnop_prepare(modrm_rm)(&ctxt->rm, env, s, modrm, is_write);
+}
+INSNOP_FINALIZE(modrm_rm_direct)
+{
+insnop_finalize(modrm_rm)(&ctxt->rm, env, s, modrm, is_write, arg);
+}
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
 enum {
-- 
2.20.1




[Qemu-devel] [RFC PATCH v3 16/46] target/i386: introduce instruction operand infrastructure

2019-08-14 Thread Jan Bobek
insnop_arg_t, insnop_ctxt_t and init, prepare and finalize functions
form the basis of instruction operand decoding. Introduce macros for
defining a generic instruction operand; use cases for operand decoding
will be introduced later.

Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 28 
 1 file changed, 28 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 0cffa2226b..9d00b36406 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4548,6 +4548,34 @@ static int ck_cpuid(CPUX86State *env, DisasContext *s, 
CkCpuidFeat feat)
 }
 }
 
+/*
+ * Instruction operand
+ */
+#define insnop_arg_t(opT)insnop_ ## opT ## _arg_t
+#define insnop_ctxt_t(opT)   insnop_ ## opT ## _ctxt_t
+#define insnop_init(opT) insnop_ ## opT ## _init
+#define insnop_prepare(opT)  insnop_ ## opT ## _prepare
+#define insnop_finalize(opT) insnop_ ## opT ## _finalize
+
+#define INSNOP_INIT(opT)\
+static int insnop_init(opT)(insnop_ctxt_t(opT) *ctxt,   \
+CPUX86State *env,   \
+DisasContext *s,\
+int modrm, bool is_write)
+
+#define INSNOP_PREPARE(opT) \
+static insnop_arg_t(opT) insnop_prepare(opT)(insnop_ctxt_t(opT) *ctxt, \
+ CPUX86State *env,  \
+ DisasContext *s,   \
+ int modrm, bool is_write)
+
+#define INSNOP_FINALIZE(opT)\
+static void insnop_finalize(opT)(insnop_ctxt_t(opT) *ctxt,  \
+ CPUX86State *env,  \
+ DisasContext *s,   \
+ int modrm, bool is_write,  \
+ insnop_arg_t(opT) arg)
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
 enum {
-- 
2.20.1




[Qemu-devel] [RFC PATCH v3 18/46] target/i386: introduce generic either-or operand

2019-08-14 Thread Jan Bobek
The either-or operand attempts to decode one operand, and if it fails,
it falls back to a second operand. It is unifying, meaning that
insnop_arg_t of the second operand must be implicitly castable to
insnop_arg_t of the first operand.

Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 46 +
 1 file changed, 46 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 8989e6504c..a0b883c680 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4596,6 +4596,52 @@ static int ck_cpuid(CPUX86State *env, DisasContext *s, 
CkCpuidFeat feat)
 insnop_finalize(opT2)(ctxt, env, s, modrm, is_write, arg);  \
 }
 
+/*
+ * Generic unifying either-or operand
+ */
+#define DEF_INSNOP_EITHER(opT, opT1, opT2)  \
+typedef insnop_arg_t(opT1) insnop_arg_t(opT);   \
+typedef struct {\
+bool is_ ## opT1;   \
+union { \
+insnop_ctxt_t(opT1) ctxt_ ## opT1;  \
+insnop_ctxt_t(opT2) ctxt_ ## opT2;  \
+};  \
+} insnop_ctxt_t(opT);   \
+\
+INSNOP_INIT(opT)\
+{   \
+int ret = insnop_init(opT1)(&ctxt->ctxt_ ## opT1,   \
+env, s, modrm, is_write);   \
+if (!ret) { \
+ctxt->is_ ## opT1 = 1;  \
+return 0;   \
+}   \
+ret = insnop_init(opT2)(&ctxt->ctxt_ ## opT2,   \
+env, s, modrm, is_write);   \
+if (!ret) { \
+ctxt->is_ ## opT1 = 0;  \
+return 0;   \
+}   \
+return ret; \
+}   \
+INSNOP_PREPARE(opT) \
+{   \
+return (ctxt->is_ ## opT1   \
+? insnop_prepare(opT1)(&ctxt->ctxt_ ## opT1,\
+   env, s, modrm, is_write) \
+: insnop_prepare(opT2)(&ctxt->ctxt_ ## opT2,\
+   env, s, modrm, is_write));   \
+}   \
+INSNOP_FINALIZE(opT)\
+{   \
+(ctxt->is_ ## opT1  \
+ ? insnop_finalize(opT1)(&ctxt->ctxt_ ## opT1,  \
+ env, s, modrm, is_write, arg)  \
+ : insnop_finalize(opT2)(&ctxt->ctxt_ ## opT2,  \
+ env, s, modrm, is_write, arg));\
+}
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
 enum {
-- 
2.20.1




[Qemu-devel] [RFC PATCH v3 21/46] target/i386: introduce modrm operand

2019-08-14 Thread Jan Bobek
This permits the ModR/M byte to be passed raw into the code generator,
effectively allowing to short-circuit the operand decoding mechanism
and do the decoding work manually in the code generator.

Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 7fc5149d29..25c25a30fb 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4740,6 +4740,26 @@ INSNOP_FINALIZE(tcg_temp_i64)
 tcg_temp_free_i64(arg);
 }
 
+/*
+ * modrm
+ *
+ * Operand whose value is the ModR/M byte.
+ */
+typedef int insnop_arg_t(modrm);
+typedef struct {} insnop_ctxt_t(modrm);
+
+INSNOP_INIT(modrm)
+{
+return 0;
+}
+INSNOP_PREPARE(modrm)
+{
+return modrm;
+}
+INSNOP_FINALIZE(modrm)
+{
+}
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
 enum {
-- 
2.20.1




[Qemu-devel] [RFC PATCH v3 46/46] target/i386: introduce SSE3 instructions to sse-opcode.inc.h

2019-08-14 Thread Jan Bobek
Add all the SSE3 instruction entries to sse-opcode.inc.h.

Signed-off-by: Jan Bobek 
---
 target/i386/sse-opcode.inc.h | 20 
 1 file changed, 20 insertions(+)

diff --git a/target/i386/sse-opcode.inc.h b/target/i386/sse-opcode.inc.h
index efa67b7ce2..0cfe6fbe31 100644
--- a/target/i386/sse-opcode.inc.h
+++ b/target/i386/sse-opcode.inc.h
@@ -133,6 +133,14 @@ OPCODE(movmskps, LEG(NP, 0F, 1, 0x50), SSE, WR, Gq, Udq)
 OPCODE(movmskpd, LEG(66, 0F, 0, 0x50), SSE2, WR, Gd, Udq)
 /* 66 REX.W 0F 50 /r: MOVMSKPD r64, xmm */
 OPCODE(movmskpd, LEG(66, 0F, 1, 0x50), SSE2, WR, Gq, Udq)
+/* F2 0F F0 /r: LDDQU xmm1, m128 */
+OPCODE(lddqu, LEG(F2, 0F, 0, 0xf0), SSE3, WR, Vdq, Mdq)
+/* F3 0F 16 /r: MOVSHDUP xmm1, xmm2/m128 */
+OPCODE(movshdup, LEG(F3, 0F, 0, 0x16), SSE3, WR, Vdq, Wdq)
+/* F3 0F 12 /r: MOVSLDUP xmm1, xmm2/m128 */
+OPCODE(movsldup, LEG(F3, 0F, 0, 0x12), SSE3, WR, Vdq, Wdq)
+/* F2 0F 12 /r: MOVDDUP xmm1, xmm2/m64 */
+OPCODE(movddup, LEG(F2, 0F, 0, 0x12), SSE3, WR, Vdq, Wq)
 /* NP 0F FC /r: PADDB mm, mm/m64 */
 OPCODE(paddb, LEG(NP, 0F, 0, 0xfc), MMX, WRR, Pq, Pq, Qq)
 /* 66 0F FC /r: PADDB xmm1, xmm2/m128 */
@@ -173,6 +181,10 @@ OPCODE(addpd, LEG(66, 0F, 0, 0x58), SSE2, WRR, Vdq, Vdq, 
Wdq)
 OPCODE(addss, LEG(F3, 0F, 0, 0x58), SSE, WRR, Vd, Vd, Wd)
 /* F2 0F 58 /r: ADDSD xmm1, xmm2/m64 */
 OPCODE(addsd, LEG(F2, 0F, 0, 0x58), SSE2, WRR, Vq, Vq, Wq)
+/* F2 0F 7C /r: HADDPS xmm1, xmm2/m128 */
+OPCODE(haddps, LEG(F2, 0F, 0, 0x7c), SSE3, WRR, Vdq, Vdq, Wdq)
+/* 66 0F 7C /r: HADDPD xmm1, xmm2/m128 */
+OPCODE(haddpd, LEG(66, 0F, 0, 0x7c), SSE3, WRR, Vdq, Vdq, Wdq)
 /* NP 0F F8 /r: PSUBB mm, mm/m64 */
 OPCODE(psubb, LEG(NP, 0F, 0, 0xf8), MMX, WRR, Pq, Pq, Qq)
 /* 66 0F F8 /r: PSUBB xmm1, xmm2/m128 */
@@ -213,6 +225,14 @@ OPCODE(subpd, LEG(66, 0F, 0, 0x5c), SSE2, WRR, Vdq, Vdq, 
Wdq)
 OPCODE(subss, LEG(F3, 0F, 0, 0x5c), SSE, WRR, Vd, Vd, Wd)
 /* F2 0F 5C /r: SUBSD xmm1, xmm2/m64 */
 OPCODE(subsd, LEG(F2, 0F, 0, 0x5c), SSE2, WRR, Vq, Vq, Wq)
+/* F2 0F 7D /r: HSUBPS xmm1, xmm2/m128 */
+OPCODE(hsubps, LEG(F2, 0F, 0, 0x7d), SSE3, WRR, Vdq, Vdq, Wdq)
+/* 66 0F 7D /r: HSUBPD xmm1, xmm2/m128 */
+OPCODE(hsubpd, LEG(66, 0F, 0, 0x7d), SSE3, WRR, Vdq, Vdq, Wdq)
+/* F2 0F D0 /r: ADDSUBPS xmm1, xmm2/m128 */
+OPCODE(addsubps, LEG(F2, 0F, 0, 0xd0), SSE3, WRR, Vdq, Vdq, Wdq)
+/* 66 0F D0 /r: ADDSUBPD xmm1, xmm2/m128 */
+OPCODE(addsubpd, LEG(66, 0F, 0, 0xd0), SSE3, WRR, Vdq, Vdq, Wdq)
 /* NP 0F D5 /r: PMULLW mm, mm/m64 */
 OPCODE(pmullw, LEG(NP, 0F, 0, 0xd5), MMX, WRR, Pq, Pq, Qq)
 /* 66 0F D5 /r: PMULLW xmm1, xmm2/m128 */
-- 
2.20.1




[Qemu-devel] [RFC PATCH v3 06/46] target/i386: Simplify gen_exception arguments

2019-08-14 Thread Jan Bobek
From: Richard Henderson 

We can compute cur_eip from values present within DisasContext.

Signed-off-by: Richard Henderson 
---
 target/i386/translate.c | 89 -
 1 file changed, 44 insertions(+), 45 deletions(-)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 40a4844b64..7532d65778 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -1272,10 +1272,10 @@ static void gen_helper_fp_arith_STN_ST0(int op, int 
opreg)
 }
 }
 
-static void gen_exception(DisasContext *s, int trapno, target_ulong cur_eip)
+static void gen_exception(DisasContext *s, int trapno)
 {
 gen_update_cc_op(s);
-gen_jmp_im(s, cur_eip);
+gen_jmp_im(s, s->pc_start - s->cs_base);
 gen_helper_raise_exception(cpu_env, tcg_const_i32(trapno));
 s->base.is_jmp = DISAS_NORETURN;
 }
@@ -1284,7 +1284,7 @@ static void gen_exception(DisasContext *s, int trapno, 
target_ulong cur_eip)
the instruction is known, but it isn't allowed in the current cpu mode.  */
 static void gen_illegal_opcode(DisasContext *s)
 {
-gen_exception(s, EXCP06_ILLOP, s->pc_start - s->cs_base);
+gen_exception(s, EXCP06_ILLOP);
 }
 
 /* if d == OR_TMP0, it means memory operand (address in A0) */
@@ -3040,8 +3040,7 @@ static const struct SSEOpHelper_eppi sse_op_table7[256] = 
{
 [0xdf] = AESNI_OP(aeskeygenassist),
 };
 
-static void gen_sse(CPUX86State *env, DisasContext *s, int b,
-target_ulong pc_start)
+static void gen_sse(CPUX86State *env, DisasContext *s, int b)
 {
 int b1, op1_offset, op2_offset, is_xmm, val;
 int modrm, mod, rm, reg;
@@ -3076,7 +3075,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, 
int b,
 }
 /* simple MMX/SSE operation */
 if (s->flags & HF_TS_MASK) {
-gen_exception(s, EXCP07_PREX, pc_start - s->cs_base);
+gen_exception(s, EXCP07_PREX);
 return;
 }
 if (s->flags & HF_EM_MASK) {
@@ -4515,7 +4514,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 s->vex_l = 0;
 s->vex_v = 0;
 if (sigsetjmp(s->jmpbuf, 0) != 0) {
-gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
+gen_exception(s, EXCP0D_GPF);
 return s->pc;
 }
 
@@ -5854,7 +5853,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 if (s->flags & (HF_EM_MASK | HF_TS_MASK)) {
 /* if CR0.EM or CR0.TS are set, generate an FPU exception */
 /* XXX: what to do if illegal op ? */
-gen_exception(s, EXCP07_PREX, pc_start - s->cs_base);
+gen_exception(s, EXCP07_PREX);
 break;
 }
 modrm = x86_ldub_code(env, s);
@@ -6572,7 +6571,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 set_cc_op(s, CC_OP_EFLAGS);
 } else if (s->vm86) {
 if (s->iopl != 3) {
-gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
+gen_exception(s, EXCP0D_GPF);
 } else {
 gen_helper_iret_real(cpu_env, tcg_const_i32(s->dflag - 1));
 set_cc_op(s, CC_OP_EFLAGS);
@@ -6694,7 +6693,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 case 0x9c: /* pushf */
 gen_svm_check_intercept(s, pc_start, SVM_EXIT_PUSHF);
 if (s->vm86 && s->iopl != 3) {
-gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
+gen_exception(s, EXCP0D_GPF);
 } else {
 gen_update_cc_op(s);
 gen_helper_read_eflags(s->T0, cpu_env);
@@ -6704,7 +6703,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 case 0x9d: /* popf */
 gen_svm_check_intercept(s, pc_start, SVM_EXIT_POPF);
 if (s->vm86 && s->iopl != 3) {
-gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
+gen_exception(s, EXCP0D_GPF);
 } else {
 ot = gen_pop_T0(s);
 if (s->cpl == 0) {
@@ -7021,7 +7020,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 goto illegal_op;
 val = x86_ldub_code(env, s);
 if (val == 0) {
-gen_exception(s, EXCP00_DIVZ, pc_start - s->cs_base);
+gen_exception(s, EXCP00_DIVZ);
 } else {
 gen_helper_aam(cpu_env, tcg_const_i32(val));
 set_cc_op(s, CC_OP_LOGICB);
@@ -7055,7 +7054,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 case 0x9b: /* fwait */
 if ((s->flags & (HF_MP_MASK | HF_TS_MASK)) ==
 (HF_MP_MASK | HF_TS_MASK)) {
-gen_exception(s, EXCP07_PREX, pc_start - s->cs_base);
+gen_exception(s, EXCP07_PREX);
 } else {
 gen_helper_fwait(cpu_env);
 }
@@ -7066,7 +7065,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 case 0xcd: /* int N */
 val = x86_ldub_code(env, s);
 if (s->vm86 && s->iopl != 3) {
-gen

[Qemu-devel] [RFC PATCH v3 15/46] target/i386: introduce function ck_cpuid

2019-08-14 Thread Jan Bobek
Introduce a helper function to take care of instruction CPUID checks.

Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 48 +
 1 file changed, 48 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 6296a02991..0cffa2226b 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4500,6 +4500,54 @@ static void gen_sse(CPUX86State *env, DisasContext *s, 
int b)
 #define tcg_gen_gvec_cmpgt(vece, dofs, aofs, bofs, oprsz, maxsz)\
 tcg_gen_gvec_cmp(TCG_COND_GT, vece, dofs, aofs, bofs, oprsz, maxsz)
 
+typedef enum {
+CK_CPUID_MMX = 1,
+CK_CPUID_3DNOW,
+CK_CPUID_SSE,
+CK_CPUID_SSE2,
+CK_CPUID_CLFLUSH,
+CK_CPUID_SSE3,
+CK_CPUID_SSSE3,
+CK_CPUID_SSE4_1,
+CK_CPUID_SSE4_2,
+CK_CPUID_SSE4A,
+CK_CPUID_AVX,
+CK_CPUID_AVX2,
+} CkCpuidFeat;
+
+static int ck_cpuid(CPUX86State *env, DisasContext *s, CkCpuidFeat feat)
+{
+switch (feat) {
+case CK_CPUID_MMX:
+return !(s->cpuid_features & CPUID_MMX)
+|| !(s->cpuid_ext2_features & CPUID_EXT2_MMX);
+case CK_CPUID_3DNOW:
+return !(s->cpuid_ext2_features & CPUID_EXT2_3DNOW);
+case CK_CPUID_SSE:
+return !(s->cpuid_features & CPUID_SSE);
+case CK_CPUID_SSE2:
+return !(s->cpuid_features & CPUID_SSE2);
+case CK_CPUID_CLFLUSH:
+return !(s->cpuid_features & CPUID_CLFLUSH);
+case CK_CPUID_SSE3:
+return !(s->cpuid_ext_features & CPUID_EXT_SSE3);
+case CK_CPUID_SSSE3:
+return !(s->cpuid_ext_features & CPUID_EXT_SSSE3);
+case CK_CPUID_SSE4_1:
+return !(s->cpuid_ext_features & CPUID_EXT_SSE41);
+case CK_CPUID_SSE4_2:
+return !(s->cpuid_ext_features & CPUID_EXT_SSE42);
+case CK_CPUID_SSE4A:
+return !(s->cpuid_ext3_features & CPUID_EXT3_SSE4A);
+case CK_CPUID_AVX:
+return !(s->cpuid_ext_features & CPUID_EXT_AVX);
+case CK_CPUID_AVX2:
+return !(s->cpuid_7_0_ebx_features & CPUID_7_0_EBX_AVX2);
+default:
+g_assert_not_reached();
+}
+}
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
 enum {
-- 
2.20.1




[Qemu-devel] [RFC PATCH v3 45/46] target/i386: introduce SSE3 code generators

2019-08-14 Thread Jan Bobek
Introduce code generators required by SSE3 instructions.

Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 64 +
 1 file changed, 64 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index c72138014a..9da3fbb611 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -5627,6 +5627,63 @@ GEN_INSN2(movmskpd, Gq, Udq)
 tcg_temp_free_i32(arg1_r32);
 }
 
+GEN_INSN2(lddqu, Vdq, Mdq)
+{
+assert(arg2 == s->A0);
+gen_ldo_env_A0(s, arg1);
+}
+
+GEN_INSN2(movshdup, Vdq, Wdq)
+{
+const TCGv_i32 r32 = tcg_temp_new_i32();
+
+tcg_gen_ld_i32(r32, cpu_env, arg2 + offsetof(ZMMReg, ZMM_L(1)));
+tcg_gen_st_i32(r32, cpu_env, arg1 + offsetof(ZMMReg, ZMM_L(0)));
+if (arg1 != arg2) {
+tcg_gen_st_i32(r32, cpu_env, arg1 + offsetof(ZMMReg, ZMM_L(1)));
+}
+
+tcg_gen_ld_i32(r32, cpu_env, arg2 + offsetof(ZMMReg, ZMM_L(3)));
+tcg_gen_st_i32(r32, cpu_env, arg1 + offsetof(ZMMReg, ZMM_L(2)));
+if (arg1 != arg2) {
+tcg_gen_st_i32(r32, cpu_env, arg1 + offsetof(ZMMReg, ZMM_L(3)));
+}
+
+tcg_temp_free_i32(r32);
+}
+
+GEN_INSN2(movsldup, Vdq, Wdq)
+{
+const TCGv_i32 r32 = tcg_temp_new_i32();
+
+tcg_gen_ld_i32(r32, cpu_env, arg2 + offsetof(ZMMReg, ZMM_L(0)));
+if (arg1 != arg2) {
+tcg_gen_st_i32(r32, cpu_env, arg1 + offsetof(ZMMReg, ZMM_L(0)));
+}
+tcg_gen_st_i32(r32, cpu_env, arg1 + offsetof(ZMMReg, ZMM_L(1)));
+
+tcg_gen_ld_i32(r32, cpu_env, arg2 + offsetof(ZMMReg, ZMM_L(2)));
+if (arg1 != arg2) {
+tcg_gen_st_i32(r32, cpu_env, arg1 + offsetof(ZMMReg, ZMM_L(2)));
+}
+tcg_gen_st_i32(r32, cpu_env, arg1 + offsetof(ZMMReg, ZMM_L(3)));
+
+tcg_temp_free_i32(r32);
+}
+
+GEN_INSN2(movddup, Vdq, Wq)
+{
+const TCGv_i64 r64 = tcg_temp_new_i64();
+
+tcg_gen_ld_i64(r64, cpu_env, arg2 + offsetof(ZMMReg, ZMM_Q(0)));
+if (arg1 != arg2) {
+tcg_gen_st_i64(r64, cpu_env, arg1 + offsetof(ZMMReg, ZMM_Q(0)));
+}
+tcg_gen_st_i64(r64, cpu_env, arg1 + offsetof(ZMMReg, ZMM_Q(1)));
+
+tcg_temp_free_i64(r64);
+}
+
 DEF_GEN_INSN3_GVEC_MM(paddb, add, Pq, Pq, Qq, MO_8)
 DEF_GEN_INSN3_GVEC_XMM(paddb, add, Vdq, Vdq, Wdq, MO_8)
 DEF_GEN_INSN3_GVEC_MM(paddw, add, Pq, Pq, Qq, MO_16)
@@ -5647,6 +5704,8 @@ DEF_GEN_INSN3_HELPER_EPP(addps, addps, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(addss, addss, Vd, Vd, Wd)
 DEF_GEN_INSN3_HELPER_EPP(addpd, addpd, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(addsd, addsd, Vq, Vq, Wq)
+DEF_GEN_INSN3_HELPER_EPP(haddps, haddps, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(haddpd, haddpd, Vdq, Vdq, Wdq)
 
 DEF_GEN_INSN3_GVEC_MM(psubb, sub, Pq, Pq, Qq, MO_8)
 DEF_GEN_INSN3_GVEC_XMM(psubb, sub, Vdq, Vdq, Wdq, MO_8)
@@ -5668,6 +5727,11 @@ DEF_GEN_INSN3_HELPER_EPP(subps, subps, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(subpd, subpd, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(subss, subss, Vd, Vd, Wd)
 DEF_GEN_INSN3_HELPER_EPP(subsd, subsd, Vq, Vq, Wq)
+DEF_GEN_INSN3_HELPER_EPP(hsubps, hsubps, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(hsubpd, hsubpd, Vdq, Vdq, Wdq)
+
+DEF_GEN_INSN3_HELPER_EPP(addsubps, addsubps, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(addsubpd, addsubpd, Vdq, Vdq, Wdq)
 
 DEF_GEN_INSN3_HELPER_EPP(pmullw, pmullw_mmx, Pq, Pq, Qq)
 DEF_GEN_INSN3_HELPER_EPP(pmullw, pmullw_xmm, Vdq, Vdq, Wdq)
-- 
2.20.1




[Qemu-devel] [RFC PATCH v3 20/46] target/i386: introduce tcg_temp operands

2019-08-14 Thread Jan Bobek
TCG temporary operands allocate a 32-bit or 64-bit TCG temporary, and
later automatically free it.

Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 44 +
 1 file changed, 44 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 99f46be34e..7fc5149d29 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4696,6 +4696,50 @@ static int ck_cpuid(CPUX86State *env, DisasContext *s, 
CkCpuidFeat feat)
 insnop_finalize(opTarg)(&ctxt->arg, env, s, modrm, is_write, arg); \
 }
 
+/*
+ * tcg_temp_i32
+ *
+ * Operand which allocates a 32-bit TCG temporary and frees it
+ * automatically after use.
+ */
+typedef TCGv_i32 insnop_arg_t(tcg_temp_i32);
+typedef struct {} insnop_ctxt_t(tcg_temp_i32);
+
+INSNOP_INIT(tcg_temp_i32)
+{
+return 0;
+}
+INSNOP_PREPARE(tcg_temp_i32)
+{
+return tcg_temp_new_i32();
+}
+INSNOP_FINALIZE(tcg_temp_i32)
+{
+tcg_temp_free_i32(arg);
+}
+
+/*
+ * tcg_temp_i64
+ *
+ * Operand which allocates a 64-bit TCG temporary and frees it
+ * automatically after use.
+ */
+typedef TCGv_i64 insnop_arg_t(tcg_temp_i64);
+typedef struct {} insnop_ctxt_t(tcg_temp_i64);
+
+INSNOP_INIT(tcg_temp_i64)
+{
+return 0;
+}
+INSNOP_PREPARE(tcg_temp_i64)
+{
+return tcg_temp_new_i64();
+}
+INSNOP_FINALIZE(tcg_temp_i64)
+{
+tcg_temp_free_i64(arg);
+}
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
 enum {
-- 
2.20.1




[Qemu-devel] [RFC PATCH v3 05/46] target/i386: use prefix from DisasContext

2019-08-14 Thread Jan Bobek
Reduce scope of the local variable prefixes to enforce use of prefix
from DisasContext instead.

Suggested-by: Richard Henderson 
Reviewed-by: Richard Henderson 
Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 113 
 1 file changed, 57 insertions(+), 56 deletions(-)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index bb13877df7..40a4844b64 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4491,7 +4491,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, 
int b,
 static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
 {
 CPUX86State *env = cpu->env_ptr;
-int b, prefixes;
+int b;
 int shift;
 TCGMemOp ot;
 int modrm, reg, rm, mod, op, opreg, val;
@@ -4499,6 +4499,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 target_ulong pc_start = s->base.pc_next;
 
 {
+int prefixes;
 TCGMemOp aflag, dflag;
 
 s->pc_start = s->pc = pc_start;
@@ -6356,7 +6357,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 case 0xa4: /* movsS */
 case 0xa5:
 ot = mo_b_d(b, s->dflag);
-if (prefixes & (PREFIX_REPZ | PREFIX_REPNZ)) {
+if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
 gen_repz_movs(s, ot, pc_start - s->cs_base, s->pc - s->cs_base);
 } else {
 gen_movs(s, ot);
@@ -6366,7 +6367,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 case 0xaa: /* stosS */
 case 0xab:
 ot = mo_b_d(b, s->dflag);
-if (prefixes & (PREFIX_REPZ | PREFIX_REPNZ)) {
+if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
 gen_repz_stos(s, ot, pc_start - s->cs_base, s->pc - s->cs_base);
 } else {
 gen_stos(s, ot);
@@ -6375,7 +6376,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 case 0xac: /* lodsS */
 case 0xad:
 ot = mo_b_d(b, s->dflag);
-if (prefixes & (PREFIX_REPZ | PREFIX_REPNZ)) {
+if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
 gen_repz_lods(s, ot, pc_start - s->cs_base, s->pc - s->cs_base);
 } else {
 gen_lods(s, ot);
@@ -6384,9 +6385,9 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 case 0xae: /* scasS */
 case 0xaf:
 ot = mo_b_d(b, s->dflag);
-if (prefixes & PREFIX_REPNZ) {
+if (s->prefix & PREFIX_REPNZ) {
 gen_repz_scas(s, ot, pc_start - s->cs_base, s->pc - s->cs_base, 1);
-} else if (prefixes & PREFIX_REPZ) {
+} else if (s->prefix & PREFIX_REPZ) {
 gen_repz_scas(s, ot, pc_start - s->cs_base, s->pc - s->cs_base, 0);
 } else {
 gen_scas(s, ot);
@@ -6396,9 +6397,9 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 case 0xa6: /* cmpsS */
 case 0xa7:
 ot = mo_b_d(b, s->dflag);
-if (prefixes & PREFIX_REPNZ) {
+if (s->prefix & PREFIX_REPNZ) {
 gen_repz_cmps(s, ot, pc_start - s->cs_base, s->pc - s->cs_base, 1);
-} else if (prefixes & PREFIX_REPZ) {
+} else if (s->prefix & PREFIX_REPZ) {
 gen_repz_cmps(s, ot, pc_start - s->cs_base, s->pc - s->cs_base, 0);
 } else {
 gen_cmps(s, ot);
@@ -6409,8 +6410,8 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 ot = mo_b_d32(b, s->dflag);
 tcg_gen_ext16u_tl(s->T0, cpu_regs[R_EDX]);
 gen_check_io(s, ot, pc_start - s->cs_base, 
- SVM_IOIO_TYPE_MASK | svm_is_rep(prefixes) | 4);
-if (prefixes & (PREFIX_REPZ | PREFIX_REPNZ)) {
+ SVM_IOIO_TYPE_MASK | svm_is_rep(s->prefix) | 4);
+if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
 gen_repz_ins(s, ot, pc_start - s->cs_base, s->pc - s->cs_base);
 } else {
 gen_ins(s, ot);
@@ -6424,8 +6425,8 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 ot = mo_b_d32(b, s->dflag);
 tcg_gen_ext16u_tl(s->T0, cpu_regs[R_EDX]);
 gen_check_io(s, ot, pc_start - s->cs_base,
- svm_is_rep(prefixes) | 4);
-if (prefixes & (PREFIX_REPZ | PREFIX_REPNZ)) {
+ svm_is_rep(s->prefix) | 4);
+if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
 gen_repz_outs(s, ot, pc_start - s->cs_base, s->pc - s->cs_base);
 } else {
 gen_outs(s, ot);
@@ -6444,7 +6445,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 val = x86_ldub_code(env, s);
 tcg_gen_movi_tl(s->T0, val);
 gen_check_io(s, ot, pc_start - s->cs_base,
- SVM_IOIO_TYPE_MASK | svm_is_rep(prefixes));
+ SVM_IOIO_TYPE_MASK | svm_is_rep(s->prefix));
 if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
 gen_io_start();
 }
@@ -6463,7 +6464,7 @@ static target_ulong disas_insn(DisasContext *s, C

[Qemu-devel] [RFC PATCH v3 17/46] target/i386: introduce generic operand alias

2019-08-14 Thread Jan Bobek
It turns out it is useful to be able to declare operand name
aliases. Introduce a macro to capture this functionality.

Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 9d00b36406..8989e6504c 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4576,6 +4576,26 @@ static int ck_cpuid(CPUX86State *env, DisasContext *s, 
CkCpuidFeat feat)
  int modrm, bool is_write,  \
  insnop_arg_t(opT) arg)
 
+/*
+ * Operand alias
+ */
+#define DEF_INSNOP_ALIAS(opT, opT2) \
+typedef insnop_arg_t(opT2) insnop_arg_t(opT);   \
+typedef insnop_ctxt_t(opT2) insnop_ctxt_t(opT); \
+\
+INSNOP_INIT(opT)\
+{   \
+return insnop_init(opT2)(ctxt, env, s, modrm, is_write);\
+}   \
+INSNOP_PREPARE(opT) \
+{   \
+return insnop_prepare(opT2)(ctxt, env, s, modrm, is_write); \
+}   \
+INSNOP_FINALIZE(opT)\
+{   \
+insnop_finalize(opT2)(ctxt, env, s, modrm, is_write, arg);  \
+}
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
 enum {
-- 
2.20.1




[Qemu-devel] [RFC PATCH v3 35/46] target/i386: introduce MMX translators

2019-08-14 Thread Jan Bobek
Use the translator macros to define instruction translators required
by MMX instructions.

Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 76c27d0380..4fecb0d240 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -5457,6 +5457,15 @@ static void translate_insn0()(
 }   \
 }
 
+DEF_TRANSLATE_INSN2(Ed, Pq)
+DEF_TRANSLATE_INSN2(Eq, Pq)
+DEF_TRANSLATE_INSN2(Gd, Nq)
+DEF_TRANSLATE_INSN2(Gq, Nq)
+DEF_TRANSLATE_INSN2(Pq, Ed)
+DEF_TRANSLATE_INSN2(Pq, Eq)
+DEF_TRANSLATE_INSN2(Pq, Qq)
+DEF_TRANSLATE_INSN2(Qq, Pq)
+
 #define DEF_TRANSLATE_INSN3(opT1, opT2, opT3)   \
 static void translate_insn3(opT1, opT2, opT3)(  \
 CPUX86State *env, DisasContext *s, int modrm,   \
@@ -5501,6 +5510,13 @@ static void translate_insn0()(
 }   \
 }
 
+DEF_TRANSLATE_INSN3(Gd, Nq, Ib)
+DEF_TRANSLATE_INSN3(Gq, Nq, Ib)
+DEF_TRANSLATE_INSN3(Nq, Nq, Ib)
+DEF_TRANSLATE_INSN3(Pq, Pq, Qd)
+DEF_TRANSLATE_INSN3(Pq, Pq, Qq)
+DEF_TRANSLATE_INSN3(Pq, Qq, Ib)
+
 #define DEF_TRANSLATE_INSN4(opT1, opT2, opT3, opT4) \
 static void translate_insn4(opT1, opT2, opT3, opT4)(\
 CPUX86State *env, DisasContext *s, int modrm,   \
-- 
2.20.1




[Qemu-devel] [RFC PATCH v3 09/46] target/i386: make variable is_xmm const

2019-08-14 Thread Jan Bobek
The variable is_xmm does not change value after assignment, so make
this fact explicit by marking it const.

Reviewed-by: Richard Henderson 
Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 17 ++---
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 8bf39b73c4..c5ec309fe2 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -3042,7 +3042,7 @@ static const struct SSEOpHelper_eppi sse_op_table7[256] = 
{
 
 static void gen_sse(CPUX86State *env, DisasContext *s, int b)
 {
-int op1_offset, op2_offset, is_xmm, val;
+int op1_offset, op2_offset, val;
 int modrm, mod, rm, reg;
 SSEFunc_0_epp sse_fn_epp;
 SSEFunc_0_eppi sse_fn_eppi;
@@ -3056,20 +3056,15 @@ static void gen_sse(CPUX86State *env, DisasContext *s, 
int b)
 : s->prefix & PREFIX_REPZ ? 2
 : s->prefix & PREFIX_REPNZ ? 3
 : 0;
+const int is_xmm =
+(0x10 <= b && b <= 0x5f)
+|| b == 0xc6
+|| b == 0xc2
+|| !!b1;
 sse_fn_epp = sse_op_table1[b][b1];
 if (!sse_fn_epp) {
 goto unknown_op;
 }
-if ((b <= 0x5f && b >= 0x10) || b == 0xc6 || b == 0xc2) {
-is_xmm = 1;
-} else {
-if (b1 == 0) {
-/* MMX case */
-is_xmm = 0;
-} else {
-is_xmm = 1;
-}
-}
 /* simple MMX/SSE operation */
 if (s->flags & HF_TS_MASK) {
 gen_exception(s, EXCP07_PREX);
-- 
2.20.1




[Qemu-devel] [RFC PATCH v3 11/46] target/i386: introduce gen_(ld, st)d_env_A0

2019-08-14 Thread Jan Bobek
Similar in spirit to the already present gen_(ld,st)(q,o)_env_A0, it
will prove useful in later commits for smaller-sized vector loads.

Reviewed-by: Richard Henderson 
Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index c5ec309fe2..258351fce3 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -2652,6 +2652,18 @@ static void gen_jmp(DisasContext *s, target_ulong eip)
 gen_jmp_tb(s, eip, 0);
 }
 
+static inline void gen_ldd_env_A0(DisasContext *s, int offset)
+{
+tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0, s->mem_index, MO_LEUL);
+tcg_gen_st_i32(s->tmp2_i32, cpu_env, offset);
+}
+
+static inline void gen_std_env_A0(DisasContext *s, int offset)
+{
+tcg_gen_ld_i32(s->tmp2_i32, cpu_env, offset);
+tcg_gen_qemu_st_i32(s->tmp2_i32, s->A0, s->mem_index, MO_LEUL);
+}
+
 static inline void gen_ldq_env_A0(DisasContext *s, int offset)
 {
 tcg_gen_qemu_ld_i64(s->tmp1_i64, s->A0, s->mem_index, MO_LEQ);
-- 
2.20.1




[Qemu-devel] [RFC PATCH v3 14/46] target/i386: introduce mnemonic aliases for several gvec operations

2019-08-14 Thread Jan Bobek
It is helpful to introduce aliases for some general gvec operations as
it makes a couple of instruction code generators simpler (added
later).

Reviewed-by: Richard Henderson 
Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index e9741cd7f7..6296a02991 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4493,6 +4493,13 @@ static void gen_sse(CPUX86State *env, DisasContext *s, 
int b)
 #pragma GCC diagnostic push
 #pragma GCC diagnostic ignored "-Wunused-function"
 
+#define tcg_gen_gvec_andn(vece, dofs, aofs, bofs, oprsz, maxsz) \
+tcg_gen_gvec_andc(vece, dofs, bofs, aofs, oprsz, maxsz)
+#define tcg_gen_gvec_cmpeq(vece, dofs, aofs, bofs, oprsz, maxsz)\
+tcg_gen_gvec_cmp(TCG_COND_EQ, vece, dofs, aofs, bofs, oprsz, maxsz)
+#define tcg_gen_gvec_cmpgt(vece, dofs, aofs, bofs, oprsz, maxsz)\
+tcg_gen_gvec_cmp(TCG_COND_GT, vece, dofs, aofs, bofs, oprsz, maxsz)
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
 enum {
-- 
2.20.1




[Qemu-devel] [RFC PATCH v3 22/46] target/i386: introduce operands for decoding modrm fields

2019-08-14 Thread Jan Bobek
The old code uses bitshifts and bitwise-and all over the place for
decoding ModR/M fields. Avoid doing that by introducing proper
decoding operands.

Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 62 +
 1 file changed, 62 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 25c25a30fb..e4515e81df 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4760,6 +4760,68 @@ INSNOP_FINALIZE(modrm)
 {
 }
 
+/*
+ * modrm_mod
+ *
+ * Operand whose value is the MOD field of the ModR/M byte.
+ */
+typedef int insnop_arg_t(modrm_mod);
+typedef struct {} insnop_ctxt_t(modrm_mod);
+
+INSNOP_INIT(modrm_mod)
+{
+return 0;
+}
+INSNOP_PREPARE(modrm_mod)
+{
+return (modrm >> 6) & 3;
+}
+INSNOP_FINALIZE(modrm_mod)
+{
+}
+
+/*
+ * modrm_reg
+ *
+ * Operand whose value is the REG field of the ModR/M byte, extended
+ * with the REX.R bit if REX prefix is present.
+ */
+typedef int insnop_arg_t(modrm_reg);
+typedef struct {} insnop_ctxt_t(modrm_reg);
+
+INSNOP_INIT(modrm_reg)
+{
+return 0;
+}
+INSNOP_PREPARE(modrm_reg)
+{
+return ((modrm >> 3) & 7) | REX_R(s);
+}
+INSNOP_FINALIZE(modrm_reg)
+{
+}
+
+/*
+ * modrm_rm
+ *
+ * Operand whose value is the RM field of the ModR/M byte, extended
+ * with the REX.B bit if REX prefix is present.
+ */
+typedef int insnop_arg_t(modrm_rm);
+typedef struct {} insnop_ctxt_t(modrm_rm);
+
+INSNOP_INIT(modrm_rm)
+{
+return 0;
+}
+INSNOP_PREPARE(modrm_rm)
+{
+return (modrm & 7) | REX_B(s);
+}
+INSNOP_FINALIZE(modrm_rm)
+{
+}
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
 enum {
-- 
2.20.1




[Qemu-devel] [RFC PATCH v3 07/46] target/i386: use pc_start from DisasContext

2019-08-14 Thread Jan Bobek
The variable pc_start is already a member of DisasContext. Remove the
superfluous local variable.

Reviewed-by: Richard Henderson 
Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 131 
 1 file changed, 65 insertions(+), 66 deletions(-)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 7532d65778..b1ba2fc3e5 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4495,13 +4495,12 @@ static target_ulong disas_insn(DisasContext *s, 
CPUState *cpu)
 TCGMemOp ot;
 int modrm, reg, rm, mod, op, opreg, val;
 target_ulong next_eip, tval;
-target_ulong pc_start = s->base.pc_next;
 
 {
 int prefixes;
 TCGMemOp aflag, dflag;
 
-s->pc_start = s->pc = pc_start;
+s->pc_start = s->pc = s->base.pc_next;
 s->override = -1;
 #ifdef TARGET_X86_64
 s->rex_x = 0;
@@ -6357,7 +6356,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 case 0xa5:
 ot = mo_b_d(b, s->dflag);
 if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
-gen_repz_movs(s, ot, pc_start - s->cs_base, s->pc - s->cs_base);
+gen_repz_movs(s, ot, s->pc_start - s->cs_base, s->pc - s->cs_base);
 } else {
 gen_movs(s, ot);
 }
@@ -6367,7 +6366,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 case 0xab:
 ot = mo_b_d(b, s->dflag);
 if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
-gen_repz_stos(s, ot, pc_start - s->cs_base, s->pc - s->cs_base);
+gen_repz_stos(s, ot, s->pc_start - s->cs_base, s->pc - s->cs_base);
 } else {
 gen_stos(s, ot);
 }
@@ -6376,7 +6375,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 case 0xad:
 ot = mo_b_d(b, s->dflag);
 if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
-gen_repz_lods(s, ot, pc_start - s->cs_base, s->pc - s->cs_base);
+gen_repz_lods(s, ot, s->pc_start - s->cs_base, s->pc - s->cs_base);
 } else {
 gen_lods(s, ot);
 }
@@ -6385,9 +6384,9 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 case 0xaf:
 ot = mo_b_d(b, s->dflag);
 if (s->prefix & PREFIX_REPNZ) {
-gen_repz_scas(s, ot, pc_start - s->cs_base, s->pc - s->cs_base, 1);
+gen_repz_scas(s, ot, s->pc_start - s->cs_base, s->pc - s->cs_base, 
1);
 } else if (s->prefix & PREFIX_REPZ) {
-gen_repz_scas(s, ot, pc_start - s->cs_base, s->pc - s->cs_base, 0);
+gen_repz_scas(s, ot, s->pc_start - s->cs_base, s->pc - s->cs_base, 
0);
 } else {
 gen_scas(s, ot);
 }
@@ -6397,9 +6396,9 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 case 0xa7:
 ot = mo_b_d(b, s->dflag);
 if (s->prefix & PREFIX_REPNZ) {
-gen_repz_cmps(s, ot, pc_start - s->cs_base, s->pc - s->cs_base, 1);
+gen_repz_cmps(s, ot, s->pc_start - s->cs_base, s->pc - s->cs_base, 
1);
 } else if (s->prefix & PREFIX_REPZ) {
-gen_repz_cmps(s, ot, pc_start - s->cs_base, s->pc - s->cs_base, 0);
+gen_repz_cmps(s, ot, s->pc_start - s->cs_base, s->pc - s->cs_base, 
0);
 } else {
 gen_cmps(s, ot);
 }
@@ -6408,10 +6407,10 @@ static target_ulong disas_insn(DisasContext *s, 
CPUState *cpu)
 case 0x6d:
 ot = mo_b_d32(b, s->dflag);
 tcg_gen_ext16u_tl(s->T0, cpu_regs[R_EDX]);
-gen_check_io(s, ot, pc_start - s->cs_base, 
+gen_check_io(s, ot, s->pc_start - s->cs_base,
  SVM_IOIO_TYPE_MASK | svm_is_rep(s->prefix) | 4);
 if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
-gen_repz_ins(s, ot, pc_start - s->cs_base, s->pc - s->cs_base);
+gen_repz_ins(s, ot, s->pc_start - s->cs_base, s->pc - s->cs_base);
 } else {
 gen_ins(s, ot);
 if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
@@ -6423,10 +6422,10 @@ static target_ulong disas_insn(DisasContext *s, 
CPUState *cpu)
 case 0x6f:
 ot = mo_b_d32(b, s->dflag);
 tcg_gen_ext16u_tl(s->T0, cpu_regs[R_EDX]);
-gen_check_io(s, ot, pc_start - s->cs_base,
+gen_check_io(s, ot, s->pc_start - s->cs_base,
  svm_is_rep(s->prefix) | 4);
 if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
-gen_repz_outs(s, ot, pc_start - s->cs_base, s->pc - s->cs_base);
+gen_repz_outs(s, ot, s->pc_start - s->cs_base, s->pc - s->cs_base);
 } else {
 gen_outs(s, ot);
 if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
@@ -6443,7 +6442,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 ot = mo_b_d32(b, s->dflag);
 val = x86_ldub_code(env, s);
 tcg_gen_movi_tl(s->T0, val);
-gen_check_io(s, ot, pc_start - s->cs_base,
+gen_check_io(s, ot, s->pc_sta

[Qemu-devel] [RFC PATCH v3 08/46] target/i386: make variable b1 const

2019-08-14 Thread Jan Bobek
The variable b1 does not change value once assigned. Make this fact
explicit by marking it const.

Reviewed-by: Richard Henderson 
Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 15 ++-
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index b1ba2fc3e5..8bf39b73c4 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -3042,7 +3042,7 @@ static const struct SSEOpHelper_eppi sse_op_table7[256] = 
{
 
 static void gen_sse(CPUX86State *env, DisasContext *s, int b)
 {
-int b1, op1_offset, op2_offset, is_xmm, val;
+int op1_offset, op2_offset, is_xmm, val;
 int modrm, mod, rm, reg;
 SSEFunc_0_epp sse_fn_epp;
 SSEFunc_0_eppi sse_fn_eppi;
@@ -3051,14 +3051,11 @@ static void gen_sse(CPUX86State *env, DisasContext *s, 
int b)
 TCGMemOp ot;
 
 b &= 0xff;
-if (s->prefix & PREFIX_DATA)
-b1 = 1;
-else if (s->prefix & PREFIX_REPZ)
-b1 = 2;
-else if (s->prefix & PREFIX_REPNZ)
-b1 = 3;
-else
-b1 = 0;
+const int b1 =
+s->prefix & PREFIX_DATA ? 1
+: s->prefix & PREFIX_REPZ ? 2
+: s->prefix & PREFIX_REPNZ ? 3
+: 0;
 sse_fn_epp = sse_op_table1[b][b1];
 if (!sse_fn_epp) {
 goto unknown_op;
-- 
2.20.1




[Qemu-devel] [RFC PATCH v3 12/46] target/i386: introduce gen_sse_ng

2019-08-14 Thread Jan Bobek
This function serves as the point-of-intercept for all newly
implemented instructions. If no new implementation exists, fall back
to gen_sse.

Reviewed-by: Richard Henderson 
Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 29 -
 1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 258351fce3..fdc7cb0054 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4489,6 +4489,33 @@ static void gen_sse(CPUX86State *env, DisasContext *s, 
int b)
 }
 }
 
+static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
+{
+enum {
+P_NP = 0,
+P_66 = 1 << (0 + 8),
+P_F3 = 1 << (1 + 8),
+P_F2 = 1 << (2 + 8),
+W_0  = 0 << (3 + 8),
+W_1  = 1 << (3 + 8),
+M_NA = 0,
+M_0F = 1 << (4 + 8),
+};
+
+switch ((b & 0xff) | M_0F
+| (s->prefix & PREFIX_DATA ? P_66 : 0)
+| (s->prefix & PREFIX_REPZ ? P_F3 : 0)
+| (s->prefix & PREFIX_REPNZ ? P_F2 : 0)
+| (REX_W(s) > 0 ? W_1 : W_0)) {
+
+default:
+gen_sse(env, s, b);
+return;
+}
+
+g_assert_not_reached();
+}
+
 /* convert one instruction. s->base.is_jmp is set if the translation must
be stopped. Return the next pc value */
 static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
@@ -8379,7 +8406,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 case 0x1c2:
 case 0x1c4 ... 0x1c6:
 case 0x1d0 ... 0x1fe:
-gen_sse(env, s, b);
+gen_sse_ng(env, s, b);
 break;
 default:
 goto unknown_op;
-- 
2.20.1




[Qemu-devel] [RFC PATCH v3 04/46] target/i386: use dflag from DisasContext

2019-08-14 Thread Jan Bobek
There already is a variable dflag in DisasContext, so reduce the scope
of the local variable dflag to enforce use of the one in DisasContext.

Suggested-by: Richard Henderson 
Reviewed-by: Richard Henderson 
Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 184 
 1 file changed, 92 insertions(+), 92 deletions(-)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index bda96277e4..bb13877df7 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4493,13 +4493,13 @@ static target_ulong disas_insn(DisasContext *s, 
CPUState *cpu)
 CPUX86State *env = cpu->env_ptr;
 int b, prefixes;
 int shift;
-TCGMemOp ot, dflag;
+TCGMemOp ot;
 int modrm, reg, rm, mod, op, opreg, val;
 target_ulong next_eip, tval;
 target_ulong pc_start = s->base.pc_next;
 
 {
-TCGMemOp aflag;
+TCGMemOp aflag, dflag;
 
 s->pc_start = s->pc = pc_start;
 s->override = -1;
@@ -4686,7 +4686,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 op = (b >> 3) & 7;
 f = (b >> 1) & 3;
 
-ot = mo_b_d(b, dflag);
+ot = mo_b_d(b, s->dflag);
 
 switch(f) {
 case 0: /* OP Ev, Gv */
@@ -4744,7 +4744,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 {
 int val;
 
-ot = mo_b_d(b, dflag);
+ot = mo_b_d(b, s->dflag);
 
 modrm = x86_ldub_code(env, s);
 mod = (modrm >> 6) & 3;
@@ -4781,16 +4781,16 @@ static target_ulong disas_insn(DisasContext *s, 
CPUState *cpu)
 /**/
 /* inc, dec, and other misc arith */
 case 0x40 ... 0x47: /* inc Gv */
-ot = dflag;
+ot = s->dflag;
 gen_inc(s, ot, OR_EAX + (b & 7), 1);
 break;
 case 0x48 ... 0x4f: /* dec Gv */
-ot = dflag;
+ot = s->dflag;
 gen_inc(s, ot, OR_EAX + (b & 7), -1);
 break;
 case 0xf6: /* GRP3 */
 case 0xf7:
-ot = mo_b_d(b, dflag);
+ot = mo_b_d(b, s->dflag);
 
 modrm = x86_ldub_code(env, s);
 mod = (modrm >> 6) & 3;
@@ -5022,7 +5022,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 
 case 0xfe: /* GRP4 */
 case 0xff: /* GRP5 */
-ot = mo_b_d(b, dflag);
+ot = mo_b_d(b, s->dflag);
 
 modrm = x86_ldub_code(env, s);
 mod = (modrm >> 6) & 3;
@@ -5036,10 +5036,10 @@ static target_ulong disas_insn(DisasContext *s, 
CPUState *cpu)
 /* operand size for jumps is 64 bit */
 ot = MO_64;
 } else if (op == 3 || op == 5) {
-ot = dflag != MO_16 ? MO_32 + (REX_W(s) == 1) : MO_16;
+ot = s->dflag != MO_16 ? MO_32 + (REX_W(s) == 1) : MO_16;
 } else if (op == 6) {
 /* default push size is 64 bit */
-ot = mo_pushpop(s, dflag);
+ot = mo_pushpop(s, s->dflag);
 }
 }
 if (mod != 3) {
@@ -5067,7 +5067,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 break;
 case 2: /* call Ev */
 /* XXX: optimize if memory (no 'and' is necessary) */
-if (dflag == MO_16) {
+if (s->dflag == MO_16) {
 tcg_gen_ext16u_tl(s->T0, s->T0);
 }
 next_eip = s->pc - s->cs_base;
@@ -5085,19 +5085,19 @@ static target_ulong disas_insn(DisasContext *s, 
CPUState *cpu)
 if (s->pe && !s->vm86) {
 tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0);
 gen_helper_lcall_protected(cpu_env, s->tmp2_i32, s->T1,
-   tcg_const_i32(dflag - 1),
+   tcg_const_i32(s->dflag - 1),
tcg_const_tl(s->pc - s->cs_base));
 } else {
 tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0);
 gen_helper_lcall_real(cpu_env, s->tmp2_i32, s->T1,
-  tcg_const_i32(dflag - 1),
+  tcg_const_i32(s->dflag - 1),
   tcg_const_i32(s->pc - s->cs_base));
 }
 tcg_gen_ld_tl(s->tmp4, cpu_env, offsetof(CPUX86State, eip));
 gen_jr(s, s->tmp4);
 break;
 case 4: /* jmp Ev */
-if (dflag == MO_16) {
+if (s->dflag == MO_16) {
 tcg_gen_ext16u_tl(s->T0, s->T0);
 }
 gen_op_jmp_v(s->T0);
@@ -5130,7 +5130,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 
 case 0x84: /* test Ev, Gv */
 case 0x85:
-ot = mo_b_d(b, dflag);
+ot = mo_b_d(b, s->dflag);
 
 modrm = x86_ldub_code(env, s);
 reg = ((modrm >> 3) & 7) | REX_R(s);
@@ -5143,7 +5143,7 @@ static target_ulong disas_insn(DisasContext *s, 

[Qemu-devel] [RFC PATCH v3 13/46] target/i386: disable unused function warning temporarily

2019-08-14 Thread Jan Bobek
Some functions added later are generated by preprocessor macros and
end up being unused (e.g. not all operands can serve as a destination
operand). Disable unused function warnings for the new code until I
figure out how I want to solve this particular issue.

Note: This changeset is intended for development only and shall not be
included in the final patch series.

Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index fdc7cb0054..e9741cd7f7 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4489,6 +4489,10 @@ static void gen_sse(CPUX86State *env, DisasContext *s, 
int b)
 }
 }
 
+/* XXX TODO get rid of this eventually */
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wunused-function"
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
 enum {
@@ -4515,6 +4519,7 @@ static void gen_sse_ng(CPUX86State *env, DisasContext *s, 
int b)
 
 g_assert_not_reached();
 }
+#pragma GCC diagnostic pop
 
 /* convert one instruction. s->base.is_jmp is set if the translation must
be stopped. Return the next pc value */
-- 
2.20.1




[Qemu-devel] [RFC PATCH v3 02/46] target/i386: Push rex_w into DisasContext

2019-08-14 Thread Jan Bobek
From: Richard Henderson 

Treat this the same as we already do for other rex bits.

Signed-off-by: Richard Henderson 
---
 target/i386/translate.c | 19 +++
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index d74dbfd585..c0866c2797 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -44,11 +44,13 @@
 #define REX_X(s) ((s)->rex_x)
 #define REX_B(s) ((s)->rex_b)
 #define REX_R(s) ((s)->rex_r)
+#define REX_W(s) ((s)->rex_w)
 #else
 #define CODE64(s) 0
 #define REX_X(s) 0
 #define REX_B(s) 0
 #define REX_R(s) 0
+#define REX_W(s) -1
 #endif
 
 #ifdef TARGET_X86_64
@@ -100,7 +102,7 @@ typedef struct DisasContext {
 #ifdef TARGET_X86_64
 int lma;/* long mode active */
 int code64; /* 64 bit code segment */
-int rex_x, rex_b, rex_r;
+int rex_x, rex_b, rex_r, rex_w;
 #endif
 int vex_l;  /* vex vector length */
 int vex_v;  /* vex  register, without 1's complement.  */
@@ -4495,7 +4497,6 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 int modrm, reg, rm, mod, op, opreg, val;
 target_ulong next_eip, tval;
 target_ulong pc_start = s->base.pc_next;
-int rex_w;
 
 s->pc_start = s->pc = pc_start;
 s->override = -1;
@@ -4503,6 +4504,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 s->rex_x = 0;
 s->rex_b = 0;
 s->rex_r = 0;
+s->rex_w = -1;
 s->x86_64_hregs = false;
 #endif
 s->rip_offset = 0; /* for relative ip address */
@@ -4514,7 +4516,6 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 }
 
 prefixes = 0;
-rex_w = -1;
 
  next_byte:
 b = x86_ldub_code(env, s);
@@ -4557,7 +4558,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 case 0x40 ... 0x4f:
 if (CODE64(s)) {
 /* REX prefix */
-rex_w = (b >> 3) & 1;
+s->rex_w = (b >> 3) & 1;
 s->rex_r = (b & 0x4) << 1;
 s->rex_x = (b & 0x2) << 2;
 s->rex_b = (b & 0x1) << 3;
@@ -4606,7 +4607,9 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 s->rex_b = (~vex2 >> 2) & 8;
 #endif
 vex3 = x86_ldub_code(env, s);
-rex_w = (vex3 >> 7) & 1;
+#ifdef TARGET_X86_64
+s->rex_w = (vex3 >> 7) & 1;
+#endif
 switch (vex2 & 0x1f) {
 case 0x01: /* Implied 0f leading opcode bytes.  */
 b = x86_ldub_code(env, s) | 0x100;
@@ -4631,9 +4634,9 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 /* Post-process prefixes.  */
 if (CODE64(s)) {
 /* In 64-bit mode, the default data size is 32-bit.  Select 64-bit
-   data with rex_w, and 16-bit data with 0x66; rex_w takes precedence
+   data with REX_W, and 16-bit data with 0x66; REX_W takes precedence
over 0x66 if both are present.  */
-dflag = (rex_w > 0 ? MO_64 : prefixes & PREFIX_DATA ? MO_16 : MO_32);
+dflag = (REX_W(s) > 0 ? MO_64 : prefixes & PREFIX_DATA ? MO_16 : 
MO_32);
 /* In 64-bit mode, 0x67 selects 32-bit addressing.  */
 aflag = (prefixes & PREFIX_ADR ? MO_32 : MO_64);
 } else {
@@ -5029,7 +5032,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 /* operand size for jumps is 64 bit */
 ot = MO_64;
 } else if (op == 3 || op == 5) {
-ot = dflag != MO_16 ? MO_32 + (rex_w == 1) : MO_16;
+ot = dflag != MO_16 ? MO_32 + (REX_W(s) == 1) : MO_16;
 } else if (op == 6) {
 /* default push size is 64 bit */
 ot = mo_pushpop(s, dflag);
-- 
2.20.1




[Qemu-devel] [RFC PATCH v3 01/46] target/i386: Push rex_r into DisasContext

2019-08-14 Thread Jan Bobek
From: Richard Henderson 

Treat this value the same as we do for rex_b and rex_x.

Signed-off-by: Richard Henderson 
---
 target/i386/translate.c | 85 +
 1 file changed, 44 insertions(+), 41 deletions(-)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 03150a86e2..d74dbfd585 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -43,10 +43,12 @@
 #define CODE64(s) ((s)->code64)
 #define REX_X(s) ((s)->rex_x)
 #define REX_B(s) ((s)->rex_b)
+#define REX_R(s) ((s)->rex_r)
 #else
 #define CODE64(s) 0
 #define REX_X(s) 0
 #define REX_B(s) 0
+#define REX_R(s) 0
 #endif
 
 #ifdef TARGET_X86_64
@@ -98,7 +100,7 @@ typedef struct DisasContext {
 #ifdef TARGET_X86_64
 int lma;/* long mode active */
 int code64; /* 64 bit code segment */
-int rex_x, rex_b;
+int rex_x, rex_b, rex_r;
 #endif
 int vex_l;  /* vex vector length */
 int vex_v;  /* vex  register, without 1's complement.  */
@@ -3037,7 +3039,7 @@ static const struct SSEOpHelper_eppi sse_op_table7[256] = 
{
 };
 
 static void gen_sse(CPUX86State *env, DisasContext *s, int b,
-target_ulong pc_start, int rex_r)
+target_ulong pc_start)
 {
 int b1, op1_offset, op2_offset, is_xmm, val;
 int modrm, mod, rm, reg;
@@ -3107,8 +3109,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s, 
int b,
 
 modrm = x86_ldub_code(env, s);
 reg = ((modrm >> 3) & 7);
-if (is_xmm)
-reg |= rex_r;
+if (is_xmm) {
+reg |= REX_R(s);
+}
 mod = (modrm >> 6) & 3;
 if (sse_fn_epp == SSE_SPECIAL) {
 b |= (b1 << 8);
@@ -3642,7 +3645,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, 
int b,
 tcg_gen_ld16u_tl(s->T0, cpu_env,
 
offsetof(CPUX86State,fpregs[rm].mmx.MMX_W(val)));
 }
-reg = ((modrm >> 3) & 7) | rex_r;
+reg = ((modrm >> 3) & 7) | REX_R(s);
 gen_op_mov_reg_v(s, ot, reg, s->T0);
 break;
 case 0x1d6: /* movq ea, xmm */
@@ -3686,7 +3689,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, 
int b,
  offsetof(CPUX86State, fpregs[rm].mmx));
 gen_helper_pmovmskb_mmx(s->tmp2_i32, cpu_env, s->ptr0);
 }
-reg = ((modrm >> 3) & 7) | rex_r;
+reg = ((modrm >> 3) & 7) | REX_R(s);
 tcg_gen_extu_i32_tl(cpu_regs[reg], s->tmp2_i32);
 break;
 
@@ -3698,7 +3701,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, 
int b,
 }
 modrm = x86_ldub_code(env, s);
 rm = modrm & 7;
-reg = ((modrm >> 3) & 7) | rex_r;
+reg = ((modrm >> 3) & 7) | REX_R(s);
 mod = (modrm >> 6) & 3;
 if (b1 >= 2) {
 goto unknown_op;
@@ -3774,7 +3777,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, 
int b,
 /* Various integer extensions at 0f 38 f[0-f].  */
 b = modrm | (b1 << 8);
 modrm = x86_ldub_code(env, s);
-reg = ((modrm >> 3) & 7) | rex_r;
+reg = ((modrm >> 3) & 7) | REX_R(s);
 
 switch (b) {
 case 0x3f0: /* crc32 Gd,Eb */
@@ -4128,7 +4131,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, 
int b,
 b = modrm;
 modrm = x86_ldub_code(env, s);
 rm = modrm & 7;
-reg = ((modrm >> 3) & 7) | rex_r;
+reg = ((modrm >> 3) & 7) | REX_R(s);
 mod = (modrm >> 6) & 3;
 if (b1 >= 2) {
 goto unknown_op;
@@ -4148,7 +4151,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, 
int b,
 rm = (modrm & 7) | REX_B(s);
 if (mod != 3)
 gen_lea_modrm(env, s, modrm);
-reg = ((modrm >> 3) & 7) | rex_r;
+reg = ((modrm >> 3) & 7) | REX_R(s);
 val = x86_ldub_code(env, s);
 switch (b) {
 case 0x14: /* pextrb */
@@ -4317,7 +4320,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, 
int b,
 /* Various integer extensions at 0f 3a f[0-f].  */
 b = modrm | (b1 << 8);
 modrm = x86_ldub_code(env, s);
-reg = ((modrm >> 3) & 7) | rex_r;
+reg = ((modrm >> 3) & 7) | REX_R(s);
 
 switch (b) {
 case 0x3f0: /* rorx Gy,Ey, Ib */
@@ -4491,14 +4494,15 @@ static target_ulong disas_insn(DisasContext *s, 
CPUState *cpu)
 TCGMemOp ot, aflag, dflag;
 int modrm, reg, rm, mod, op, opreg, val;
 target_ulong next_eip, tval;
-int rex_w, rex_r;
 target_ulong pc_start = s->base.pc_next;
+int rex_w;
 
 s->pc_start = s->pc = pc_start;
 s->override = -1;
 #ifdef TARGET_X86_64
 s->rex_x = 0;
 s->rex_b = 0;
+s->rex_r = 0;
 s->x86_64_hregs = false;
 #endif
 s->rip_offset = 0

[Qemu-devel] [RFC PATCH v3 03/46] target/i386: reduce scope of variable aflag

2019-08-14 Thread Jan Bobek
The variable aflag is not used in most of disas_insn; make this clear
by explicitly reducing its scope to the block where it is used.

Suggested-by: Richard Henderson 
Reviewed-by: Richard Henderson 
Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index c0866c2797..bda96277e4 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4493,11 +4493,14 @@ static target_ulong disas_insn(DisasContext *s, 
CPUState *cpu)
 CPUX86State *env = cpu->env_ptr;
 int b, prefixes;
 int shift;
-TCGMemOp ot, aflag, dflag;
+TCGMemOp ot, dflag;
 int modrm, reg, rm, mod, op, opreg, val;
 target_ulong next_eip, tval;
 target_ulong pc_start = s->base.pc_next;
 
+{
+TCGMemOp aflag;
+
 s->pc_start = s->pc = pc_start;
 s->override = -1;
 #ifdef TARGET_X86_64
@@ -4657,6 +4660,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 s->prefix = prefixes;
 s->aflag = aflag;
 s->dflag = dflag;
+}
 
 /* now check op code */
  reswitch:
-- 
2.20.1




[Qemu-devel] [RFC PATCH v3 10/46] target/i386: add vector register file alignment constraints

2019-08-14 Thread Jan Bobek
gvec operations require that all vectors be aligned on 16-byte
boundary; make sure the MM/XMM/YMM/ZMM register file is aligned as
neccessary.

Reviewed-by: Richard Henderson 
Signed-off-by: Jan Bobek 
---
 target/i386/cpu.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 8b3dc5533e..cb407b86ba 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1199,9 +1199,9 @@ typedef struct CPUX86State {
 float_status mmx_status; /* for 3DNow! float ops */
 float_status sse_status;
 uint32_t mxcsr;
-ZMMReg xmm_regs[CPU_NB_REGS == 8 ? 8 : 32];
-ZMMReg xmm_t0;
-MMXReg mmx_t0;
+ZMMReg xmm_regs[CPU_NB_REGS == 8 ? 8 : 32] QEMU_ALIGNED(16);
+ZMMReg xmm_t0 QEMU_ALIGNED(16);
+MMXReg mmx_t0 QEMU_ALIGNED(8);
 
 XMMReg ymmh_regs[CPU_NB_REGS];
 
-- 
2.20.1




[Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation

2019-08-14 Thread Jan Bobek
The previous version can be found at [1]. Changes compared to v2:

  - Expanded the instruction operand infrastructure a bit; I am now
fairly confident that it is powerful enough to accommodate for all
the use cases I will need. It's still a bit clunky to work with at
times, but I am happy with it for now.

  - Reduced the number of various INSN_* (now called OPCODE_*) macro
variants using variadic macros.

  - Implemented translation for instructions up to SSE3.

Cheers,
 -Jan

References:
  1. https://lists.nongnu.org/archive/html/qemu-devel/2019-08/msg01790.html

Jan Bobek (43):
  target/i386: reduce scope of variable aflag
  target/i386: use dflag from DisasContext
  target/i386: use prefix from DisasContext
  target/i386: use pc_start from DisasContext
  target/i386: make variable b1 const
  target/i386: make variable is_xmm const
  target/i386: add vector register file alignment constraints
  target/i386: introduce gen_(ld,st)d_env_A0
  target/i386: introduce gen_sse_ng
  target/i386: disable unused function warning temporarily
  target/i386: introduce mnemonic aliases for several gvec operations
  target/i386: introduce function ck_cpuid
  target/i386: introduce instruction operand infrastructure
  target/i386: introduce generic operand alias
  target/i386: introduce generic either-or operand
  target/i386: introduce generic load-store operand
  target/i386: introduce tcg_temp operands
  target/i386: introduce modrm operand
  target/i386: introduce operands for decoding modrm fields
  target/i386: introduce operand for direct-only r/m field
  target/i386: introduce operand vex_v
  target/i386: introduce Ib (immediate) operand
  target/i386: introduce M* (memptr) operands
  target/i386: introduce G*, R*, E* (general register) operands
  target/i386: introduce P*, N*, Q* (MMX) operands
  target/i386: introduce H*, V*, U*, W* (SSE/AVX) operands
  target/i386: introduce code generators
  target/i386: introduce helper-based code generator macros
  target/i386: introduce gvec-based code generator macros
  target/i386: introduce sse-opcode.inc.h
  target/i386: introduce instruction translator macros
  target/i386: introduce MMX translators
  target/i386: introduce MMX code generators
  target/i386: introduce MMX instructions to sse-opcode.inc.h
  target/i386: introduce SSE translators
  target/i386: introduce SSE code generators
  target/i386: introduce SSE instructions to sse-opcode.inc.h
  target/i386: introduce SSE2 translators
  target/i386: introduce SSE2 code generators
  target/i386: introduce SSE2 instructions to sse-opcode.inc.h
  target/i386: introduce SSE3 translators
  target/i386: introduce SSE3 code generators
  target/i386: introduce SSE3 instructions to sse-opcode.inc.h

Richard Henderson (3):
  target/i386: Push rex_r into DisasContext
  target/i386: Push rex_w into DisasContext
  target/i386: Simplify gen_exception arguments

 target/i386/cpu.h|6 +-
 target/i386/sse-opcode.inc.h |  699 +
 target/i386/translate.c  | 2808 ++
 3 files changed, 3189 insertions(+), 324 deletions(-)
 create mode 100644 target/i386/sse-opcode.inc.h

-- 
2.20.1




Re: [Qemu-devel] [PATCH v9 05/11] numa: Extend CLI to provide initiator information for numa nodes

2019-08-14 Thread Tao Xu

On 8/15/2019 5:29 AM, Dan Williams wrote:

On Tue, Aug 13, 2019 at 10:14 PM Tao Xu  wrote:


On 8/14/2019 10:39 AM, Dan Williams wrote:

On Tue, Aug 13, 2019 at 8:00 AM Igor Mammedov  wrote:


On Fri,  9 Aug 2019 14:57:25 +0800
Tao  wrote:


From: Tao Xu 


[...]

+for (i = 0; i < machine->numa_state->num_nodes; i++) {
+if (numa_info[i].initiator_valid &&
+!numa_info[numa_info[i].initiator].has_cpu) {

^^ possible out of bounds read, 
see bellow


+error_report("The initiator-id %"PRIu16 " of NUMA node %d"
+ " does not exist.", numa_info[i].initiator, i);
+error_printf("\n");
+
+exit(1);
+}

it takes care only about nodes that have cpus or memory-only ones that have
initiator explicitly provided on CLI. And leaves possibility to have
memory-only nodes without initiator mixed with nodes that have initiator.
Is it valid to have mixed configuration?
Should we forbid it?


The spec talks about the "Proximity Domain for the Attached Initiator"
field only being valid if the memory controller for the memory can be
identified by an initiator id in the SRAT. So I expect the only way to
define a memory proximity domain without this local initiator is to
allow specifying a node-id that does not have an entry in the SRAT.


Hi Dan,

So there may be a situation for the Attached Initiator field is not
valid? If true, I would allow user to input Initiator invalid.


Yes it's something the OS needs to consider because the platform may
not be able to meet the constraint that a single initiator is
associated with the memory controller for a given memory target. In
retrospect it would have been nice if the spec reserved 0x for
this purpose, but it seems "not in SRAT" is the only way to identify
memory that is not attached to any single initiator.

But As far as I konw, QEMU can't emulate a NUMA node "not in SRAT". I am 
wondering if it is effective only set Initiator invalid?





Re: [Qemu-devel] [PATCH 1/3] riscv: sifive_u: Add support for loading initrd

2019-08-14 Thread Palmer Dabbelt

On Wed, 14 Aug 2019 18:30:59 PDT (-0700), bmeng...@gmail.com wrote:

Hi Palmer,

On Thu, Aug 15, 2019 at 1:06 AM Palmer Dabbelt  wrote:


On Mon, 12 Aug 2019 16:48:00 PDT (-0700), bmeng...@gmail.com wrote:
> Hi Palmer,
>
> On Tue, Aug 13, 2019 at 6:45 AM Palmer Dabbelt  wrote:
>>
>> On Fri, 19 Jul 2019 06:40:43 PDT (-0700), li...@roeck-us.net wrote:
>> > Add support for loading initrd with "-initrd "
>> > to the sifive_u machine. This lets us boot into Linux without
>> > disk drive.
>> >
>> > Signed-off-by: Guenter Roeck 
>> > ---
>> >  hw/riscv/sifive_u.c | 20 +---
>> >  1 file changed, 17 insertions(+), 3 deletions(-)
>> >
>> > diff --git a/hw/riscv/sifive_u.c b/hw/riscv/sifive_u.c
>> > index 71b8083..0657046 100644
>> > --- a/hw/riscv/sifive_u.c
>> > +++ b/hw/riscv/sifive_u.c
>> > @@ -67,7 +67,7 @@ static const struct MemmapEntry {
>> >
>> >  #define GEM_REVISION0x10070109
>> >
>> > -static void create_fdt(SiFiveUState *s, const struct MemmapEntry *memmap,
>> > +static void *create_fdt(SiFiveUState *s, const struct MemmapEntry *memmap,
>> >  uint64_t mem_size, const char *cmdline)
>> >  {
>> >  void *fdt;
>> > @@ -244,11 +244,14 @@ static void create_fdt(SiFiveUState *s, const struct 
MemmapEntry *memmap,
>> >  qemu_fdt_setprop_string(fdt, "/chosen", "bootargs", cmdline);
>> >  }
>> >  g_free(nodename);
>> > +
>> > +return fdt;
>> >  }
>> >
>> >  static void riscv_sifive_u_init(MachineState *machine)
>> >  {
>> >  const struct MemmapEntry *memmap = sifive_u_memmap;
>> > +void *fdt;
>> >
>> >  SiFiveUState *s = g_new0(SiFiveUState, 1);
>> >  MemoryRegion *system_memory = get_system_memory();
>> > @@ -269,13 +272,24 @@ static void riscv_sifive_u_init(MachineState 
*machine)
>> >  main_mem);
>> >
>> >  /* create device tree */
>> > -create_fdt(s, memmap, machine->ram_size, machine->kernel_cmdline);
>> > +fdt = create_fdt(s, memmap, machine->ram_size, 
machine->kernel_cmdline);
>> >
>> >  riscv_find_and_load_firmware(machine, BIOS_FILENAME,
>> >   memmap[SIFIVE_U_DRAM].base);
>> >
>> >  if (machine->kernel_filename) {
>> > -riscv_load_kernel(machine->kernel_filename);
>> > +uint64_t kernel_entry = 
riscv_load_kernel(machine->kernel_filename);
>> > +
>> > +if (machine->initrd_filename) {
>> > +hwaddr start;
>> > +hwaddr end = riscv_load_initrd(machine->initrd_filename,
>> > +   machine->ram_size, 
kernel_entry,
>> > +   &start);
>> > +qemu_fdt_setprop_cell(fdt, "/chosen",
>> > +  "linux,initrd-start", start);
>> > +qemu_fdt_setprop_cell(fdt, "/chosen", "linux,initrd-end",
>> > +  end);
>> > +}
>> >  }
>> >
>> >  /* reset vector */
>>
>> Thanks.  I've queued all three of these.
>>
>
> Ah, looks I did a duplicate.
> http://patchwork.ozlabs.org/patch/1145247/
>
> Which git repo/branch should I rebase my series on?

github.com/palmer-dabbelt/riscv-qemu -b for-master


I did not see branch "for-master" in the riscv-qemu repo. However I
did find the branch in the github.com/palmer-dabbelt/qemu repo.

I assume that's the correct one I should rebase my patch series on.


Thanks, I've deleted that confusing fork.



Re: [Qemu-devel] [PATCH 1/3] riscv: sifive_u: Add support for loading initrd

2019-08-14 Thread Bin Meng
Hi Palmer,

On Thu, Aug 15, 2019 at 1:06 AM Palmer Dabbelt  wrote:
>
> On Mon, 12 Aug 2019 16:48:00 PDT (-0700), bmeng...@gmail.com wrote:
> > Hi Palmer,
> >
> > On Tue, Aug 13, 2019 at 6:45 AM Palmer Dabbelt  wrote:
> >>
> >> On Fri, 19 Jul 2019 06:40:43 PDT (-0700), li...@roeck-us.net wrote:
> >> > Add support for loading initrd with "-initrd "
> >> > to the sifive_u machine. This lets us boot into Linux without
> >> > disk drive.
> >> >
> >> > Signed-off-by: Guenter Roeck 
> >> > ---
> >> >  hw/riscv/sifive_u.c | 20 +---
> >> >  1 file changed, 17 insertions(+), 3 deletions(-)
> >> >
> >> > diff --git a/hw/riscv/sifive_u.c b/hw/riscv/sifive_u.c
> >> > index 71b8083..0657046 100644
> >> > --- a/hw/riscv/sifive_u.c
> >> > +++ b/hw/riscv/sifive_u.c
> >> > @@ -67,7 +67,7 @@ static const struct MemmapEntry {
> >> >
> >> >  #define GEM_REVISION0x10070109
> >> >
> >> > -static void create_fdt(SiFiveUState *s, const struct MemmapEntry 
> >> > *memmap,
> >> > +static void *create_fdt(SiFiveUState *s, const struct MemmapEntry 
> >> > *memmap,
> >> >  uint64_t mem_size, const char *cmdline)
> >> >  {
> >> >  void *fdt;
> >> > @@ -244,11 +244,14 @@ static void create_fdt(SiFiveUState *s, const 
> >> > struct MemmapEntry *memmap,
> >> >  qemu_fdt_setprop_string(fdt, "/chosen", "bootargs", cmdline);
> >> >  }
> >> >  g_free(nodename);
> >> > +
> >> > +return fdt;
> >> >  }
> >> >
> >> >  static void riscv_sifive_u_init(MachineState *machine)
> >> >  {
> >> >  const struct MemmapEntry *memmap = sifive_u_memmap;
> >> > +void *fdt;
> >> >
> >> >  SiFiveUState *s = g_new0(SiFiveUState, 1);
> >> >  MemoryRegion *system_memory = get_system_memory();
> >> > @@ -269,13 +272,24 @@ static void riscv_sifive_u_init(MachineState 
> >> > *machine)
> >> >  main_mem);
> >> >
> >> >  /* create device tree */
> >> > -create_fdt(s, memmap, machine->ram_size, machine->kernel_cmdline);
> >> > +fdt = create_fdt(s, memmap, machine->ram_size, 
> >> > machine->kernel_cmdline);
> >> >
> >> >  riscv_find_and_load_firmware(machine, BIOS_FILENAME,
> >> >   memmap[SIFIVE_U_DRAM].base);
> >> >
> >> >  if (machine->kernel_filename) {
> >> > -riscv_load_kernel(machine->kernel_filename);
> >> > +uint64_t kernel_entry = 
> >> > riscv_load_kernel(machine->kernel_filename);
> >> > +
> >> > +if (machine->initrd_filename) {
> >> > +hwaddr start;
> >> > +hwaddr end = riscv_load_initrd(machine->initrd_filename,
> >> > +   machine->ram_size, 
> >> > kernel_entry,
> >> > +   &start);
> >> > +qemu_fdt_setprop_cell(fdt, "/chosen",
> >> > +  "linux,initrd-start", start);
> >> > +qemu_fdt_setprop_cell(fdt, "/chosen", "linux,initrd-end",
> >> > +  end);
> >> > +}
> >> >  }
> >> >
> >> >  /* reset vector */
> >>
> >> Thanks.  I've queued all three of these.
> >>
> >
> > Ah, looks I did a duplicate.
> > http://patchwork.ozlabs.org/patch/1145247/
> >
> > Which git repo/branch should I rebase my series on?
>
> github.com/palmer-dabbelt/riscv-qemu -b for-master

I did not see branch "for-master" in the riscv-qemu repo. However I
did find the branch in the github.com/palmer-dabbelt/qemu repo.

I assume that's the correct one I should rebase my patch series on.

Regards,
Bin



Re: [Qemu-devel] [RFC PATCH v2 21/39] target/i386: introduce insn.h

2019-08-14 Thread Jan Bobek
On 8/13/19 2:00 AM, Richard Henderson wrote:
> On 8/10/19 5:12 AM, Jan Bobek wrote:
>> This header is intended to eventually list all supported instructions
>> along with some useful details (e.g. mnemonics, opcode, operands etc.)
>> It shall be used (along with some preprocessor magic) anytime we need
>> to automatically generate code for every instruction.
>>
>> Signed-off-by: Jan Bobek 
>> ---
>>  target/i386/insn.h | 87 ++
>>  1 file changed, 87 insertions(+)
>>  create mode 100644 target/i386/insn.h
> 
> Things that are included multiple times should be named *.inc.h.  There are
> quite a few that don't follow this in the tree, but we are slowly fixing 
> those.
> 
> Though even "insn.inc.h" isn't particularly descriptive, and definitely
> overstates the case.  Maybe sse-opcode.inc.h?  While it's not only sse, it is
> used by gen_sse_ng().

"sse-opcode.inc.h" isn't 100 % as you point out, but looks good enough for now.

-Jan



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v9 00/11] Build ACPI Heterogeneous Memory Attribute Table (HMAT)

2019-08-14 Thread Tao Xu

On 8/15/2019 4:57 AM, Eduardo Habkost wrote:

On Tue, Aug 13, 2019 at 04:53:33PM +0800, Tao Xu wrote:

Hi Igor and Eduardo,

I am wondering if there are more comments about patch 1/11~4/11? Because
these 4 patch are independent and the patch series are big and pushing for a
long time. Could the patch 1/11~4/11 be ready for queuing firstly?


Now that I got a few Acked-bys for patch 1/4, I plan to queue
patches 1-4 in machine-next soon.


Thank you very much!



[Qemu-devel] current QEMU can't start pc-q35-2.12 SEV guest

2019-08-14 Thread Bruce Rogers
Hi,

I ran into a case where a guest on a SEV capable host, which was
enabled to use SEV and using an older machine type was no longer able
to run when the QEMU version had been updated.

Specifically, when the guest was installed and running under a v2.12
QEMU, set up for SEV (ok it was v2.11 with SEV support backported, but
the details still apply), using a command line such as follows:

qemu-system-x86_64 -cpu EPYC-IBRS \
-machine pc-q35-2.12,accel=kvm,memory-encryption=sev0 \
-object sev-guest,id=sev0,...

The guest ran fine, using SEV memory enryption.

Later the version of QEMU was updated to v3.1.0, and the same guest now
hung at boot, when using the exact same command line. (Current QEMU
still has the same problem.)

Upon investigation, I find that the handling of xlevel in
target/i386/cpu.c relies includes an explicit detection of SEV being
enabled and sets the cpuid_min_xlevel in the CPUX86State structure to
0x801F as the required minimum for SEV support. This normally is
used to set the xlevel the guest sees, allowing it to use SEV.

The compat settings for the v2.12 machine type include an xlevel value
associated with it (0x800A). Unfortunately the processing of the
compat settings gets conflated with the logic of handling a user
explicitly specifying an xlevel on the command line, which is treated
as an "override" condition, overriding the other xlevel selections
which would otherwise be done in the QEMU cpu code.

So, in the scenario I describe above, the original, working case would
provide an cpuid xlevel value of 0x801F to the guest (correct), and
the failing case ends up providing the value 0x800A (incorrect).

It seems to me that the handling of the compat settings and the
explicit xlevel setting by the user should be processed separately, but
I don't see how to do that easily.

How should this problem be resolved?

In my case, I've added to the code which is for checking a user
provided xlevel value, the check again for sev_enabled(), and if that's
the case, I still apply the cpuid_min_xlevel value. This works for the
time being, but doesn't seem to be the right solution.

Looking forward to your help with this issue.

Thanks,

Bruce


Re: [Qemu-devel] [RFC PATCH v2 23/39] target/i386: introduce instruction translator macros

2019-08-14 Thread Jan Bobek
On 8/13/19 2:30 AM, Richard Henderson wrote:
> On 8/10/19 5:12 AM, Jan Bobek wrote:
>> +#define CASES_LEG_NP_0F_W0(opcode)  \
>> +case opcode | M_0F | W_0:
>> +#define CASES_LEG_NP_0F_W1(opcode)  \
>> +case opcode | M_0F | W_1:
>> +#define CASES_LEG_F3_0F_W0(opcode)  \
>> +case opcode | M_0F | P_F3 | W_0:
>> +#define CASES_LEG_F3_0F_W1(opcode)  \
>> +case opcode | M_0F | P_F3 | W_1:
>> +
>> +#define LEG(p, m, w)\
>> +CASES_LEG_ ## p ## _ ## m ## _W ## w
>> +#define INSN(mnem, cases, opcode, feat) \
>> +cases(opcode)   \
> 
> It appears as if you don't need the CASES_* macros here.
> 
> #define LEG(p, m, w, op) \
>case P_##p | M_##m | W_##2 | op
> 
> #define INSN(mnem, leg, feat) \
>leg: translate_insn(env, s, CK_CPUID_##feat, gen_insn(mnem));
> 
> so long as P_NP is in the enumeration above with value 0.
> 
> Unless there's some other reason that opcode needs to stay separate?

I was thinking ahead with the CASES_* macros here: if I have LIG
and/or WIG in the VEX prefix, I'll need more than one case label,
but only one label in other cases. However, that's not a reason
for the opcode to be separate, and I think I like it stashed with
the rest of the prefix fields better.

-Jan
 



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [RFC PATCH] ati-vga: Implement dummy VBlank IRQ

2019-08-14 Thread BALATON Zoltan
The MacOS driver exits if the card does not have an interrupt. If we
set PCI_INTERRUPT_PIN to 1 then it enables VBlank interrupts and it
boots but the mouse poniter can not be moved. This patch implements a
dummy VBlank interrupt by a timer triggered at 60 Hz to test if it
helps. Unfortunately it doesn't: MacOS with this patch hangs during
boot just polling interrupts and acknowledging them so maybe it needs
something else or there may be some other problem with this
implementation.

This is posted for comments and to let others experiment with it but
probably should not be committed upstream yet.

Signed-off-by: BALATON Zoltan 
---
 hw/display/ati.c  | 41 +
 hw/display/ati_dbg.c  |  1 +
 hw/display/ati_int.h  |  4 
 hw/display/ati_regs.h |  1 +
 4 files changed, 47 insertions(+)

diff --git a/hw/display/ati.c b/hw/display/ati.c
index a365e2455d..e06cbf3e91 100644
--- a/hw/display/ati.c
+++ b/hw/display/ati.c
@@ -243,6 +243,21 @@ static uint64_t ati_i2c(bitbang_i2c_interface *i2c, 
uint64_t data, int base)
 return data;
 }
 
+static void ati_vga_update_irq(ATIVGAState *s)
+{
+pci_set_irq(&s->dev, s->regs.gen_int_status & 1);
+}
+
+static void ati_vga_vblank_irq(void *opaque)
+{
+ATIVGAState *s = opaque;
+
+timer_mod(&s->vblank_timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) +
+  NANOSECONDS_PER_SECOND / 60);
+s->regs.gen_int_status |= 1;
+ati_vga_update_irq(s);
+}
+
 static inline uint64_t ati_reg_read_offs(uint32_t reg, int offs,
  unsigned int size)
 {
@@ -283,6 +298,12 @@ static uint64_t ati_mm_read(void *opaque, hwaddr addr, 
unsigned int size)
 addr - (BIOS_0_SCRATCH + i * 4), size);
 break;
 }
+case GEN_INT_CNTL:
+val = s->regs.gen_int_cntl;
+break;
+case GEN_INT_STATUS:
+val = s->regs.gen_int_status;
+break;
 case CRTC_GEN_CNTL ... CRTC_GEN_CNTL + 3:
 val = ati_reg_read_offs(s->regs.crtc_gen_cntl,
 addr - CRTC_GEN_CNTL, size);
@@ -512,6 +533,19 @@ static void ati_mm_write(void *opaque, hwaddr addr,
addr - (BIOS_0_SCRATCH + i * 4), data, size);
 break;
 }
+case GEN_INT_CNTL:
+s->regs.gen_int_cntl = data;
+if (data & 1) {
+ati_vga_vblank_irq(s);
+} else {
+timer_del(&s->vblank_timer);
+}
+break;
+case GEN_INT_STATUS:
+data &= (s->dev_id == PCI_DEVICE_ID_ATI_RAGE128_PF ?
+ 0x000f040fUL : 0xfc080effUL);
+s->regs.gen_int_status &= ~data;
+break;
 case CRTC_GEN_CNTL ... CRTC_GEN_CNTL + 3:
 {
 uint32_t val = s->regs.crtc_gen_cntl;
@@ -902,12 +936,18 @@ static void ati_vga_realize(PCIDevice *dev, Error **errp)
 pci_register_bar(dev, 0, PCI_BASE_ADDRESS_MEM_PREFETCH, &vga->vram);
 pci_register_bar(dev, 1, PCI_BASE_ADDRESS_SPACE_IO, &s->io);
 pci_register_bar(dev, 2, PCI_BASE_ADDRESS_SPACE_MEMORY, &s->mm);
+
+/* most interrupts are not yet emulated but MacOS needs at least VBlank */
+dev->config[PCI_INTERRUPT_PIN] = 1;
+timer_init_ns(&s->vblank_timer, QEMU_CLOCK_VIRTUAL, ati_vga_vblank_irq, s);
 }
 
 static void ati_vga_reset(DeviceState *dev)
 {
 ATIVGAState *s = ATI_VGA(dev);
 
+timer_del(&s->vblank_timer);
+
 /* reset vga */
 vga_common_reset(&s->vga);
 s->mode = VGA_MODE;
@@ -917,6 +957,7 @@ static void ati_vga_exit(PCIDevice *dev)
 {
 ATIVGAState *s = ATI_VGA(dev);
 
+timer_del(&s->vblank_timer);
 graphic_console_close(s->vga.con);
 }
 
diff --git a/hw/display/ati_dbg.c b/hw/display/ati_dbg.c
index 7e59c41ac2..0ebbd36f14 100644
--- a/hw/display/ati_dbg.c
+++ b/hw/display/ati_dbg.c
@@ -16,6 +16,7 @@ static struct ati_regdesc ati_reg_names[] = {
 {"BUS_CNTL", 0x0030},
 {"BUS_CNTL1", 0x0034},
 {"GEN_INT_CNTL", 0x0040},
+{"GEN_INT_STATUS", 0x0044},
 {"CRTC_GEN_CNTL", 0x0050},
 {"CRTC_EXT_CNTL", 0x0054},
 {"DAC_CNTL", 0x0058},
diff --git a/hw/display/ati_int.h b/hw/display/ati_int.h
index 5b4d3be1e6..2a16708e4f 100644
--- a/hw/display/ati_int.h
+++ b/hw/display/ati_int.h
@@ -9,6 +9,7 @@
 #ifndef ATI_INT_H
 #define ATI_INT_H
 
+#include "qemu/timer.h"
 #include "hw/pci/pci.h"
 #include "hw/i2c/bitbang_i2c.h"
 #include "vga_int.h"
@@ -33,6 +34,8 @@
 typedef struct ATIVGARegs {
 uint32_t mm_index;
 uint32_t bios_scratch[8];
+uint32_t gen_int_cntl;
+uint32_t gen_int_status;
 uint32_t crtc_gen_cntl;
 uint32_t crtc_ext_cntl;
 uint32_t dac_cntl;
@@ -89,6 +92,7 @@ typedef struct ATIVGAState {
 uint16_t cursor_size;
 uint32_t cursor_offset;
 QEMUCursor *cursor;
+QEMUTimer vblank_timer;
 bitbang_i2c_interface bbi2c;
 MemoryRegion io;
 MemoryRegion mm;
diff --git a/hw/display/ati_regs.h b/hw/display/ati_regs.h
index 02046e97c2..2b6f949bd4 100644
--- a/hw/display/ati_regs.h

Re: [Qemu-devel] [PATCH 0/6] Fix multifd with big number of channels

2019-08-14 Thread no-reply
Patchew URL: https://patchew.org/QEMU/20190814020218.1868-1-quint...@redhat.com/



Hi,

This series failed the asan build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-fedora V=1 NETWORK=1
time make docker-test-debug@fedora TARGET_LIST=x86_64-softmmu J=14 NETWORK=1
=== TEST SCRIPT END ===

PASS 1 fdc-test /x86_64/fdc/cmos
PASS 2 fdc-test /x86_64/fdc/no_media_on_start
PASS 3 fdc-test /x86_64/fdc/read_without_media
==7932==WARNING: ASan doesn't fully support makecontext/swapcontext functions 
and may produce false positives in some cases!
PASS 4 fdc-test /x86_64/fdc/media_change
PASS 5 fdc-test /x86_64/fdc/sense_interrupt
PASS 6 fdc-test /x86_64/fdc/relative_seek
---
PASS 32 test-opts-visitor /visitor/opts/range/beyond
PASS 33 test-opts-visitor /visitor/opts/dict/unvisited
MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}  
tests/test-coroutine -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl 
--test-name="test-coroutine" 
==7977==WARNING: ASan doesn't fully support makecontext/swapcontext functions 
and may produce false positives in some cases!
==7977==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 
0x7ffec6185000; bottom 0x7fb9ac9f8000; size: 0x00451978d000 (296780091392)
False positive error reports may follow
For details see https://github.com/google/sanitizers/issues/189
PASS 1 test-coroutine /basic/no-dangling-access
---
PASS 11 test-aio /aio/event/wait
PASS 12 test-aio /aio/event/flush
PASS 13 test-aio /aio/event/wait/no-flush-cb
==7996==WARNING: ASan doesn't fully support makecontext/swapcontext functions 
and may produce false positives in some cases!
PASS 14 test-aio /aio/timer/schedule
PASS 15 test-aio /aio/coroutine/queue-chaining
PASS 16 test-aio /aio-gsource/flush
---
MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}  
QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64 QTEST_QEMU_IMG=qemu-img 
tests/ide-test -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl 
--test-name="ide-test" 
PASS 28 test-aio /aio-gsource/timer/schedule
MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}  
tests/test-aio-multithread -m=quick -k --tap < /dev/null | 
./scripts/tap-driver.pl --test-name="test-aio-multithread" 
==8005==WARNING: ASan doesn't fully support makecontext/swapcontext functions 
and may produce false positives in some cases!
==8011==WARNING: ASan doesn't fully support makecontext/swapcontext functions 
and may produce false positives in some cases!
PASS 1 test-aio-multithread /aio/multi/lifecycle
PASS 1 ide-test /x86_64/ide/identify
==8026==WARNING: ASan doesn't fully support makecontext/swapcontext functions 
and may produce false positives in some cases!
PASS 2 test-aio-multithread /aio/multi/schedule
PASS 2 ide-test /x86_64/ide/flush
==8037==WARNING: ASan doesn't fully support makecontext/swapcontext functions 
and may produce false positives in some cases!
PASS 3 test-aio-multithread /aio/multi/mutex/contended
PASS 3 ide-test /x86_64/ide/bmdma/simple_rw
==8048==WARNING: ASan doesn't fully support makecontext/swapcontext functions 
and may produce false positives in some cases!
PASS 4 ide-test /x86_64/ide/bmdma/trim
==8054==WARNING: ASan doesn't fully support makecontext/swapcontext functions 
and may produce false positives in some cases!
PASS 5 ide-test /x86_64/ide/bmdma/short_prdt
==8060==WARNING: ASan doesn't fully support makecontext/swapcontext functions 
and may produce false positives in some cases!
PASS 4 test-aio-multithread /aio/multi/mutex/handoff
PASS 6 ide-test /x86_64/ide/bmdma/one_sector_short_prdt
==8071==WARNING: ASan doesn't fully support makecontext/swapcontext functions 
and may produce false positives in some cases!
PASS 5 test-aio-multithread /aio/multi/mutex/mcs
PASS 7 ide-test /x86_64/ide/bmdma/long_prdt
==8082==WARNING: ASan doesn't fully support makecontext/swapcontext functions 
and may produce false positives in some cases!
==8082==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 
0x7ffe85f3; bottom 0x7f578fb66000; size: 0x00a6f63ca000 (717095739392)
False positive error reports may follow
For details see https://github.com/google/sanitizers/issues/189
PASS 6 test-aio-multithread /aio/multi/mutex/pthread
MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}  
tests/test-throttle -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl 
--test-name="test-throttle" 
PASS 8 ide-test /x86_64/ide/bmdma/no_busmaster
==8090==WARNING: ASan doesn't fully support makecontext/swapcontext functions 
and may produce false positives in some cases!
PASS 1 test-throttle /throttle/leak_bucket
PASS 2 test-throttle /throttle/compute_wait
PASS 3 test-throttle /throttle/init
---
MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}  
tests/test-thread-pool -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl 
--test

Re: [Qemu-devel] [RFC PATCH v2 16/39] target/i386: introduce instruction operand infrastructure

2019-08-14 Thread Jan Bobek
On 8/13/19 2:07 AM, Richard Henderson wrote:
> On 8/10/19 5:12 AM, Jan Bobek wrote:
>> +#define INSNOP_INIT(opT, init_stmt)\
>> +static int insnop_init(opT)(CPUX86State *env, DisasContext *s, \
>> +int modrm, insnop_t(opT) *op)  \
>> +{  \
>> +init_stmt; \
>> +}
> ...
>> +#define INSNOP_INIT_FAILreturn 1
>> +#define INSNOP_INIT_OK(x)   return ((*(op) = (x)), 0)
> 
> Return bool and true on success.

So, the reason why I did this "inverted" logic (0 = success, 1 =
failure) is because I was anticipating I might need to differentiate
between two or more different failures, in which case returning
different non-zero values for different error cases makes perfect
sense. I have not made use of it yet, but I'd rather hold on to this
idiom at least for now, until I am 100 % sure it really is
unnecessary.

-Jan



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v3] ppc: Add support for 'mffsl' instruction

2019-08-14 Thread David Gibson
On Wed, Aug 14, 2019 at 11:34:13AM -0500, Paul Clarke wrote:
> Should these 'checkpatch' ERRORs be addressed, even if it will diverge the 
> code style from the existing, surrounding code?
> 
> On 8/14/19 11:30 AM, no-re...@patchew.org wrote:
> > This series seems to have some coding style problems. See output below for
> > more information:
> 
> > === OUTPUT BEGIN ===
> > ERROR: code indent should never use tabs
> > #54: FILE: disas/ppc.c:5004:
> > +{ "mffsl",   XRA(63,583,12), XRARB_MASK,^IPOWER9,^I{ FRT } },$
> > 
> > ERROR: space required after that ',' (ctx:VxV)
> > #54: FILE: disas/ppc.c:5004:
> > +{ "mffsl",   XRA(63,583,12), XRARB_MASK,   POWER9, { FRT } },
> > ^
> > 
> > ERROR: space required after that ',' (ctx:VxV)
> > #54: FILE: disas/ppc.c:5004:
> > +{ "mffsl",   XRA(63,583,12), XRARB_MASK,   POWER9, { FRT } },

The ones above, no.

> > ERROR: braces {} are necessary for all arms of this statement
> > #148: FILE: target/ppc/translate/fp-impl.inc.c:625:
> > +if (unlikely(!(ctx->insns_flags2 & PPC2_ISA300)))

But this one, yes.

> > [...]
> > 
> > total: 4 errors, 0 warnings, 115 lines checked
> > 
> > Commit c51c0f894525 (ppc: Add support for 'mffsl' instruction) has style 
> > problems, please review.  If any of these errors
> > are false positives report them to the maintainer, see
> > CHECKPATCH in MAINTAINERS.
> > === OUTPUT END ===
> 
> PC
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v9 05/11] numa: Extend CLI to provide initiator information for numa nodes

2019-08-14 Thread Dan Williams
On Tue, Aug 13, 2019 at 10:14 PM Tao Xu  wrote:
>
> On 8/14/2019 10:39 AM, Dan Williams wrote:
> > On Tue, Aug 13, 2019 at 8:00 AM Igor Mammedov  wrote:
> >>
> >> On Fri,  9 Aug 2019 14:57:25 +0800
> >> Tao  wrote:
> >>
> >>> From: Tao Xu 
> >>>
> >>> In ACPI 6.3 chapter 5.2.27 Heterogeneous Memory Attribute Table (HMAT),
> >>> The initiator represents processor which access to memory. And in 5.2.27.3
> >>> Memory Proximity Domain Attributes Structure, the attached initiator is
> >>> defined as where the memory controller responsible for a memory proximity
> >>> domain. With attached initiator information, the topology of heterogeneous
> >>> memory can be described.
> >>>
> >>> Extend CLI of "-numa node" option to indicate the initiator numa node-id.
> >>> In the linux kernel, the codes in drivers/acpi/hmat/hmat.c parse and 
> >>> report
> >>> the platform's HMAT tables.
> >>>
> >>> Reviewed-by: Jingqi Liu 
> >>> Suggested-by: Dan Williams 
> >>> Signed-off-by: Tao Xu 
> >>> ---
> >>>
> >>> No changes in v9
> >>> ---
> >>>   hw/core/machine.c | 24 
> >>>   hw/core/numa.c| 13 +
> >>>   include/sysemu/numa.h |  3 +++
> >>>   qapi/machine.json |  6 +-
> >>>   qemu-options.hx   | 27 +++
> >>>   5 files changed, 68 insertions(+), 5 deletions(-)
> >>>
> >>> diff --git a/hw/core/machine.c b/hw/core/machine.c
> >>> index 3c55470103..113184a9df 100644
> >>> --- a/hw/core/machine.c
> >>> +++ b/hw/core/machine.c
> >>> @@ -640,6 +640,7 @@ void machine_set_cpu_numa_node(MachineState *machine,
> >>>  const CpuInstanceProperties *props, 
> >>> Error **errp)
> >>>   {
> >>>   MachineClass *mc = MACHINE_GET_CLASS(machine);
> >>> +NodeInfo *numa_info = machine->numa_state->nodes;
> >>>   bool match = false;
> >>>   int i;
> >>>
> >>> @@ -709,6 +710,16 @@ void machine_set_cpu_numa_node(MachineState *machine,
> >>>   match = true;
> >>>   slot->props.node_id = props->node_id;
> >>>   slot->props.has_node_id = props->has_node_id;
> >>> +
> >>> +if (numa_info[props->node_id].initiator_valid &&
> >>> +(props->node_id != numa_info[props->node_id].initiator)) {
> >>> +error_setg(errp, "The initiator of CPU NUMA node %" PRId64
> >>> +   " should be itself.", props->node_id);
> >>> +return;
> >>> +}
> >>> +numa_info[props->node_id].initiator_valid = true;
> >>> +numa_info[props->node_id].has_cpu = true;
> >>> +numa_info[props->node_id].initiator = props->node_id;
> >>>   }
> >>>
> >>>   if (!match) {
> >>> @@ -1050,6 +1061,7 @@ static void 
> >>> machine_numa_finish_cpu_init(MachineState *machine)
> >>>   GString *s = g_string_new(NULL);
> >>>   MachineClass *mc = MACHINE_GET_CLASS(machine);
> >>>   const CPUArchIdList *possible_cpus = 
> >>> mc->possible_cpu_arch_ids(machine);
> >>> +NodeInfo *numa_info = machine->numa_state->nodes;
> >>>
> >>>   assert(machine->numa_state->num_nodes);
> >>>   for (i = 0; i < possible_cpus->len; i++) {
> >>> @@ -1083,6 +1095,18 @@ static void 
> >>> machine_numa_finish_cpu_init(MachineState *machine)
> >>>   machine_set_cpu_numa_node(machine, &props, &error_fatal);
> >>>   }
> >>>   }
> >>> +
> >>> +for (i = 0; i < machine->numa_state->num_nodes; i++) {
> >>> +if (numa_info[i].initiator_valid &&
> >>> +!numa_info[numa_info[i].initiator].has_cpu) {
> >>^^ possible out of bounds 
> >> read, see bellow
> >>
> >>> +error_report("The initiator-id %"PRIu16 " of NUMA node %d"
> >>> + " does not exist.", numa_info[i].initiator, i);
> >>> +error_printf("\n");
> >>> +
> >>> +exit(1);
> >>> +}
> >> it takes care only about nodes that have cpus or memory-only ones that have
> >> initiator explicitly provided on CLI. And leaves possibility to have
> >> memory-only nodes without initiator mixed with nodes that have initiator.
> >> Is it valid to have mixed configuration?
> >> Should we forbid it?
> >
> > The spec talks about the "Proximity Domain for the Attached Initiator"
> > field only being valid if the memory controller for the memory can be
> > identified by an initiator id in the SRAT. So I expect the only way to
> > define a memory proximity domain without this local initiator is to
> > allow specifying a node-id that does not have an entry in the SRAT.
> >
> Hi Dan,
>
> So there may be a situation for the Attached Initiator field is not
> valid? If true, I would allow user to input Initiator invalid.

Yes it's something the OS needs to consider because the platform may
not be able to meet the constraint that a single initiator is
associated with the memory controller for a given memory target. In
retrospect it would have been nice if the spec reserved 0x f

Re: [Qemu-devel] [PATCH 00/13] RFC: luks/encrypted qcow2 key management

2019-08-14 Thread Eric Blake
On 8/14/19 3:22 PM, Maxim Levitsky wrote:

> This is an issue that was raised today on IRC with Kevin Wolf. Really thanks
> for the idea!
> 
> We agreed that this new qmp interface should take the same options as
> blockdev-create does, however since we want to be able to edit the encryption
> slots separately, this implies that we sort of need to allow this on creation
> time as well.
> 
> Also the BlockdevCreateOptions is a union, which is specialized by the driver 
> name
> which is great for creation, but for update, the driver name is already known,
> and thus the user should not be forced to pass it again.
> However qmp doesn't seem to support union type guessing based on actual fields
> given (this might not be desired either), which complicates this somewhat.

Does the idea of a union type with a default value for the discriminator
help?  Maybe we have a discriminator which defaults to 'auto', and add a
union branch 'auto':'any'.  During creation, if the "driver":"auto"
branch is selected (usually implicitly by omitting "driver", but also
possible explicitly), the creation attempt is rejected as invalid
regardless of the contents of the remaining 'any'.  But during amend
usage, if the 'auto' branch is selected, we then add in the proper
"driver":"xyz" and reparse the QAPI object to determine if the remaining
fields in 'any' still meet the specification for the required driver branch.

This idea may still require some tweaks to the QAPI generator, but it's
the best I can come up with for a way to parse an arbitrary JSON object
with unknown validation, then reparse it again after adding more
information that would constrain the parse differently.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v9 00/11] Build ACPI Heterogeneous Memory Attribute Table (HMAT)

2019-08-14 Thread Eduardo Habkost
On Tue, Aug 13, 2019 at 04:53:33PM +0800, Tao Xu wrote:
> Hi Igor and Eduardo,
> 
> I am wondering if there are more comments about patch 1/11~4/11? Because
> these 4 patch are independent and the patch series are big and pushing for a
> long time. Could the patch 1/11~4/11 be ready for queuing firstly?

Now that I got a few Acked-bys for patch 1/4, I plan to queue
patches 1-4 in machine-next soon.

-- 
Eduardo



[Qemu-devel] [PATCH 3/3] Document the qmp commands for continious replication

2019-08-14 Thread Lukas Straub
Signed-off-by: Lukas Straub 
---
 docs/COLO-FT.txt | 185 +++
 1 file changed, 138 insertions(+), 47 deletions(-)

diff --git a/docs/COLO-FT.txt b/docs/COLO-FT.txt
index ad24680d13..c08bfbd3a8 100644
--- a/docs/COLO-FT.txt
+++ b/docs/COLO-FT.txt
@@ -145,35 +145,64 @@ The diagram just shows the main qmp command, you can get 
the detail
 in test procedure.
 
 == Test procedure ==
+Note: Here we are running both instances on the same Machine for testing, 
+change the IP Addresses if you want to run it on two Hosts
+
 1. Startup qemu
 Primary:
-# qemu-system-x86_64 -accel kvm -m 2048 -smp 2 -qmp stdio -name primary \
-  -device piix3-usb-uhci -vnc :7 \
-  -device usb-tablet -netdev tap,id=hn0,vhost=off \
-  -device virtio-net-pci,id=net-pci0,netdev=hn0 \
-  -drive 
if=virtio,id=primary-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,\
- children.0.file.filename=1.raw,\
- children.0.driver=raw -S
+# imagefolder="/mnt/vms/colo-test"
+
+# cp --reflink=auto $imagefolder/primary.qcow2 $imagefolder/primary-copy.qcow2
+
+# qemu-system-x86_64 -enable-kvm -cpu qemu64,+kvmclock -m 512 -smp 1 -qmp 
stdio \
+   -vnc :0 -k de -device piix3-usb-uhci -device usb-tablet -name primary \
+   -netdev tap,id=hn0,vhost=off,helper=/usr/lib/qemu/qemu-bridge-helper \
+   -device rtl8139,id=e0,netdev=hn0 \
+   -chardev socket,id=mirror0,host=127.0.0.1,port=9003,server,nowait \
+   -chardev socket,id=compare1,host=127.0.0.1,port=9004,server,wait \
+   -chardev socket,id=compare0,host=127.0.0.1,port=9001,server,nowait \
+   -chardev socket,id=compare0-0,host=127.0.0.1,port=9001 \
+   -chardev socket,id=compare_out,host=127.0.0.1,port=9005,server,nowait \
+   -chardev socket,id=compare_out0,host=127.0.0.1,port=9005 \
+   -object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0 \
+   -object filter-redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out \
+   -object filter-redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0 \
+   -object iothread,id=iothread1 \
+   -object colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,\
+outdev=compare_out0,iothread=iothread1 \
+   -drive 
if=ide,id=colo-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,\
+children.0.file.filename=$imagefolder/primary.qcow2,children.0.driver=qcow2 -S
+
 Secondary:
-# qemu-system-x86_64 -accel kvm -m 2048 -smp 2 -qmp stdio -name secondary \
-  -device piix3-usb-uhci -vnc :7 \
-  -device usb-tablet -netdev tap,id=hn0,vhost=off \
-  -device virtio-net-pci,id=net-pci0,netdev=hn0 \
-  -drive 
if=none,id=secondary-disk0,file.filename=1.raw,driver=raw,node-name=node0 \
-  -drive if=virtio,id=active-disk0,driver=replication,mode=secondary,\
- file.driver=qcow2,top-id=active-disk0,\
- file.file.filename=/mnt/ramfs/active_disk.img,\
- file.backing.driver=qcow2,\
- file.backing.file.filename=/mnt/ramfs/hidden_disk.img,\
- file.backing.backing=secondary-disk0 \
-  -incoming tcp:0:
+# imagefolder="/mnt/vms/colo-test"
+
+# qemu-img create -f qcow2 $imagefolder/secondary-active.qcow2 10G
+
+# qemu-img create -f qcow2 $imagefolder/secondary-hidden.qcow2 10G
+
+# qemu-system-x86_64 -enable-kvm -cpu qemu64,+kvmclock -m 512 -smp 1 -qmp 
stdio \
+   -vnc :1 -k de -device piix3-usb-uhci -device usb-tablet -name secondary \
+   -netdev tap,id=hn0,vhost=off,helper=/usr/lib/qemu/qemu-bridge-helper \
+   -device rtl8139,id=e0,netdev=hn0 \
+   -chardev socket,id=red0,host=127.0.0.1,port=9003,reconnect=1 \
+   -chardev socket,id=red1,host=127.0.0.1,port=9004,reconnect=1 \
+   -object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0 \
+   -object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1 \
+   -object filter-rewriter,id=rew0,netdev=hn0,queue=all \
+   -drive 
if=none,id=parent0,file.filename=$imagefolder/primary-copy.qcow2,driver=qcow2 \
+   -drive 
if=none,id=childs0,driver=replication,mode=secondary,file.driver=qcow2,\
+top-id=childs0,file.file.filename=$imagefolder/secondary-active.qcow2,\
+file.backing.driver=qcow2,file.backing.file.filename=$imagefolder/secondary-hidden.qcow2,\
+file.backing.backing=parent0 \
+   -drive 
if=ide,id=colo-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,\
+children.0.file=childs0,children.0.driver=raw \
+   -incoming tcp:0:9998
+
 
 2. On Secondary VM's QEMU monitor, issue command
 {'execute':'qmp_capabilities'}
-{ 'execute': 'nbd-server-start',
-  'arguments': {'addr': {'type': 'inet', 'data': {'host': 'xx.xx.xx.xx', 
'port': '8889'} } }
-}
-{'execute': 'nbd-server-add', 'arguments': {'device': 'secondary-disk0', 
'writable': true } }
+{'execute': 'nbd-server-start', 'arguments': {'addr': {'type': 'inet', 'data': 
{'host': '127.0.0.1', 'port': ''} } } }
+{'execute': 'nbd-server-add', 'arguments': {'device': 'parent0', 'writable': 
true } }
 
 Note:
   a. The qmp command nbd-server-start and nbd-server-add must be run
@@ -184,12 +213,10 @@ Note:
 
 3. On Primary VM's QEMU monitor, issu

[Qemu-devel] [PATCH 0/3] colo: Add support for continious replication

2019-08-14 Thread Lukas Straub
Hello Everyone,
These Patches add support for continious replication to colo.
Please review.

Regards,
Lukas Straub

Lukas Straub (3):
  Replication: Ignore requests after failover
  net/filter.c: Add Options to insert filters anywhere in the filter list
  Document the qmp commands for continious replication

 block/replication.c  |  31 +++-
 docs/COLO-FT.txt | 185 ---
 include/net/filter.h |   2 +
 net/filter.c |  73 -
 qemu-options.hx  |  10 +--
 5 files changed, 243 insertions(+), 58 deletions(-)

--
2.20.1



[Qemu-devel] [PATCH 2/3] net/filter.c: Add Options to insert filters anywhere in the filter list

2019-08-14 Thread Lukas Straub
To switch the Secondary to Primary, we need to insert new filters before
the filter-rewriter.

Add the necessary options to insert filters anywhere in the filter list.

Signed-off-by: Lukas Straub 
---
 include/net/filter.h |  2 ++
 net/filter.c | 73 ++--
 qemu-options.hx  | 10 +++---
 3 files changed, 78 insertions(+), 7 deletions(-)

diff --git a/include/net/filter.h b/include/net/filter.h
index 49da666ac0..355c178f75 100644
--- a/include/net/filter.h
+++ b/include/net/filter.h
@@ -62,6 +62,8 @@ struct NetFilterState {
 NetClientState *netdev;
 NetFilterDirection direction;
 bool on;
+char *position;
+bool insert_before;
 QTAILQ_ENTRY(NetFilterState) next;
 };

diff --git a/net/filter.c b/net/filter.c
index 28d1930db7..1058100b83 100644
--- a/net/filter.c
+++ b/net/filter.c
@@ -171,11 +171,47 @@ static void netfilter_set_status(Object *obj, const char 
*str, Error **errp)
 }
 }

+static char *netfilter_get_position(Object *obj, Error **errp)
+{
+NetFilterState *nf = NETFILTER(obj);
+
+return g_strdup(nf->position);
+}
+
+static void netfilter_set_position(Object *obj, const char *str, Error **errp)
+{
+NetFilterState *nf = NETFILTER(obj);
+
+nf->position = g_strdup(str);
+}
+
+static char *netfilter_get_insert(Object *obj, Error **errp)
+{
+NetFilterState *nf = NETFILTER(obj);
+
+return nf->insert_before ? g_strdup("before") : g_strdup("after");
+}
+
+static void netfilter_set_insert(Object *obj, const char *str, Error **errp)
+{
+NetFilterState *nf = NETFILTER(obj);
+
+if (strcmp(str, "before") && strcmp(str, "after")) {
+error_setg(errp, "Invalid value for netfilter insert, "
+ "should be 'head' or 'tail'");
+return;
+}
+
+nf->insert_before = !strcmp(str, "before");
+}
+
 static void netfilter_init(Object *obj)
 {
 NetFilterState *nf = NETFILTER(obj);

 nf->on = true;
+nf->insert_before = false;
+nf->position = g_strdup("tail");

 object_property_add_str(obj, "netdev",
 netfilter_get_netdev_id, netfilter_set_netdev_id,
@@ -187,16 +223,23 @@ static void netfilter_init(Object *obj)
 object_property_add_str(obj, "status",
 netfilter_get_status, netfilter_set_status,
 NULL);
+object_property_add_str(obj, "position",
+netfilter_get_position, netfilter_set_position,
+NULL);
+object_property_add_str(obj, "insert",
+netfilter_get_insert, netfilter_set_insert,
+NULL);
 }

 static void netfilter_complete(UserCreatable *uc, Error **errp)
 {
 NetFilterState *nf = NETFILTER(uc);
+NetFilterState *position;
 NetClientState *ncs[MAX_QUEUE_NUM];
 NetFilterClass *nfc = NETFILTER_GET_CLASS(uc);
 int queues;
 Error *local_err = NULL;
-
+
 if (!nf->netdev_id) {
 error_setg(errp, "Parameter 'netdev' is required");
 return;
@@ -219,6 +262,20 @@ static void netfilter_complete(UserCreatable *uc, Error 
**errp)
 return;
 }

+if (strcmp(nf->position, "head") && strcmp(nf->position, "tail")) {
+/* Search for the position to insert before/after */
+Object *container;
+Object *obj;
+
+container = object_get_objects_root();
+obj = object_resolve_path_component(container, nf->position);
+if (!obj) {
+error_setg(errp, "filter '%s' not found", nf->position);
+return;
+}
+position = NETFILTER(obj);
+}
+
 nf->netdev = ncs[0];

 if (nfc->setup) {
@@ -228,7 +285,18 @@ static void netfilter_complete(UserCreatable *uc, Error 
**errp)
 return;
 }
 }
-QTAILQ_INSERT_TAIL(&nf->netdev->filters, nf, next);
+
+if (!strcmp(nf->position, "head")) {
+QTAILQ_INSERT_HEAD(&nf->netdev->filters, nf, next);
+} else if (!strcmp(nf->position, "tail")) {
+QTAILQ_INSERT_TAIL(&nf->netdev->filters, nf, next);
+} else {
+if (nf->insert_before) {
+QTAILQ_INSERT_BEFORE(position, nf, next);
+} else {
+QTAILQ_INSERT_AFTER(&nf->netdev->filters, position, nf, next);
+}
+}
 }

 static void netfilter_finalize(Object *obj)
@@ -245,6 +313,7 @@ static void netfilter_finalize(Object *obj)
 QTAILQ_REMOVE(&nf->netdev->filters, nf, next);
 }
 g_free(nf->netdev_id);
+g_free(nf->position);
 }

 static void default_handle_event(NetFilterState *nf, int event, Error **errp)
diff --git a/qemu-options.hx b/qemu-options.hx
index 08749a3391..f0a47a0746 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -4368,7 +4368,7 @@ applications, they can do this through this parameter. 
Its format is
 a gnutls priority string as described at
 @url{https://gnutls.org/manual/html_node/Priority-Strings.html}.

-@item -object 
filter-bu

[Qemu-devel] [PATCH 1/3] Replication: Ignore requests after failover

2019-08-14 Thread Lukas Straub
After failover, the Secondary side of replication shouldn't change state.
Add the necessary checks to ignore requests after failover.

Signed-off-by: Lukas Straub 
---
 block/replication.c | 31 +++
 1 file changed, 27 insertions(+), 4 deletions(-)

diff --git a/block/replication.c b/block/replication.c
index 3d4dedddfc..466d463963 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -454,6 +454,14 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 aio_context_acquire(aio_context);
 s = bs->opaque;

+if (s->stage == BLOCK_REPLICATION_DONE || s->stage == 
BLOCK_REPLICATION_FAILOVER) {
+/* This case happens when a secondary is promoted to primary.
+ * Ignore the request because the secondary side of replication
+ * doesn't have to do anything anymore. */
+aio_context_release(aio_context);
+return;
+}
+
 if (s->stage != BLOCK_REPLICATION_NONE) {
 error_setg(errp, "Block replication is running or done");
 aio_context_release(aio_context);
@@ -529,8 +537,7 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
"Block device is in use by internal backup job");

 top_bs = bdrv_lookup_bs(s->top_id, s->top_id, NULL);
-if (!top_bs || !bdrv_is_root_node(top_bs) ||
-!check_top_bs(top_bs, bs)) {
+if (!top_bs || !check_top_bs(top_bs, bs)) {
 error_setg(errp, "No top_bs or it is invalid");
 reopen_backing_file(bs, false, NULL);
 aio_context_release(aio_context);
@@ -577,6 +584,14 @@ static void replication_do_checkpoint(ReplicationState 
*rs, Error **errp)
 aio_context_acquire(aio_context);
 s = bs->opaque;

+if (s->stage == BLOCK_REPLICATION_DONE || s->stage == 
BLOCK_REPLICATION_FAILOVER) {
+/* This case happens when a secondary was promoted to primary.
+ * Ignore the request because the secondary side of replication
+ * doesn't have to do anything anymore. */
+aio_context_release(aio_context);
+return;
+}
+
 if (s->mode == REPLICATION_MODE_SECONDARY) {
 secondary_do_checkpoint(s, errp);
 }
@@ -592,8 +607,8 @@ static void replication_get_error(ReplicationState *rs, 
Error **errp)
 aio_context = bdrv_get_aio_context(bs);
 aio_context_acquire(aio_context);
 s = bs->opaque;
-
-if (s->stage != BLOCK_REPLICATION_RUNNING) {
+
+if (s->stage == BLOCK_REPLICATION_NONE) {
 error_setg(errp, "Block replication is not running");
 aio_context_release(aio_context);
 return;
@@ -635,6 +650,14 @@ static void replication_stop(ReplicationState *rs, bool 
failover, Error **errp)
 aio_context_acquire(aio_context);
 s = bs->opaque;

+if (s->stage == BLOCK_REPLICATION_DONE || s->stage == 
BLOCK_REPLICATION_FAILOVER) {
+/* This case happens when a secondary was promoted to primary.
+ * Ignore the request because the secondary side of replication
+ * doesn't have to do anything anymore. */
+aio_context_release(aio_context);
+return;
+}
+
 if (s->stage != BLOCK_REPLICATION_RUNNING) {
 error_setg(errp, "Block replication is not running");
 aio_context_release(aio_context);
--
2.20.1



Re: [Qemu-devel] [Qemu-block] [PATCH 2/2] qapi: deprecate implicit filters

2019-08-14 Thread Maxim Levitsky
On Wed, 2019-08-14 at 15:27 -0400, John Snow wrote:
> 
> On 8/14/19 6:07 AM, Vladimir Sementsov-Ogievskiy wrote:
> > To get rid of implicit filters related workarounds in future let's
> > deprecate them now.
> > 
> > Signed-off-by: Vladimir Sementsov-Ogievskiy 
> > ---
> >  qemu-deprecated.texi  |  7 +++
> >  qapi/block-core.json  |  6 --
> >  include/block/block_int.h | 10 +-
> >  blockdev.c| 10 ++
> >  4 files changed, 30 insertions(+), 3 deletions(-)
> > 
> > diff --git a/qemu-deprecated.texi b/qemu-deprecated.texi
> > index 2753fafd0b..8222440148 100644
> > --- a/qemu-deprecated.texi
> > +++ b/qemu-deprecated.texi
> > @@ -183,6 +183,13 @@ the 'wait' field, which is only applicable to sockets 
> > in server mode
> >  
> >  Use blockdev-mirror and blockdev-backup instead.
> >  
> > +@subsection implicit filters (since 4.2)
> > +
> > +Mirror and commit jobs inserts filters, which becomes implicit if user
> > +omitted filter-node-name parameter. So omitting it is deprecated, set it
> > +always. Note, that drive-mirror don't have this parameter, so it will
> > +create implicit filter anyway, but drive-mirror is deprecated itself too.
> > +
> >  @section Human Monitor Protocol (HMP) commands
> >  
> >  @subsection The hub_id parameter of 'hostfwd_add' / 'hostfwd_remove' 
> > (since 3.1)
> > diff --git a/qapi/block-core.json b/qapi/block-core.json
> > index 4e35526634..0505ac9d8b 100644
> > --- a/qapi/block-core.json
> > +++ b/qapi/block-core.json
> > @@ -1596,7 +1596,8 @@
> >  # @filter-node-name: the node name that should be assigned to the
> >  #filter driver that the commit job inserts into the 
> > graph
> >  #above @top. If this option is not given, a node name 
> > is
> > -#autogenerated. (Since: 2.9)
> > +#autogenerated. Omitting this option is deprecated, it 
> > will
> > +#be required in future. (Since: 2.9)
> >  #
> >  # @auto-finalize: When false, this job will wait in a PENDING state after 
> > it has
> >  # finished its work, waiting for @block-job-finalize before
> > @@ -2249,7 +2250,8 @@
> >  # @filter-node-name: the node name that should be assigned to the
> >  #filter driver that the mirror job inserts into the 
> > graph
> >  #above @device. If this option is not given, a node 
> > name is
> > -#autogenerated. (Since: 2.9)
> > +#autogenerated. Omitting this option is deprecated, it 
> > will
> > +#be required in future. (Since: 2.9)
> >  #
> >  # @copy-mode: when to copy data to the destination; defaults to 
> > 'background'
> >  # (Since: 3.0)
> > diff --git a/include/block/block_int.h b/include/block/block_int.h
> > index 3aa1e832a8..624da0b4a2 100644
> > --- a/include/block/block_int.h
> > +++ b/include/block/block_int.h
> > @@ -762,7 +762,15 @@ struct BlockDriverState {
> >  bool sg;/* if true, the device is a /dev/sg* */
> >  bool probed;/* if true, format was probed rather than specified */
> >  bool force_share; /* if true, always allow all shared permissions */
> > -bool implicit;  /* if true, this filter node was automatically 
> > inserted */
> > +
> > +/*
> > + * @implicit field is deprecated, don't set it to true for new filters.
> > + * If true, this filter node was automatically inserted and user don't
> > + * know about it and unprepared for any effects of it. So, implicit
> > + * filters are workarounded and skipped in many places of the block
> > + * layer code.
> > + */
> > +bool implicit;
> >  
> >  BlockDriver *drv; /* NULL means no media */
> >  void *opaque;
> > diff --git a/blockdev.c b/blockdev.c
> > index 36e9368e01..b3cfaccce1 100644
> > --- a/blockdev.c
> > +++ b/blockdev.c
> > @@ -3292,6 +3292,11 @@ void qmp_block_commit(bool has_job_id, const char 
> > *job_id, const char *device,
> >  BlockdevOnError on_error = BLOCKDEV_ON_ERROR_REPORT;
> >  int job_flags = JOB_DEFAULT;
> >  
> > +if (!has_filter_node_name) {
> > +warn_report("Omitting filter-node-name parameter is deprecated, it 
> > "
> > +"will be required in future");
> > +}
> > +
> >  if (!has_speed) {
> >  speed = 0;
> >  }
> > @@ -3990,6 +3995,11 @@ void qmp_blockdev_mirror(bool has_job_id, const char 
> > *job_id,
> >  Error *local_err = NULL;
> >  int ret;
> >  
> > +if (!has_filter_node_name) {
> > +warn_report("Omitting filter-node-name parameter is deprecated, it 
> > "
> > +"will be required in future");
> > +}
> > +
> >  bs = qmp_get_root_bs(device, errp);
> >  if (!bs) {
> >  return;
> > 
> 
> This might be OK to do right away, though.
> 
> I asked Markus this not too long ago; do we want to amend the QAPI
> schema specification to allow com

[Qemu-devel] [RFC v2] hw/sd/aspeed_sdhci: New device

2019-08-14 Thread Eddie James
The Aspeed SOCs have two SD/MMC controllers. Add a device that
encapsulates both of these controllers and models the Aspeed-specific
registers and behavior.

Tested by reading from mmcblk0 in Linux:
qemu-system-arm -machine romulus-bmc -nographic -serial mon:stdio \
 -drive file=_tmp/flash-romulus,format=raw,if=mtd \
 -device sd-card,drive=sd0 -drive file=_tmp/kernel,format=raw,if=sd

Signed-off-by: Eddie James 
---
This patch applies on top of Cedric's set of recent Aspeed changes. Therefore,
I'm sending as an RFC rather than a patch.

Changes since v1:
 - Move slot realize code into the Aspeed SDHCI realize function
 - Fix interrupt handling by creating input irqs and connecting them to the
   slot irqs.
 - Removed card device creation code

 hw/arm/aspeed.c  |   1 -
 hw/arm/aspeed_soc.c  |  24 ++
 hw/sd/Makefile.objs  |   1 +
 hw/sd/aspeed_sdhci.c | 190 +++
 include/hw/arm/aspeed_soc.h  |   3 +
 include/hw/sd/aspeed_sdhci.h |  35 
 6 files changed, 253 insertions(+), 1 deletion(-)
 create mode 100644 hw/sd/aspeed_sdhci.c
 create mode 100644 include/hw/sd/aspeed_sdhci.h

diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c
index 2574425..aeed5b6 100644
--- a/hw/arm/aspeed.c
+++ b/hw/arm/aspeed.c
@@ -480,7 +480,6 @@ static void aspeed_machine_class_init(ObjectClass *oc, void 
*data)
 mc->desc = board->desc;
 mc->init = aspeed_machine_init;
 mc->max_cpus = ASPEED_CPUS_NUM;
-mc->no_sdcard = 1;
 mc->no_floppy = 1;
 mc->no_cdrom = 1;
 mc->no_parallel = 1;
diff --git a/hw/arm/aspeed_soc.c b/hw/arm/aspeed_soc.c
index 8df96f2..a12f14a 100644
--- a/hw/arm/aspeed_soc.c
+++ b/hw/arm/aspeed_soc.c
@@ -22,6 +22,7 @@
 #include "qemu/error-report.h"
 #include "hw/i2c/aspeed_i2c.h"
 #include "net/net.h"
+#include "sysemu/blockdev.h"
 
 #define ASPEED_SOC_IOMEM_SIZE   0x0020
 
@@ -62,6 +63,7 @@ static const hwaddr aspeed_soc_ast2500_memmap[] = {
 [ASPEED_XDMA]   = 0x1E6E7000,
 [ASPEED_ADC]= 0x1E6E9000,
 [ASPEED_SRAM]   = 0x1E72,
+[ASPEED_SDHCI]  = 0x1E74,
 [ASPEED_GPIO]   = 0x1E78,
 [ASPEED_RTC]= 0x1E781000,
 [ASPEED_TIMER1] = 0x1E782000,
@@ -100,6 +102,7 @@ static const hwaddr aspeed_soc_ast2600_memmap[] = {
 [ASPEED_XDMA]   = 0x1E6E7000,
 [ASPEED_ADC]= 0x1E6E9000,
 [ASPEED_VIDEO]  = 0x1E70,
+[ASPEED_SDHCI]  = 0x1E74,
 [ASPEED_GPIO]   = 0x1E78,
 [ASPEED_RTC]= 0x1E781000,
 [ASPEED_TIMER1] = 0x1E782000,
@@ -146,6 +149,7 @@ static const int aspeed_soc_ast2400_irqmap[] = {
 [ASPEED_ETH1]   = 2,
 [ASPEED_ETH2]   = 3,
 [ASPEED_XDMA]   = 6,
+[ASPEED_SDHCI]  = 26,
 };
 
 #define aspeed_soc_ast2500_irqmap aspeed_soc_ast2400_irqmap
@@ -163,6 +167,7 @@ static const int aspeed_soc_ast2600_irqmap[] = {
 [ASPEED_SDMC]   = 0,
 [ASPEED_SCU]= 12,
 [ASPEED_XDMA]   = 6,
+[ASPEED_SDHCI]  = 43,
 [ASPEED_ADC]= 46,
 [ASPEED_GPIO]   = 40,
 [ASPEED_RTC]= 13,
@@ -350,6 +355,15 @@ static void aspeed_soc_init(Object *obj)
 sysbus_init_child_obj(obj, "fsi[*]", OBJECT(&s->fsi[0]),
   sizeof(s->fsi[0]), TYPE_ASPEED_FSI);
 }
+
+sysbus_init_child_obj(obj, "sdhci", OBJECT(&s->sdhci), sizeof(s->sdhci),
+  TYPE_ASPEED_SDHCI);
+
+/* Init sd card slot class here so that they're under the correct parent */
+for (i = 0; i < ASPEED_SDHCI_NUM_SLOTS; ++i) {
+sysbus_init_child_obj(obj, "sdhci_slot[*]", OBJECT(&s->sdhci.slots[i]),
+  sizeof(s->sdhci.slots[i]), TYPE_SYSBUS_SDHCI);
+}
 }
 
 /*
@@ -680,6 +694,16 @@ static void aspeed_soc_realize(DeviceState *dev, Error 
**errp)
 sysbus_connect_irq(SYS_BUS_DEVICE(&s->fsi[0]), 0,
aspeed_soc_get_irq(s, ASPEED_FSI1));
 }
+
+/* SD/SDIO - set the reg address so slot memory mapping can be set up */
+s->sdhci.ioaddr = sc->info->memmap[ASPEED_SDHCI];
+object_property_set_bool(OBJECT(&s->sdhci), true, "realized", &err);
+if (err) {
+error_propagate(errp, err);
+return;
+}
+sysbus_connect_irq(SYS_BUS_DEVICE(&s->sdhci), 0,
+   aspeed_soc_get_irq(s, ASPEED_SDHCI));
 }
 static Property aspeed_soc_properties[] = {
 DEFINE_PROP_UINT32("num-cpus", AspeedSoCState, num_cpus, 0),
diff --git a/hw/sd/Makefile.objs b/hw/sd/Makefile.objs
index 0665727..a884c23 100644
--- a/hw/sd/Makefile.objs
+++ b/hw/sd/Makefile.objs
@@ -8,3 +8,4 @@ obj-$(CONFIG_MILKYMIST) += milkymist-memcard.o
 obj-$(CONFIG_OMAP) += omap_mmc.o
 obj-$(CONFIG_PXA2XX) += pxa2xx_mmci.o
 obj-$(CONFIG_RASPI) += bcm2835_sdhost.o
+obj-$(CONFIG_ASPEED_SOC) += aspeed_sdhci.o
diff --git a/hw/sd/aspeed_sdhci.c b/hw/sd/aspeed_sdhci.c
new file mode 100644
index 000..d1a05e9
--- /dev/null
+++ b/hw/sd/aspeed_sdhci.c
@@ -0,0 +1,190 @@
+/*
+ * Aspeed SD Host Controller
+ * Eddie James 
+ *
+ * Copyright (

[Qemu-devel] [PATCH 13/13] iotests : add tests for encryption key management

2019-08-14 Thread Maxim Levitsky
Signed-off-by: Maxim Levitsky 
---
 tests/qemu-iotests/257   | 197 ++
 tests/qemu-iotests/257.out   |  96 +++
 tests/qemu-iotests/258   |  95 +++
 tests/qemu-iotests/258.out   |  30 +
 tests/qemu-iotests/259   | 199 +++
 tests/qemu-iotests/259.out   |   5 +
 tests/qemu-iotests/common.filter |   5 +-
 tests/qemu-iotests/group |   3 +
 8 files changed, 628 insertions(+), 2 deletions(-)
 create mode 100755 tests/qemu-iotests/257
 create mode 100644 tests/qemu-iotests/257.out
 create mode 100755 tests/qemu-iotests/258
 create mode 100644 tests/qemu-iotests/258.out
 create mode 100644 tests/qemu-iotests/259
 create mode 100644 tests/qemu-iotests/259.out

diff --git a/tests/qemu-iotests/257 b/tests/qemu-iotests/257
new file mode 100755
index 00..5991e4a8c7
--- /dev/null
+++ b/tests/qemu-iotests/257
@@ -0,0 +1,197 @@
+#!/usr/bin/env bash
+#
+# Test encryption key management with luks
+# Based on 134
+#
+# Copyright (C) 2019 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+# creator
+owner=mlevi...@redhat.com
+
+seq=`basename $0`
+echo "QA output created by $seq"
+
+status=1   # failure is the default!
+
+_cleanup()
+{
+   _cleanup_test_img
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+_supported_fmt qcow2 luks
+_supported_proto file #TODO
+
+QEMU_IO_OPTIONS=$QEMU_IO_OPTIONS_NO_FMT
+
+# you are supposed to see the password as ***, see :-)
+SECRET0="--object secret,id=sec0,data=hunter0"
+SECRET1="--object secret,id=sec1,data=hunter1"
+SECRET2="--object secret,id=sec2,data=hunter2"
+SECRET3="--object secret,id=sec3,data=hunter3"
+SECRETS="$SECRET0 $SECRET1 $SECRET2 $SECRET3"
+
+
+if [ "$IMGFMT" = "qcow2" ] ; then
+   OPTPREFIX="encrypt."
+   EXTRA_IMG_ARGS="-o encrypt.format=luks"
+fi
+
+IMGSPEC0="driver=$IMGFMT,file.filename=$TEST_IMG,${OPTPREFIX}key-secret=sec0"
+IMGSPEC1="driver=$IMGFMT,file.filename=$TEST_IMG,${OPTPREFIX}key-secret=sec1"
+IMGSPEC2="driver=$IMGFMT,file.filename=$TEST_IMG,${OPTPREFIX}key-secret=sec2"
+IMGSPEC3="driver=$IMGFMT,file.filename=$TEST_IMG,${OPTPREFIX}key-secret=sec3"
+
+echo "== creating a test image =="
+_make_test_img $SECRET0 $EXTRA_IMG_ARGS -o 
"${OPTPREFIX}key-secret=sec0,${OPTPREFIX}iter-time=10"   32M
+
+echo
+echo "== test that key 0 opens the image =="
+$QEMU_IO $SECRET0 -c "read 0 4096" --image-opts $IMGSPEC0 | _filter_qemu_io | 
_filter_testdir
+
+
+echo
+echo "== adding a password to slot 1 =="
+$QEMU_IMG add_encryption_key $SECRETS --image-opts $IMGSPEC0 --keydef 
key-secret=sec1,iter-time=10
+echo "== adding a password to slot 3 =="
+$QEMU_IMG add_encryption_key $SECRETS --image-opts $IMGSPEC1 --keydef 
key-secret=sec3,iter-time=100,slot=3
+echo "== adding a password to slot 2 =="
+$QEMU_IMG add_encryption_key $SECRETS --image-opts $IMGSPEC3 --keydef 
key-secret=sec2,iter-time=10
+
+echo
+echo "== all secrets should work =="
+for IMGSPEC in $IMGSPEC0 $IMGSPEC1 $IMGSPEC2 $IMGSPEC3; do
+   $QEMU_IO $SECRETS -c "read 0 4096" --image-opts $IMGSPEC | 
_filter_qemu_io | _filter_testdir
+done
+
+
+echo
+echo "== erase slot 0 and try it =="
+$QEMU_IMG erase_encryption_key $SECRETS --image-opts $IMGSPEC1 --keydef 
key-secret=sec0| _filter_img_create
+$QEMU_IO $SECRETS -c "read 0 4096" --image-opts $IMGSPEC0 | _filter_qemu_io | 
_filter_testdir
+
+echo
+echo "== erase slot 2 and try it =="
+$QEMU_IMG erase_encryption_key $SECRETS --image-opts $IMGSPEC1 --keydef 
slot=2| _filter_img_create
+$QEMU_IO $SECRETS -c "read 0 4096" --image-opts $IMGSPEC2 | _filter_qemu_io | 
_filter_testdir
+
+
+# at this point slots 1 and 3 should be active
+
+echo
+echo "== filling  4 slots with secret 2 =="
+for i in $(seq 0 3) ; do
+   $QEMU_IMG add_encryption_key $SECRETS --image-opts $IMGSPEC3 --keydef 
key-secret=sec2,iter-time=10
+done
+
+echo
+echo "== adding secret 0 =="
+   $QEMU_IMG add_encryption_key $SECRETS --image-opts $IMGSPEC3 --keydef 
key-secret=sec0,iter-time=10
+
+echo
+echo "== adding secret 3 (last slot) =="
+   $QEMU_IMG add_encryption_key $SECRETS --image-opts $IMGSPEC3 --keydef 
key-secret=sec3,iter-time=10
+
+echo
+echo "== trying to add another slot (should fail) =="
+$QEMU_IMG add_encryption_key $SECRETS --image-opt

[Qemu-devel] [PATCH 12/13] qemu-img: implement key management

2019-08-14 Thread Maxim Levitsky
Signed-off-by: Maxim Levitsky 
---
 block/crypto.c   |  16 ++
 block/crypto.h   |   3 +
 qemu-img-cmds.hx |  13 +
 qemu-img.c   | 140 +++
 4 files changed, 172 insertions(+)

diff --git a/block/crypto.c b/block/crypto.c
index 415b6db041..2fcdf9dd39 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -203,6 +203,22 @@ block_crypto_create_opts_init(QDict *opts, Error **errp)
 return ret;
 }
 
+QCryptoEncryptionSetupOptions *
+block_crypto_setup_opts_init(QDict *opts, Error **errp)
+{
+Visitor *v;
+QCryptoEncryptionSetupOptions *ret;
+
+v = qobject_input_visitor_new_flat_confused(opts, errp);
+if (!v) {
+return NULL;
+}
+
+visit_type_QCryptoEncryptionSetupOptions(v, NULL, &ret, errp);
+
+visit_free(v);
+return ret;
+}
 
 static int block_crypto_open_generic(QCryptoBlockFormat format,
  QemuOptsList *opts_spec,
diff --git a/block/crypto.h b/block/crypto.h
index b935695e79..ece4d64aef 100644
--- a/block/crypto.h
+++ b/block/crypto.h
@@ -94,4 +94,7 @@ block_crypto_create_opts_init(QDict *opts, Error **errp);
 QCryptoBlockOpenOptions *
 block_crypto_open_opts_init(QDict *opts, Error **errp);
 
+QCryptoEncryptionSetupOptions *
+block_crypto_setup_opts_init(QDict *opts, Error **errp);
+
 #endif /* BLOCK_CRYPTO_H */
diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
index 1c93e6d185..7816a0adfb 100644
--- a/qemu-img-cmds.hx
+++ b/qemu-img-cmds.hx
@@ -19,6 +19,18 @@ STEXI
 @item amend [--object @var{objectdef}] [--image-opts] [-p] [-q] [-f @var{fmt}] 
[-t @var{cache}] -o @var{options} @var{filename}
 ETEXI
 
+DEF("add_encryption_key", img_add_encryption_key,
+"add_encryption_key [--object objectdef] [--image-opts] [--force] -U 
--keydef key_definition filename")
+STEXI
+@item add_encryption_key [--object @var{objectdef}] [--image-opts] [--force] 
-U --keydef @var{key_definition} @var{filename}
+ETEXI
+
+DEF("erase_encryption_key", img_erase_encryption_key,
+"erase_encryption_key [--object objectdef] [--image-opts] [--force] -U 
--keydef key_definition filename")
+STEXI
+@item erase_encryption_key [--object @var{objectdef}] [--image-opts] [--force] 
-U --keydef @var{key_definition} @var{filename}
+ETEXI
+
 DEF("bench", img_bench,
 "bench [-c count] [-d depth] [-f fmt] [--flush-interval=flush_interval] 
[-n] [--no-drain] [-o offset] [--pattern=pattern] [-q] [-s buffer_size] [-S 
step_size] [-t cache] [-w] [-U] filename")
 STEXI
@@ -97,6 +109,7 @@ STEXI
 @item resize [--object @var{objectdef}] [--image-opts] [-f @var{fmt}] 
[--preallocation=@var{prealloc}] [-q] [--shrink] @var{filename} [+ | 
-]@var{size}
 ETEXI
 
+
 STEXI
 @end table
 ETEXI
diff --git a/qemu-img.c b/qemu-img.c
index 79983772de..bc6cd60df1 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -47,6 +47,7 @@
 #include "block/blockjob.h"
 #include "block/qapi.h"
 #include "crypto/init.h"
+#include "block/crypto.h"
 #include "trace/control.h"
 
 #define QEMU_IMG_VERSION "qemu-img version " QEMU_FULL_VERSION \
@@ -70,6 +71,8 @@ enum {
 OPTION_PREALLOCATION = 265,
 OPTION_SHRINK = 266,
 OPTION_SALVAGE = 267,
+OPTION_FORCE = 268,
+OPTION_KEYDEF = 269,
 };
 
 typedef enum OutputFormat {
@@ -223,6 +226,14 @@ static QemuOptsList qemu_source_opts = {
 },
 };
 
+static QemuOptsList keydef_opts = {
+.name = "encryption_opts",
+.head = QTAILQ_HEAD_INITIALIZER(keydef_opts.head),
+.desc = {
+{ }
+},
+};
+
 static int GCC_FMT_ATTR(2, 3) qprintf(bool quiet, const char *fmt, ...)
 {
 int ret = 0;
@@ -4997,6 +5008,135 @@ out:
 return ret;
 }
 
+
+static QemuOptsList keydef_options_list = {
+.name = "encryption",
+.head = QTAILQ_HEAD_INITIALIZER(keydef_options_list.head),
+.desc = {
+{ }
+},
+};
+
+static int setup_encryption(int argc, char **argv,
+enum BlkSetupEncryptionAction action)
+{
+static const struct option long_options[] = {
+{"help", no_argument, 0, 'h'},
+{"image-opts", no_argument, 0, OPTION_IMAGE_OPTS},
+{"object", required_argument, 0, OPTION_OBJECT},
+{"force", no_argument, 0, OPTION_FORCE},
+{"force-share", no_argument, 0, 'U'},
+{"keydef", required_argument, 0, OPTION_KEYDEF},
+{0, 0, 0, 0}
+};
+
+BlockBackend *blk = NULL;
+const char *filename = NULL;
+bool force_share = false;
+QemuOpts *keydef_opts = NULL;
+bool image_opts = false;
+Error *local_err = NULL;
+QDict *keydef_dict = NULL;
+QCryptoEncryptionSetupOptions *qcrypto_options = NULL;
+bool force = false;
+
+int ret = 1;
+int c;
+
+while ((c = getopt_long(argc, argv, "hU:", long_options, NULL)) != -1) {
+switch (c) {
+case '?':
+case 'h':
+help();
+break;
+case 'U':
+force_share = true;
+break;
+
+case OPTION_KEYDEF:
+if (keydef_opts) {
+error_report("O

[Qemu-devel] [PATCH 05/13] qcrypto-luks: clear the masterkey and password before freeing them always

2019-08-14 Thread Maxim Levitsky
While there are other places where these are still stored in memory,
this is still one less key material area that can be sniffed with
various side channel attacks



Signed-off-by: Maxim Levitsky 
---
 crypto/block-luks.c | 52 ++---
 1 file changed, 44 insertions(+), 8 deletions(-)

diff --git a/crypto/block-luks.c b/crypto/block-luks.c
index e1a4df94b7..336e633df4 100644
--- a/crypto/block-luks.c
+++ b/crypto/block-luks.c
@@ -1023,8 +1023,18 @@ qcrypto_block_luks_load_key(QCryptoBlock *block,
  cleanup:
 qcrypto_ivgen_free(ivgen);
 qcrypto_cipher_free(cipher);
-g_free(splitkey);
-g_free(possiblekey);
+
+if (splitkey) {
+memset(splitkey, 0, splitkeylen);
+g_free(splitkey);
+}
+
+if (possiblekey) {
+memset(possiblekey, 0, masterkeylen(luks));
+g_free(possiblekey);
+
+}
+
 return ret;
 }
 
@@ -1161,16 +1171,34 @@ qcrypto_block_luks_open(QCryptoBlock *block,
 block->sector_size = QCRYPTO_BLOCK_LUKS_SECTOR_SIZE;
 block->payload_offset = luks->header.payload_offset * block->sector_size;
 
-g_free(masterkey);
-g_free(password);
+if (masterkey) {
+memset(masterkey, 0, masterkeylen(luks));
+g_free(masterkey);
+}
+
+if (password) {
+memset(password, 0, strlen(password));
+g_free(password);
+}
+
 return 0;
 
  fail:
-g_free(masterkey);
+
+if (masterkey) {
+memset(masterkey, 0, masterkeylen(luks));
+g_free(masterkey);
+}
+
+if (password) {
+memset(password, 0, strlen(password));
+g_free(password);
+}
+
 qcrypto_block_free_cipher(block);
 qcrypto_ivgen_free(block->ivgen);
+
 g_free(luks);
-g_free(password);
 return ret;
 }
 
@@ -1459,7 +1487,10 @@ qcrypto_block_luks_create(QCryptoBlock *block,
 
 memset(masterkey, 0, luks->header.key_bytes);
 g_free(masterkey);
+
+memset(password, 0, strlen(password));
 g_free(password);
+
 g_free(cipher_mode_spec);
 
 return 0;
@@ -1467,9 +1498,14 @@ qcrypto_block_luks_create(QCryptoBlock *block,
  error:
 if (masterkey) {
 memset(masterkey, 0, luks->header.key_bytes);
+g_free(masterkey);
 }
-g_free(masterkey);
-g_free(password);
+
+if (password) {
+memset(password, 0, strlen(password));
+g_free(password);
+}
+
 g_free(cipher_mode_spec);
 
 qcrypto_block_free_cipher(block);
-- 
2.17.2




[Qemu-devel] [PATCH 09/13] qcrypto-luks: implement the encryption key management

2019-08-14 Thread Maxim Levitsky
Signed-off-by: Maxim Levitsky 
---
 crypto/block-luks.c | 374 +++-
 1 file changed, 373 insertions(+), 1 deletion(-)

diff --git a/crypto/block-luks.c b/crypto/block-luks.c
index 1997e92fe1..2c33643b52 100644
--- a/crypto/block-luks.c
+++ b/crypto/block-luks.c
@@ -72,6 +72,8 @@ typedef struct QCryptoBlockLUKSKeySlot 
QCryptoBlockLUKSKeySlot;
 
 #define QCRYPTO_BLOCK_LUKS_DEFAULT_ITER_TIME 2000
 
+#define QCRYPTO_BLOCK_LUKS_ERASE_ITERATIONS 40
+
 static const char qcrypto_block_luks_magic[QCRYPTO_BLOCK_LUKS_MAGIC_LEN] = {
 'L', 'U', 'K', 'S', 0xBA, 0xBE
 };
@@ -221,6 +223,9 @@ struct QCryptoBlockLUKS {
 
 /* Hash algorithm used in pbkdf2 function */
 QCryptoHashAlgorithm hash_alg;
+
+/* Name of the secret that was used to open the image */
+char *secret;
 };
 
 
@@ -1121,6 +1126,194 @@ qcrypto_block_luks_find_key(QCryptoBlock *block,
 }
 
 
+
+/*
+ * Returns true if a slot i is marked as containing as active
+ * (contains encrypted copy of the master key)
+ */
+
+static bool
+qcrypto_block_luks_slot_active(QCryptoBlockLUKS *luks, int slot_idx)
+{
+uint32_t val = luks->header.key_slots[slot_idx].active;
+return val ==  QCRYPTO_BLOCK_LUKS_KEY_SLOT_ENABLED;
+}
+
+/*
+ * Returns the number of slots that are marked as active
+ * (contains encrypted copy of the master key)
+ */
+
+static int
+qcrypto_block_luks_count_active_slots(QCryptoBlockLUKS *luks)
+{
+int i, ret = 0;
+
+for (i = 0; i < QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS; i++) {
+if (qcrypto_block_luks_slot_active(luks, i)) {
+ret++;
+}
+}
+return ret;
+}
+
+
+/*
+ * Finds first key slot which is not active
+ * Returns the key slot index, or -1 if doesn't exist
+ */
+
+static int
+qcrypto_block_luks_find_free_keyslot(QCryptoBlockLUKS *luks)
+{
+uint i;
+
+for (i = 0; i < QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS; i++) {
+if (!qcrypto_block_luks_slot_active(luks, i)) {
+return i;
+}
+}
+return -1;
+
+}
+
+/*
+ * Erases an keyslot given its index
+ *
+ * Returns:
+ *0 if the keyslot was erased successfully
+ *   -1 if a error occurred while erasing the keyslot
+ *
+ */
+
+static int
+qcrypto_block_luks_erase_key(QCryptoBlock *block,
+ uint slot_idx,
+ QCryptoBlockWriteFunc writefunc,
+ void *opaque,
+ Error **errp)
+{
+QCryptoBlockLUKS *luks = block->opaque;
+QCryptoBlockLUKSKeySlot *slot = &luks->header.key_slots[slot_idx];
+uint8_t *garbagekey = NULL;
+size_t splitkeylen = masterkeylen(luks) * slot->stripes;
+int i;
+int ret = -1;
+
+assert(slot_idx < QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS);
+assert(splitkeylen > 0);
+
+garbagekey = g_malloc0(splitkeylen);
+
+/* Reset the key slot header */
+memset(slot->salt, 0, QCRYPTO_BLOCK_LUKS_SALT_LEN);
+slot->iterations = 0;
+slot->active = QCRYPTO_BLOCK_LUKS_KEY_SLOT_DISABLED;
+
+qcrypto_block_luks_store_header(block,  writefunc, opaque, errp);
+
+/*
+ * Now try to erase the key material, even if the header
+ * update failed
+ */
+
+for (i = 0 ; i < QCRYPTO_BLOCK_LUKS_ERASE_ITERATIONS ; i++) {
+if (qcrypto_random_bytes(garbagekey, splitkeylen, errp) < 0) {
+
+/*
+ * If we failed to get the random data, still write
+ * *something* to the key slot at least once
+ */
+
+if (i > 0) {
+goto cleanup;
+}
+}
+
+if (writefunc(block, slot->key_offset * QCRYPTO_BLOCK_LUKS_SECTOR_SIZE,
+  garbagekey,
+  splitkeylen,
+  opaque,
+  errp) != splitkeylen) {
+goto cleanup;
+}
+}
+
+ret = 0;
+cleanup:
+g_free(garbagekey);
+return ret;
+}
+
+
+/*
+ * Erase all the keys that match the given password
+ * Will stop when only one keyslot is remaining
+ * Returns 0 is some keys were erased or -1 on failure
+ */
+
+static int
+qcrypto_block_luks_erase_matching_keys(QCryptoBlock *block,
+ const char *password,
+ QCryptoBlockReadFunc readfunc,
+ QCryptoBlockWriteFunc writefunc,
+ void *opaque,
+ bool force,
+ Error **errp)
+{
+QCryptoBlockLUKS *luks = block->opaque;
+uint i;
+int rv, ret = -1;
+uint8_t *masterkey;
+uint erased_count = 0;
+uint active_slot_count = qcrypto_block_luks_count_active_slots(luks);
+
+masterkey = g_new0(uint8_t, masterkeylen(luks));
+
+for (i = 0; i < QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS; i++) {
+
+/* refuse to erase last key if not forced */
+if (!force && active_slot_count == 1) {
+break;
+}
+
+rv = qcrypto_block_luks_lo

[Qemu-devel] [PATCH 10/13] block/crypto: implement the encryption key management

2019-08-14 Thread Maxim Levitsky
This implements the encryption key management
using the generic code in qcrypto layer

This code adds another 'write_func' because the initialization
write_func works directly on the underlying file,
because during the creation, there is no open instance
of the luks driver, but during regular use, we have it,
and should use it instead.

Signed-off-by: Maxim Levitsky 
---
 block/crypto.c | 96 --
 1 file changed, 93 insertions(+), 3 deletions(-)

diff --git a/block/crypto.c b/block/crypto.c
index 42a3f0898b..415b6db041 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -36,6 +36,7 @@ typedef struct BlockCrypto BlockCrypto;
 
 struct BlockCrypto {
 QCryptoBlock *block;
+bool updating_keys;
 };
 
 
@@ -69,6 +70,24 @@ static ssize_t block_crypto_read_func(QCryptoBlock *block,
 return ret;
 }
 
+static ssize_t block_crypto_write_func(QCryptoBlock *block,
+  size_t offset,
+  const uint8_t *buf,
+  size_t buflen,
+  void *opaque,
+  Error **errp)
+{
+BlockDriverState *bs = opaque;
+ssize_t ret;
+
+ret = bdrv_pwrite(bs->file, offset, buf, buflen);
+if (ret < 0) {
+error_setg_errno(errp, -ret, "Could not write encryption header");
+return ret;
+}
+return ret;
+}
+
 
 struct BlockCryptoCreateData {
 BlockBackend *blk;
@@ -622,6 +641,78 @@ block_crypto_get_specific_info_luks(BlockDriverState *bs, 
Error **errp)
 return spec_info;
 }
 
+
+static int
+block_crypto_setup_encryption(BlockDriverState *bs,
+  enum BlkSetupEncryptionAction action,
+  QCryptoEncryptionSetupOptions *options,
+  bool force,
+  Error **errp)
+{
+BlockCrypto *crypto = bs->opaque;
+int ret;
+
+assert(crypto);
+assert(crypto->block);
+
+crypto->updating_keys = true;
+
+ret = bdrv_child_refresh_perms(bs, bs->file, errp);
+
+if (ret) {
+crypto->updating_keys = false;
+return ret;
+}
+
+ret = qcrypto_block_setup_encryption(crypto->block,
+  block_crypto_read_func,
+  block_crypto_write_func,
+  bs,
+  action,
+  options,
+  force,
+  errp);
+
+crypto->updating_keys = false;
+bdrv_child_refresh_perms(bs, bs->file, errp);
+
+
+return ret;
+
+}
+
+
+static void
+block_crypto_child_perms(BlockDriverState *bs, BdrvChild *c,
+ const BdrvChildRole *role,
+ BlockReopenQueue *reopen_queue,
+ uint64_t perm, uint64_t shared,
+ uint64_t *nperm, uint64_t *nshared)
+{
+
+BlockCrypto *crypto = bs->opaque;
+
+/*
+ * This driver doesn't modify LUKS metadata except
+ * when updating the encryption slots.
+ * Allow share-rw=on as a special case.
+ *
+ * Encryption update will set the crypto->updating_keys
+ * during that period and refresh permissions
+ *
+ * */
+
+if (crypto->updating_keys) {
+/*need exclusive write access for header update  */
+perm |= BLK_PERM_WRITE;
+shared &= ~BLK_PERM_WRITE;
+}
+
+bdrv_filter_default_perms(bs, c, role, reopen_queue,
+perm, shared, nperm, nshared);
+}
+
+
 static const char *const block_crypto_strong_runtime_opts[] = {
 BLOCK_CRYPTO_OPT_LUKS_KEY_SECRET,
 
@@ -634,9 +725,7 @@ static BlockDriver bdrv_crypto_luks = {
 .bdrv_probe = block_crypto_probe_luks,
 .bdrv_open  = block_crypto_open_luks,
 .bdrv_close = block_crypto_close,
-/* This driver doesn't modify LUKS metadata except when creating image.
- * Allow share-rw=on as a special case. */
-.bdrv_child_perm= bdrv_filter_default_perms,
+.bdrv_child_perm= block_crypto_child_perms,
 .bdrv_co_create = block_crypto_co_create_luks,
 .bdrv_co_create_opts = block_crypto_co_create_opts_luks,
 .bdrv_co_truncate   = block_crypto_co_truncate,
@@ -649,6 +738,7 @@ static BlockDriver bdrv_crypto_luks = {
 .bdrv_getlength = block_crypto_getlength,
 .bdrv_get_info  = block_crypto_get_info_luks,
 .bdrv_get_specific_info = block_crypto_get_specific_info_luks,
+.bdrv_setup_encryption = block_crypto_setup_encryption,
 
 .strong_runtime_opts = block_crypto_strong_runtime_opts,
 };
-- 
2.17.2




[Qemu-devel] [PATCH 11/13] block/qcow2: implement the encryption key managment

2019-08-14 Thread Maxim Levitsky
This is the main purpose of the patchset, to enaable
us to manage luks like header, embedded in the qcow2
image, which standard cryptosetup tools don't support.

Signed-off-by: Maxim Levitsky 
---
 block/qcow2.c | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/block/qcow2.c b/block/qcow2.c
index 039bdc2f7e..a87e58f36a 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -5086,6 +5086,31 @@ void qcow2_signal_corruption(BlockDriverState *bs, bool 
fatal, int64_t offset,
 s->signaled_corruption = true;
 }
 
+
+static int qcow2_setup_encryption(BlockDriverState *bs,
+  enum BlkSetupEncryptionAction action,
+  QCryptoEncryptionSetupOptions *options,
+  bool force,
+  Error **errp)
+{
+BDRVQcow2State *s = bs->opaque;
+
+if (!s->crypto) {
+error_setg(errp, "Can't manage encryption - image is not encrypted");
+return -EINVAL;
+}
+
+return qcrypto_block_setup_encryption(s->crypto,
+  qcow2_crypto_hdr_read_func,
+  qcow2_crypto_hdr_write_func,
+  bs,
+  action,
+  options,
+  force,
+  errp);
+}
+
+
 static QemuOptsList qcow2_create_opts = {
 .name = "qcow2-create-opts",
 .head = QTAILQ_HEAD_INITIALIZER(qcow2_create_opts.head),
@@ -5232,6 +5257,8 @@ BlockDriver bdrv_qcow2 = {
 .bdrv_reopen_bitmaps_rw = qcow2_reopen_bitmaps_rw,
 .bdrv_can_store_new_dirty_bitmap = qcow2_can_store_new_dirty_bitmap,
 .bdrv_remove_persistent_dirty_bitmap = 
qcow2_remove_persistent_dirty_bitmap,
+
+.bdrv_setup_encryption = qcow2_setup_encryption,
 };
 
 static void bdrv_qcow2_init(void)
-- 
2.17.2




[Qemu-devel] [PATCH 08/13] qcrypto: add the plumbing for encryption management

2019-08-14 Thread Maxim Levitsky
This adds qcrypto_block_manage_encryption, which
 is thin wrapper around manage_encryption of the crypto driver
 which is also added

Signed-off-by: Maxim Levitsky 
---
 crypto/block.c | 29 +
 crypto/blockpriv.h |  9 +
 include/crypto/block.h | 27 +++
 3 files changed, 65 insertions(+)

diff --git a/crypto/block.c b/crypto/block.c
index ee96759f7d..5916e49aba 100644
--- a/crypto/block.c
+++ b/crypto/block.c
@@ -20,6 +20,7 @@
 
 #include "qemu/osdep.h"
 #include "qapi/error.h"
+
 #include "blockpriv.h"
 #include "block-qcow.h"
 #include "block-luks.h"
@@ -282,6 +283,34 @@ void qcrypto_block_free(QCryptoBlock *block)
 }
 
 
+int qcrypto_block_setup_encryption(QCryptoBlock *block,
+   QCryptoBlockReadFunc readfunc,
+   QCryptoBlockWriteFunc writefunc,
+   void *opaque,
+   enum BlkSetupEncryptionAction action,
+   QCryptoEncryptionSetupOptions *options,
+   bool force,
+   Error **errp)
+{
+if (!block->driver->setup_encryption) {
+error_setg(errp,
+"Crypto format %s doesn't support management of encryption 
keys",
+QCryptoBlockFormat_str(block->format));
+return -1;
+}
+
+return block->driver->setup_encryption(block,
+   readfunc,
+   writefunc,
+   opaque,
+   action,
+   options,
+   force,
+   errp);
+}
+
+
+
 typedef int (*QCryptoCipherEncDecFunc)(QCryptoCipher *cipher,
 const void *in,
 void *out,
diff --git a/crypto/blockpriv.h b/crypto/blockpriv.h
index 71c59cb542..804965dca3 100644
--- a/crypto/blockpriv.h
+++ b/crypto/blockpriv.h
@@ -81,6 +81,15 @@ struct QCryptoBlockDriver {
 
 bool (*has_format)(const uint8_t *buf,
size_t buflen);
+
+int (*setup_encryption)(QCryptoBlock *block,
+QCryptoBlockReadFunc readfunc,
+QCryptoBlockWriteFunc writefunc,
+void *opaque,
+enum BlkSetupEncryptionAction action,
+QCryptoEncryptionSetupOptions *options,
+bool force,
+Error **errp);
 };
 
 
diff --git a/include/crypto/block.h b/include/crypto/block.h
index fe12899831..60d46e3efc 100644
--- a/include/crypto/block.h
+++ b/include/crypto/block.h
@@ -23,6 +23,7 @@
 
 #include "crypto/cipher.h"
 #include "crypto/ivgen.h"
+#include "block/block.h"
 
 typedef struct QCryptoBlock QCryptoBlock;
 
@@ -268,4 +269,30 @@ uint64_t qcrypto_block_get_sector_size(QCryptoBlock 
*block);
  */
 void qcrypto_block_free(QCryptoBlock *block);
 
+
+/**
+ * qcrypto_block_setup_encryption:
+ * @block: the block encryption object
+ *
+ * @readfunc: callback for reading data from the volume header
+ * @writefunc: callback for writing data to the volume header
+ * @opaque: data to pass to @readfunc and @writefunc
+ * @action: tell the driver the setup action (add/erase currently)
+ * @options: driver specific options, that specify
+ *   what encryption settings to manage
+ * @force: hint for the driver to allow unsafe operation
+ * @errp: error pointer
+ *
+ * Adds/Erases a new encryption key using @options
+ *
+ */
+int qcrypto_block_setup_encryption(QCryptoBlock *block,
+   QCryptoBlockReadFunc readfunc,
+   QCryptoBlockWriteFunc writefunc,
+   void *opaque,
+   enum BlkSetupEncryptionAction action,
+   QCryptoEncryptionSetupOptions *options,
+   bool force,
+   Error **errp);
+
 #endif /* QCRYPTO_BLOCK_H */
-- 
2.17.2




[Qemu-devel] [PATCH 03/13] qcrypto-luks: refactoring: extract load/store/check/parse header functions

2019-08-14 Thread Maxim Levitsky
With upcoming key management, the header will
need to be stored after the image is created.

Extracting load header isn't strictly needed, but
do this anyway for the symmetry.

Also I extracted a function that does basic sanity
checks on the just read header, and a function
which parses all the crypto format to make the
code a bit more readable, plus now the code
doesn't destruct the in-header cipher-mode string,
so that the header now can be stored many times,
which is needed for the key management.

Also this allows to contain the endianess conversions
in these functions alone

The header is no longer endian swapped in place,
to prevent (mostly theoretical races I think)
races where someone could see the header in the
process of beeing byteswapped.

Signed-off-by: Maxim Levitsky 
---
 crypto/block-luks.c | 756 ++--
 1 file changed, 440 insertions(+), 316 deletions(-)

diff --git a/crypto/block-luks.c b/crypto/block-luks.c
index 48213abde7..6bb369f3b4 100644
--- a/crypto/block-luks.c
+++ b/crypto/block-luks.c
@@ -417,6 +417,427 @@ static int masterkeylen(QCryptoBlockLUKS *luks)
 }
 
 
+/*
+ * Stores the main LUKS header, taking care of endianess
+ */
+static int
+qcrypto_block_luks_store_header(QCryptoBlock *block,
+QCryptoBlockWriteFunc writefunc,
+void *opaque,
+Error **errp)
+{
+QCryptoBlockLUKS *luks = block->opaque;
+Error *local_err = NULL;
+size_t i;
+QCryptoBlockLUKSHeader *hdr_copy;
+
+/* Create a copy of the header */
+hdr_copy = g_new0(QCryptoBlockLUKSHeader, 1);
+memcpy(hdr_copy, &luks->header, sizeof(QCryptoBlockLUKSHeader));
+
+/*
+ * Everything on disk uses Big Endian (tm), so flip header fields
+ * before writing them
+ */
+cpu_to_be16s(&hdr_copy->version);
+cpu_to_be32s(&hdr_copy->payload_offset);
+cpu_to_be32s(&hdr_copy->key_bytes);
+cpu_to_be32s(&hdr_copy->master_key_iterations);
+
+for (i = 0; i < QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS; i++) {
+cpu_to_be32s(&hdr_copy->key_slots[i].active);
+cpu_to_be32s(&hdr_copy->key_slots[i].iterations);
+cpu_to_be32s(&hdr_copy->key_slots[i].key_offset);
+cpu_to_be32s(&hdr_copy->key_slots[i].stripes);
+}
+
+/* Write out the partition header and key slot headers */
+writefunc(block, 0, (const uint8_t *)hdr_copy, sizeof(*hdr_copy),
+  opaque, &local_err);
+
+g_free(hdr_copy);
+
+if (local_err) {
+error_propagate(errp, local_err);
+return -1;
+}
+return 0;
+}
+
+/*
+ * Loads the main LUKS header,and byteswaps it to native endianess
+ * And run basic sanity checks on it
+ */
+static int
+qcrypto_block_luks_load_header(QCryptoBlock *block,
+QCryptoBlockReadFunc readfunc,
+void *opaque,
+Error **errp)
+{
+ssize_t rv;
+size_t i;
+int ret = 0;
+QCryptoBlockLUKS *luks = block->opaque;
+
+/*
+ * Read the entire LUKS header, minus the key material from
+ * the underlying device
+ */
+
+rv = readfunc(block, 0,
+  (uint8_t *)&luks->header,
+  sizeof(luks->header),
+  opaque,
+  errp);
+if (rv < 0) {
+ret = rv;
+goto fail;
+}
+
+/*
+ * The header is always stored in big-endian format, so
+ * convert everything to native
+ */
+be16_to_cpus(&luks->header.version);
+be32_to_cpus(&luks->header.payload_offset);
+be32_to_cpus(&luks->header.key_bytes);
+be32_to_cpus(&luks->header.master_key_iterations);
+
+for (i = 0; i < QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS; i++) {
+be32_to_cpus(&luks->header.key_slots[i].active);
+be32_to_cpus(&luks->header.key_slots[i].iterations);
+be32_to_cpus(&luks->header.key_slots[i].key_offset);
+be32_to_cpus(&luks->header.key_slots[i].stripes);
+}
+
+
+return 0;
+fail:
+return ret;
+}
+
+
+/*
+ * Does basic sanity checks on the LUKS header
+ */
+static int
+qcrypto_block_luks_check_header(QCryptoBlockLUKS *luks, Error **errp)
+{
+int ret;
+
+if (memcmp(luks->header.magic, qcrypto_block_luks_magic,
+   QCRYPTO_BLOCK_LUKS_MAGIC_LEN) != 0) {
+error_setg(errp, "Volume is not in LUKS format");
+ret = -EINVAL;
+goto fail;
+}
+
+if (luks->header.version != QCRYPTO_BLOCK_LUKS_VERSION) {
+error_setg(errp, "LUKS version %" PRIu32 " is not supported",
+   luks->header.version);
+ret = -ENOTSUP;
+goto fail;
+}
+
+return 0;
+fail:
+return ret;
+}
+
+
+/*
+ * Parses the crypto parameters that are stored in the LUKS header
+ * to string
+ */
+
+static int
+qcrypto_block_luks_parse_header(QCryptoBlockLUKS *luks, Error **errp)
+{
+char *cipher_mode = g_strdup(luks->header.cipher_mode);
+ 

[Qemu-devel] [PATCH 04/13] qcrypto-luks: refactoring: simplify the math used for keyslot locations

2019-08-14 Thread Maxim Levitsky
Signed-off-by: Maxim Levitsky 
---
 crypto/block-luks.c | 64 +++--
 1 file changed, 38 insertions(+), 26 deletions(-)

diff --git a/crypto/block-luks.c b/crypto/block-luks.c
index 6bb369f3b4..e1a4df94b7 100644
--- a/crypto/block-luks.c
+++ b/crypto/block-luks.c
@@ -417,6 +417,33 @@ static int masterkeylen(QCryptoBlockLUKS *luks)
 }
 
 
+/*
+ * Returns number of sectors needed to store the key material
+ * given number of anti forensic stripes
+ */
+static int splitkeylen_sectors(QCryptoBlockLUKS *luks, int stripes)
+
+{
+/*
+ * This calculation doesn't match that shown in the spec,
+ * but instead follows the cryptsetup implementation.
+ */
+
+size_t header_sectors = QCRYPTO_BLOCK_LUKS_KEY_SLOT_OFFSET /
+ QCRYPTO_BLOCK_LUKS_SECTOR_SIZE;
+
+size_t splitkeylen = masterkeylen(luks) * stripes;
+
+/* First align the key material size to block size*/
+size_t splitkeylen_sectors =
+DIV_ROUND_UP(splitkeylen, QCRYPTO_BLOCK_LUKS_SECTOR_SIZE);
+
+/* Then also align the key material size to the size of the header */
+return ROUND_UP(splitkeylen_sectors, header_sectors);
+}
+
+
+
 /*
  * Stores the main LUKS header, taking care of endianess
  */
@@ -1169,7 +1196,7 @@ qcrypto_block_luks_create(QCryptoBlock *block,
 QCryptoBlockCreateOptionsLUKS luks_opts;
 Error *local_err = NULL;
 uint8_t *masterkey = NULL;
-size_t splitkeylen = 0;
+size_t next_sector;
 size_t i;
 char *password;
 const char *cipher_alg;
@@ -1388,23 +1415,16 @@ qcrypto_block_luks_create(QCryptoBlock *block,
 goto error;
 }
 
+/* start with the sector that follows the header*/
+next_sector = QCRYPTO_BLOCK_LUKS_KEY_SLOT_OFFSET /
+  QCRYPTO_BLOCK_LUKS_SECTOR_SIZE;
 
-/* Although LUKS has multiple key slots, we're just going
- * to use the first key slot */
-splitkeylen = luks->header.key_bytes * QCRYPTO_BLOCK_LUKS_STRIPES;
 for (i = 0; i < QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS; i++) {
-luks->header.key_slots[i].active = 
QCRYPTO_BLOCK_LUKS_KEY_SLOT_DISABLED;
-luks->header.key_slots[i].stripes = QCRYPTO_BLOCK_LUKS_STRIPES;
-
-/* This calculation doesn't match that shown in the spec,
- * but instead follows the cryptsetup implementation.
- */
-luks->header.key_slots[i].key_offset =
-(QCRYPTO_BLOCK_LUKS_KEY_SLOT_OFFSET /
- QCRYPTO_BLOCK_LUKS_SECTOR_SIZE) +
-(ROUND_UP(DIV_ROUND_UP(splitkeylen, 
QCRYPTO_BLOCK_LUKS_SECTOR_SIZE),
-  (QCRYPTO_BLOCK_LUKS_KEY_SLOT_OFFSET /
-   QCRYPTO_BLOCK_LUKS_SECTOR_SIZE)) * i);
+QCryptoBlockLUKSKeySlot *slot = &luks->header.key_slots[i];
+slot->active = QCRYPTO_BLOCK_LUKS_KEY_SLOT_DISABLED;
+slot->key_offset = next_sector;
+slot->stripes = QCRYPTO_BLOCK_LUKS_STRIPES;
+next_sector += splitkeylen_sectors(luks, QCRYPTO_BLOCK_LUKS_STRIPES);
 }
 
 
@@ -1412,17 +1432,9 @@ qcrypto_block_luks_create(QCryptoBlock *block,
  * slot headers, rounded up to the nearest sector, combined with
  * the size of each master key material region, also rounded up
  * to the nearest sector */
-luks->header.payload_offset =
-(QCRYPTO_BLOCK_LUKS_KEY_SLOT_OFFSET /
- QCRYPTO_BLOCK_LUKS_SECTOR_SIZE) +
-(ROUND_UP(DIV_ROUND_UP(splitkeylen, QCRYPTO_BLOCK_LUKS_SECTOR_SIZE),
-  (QCRYPTO_BLOCK_LUKS_KEY_SLOT_OFFSET /
-   QCRYPTO_BLOCK_LUKS_SECTOR_SIZE)) *
- QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS);
-
+luks->header.payload_offset = next_sector;
 block->sector_size = QCRYPTO_BLOCK_LUKS_SECTOR_SIZE;
-block->payload_offset = luks->header.payload_offset *
-block->sector_size;
+block->payload_offset = luks->header.payload_offset * block->sector_size;
 
 /* Reserve header space to match payload offset */
 initfunc(block, block->payload_offset, opaque, &local_err);
-- 
2.17.2




[Qemu-devel] [PATCH 02/13] qcrypto-luks: misc refactoring

2019-08-14 Thread Maxim Levitsky
This is also a preparation for key read/write/erase functions

* use master key len from the header
* prefer to use crypto params in the QCryptoBlockLUKS
  over passing them as function arguments
* define QCRYPTO_BLOCK_LUKS_DEFAULT_ITER_TIME
* Add comments to various crypto parameters in the QCryptoBlockLUKS

Signed-off-by: Maxim Levitsky 
---
 crypto/block-luks.c | 213 ++--
 1 file changed, 105 insertions(+), 108 deletions(-)

diff --git a/crypto/block-luks.c b/crypto/block-luks.c
index 409ab50f20..48213abde7 100644
--- a/crypto/block-luks.c
+++ b/crypto/block-luks.c
@@ -70,6 +70,8 @@ typedef struct QCryptoBlockLUKSKeySlot 
QCryptoBlockLUKSKeySlot;
 
 #define QCRYPTO_BLOCK_LUKS_SECTOR_SIZE 512LL
 
+#define QCRYPTO_BLOCK_LUKS_DEFAULT_ITER_TIME 2000
+
 static const char qcrypto_block_luks_magic[QCRYPTO_BLOCK_LUKS_MAGIC_LEN] = {
 'L', 'U', 'K', 'S', 0xBA, 0xBE
 };
@@ -199,13 +201,25 @@ QEMU_BUILD_BUG_ON(sizeof(struct QCryptoBlockLUKSHeader) 
!= 592);
 struct QCryptoBlockLUKS {
 QCryptoBlockLUKSHeader header;
 
-/* Cache parsed versions of what's in header fields,
- * as we can't rely on QCryptoBlock.cipher being
- * non-NULL */
+/* Main encryption algorithm used for encryption*/
 QCryptoCipherAlgorithm cipher_alg;
+
+/* Mode of encryption for the selected encryption algorithm */
 QCryptoCipherMode cipher_mode;
+
+/* Initialization vector generation algorithm */
 QCryptoIVGenAlgorithm ivgen_alg;
+
+/* Hash algorithm used for IV generation*/
 QCryptoHashAlgorithm ivgen_hash_alg;
+
+/*
+ * Encryption algorithm used for IV generation.
+ * Usually the same as main encryption algorithm
+ */
+QCryptoCipherAlgorithm ivgen_cipher_alg;
+
+/* Hash algorithm used in pbkdf2 function */
 QCryptoHashAlgorithm hash_alg;
 };
 
@@ -397,6 +411,12 @@ qcrypto_block_luks_essiv_cipher(QCryptoCipherAlgorithm 
cipher,
 }
 }
 
+static int masterkeylen(QCryptoBlockLUKS *luks)
+{
+return luks->header.key_bytes;
+}
+
+
 /*
  * Given a key slot, and user password, this will attempt to unlock
  * the master encryption key from the key slot.
@@ -410,21 +430,15 @@ qcrypto_block_luks_essiv_cipher(QCryptoCipherAlgorithm 
cipher,
  */
 static int
 qcrypto_block_luks_load_key(QCryptoBlock *block,
-QCryptoBlockLUKSKeySlot *slot,
+uint slot_idx,
 const char *password,
-QCryptoCipherAlgorithm cipheralg,
-QCryptoCipherMode ciphermode,
-QCryptoHashAlgorithm hash,
-QCryptoIVGenAlgorithm ivalg,
-QCryptoCipherAlgorithm ivcipheralg,
-QCryptoHashAlgorithm ivhash,
 uint8_t *masterkey,
-size_t masterkeylen,
 QCryptoBlockReadFunc readfunc,
 void *opaque,
 Error **errp)
 {
 QCryptoBlockLUKS *luks = block->opaque;
+QCryptoBlockLUKSKeySlot *slot = &luks->header.key_slots[slot_idx];
 uint8_t *splitkey;
 size_t splitkeylen;
 uint8_t *possiblekey;
@@ -439,9 +453,9 @@ qcrypto_block_luks_load_key(QCryptoBlock *block,
 return 0;
 }
 
-splitkeylen = masterkeylen * slot->stripes;
+splitkeylen = masterkeylen(luks) * slot->stripes;
 splitkey = g_new0(uint8_t, splitkeylen);
-possiblekey = g_new0(uint8_t, masterkeylen);
+possiblekey = g_new0(uint8_t, masterkeylen(luks));
 
 /*
  * The user password is used to generate a (possible)
@@ -450,11 +464,11 @@ qcrypto_block_luks_load_key(QCryptoBlock *block,
  * the key is correct and validate the results of
  * decryption later.
  */
-if (qcrypto_pbkdf2(hash,
+if (qcrypto_pbkdf2(luks->hash_alg,
(const uint8_t *)password, strlen(password),
slot->salt, QCRYPTO_BLOCK_LUKS_SALT_LEN,
slot->iterations,
-   possiblekey, masterkeylen,
+   possiblekey, masterkeylen(luks),
errp) < 0) {
 goto cleanup;
 }
@@ -478,19 +492,19 @@ qcrypto_block_luks_load_key(QCryptoBlock *block,
 
 /* Setup the cipher/ivgen that we'll use to try to decrypt
  * the split master key material */
-cipher = qcrypto_cipher_new(cipheralg, ciphermode,
-possiblekey, masterkeylen,
+cipher = qcrypto_cipher_new(luks->cipher_alg, luks->cipher_mode,
+possiblekey, masterkeylen(luks),
 errp);
 if (!cipher) {
 goto cleanup;
 }
 
-niv = qcrypto_cipher_get_iv_len(cipheralg,
-ciphermode);
-ivgen = qcrypto_ivgen_new(ivalg,
-  ivcipheralg,
-

  1   2   3   >