Re: [PATCH 1/2] qapi/block-core: Improve MapEntry documentation

2020-11-05 Thread Markus Armbruster
Max Reitz  writes:

> MapEntry and BlockDeviceMapEntry are kind of the same thing, and the
> latter is not used, so we want to remove it.  However, the documentation
> it provides for some fields is better than that of MapEntry, so steal
> some of it for the latter.
>
> (And adjust them a bit in the process, because I feel like we can make
> them even clearer.)
>
> Signed-off-by: Max Reitz 
> ---
>  qapi/block-core.json | 18 +-
>  1 file changed, 13 insertions(+), 5 deletions(-)
>
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 1b8b4156b4..3f86675357 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -244,17 +244,25 @@
>  #
>  # Mapping information from a virtual block range to a host file range
>  #
> -# @start: the start byte of the mapped virtual range
> +# @start: virtual (guest) offset of the first byte described by this
> +# entry
>  #
>  # @length: the number of bytes of the mapped virtual range
>  #
> -# @data: whether the mapped range has data
> +# @data: reading the image will actually read data from a file (in
> +#particular, if @offset is present this means that the sectors
> +#are not simply preallocated, but contain actual data in raw
> +#format)
>  #
> -# @zero: whether the virtual blocks are zeroed
> +# @zero: whether the virtual blocks read as zeroes
>  #
> -# @depth: the depth of the mapping
> +# @depth: number of layers (0 = top image, 1 = top image's backing
> +# file, ..., n - 1 = bottom image (where n is the number of
> +# images in the chain)) before reaching one for which the
> +# range is allocated
>  #
> -# @offset: the offset in file that the virtual sectors are mapped to
> +# @offset: if present, the image file stores the data for this range
> +#  in raw format at the given (host) offset
>  #
>  # @filename: filename that is referred to by @offset
>  #

I can't vouch for the comment's accuracy without reading a lot of code,
but I can say that the comment is clearly much clearer, so:

Acked-by: Markus Armbruster 




Re: [PATCH v2 5/6] macio: don't reference serial_hd() directly within the device

2020-11-05 Thread Mark Cave-Ayland

On 05/11/2020 05:31, Thomas Huth wrote:


(goes and looks)

Ah okay it appears to be because the object property link to the PIC is
missing, which is to be expected as it is only present on the Mac machines.

With the latest round of QOM updates I can see the solution but it's
probably a bit much now that we've reached rc-0. The easiest thing for the
moment is to switch user_creatable back to false if this is causing an issue.


+1 for setting user_creatable back to false ... can you send a patch or
shall I prepare one?


No that's fine, I can come up with a fix over the next couple of days.


Just out of interest how did you find this? My new workflow involves a local
"make check" with all ppc targets and a pass through Travis-CI and it didn't
show up there for me (or indeed Peter's pre-merge tests).


I've found it with the scripts/device-crash-test script. (You currently need
to apply Eduardo's patch "Check if path is actually an executable file" on
top first to run it)


Have you got a link for this? I've tried doing some basic searches in my email client 
and couldn't easily spot it...



ATB,

Mark.



Re: [PATCH] target/openrisc: fix icount handling for timer instructions

2020-11-05 Thread Pavel Dovgalyuk

On 06.11.2020 00:39, Richard Henderson wrote:

On 11/5/20 3:54 AM, Pavel Dovgalyuk wrote:

This patch adds icount handling to mfspr/mtspr instructions
that may deal with hardware timers.

Signed-off-by: Pavel Dovgalyuk 
---
  target/openrisc/translate.c |   15 +++
  1 file changed, 15 insertions(+)


Looks correct, but it would be better not to duplicate the code from
trans_l_mtspr, and use an is_jmp code (called DISAS_UPDATE_EXIT in some other
targets).


mtspr includes the following comment:
* Save all of the cpu state first, allowing it to be overwritten.

Does it mean, that helper can overwrite the PC? Then the PC can't be 
updated in is_jmp handler at the end of instruction translation.



Pavel Dovgalyuk





Re: [PATCH V2] qtest: Fix bad printf format specifiers

2020-11-05 Thread Markus Armbruster
AlexChen  writes:

> We should use printf format specifier PRIu32 instead of "%d" for
> argument of type 'uint32_t'.

I prefer v1, which uses %u.

[...]




Re: [PATCH] qtest: Fix bad printf format specifiers

2020-11-05 Thread Markus Armbruster
Thomas Huth  writes:

> On 05/11/2020 06.14, AlexChen wrote:
>> On 2020/11/4 18:44, Thomas Huth wrote:
>>> On 04/11/2020 11.23, AlexChen wrote:
 We should use printf format specifier "%u" instead of "%d" for
 argument of type "unsigned int".

 Reported-by: Euler Robot 
 Signed-off-by: Alex Chen 
 ---
  tests/qtest/arm-cpu-features.c | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)

 diff --git a/tests/qtest/arm-cpu-features.c 
 b/tests/qtest/arm-cpu-features.c
 index d20094d5a7..bc681a95d5 100644
 --- a/tests/qtest/arm-cpu-features.c
 +++ b/tests/qtest/arm-cpu-features.c
 @@ -536,7 +536,7 @@ static void test_query_cpu_model_expansion_kvm(const 
 void *data)
  if (kvm_supports_sve) {
  g_assert(vls != 0);
  max_vq = 64 - __builtin_clzll(vls);
 -sprintf(max_name, "sve%d", max_vq * 128);
 +sprintf(max_name, "sve%u", max_vq * 128);

  /* Enabling a supported length is of course fine. */
  assert_sve_vls(qts, "host", vls, "{ %s: true }", max_name);
 @@ -556,7 +556,7 @@ static void test_query_cpu_model_expansion_kvm(const 
 void *data)
   * unless all larger, supported vector lengths are also
   * disabled.
   */
 -sprintf(name, "sve%d", vq * 128);
 +sprintf(name, "sve%u", vq * 128);
  error = g_strdup_printf("cannot disable %s", name);
  assert_error(qts, "host", error,
   "{ %s: true, %s: false }",
 @@ -569,7 +569,7 @@ static void test_query_cpu_model_expansion_kvm(const 
 void *data)
   * we need at least one vector length enabled.
   */
  vq = __builtin_ffsll(vls);
 -sprintf(name, "sve%d", vq * 128);
 +sprintf(name, "sve%u", vq * 128);
  error = g_strdup_printf("cannot disable %s", name);
  assert_error(qts, "host", error, "{ %s: false }", name);
  g_free(error);
 @@ -581,7 +581,7 @@ static void test_query_cpu_model_expansion_kvm(const 
 void *data)
  }
  }
  if (vq <= SVE_MAX_VQ) {
 -sprintf(name, "sve%d", vq * 128);
 +sprintf(name, "sve%u", vq * 128);
  error = g_strdup_printf("cannot enable %s", name);
  assert_error(qts, "host", error, "{ %s: true }", name);
  g_free(error);

>>>
>>> max_vq and vq are both "uint32_t" and not "unsigned int" ... so if you want
>>> to fix this really really correctly, please use PRIu32 from inttypes.h 
>>> instead.
>>>
>> 
>> Hi Thomas,
>> Thanks for your review.
>> According to the definition of the macro PRIu32(# define PRIu32 "u"),
>> using PRIu32 works the same as using %u to print, and using PRIu32 to print
>> is relatively rare in QEMU(%u 720, PRIu32 only 120). Can we continue to use 
>> %u to
>> print max_vq and vq in this patch.
>> Of course, this is just my small small suggestion. If you think it is better 
>> to use
>> PRIu32 for printing, I will send patch V2.
>
> Well, %u happens to work since "int" is 32-bit with all current compilers
> that we support.

Yes, it works.

>  But if there is ever a compiler where the size of int is
> different, you'll get a compiler warning here again.

No, we won't.

If we ever use a compiler where int is narrower than 32 bits, then the
type of the argument is actually uint32_t[1].  We can forget about this
case, because "int narrower than 32 bits" is not going to fly with our
code base.

If we ever use a compiler where int is wider than 32 bits, then the type
of the argument is *not* uint32_t[2].  PRIu32 will work anyway, because
it will actually retrieve an unsigned int argument, *not* an uint32_t
argument[3].

In other words "%" PRIu32 is just a less legible alias for "%u" in all
cases that matter.

And that's why I would simply use "%u".

>  So if we now fix this
> up, then let's do it really right and use PRIu32, please.
>
>  Thomas


[1] Because promotion does nothing either argument, and the usual
arithmetic conversions convert to uint32_t.  See my first reply.

[2] Because uint32_t gets promoted to unsigned int.  See my first reply.

[3] Because variable arguments undergo default argument promotion (§
6.5.2.2 Function calls), which promotes uint32_t to unsigned int.




Re: [PATCH] migration/dirtyrate: simplify inlcudes in dirtyrate.c

2020-11-05 Thread Zheng Chuan
Kindly ping for not forgetting this trivial fix:)

On 2020/10/30 22:09, Mark Kanda wrote:
> On 10/29/2020 10:58 PM, Chuan Zheng wrote:
>> Remove redundant blank line which is left by Commit 662770af7c6e8c,
>> also take this opportunity to remove redundant includes in dirtyrate.c.
>>
>> Signed-off-by: Chuan Zheng 
> 
> Reviewed-by: Mark Kanda 
> 
>> ---
>>   migration/dirtyrate.c | 5 -
>>   1 file changed, 5 deletions(-)
>>
>> diff --git a/migration/dirtyrate.c b/migration/dirtyrate.c
>> index 8f728d2..ccb9814 100644
>> --- a/migration/dirtyrate.c
>> +++ b/migration/dirtyrate.c
>> @@ -11,17 +11,12 @@
>>    */
>>     #include "qemu/osdep.h"
>> -
>>   #include 
>>   #include "qapi/error.h"
>>   #include "cpu.h"
>> -#include "qemu/config-file.h"
>> -#include "exec/memory.h"
>>   #include "exec/ramblock.h"
>> -#include "exec/target_page.h"
>>   #include "qemu/rcu_queue.h"
>>   #include "qapi/qapi-commands-migration.h"
>> -#include "migration.h"
>>   #include "ram.h"
>>   #include "trace.h"
>>   #include "dirtyrate.h"
>>

-- 
Regards.
Chuan



[PATCH] scripts/checkpatch.pl: Modify the line length limit of the code

2020-11-05 Thread Gan Qixin
Modify the rule that limit the length of lines according to the following ideas:

--add a variable max_line_length to indicate the limit of line length and set 
it to 100 by default
--when the line length exceeds max_line_length, output warning information 
instead of error
--if/while/etc brace do not go on next line whether the line length exceeds 
max_line_length or not

Signed-off-by: Gan Qixin 
---
 scripts/checkpatch.pl | 18 +-
 1 file changed, 5 insertions(+), 13 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 88c858f67c..84a72d47ad 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -35,6 +35,7 @@ my $summary_file = 0;
 my $root;
 my %debug;
 my $help = 0;
+my $max_line_length = 100;
 
 sub help {
my ($exitcode) = @_;
@@ -1628,17 +1629,13 @@ sub process {
 # check we are in a valid source file if not then ignore this hunk
next if ($realfile !~ /$SrcFile/);
 
-#90 column limit; exempt URLs, if no other words on line
+#$max_line_length column limit; exempt URLs, if no other words on line
if ($line =~ /^\+/ &&
!($line =~ /^\+\s*"[^"]*"\s*(?:\s*|,|\)\s*;)\s*$/) &&
!($rawline =~ /^[^[:alnum:]]*https?:\S*$/) &&
-   $length > 80)
+   $length > $max_line_length)
{
-   if ($length > 90) {
-   ERROR("line over 90 characters\n" . $herecurr);
-   } else {
-   WARN("line over 80 characters\n" . $herecurr);
-   }
+   WARN("line over $max_line_length characters\n" . 
$herecurr);
}
 
 # check for spaces before a quoted newline
@@ -1831,13 +1828,8 @@ sub process {
#print "realcnt<$realcnt> ctx_cnt<$ctx_cnt>\n";
#print 
"pre<$pre_ctx>\nline<$line>\nctx<$ctx>\nnext<$lines[$ctx_ln - 1]>\n";
 
-   # The length of the "previous line" is checked against 
80 because it
-   # includes the + at the beginning of the line (if the 
actual line has
-   # 79 or 80 characters, it is no longer possible to add 
a space and an
-   # opening brace there)
if ($#ctx == 0 && $ctx !~ /{\s*/ &&
-   defined($lines[$ctx_ln - 1]) && $lines[$ctx_ln - 1] 
=~ /^\+\s*\{/ &&
-   defined($lines[$ctx_ln - 2]) && 
length($lines[$ctx_ln - 2]) < 80) {
+   defined($lines[$ctx_ln - 1]) && $lines[$ctx_ln - 1] 
=~ /^\+\s*\{/) {
ERROR("that open brace { should be on the 
previous line\n" .
"$here\n$ctx\n$rawlines[$ctx_ln - 
1]\n");
}
-- 
2.23.0




[PATCH] migration/multifd: fix hangup with TLS-Multifd due to blocking handshake

2020-11-05 Thread Chuan Zheng
The qemu main loop could hang up forever when we enable TLS+Multifd.
The Src multifd_send_0 invokes tls handshake, it sends hello to sever
and wait response.
However, the Dst main qemu loop has been waiting recvmsg() for multifd_recv_1.
Both of Src and Dst main qemu loop are blocking and waiting for reponse which
results in hanging up forever.

Src: (multifd_send_0)  Dst: 
(multifd_recv_1)
multifd_channel_connect
migration_channel_process_incoming
  multifd_tls_channel_connect
migration_tls_channel_process_incoming
multifd_tls_channel_connect
qio_channel_tls_handshake_task
   qio_channel_tls_handshake 
gnutls_handshake
  qio_channel_tls_handshake_task   
...
qcrypto_tls_session_handshake  
...
  gnutls_handshake 
...
   ... 
...
recvmsg (Blocking I/O waiting for response)
recvmsg (Blocking I/O waiting for response)

Fix this by offloadinig handshake work to a background thread.

Reported-by: Yan Jin 
Suggested-by: Daniel P. Berrangé 
Signed-off-by: Chuan Zheng 
---
 migration/multifd.c | 23 +--
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/migration/multifd.c b/migration/multifd.c
index 68b171f..88486b9 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -739,6 +739,19 @@ static void multifd_tls_outgoing_handshake(QIOTask *task,
 multifd_channel_connect(p, ioc, err);
 }
 
+static void *multifd_tls_handshake_thread(void *opaque)
+{
+MultiFDSendParams *p = opaque;
+QIOChannelTLS *tioc = QIO_CHANNEL_TLS(p->c);
+
+qio_channel_tls_handshake(tioc,
+  multifd_tls_outgoing_handshake,
+  p,
+  NULL,
+  NULL);
+return NULL;
+}
+
 static void multifd_tls_channel_connect(MultiFDSendParams *p,
 QIOChannel *ioc,
 Error **errp)
@@ -754,12 +767,10 @@ static void multifd_tls_channel_connect(MultiFDSendParams 
*p,
 
 trace_multifd_tls_outgoing_handshake_start(ioc, tioc, hostname);
 qio_channel_set_name(QIO_CHANNEL(tioc), "multifd-tls-outgoing");
-qio_channel_tls_handshake(tioc,
-  multifd_tls_outgoing_handshake,
-  p,
-  NULL,
-  NULL);
-
+p->c = QIO_CHANNEL(tioc);
+qemu_thread_create(&p->thread, "multifd-tls-handshake-worker",
+   multifd_tls_handshake_thread, p,
+   QEMU_THREAD_JOINABLE);
 }
 
 static bool multifd_channel_connect(MultiFDSendParams *p,
-- 
1.8.3.1




Re: Emulation for riscv

2020-11-05 Thread Palmer Dabbelt

On Thu, 22 Oct 2020 17:56:38 PDT (-0700), alistai...@gmail.com wrote:

On Thu, Oct 22, 2020 at 4:58 PM Moises Arreola  wrote:


Hello everyone, my name is Moses and I'm trying to set up a VM for a risc-v 
processor, I'm using the Risc-V Getting Started Guide and on the final step I'm 
getting an error while trying to launch the virtual machine using the cmd:


Hello,

Please don't use the RISC-V Getting Started Guide. Pretty much all of
the information there is out of date and wrong. Unfortunately we are
unable to correct it.

The QEMU wiki is a much better place for information:
https://wiki.qemu.org/Documentation/Platforms/RISCV


Ya, everything at riscv.org is useless.  It's best to stick to the open source
documentation, as when that gets out of date we can at least fix it.  Using a
distro helps a lot here, the wiki describes how to run a handful of popular
ones that were ported to RISC-V early but if your favorite isn't on the list
then it may have its own documentation somewhere else.


sudo qemu-system-riscv64 -nographic -machine virt \
-kernel linux/arch/riscv/boot/Image -append "root=/dev/vda ro console=ttyS0" \
-drive file=busybox,format=raw,id=hd0 \
-device virtio-blk-device,drive=hd0

But what I get in return is a message telling me that the file I gave wasn't 
the right one, the actual output is:

qemu-system-riscv64: -drive file=busybox,format=raw,id=hd0: A regular file was 
expected by the 'file' driver, but something else was given

And I checked the file busybox with de cmd "file" and got the following :
busybox: ELF 64-bit LSB executable, UCB RISC-V, version 1 (SYSV), dynamically 
linked, interpreter /lib/ld-linux-riscv64-lp64d.so.1, for GNU/Linux 4.15.0, 
stripped


That looks like an ELF, which won't work when attached as a drive.

How are you building this rootFS?

Alistair



So I was wondering if the error message was related to qemu.
Thanks in advance for answering any suggestions are welcome




[PATCH V17 3/6] hw/mips: Implement fw_cfg_arch_key_name()

2020-11-05 Thread Huacai Chen
Implement fw_cfg_arch_key_name(), which returns the name of a
mips-specific key.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Huacai Chen 
Co-developed-by: Jiaxun Yang 
Signed-off-by: Jiaxun Yang 
---
 hw/mips/fw_cfg.c| 35 +++
 hw/mips/fw_cfg.h| 19 +++
 hw/mips/meson.build |  2 +-
 3 files changed, 55 insertions(+), 1 deletion(-)
 create mode 100644 hw/mips/fw_cfg.c
 create mode 100644 hw/mips/fw_cfg.h

diff --git a/hw/mips/fw_cfg.c b/hw/mips/fw_cfg.c
new file mode 100644
index 000..67c4a74
--- /dev/null
+++ b/hw/mips/fw_cfg.c
@@ -0,0 +1,35 @@
+/*
+ * QEMU fw_cfg helpers (MIPS specific)
+ *
+ * Copyright (c) 2020 Lemote, Inc.
+ *
+ * Author:
+ *   Huacai Chen (che...@lemote.com)
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/mips/fw_cfg.h"
+#include "hw/nvram/fw_cfg.h"
+
+const char *fw_cfg_arch_key_name(uint16_t key)
+{
+static const struct {
+uint16_t key;
+const char *name;
+} fw_cfg_arch_wellknown_keys[] = {
+{FW_CFG_MACHINE_VERSION, "machine_version"},
+{FW_CFG_CPU_FREQ, "cpu_frequency"},
+};
+
+for (size_t i = 0; i < ARRAY_SIZE(fw_cfg_arch_wellknown_keys); i++) {
+if (fw_cfg_arch_wellknown_keys[i].key == key) {
+return fw_cfg_arch_wellknown_keys[i].name;
+}
+}
+return NULL;
+}
diff --git a/hw/mips/fw_cfg.h b/hw/mips/fw_cfg.h
new file mode 100644
index 000..e317d5b
--- /dev/null
+++ b/hw/mips/fw_cfg.h
@@ -0,0 +1,19 @@
+/*
+ * QEMU fw_cfg helpers (MIPS specific)
+ *
+ * Copyright (c) 2020 Huacai Chen
+ *
+ * SPDX-License-Identifier: MIT
+ */
+
+#ifndef HW_MIPS_FW_CFG_H
+#define HW_MIPS_FW_CFG_H
+
+#include "hw/boards.h"
+#include "hw/nvram/fw_cfg.h"
+
+/* Data for BIOS to identify machine */
+#define FW_CFG_MACHINE_VERSION  (FW_CFG_ARCH_LOCAL + 0)
+#define FW_CFG_CPU_FREQ (FW_CFG_ARCH_LOCAL + 1)
+
+#endif
diff --git a/hw/mips/meson.build b/hw/mips/meson.build
index bcdf96b..0e9f930 100644
--- a/hw/mips/meson.build
+++ b/hw/mips/meson.build
@@ -1,5 +1,5 @@
 mips_ss = ss.source_set()
-mips_ss.add(files('addr.c', 'mips_int.c'))
+mips_ss.add(files('addr.c', 'mips_int.c', 'fw_cfg.c'))
 mips_ss.add(when: 'CONFIG_FULOONG', if_true: files('fuloong2e.c'))
 mips_ss.add(when: 'CONFIG_JAZZ', if_true: files('jazz.c'))
 mips_ss.add(when: 'CONFIG_MALTA', if_true: files('gt64xxx_pci.c', 'malta.c'))
-- 
2.7.0




[PATCH V17 6/6] docs/system: Update MIPS machine documentation

2020-11-05 Thread Huacai Chen
Update MIPS machine documentation to add Loongson-3 based machine description.

Signed-off-by: Huacai Chen 
---
 docs/system/target-mips.rst | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/docs/system/target-mips.rst b/docs/system/target-mips.rst
index cd2a931..138441b 100644
--- a/docs/system/target-mips.rst
+++ b/docs/system/target-mips.rst
@@ -84,6 +84,16 @@ The Fuloong 2E emulation supports:
 
 -  RTL8139D as a network card chipset
 
+The Loongson-3 virtual platform emulation supports:
+
+-  Loongson 3A CPU
+
+-  LIOINTC as interrupt controller
+
+-  GPEX and virtio as peripheral devices
+
+-  Both KVM and TCG supported
+
 The mipssim pseudo board emulation provides an environment similar to
 what the proprietary MIPS emulator uses for running Linux. It supports:
 
-- 
2.7.0




[PATCH V17 2/6] hw/intc: Rework Loongson LIOINTC

2020-11-05 Thread Huacai Chen
As suggested by Philippe Mathieu-Daudé, rework Loongson's liointc:
1, Move macro definitions to loongson_liointc.h;
2, Remove magic values and use macros instead;
3, Replace dead D() code by trace events.

Suggested-by: Philippe Mathieu-Daudé 
Signed-off-by: Huacai Chen 
---
 hw/intc/loongson_liointc.c | 49 +++---
 include/hw/intc/loongson_liointc.h | 39 ++
 2 files changed, 53 insertions(+), 35 deletions(-)
 create mode 100644 include/hw/intc/loongson_liointc.h

diff --git a/hw/intc/loongson_liointc.c b/hw/intc/loongson_liointc.c
index fbbfb57..be29e2f 100644
--- a/hw/intc/loongson_liointc.c
+++ b/hw/intc/loongson_liointc.c
@@ -1,6 +1,7 @@
 /*
  * QEMU Loongson Local I/O interrupt controler.
  *
+ * Copyright (c) 2020 Huacai Chen 
  * Copyright (c) 2020 Jiaxun Yang 
  *
  * This program is free software: you can redistribute it and/or modify
@@ -19,33 +20,11 @@
  */
 
 #include "qemu/osdep.h"
-#include "hw/sysbus.h"
 #include "qemu/module.h"
+#include "qemu/log.h"
 #include "hw/irq.h"
 #include "hw/qdev-properties.h"
-#include "qom/object.h"
-
-#define D(x)
-
-#define NUM_IRQS32
-
-#define NUM_CORES   4
-#define NUM_IPS 4
-#define NUM_PARENTS (NUM_CORES * NUM_IPS)
-#define PARENT_COREx_IPy(x, y)  (NUM_IPS * x + y)
-
-#define R_MAPPER_START  0x0
-#define R_MAPPER_END0x20
-#define R_ISR   R_MAPPER_END
-#define R_IEN   0x24
-#define R_IEN_SET   0x28
-#define R_IEN_CLR   0x2c
-#define R_PERCORE_ISR(x)(0x40 + 0x8 * x)
-#define R_END   0x64
-
-#define TYPE_LOONGSON_LIOINTC "loongson.liointc"
-DECLARE_INSTANCE_CHECKER(struct loongson_liointc, LOONGSON_LIOINTC,
- TYPE_LOONGSON_LIOINTC)
+#include "hw/intc/loongson_liointc.h"
 
 struct loongson_liointc {
 SysBusDevice parent_obj;
@@ -123,14 +102,13 @@ liointc_read(void *opaque, hwaddr addr, unsigned int size)
 goto out;
 }
 
-/* Rest is 4 byte */
+/* Rest are 4 bytes */
 if (size != 4 || (addr % 4)) {
 goto out;
 }
 
-if (addr >= R_PERCORE_ISR(0) &&
-addr < R_PERCORE_ISR(NUM_CORES)) {
-int core = (addr - R_PERCORE_ISR(0)) / 8;
+if (addr >= R_START && addr < R_END) {
+int core = (addr - R_START) / R_ISR_SIZE;
 r = p->per_core_isr[core];
 goto out;
 }
@@ -147,7 +125,8 @@ liointc_read(void *opaque, hwaddr addr, unsigned int size)
 }
 
 out:
-D(qemu_log("%s: size=%d addr=%lx val=%x\n", __func__, size, addr, r));
+qemu_log_mask(CPU_LOG_INT, "%s: size=%d addr=%lx val=%x\n",
+  __func__, size, addr, r);
 return r;
 }
 
@@ -158,7 +137,8 @@ liointc_write(void *opaque, hwaddr addr,
 struct loongson_liointc *p = opaque;
 uint32_t value = val64;
 
-D(qemu_log("%s: size=%d, addr=%lx val=%x\n", __func__, size, addr, value));
+qemu_log_mask(CPU_LOG_INT, "%s: size=%d, addr=%lx val=%x\n",
+  __func__, size, addr, value);
 
 /* Mapper is 1 byte */
 if (size == 1 && addr < R_MAPPER_END) {
@@ -166,14 +146,13 @@ liointc_write(void *opaque, hwaddr addr,
 goto out;
 }
 
-/* Rest is 4 byte */
+/* Rest are 4 bytes */
 if (size != 4 || (addr % 4)) {
 goto out;
 }
 
-if (addr >= R_PERCORE_ISR(0) &&
-addr < R_PERCORE_ISR(NUM_CORES)) {
-int core = (addr - R_PERCORE_ISR(0)) / 8;
+if (addr >= R_START && addr < R_END) {
+int core = (addr - R_START) / R_ISR_SIZE;
 p->per_core_isr[core] = value;
 goto out;
 }
@@ -224,7 +203,7 @@ static void loongson_liointc_init(Object *obj)
 }
 
 memory_region_init_io(&p->mmio, obj, &pic_ops, p,
- "loongson.liointc", R_END);
+ TYPE_LOONGSON_LIOINTC, R_END);
 sysbus_init_mmio(SYS_BUS_DEVICE(obj), &p->mmio);
 }
 
diff --git a/include/hw/intc/loongson_liointc.h 
b/include/hw/intc/loongson_liointc.h
new file mode 100644
index 000..e11f482
--- /dev/null
+++ b/include/hw/intc/loongson_liointc.h
@@ -0,0 +1,39 @@
+/*
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
+ *
+ * Copyright (c) 2020 Huacai Chen 
+ * Copyright (c) 2020 Jiaxun Yang 
+ *
+ */
+
+#ifndef LOONSGON_LIOINTC_H
+#define LOONGSON_LIOINTC_H
+
+#include "qemu/units.h"
+#include "hw/sysbus.h"
+#include "qom/object.h"
+
+#define NUM_IRQS32
+
+#define NUM_CORES   4
+#define NUM_IPS 4
+#define NUM_PARENTS (NUM_CORES * NUM_IPS)
+#define PARENT_COREx_IPy(x, y)  (NUM_IPS * x + y)
+
+#define R_MAPPER_START  0x0
+#define R_MAPPER_END0x20
+#define R_ISR   R_MAPPER_END
+#define R_IEN   0x24
+#define R_IEN_SET   0x28
+#defin

[PATCH V17 5/6] hw/mips: Add Loongson-3 machine support

2020-11-05 Thread Huacai Chen
Add Loongson-3 based machine support, it use liointc as the interrupt
controler and use GPEX as the pci controller. Currently it can work with
both TCG and KVM.

As the machine model is not based on any exiting physical hardware, the
name of the machine is "loongson3-virt". It may be superseded in future
by a real machine model. If this happens, then a regular deprecation
procedure shall occur for "loongson3-virt" machine.

We now already have a full functional Linux kernel (based on Linux-5.4.x
LTS) here:

https://github.com/chenhuacai/linux

Of course the upstream kernel is also usable (the kvm host side and
guest side have both been upstream in Linux-5.9):

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

How to use QEMU/Loongson-3?
1, Download kernel source from the above URL;
2, Build a kernel with arch/mips/configs/loongson3_defconfig;
3, Boot a Loongson-3A4000 host with this kernel (for KVM mode);
4, Build QEMU-master with this patchset;
5, modprobe kvm (only necessary for KVM mode);
6, Use QEMU with TCG:
   qemu-system-mips64el -M loongson3-virt,accel=tcg -cpu Loongson-3A1000 
-kernel  -append ...
   Use QEMU with KVM:
   qemu-system-mips64el -M loongson3-virt,accel=kvm -cpu Loongson-3A4000 
-kernel  -append ...

   The "-cpu" parameter is optional here and QEMU will use the correct type for 
TCG/KVM automatically.

Signed-off-by: Huacai Chen 
Co-developed-by: Jiaxun Yang 
Signed-off-by: Jiaxun Yang 
---
 default-configs/devices/mips64el-softmmu.mak |   1 +
 hw/mips/Kconfig  |  12 +
 hw/mips/loongson3_virt.c | 614 +++
 hw/mips/meson.build  |   2 +-
 4 files changed, 628 insertions(+), 1 deletion(-)
 create mode 100644 hw/mips/loongson3_virt.c

diff --git a/default-configs/devices/mips64el-softmmu.mak 
b/default-configs/devices/mips64el-softmmu.mak
index 9f8a3ef..26c660a 100644
--- a/default-configs/devices/mips64el-softmmu.mak
+++ b/default-configs/devices/mips64el-softmmu.mak
@@ -3,6 +3,7 @@
 include mips-softmmu-common.mak
 CONFIG_IDE_VIA=y
 CONFIG_FULOONG=y
+CONFIG_LOONGSON3V=y
 CONFIG_ATI_VGA=y
 CONFIG_RTL8139_PCI=y
 CONFIG_JAZZ=y
diff --git a/hw/mips/Kconfig b/hw/mips/Kconfig
index 8be7012..ef5cee1 100644
--- a/hw/mips/Kconfig
+++ b/hw/mips/Kconfig
@@ -32,6 +32,18 @@ config FULOONG
 bool
 select PCI_BONITO
 
+config LOONGSON3V
+bool
+select PCKBD
+select SERIAL
+select GOLDFISH_RTC
+select LOONGSON_LIOINTC
+select PCI_DEVICES
+select PCI_EXPRESS_GENERIC_BRIDGE
+select VIRTIO_VGA
+select QXL if SPICE
+select MSI_NONBROKEN
+
 config MIPS_CPS
 bool
 select PTIMER
diff --git a/hw/mips/loongson3_virt.c b/hw/mips/loongson3_virt.c
new file mode 100644
index 000..c5db2db
--- /dev/null
+++ b/hw/mips/loongson3_virt.c
@@ -0,0 +1,614 @@
+/*
+ * Generic Loongson-3 Platform support
+ *
+ * Copyright (c) 2018-2020 Huacai Chen (che...@lemote.com)
+ * Copyright (c) 2018-2020 Jiaxun Yang 
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see .
+ */
+
+/*
+ * Generic virtualized PC Platform based on Loongson-3 CPU (MIPS64R2 with
+ * extensions, 800~2000MHz)
+ */
+
+#include "qemu/osdep.h"
+#include "qemu-common.h"
+#include "qemu/units.h"
+#include "qemu/cutils.h"
+#include "qapi/error.h"
+#include "cpu.h"
+#include "elf.h"
+#include "kvm_mips.h"
+#include "hw/boards.h"
+#include "hw/char/serial.h"
+#include "hw/intc/loongson_liointc.h"
+#include "hw/mips/mips.h"
+#include "hw/mips/cpudevs.h"
+#include "hw/mips/fw_cfg.h"
+#include "hw/mips/loongson3_bootp.h"
+#include "hw/misc/unimp.h"
+#include "hw/intc/i8259.h"
+#include "hw/loader.h"
+#include "hw/isa/superio.h"
+#include "hw/pci/msi.h"
+#include "hw/pci/pci.h"
+#include "hw/pci/pci_host.h"
+#include "hw/pci-host/gpex.h"
+#include "hw/usb.h"
+#include "net/net.h"
+#include "exec/address-spaces.h"
+#include "sysemu/kvm.h"
+#include "sysemu/qtest.h"
+#include "sysemu/reset.h"
+#include "sysemu/runstate.h"
+#include "qemu/log.h"
+#include "qemu/error-report.h"
+
+#define PM_CNTL_MODE  0x10
+
+#define LOONGSON_MAX_VCPUS  16
+
+/*
+ * Loongson-3's virtual machine BIOS can be obtained here:
+ * 1, https://github.com/loongson-community/firmware-nonfree
+ * 2, http://dev.lemote.com:8000/files/firmware/UEFI/KVM/bios_loongson3.bin
+ */
+#define LOONGSON3_BIOSNAME "bios_loongson3.bin"

[PATCH V17 4/6] hw/mips: Add Loongson-3 boot parameter helpers

2020-11-05 Thread Huacai Chen
Preparing to add Loongson-3 machine support, add Loongson-3's LEFI (a
UEFI-like interface for BIOS-Kernel boot parameters) helpers first.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Huacai Chen 
Co-developed-by: Jiaxun Yang 
Signed-off-by: Jiaxun Yang 
---
 hw/mips/loongson3_bootp.c | 165 +++
 hw/mips/loongson3_bootp.h | 241 ++
 hw/mips/meson.build   |   1 +
 3 files changed, 407 insertions(+)
 create mode 100644 hw/mips/loongson3_bootp.c
 create mode 100644 hw/mips/loongson3_bootp.h

diff --git a/hw/mips/loongson3_bootp.c b/hw/mips/loongson3_bootp.c
new file mode 100644
index 000..3a16081
--- /dev/null
+++ b/hw/mips/loongson3_bootp.c
@@ -0,0 +1,165 @@
+/*
+ * LEFI (a UEFI-like interface for BIOS-Kernel boot parameters) helpers
+ *
+ * Copyright (c) 2018-2020 Huacai Chen (che...@lemote.com)
+ * Copyright (c) 2018-2020 Jiaxun Yang 
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see .
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/units.h"
+#include "qemu/cutils.h"
+#include "cpu.h"
+#include "hw/boards.h"
+#include "hw/mips/loongson3_bootp.h"
+
+#define LOONGSON3_CORE_PER_NODE 4
+
+static struct efi_cpuinfo_loongson *init_cpu_info(void *g_cpuinfo, uint64_t 
cpu_freq)
+{
+struct efi_cpuinfo_loongson *c = g_cpuinfo;
+
+stl_le_p(&c->cputype, Loongson_3A);
+stl_le_p(&c->processor_id, MIPS_CPU(first_cpu)->env.CP0_PRid);
+if (cpu_freq > UINT_MAX) {
+stl_le_p(&c->cpu_clock_freq, UINT_MAX);
+} else {
+stl_le_p(&c->cpu_clock_freq, cpu_freq);
+}
+
+stw_le_p(&c->cpu_startup_core_id, 0);
+stl_le_p(&c->nr_cpus, current_machine->smp.cpus);
+stl_le_p(&c->total_node, DIV_ROUND_UP(current_machine->smp.cpus,
+  LOONGSON3_CORE_PER_NODE));
+
+return c;
+}
+
+static struct efi_memory_map_loongson *init_memory_map(void *g_map, uint64_t 
ram_size)
+{
+struct efi_memory_map_loongson *emap = g_map;
+
+stl_le_p(&emap->nr_map, 2);
+stl_le_p(&emap->mem_freq, 3);
+
+stl_le_p(&emap->map[0].node_id, 0);
+stl_le_p(&emap->map[0].mem_type, 1);
+stq_le_p(&emap->map[0].mem_start, 0x0);
+stl_le_p(&emap->map[0].mem_size, 240);
+
+stl_le_p(&emap->map[1].node_id, 0);
+stl_le_p(&emap->map[1].mem_type, 2);
+stq_le_p(&emap->map[1].mem_start, 0x9000);
+stl_le_p(&emap->map[1].mem_size, (ram_size / MiB) - 256);
+
+return emap;
+}
+
+static struct system_loongson *init_system_loongson(void *g_system)
+{
+struct system_loongson *s = g_system;
+
+stl_le_p(&s->ccnuma_smp, 0);
+stl_le_p(&s->sing_double_channel, 1);
+stl_le_p(&s->nr_uarts, 1);
+stl_le_p(&s->uarts[0].iotype, 2);
+stl_le_p(&s->uarts[0].int_offset, 2);
+stl_le_p(&s->uarts[0].uartclk, 2500); /* Random value */
+stq_le_p(&s->uarts[0].uart_base, virt_memmap[VIRT_UART].base);
+
+return s;
+}
+
+static struct irq_source_routing_table *init_irq_source(void *g_irq_source)
+{
+struct irq_source_routing_table *irq_info = g_irq_source;
+
+stl_le_p(&irq_info->node_id, 0);
+stl_le_p(&irq_info->PIC_type, 0);
+stw_le_p(&irq_info->dma_mask_bits, 64);
+stq_le_p(&irq_info->pci_mem_start_addr, virt_memmap[VIRT_PCIE_MMIO].base);
+stq_le_p(&irq_info->pci_mem_end_addr, virt_memmap[VIRT_PCIE_MMIO].base +
+  virt_memmap[VIRT_PCIE_MMIO].size - 
1);
+stq_le_p(&irq_info->pci_io_start_addr, virt_memmap[VIRT_PCIE_PIO].base);
+
+return irq_info;
+}
+
+static struct interface_info *init_interface_info(void *g_interface)
+{
+struct interface_info *interface = g_interface;
+
+stw_le_p(&interface->vers, 0x01);
+strpadcpy(interface->description, 64, "UEFI_Version_v1.0", '\0');
+
+return interface;
+}
+
+static struct board_devices *board_devices_info(void *g_board)
+{
+struct board_devices *bd = g_board;
+
+strpadcpy(bd->name, 64, "Loongson-3A-VIRT-1w-V1.00-demo", '\0');
+
+return bd;
+}
+
+static struct loongson_special_attribute *init_special_info(void *g_special)
+{
+struct loongson_special_attribute *special = g_special;
+
+strpadcpy(special->special_name, 64, "2018-04-01", '\0');
+
+return special;
+}
+
+void init_loongson_params(struct loongson_params *lp, void *p,
+  uint64_t cpu_freq, uint64_t ram_size)
+{
+  

[PATCH V17 0/6] mips: Add Loongson-3 machine support

2020-11-05 Thread Huacai Chen
Loongson-3 CPU family include Loongson-3A R1/R2/R3/R4 and Loongson-3B
R1/R2. Loongson-3A R1 is the oldest and its ISA is the smallest, while
Loongson-3A R4 is the newest and its ISA is almost the superset of all
others. To reduce complexity, in QEMU we just define two CPU types:

1, "Loongson-3A1000" CPU which is corresponding to Loongson-3A R1. It is
   suitable for TCG because Loongson-3A R1 has fewest ASE.
2, "Loongson-3A4000" CPU which is corresponding to Loongson-3A R4. It is
   suitable for KVM because Loongson-3A R4 has the VZ ASE.

Loongson-3 lacks English documents. I've tried to translated them with
translate.google.com, and the machine translated documents (together
with their original Chinese versions) are available here.

Loongson-3A R1 (Loongson-3A1000)
User Manual Part 1:
http://ftp.godson.ac.cn/lemote/3A1000_p1.pdf
http://ftp.godson.ac.cn/lemote/Loongson3A1000_processor_user_manual_P1.pdf 
(Chinese Version)
User Manual Part 2:
http://ftp.godson.ac.cn/lemote/3A1000_p2.pdf
http://ftp.godson.ac.cn/lemote/Loongson3A1000_processor_user_manual_P2.pdf 
(Chinese Version)

Loongson-3A R2 (Loongson-3A2000)
User Manual Part 1:
http://ftp.godson.ac.cn/lemote/3A2000_p1.pdf
http://ftp.godson.ac.cn/lemote/Loongson3A2000_user1.pdf (Chinese Version)
User Manual Part 2:
http://ftp.godson.ac.cn/lemote/3A2000_p2.pdf
http://ftp.godson.ac.cn/lemote/Loongson3A2000_user2.pdf (Chinese Version)

Loongson-3A R3 (Loongson-3A3000)
User Manual Part 1:
http://ftp.godson.ac.cn/lemote/3A3000_p1.pdf
http://ftp.godson.ac.cn/lemote/Loongson3A3000_3B3000usermanual1.pdf (Chinese 
Version)
User Manual Part 2:
http://ftp.godson.ac.cn/lemote/3A3000_p2.pdf
http://ftp.godson.ac.cn/lemote/Loongson3A3000_3B3000usermanual2.pdf (Chinese 
Version)

Loongson-3A R4 (Loongson-3A4000)
User Manual Part 1:
http://ftp.godson.ac.cn/lemote/3A4000_p1.pdf
http://ftp.godson.ac.cn/lemote/3A4000user.pdf (Chinese Version)
User Manual Part 2:
I'm sorry that it is unavailable now.

And human-translated documents (W.I.P) are available here now:
https://github.com/loongson-community/docs/tree/master/English-translation-of-Loongson-manual

Both KVM and TCG are available now!

We now already have a full functional Linux kernel (based on Linux-5.4.x
LTS) here:

https://github.com/chenhuacai/linux

Of course the upstream kernel is also usable (the kvm host side and
guest side have both been upstream in Linux-5.9):

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

How to use QEMU/Loongson-3?
1, Download kernel source from the above URL;
2, Build a kernel with arch/mips/configs/loongson3_{def,hpc}config;
3, Boot a Loongson-3A4000 host with this kernel (for KVM mode);
4, Build QEMU-master with this patchset;
5, modprobe kvm (only necessary for KVM mode);
6, Use QEMU with TCG:
   qemu-system-mips64el -M loongson3-virt,accel=tcg -cpu Loongson-3A1000 
-kernel  -append ...
   Use QEMU with KVM:
   qemu-system-mips64el -M loongson3-virt,accel=kvm -cpu Loongson-3A4000 
-kernel  -append ...

   The "-cpu" parameter is optional here and QEMU will use the correct type for 
TCG/KVM automatically.

V1 -> V2:
1, Add a cover letter;
2, Improve CPU definitions;
3, Remove LS7A-related things (Use GPEX instead);
4, Add a description of how to run QEMU/Loongson-3.

V2 -> V3:
1, Fix all possible checkpatch.pl errors and warnings.

V3 -> V4:
1, Sync code with upstream;
2, Remove merged patches;
3, Fix build failure without CONFIG_KVM;
4, Add Reviewed-by: Aleksandar Markovic .

V4 -> V5:
1, Improve coding style;
2, Remove merged patches;
3, Rename machine name from "loongson3" to "loongson3-virt";
4, Rework the "loongson3-virt" machine to drop any ISA things;
5, Rework "hw/mips: Implement the kvm_type() hook in MachineClass";
6, Add Jiaxun Yang as a reviewer of Loongson-3.

V5 -> V6:
1, Fix license preamble;
2, Improve commit messages;
3, Add hw/intc/loongson_liointc.c to MAINTAINERS;
4, Fix all possible checkpatch.pl errors and warnings.

V7 and V8 have only one patch (machine definition) with some minor improvements.

V8 -> V9:
1, Update KVM type definition from kernel;
2, Fix PageMask with variable page size for TCG;
3, Add TCG support (add Loongson-EXT instructions).

V9 -> V10:
1, Split fw_cfg to a separate patch;
2, Split boot parameters definition to a local header;
3, Update MIPS machine documentation;
4, Many other improvements suggested by Philippe Mathieu-Daudé.

V10 -> V11:
1, Fix some typos;
2, Add Reviewed-by: Philippe Mathieu-Daudé .

V11 -> V12:
1, Split boot parameter helpers to loongson3_bootp.c;
2, Support both BE and LE host (Loongson guests are always LE).

V12 -> V13:
1, Sync code with upstream;
2, Re-enable KVM support for MIPS in meson;

V13 -> V14:
1, Remove merged patches;
2, Split boot parameter helpers to a separate patch;
3, Many other improvements suggested by Philippe Mathieu-Daudé.

V14 -> V15:
1, Remove merged patches;
2, Fix malta breakage caused by variable page size;
3, Add unaligned access support for Loongson-3's TCG m

[PATCH V17 1/6] target/mips: Fix PageMask with variable page size

2020-11-05 Thread Huacai Chen
From: Jiaxun Yang 

Our current code assumed the target page size is always 4k
when handling PageMask and VPN2, however, variable page size
was just added to mips target and that's no longer true.

Fixes: ee3863b9d414 ("target/mips: Support variable page size")
Signed-off-by: Jiaxun Yang 
Signed-off-by: Huacai Chen 
---
 target/mips/cp0_helper.c | 27 +--
 target/mips/cpu.h|  1 +
 2 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/target/mips/cp0_helper.c b/target/mips/cp0_helper.c
index 709cc9a..92bf14f 100644
--- a/target/mips/cp0_helper.c
+++ b/target/mips/cp0_helper.c
@@ -892,13 +892,28 @@ void helper_mtc0_memorymapid(CPUMIPSState *env, 
target_ulong arg1)
 
 void update_pagemask(CPUMIPSState *env, target_ulong arg1, int32_t *pagemask)
 {
-uint64_t mask = arg1 >> (TARGET_PAGE_BITS + 1);
-if (!(env->insn_flags & ISA_MIPS32R6) || (arg1 == ~0) ||
-(mask == 0x || mask == 0x0003 || mask == 0x000F ||
- mask == 0x003F || mask == 0x00FF || mask == 0x03FF ||
- mask == 0x0FFF || mask == 0x3FFF || mask == 0x)) {
-env->CP0_PageMask = arg1 & (0x1FFF & (TARGET_PAGE_MASK << 1));
+unsigned long mask;
+int maskbits;
+
+/* Don't care MASKX as we don't support 1KB page */
+mask = extract32((uint32_t)arg1, CP0PM_MASK, 16);
+maskbits = find_first_zero_bit(&mask, 32);
+
+/* Ensure no more set bit after first zero */
+if (mask >> maskbits) {
+goto invalid;
+}
+/* We don't support VTLB entry smaller than target page */
+if ((maskbits + 12) < TARGET_PAGE_BITS) {
+goto invalid;
 }
+env->CP0_PageMask = mask << CP0PM_MASK;
+
+return;
+
+invalid:
+/* When invalid, set to default target page size. */
+env->CP0_PageMask = (~TARGET_PAGE_MASK >> 12) << CP0PM_MASK;
 }
 
 void helper_mtc0_pagemask(CPUMIPSState *env, target_ulong arg1)
diff --git a/target/mips/cpu.h b/target/mips/cpu.h
index d41579d..23f8c6f 100644
--- a/target/mips/cpu.h
+++ b/target/mips/cpu.h
@@ -619,6 +619,7 @@ struct CPUMIPSState {
  * CP0 Register 5
  */
 int32_t CP0_PageMask;
+#define CP0PM_MASK 13
 int32_t CP0_PageGrain_rw_bitmask;
 int32_t CP0_PageGrain;
 #define CP0PG_RIE 31
-- 
2.7.0




RE: [PATCH v2 4/4] hw/riscv: Load the kernel after the firmware

2020-11-05 Thread Anup Patel


> -Original Message-
> From: Qemu-riscv  bounces+anup.patel=wdc@nongnu.org> On Behalf Of Palmer Dabbelt
> Sent: 06 November 2020 08:19
> To: alistai...@gmail.com
> Cc: qemu-ri...@nongnu.org; bmeng...@gmail.com; Alistair Francis
> ; qemu-devel@nongnu.org
> Subject: Re: [PATCH v2 4/4] hw/riscv: Load the kernel after the firmware
> 
> On Tue, 20 Oct 2020 08:46:45 PDT (-0700), alistai...@gmail.com wrote:
> > On Mon, Oct 19, 2020 at 4:17 PM Palmer Dabbelt 
> wrote:
> >>
> >> On Tue, 13 Oct 2020 17:17:33 PDT (-0700), Alistair Francis wrote:
> >> > Instead of loading the kernel at a hardcoded start address, let's
> >> > load the kernel at the next alligned address after the end of the
> firmware.
> >> >
> >> > This should have no impact for current users of OpenSBI, but will
> >> > allow loading a noMMU kernel at the start of memory.
> >> >
> >> > Signed-off-by: Alistair Francis 
> >> > ---
> >> >  include/hw/riscv/boot.h |  3 +++
> >> >  hw/riscv/boot.c | 19 ++-
> >> >  hw/riscv/opentitan.c|  3 ++-
> >> >  hw/riscv/sifive_e.c |  3 ++-
> >> >  hw/riscv/sifive_u.c | 10 --
> >> >  hw/riscv/spike.c| 11 ---
> >> >  hw/riscv/virt.c | 11 ---
> >> >  7 files changed, 45 insertions(+), 15 deletions(-)
> >> >
> >> > diff --git a/include/hw/riscv/boot.h b/include/hw/riscv/boot.h
> >> > index 2975ed1a31..0b01988727 100644
> >> > --- a/include/hw/riscv/boot.h
> >> > +++ b/include/hw/riscv/boot.h
> >> > @@ -25,6 +25,8 @@
> >> >
> >> >  bool riscv_is_32_bit(MachineState *machine);
> >> >
> >> > +target_ulong riscv_calc_kernel_start_addr(MachineState *machine,
> >> > +  target_ulong
> >> > +firmware_end_addr);
> >> >  target_ulong riscv_find_and_load_firmware(MachineState *machine,
> >> >const char 
> >> > *default_machine_firmware,
> >> >hwaddr
> >> > firmware_load_addr, @@ -34,6 +36,7 @@ target_ulong
> riscv_load_firmware(const char *firmware_filename,
> >> >   hwaddr firmware_load_addr,
> >> >   symbol_fn_t sym_cb);
> >> > target_ulong riscv_load_kernel(const char *kernel_filename,
> >> > +   target_ulong firmware_end_addr,
> >> > symbol_fn_t sym_cb);  hwaddr
> >> > riscv_load_initrd(const char *filename, uint64_t mem_size,
> >> >   uint64_t kernel_entry, hwaddr *start);
> >> > diff --git a/hw/riscv/boot.c b/hw/riscv/boot.c index
> >> > 5dea644f47..9b3fe3fb1e 100644
> >> > --- a/hw/riscv/boot.c
> >> > +++ b/hw/riscv/boot.c
> >> > @@ -33,10 +33,8 @@
> >> >  #include 
> >> >
> >> >  #if defined(TARGET_RISCV32)
> >> > -# define KERNEL_BOOT_ADDRESS 0x8040
> >> >  #define fw_dynamic_info_data(__val) cpu_to_le32(__val)
> >> >  #else
> >> > -# define KERNEL_BOOT_ADDRESS 0x8020
> >> >  #define fw_dynamic_info_data(__val) cpu_to_le64(__val)
> >> >  #endif
> >> >
> >> > @@ -49,6 +47,15 @@ bool riscv_is_32_bit(MachineState *machine)
> >> >  }
> >> >  }
> >> >
> >> > +target_ulong riscv_calc_kernel_start_addr(MachineState *machine,
> >> > +  target_ulong 
> >> > firmware_end_addr) {
> >> > +if (riscv_is_32_bit(machine)) {
> >> > +return QEMU_ALIGN_UP(firmware_end_addr, 4 * MiB);
> >> > +} else {
> >> > +return QEMU_ALIGN_UP(firmware_end_addr, 2 * MiB);
> >> > +}
> >> > +}
> >> > +
> >> >  target_ulong riscv_find_and_load_firmware(MachineState *machine,
> >> >const char 
> >> > *default_machine_firmware,
> >> >hwaddr
> >> > firmware_load_addr, @@ -123,7 +130,9 @@ target_ulong
> riscv_load_firmware(const char *firmware_filename,
> >> >  exit(1);
> >> >  }
> >> >
> >> > -target_ulong riscv_load_kernel(const char *kernel_filename,
> >> > symbol_fn_t sym_cb)
> >> > +target_ulong riscv_load_kernel(const char *kernel_filename,
> >> > +   target_ulong kernel_start_addr,
> >> > +   symbol_fn_t sym_cb)
> >> >  {
> >> >  uint64_t kernel_entry;
> >> >
> >> > @@ -138,9 +147,9 @@ target_ulong riscv_load_kernel(const char
> *kernel_filename, symbol_fn_t sym_cb)
> >> >  return kernel_entry;
> >> >  }
> >> >
> >> > -if (load_image_targphys_as(kernel_filename,
> KERNEL_BOOT_ADDRESS,
> >> > +if (load_image_targphys_as(kernel_filename, kernel_start_addr,
> >> > ram_size, NULL) > 0) {
> >> > -return KERNEL_BOOT_ADDRESS;
> >> > +return kernel_start_addr;
> >> >  }
> >> >
> >> >  error_report("could not load kernel '%s'", kernel_filename);
> >> > diff --git a/hw/riscv/opentitan.c b/hw/riscv/opentitan.c index
> >> > 0531bd879b..cc758b78b8 100644
> >> > --- a/hw/riscv/opentitan.c
> >> > +++ b

[PATCH 1/2] file-posix: Use OFD lock only if the filesystem supports the lock

2020-11-05 Thread Masayoshi Mizuma
From: Masayoshi Mizuma 

locking=auto doesn't work if the filesystem doesn't support OFD lock.
In that situation, following error happens:

  qemu-system-x86_64: -blockdev 
driver=qcow2,node-name=disk,file.driver=file,file.filename=/mnt/guest.qcow2,file.locking=auto:
 Failed to lock byte 100

qemu_probe_lock_ops() judges whether qemu can use OFD lock
or not with doing fcntl(F_OFD_GETLK) to /dev/null. So the
error happens if /dev/null supports OFD lock, but the filesystem
doesn't support the lock.

Lock the actual file, not /dev/null, using F_OFD_SETLK and if that
fails, then fallback to F_SETLK.

Signed-off-by: Masayoshi Mizuma 
---
 block/file-posix.c   |  56 
 include/qemu/osdep.h |   2 +-
 util/osdep.c | 149 ---
 3 files changed, 125 insertions(+), 82 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index c63926d592..a568dbf125 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -584,34 +584,6 @@ static int raw_open_common(BlockDriverState *bs, QDict 
*options,
 s->use_linux_io_uring = (aio == BLOCKDEV_AIO_OPTIONS_IO_URING);
 #endif
 
-locking = qapi_enum_parse(&OnOffAuto_lookup,
-  qemu_opt_get(opts, "locking"),
-  ON_OFF_AUTO_AUTO, &local_err);
-if (local_err) {
-error_propagate(errp, local_err);
-ret = -EINVAL;
-goto fail;
-}
-switch (locking) {
-case ON_OFF_AUTO_ON:
-s->use_lock = true;
-if (!qemu_has_ofd_lock()) {
-warn_report("File lock requested but OFD locking syscall is "
-"unavailable, falling back to POSIX file locks");
-error_printf("Due to the implementation, locks can be lost "
- "unexpectedly.\n");
-}
-break;
-case ON_OFF_AUTO_OFF:
-s->use_lock = false;
-break;
-case ON_OFF_AUTO_AUTO:
-s->use_lock = qemu_has_ofd_lock();
-break;
-default:
-abort();
-}
-
 str = qemu_opt_get(opts, "pr-manager");
 if (str) {
 s->pr_mgr = pr_manager_lookup(str, &local_err);
@@ -641,6 +613,34 @@ static int raw_open_common(BlockDriverState *bs, QDict 
*options,
 }
 s->fd = fd;
 
+locking = qapi_enum_parse(&OnOffAuto_lookup,
+  qemu_opt_get(opts, "locking"),
+  ON_OFF_AUTO_AUTO, &local_err);
+if (local_err) {
+error_propagate(errp, local_err);
+ret = -EINVAL;
+goto fail;
+}
+switch (locking) {
+case ON_OFF_AUTO_ON:
+s->use_lock = true;
+if (!qemu_has_ofd_lock(s->fd)) {
+warn_report("File lock requested but OFD locking syscall is "
+"unavailable, falling back to POSIX file locks");
+error_printf("Due to the implementation, locks can be lost "
+ "unexpectedly.\n");
+}
+break;
+case ON_OFF_AUTO_OFF:
+s->use_lock = false;
+break;
+case ON_OFF_AUTO_AUTO:
+s->use_lock = qemu_has_ofd_lock(s->fd);
+break;
+default:
+abort();
+}
+
 /* Check s->open_flags rather than bdrv_flags due to auto-read-only */
 if (s->open_flags & O_RDWR) {
 ret = check_hdev_writable(s->fd);
diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
index f9ec8c84e9..222138a81a 100644
--- a/include/qemu/osdep.h
+++ b/include/qemu/osdep.h
@@ -512,7 +512,7 @@ int qemu_dup(int fd);
 int qemu_lock_fd(int fd, int64_t start, int64_t len, bool exclusive);
 int qemu_unlock_fd(int fd, int64_t start, int64_t len);
 int qemu_lock_fd_test(int fd, int64_t start, int64_t len, bool exclusive);
-bool qemu_has_ofd_lock(void);
+bool qemu_has_ofd_lock(int orig_fd);
 #endif
 
 #if defined(__HAIKU__) && defined(__i386__)
diff --git a/util/osdep.c b/util/osdep.c
index 66d01b9160..454e8ef9f4 100644
--- a/util/osdep.c
+++ b/util/osdep.c
@@ -117,9 +117,6 @@ int qemu_mprotect_none(void *addr, size_t size)
 
 #ifndef _WIN32
 
-static int fcntl_op_setlk = -1;
-static int fcntl_op_getlk = -1;
-
 /*
  * Dups an fd and sets the flags
  */
@@ -187,68 +184,87 @@ static int qemu_parse_fdset(const char *param)
 return qemu_parse_fd(param);
 }
 
-static void qemu_probe_lock_ops(void)
+bool qemu_has_ofd_lock(int orig_fd)
 {
-if (fcntl_op_setlk == -1) {
 #ifdef F_OFD_SETLK
-int fd;
-int ret;
-struct flock fl = {
-.l_whence = SEEK_SET,
-.l_start  = 0,
-.l_len= 0,
-.l_type   = F_WRLCK,
-};
-
-fd = open("/dev/null", O_RDWR);
-if (fd < 0) {
+int fd;
+int ret;
+struct flock fl = {
+.l_whence = SEEK_SET,
+.l_start  = 0,
+.l_len= 0,
+.l_type   = F_RDLCK,
+};
+
+fd = qemu_dup(orig_fd);
+if (fd >= 0) {
+ret = fcntl_setfl(fd, O_RDONLY);
+if (ret) {
 fprintf(stder

Re: [PATCH v3 00/41] Mirror map JIT memory for TCG

2020-11-05 Thread no-reply
Patchew URL: 
https://patchew.org/QEMU/20201106032921.600200-1-richard.hender...@linaro.org/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20201106032921.600200-1-richard.hender...@linaro.org
Subject: [PATCH v3 00/41] Mirror map JIT memory for TCG

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag] 
patchew/20201106032921.600200-1-richard.hender...@linaro.org -> 
patchew/20201106032921.600200-1-richard.hender...@linaro.org
Switched to a new branch 'test'
170f310 tcg: Constify TCGLabelQemuLdst.raddr
c336494 tcg: Constify tcg_code_gen_epilogue
a009e99 tcg: Remove TCG_TARGET_SUPPORT_MIRROR
545feb7 tcg/arm: Support split-wx code generation
a873c61 tcg/mips: Support split-wx code generation
7f12d40 tcg/mips: Do not assert on relocation overflow
200ecb3 accel/tcg: Add mips support to alloc_code_gen_buffer_splitwx_memfd
edd72db tcg/riscv: Support split-wx code generation
1c6764d tcg/riscv: Remove branch-over-branch fallback
63883fc tcg/riscv: Fix branch range checks
dff34e0 tcg/s390: Support split-wx code generation
8d88879 tcg/s390: Use tcg_tbrel_diff
f0fea63 tcg/sparc: Support split-wx code generation
5a837a3 tcg/sparc: Use tcg_tbrel_diff
684c281 tcg/ppc: Support split-wx code generation
a254bfd tcg/ppc: Use tcg_out_mem_long to reset TCG_REG_TB
4c4f647 tcg/ppc: Use tcg_tbrel_diff
5f81f0e tcg: Introduce tcg_tbrel_diff
1cab418 tcg/tci: Push const down through bytecode reading
c55a8c0 disas: Push const down through host disasassembly
55b926c tcg/aarch64: Support split-wx code generation
aef71b4 tcg/aarch64: Implement flush_idcache_range manually
548fc79 tcg/aarch64: Use B not BL for tcg_out_goto_long
6cbd22a tcg/i386: Support split-wx code generation
fe36cad tcg: Return the TB pointer from the rx region from exit_tb
ef96a10 accel/tcg: Support split-wx for darwin/iOS with vm_remap
fed5e19 accel/tcg: Support split-wx for linux with memfd
caaf645 tcg: Add --accel tcg,split-wx property
f93ae22 tcg: Use Error with alloc_code_gen_buffer
b6992b5 tcg: Make tb arg to synchronize_from_tb const
043973b tcg: Make DisasContextBase.tb const
1d83486 tcg: Adjust tb_target_set_jmp_target for split-wx
eec18a6 tcg: Adjust tcg_register_jit for const
65e76b9 tcg: Adjust tcg_out_label for const
44975a9 tcg: Adjust tcg_out_call for const
c3e1e5d tcg: Adjust TCGLabel for const
72ac21e tcg: Introduce tcg_splitwx_to_{rx,rw}
3e322da tcg: Add in_code_gen_buffer
ccb0c48 tcg: Move tcg epilogue pointer out of TCGContext
09ef808 tcg: Move tcg prologue pointer out of TCGContext
e488e58 tcg: Enhance flush_icache_range with separate data pointer

=== OUTPUT BEGIN ===
1/41 Checking commit e488e58096f9 (tcg: Enhance flush_icache_range with 
separate data pointer)
2/41 Checking commit 09ef8082ce7f (tcg: Move tcg prologue pointer out of 
TCGContext)
3/41 Checking commit ccb0c482bf3e (tcg: Move tcg epilogue pointer out of 
TCGContext)
4/41 Checking commit 3e322da5de89 (tcg: Add in_code_gen_buffer)
5/41 Checking commit 72ac21e27103 (tcg: Introduce tcg_splitwx_to_{rx,rw})
6/41 Checking commit c3e1e5d3a470 (tcg: Adjust TCGLabel for const)
7/41 Checking commit 44975a9cbb02 (tcg: Adjust tcg_out_call for const)
8/41 Checking commit 65e76b95a029 (tcg: Adjust tcg_out_label for const)
9/41 Checking commit eec18a607903 (tcg: Adjust tcg_register_jit for const)
10/41 Checking commit 1d83486ee180 (tcg: Adjust tb_target_set_jmp_target for 
split-wx)
11/41 Checking commit 043973b272b1 (tcg: Make DisasContextBase.tb const)
12/41 Checking commit b6992b579570 (tcg: Make tb arg to synchronize_from_tb 
const)
13/41 Checking commit f93ae2267092 (tcg: Use Error with alloc_code_gen_buffer)
14/41 Checking commit caaf645ec574 (tcg: Add --accel tcg,split-wx property)
15/41 Checking commit fed5e19b3c0c (accel/tcg: Support split-wx for linux with 
memfd)
16/41 Checking commit ef96a10480c7 (accel/tcg: Support split-wx for darwin/iOS 
with vm_remap)
ERROR: externs should be avoided in .c files
#24: FILE: accel/tcg/translate-all.c:1172:
+extern kern_return_t mach_vm_remap(vm_map_t target_task,

total: 1 errors, 0 warnings, 80 lines checked

Patch 16/41 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

17/41 Checking commit fe36cad8bf7d (tcg: Return the TB pointer from the rx 
region from exit_tb)
18/41 Checking commit 6cbd22af41aa (tcg/i386: Support split-wx code generation)
19/41 Checking commit 548fc7975cc4 (tcg/aarch64: Use B not BL for 
tcg_out_goto_long)
20/41 Checking commit aef71b442d41 (tcg/aarch64: Implement flush_idcache_range 
manually)
21/41 Checking commit 55b926c57a1b (tcg/aar

[PATCH 2/2] tests/test-image-locking: Pass the fd to the argument of qemu_has_ofd_lock()

2020-11-05 Thread Masayoshi Mizuma
From: Masayoshi Mizuma 

Pass the file descriptor of /dev/null to qemu_has_ofd_lock() because
former patch is changed the argument.

Signed-off-by: Masayoshi Mizuma 
---
 tests/test-image-locking.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/tests/test-image-locking.c b/tests/test-image-locking.c
index ba057bd66c..2b823e1588 100644
--- a/tests/test-image-locking.c
+++ b/tests/test-image-locking.c
@@ -144,14 +144,19 @@ static void test_set_perm_abort(void)
 
 int main(int argc, char **argv)
 {
+int fd;
+
 bdrv_init();
 qemu_init_main_loop(&error_abort);
 
 g_test_init(&argc, &argv, NULL);
 
-if (qemu_has_ofd_lock()) {
+fd = open("/dev/null", O_RDONLY);
+
+if ((fd != -1) && (qemu_has_ofd_lock(fd))) {
 g_test_add_func("/image-locking/basic", test_image_locking_basic);
 g_test_add_func("/image-locking/set-perm-abort", test_set_perm_abort);
+close(fd);
 }
 
 return g_test_run();
-- 
2.27.0




[PATCH v3 40/41] tcg: Constify tcg_code_gen_epilogue

2020-11-05 Thread Richard Henderson
Now that all native tcg hosts support splitwx,
make this pointer const.

Signed-off-by: Richard Henderson 
---
 include/tcg/tcg.h| 2 +-
 tcg/tcg.c| 2 +-
 tcg/aarch64/tcg-target.c.inc | 3 +--
 tcg/arm/tcg-target.c.inc | 3 +--
 tcg/i386/tcg-target.c.inc| 3 +--
 tcg/mips/tcg-target.c.inc| 3 +--
 tcg/ppc/tcg-target.c.inc | 3 +--
 tcg/riscv/tcg-target.c.inc   | 3 +--
 tcg/s390/tcg-target.c.inc| 3 +--
 tcg/sparc/tcg-target.c.inc   | 3 +--
 10 files changed, 10 insertions(+), 18 deletions(-)

diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index 90ec7c1445..477919aeb6 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -677,7 +677,7 @@ struct TCGContext {
 
 extern TCGContext tcg_init_ctx;
 extern __thread TCGContext *tcg_ctx;
-extern void *tcg_code_gen_epilogue;
+extern const void *tcg_code_gen_epilogue;
 extern uintptr_t tcg_splitwx_diff;
 extern TCGv_env cpu_env;
 
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 07a4bd2c57..5c19b1e6b3 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -160,7 +160,7 @@ static int tcg_out_ldst_finalize(TCGContext *s);
 static TCGContext **tcg_ctxs;
 static unsigned int n_tcg_ctxs;
 TCGv_env cpu_env = 0;
-void *tcg_code_gen_epilogue;
+const void *tcg_code_gen_epilogue;
 uintptr_t tcg_splitwx_diff;
 
 #ifndef CONFIG_TCG_INTERPRETER
diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
index e398913c0c..32a60eba5e 100644
--- a/tcg/aarch64/tcg-target.c.inc
+++ b/tcg/aarch64/tcg-target.c.inc
@@ -2900,8 +2900,7 @@ static void tcg_target_qemu_prologue(TCGContext *s)
  * Return path for goto_ptr. Set return value to 0, a-la exit_tb,
  * and fall through to the rest of the epilogue.
  */
-/* TODO: Cast goes away when all hosts converted */
-tcg_code_gen_epilogue = (void *)tcg_splitwx_to_rx(s->code_ptr);
+tcg_code_gen_epilogue = tcg_splitwx_to_rx(s->code_ptr);
 tcg_out_movi(s, TCG_TYPE_REG, TCG_REG_X0, 0);
 
 /* TB epilogue */
diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
index 3d2717aeb0..d6cb19ca9f 100644
--- a/tcg/arm/tcg-target.c.inc
+++ b/tcg/arm/tcg-target.c.inc
@@ -2301,8 +2301,7 @@ static void tcg_target_qemu_prologue(TCGContext *s)
  * Return path for goto_ptr. Set return value to 0, a-la exit_tb,
  * and fall through to the rest of the epilogue.
  */
-/* TODO: Cast goes away when all hosts converted */
-tcg_code_gen_epilogue = (void *)tcg_splitwx_to_rx(s->code_ptr);
+tcg_code_gen_epilogue = tcg_splitwx_to_rx(s->code_ptr);
 tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R0, 0);
 tcg_out_epilogue(s);
 }
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index 23c7a8a383..be57d2330a 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -3826,8 +3826,7 @@ static void tcg_target_qemu_prologue(TCGContext *s)
  * Return path for goto_ptr. Set return value to 0, a-la exit_tb,
  * and fall through to the rest of the epilogue.
  */
-/* TODO: Cast goes away when all hosts converted */
-tcg_code_gen_epilogue = (void *)tcg_splitwx_to_rx(s->code_ptr);
+tcg_code_gen_epilogue = tcg_splitwx_to_rx(s->code_ptr);
 tcg_out_movi(s, TCG_TYPE_REG, TCG_REG_EAX, 0);
 
 /* TB epilogue */
diff --git a/tcg/mips/tcg-target.c.inc b/tcg/mips/tcg-target.c.inc
index a2201bd1dd..18fd474593 100644
--- a/tcg/mips/tcg-target.c.inc
+++ b/tcg/mips/tcg-target.c.inc
@@ -2473,8 +2473,7 @@ static void tcg_target_qemu_prologue(TCGContext *s)
  * Return path for goto_ptr. Set return value to 0, a-la exit_tb,
  * and fall through to the rest of the epilogue.
  */
-/* TODO: Cast goes away when all hosts converted */
-tcg_code_gen_epilogue = (void *)tcg_splitwx_to_rx(s->code_ptr);
+tcg_code_gen_epilogue = tcg_splitwx_to_rx(s->code_ptr);
 tcg_out_mov(s, TCG_TYPE_REG, TCG_REG_V0, TCG_REG_ZERO);
 
 /* TB epilogue */
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index fe33687787..3580bbabc1 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -2346,8 +2346,7 @@ static void tcg_target_qemu_prologue(TCGContext *s)
 tcg_out32(s, BCCTR | BO_ALWAYS);
 
 /* Epilogue */
-/* TODO: Cast goes away when all hosts converted */
-tcg_code_gen_epilogue = (void *)tcg_splitwx_to_rx(s->code_ptr);
+tcg_code_gen_epilogue = tcg_splitwx_to_rx(s->code_ptr);
 
 tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_R0, TCG_REG_R1, FRAME_SIZE+LR_OFFSET);
 for (i = 0; i < ARRAY_SIZE(tcg_target_callee_save_regs); ++i) {
diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
index 5c1e0f8fc1..7b4ee4a084 100644
--- a/tcg/riscv/tcg-target.c.inc
+++ b/tcg/riscv/tcg-target.c.inc
@@ -1784,8 +1784,7 @@ static void tcg_target_qemu_prologue(TCGContext *s)
 tcg_out_opc_imm(s, OPC_JALR, TCG_REG_ZERO, tcg_target_call_iarg_regs[1], 
0);
 
 /* Return path for goto_ptr. Set return value to 0 */
-/* TODO: Cast goes away when all hosts converted */
-tcg_code_gen_ep

[PATCH v3 38/41] tcg/arm: Support split-wx code generation

2020-11-05 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 tcg/arm/tcg-target.h |  2 +-
 tcg/arm/tcg-target.c.inc | 37 +
 2 files changed, 22 insertions(+), 17 deletions(-)

diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h
index e355d6a4b2..17f6be9cfc 100644
--- a/tcg/arm/tcg-target.h
+++ b/tcg/arm/tcg-target.h
@@ -150,6 +150,6 @@ void tb_target_set_jmp_target(uintptr_t, uintptr_t, 
uintptr_t, uintptr_t);
 #define TCG_TARGET_NEED_LDST_LABELS
 #endif
 #define TCG_TARGET_NEED_POOL_LABELS
-#define TCG_TARGET_SUPPORT_MIRROR   0
+#define TCG_TARGET_SUPPORT_MIRROR   1
 
 #endif
diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
index 9b9400f164..3d2717aeb0 100644
--- a/tcg/arm/tcg-target.c.inc
+++ b/tcg/arm/tcg-target.c.inc
@@ -187,29 +187,32 @@ static const uint8_t tcg_cond_to_arm_cond[] = {
 [TCG_COND_GTU] = COND_HI,
 };
 
-static inline bool reloc_pc24(tcg_insn_unit *code_ptr, tcg_insn_unit *target)
+static bool reloc_pc24(tcg_insn_unit *src_rw, const tcg_insn_unit *target)
 {
-ptrdiff_t offset = (tcg_ptr_byte_diff(target, code_ptr) - 8) >> 2;
+const tcg_insn_unit *src_rx = tcg_splitwx_to_rx(src_rw);
+ptrdiff_t offset = (tcg_ptr_byte_diff(target, src_rx) - 8) >> 2;
+
 if (offset == sextract32(offset, 0, 24)) {
-*code_ptr = (*code_ptr & ~0xff) | (offset & 0xff);
+*src_rw = deposit32(*src_rw, 0, 24, offset);
 return true;
 }
 return false;
 }
 
-static inline bool reloc_pc13(tcg_insn_unit *code_ptr, tcg_insn_unit *target)
+static bool reloc_pc13(tcg_insn_unit *src_rw, const tcg_insn_unit *target)
 {
-ptrdiff_t offset = tcg_ptr_byte_diff(target, code_ptr) - 8;
+const tcg_insn_unit *src_rx = tcg_splitwx_to_rx(src_rw);
+ptrdiff_t offset = tcg_ptr_byte_diff(target, src_rx) - 8;
 
 if (offset >= -0xfff && offset <= 0xfff) {
-tcg_insn_unit insn = *code_ptr;
+tcg_insn_unit insn = *src_rw;
 bool u = (offset >= 0);
 if (!u) {
 offset = -offset;
 }
 insn = deposit32(insn, 23, 1, u);
 insn = deposit32(insn, 0, 12, offset);
-*code_ptr = insn;
+*src_rw = insn;
 return true;
 }
 return false;
@@ -221,9 +224,9 @@ static bool patch_reloc(tcg_insn_unit *code_ptr, int type,
 tcg_debug_assert(addend == 0);
 
 if (type == R_ARM_PC24) {
-return reloc_pc24(code_ptr, (tcg_insn_unit *)value);
+return reloc_pc24(code_ptr, (const tcg_insn_unit *)value);
 } else if (type == R_ARM_PC13) {
-return reloc_pc13(code_ptr, (tcg_insn_unit *)value);
+return reloc_pc13(code_ptr, (const tcg_insn_unit *)value);
 } else {
 g_assert_not_reached();
 }
@@ -617,7 +620,7 @@ static void tcg_out_movi32(TCGContext *s, int cond, int rd, 
uint32_t arg)
 
 /* Check for a pc-relative address.  This will usually be the TB,
or within the TB, which is immediately before the code block.  */
-diff = arg - ((intptr_t)s->code_ptr + 8);
+diff = tcg_pcrel_diff(s, (void *)arg) - 8;
 if (diff >= 0) {
 rot = encode_imm(diff);
 if (rot >= 0) {
@@ -1337,7 +1340,8 @@ static void add_qemu_ldst_label(TCGContext *s, bool 
is_ld, TCGMemOpIdx oi,
 label->datahi_reg = datahi;
 label->addrlo_reg = addrlo;
 label->addrhi_reg = addrhi;
-label->raddr = raddr;
+/* TODO: Cast goes away when all hosts converted */
+label->raddr = (void *)tcg_splitwx_to_rx(raddr);
 label->label_ptr[0] = label_ptr;
 }
 
@@ -1348,7 +1352,7 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, 
TCGLabelQemuLdst *lb)
 MemOp opc = get_memop(oi);
 void *func;
 
-if (!reloc_pc24(lb->label_ptr[0], s->code_ptr)) {
+if (!reloc_pc24(lb->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
 return false;
 }
 
@@ -1411,7 +1415,7 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, 
TCGLabelQemuLdst *lb)
 TCGMemOpIdx oi = lb->oi;
 MemOp opc = get_memop(oi);
 
-if (!reloc_pc24(lb->label_ptr[0], s->code_ptr)) {
+if (!reloc_pc24(lb->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
 return false;
 }
 
@@ -1762,8 +1766,8 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode 
opc,
 TCGReg base = TCG_REG_PC;
 
 tcg_debug_assert(s->tb_jmp_insn_offset == 0);
-ptr = (intptr_t)(s->tb_jmp_target_addr + args[0]);
-dif = ptr - ((intptr_t)s->code_ptr + 8);
+ptr = (intptr_t)tcg_splitwx_to_rx(s->tb_jmp_target_addr + args[0]);
+dif = tcg_pcrel_diff(s, (void *)ptr) - 8;
 dil = sextract32(dif, 0, 12);
 if (dif != dil) {
 /* The TB is close, but outside the 12 bits addressable by
@@ -2297,7 +2301,8 @@ static void tcg_target_qemu_prologue(TCGContext *s)
  * Return path for goto_ptr. Set return value to 0, a-la exit_tb,
  * and fall through to the rest of the epilogue.
  */
-tcg_code_gen_epilogue = s->code_ptr;

[PATCH v3 37/41] tcg/mips: Support split-wx code generation

2020-11-05 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 tcg/mips/tcg-target.h |  2 +-
 tcg/mips/tcg-target.c.inc | 43 ++-
 2 files changed, 25 insertions(+), 20 deletions(-)

diff --git a/tcg/mips/tcg-target.h b/tcg/mips/tcg-target.h
index d231522dc9..d7d8e6ea1c 100644
--- a/tcg/mips/tcg-target.h
+++ b/tcg/mips/tcg-target.h
@@ -206,7 +206,7 @@ extern bool use_mips32r2_instructions;
 
 #define TCG_TARGET_DEFAULT_MO (0)
 #define TCG_TARGET_HAS_MEMORY_BSWAP 1
-#define TCG_TARGET_SUPPORT_MIRROR   0
+#define TCG_TARGET_SUPPORT_MIRROR   1
 
 /* Flush the dcache at RW, and the icache at RX, as necessary. */
 static inline void flush_idcache_range(uintptr_t rx, uintptr_t rw, size_t len)
diff --git a/tcg/mips/tcg-target.c.inc b/tcg/mips/tcg-target.c.inc
index 37faf1356c..a2201bd1dd 100644
--- a/tcg/mips/tcg-target.c.inc
+++ b/tcg/mips/tcg-target.c.inc
@@ -139,17 +139,18 @@ static const TCGReg tcg_target_call_oarg_regs[2] = {
 TCG_REG_V1
 };
 
-static tcg_insn_unit *tb_ret_addr;
-static tcg_insn_unit *bswap32_addr;
-static tcg_insn_unit *bswap32u_addr;
-static tcg_insn_unit *bswap64_addr;
+static const tcg_insn_unit *tb_ret_addr;
+static const tcg_insn_unit *bswap32_addr;
+static const tcg_insn_unit *bswap32u_addr;
+static const tcg_insn_unit *bswap64_addr;
 
-static bool reloc_pc16(tcg_insn_unit *pc, const tcg_insn_unit *target)
+static bool reloc_pc16(tcg_insn_unit *src_rw, const tcg_insn_unit *target)
 {
 /* Let the compiler perform the right-shift as part of the arithmetic.  */
-ptrdiff_t disp = target - (pc + 1);
+const tcg_insn_unit *src_rx = tcg_splitwx_to_rx(src_rw);
+ptrdiff_t disp = target - (src_rx + 1);
 if (disp == (int16_t)disp) {
-*pc = deposit32(*pc, 0, 16, disp);
+*src_rw = deposit32(*src_rw, 0, 16, disp);
 return true;
 }
 return false;
@@ -505,7 +506,7 @@ static void tcg_out_opc_sa64(TCGContext *s, MIPSInsn opc1, 
MIPSInsn opc2,
 static bool tcg_out_opc_jmp(TCGContext *s, MIPSInsn opc, const void *target)
 {
 uintptr_t dest = (uintptr_t)target;
-uintptr_t from = (uintptr_t)s->code_ptr + 4;
+uintptr_t from = (uintptr_t)tcg_splitwx_to_rx(s->code_ptr) + 4;
 int32_t inst;
 
 /* The pc-region branch happens within the 256MB region of
@@ -617,7 +618,7 @@ static inline void tcg_out_bswap16s(TCGContext *s, TCGReg 
ret, TCGReg arg)
 }
 }
 
-static void tcg_out_bswap_subr(TCGContext *s, tcg_insn_unit *sub)
+static void tcg_out_bswap_subr(TCGContext *s, const tcg_insn_unit *sub)
 {
 bool ok = tcg_out_opc_jmp(s, OPC_JAL, sub);
 tcg_debug_assert(ok);
@@ -1282,7 +1283,8 @@ static void add_qemu_ldst_label(TCGContext *s, int is_ld, 
TCGMemOpIdx oi,
 label->datahi_reg = datahi;
 label->addrlo_reg = addrlo;
 label->addrhi_reg = addrhi;
-label->raddr = raddr;
+/* TODO: Cast goes away when all hosts converted */
+label->raddr = (void *)tcg_splitwx_to_rx(raddr);
 label->label_ptr[0] = label_ptr[0];
 if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
 label->label_ptr[1] = label_ptr[1];
@@ -1291,15 +1293,16 @@ static void add_qemu_ldst_label(TCGContext *s, int 
is_ld, TCGMemOpIdx oi,
 
 static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
 {
+const tcg_insn_unit *tgt_rx = tcg_splitwx_to_rx(s->code_ptr);
 TCGMemOpIdx oi = l->oi;
 MemOp opc = get_memop(oi);
 TCGReg v0;
 int i;
 
 /* resolve label address */
-if (!reloc_pc16(l->label_ptr[0], s->code_ptr)
+if (!reloc_pc16(l->label_ptr[0], tgt_rx)
 || (TCG_TARGET_REG_BITS < TARGET_LONG_BITS
-&& !reloc_pc16(l->label_ptr[1], s->code_ptr))) {
+&& !reloc_pc16(l->label_ptr[1], tgt_rx))) {
 return false;
 }
 
@@ -1344,15 +1347,16 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, 
TCGLabelQemuLdst *l)
 
 static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
 {
+const tcg_insn_unit *tgt_rx = tcg_splitwx_to_rx(s->code_ptr);
 TCGMemOpIdx oi = l->oi;
 MemOp opc = get_memop(oi);
 MemOp s_bits = opc & MO_SIZE;
 int i;
 
 /* resolve label address */
-if (!reloc_pc16(l->label_ptr[0], s->code_ptr)
+if (!reloc_pc16(l->label_ptr[0], tgt_rx)
 || (TCG_TARGET_REG_BITS < TARGET_LONG_BITS
-&& !reloc_pc16(l->label_ptr[1], s->code_ptr))) {
+&& !reloc_pc16(l->label_ptr[1], tgt_rx))) {
 return false;
 }
 
@@ -2469,11 +2473,12 @@ static void tcg_target_qemu_prologue(TCGContext *s)
  * Return path for goto_ptr. Set return value to 0, a-la exit_tb,
  * and fall through to the rest of the epilogue.
  */
-tcg_code_gen_epilogue = s->code_ptr;
+/* TODO: Cast goes away when all hosts converted */
+tcg_code_gen_epilogue = (void *)tcg_splitwx_to_rx(s->code_ptr);
 tcg_out_mov(s, TCG_TYPE_REG, TCG_REG_V0, TCG_REG_ZERO);
 
 /* TB epilogue */
-tb_ret_addr = s->code_ptr;
+tb_ret_addr = tcg_splitwx_to_rx(s->code_ptr);
 for

[PATCH v3 34/41] tcg/riscv: Support split-wx code generation

2020-11-05 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 tcg/riscv/tcg-target.h |  2 +-
 tcg/riscv/tcg-target.c.inc | 41 +-
 2 files changed, 24 insertions(+), 19 deletions(-)

diff --git a/tcg/riscv/tcg-target.h b/tcg/riscv/tcg-target.h
index 3c2e8305b0..0eb19f2b11 100644
--- a/tcg/riscv/tcg-target.h
+++ b/tcg/riscv/tcg-target.h
@@ -179,6 +179,6 @@ void tb_target_set_jmp_target(uintptr_t, uintptr_t, 
uintptr_t, uintptr_t);
 #define TCG_TARGET_NEED_POOL_LABELS
 
 #define TCG_TARGET_HAS_MEMORY_BSWAP 0
-#define TCG_TARGET_SUPPORT_MIRROR   0
+#define TCG_TARGET_SUPPORT_MIRROR   1
 
 #endif
diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
index 02beb86977..5c1e0f8fc1 100644
--- a/tcg/riscv/tcg-target.c.inc
+++ b/tcg/riscv/tcg-target.c.inc
@@ -425,41 +425,44 @@ static void tcg_out_nop_fill(tcg_insn_unit *p, int count)
  * Relocations
  */
 
-static bool reloc_sbimm12(tcg_insn_unit *code_ptr, tcg_insn_unit *target)
+static bool reloc_sbimm12(tcg_insn_unit *src_rw, const tcg_insn_unit *target)
 {
-intptr_t offset = (intptr_t)target - (intptr_t)code_ptr;
+const tcg_insn_unit *src_rx = tcg_splitwx_to_rx(src_rw);
+intptr_t offset = (intptr_t)target - (intptr_t)src_rx;
 
 tcg_debug_assert((offset & 1) == 0);
 if (offset == sextreg(offset, 0, 12)) {
-code_ptr[0] |= encode_sbimm12(offset);
+*src_rw |= encode_sbimm12(offset);
 return true;
 }
 
 return false;
 }
 
-static bool reloc_jimm20(tcg_insn_unit *code_ptr, tcg_insn_unit *target)
+static bool reloc_jimm20(tcg_insn_unit *src_rw, const tcg_insn_unit *target)
 {
-intptr_t offset = (intptr_t)target - (intptr_t)code_ptr;
+const tcg_insn_unit *src_rx = tcg_splitwx_to_rx(src_rw);
+intptr_t offset = (intptr_t)target - (intptr_t)src_rx;
 
 tcg_debug_assert((offset & 1) == 0);
 if (offset == sextreg(offset, 0, 20)) {
-code_ptr[0] |= encode_ujimm20(offset);
+*src_rw |= encode_ujimm20(offset);
 return true;
 }
 
 return false;
 }
 
-static bool reloc_call(tcg_insn_unit *code_ptr, const tcg_insn_unit *target)
+static bool reloc_call(tcg_insn_unit *src_rw, const tcg_insn_unit *target)
 {
-intptr_t offset = (intptr_t)target - (intptr_t)code_ptr;
+const tcg_insn_unit *src_rx = tcg_splitwx_to_rx(src_rw);
+intptr_t offset = (intptr_t)target - (intptr_t)src_rx;
 int32_t lo = sextreg(offset, 0, 12);
 int32_t hi = offset - lo;
 
 if (offset == hi + lo) {
-code_ptr[0] |= encode_uimm20(hi);
-code_ptr[1] |= encode_imm12(lo);
+src_rw[0] |= encode_uimm20(hi);
+src_rw[1] |= encode_imm12(lo);
 return true;
 }
 
@@ -532,7 +535,7 @@ static void tcg_out_movi(TCGContext *s, TCGType type, 
TCGReg rd,
 if (tmp == (int32_t)tmp) {
 tcg_out_opc_upper(s, OPC_AUIPC, rd, 0);
 tcg_out_opc_imm(s, OPC_ADDI, rd, rd, 0);
-ret = reloc_call(s->code_ptr - 2, (tcg_insn_unit *)val);
+ret = reloc_call(s->code_ptr - 2, (const tcg_insn_unit *)val);
 tcg_debug_assert(ret == true);
 return;
 }
@@ -917,7 +920,7 @@ QEMU_BUILD_BUG_ON(TCG_TARGET_REG_BITS < TARGET_LONG_BITS);
 QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
 QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -(1 << 11));
 
-static void tcg_out_goto(TCGContext *s, tcg_insn_unit *target)
+static void tcg_out_goto(TCGContext *s, const tcg_insn_unit *target)
 {
 tcg_out_opc_jump(s, OPC_JAL, TCG_REG_ZERO, 0);
 bool ok = reloc_jimm20(s->code_ptr - 1, target);
@@ -993,7 +996,8 @@ static void add_qemu_ldst_label(TCGContext *s, int is_ld, 
TCGMemOpIdx oi,
 label->datahi_reg = datahi;
 label->addrlo_reg = addrlo;
 label->addrhi_reg = addrhi;
-label->raddr = raddr;
+/* TODO: Cast goes away when all hosts converted */
+label->raddr = (void *)tcg_splitwx_to_rx(raddr);
 label->label_ptr[0] = label_ptr[0];
 }
 
@@ -1012,7 +1016,7 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, 
TCGLabelQemuLdst *l)
 }
 
 /* resolve label address */
-if (!reloc_sbimm12(l->label_ptr[0], s->code_ptr)) {
+if (!reloc_sbimm12(l->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
 return false;
 }
 
@@ -1046,7 +1050,7 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, 
TCGLabelQemuLdst *l)
 }
 
 /* resolve label address */
-if (!reloc_sbimm12(l->label_ptr[0], s->code_ptr)) {
+if (!reloc_sbimm12(l->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
 return false;
 }
 
@@ -1232,7 +1236,7 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg 
*args, bool is_64)
 #endif
 }
 
-static tcg_insn_unit *tb_ret_addr;
+static const tcg_insn_unit *tb_ret_addr;
 
 static void tcg_out_op(TCGContext *s, TCGOpcode opc,
const TCGArg *args, const int *const_args)
@@ -1780,11 +1784,12 @@ static void tcg_target_qemu_prologue(TCGContext *s)
 tcg_out_opc_imm(s, OPC_JALR, TCG_REG_ZERO, tcg_target_call_iarg_regs[1], 
0);
 
 /* R

[PATCH v3 36/41] tcg/mips: Do not assert on relocation overflow

2020-11-05 Thread Richard Henderson
This target was not updated with 7ecd02a06f8, and so did
not allow re-compilation with relocation overflow.

Remove reloc_26 and reloc_26_val as unused.

Signed-off-by: Richard Henderson 
---
 tcg/mips/tcg-target.c.inc | 53 ++-
 1 file changed, 19 insertions(+), 34 deletions(-)

diff --git a/tcg/mips/tcg-target.c.inc b/tcg/mips/tcg-target.c.inc
index 52638e920c..37faf1356c 100644
--- a/tcg/mips/tcg-target.c.inc
+++ b/tcg/mips/tcg-target.c.inc
@@ -144,29 +144,15 @@ static tcg_insn_unit *bswap32_addr;
 static tcg_insn_unit *bswap32u_addr;
 static tcg_insn_unit *bswap64_addr;
 
-static inline uint32_t reloc_pc16_val(tcg_insn_unit *pc,
-  const tcg_insn_unit *target)
+static bool reloc_pc16(tcg_insn_unit *pc, const tcg_insn_unit *target)
 {
 /* Let the compiler perform the right-shift as part of the arithmetic.  */
 ptrdiff_t disp = target - (pc + 1);
-tcg_debug_assert(disp == (int16_t)disp);
-return disp & 0x;
-}
-
-static inline void reloc_pc16(tcg_insn_unit *pc, const tcg_insn_unit *target)
-{
-*pc = deposit32(*pc, 0, 16, reloc_pc16_val(pc, target));
-}
-
-static inline uint32_t reloc_26_val(tcg_insn_unit *pc, tcg_insn_unit *target)
-{
-tcg_debug_assertuintptr_t)pc ^ (uintptr_t)target) & 0xf000) == 0);
-return ((uintptr_t)target >> 2) & 0x3ff;
-}
-
-static inline void reloc_26(tcg_insn_unit *pc, tcg_insn_unit *target)
-{
-*pc = deposit32(*pc, 0, 26, reloc_26_val(pc, target));
+if (disp == (int16_t)disp) {
+*pc = deposit32(*pc, 0, 16, disp);
+return true;
+}
+return false;
 }
 
 static bool patch_reloc(tcg_insn_unit *code_ptr, int type,
@@ -174,8 +160,7 @@ static bool patch_reloc(tcg_insn_unit *code_ptr, int type,
 {
 tcg_debug_assert(type == R_MIPS_PC16);
 tcg_debug_assert(addend == 0);
-reloc_pc16(code_ptr, (tcg_insn_unit *)value);
-return true;
+return reloc_pc16(code_ptr, (const tcg_insn_unit *)value);
 }
 
 #define TCG_CT_CONST_ZERO 0x100
@@ -925,11 +910,7 @@ static void tcg_out_brcond(TCGContext *s, TCGCond cond, 
TCGReg arg1,
 }
 
 tcg_out_opc_br(s, b_opc, arg1, arg2);
-if (l->has_value) {
-reloc_pc16(s->code_ptr - 1, l->u.value_ptr);
-} else {
-tcg_out_reloc(s, s->code_ptr - 1, R_MIPS_PC16, l, 0);
-}
+tcg_out_reloc(s, s->code_ptr - 1, R_MIPS_PC16, l, 0);
 tcg_out_nop(s);
 }
 
@@ -1316,9 +1297,10 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, 
TCGLabelQemuLdst *l)
 int i;
 
 /* resolve label address */
-reloc_pc16(l->label_ptr[0], s->code_ptr);
-if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
-reloc_pc16(l->label_ptr[1], s->code_ptr);
+if (!reloc_pc16(l->label_ptr[0], s->code_ptr)
+|| (TCG_TARGET_REG_BITS < TARGET_LONG_BITS
+&& !reloc_pc16(l->label_ptr[1], s->code_ptr))) {
+return false;
 }
 
 i = 1;
@@ -1346,7 +1328,9 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, 
TCGLabelQemuLdst *l)
 }
 
 tcg_out_opc_br(s, OPC_BEQ, TCG_REG_ZERO, TCG_REG_ZERO);
-reloc_pc16(s->code_ptr - 1, l->raddr);
+if (!reloc_pc16(s->code_ptr - 1, l->raddr)) {
+return false;
+}
 
 /* delay slot */
 if (TCG_TARGET_REG_BITS == 64 && l->type == TCG_TYPE_I32) {
@@ -1366,9 +1350,10 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, 
TCGLabelQemuLdst *l)
 int i;
 
 /* resolve label address */
-reloc_pc16(l->label_ptr[0], s->code_ptr);
-if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
-reloc_pc16(l->label_ptr[1], s->code_ptr);
+if (!reloc_pc16(l->label_ptr[0], s->code_ptr)
+|| (TCG_TARGET_REG_BITS < TARGET_LONG_BITS
+&& !reloc_pc16(l->label_ptr[1], s->code_ptr))) {
+return false;
 }
 
 i = 1;
-- 
2.25.1




[PATCH v3 28/41] tcg/sparc: Use tcg_tbrel_diff

2020-11-05 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 tcg/sparc/tcg-target.c.inc | 15 +++
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/tcg/sparc/tcg-target.c.inc b/tcg/sparc/tcg-target.c.inc
index d599ae27b5..8f04fdf981 100644
--- a/tcg/sparc/tcg-target.c.inc
+++ b/tcg/sparc/tcg-target.c.inc
@@ -440,7 +440,7 @@ static void tcg_out_movi_int(TCGContext *s, TCGType type, 
TCGReg ret,
 
 /* A 13-bit constant relative to the TB.  */
 if (!in_prologue && USE_REG_TB) {
-test = arg - (uintptr_t)s->code_gen_ptr;
+test = tcg_tbrel_diff(s, (void *)arg);
 if (check_fit_ptr(test, 13)) {
 tcg_out_arithi(s, ret, TCG_REG_TB, test, ARITH_ADD);
 return;
@@ -537,15 +537,15 @@ static inline bool tcg_out_sti(TCGContext *s, TCGType 
type, TCGArg val,
 return false;
 }
 
-static void tcg_out_ld_ptr(TCGContext *s, TCGReg ret, uintptr_t arg)
+static void tcg_out_ld_ptr(TCGContext *s, TCGReg ret, const void *arg)
 {
-intptr_t diff = arg - (uintptr_t)s->code_gen_ptr;
+intptr_t diff = tcg_tbrel_diff(s, arg);
 if (USE_REG_TB && check_fit_ptr(diff, 13)) {
 tcg_out_ld(s, TCG_TYPE_PTR, ret, TCG_REG_TB, diff);
 return;
 }
-tcg_out_movi(s, TCG_TYPE_PTR, ret, arg & ~0x3ff);
-tcg_out_ld(s, TCG_TYPE_PTR, ret, ret, arg & 0x3ff);
+tcg_out_movi(s, TCG_TYPE_PTR, ret, (uintptr_t)arg & ~0x3ff);
+tcg_out_ld(s, TCG_TYPE_PTR, ret, ret, (uintptr_t)arg & 0x3ff);
 }
 
 static inline void tcg_out_sety(TCGContext *s, TCGReg rs)
@@ -1313,7 +1313,7 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
 tcg_out_movi_imm13(s, TCG_REG_O0, a0);
 break;
 } else if (USE_REG_TB) {
-intptr_t tb_diff = a0 - (uintptr_t)s->code_gen_ptr;
+intptr_t tb_diff = tcg_tbrel_diff(s, (void *)a0);
 if (check_fit_ptr(tb_diff, 13)) {
 tcg_out_arithi(s, TCG_REG_G0, TCG_REG_I7, 8, RETURN);
 /* Note that TCG_REG_TB has been unwound to O1.  */
@@ -1345,8 +1345,7 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
 }
 } else {
 /* indirect jump method */
-tcg_out_ld_ptr(s, TCG_REG_TB,
-   (uintptr_t)(s->tb_jmp_target_addr + a0));
+tcg_out_ld_ptr(s, TCG_REG_TB, s->tb_jmp_target_addr + a0);
 tcg_out_arithi(s, TCG_REG_G0, TCG_REG_TB, 0, JMPL);
 tcg_out_nop(s);
 }
-- 
2.25.1




[PATCH v3 35/41] accel/tcg: Add mips support to alloc_code_gen_buffer_splitwx_memfd

2020-11-05 Thread Richard Henderson
Re-use the 256MiB region handling from alloc_code_gen_buffer_anon,
and replace that with the shared file mapping.

Signed-off-by: Richard Henderson 
---
 accel/tcg/translate-all.c | 46 ---
 1 file changed, 38 insertions(+), 8 deletions(-)

diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 17df6c94fa..b49aaf1026 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -1137,24 +1137,40 @@ static bool alloc_code_gen_buffer_anon(size_t size, int 
prot,
 
 static bool alloc_code_gen_buffer_splitwx_memfd(size_t size, Error **errp)
 {
-void *buf_rw, *buf_rx;
+void *buf_rw = NULL, *buf_rx = MAP_FAILED;
 int fd = -1;
 
+#ifdef __mips__
+/* Find space for the RX mapping, vs the 256MiB regions. */
+if (!alloc_code_gen_buffer_anon(size, PROT_NONE,
+MAP_PRIVATE | MAP_ANONYMOUS |
+MAP_NORESERVE, errp)) {
+return false;
+}
+/* The size of the mapping may have been adjusted. */
+size = tcg_ctx->code_gen_buffer_size;
+buf_rx = tcg_ctx->code_gen_buffer;
+#endif
+
 buf_rw = qemu_memfd_alloc("tcg-jit", size, 0, &fd, errp);
 if (buf_rw == NULL) {
-return false;
+goto fail;
 }
 
+#ifdef __mips__
+void *tmp = mmap(buf_rx, size, PROT_READ | PROT_EXEC,
+ MAP_SHARED | MAP_FIXED, fd, 0);
+if (tmp != buf_rx) {
+goto fail_rx;
+}
+#else
 buf_rx = mmap(NULL, size, PROT_READ | PROT_EXEC, MAP_SHARED, fd, 0);
 if (buf_rx == MAP_FAILED) {
-error_setg_errno(errp, errno,
- "failed to map shared memory for execute");
-munmap(buf_rw, size);
-close(fd);
-return false;
+goto fail_rx;
 }
-close(fd);
+#endif
 
+close(fd);
 tcg_ctx->code_gen_buffer = buf_rw;
 tcg_ctx->code_gen_buffer_size = size;
 tcg_splitwx_diff = buf_rx - buf_rw;
@@ -1163,6 +1179,20 @@ static bool alloc_code_gen_buffer_splitwx_memfd(size_t 
size, Error **errp)
 qemu_madvise(buf_rw, size, QEMU_MADV_HUGEPAGE);
 qemu_madvise(buf_rx, size, QEMU_MADV_HUGEPAGE);
 return true;
+
+ fail_rx:
+error_setg_errno(errp, errno, "failed to map shared memory for execute");
+ fail:
+if (buf_rx != MAP_FAILED) {
+munmap(buf_rx, size);
+}
+if (buf_rw) {
+munmap(buf_rw, size);
+}
+if (fd >= 0) {
+close(fd);
+}
+return false;
 }
 #endif /* CONFIG_POSIX */
 
-- 
2.25.1




[PATCH v3 31/41] tcg/s390: Support split-wx code generation

2020-11-05 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 tcg/s390/tcg-target.h |  2 +-
 tcg/s390/tcg-target.c.inc | 69 +--
 2 files changed, 31 insertions(+), 40 deletions(-)

diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
index 8324197127..1fd8b3858e 100644
--- a/tcg/s390/tcg-target.h
+++ b/tcg/s390/tcg-target.h
@@ -163,6 +163,6 @@ static inline void tb_target_set_jmp_target(uintptr_t 
tc_ptr, uintptr_t jmp_rx,
 #define TCG_TARGET_NEED_LDST_LABELS
 #endif
 #define TCG_TARGET_NEED_POOL_LABELS
-#define TCG_TARGET_SUPPORT_MIRROR   0
+#define TCG_TARGET_SUPPORT_MIRROR   1
 
 #endif
diff --git a/tcg/s390/tcg-target.c.inc b/tcg/s390/tcg-target.c.inc
index e4c61fc014..582a8ef941 100644
--- a/tcg/s390/tcg-target.c.inc
+++ b/tcg/s390/tcg-target.c.inc
@@ -363,36 +363,37 @@ static void * const qemu_st_helpers[16] = {
 };
 #endif
 
-static tcg_insn_unit *tb_ret_addr;
+static const tcg_insn_unit *tb_ret_addr;
 uint64_t s390_facilities;
 
-static bool patch_reloc(tcg_insn_unit *code_ptr, int type,
+static bool patch_reloc(tcg_insn_unit *src_rw, int type,
 intptr_t value, intptr_t addend)
 {
+const tcg_insn_unit *src_rx = tcg_splitwx_to_rx(src_rw);
 intptr_t pcrel2;
 uint32_t old;
 
 value += addend;
-pcrel2 = (tcg_insn_unit *)value - code_ptr;
+pcrel2 = (tcg_insn_unit *)value - src_rx;
 
 switch (type) {
 case R_390_PC16DBL:
 if (pcrel2 == (int16_t)pcrel2) {
-tcg_patch16(code_ptr, pcrel2);
+tcg_patch16(src_rw, pcrel2);
 return true;
 }
 break;
 case R_390_PC32DBL:
 if (pcrel2 == (int32_t)pcrel2) {
-tcg_patch32(code_ptr, pcrel2);
+tcg_patch32(src_rw, pcrel2);
 return true;
 }
 break;
 case R_390_20:
 if (value == sextract64(value, 0, 20)) {
-old = *(uint32_t *)code_ptr & 0xf0ff;
+old = *(uint32_t *)src_rw & 0xf0ff;
 old |= ((value & 0xfff) << 16) | ((value & 0xff000) >> 4);
-tcg_patch32(code_ptr, old);
+tcg_patch32(src_rw, old);
 return true;
 }
 break;
@@ -730,7 +731,8 @@ static inline bool tcg_out_sti(TCGContext *s, TCGType type, 
TCGArg val,
 }
 
 /* load data from an absolute host address */
-static void tcg_out_ld_abs(TCGContext *s, TCGType type, TCGReg dest, void *abs)
+static void tcg_out_ld_abs(TCGContext *s, TCGType type,
+   TCGReg dest, const void *abs)
 {
 intptr_t addr = (intptr_t)abs;
 
@@ -1304,7 +1306,7 @@ static void tgen_extract(TCGContext *s, TCGReg dest, 
TCGReg src,
 
 static void tgen_gotoi(TCGContext *s, int cc, const tcg_insn_unit *dest)
 {
-ptrdiff_t off = dest - s->code_ptr;
+ptrdiff_t off = tcg_pcrel_diff(s, dest) >> 1;
 if (off == (int16_t)off) {
 tcg_out_insn(s, RI, BRC, cc, off);
 } else if (off == (int32_t)off) {
@@ -1333,34 +1335,18 @@ static void tgen_branch(TCGContext *s, int cc, TCGLabel 
*l)
 static void tgen_compare_branch(TCGContext *s, S390Opcode opc, int cc,
 TCGReg r1, TCGReg r2, TCGLabel *l)
 {
-intptr_t off = 0;
-
-if (l->has_value) {
-off = l->u.value_ptr - s->code_ptr;
-tcg_debug_assert(off == (int16_t)off);
-} else {
-tcg_out_reloc(s, s->code_ptr + 1, R_390_PC16DBL, l, 2);
-}
-
+tcg_out_reloc(s, s->code_ptr + 1, R_390_PC16DBL, l, 2);
 tcg_out16(s, (opc & 0xff00) | (r1 << 4) | r2);
-tcg_out16(s, off);
+tcg_out16(s, 0);
 tcg_out16(s, cc << 12 | (opc & 0xff));
 }
 
 static void tgen_compare_imm_branch(TCGContext *s, S390Opcode opc, int cc,
 TCGReg r1, int i2, TCGLabel *l)
 {
-tcg_target_long off = 0;
-
-if (l->has_value) {
-off = l->u.value_ptr - s->code_ptr;
-tcg_debug_assert(off == (int16_t)off);
-} else {
-tcg_out_reloc(s, s->code_ptr + 1, R_390_PC16DBL, l, 2);
-}
-
+tcg_out_reloc(s, s->code_ptr + 1, R_390_PC16DBL, l, 2);
 tcg_out16(s, (opc & 0xff00) | (r1 << 4) | cc);
-tcg_out16(s, off);
+tcg_out16(s, 0);
 tcg_out16(s, (i2 << 8) | (opc & 0xff));
 }
 
@@ -1417,7 +1403,7 @@ static void tgen_brcond(TCGContext *s, TCGType type, 
TCGCond c,
 
 static void tcg_out_call(TCGContext *s, const tcg_insn_unit *dest)
 {
-ptrdiff_t off = dest - s->code_ptr;
+ptrdiff_t off = tcg_pcrel_diff(s, dest) >> 1;
 if (off == (int32_t)off) {
 tcg_out_insn(s, RIL, BRASL, TCG_REG_R14, off);
 } else {
@@ -1601,7 +1587,8 @@ static void add_qemu_ldst_label(TCGContext *s, bool 
is_ld, TCGMemOpIdx oi,
 label->oi = oi;
 label->datalo_reg = data;
 label->addrlo_reg = addr;
-label->raddr = raddr;
+/* TODO: Cast goes away when all hosts converted */
+label->raddr = (void *)tcg_splitwx_to_rx(raddr);
 label->label_ptr[0] = label_ptr;
 }
 
@@ -1613,7 +1600,7 @@ static bool tcg_out_qemu_ld_slow_path(TCG

[PATCH v3 41/41] tcg: Constify TCGLabelQemuLdst.raddr

2020-11-05 Thread Richard Henderson
Now that all native tcg hosts support splitwx,
make this pointer const.

Signed-off-by: Richard Henderson 
---
 tcg/aarch64/tcg-target.c.inc | 3 +--
 tcg/arm/tcg-target.c.inc | 3 +--
 tcg/i386/tcg-target.c.inc| 3 +--
 tcg/mips/tcg-target.c.inc| 3 +--
 tcg/ppc/tcg-target.c.inc | 3 +--
 tcg/riscv/tcg-target.c.inc   | 3 +--
 tcg/s390/tcg-target.c.inc| 3 +--
 tcg/tcg-ldst.c.inc   | 2 +-
 8 files changed, 8 insertions(+), 15 deletions(-)

diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
index 32a60eba5e..9e15128f31 100644
--- a/tcg/aarch64/tcg-target.c.inc
+++ b/tcg/aarch64/tcg-target.c.inc
@@ -1636,8 +1636,7 @@ static void add_qemu_ldst_label(TCGContext *s, bool 
is_ld, TCGMemOpIdx oi,
 label->type = ext;
 label->datalo_reg = data_reg;
 label->addrlo_reg = addr_reg;
-/* TODO: Cast goes away when all hosts converted */
-label->raddr = (void *)tcg_splitwx_to_rx(raddr);
+label->raddr = tcg_splitwx_to_rx(raddr);
 label->label_ptr[0] = label_ptr;
 }
 
diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
index d6cb19ca9f..0fd1126454 100644
--- a/tcg/arm/tcg-target.c.inc
+++ b/tcg/arm/tcg-target.c.inc
@@ -1340,8 +1340,7 @@ static void add_qemu_ldst_label(TCGContext *s, bool 
is_ld, TCGMemOpIdx oi,
 label->datahi_reg = datahi;
 label->addrlo_reg = addrlo;
 label->addrhi_reg = addrhi;
-/* TODO: Cast goes away when all hosts converted */
-label->raddr = (void *)tcg_splitwx_to_rx(raddr);
+label->raddr = tcg_splitwx_to_rx(raddr);
 label->label_ptr[0] = label_ptr;
 }
 
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index be57d2330a..18a3af53bb 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -1786,8 +1786,7 @@ static void add_qemu_ldst_label(TCGContext *s, bool 
is_ld, bool is_64,
 label->datahi_reg = datahi;
 label->addrlo_reg = addrlo;
 label->addrhi_reg = addrhi;
-/* TODO: Cast goes away when all hosts converted */
-label->raddr = (void *)tcg_splitwx_to_rx(raddr);
+label->raddr = tcg_splitwx_to_rx(raddr);
 label->label_ptr[0] = label_ptr[0];
 if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) {
 label->label_ptr[1] = label_ptr[1];
diff --git a/tcg/mips/tcg-target.c.inc b/tcg/mips/tcg-target.c.inc
index 18fd474593..add157f6c3 100644
--- a/tcg/mips/tcg-target.c.inc
+++ b/tcg/mips/tcg-target.c.inc
@@ -1283,8 +1283,7 @@ static void add_qemu_ldst_label(TCGContext *s, int is_ld, 
TCGMemOpIdx oi,
 label->datahi_reg = datahi;
 label->addrlo_reg = addrlo;
 label->addrhi_reg = addrhi;
-/* TODO: Cast goes away when all hosts converted */
-label->raddr = (void *)tcg_splitwx_to_rx(raddr);
+label->raddr = tcg_splitwx_to_rx(raddr);
 label->label_ptr[0] = label_ptr[0];
 if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
 label->label_ptr[1] = label_ptr[1];
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index 3580bbabc1..71b40e7490 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -2001,8 +2001,7 @@ static void add_qemu_ldst_label(TCGContext *s, bool 
is_ld, TCGMemOpIdx oi,
 label->datahi_reg = datahi_reg;
 label->addrlo_reg = addrlo_reg;
 label->addrhi_reg = addrhi_reg;
-/* TODO: Cast goes away when all hosts converted */
-label->raddr = (void *)tcg_splitwx_to_rx(raddr);
+label->raddr = tcg_splitwx_to_rx(raddr);
 label->label_ptr[0] = lptr;
 }
 
diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
index 7b4ee4a084..f48a028dac 100644
--- a/tcg/riscv/tcg-target.c.inc
+++ b/tcg/riscv/tcg-target.c.inc
@@ -996,8 +996,7 @@ static void add_qemu_ldst_label(TCGContext *s, int is_ld, 
TCGMemOpIdx oi,
 label->datahi_reg = datahi;
 label->addrlo_reg = addrlo;
 label->addrhi_reg = addrhi;
-/* TODO: Cast goes away when all hosts converted */
-label->raddr = (void *)tcg_splitwx_to_rx(raddr);
+label->raddr = tcg_splitwx_to_rx(raddr);
 label->label_ptr[0] = label_ptr[0];
 }
 
diff --git a/tcg/s390/tcg-target.c.inc b/tcg/s390/tcg-target.c.inc
index b3660ffedf..d7ef079055 100644
--- a/tcg/s390/tcg-target.c.inc
+++ b/tcg/s390/tcg-target.c.inc
@@ -1587,8 +1587,7 @@ static void add_qemu_ldst_label(TCGContext *s, bool 
is_ld, TCGMemOpIdx oi,
 label->oi = oi;
 label->datalo_reg = data;
 label->addrlo_reg = addr;
-/* TODO: Cast goes away when all hosts converted */
-label->raddr = (void *)tcg_splitwx_to_rx(raddr);
+label->raddr = tcg_splitwx_to_rx(raddr);
 label->label_ptr[0] = label_ptr;
 }
 
diff --git a/tcg/tcg-ldst.c.inc b/tcg/tcg-ldst.c.inc
index 05f9b3ccd6..c3ce88e69d 100644
--- a/tcg/tcg-ldst.c.inc
+++ b/tcg/tcg-ldst.c.inc
@@ -28,7 +28,7 @@ typedef struct TCGLabelQemuLdst {
 TCGReg addrhi_reg;  /* reg index for high word of guest virtual addr */
 TCGReg datalo_reg;  /* reg index for low word to be loaded or stored */
 TCGReg datahi_reg;  /* reg index for high word

[PATCH v3 29/41] tcg/sparc: Support split-wx code generation

2020-11-05 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 tcg/sparc/tcg-target.h |  2 +-
 tcg/sparc/tcg-target.c.inc | 24 +---
 2 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/tcg/sparc/tcg-target.h b/tcg/sparc/tcg-target.h
index 517840705f..bb2505bfc7 100644
--- a/tcg/sparc/tcg-target.h
+++ b/tcg/sparc/tcg-target.h
@@ -181,6 +181,6 @@ static inline void flush_idcache_range(uintptr_t rx, 
uintptr_t rw, size_t len)
 void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t, uintptr_t);
 
 #define TCG_TARGET_NEED_POOL_LABELS
-#define TCG_TARGET_SUPPORT_MIRROR   0
+#define TCG_TARGET_SUPPORT_MIRROR   1
 
 #endif
diff --git a/tcg/sparc/tcg-target.c.inc b/tcg/sparc/tcg-target.c.inc
index 8f04fdf981..182124b96c 100644
--- a/tcg/sparc/tcg-target.c.inc
+++ b/tcg/sparc/tcg-target.c.inc
@@ -291,14 +291,15 @@ static inline int check_fit_i32(int32_t val, unsigned int 
bits)
 # define check_fit_ptr  check_fit_i32
 #endif
 
-static bool patch_reloc(tcg_insn_unit *code_ptr, int type,
+static bool patch_reloc(tcg_insn_unit *src_rw, int type,
 intptr_t value, intptr_t addend)
 {
-uint32_t insn = *code_ptr;
+const tcg_insn_unit *src_rx = tcg_splitwx_to_rx(src_rw);
+uint32_t insn = *src_rw;
 intptr_t pcrel;
 
 value += addend;
-pcrel = tcg_ptr_byte_diff((tcg_insn_unit *)value, code_ptr);
+pcrel = tcg_ptr_byte_diff((tcg_insn_unit *)value, src_rx);
 
 switch (type) {
 case R_SPARC_WDISP16:
@@ -315,7 +316,7 @@ static bool patch_reloc(tcg_insn_unit *code_ptr, int type,
 g_assert_not_reached();
 }
 
-*code_ptr = insn;
+*src_rw = insn;
 return true;
 }
 
@@ -868,8 +869,8 @@ static void tcg_out_mb(TCGContext *s, TCGArg a0)
 }
 
 #ifdef CONFIG_SOFTMMU
-static tcg_insn_unit *qemu_ld_trampoline[16];
-static tcg_insn_unit *qemu_st_trampoline[16];
+static const tcg_insn_unit *qemu_ld_trampoline[16];
+static const tcg_insn_unit *qemu_st_trampoline[16];
 
 static void emit_extend(TCGContext *s, TCGReg r, int op)
 {
@@ -930,7 +931,7 @@ static void build_trampolines(TCGContext *s)
 while ((uintptr_t)s->code_ptr & 15) {
 tcg_out_nop(s);
 }
-qemu_ld_trampoline[i] = s->code_ptr;
+qemu_ld_trampoline[i] = tcg_splitwx_to_rx(s->code_ptr);
 
 if (SPARC64 || TARGET_LONG_BITS == 32) {
 ra = TCG_REG_O3;
@@ -958,7 +959,7 @@ static void build_trampolines(TCGContext *s)
 while ((uintptr_t)s->code_ptr & 15) {
 tcg_out_nop(s);
 }
-qemu_st_trampoline[i] = s->code_ptr;
+qemu_st_trampoline[i] = tcg_splitwx_to_rx(s->code_ptr);
 
 if (SPARC64) {
 emit_extend(s, TCG_REG_O2, i);
@@ -1038,7 +1039,8 @@ static void tcg_target_qemu_prologue(TCGContext *s)
 tcg_out_nop(s);
 
 /* Epilogue for goto_ptr.  */
-tcg_code_gen_epilogue = s->code_ptr;
+/* TODO: Cast goes away when all hosts converted */
+tcg_code_gen_epilogue = (void *)tcg_splitwx_to_rx(s->code_ptr);
 tcg_out_arithi(s, TCG_REG_G0, TCG_REG_I7, 8, RETURN);
 /* delay slot */
 tcg_out_movi_imm13(s, TCG_REG_O0, 0);
@@ -1163,7 +1165,7 @@ static void tcg_out_qemu_ld(TCGContext *s, TCGReg data, 
TCGReg addr,
 #ifdef CONFIG_SOFTMMU
 unsigned memi = get_mmuidx(oi);
 TCGReg addrz, param;
-tcg_insn_unit *func;
+const tcg_insn_unit *func;
 tcg_insn_unit *label_ptr;
 
 addrz = tcg_out_tlb_load(s, addr, memi, memop,
@@ -1245,7 +1247,7 @@ static void tcg_out_qemu_st(TCGContext *s, TCGReg data, 
TCGReg addr,
 #ifdef CONFIG_SOFTMMU
 unsigned memi = get_mmuidx(oi);
 TCGReg addrz, param;
-tcg_insn_unit *func;
+const tcg_insn_unit *func;
 tcg_insn_unit *label_ptr;
 
 addrz = tcg_out_tlb_load(s, addr, memi, memop,
-- 
2.25.1




[PATCH v3 23/41] tcg/tci: Push const down through bytecode reading

2020-11-05 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 tcg/tci.c | 60 +++
 1 file changed, 34 insertions(+), 26 deletions(-)

diff --git a/tcg/tci.c b/tcg/tci.c
index 262a2b39ce..388c1dbee8 100644
--- a/tcg/tci.c
+++ b/tcg/tci.c
@@ -163,34 +163,34 @@ static uint64_t tci_uint64(uint32_t high, uint32_t low)
 #endif
 
 /* Read constant (native size) from bytecode. */
-static tcg_target_ulong tci_read_i(uint8_t **tb_ptr)
+static tcg_target_ulong tci_read_i(const uint8_t **tb_ptr)
 {
-tcg_target_ulong value = *(tcg_target_ulong *)(*tb_ptr);
+tcg_target_ulong value = *(const tcg_target_ulong *)(*tb_ptr);
 *tb_ptr += sizeof(value);
 return value;
 }
 
 /* Read unsigned constant (32 bit) from bytecode. */
-static uint32_t tci_read_i32(uint8_t **tb_ptr)
+static uint32_t tci_read_i32(const uint8_t **tb_ptr)
 {
-uint32_t value = *(uint32_t *)(*tb_ptr);
+uint32_t value = *(const uint32_t *)(*tb_ptr);
 *tb_ptr += sizeof(value);
 return value;
 }
 
 /* Read signed constant (32 bit) from bytecode. */
-static int32_t tci_read_s32(uint8_t **tb_ptr)
+static int32_t tci_read_s32(const uint8_t **tb_ptr)
 {
-int32_t value = *(int32_t *)(*tb_ptr);
+int32_t value = *(const int32_t *)(*tb_ptr);
 *tb_ptr += sizeof(value);
 return value;
 }
 
 #if TCG_TARGET_REG_BITS == 64
 /* Read constant (64 bit) from bytecode. */
-static uint64_t tci_read_i64(uint8_t **tb_ptr)
+static uint64_t tci_read_i64(const uint8_t **tb_ptr)
 {
-uint64_t value = *(uint64_t *)(*tb_ptr);
+uint64_t value = *(const uint64_t *)(*tb_ptr);
 *tb_ptr += sizeof(value);
 return value;
 }
@@ -198,7 +198,7 @@ static uint64_t tci_read_i64(uint8_t **tb_ptr)
 
 /* Read indexed register (native size) from bytecode. */
 static tcg_target_ulong
-tci_read_r(const tcg_target_ulong *regs, uint8_t **tb_ptr)
+tci_read_r(const tcg_target_ulong *regs, const uint8_t **tb_ptr)
 {
 tcg_target_ulong value = tci_read_reg(regs, **tb_ptr);
 *tb_ptr += 1;
@@ -206,7 +206,7 @@ tci_read_r(const tcg_target_ulong *regs, uint8_t **tb_ptr)
 }
 
 /* Read indexed register (8 bit) from bytecode. */
-static uint8_t tci_read_r8(const tcg_target_ulong *regs, uint8_t **tb_ptr)
+static uint8_t tci_read_r8(const tcg_target_ulong *regs, const uint8_t 
**tb_ptr)
 {
 uint8_t value = tci_read_reg8(regs, **tb_ptr);
 *tb_ptr += 1;
@@ -215,7 +215,7 @@ static uint8_t tci_read_r8(const tcg_target_ulong *regs, 
uint8_t **tb_ptr)
 
 #if TCG_TARGET_HAS_ext8s_i32 || TCG_TARGET_HAS_ext8s_i64
 /* Read indexed register (8 bit signed) from bytecode. */
-static int8_t tci_read_r8s(const tcg_target_ulong *regs, uint8_t **tb_ptr)
+static int8_t tci_read_r8s(const tcg_target_ulong *regs, const uint8_t 
**tb_ptr)
 {
 int8_t value = tci_read_reg8s(regs, **tb_ptr);
 *tb_ptr += 1;
@@ -224,7 +224,8 @@ static int8_t tci_read_r8s(const tcg_target_ulong *regs, 
uint8_t **tb_ptr)
 #endif
 
 /* Read indexed register (16 bit) from bytecode. */
-static uint16_t tci_read_r16(const tcg_target_ulong *regs, uint8_t **tb_ptr)
+static uint16_t tci_read_r16(const tcg_target_ulong *regs,
+ const uint8_t **tb_ptr)
 {
 uint16_t value = tci_read_reg16(regs, **tb_ptr);
 *tb_ptr += 1;
@@ -233,7 +234,8 @@ static uint16_t tci_read_r16(const tcg_target_ulong *regs, 
uint8_t **tb_ptr)
 
 #if TCG_TARGET_HAS_ext16s_i32 || TCG_TARGET_HAS_ext16s_i64
 /* Read indexed register (16 bit signed) from bytecode. */
-static int16_t tci_read_r16s(const tcg_target_ulong *regs, uint8_t **tb_ptr)
+static int16_t tci_read_r16s(const tcg_target_ulong *regs,
+ const uint8_t **tb_ptr)
 {
 int16_t value = tci_read_reg16s(regs, **tb_ptr);
 *tb_ptr += 1;
@@ -242,7 +244,8 @@ static int16_t tci_read_r16s(const tcg_target_ulong *regs, 
uint8_t **tb_ptr)
 #endif
 
 /* Read indexed register (32 bit) from bytecode. */
-static uint32_t tci_read_r32(const tcg_target_ulong *regs, uint8_t **tb_ptr)
+static uint32_t tci_read_r32(const tcg_target_ulong *regs,
+ const uint8_t **tb_ptr)
 {
 uint32_t value = tci_read_reg32(regs, **tb_ptr);
 *tb_ptr += 1;
@@ -251,14 +254,16 @@ static uint32_t tci_read_r32(const tcg_target_ulong 
*regs, uint8_t **tb_ptr)
 
 #if TCG_TARGET_REG_BITS == 32
 /* Read two indexed registers (2 * 32 bit) from bytecode. */
-static uint64_t tci_read_r64(const tcg_target_ulong *regs, uint8_t **tb_ptr)
+static uint64_t tci_read_r64(const tcg_target_ulong *regs,
+ const uint8_t **tb_ptr)
 {
 uint32_t low = tci_read_r32(regs, tb_ptr);
 return tci_uint64(tci_read_r32(regs, tb_ptr), low);
 }
 #elif TCG_TARGET_REG_BITS == 64
 /* Read indexed register (32 bit signed) from bytecode. */
-static int32_t tci_read_r32s(const tcg_target_ulong *regs, uint8_t **tb_ptr)
+static int32_t tci_read_r32s(const tcg_target_ulong *regs,
+ const uint8_t **tb_ptr)
 {
 int32_t value = tci_read_reg32s(re

[PATCH v3 22/41] disas: Push const down through host disasassembly

2020-11-05 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 include/disas/dis-asm.h | 4 ++--
 disas.c | 4 +---
 disas/capstone.c| 2 +-
 3 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/include/disas/dis-asm.h b/include/disas/dis-asm.h
index 2164762b46..d1133a4e04 100644
--- a/include/disas/dis-asm.h
+++ b/include/disas/dis-asm.h
@@ -358,7 +358,7 @@ typedef struct disassemble_info {
 (bfd_vma addr, struct disassemble_info * info);
 
   /* These are for buffer_read_memory.  */
-  bfd_byte *buffer;
+  const bfd_byte *buffer;
   bfd_vma buffer_vma;
   int buffer_length;
 
@@ -462,7 +462,7 @@ int print_insn_rx(bfd_vma, disassemble_info *);
 
 #ifdef CONFIG_CAPSTONE
 bool cap_disas_target(disassemble_info *info, uint64_t pc, size_t size);
-bool cap_disas_host(disassemble_info *info, void *code, size_t size);
+bool cap_disas_host(disassemble_info *info, const void *code, size_t size);
 bool cap_disas_monitor(disassemble_info *info, uint64_t pc, int count);
 bool cap_disas_plugin(disassemble_info *info, uint64_t pc, size_t size);
 #else
diff --git a/disas.c b/disas.c
index de1de7be94..a61f95b580 100644
--- a/disas.c
+++ b/disas.c
@@ -299,10 +299,8 @@ char *plugin_disas(CPUState *cpu, uint64_t addr, size_t 
size)
 }
 
 /* Disassemble this for me please... (debugging). */
-void disas(FILE *out, const void *ccode, unsigned long size)
+void disas(FILE *out, const void *code, unsigned long size)
 {
-/* TODO: Push constness through the disas backends. */
-void *code = (void *)ccode;
 uintptr_t pc;
 int count;
 CPUDebug s;
diff --git a/disas/capstone.c b/disas/capstone.c
index 7462c0e305..20bc8f9669 100644
--- a/disas/capstone.c
+++ b/disas/capstone.c
@@ -229,7 +229,7 @@ bool cap_disas_target(disassemble_info *info, uint64_t pc, 
size_t size)
 }
 
 /* Disassemble SIZE bytes at CODE for the host.  */
-bool cap_disas_host(disassemble_info *info, void *code, size_t size)
+bool cap_disas_host(disassemble_info *info, const void *code, size_t size)
 {
 csh handle;
 const uint8_t *cbuf;
-- 
2.25.1




[PATCH v3 32/41] tcg/riscv: Fix branch range checks

2020-11-05 Thread Richard Henderson
The offset even checks were folded into the range check incorrectly.
By offsetting by 1, and not decrementing the width, we silently
allowed out of range branches.

Assert that the offset is always even instead.  Move tcg_out_goto
down into the CONFIG_SOFTMMU block so that it is not unused.

Signed-off-by: Richard Henderson 
---
 tcg/riscv/tcg-target.c.inc | 28 +++-
 1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
index 025e3cd0bb..195c3eff03 100644
--- a/tcg/riscv/tcg-target.c.inc
+++ b/tcg/riscv/tcg-target.c.inc
@@ -429,7 +429,8 @@ static bool reloc_sbimm12(tcg_insn_unit *code_ptr, 
tcg_insn_unit *target)
 {
 intptr_t offset = (intptr_t)target - (intptr_t)code_ptr;
 
-if (offset == sextreg(offset, 1, 12) << 1) {
+tcg_debug_assert((offset & 1) == 0);
+if (offset == sextreg(offset, 0, 12)) {
 code_ptr[0] |= encode_sbimm12(offset);
 return true;
 }
@@ -441,7 +442,8 @@ static bool reloc_jimm20(tcg_insn_unit *code_ptr, 
tcg_insn_unit *target)
 {
 intptr_t offset = (intptr_t)target - (intptr_t)code_ptr;
 
-if (offset == sextreg(offset, 1, 20) << 1) {
+tcg_debug_assert((offset & 1) == 0);
+if (offset == sextreg(offset, 0, 20)) {
 code_ptr[0] |= encode_ujimm20(offset);
 return true;
 }
@@ -854,28 +856,21 @@ static void tcg_out_setcond2(TCGContext *s, TCGCond cond, 
TCGReg ret,
 g_assert_not_reached();
 }
 
-static inline void tcg_out_goto(TCGContext *s, tcg_insn_unit *target)
-{
-ptrdiff_t offset = tcg_pcrel_diff(s, target);
-tcg_debug_assert(offset == sextreg(offset, 1, 20) << 1);
-tcg_out_opc_jump(s, OPC_JAL, TCG_REG_ZERO, offset);
-}
-
 static void tcg_out_call_int(TCGContext *s, const tcg_insn_unit *arg, bool 
tail)
 {
 TCGReg link = tail ? TCG_REG_ZERO : TCG_REG_RA;
 ptrdiff_t offset = tcg_pcrel_diff(s, arg);
 int ret;
 
-if (offset == sextreg(offset, 1, 20) << 1) {
+tcg_debug_assert((offset & 1) == 0);
+if (offset == sextreg(offset, 0, 20)) {
 /* short jump: -2097150 to 2097152 */
 tcg_out_opc_jump(s, OPC_JAL, link, offset);
-} else if (TCG_TARGET_REG_BITS == 32 ||
-offset == sextreg(offset, 1, 31) << 1) {
+} else if (TCG_TARGET_REG_BITS == 32 || offset == (int32_t)offset) {
 /* long jump: -2147483646 to 2147483648 */
 tcg_out_opc_upper(s, OPC_AUIPC, TCG_REG_TMP0, 0);
 tcg_out_opc_imm(s, OPC_JALR, link, TCG_REG_TMP0, 0);
-ret = reloc_call(s->code_ptr - 2, arg);\
+ret = reloc_call(s->code_ptr - 2, arg);
 tcg_debug_assert(ret == true);
 } else if (TCG_TARGET_REG_BITS == 64) {
 /* far jump: 64-bit */
@@ -962,6 +957,13 @@ QEMU_BUILD_BUG_ON(TCG_TARGET_REG_BITS < TARGET_LONG_BITS);
 QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
 QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -(1 << 11));
 
+static void tcg_out_goto(TCGContext *s, tcg_insn_unit *target)
+{
+tcg_out_opc_jump(s, OPC_JAL, TCG_REG_ZERO, 0);
+bool ok = reloc_jimm20(s->code_ptr - 1, target);
+tcg_debug_assert(ok);
+}
+
 static void tcg_out_tlb_load(TCGContext *s, TCGReg addrl,
  TCGReg addrh, TCGMemOpIdx oi,
  tcg_insn_unit **label_ptr, bool is_load)
-- 
2.25.1




[PATCH v3 21/41] tcg/aarch64: Support split-wx code generation

2020-11-05 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 tcg/aarch64/tcg-target.h |  2 +-
 tcg/aarch64/tcg-target.c.inc | 57 
 2 files changed, 33 insertions(+), 26 deletions(-)

diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h
index e62d38ba55..abb94f9458 100644
--- a/tcg/aarch64/tcg-target.h
+++ b/tcg/aarch64/tcg-target.h
@@ -155,6 +155,6 @@ void tb_target_set_jmp_target(uintptr_t, uintptr_t, 
uintptr_t, uintptr_t);
 #define TCG_TARGET_NEED_LDST_LABELS
 #endif
 #define TCG_TARGET_NEED_POOL_LABELS
-#define TCG_TARGET_SUPPORT_MIRROR   0
+#define TCG_TARGET_SUPPORT_MIRROR   1
 
 #endif /* AARCH64_TCG_TARGET_H */
diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
index 8aa1fafd91..e398913c0c 100644
--- a/tcg/aarch64/tcg-target.c.inc
+++ b/tcg/aarch64/tcg-target.c.inc
@@ -78,38 +78,42 @@ static const int tcg_target_call_oarg_regs[1] = {
 #define TCG_REG_GUEST_BASE TCG_REG_X28
 #endif
 
-static inline bool reloc_pc26(tcg_insn_unit *code_ptr, tcg_insn_unit *target)
+static bool reloc_pc26(tcg_insn_unit *src_rw, const tcg_insn_unit *target)
 {
-ptrdiff_t offset = target - code_ptr;
+const tcg_insn_unit *src_rx = tcg_splitwx_to_rx(src_rw);
+ptrdiff_t offset = target - src_rx;
+
 if (offset == sextract64(offset, 0, 26)) {
 /* read instruction, mask away previous PC_REL26 parameter contents,
set the proper offset, then write back the instruction. */
-*code_ptr = deposit32(*code_ptr, 0, 26, offset);
+*src_rw = deposit32(*src_rw, 0, 26, offset);
 return true;
 }
 return false;
 }
 
-static inline bool reloc_pc19(tcg_insn_unit *code_ptr, tcg_insn_unit *target)
+static bool reloc_pc19(tcg_insn_unit *src_rw, const tcg_insn_unit *target)
 {
-ptrdiff_t offset = target - code_ptr;
+const tcg_insn_unit *src_rx = tcg_splitwx_to_rx(src_rw);
+ptrdiff_t offset = target - src_rx;
+
 if (offset == sextract64(offset, 0, 19)) {
-*code_ptr = deposit32(*code_ptr, 5, 19, offset);
+*src_rw = deposit32(*src_rw, 5, 19, offset);
 return true;
 }
 return false;
 }
 
-static inline bool patch_reloc(tcg_insn_unit *code_ptr, int type,
-   intptr_t value, intptr_t addend)
+static bool patch_reloc(tcg_insn_unit *code_ptr, int type,
+intptr_t value, intptr_t addend)
 {
 tcg_debug_assert(addend == 0);
 switch (type) {
 case R_AARCH64_JUMP26:
 case R_AARCH64_CALL26:
-return reloc_pc26(code_ptr, (tcg_insn_unit *)value);
+return reloc_pc26(code_ptr, (const tcg_insn_unit *)value);
 case R_AARCH64_CONDBR19:
-return reloc_pc19(code_ptr, (tcg_insn_unit *)value);
+return reloc_pc19(code_ptr, (const tcg_insn_unit *)value);
 default:
 g_assert_not_reached();
 }
@@ -1050,12 +1054,13 @@ static void tcg_out_movi(TCGContext *s, TCGType type, 
TCGReg rd,
 /* Look for host pointer values within 4G of the PC.  This happens
often when loading pointers to QEMU's own data structures.  */
 if (type == TCG_TYPE_I64) {
-tcg_target_long disp = value - (intptr_t)s->code_ptr;
+intptr_t src_rx = (intptr_t)tcg_splitwx_to_rx(s->code_ptr);
+tcg_target_long disp = value - src_rx;
 if (disp == sextract64(disp, 0, 21)) {
 tcg_out_insn(s, 3406, ADR, rd, disp);
 return;
 }
-disp = (value >> 12) - ((intptr_t)s->code_ptr >> 12);
+disp = (value >> 12) - (src_rx >> 12);
 if (disp == sextract64(disp, 0, 21)) {
 tcg_out_insn(s, 3406, ADRP, rd, disp);
 if (value & 0xfff) {
@@ -1308,14 +1313,14 @@ static void tcg_out_cmp(TCGContext *s, TCGType ext, 
TCGReg a,
 
 static void tcg_out_goto(TCGContext *s, const tcg_insn_unit *target)
 {
-ptrdiff_t offset = target - s->code_ptr;
+ptrdiff_t offset = tcg_pcrel_diff(s, target) >> 2;
 tcg_debug_assert(offset == sextract64(offset, 0, 26));
 tcg_out_insn(s, 3206, B, offset);
 }
 
-static inline void tcg_out_goto_long(TCGContext *s, tcg_insn_unit *target)
+static void tcg_out_goto_long(TCGContext *s, const tcg_insn_unit *target)
 {
-ptrdiff_t offset = target - s->code_ptr;
+ptrdiff_t offset = tcg_pcrel_diff(s, target) >> 2;
 if (offset == sextract64(offset, 0, 26)) {
 tcg_out_insn(s, 3206, B, offset);
 } else {
@@ -1329,9 +1334,9 @@ static inline void tcg_out_callr(TCGContext *s, TCGReg 
reg)
 tcg_out_insn(s, 3207, BLR, reg);
 }
 
-static inline void tcg_out_call(TCGContext *s, const tcg_insn_unit *target)
+static void tcg_out_call(TCGContext *s, const tcg_insn_unit *target)
 {
-ptrdiff_t offset = target - s->code_ptr;
+ptrdiff_t offset = tcg_pcrel_diff(s, target) >> 2;
 if (offset == sextract64(offset, 0, 26)) {
 tcg_out_insn(s, 3206, BL, offset);
 } else {
@@ -1393,7 +1398,7 @@ static void tcg_out_brcond(TCGContext *s, TCGType ext, 
TCGCond c, TCGArg a,

[PATCH v3 27/41] tcg/ppc: Support split-wx code generation

2020-11-05 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 tcg/ppc/tcg-target.h |  2 +-
 tcg/ppc/tcg-target.c.inc | 53 +++-
 2 files changed, 31 insertions(+), 24 deletions(-)

diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h
index 78d6a5e96f..a8628b6cad 100644
--- a/tcg/ppc/tcg-target.h
+++ b/tcg/ppc/tcg-target.h
@@ -185,6 +185,6 @@ void tb_target_set_jmp_target(uintptr_t, uintptr_t, 
uintptr_t, uintptr_t);
 #define TCG_TARGET_NEED_LDST_LABELS
 #endif
 #define TCG_TARGET_NEED_POOL_LABELS
-#define TCG_TARGET_SUPPORT_MIRROR   0
+#define TCG_TARGET_SUPPORT_MIRROR   1
 
 #endif
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index 91d5d95ddf..fe33687787 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -62,8 +62,6 @@
 #define TCG_CT_CONST_MONE 0x2000
 #define TCG_CT_CONST_WSZ  0x4000
 
-static tcg_insn_unit *tb_ret_addr;
-
 TCGPowerISA have_isa;
 static bool have_isel;
 bool have_altivec;
@@ -184,35 +182,41 @@ static inline bool in_range_b(tcg_target_long target)
 return target == sextract64(target, 0, 26);
 }
 
-static uint32_t reloc_pc24_val(tcg_insn_unit *pc, const tcg_insn_unit *target)
+static uint32_t reloc_pc24_val(const tcg_insn_unit *pc,
+  const tcg_insn_unit *target)
 {
 ptrdiff_t disp = tcg_ptr_byte_diff(target, pc);
 tcg_debug_assert(in_range_b(disp));
 return disp & 0x3fc;
 }
 
-static bool reloc_pc24(tcg_insn_unit *pc, tcg_insn_unit *target)
+static bool reloc_pc24(tcg_insn_unit *src_rw, const tcg_insn_unit *target)
 {
-ptrdiff_t disp = tcg_ptr_byte_diff(target, pc);
+const tcg_insn_unit *src_rx = tcg_splitwx_to_rx(src_rw);
+ptrdiff_t disp = tcg_ptr_byte_diff(target, src_rx);
+
 if (in_range_b(disp)) {
-*pc = (*pc & ~0x3fc) | (disp & 0x3fc);
+*src_rw = (*src_rw & ~0x3fc) | (disp & 0x3fc);
 return true;
 }
 return false;
 }
 
-static uint16_t reloc_pc14_val(tcg_insn_unit *pc, const tcg_insn_unit *target)
+static uint16_t reloc_pc14_val(const tcg_insn_unit *pc,
+  const tcg_insn_unit *target)
 {
 ptrdiff_t disp = tcg_ptr_byte_diff(target, pc);
 tcg_debug_assert(disp == (int16_t) disp);
 return disp & 0xfffc;
 }
 
-static bool reloc_pc14(tcg_insn_unit *pc, tcg_insn_unit *target)
+static bool reloc_pc14(tcg_insn_unit *src_rw, const tcg_insn_unit *target)
 {
-ptrdiff_t disp = tcg_ptr_byte_diff(target, pc);
+const tcg_insn_unit *src_rx = tcg_splitwx_to_rx(src_rw);
+ptrdiff_t disp = tcg_ptr_byte_diff(target, src_rx);
+
 if (disp == (int16_t) disp) {
-*pc = (*pc & ~0xfffc) | (disp & 0xfffc);
+*src_rw = (*src_rw & ~0xfffc) | (disp & 0xfffc);
 return true;
 }
 return false;
@@ -673,12 +677,12 @@ static const uint32_t tcg_to_isel[] = {
 static bool patch_reloc(tcg_insn_unit *code_ptr, int type,
 intptr_t value, intptr_t addend)
 {
-tcg_insn_unit *target;
+const tcg_insn_unit *target;
 int16_t lo;
 int32_t hi;
 
 value += addend;
-target = (tcg_insn_unit *)value;
+target = (const tcg_insn_unit *)value;
 
 switch (type) {
 case R_PPC_REL14:
@@ -1544,7 +1548,7 @@ static void tcg_out_setcond(TCGContext *s, TCGType type, 
TCGCond cond,
 static void tcg_out_bc(TCGContext *s, int bc, TCGLabel *l)
 {
 if (l->has_value) {
-bc |= reloc_pc14_val(s->code_ptr, l->u.value_ptr);
+bc |= reloc_pc14_val(tcg_splitwx_to_rx(s->code_ptr), l->u.value_ptr);
 } else {
 tcg_out_reloc(s, s->code_ptr, R_PPC_REL14, l, 0);
 }
@@ -1997,7 +2001,8 @@ static void add_qemu_ldst_label(TCGContext *s, bool 
is_ld, TCGMemOpIdx oi,
 label->datahi_reg = datahi_reg;
 label->addrlo_reg = addrlo_reg;
 label->addrhi_reg = addrhi_reg;
-label->raddr = raddr;
+/* TODO: Cast goes away when all hosts converted */
+label->raddr = (void *)tcg_splitwx_to_rx(raddr);
 label->label_ptr[0] = lptr;
 }
 
@@ -2007,7 +2012,7 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, 
TCGLabelQemuLdst *lb)
 MemOp opc = get_memop(oi);
 TCGReg hi, lo, arg = TCG_REG_R3;
 
-if (!reloc_pc14(lb->label_ptr[0], s->code_ptr)) {
+if (!reloc_pc14(lb->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
 return false;
 }
 
@@ -2055,7 +2060,7 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, 
TCGLabelQemuLdst *lb)
 MemOp s_bits = opc & MO_SIZE;
 TCGReg hi, lo, arg = TCG_REG_R3;
 
-if (!reloc_pc14(lb->label_ptr[0], s->code_ptr)) {
+if (!reloc_pc14(lb->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
 return false;
 }
 
@@ -2306,10 +2311,10 @@ static void tcg_target_qemu_prologue(TCGContext *s)
 int i;
 
 #ifdef _CALL_AIX
-void **desc = (void **)s->code_ptr;
-desc[0] = desc + 2;   /* entry point */
-desc[1] = 0;  /* environment pointer */
-s->code_ptr = (void *)(desc + 2);

[PATCH v3 20/41] tcg/aarch64: Implement flush_idcache_range manually

2020-11-05 Thread Richard Henderson
Copy the single pointer implementation from libgcc and modify it to
support the double pointer interface we require.  This halves the
number of cache operations required when split-rwx is enabled.

Signed-off-by: Richard Henderson 
---
 tcg/aarch64/tcg-target.h | 11 +--
 tcg/aarch64/tcg-target.c.inc | 64 
 2 files changed, 65 insertions(+), 10 deletions(-)

diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h
index fa64058d43..e62d38ba55 100644
--- a/tcg/aarch64/tcg-target.h
+++ b/tcg/aarch64/tcg-target.h
@@ -148,16 +148,7 @@ typedef enum {
 #define TCG_TARGET_DEFAULT_MO (0)
 #define TCG_TARGET_HAS_MEMORY_BSWAP 1
 
-/* Flush the dcache at RW, and the icache at RX, as necessary. */
-static inline void flush_idcache_range(uintptr_t rx, uintptr_t rw, size_t len)
-{
-/* TODO: Copy this from gcc to avoid 4 loops instead of 2. */
-if (rw != rx) {
-__builtin___clear_cache((char *)rw, (char *)(rw + len));
-}
-__builtin___clear_cache((char *)rx, (char *)(rx + len));
-}
-
+void flush_idcache_range(uintptr_t rx, uintptr_t rw, size_t len);
 void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t, uintptr_t);
 
 #ifdef CONFIG_SOFTMMU
diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
index bd888bc66d..8aa1fafd91 100644
--- a/tcg/aarch64/tcg-target.c.inc
+++ b/tcg/aarch64/tcg-target.c.inc
@@ -2968,3 +2968,67 @@ void tcg_register_jit(const void *buf, size_t buf_size)
 {
 tcg_register_jit_int(buf, buf_size, &debug_frame, sizeof(debug_frame));
 }
+
+/* Flush the dcache at RW, and the icache at RX, as necessary. */
+#ifdef CONFIG_DARWIN
+/* Apple does not expose CTR_EL0, so we must use system interfaces. */
+extern void sys_icache_invalidate(void *start, size_t len);
+extern void sys_dcache_flush(void *start, size_t len);
+void flush_idcache_range(uintptr_t rx, uintptr_t rw, size_t len)
+{
+sys_dcache_flush((void *)rw, len);
+sys_icache_invalidate((void *)rx, len);
+}
+#else
+/*
+ * This is a copy of gcc's __aarch64_sync_cache_range, modified
+ * to fit this three-operand interface.
+ */
+void flush_idcache_range(uintptr_t rx, uintptr_t rw, size_t len)
+{
+const unsigned CTR_IDC = 1u << 28;
+const unsigned CTR_DIC = 1u << 29;
+static unsigned int cache_info;
+uintptr_t icache_lsize, dcache_lsize, p;
+
+if (!cache_info) {
+/*
+ * CTR_EL0 [3:0] contains log2 of icache line size in words.
+ * CTR_EL0 [19:16] contains log2 of dcache line size in words.
+ */
+asm volatile("mrs\t%0, ctr_el0" : "=r"(cache_info));
+}
+
+icache_lsize = 4 << extract32(cache_info, 0, 4);
+dcache_lsize = 4 << extract32(cache_info, 16, 4);
+
+/*
+ * If CTR_EL0.IDC is enabled, Data cache clean to the Point of Unification
+ * is not required for instruction to data coherence.
+ */
+if (!(cache_info & CTR_IDC)) {
+/*
+ * Loop over the address range, clearing one cache line at once.
+ * Data cache must be flushed to unification first to make sure
+ * the instruction cache fetches the updated data.
+ */
+for (p = rw & -dcache_lsize; p < rw + len; p += dcache_lsize) {
+asm volatile("dc\tcvau, %0" : : "r" (p) : "memory");
+}
+asm volatile("dsb\tish" : : : "memory");
+}
+
+/*
+ * If CTR_EL0.DIC is enabled, Instruction cache cleaning to the Point
+ * of Unification is not required for instruction to data coherence.
+ */
+if (!(cache_info & CTR_DIC)) {
+for (p = rx & -icache_lsize; p < rx + len; p += icache_lsize) {
+asm volatile("ic\tivau, %0" : : "r"(p) : "memory");
+}
+asm volatile ("dsb\tish" : : : "memory");
+}
+
+asm volatile("isb" : : : "memory");
+}
+#endif /* CONFIG_DARWIN */
-- 
2.25.1




[PATCH v3 19/41] tcg/aarch64: Use B not BL for tcg_out_goto_long

2020-11-05 Thread Richard Henderson
A typo generated a branch-and-link insn instead of plain branch.

Signed-off-by: Richard Henderson 
---
 tcg/aarch64/tcg-target.c.inc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
index fea784cf75..bd888bc66d 100644
--- a/tcg/aarch64/tcg-target.c.inc
+++ b/tcg/aarch64/tcg-target.c.inc
@@ -1317,7 +1317,7 @@ static inline void tcg_out_goto_long(TCGContext *s, 
tcg_insn_unit *target)
 {
 ptrdiff_t offset = target - s->code_ptr;
 if (offset == sextract64(offset, 0, 26)) {
-tcg_out_insn(s, 3206, BL, offset);
+tcg_out_insn(s, 3206, B, offset);
 } else {
 tcg_out_movi(s, TCG_TYPE_I64, TCG_REG_TMP, (intptr_t)target);
 tcg_out_insn(s, 3207, BR, TCG_REG_TMP);
-- 
2.25.1




[PATCH v3 26/41] tcg/ppc: Use tcg_out_mem_long to reset TCG_REG_TB

2020-11-05 Thread Richard Henderson
The maximum TB code gen size is UINT16_MAX, which the current
code does not support.  Use our utility function to optimally
add an arbitrary constant.

Signed-off-by: Richard Henderson 
---
 tcg/ppc/tcg-target.c.inc | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index dc90705d02..91d5d95ddf 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -2392,9 +2392,8 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc, 
const TCGArg *args,
 set_jmp_reset_offset(s, args[0]);
 if (USE_REG_TB) {
 /* For the unlinked case, need to reset TCG_REG_TB.  */
-c = -tcg_current_code_size(s);
-assert(c == (int16_t)c);
-tcg_out32(s, ADDI | TAI(TCG_REG_TB, TCG_REG_TB, c));
+tcg_out_mem_long(s, ADDI, ADD, TCG_REG_TB, TCG_REG_TB,
+ -tcg_current_code_size(s));
 }
 break;
 case INDEX_op_goto_ptr:
-- 
2.25.1




[PATCH v3 25/41] tcg/ppc: Use tcg_tbrel_diff

2020-11-05 Thread Richard Henderson
Use tcg_tbrel_diff when we need a displacement to a label,
and with a NULL argument when we need the normalizing addend.

Signed-off-by: Richard Henderson 
---
 tcg/ppc/tcg-target.c.inc | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index bc0057eedf..dc90705d02 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -837,7 +837,7 @@ static void tcg_out_movi_int(TCGContext *s, TCGType type, 
TCGReg ret,
 }
 
 /* Load addresses within the TB with one insn.  */
-tb_diff = arg - (intptr_t)s->code_gen_ptr;
+tb_diff = tcg_tbrel_diff(s, (void *)arg);
 if (!in_prologue && USE_REG_TB && tb_diff == (int16_t)tb_diff) {
 tcg_out32(s, ADDI | TAI(ret, TCG_REG_TB, tb_diff));
 return;
@@ -890,7 +890,7 @@ static void tcg_out_movi_int(TCGContext *s, TCGType type, 
TCGReg ret,
 /* Use the constant pool, if possible.  */
 if (!in_prologue && USE_REG_TB) {
 new_pool_label(s, arg, R_PPC_ADDR16, s->code_ptr,
-   -(intptr_t)s->code_gen_ptr);
+   tcg_tbrel_diff(s, NULL));
 tcg_out32(s, LD | TAI(ret, TCG_REG_TB, 0));
 return;
 }
@@ -940,7 +940,7 @@ static void tcg_out_dupi_vec(TCGContext *s, TCGType type, 
TCGReg ret,
  */
 if (USE_REG_TB) {
 rel = R_PPC_ADDR16;
-add = -(intptr_t)s->code_gen_ptr;
+add = tcg_tbrel_diff(s, NULL);
 } else {
 rel = R_PPC_ADDR32;
 add = 0;
-- 
2.25.1




[PATCH v3 16/41] accel/tcg: Support split-wx for darwin/iOS with vm_remap

2020-11-05 Thread Richard Henderson
Cribbed from code posted by Joelle van Dyne ,
and rearranged to a cleaner structure.  Completely untested.

Signed-off-by: Richard Henderson 
---
 accel/tcg/translate-all.c | 65 +++
 1 file changed, 65 insertions(+)

diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 1931e65365..17df6c94fa 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -1166,9 +1166,71 @@ static bool alloc_code_gen_buffer_splitwx_memfd(size_t 
size, Error **errp)
 }
 #endif /* CONFIG_POSIX */
 
+#ifdef CONFIG_DARWIN
+#include 
+
+extern kern_return_t mach_vm_remap(vm_map_t target_task,
+   mach_vm_address_t *target_address,
+   mach_vm_size_t size,
+   mach_vm_offset_t mask,
+   int flags,
+   vm_map_t src_task,
+   mach_vm_address_t src_address,
+   boolean_t copy,
+   vm_prot_t *cur_protection,
+   vm_prot_t *max_protection,
+   vm_inherit_t inheritance);
+
+static bool alloc_code_gen_buffer_splitwx_vmremap(size_t size, Error **errp)
+{
+kern_return_t ret;
+mach_vm_address_t buf_rw, buf_rx;
+vm_prot_t cur_prot, max_prot;
+
+/* Map the read-write portion via normal anon memory. */
+if (!alloc_code_gen_buffer_anon(size, PROT_READ | PROT_WRITE,
+MAP_PRIVATE | MAP_ANONYMOUS, errp)) {
+return false;
+}
+
+buf_rw = (mach_vm_address_t)tcg_ctx->code_gen_buffer;
+buf_rx = 0;
+ret = mach_vm_remap(mach_task_self(),
+&buf_rx,
+size,
+0,
+VM_FLAGS_ANYWHERE,
+mach_task_self(),
+buf_rw,
+false,
+&cur_prot,
+&max_prot,
+VM_INHERIT_NONE);
+if (ret != KERN_SUCCESS) {
+/* TODO: Convert "ret" to a human readable error message. */
+error_setg(errp, "vm_remap for jit splitwx failed");
+munmap((void *)buf_rw, size);
+return false;
+}
+
+if (mprotect((void *)buf_rx, size, PROT_READ | PROT_EXEC) != 0) {
+error_setg_errno(errp, errno, "mprotect for jit splitwx");
+munmap((void *)buf_rx, size);
+munmap((void *)buf_rw, size);
+return false;
+}
+
+tcg_splitwx_diff = buf_rx - buf_rw;
+return true;
+}
+#endif /* CONFIG_DARWIN */
+
 static bool alloc_code_gen_buffer_splitwx(size_t size, Error **errp)
 {
 if (TCG_TARGET_SUPPORT_MIRROR) {
+#ifdef CONFIG_DARWIN
+return alloc_code_gen_buffer_splitwx_vmremap(size, errp);
+#endif
 #ifdef CONFIG_POSIX
 return alloc_code_gen_buffer_splitwx_memfd(size, errp);
 #endif
@@ -1201,6 +1263,9 @@ static bool alloc_code_gen_buffer(size_t size, int 
splitwx, Error **errp)
 #ifdef CONFIG_TCG_INTERPRETER
 /* The tcg interpreter does not need execute permission. */
 prot = PROT_READ | PROT_WRITE;
+#elif defined(CONFIG_DARWIN)
+/* Applicable to both iOS and macOS (Apple Silicon). */
+flags |= MAP_JIT;
 #endif
 
 return alloc_code_gen_buffer_anon(size, prot, flags, errp);
-- 
2.25.1




[PATCH v3 18/41] tcg/i386: Support split-wx code generation

2020-11-05 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 tcg/i386/tcg-target.h |  2 +-
 tcg/i386/tcg-target.c.inc | 20 +++-
 2 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h
index 1b9d41bd56..bbbd1c2d4a 100644
--- a/tcg/i386/tcg-target.h
+++ b/tcg/i386/tcg-target.h
@@ -236,6 +236,6 @@ static inline void tb_target_set_jmp_target(uintptr_t 
tc_ptr, uintptr_t jmp_rx,
 #define TCG_TARGET_NEED_LDST_LABELS
 #endif
 #define TCG_TARGET_NEED_POOL_LABELS
-#define TCG_TARGET_SUPPORT_MIRROR   0
+#define TCG_TARGET_SUPPORT_MIRROR   1
 
 #endif
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index 7f74c77d7f..23c7a8a383 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -165,7 +165,7 @@ static bool have_lzcnt;
 # define have_lzcnt 0
 #endif
 
-static tcg_insn_unit *tb_ret_addr;
+static const tcg_insn_unit *tb_ret_addr;
 
 static bool patch_reloc(tcg_insn_unit *code_ptr, int type,
 intptr_t value, intptr_t addend)
@@ -173,7 +173,7 @@ static bool patch_reloc(tcg_insn_unit *code_ptr, int type,
 value += addend;
 switch(type) {
 case R_386_PC32:
-value -= (uintptr_t)code_ptr;
+value -= (uintptr_t)tcg_splitwx_to_rx(code_ptr);
 if (value != (int32_t)value) {
 return false;
 }
@@ -182,7 +182,7 @@ static bool patch_reloc(tcg_insn_unit *code_ptr, int type,
 tcg_patch32(code_ptr, value);
 break;
 case R_386_PC8:
-value -= (uintptr_t)code_ptr;
+value -= (uintptr_t)tcg_splitwx_to_rx(code_ptr);
 if (value != (int8_t)value) {
 return false;
 }
@@ -1006,7 +1006,7 @@ static void tcg_out_movi(TCGContext *s, TCGType type,
 }
 
 /* Try a 7 byte pc-relative lea before the 10 byte movq.  */
-diff = arg - ((uintptr_t)s->code_ptr + 7);
+diff = tcg_pcrel_diff(s, (const void *)arg) - 7;
 if (diff == (int32_t)diff) {
 tcg_out_opc(s, OPC_LEA | P_REXW, ret, 0, 0);
 tcg_out8(s, (LOWREGMASK(ret) << 3) | 5);
@@ -1615,7 +1615,7 @@ static inline void tcg_out_call(TCGContext *s, const 
tcg_insn_unit *dest)
 tcg_out_branch(s, 1, dest);
 }
 
-static void tcg_out_jmp(TCGContext *s, tcg_insn_unit *dest)
+static void tcg_out_jmp(TCGContext *s, const tcg_insn_unit *dest)
 {
 tcg_out_branch(s, 0, dest);
 }
@@ -1786,7 +1786,8 @@ static void add_qemu_ldst_label(TCGContext *s, bool 
is_ld, bool is_64,
 label->datahi_reg = datahi;
 label->addrlo_reg = addrlo;
 label->addrhi_reg = addrhi;
-label->raddr = raddr;
+/* TODO: Cast goes away when all hosts converted */
+label->raddr = (void *)tcg_splitwx_to_rx(raddr);
 label->label_ptr[0] = label_ptr[0];
 if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) {
 label->label_ptr[1] = label_ptr[1];
@@ -2280,7 +2281,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode 
opc,
 /* jump displacement must be aligned for atomic patching;
  * see if we need to add extra nops before jump
  */
-gap = tcg_pcrel_diff(s, QEMU_ALIGN_PTR_UP(s->code_ptr + 1, 4));
+gap = QEMU_ALIGN_PTR_UP(s->code_ptr + 1, 4) - s->code_ptr;
 if (gap != 1) {
 tcg_out_nopn(s, gap - 1);
 }
@@ -3825,11 +3826,12 @@ static void tcg_target_qemu_prologue(TCGContext *s)
  * Return path for goto_ptr. Set return value to 0, a-la exit_tb,
  * and fall through to the rest of the epilogue.
  */
-tcg_code_gen_epilogue = s->code_ptr;
+/* TODO: Cast goes away when all hosts converted */
+tcg_code_gen_epilogue = (void *)tcg_splitwx_to_rx(s->code_ptr);
 tcg_out_movi(s, TCG_TYPE_REG, TCG_REG_EAX, 0);
 
 /* TB epilogue */
-tb_ret_addr = s->code_ptr;
+tb_ret_addr = tcg_splitwx_to_rx(s->code_ptr);
 
 tcg_out_addi(s, TCG_REG_CALL_STACK, stack_addend);
 
-- 
2.25.1




[PATCH v3 39/41] tcg: Remove TCG_TARGET_SUPPORT_MIRROR

2020-11-05 Thread Richard Henderson
Now that all native tcg hosts support splitwx,
remove the define.  Replace the one use with a
test for CONFIG_TCG_INTERPRETER.

Signed-off-by: Richard Henderson 
---
 tcg/aarch64/tcg-target.h  |  1 -
 tcg/arm/tcg-target.h  |  1 -
 tcg/i386/tcg-target.h |  1 -
 tcg/mips/tcg-target.h |  1 -
 tcg/ppc/tcg-target.h  |  1 -
 tcg/riscv/tcg-target.h|  1 -
 tcg/s390/tcg-target.h |  1 -
 tcg/sparc/tcg-target.h|  1 -
 tcg/tci/tcg-target.h  |  1 -
 accel/tcg/translate-all.c | 14 +++---
 10 files changed, 7 insertions(+), 16 deletions(-)

diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h
index abb94f9458..fedd88f6fb 100644
--- a/tcg/aarch64/tcg-target.h
+++ b/tcg/aarch64/tcg-target.h
@@ -155,6 +155,5 @@ void tb_target_set_jmp_target(uintptr_t, uintptr_t, 
uintptr_t, uintptr_t);
 #define TCG_TARGET_NEED_LDST_LABELS
 #endif
 #define TCG_TARGET_NEED_POOL_LABELS
-#define TCG_TARGET_SUPPORT_MIRROR   1
 
 #endif /* AARCH64_TCG_TARGET_H */
diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h
index 17f6be9cfc..b21a2fb6a1 100644
--- a/tcg/arm/tcg-target.h
+++ b/tcg/arm/tcg-target.h
@@ -150,6 +150,5 @@ void tb_target_set_jmp_target(uintptr_t, uintptr_t, 
uintptr_t, uintptr_t);
 #define TCG_TARGET_NEED_LDST_LABELS
 #endif
 #define TCG_TARGET_NEED_POOL_LABELS
-#define TCG_TARGET_SUPPORT_MIRROR   1
 
 #endif
diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h
index bbbd1c2d4a..f52ba0ffec 100644
--- a/tcg/i386/tcg-target.h
+++ b/tcg/i386/tcg-target.h
@@ -236,6 +236,5 @@ static inline void tb_target_set_jmp_target(uintptr_t 
tc_ptr, uintptr_t jmp_rx,
 #define TCG_TARGET_NEED_LDST_LABELS
 #endif
 #define TCG_TARGET_NEED_POOL_LABELS
-#define TCG_TARGET_SUPPORT_MIRROR   1
 
 #endif
diff --git a/tcg/mips/tcg-target.h b/tcg/mips/tcg-target.h
index d7d8e6ea1c..cd548dacec 100644
--- a/tcg/mips/tcg-target.h
+++ b/tcg/mips/tcg-target.h
@@ -206,7 +206,6 @@ extern bool use_mips32r2_instructions;
 
 #define TCG_TARGET_DEFAULT_MO (0)
 #define TCG_TARGET_HAS_MEMORY_BSWAP 1
-#define TCG_TARGET_SUPPORT_MIRROR   1
 
 /* Flush the dcache at RW, and the icache at RX, as necessary. */
 static inline void flush_idcache_range(uintptr_t rx, uintptr_t rw, size_t len)
diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h
index a8628b6cad..8f3e4c924a 100644
--- a/tcg/ppc/tcg-target.h
+++ b/tcg/ppc/tcg-target.h
@@ -185,6 +185,5 @@ void tb_target_set_jmp_target(uintptr_t, uintptr_t, 
uintptr_t, uintptr_t);
 #define TCG_TARGET_NEED_LDST_LABELS
 #endif
 #define TCG_TARGET_NEED_POOL_LABELS
-#define TCG_TARGET_SUPPORT_MIRROR   1
 
 #endif
diff --git a/tcg/riscv/tcg-target.h b/tcg/riscv/tcg-target.h
index 0eb19f2b11..e03fd17427 100644
--- a/tcg/riscv/tcg-target.h
+++ b/tcg/riscv/tcg-target.h
@@ -179,6 +179,5 @@ void tb_target_set_jmp_target(uintptr_t, uintptr_t, 
uintptr_t, uintptr_t);
 #define TCG_TARGET_NEED_POOL_LABELS
 
 #define TCG_TARGET_HAS_MEMORY_BSWAP 0
-#define TCG_TARGET_SUPPORT_MIRROR   1
 
 #endif
diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
index 1fd8b3858e..c5a749e425 100644
--- a/tcg/s390/tcg-target.h
+++ b/tcg/s390/tcg-target.h
@@ -163,6 +163,5 @@ static inline void tb_target_set_jmp_target(uintptr_t 
tc_ptr, uintptr_t jmp_rx,
 #define TCG_TARGET_NEED_LDST_LABELS
 #endif
 #define TCG_TARGET_NEED_POOL_LABELS
-#define TCG_TARGET_SUPPORT_MIRROR   1
 
 #endif
diff --git a/tcg/sparc/tcg-target.h b/tcg/sparc/tcg-target.h
index bb2505bfc7..87e2be61e6 100644
--- a/tcg/sparc/tcg-target.h
+++ b/tcg/sparc/tcg-target.h
@@ -181,6 +181,5 @@ static inline void flush_idcache_range(uintptr_t rx, 
uintptr_t rw, size_t len)
 void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t, uintptr_t);
 
 #define TCG_TARGET_NEED_POOL_LABELS
-#define TCG_TARGET_SUPPORT_MIRROR   1
 
 #endif
diff --git a/tcg/tci/tcg-target.h b/tcg/tci/tcg-target.h
index 3653fef947..a19a6b06e5 100644
--- a/tcg/tci/tcg-target.h
+++ b/tcg/tci/tcg-target.h
@@ -200,7 +200,6 @@ static inline void flush_idcache_range(uintptr_t rx, 
uintptr_t rw, size_t len)
 #define TCG_TARGET_DEFAULT_MO  (0)
 
 #define TCG_TARGET_HAS_MEMORY_BSWAP 1
-#define TCG_TARGET_SUPPORT_MIRROR   0
 
 static inline void tb_target_set_jmp_target(uintptr_t tc_ptr, uintptr_t jmp_rx,
 uintptr_t jmp_rw, uintptr_t addr)
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index b49aaf1026..06102871e7 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -1257,14 +1257,14 @@ static bool 
alloc_code_gen_buffer_splitwx_vmremap(size_t size, Error **errp)
 
 static bool alloc_code_gen_buffer_splitwx(size_t size, Error **errp)
 {
-if (TCG_TARGET_SUPPORT_MIRROR) {
-#ifdef CONFIG_DARWIN
-return alloc_code_gen_buffer_splitwx_vmremap(size, errp);
+#ifndef CONFIG_TCG_INTERPRETER
+# ifdef CONFIG_DARWIN
+return alloc_code_gen_buffer_splitwx_vmremap(size, errp);
+# endif
+# ifdef CONFIG_POSIX
+return alloc_code_gen_buffer_splitw

[PATCH v3 13/41] tcg: Use Error with alloc_code_gen_buffer

2020-11-05 Thread Richard Henderson
Report better error messages than just "could not allocate".
Let alloc_code_gen_buffer set ctx->code_gen_buffer_size
and ctx->code_gen_buffer, and simply return bool.

Signed-off-by: Richard Henderson 
---
 accel/tcg/translate-all.c | 60 ++-
 1 file changed, 34 insertions(+), 26 deletions(-)

diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 7b85ddacd2..2824b3e387 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -59,6 +59,7 @@
 #include "sysemu/cpus.h"
 #include "sysemu/cpu-timers.h"
 #include "sysemu/tcg.h"
+#include "qapi/error.h"
 
 /* #define DEBUG_TB_INVALIDATE */
 /* #define DEBUG_TB_FLUSH */
@@ -963,7 +964,7 @@ static void page_lock_pair(PageDesc **ret_p1, 
tb_page_addr_t phys1,
   (DEFAULT_CODE_GEN_BUFFER_SIZE_1 < MAX_CODE_GEN_BUFFER_SIZE \
? DEFAULT_CODE_GEN_BUFFER_SIZE_1 : MAX_CODE_GEN_BUFFER_SIZE)
 
-static inline size_t size_code_gen_buffer(size_t tb_size)
+static size_t size_code_gen_buffer(size_t tb_size)
 {
 /* Size the buffer.  */
 if (tb_size == 0) {
@@ -1014,7 +1015,7 @@ static inline void *split_cross_256mb(void *buf1, size_t 
size1)
 static uint8_t static_code_gen_buffer[DEFAULT_CODE_GEN_BUFFER_SIZE]
 __attribute__((aligned(CODE_GEN_ALIGN)));
 
-static inline void *alloc_code_gen_buffer(void)
+static bool alloc_code_gen_buffer(size_t tb_size, Error **errp)
 {
 void *buf = static_code_gen_buffer;
 void *end = static_code_gen_buffer + sizeof(static_code_gen_buffer);
@@ -1027,9 +1028,8 @@ static inline void *alloc_code_gen_buffer(void)
 size = end - buf;
 
 /* Honor a command-line option limiting the size of the buffer.  */
-if (size > tcg_ctx->code_gen_buffer_size) {
-size = QEMU_ALIGN_DOWN(tcg_ctx->code_gen_buffer_size,
-   qemu_real_host_page_size);
+if (size > tb_size) {
+size = QEMU_ALIGN_DOWN(tb_size, qemu_real_host_page_size);
 }
 tcg_ctx->code_gen_buffer_size = size;
 
@@ -1041,31 +1041,43 @@ static inline void *alloc_code_gen_buffer(void)
 #endif
 
 if (qemu_mprotect_rwx(buf, size)) {
-abort();
+error_setg_errno(errp, errno, "mprotect of jit buffer");
+return false;
 }
 qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);
 
-return buf;
+tcg_ctx->code_gen_buffer = buf;
+return true;
 }
 #elif defined(_WIN32)
-static inline void *alloc_code_gen_buffer(void)
+static bool alloc_code_gen_buffer(size_t size, Error **errp)
 {
-size_t size = tcg_ctx->code_gen_buffer_size;
-return VirtualAlloc(NULL, size, MEM_RESERVE | MEM_COMMIT,
-PAGE_EXECUTE_READWRITE);
+void *buf = VirtualAlloc(NULL, size, MEM_RESERVE | MEM_COMMIT,
+ PAGE_EXECUTE_READWRITE);
+if (buf == NULL) {
+error_setg_win32(errp, GetLastError(),
+ "allocate %zu bytes for jit buffer", size);
+return false;
+}
+
+tcg_ctx->code_gen_buffer = buf;
+tcg_ctx->code_gen_buffer_size = size;
+return true;
 }
 #else
-static inline void *alloc_code_gen_buffer(void)
+static bool alloc_code_gen_buffer(size_t size, Error **errp)
 {
 int prot = PROT_WRITE | PROT_READ | PROT_EXEC;
 int flags = MAP_PRIVATE | MAP_ANONYMOUS;
-size_t size = tcg_ctx->code_gen_buffer_size;
 void *buf;
 
 buf = mmap(NULL, size, prot, flags, -1, 0);
 if (buf == MAP_FAILED) {
-return NULL;
+error_setg_errno(errp, errno,
+ "allocate %zu bytes for jit buffer", size);
+return false;
 }
+tcg_ctx->code_gen_buffer_size = size;
 
 #ifdef __mips__
 if (cross_256mb(buf, size)) {
@@ -1104,20 +1116,11 @@ static inline void *alloc_code_gen_buffer(void)
 /* Request large pages for the buffer.  */
 qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);
 
-return buf;
+tcg_ctx->code_gen_buffer = buf;
+return true;
 }
 #endif /* USE_STATIC_CODE_GEN_BUFFER, WIN32, POSIX */
 
-static inline void code_gen_alloc(size_t tb_size)
-{
-tcg_ctx->code_gen_buffer_size = size_code_gen_buffer(tb_size);
-tcg_ctx->code_gen_buffer = alloc_code_gen_buffer();
-if (tcg_ctx->code_gen_buffer == NULL) {
-fprintf(stderr, "Could not allocate dynamic translator buffer\n");
-exit(1);
-}
-}
-
 static bool tb_cmp(const void *ap, const void *bp)
 {
 const TranslationBlock *a = ap;
@@ -1144,11 +1147,16 @@ static void tb_htable_init(void)
size. */
 void tcg_exec_init(unsigned long tb_size)
 {
+bool ok;
+
 tcg_allowed = true;
 cpu_gen_init();
 page_init();
 tb_htable_init();
-code_gen_alloc(tb_size);
+
+ok = alloc_code_gen_buffer(size_code_gen_buffer(tb_size), &error_fatal);
+assert(ok);
+
 #if defined(CONFIG_SOFTMMU)
 /* There's no guest base to take into account, so go ahead and
initialize the prologue now.  */
-- 
2.25.1




[PATCH v3 12/41] tcg: Make tb arg to synchronize_from_tb const

2020-11-05 Thread Richard Henderson
There is nothing within the translators that ought to be
changing the TranslationBlock data, so make it const.

This does not actually use the read-only copy of the
data structure that exists within the rx region.

Signed-off-by: Richard Henderson 
---
 include/hw/core/cpu.h   | 3 ++-
 target/arm/cpu.c| 3 ++-
 target/avr/cpu.c| 3 ++-
 target/hppa/cpu.c   | 3 ++-
 target/i386/cpu.c   | 3 ++-
 target/microblaze/cpu.c | 3 ++-
 target/mips/cpu.c   | 3 ++-
 target/riscv/cpu.c  | 3 ++-
 target/rx/cpu.c | 3 ++-
 target/sh4/cpu.c| 3 ++-
 target/sparc/cpu.c  | 3 ++-
 target/tricore/cpu.c| 2 +-
 12 files changed, 23 insertions(+), 12 deletions(-)

diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index 3d92c967ff..44c336a96a 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -189,7 +189,8 @@ struct CPUClass {
 void (*get_memory_mapping)(CPUState *cpu, MemoryMappingList *list,
Error **errp);
 void (*set_pc)(CPUState *cpu, vaddr value);
-void (*synchronize_from_tb)(CPUState *cpu, struct TranslationBlock *tb);
+void (*synchronize_from_tb)(CPUState *cpu,
+const struct TranslationBlock *tb);
 bool (*tlb_fill)(CPUState *cpu, vaddr address, int size,
  MMUAccessType access_type, int mmu_idx,
  bool probe, uintptr_t retaddr);
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 07492e9f9a..2f9be1c0ee 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -54,7 +54,8 @@ static void arm_cpu_set_pc(CPUState *cs, vaddr value)
 }
 }
 
-static void arm_cpu_synchronize_from_tb(CPUState *cs, TranslationBlock *tb)
+static void arm_cpu_synchronize_from_tb(CPUState *cs,
+const TranslationBlock *tb)
 {
 ARMCPU *cpu = ARM_CPU(cs);
 CPUARMState *env = &cpu->env;
diff --git a/target/avr/cpu.c b/target/avr/cpu.c
index 5d9c4ad5bf..6f3d5a9e4a 100644
--- a/target/avr/cpu.c
+++ b/target/avr/cpu.c
@@ -41,7 +41,8 @@ static bool avr_cpu_has_work(CPUState *cs)
 && cpu_interrupts_enabled(env);
 }
 
-static void avr_cpu_synchronize_from_tb(CPUState *cs, TranslationBlock *tb)
+static void avr_cpu_synchronize_from_tb(CPUState *cs,
+const TranslationBlock *tb)
 {
 AVRCPU *cpu = AVR_CPU(cs);
 CPUAVRState *env = &cpu->env;
diff --git a/target/hppa/cpu.c b/target/hppa/cpu.c
index 71b6aca45d..e28f047d10 100644
--- a/target/hppa/cpu.c
+++ b/target/hppa/cpu.c
@@ -35,7 +35,8 @@ static void hppa_cpu_set_pc(CPUState *cs, vaddr value)
 cpu->env.iaoq_b = value + 4;
 }
 
-static void hppa_cpu_synchronize_from_tb(CPUState *cs, TranslationBlock *tb)
+static void hppa_cpu_synchronize_from_tb(CPUState *cs,
+ const TranslationBlock *tb)
 {
 HPPACPU *cpu = HPPA_CPU(cs);
 
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 0d8606958e..01a8acafe3 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -7012,7 +7012,8 @@ static void x86_cpu_set_pc(CPUState *cs, vaddr value)
 cpu->env.eip = value;
 }
 
-static void x86_cpu_synchronize_from_tb(CPUState *cs, TranslationBlock *tb)
+static void x86_cpu_synchronize_from_tb(CPUState *cs,
+const TranslationBlock *tb)
 {
 X86CPU *cpu = X86_CPU(cs);
 
diff --git a/target/microblaze/cpu.c b/target/microblaze/cpu.c
index 9b2482159d..c8e754cfb1 100644
--- a/target/microblaze/cpu.c
+++ b/target/microblaze/cpu.c
@@ -83,7 +83,8 @@ static void mb_cpu_set_pc(CPUState *cs, vaddr value)
 cpu->env.iflags = 0;
 }
 
-static void mb_cpu_synchronize_from_tb(CPUState *cs, TranslationBlock *tb)
+static void mb_cpu_synchronize_from_tb(CPUState *cs,
+   const TranslationBlock *tb)
 {
 MicroBlazeCPU *cpu = MICROBLAZE_CPU(cs);
 
diff --git a/target/mips/cpu.c b/target/mips/cpu.c
index 76d50b00b4..79eee215cf 100644
--- a/target/mips/cpu.c
+++ b/target/mips/cpu.c
@@ -44,7 +44,8 @@ static void mips_cpu_set_pc(CPUState *cs, vaddr value)
 }
 }
 
-static void mips_cpu_synchronize_from_tb(CPUState *cs, TranslationBlock *tb)
+static void mips_cpu_synchronize_from_tb(CPUState *cs,
+ const TranslationBlock *tb)
 {
 MIPSCPU *cpu = MIPS_CPU(cs);
 CPUMIPSState *env = &cpu->env;
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 6a0264fc6b..1b2f40de47 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -285,7 +285,8 @@ static void riscv_cpu_set_pc(CPUState *cs, vaddr value)
 env->pc = value;
 }
 
-static void riscv_cpu_synchronize_from_tb(CPUState *cs, TranslationBlock *tb)
+static void riscv_cpu_synchronize_from_tb(CPUState *cs,
+  const TranslationBlock *tb)
 {
 RISCVCPU *cpu = RISCV_CPU(cs);
 CPURISCVState *env = &cpu->env;
diff --git a/target/rx/cpu.c b/target/rx/cpu.c
index 23ee17a701..

[PATCH v3 30/41] tcg/s390: Use tcg_tbrel_diff

2020-11-05 Thread Richard Henderson
Use tcg_tbrel_diff when we need a displacement to a label,
and with a NULL argument when we need the normalizing addend.

Signed-off-by: Richard Henderson 
---
 tcg/s390/tcg-target.c.inc | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/tcg/s390/tcg-target.c.inc b/tcg/s390/tcg-target.c.inc
index 1444914428..e4c61fc014 100644
--- a/tcg/s390/tcg-target.c.inc
+++ b/tcg/s390/tcg-target.c.inc
@@ -630,7 +630,7 @@ static void tcg_out_movi_int(TCGContext *s, TCGType type, 
TCGReg ret,
 return;
 }
 } else if (USE_REG_TB && !in_prologue) {
-ptrdiff_t off = sval - (uintptr_t)s->code_gen_ptr;
+ptrdiff_t off = tcg_tbrel_diff(s, (void *)sval);
 if (off == sextract64(off, 0, 20)) {
 /* This is certain to be an address within TB, and therefore
OFF will be negative; don't try RX_LA.  */
@@ -655,7 +655,7 @@ static void tcg_out_movi_int(TCGContext *s, TCGType type, 
TCGReg ret,
 } else if (USE_REG_TB && !in_prologue) {
 tcg_out_insn(s, RXY, LG, ret, TCG_REG_TB, TCG_REG_NONE, 0);
 new_pool_label(s, sval, R_390_20, s->code_ptr - 2,
-   -(intptr_t)s->code_gen_ptr);
+   tcg_tbrel_diff(s, NULL));
 } else {
 TCGReg base = ret ? ret : TCG_TMP0;
 tcg_out_insn(s, RIL, LARL, base, 0);
@@ -746,7 +746,7 @@ static void tcg_out_ld_abs(TCGContext *s, TCGType type, 
TCGReg dest, void *abs)
 }
 }
 if (USE_REG_TB) {
-ptrdiff_t disp = abs - (void *)s->code_gen_ptr;
+ptrdiff_t disp = tcg_tbrel_diff(s, abs);
 if (disp == sextract64(disp, 0, 20)) {
 tcg_out_ld(s, type, dest, TCG_REG_TB, disp);
 return;
@@ -956,7 +956,7 @@ static void tgen_andi(TCGContext *s, TCGType type, TCGReg 
dest, uint64_t val)
 if (!maybe_out_small_movi(s, type, TCG_TMP0, val)) {
 tcg_out_insn(s, RXY, NG, dest, TCG_REG_TB, TCG_REG_NONE, 0);
 new_pool_label(s, val & valid, R_390_20, s->code_ptr - 2,
-   -(intptr_t)s->code_gen_ptr);
+   tcg_tbrel_diff(s, NULL));
 return;
 }
 } else {
@@ -1015,7 +1015,7 @@ static void tgen_ori(TCGContext *s, TCGType type, TCGReg 
dest, uint64_t val)
 } else if (USE_REG_TB) {
 tcg_out_insn(s, RXY, OG, dest, TCG_REG_TB, TCG_REG_NONE, 0);
 new_pool_label(s, val, R_390_20, s->code_ptr - 2,
-   -(intptr_t)s->code_gen_ptr);
+   tcg_tbrel_diff(s, NULL));
 } else {
 /* Perform the OR via sequential modifications to the high and
low parts.  Do this via recursion to handle 16-bit vs 32-bit
@@ -1050,7 +1050,7 @@ static void tgen_xori(TCGContext *s, TCGType type, TCGReg 
dest, uint64_t val)
 } else if (USE_REG_TB) {
 tcg_out_insn(s, RXY, XG, dest, TCG_REG_TB, TCG_REG_NONE, 0);
 new_pool_label(s, val, R_390_20, s->code_ptr - 2,
-   -(intptr_t)s->code_gen_ptr);
+   tcg_tbrel_diff(s, NULL));
 } else {
 /* Perform the xor by parts.  */
 tcg_debug_assert(s390_facilities & FACILITY_EXT_IMM);
@@ -1108,12 +1108,12 @@ static int tgen_cmp(TCGContext *s, TCGType type, 
TCGCond c, TCGReg r1,
 op = (is_unsigned ? RXY_CLY : RXY_CY);
 tcg_out_insn_RXY(s, op, r1, TCG_REG_TB, TCG_REG_NONE, 0);
 new_pool_label(s, (uint32_t)c2, R_390_20, s->code_ptr - 2,
-   4 - (intptr_t)s->code_gen_ptr);
+   4 - tcg_tbrel_diff(s, NULL));
 } else {
 op = (is_unsigned ? RXY_CLG : RXY_CG);
 tcg_out_insn_RXY(s, op, r1, TCG_REG_TB, TCG_REG_NONE, 0);
 new_pool_label(s, c2, R_390_20, s->code_ptr - 2,
-   -(intptr_t)s->code_gen_ptr);
+   tcg_tbrel_diff(s, NULL));
 }
 goto exit;
 } else {
-- 
2.25.1




[PATCH v3 24/41] tcg: Introduce tcg_tbrel_diff

2020-11-05 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 include/tcg/tcg.h | 13 +
 1 file changed, 13 insertions(+)

diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index 67d57695c2..90ec7c1445 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -1141,6 +1141,19 @@ static inline ptrdiff_t tcg_pcrel_diff(TCGContext *s, 
const void *target)
 return tcg_ptr_byte_diff(target, tcg_splitwx_to_rx(s->code_ptr));
 }
 
+/**
+ * tcg_tbrel_diff
+ * @s: the tcg context
+ * @target: address of the target
+ *
+ * Produce a difference, from the beginning of the current TB code
+ * to the destination address.
+ */
+static inline ptrdiff_t tcg_tbrel_diff(TCGContext *s, const void *target)
+{
+return tcg_ptr_byte_diff(target, tcg_splitwx_to_rx(s->code_buf));
+}
+
 /**
  * tcg_current_code_size
  * @s: the tcg context
-- 
2.25.1




[PATCH v3 09/41] tcg: Adjust tcg_register_jit for const

2020-11-05 Thread Richard Henderson
We must change all targets at once, since all must match
the declaration in tcg.c.

Signed-off-by: Richard Henderson 
---
 include/tcg/tcg.h|  2 +-
 tcg/tcg.c| 10 +-
 tcg/aarch64/tcg-target.c.inc |  2 +-
 tcg/arm/tcg-target.c.inc |  2 +-
 tcg/i386/tcg-target.c.inc|  2 +-
 tcg/mips/tcg-target.c.inc|  2 +-
 tcg/ppc/tcg-target.c.inc |  2 +-
 tcg/riscv/tcg-target.c.inc   |  2 +-
 tcg/s390/tcg-target.c.inc|  2 +-
 tcg/sparc/tcg-target.c.inc   |  2 +-
 10 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index b2ba16ea8c..67d57695c2 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -1250,7 +1250,7 @@ typedef uintptr_t tcg_prologue_fn(CPUArchState *env, 
const void *tb_ptr);
 extern tcg_prologue_fn *tcg_qemu_tb_exec;
 #endif
 
-void tcg_register_jit(void *buf, size_t buf_size);
+void tcg_register_jit(const void *buf, size_t buf_size);
 
 #if TCG_TARGET_MAYBE_vec
 /* Return zero if the tuple (opc, type, vece) is unsupportable;
diff --git a/tcg/tcg.c b/tcg/tcg.c
index e5d2208e88..07a4bd2c57 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -96,7 +96,7 @@ typedef struct QEMU_PACKED {
 DebugFrameFDEHeader fde;
 } DebugFrameHeader;
 
-static void tcg_register_jit_int(void *buf, size_t size,
+static void tcg_register_jit_int(const void *buf, size_t size,
  const void *debug_frame,
  size_t debug_frame_size)
 __attribute__((unused));
@@ -1134,7 +1134,7 @@ void tcg_prologue_init(TCGContext *s)
 total_size -= prologue_size;
 s->code_gen_buffer_size = total_size;
 
-tcg_register_jit(s->code_gen_buffer, total_size);
+tcg_register_jit(tcg_splitwx_to_rx(s->code_gen_buffer), total_size);
 
 #ifdef DEBUG_DISAS
 if (qemu_loglevel_mask(CPU_LOG_TB_OUT_ASM)) {
@@ -4502,7 +4502,7 @@ static int find_string(const char *strtab, const char 
*str)
 }
 }
 
-static void tcg_register_jit_int(void *buf_ptr, size_t buf_size,
+static void tcg_register_jit_int(const void *buf_ptr, size_t buf_size,
  const void *debug_frame,
  size_t debug_frame_size)
 {
@@ -4704,13 +4704,13 @@ static void tcg_register_jit_int(void *buf_ptr, size_t 
buf_size,
 /* No support for the feature.  Provide the entry point expected by exec.c,
and implement the internal function we declared earlier.  */
 
-static void tcg_register_jit_int(void *buf, size_t size,
+static void tcg_register_jit_int(const void *buf, size_t size,
  const void *debug_frame,
  size_t debug_frame_size)
 {
 }
 
-void tcg_register_jit(void *buf, size_t buf_size)
+void tcg_register_jit(const void *buf, size_t buf_size)
 {
 }
 #endif /* ELF_HOST_MACHINE */
diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
index 6d8152c468..9ace859db3 100644
--- a/tcg/aarch64/tcg-target.c.inc
+++ b/tcg/aarch64/tcg-target.c.inc
@@ -2964,7 +2964,7 @@ static const DebugFrame debug_frame = {
 }
 };
 
-void tcg_register_jit(void *buf, size_t buf_size)
+void tcg_register_jit(const void *buf, size_t buf_size)
 {
 tcg_register_jit_int(buf, buf_size, &debug_frame, sizeof(debug_frame));
 }
diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
index a197e6bc45..9b9400f164 100644
--- a/tcg/arm/tcg-target.c.inc
+++ b/tcg/arm/tcg-target.c.inc
@@ -2353,7 +2353,7 @@ static const DebugFrame debug_frame = {
 }
 };
 
-void tcg_register_jit(void *buf, size_t buf_size)
+void tcg_register_jit(const void *buf, size_t buf_size)
 {
 tcg_register_jit_int(buf, buf_size, &debug_frame, sizeof(debug_frame));
 }
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index 0ac1ef3d82..7f74c77d7f 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -3998,7 +3998,7 @@ static const DebugFrame debug_frame = {
 #endif
 
 #if defined(ELF_HOST_MACHINE)
-void tcg_register_jit(void *buf, size_t buf_size)
+void tcg_register_jit(const void *buf, size_t buf_size)
 {
 tcg_register_jit_int(buf, buf_size, &debug_frame, sizeof(debug_frame));
 }
diff --git a/tcg/mips/tcg-target.c.inc b/tcg/mips/tcg-target.c.inc
index 6d2c369a85..e9c8c24741 100644
--- a/tcg/mips/tcg-target.c.inc
+++ b/tcg/mips/tcg-target.c.inc
@@ -2703,7 +2703,7 @@ static const DebugFrame debug_frame = {
 }
 };
 
-void tcg_register_jit(void *buf, size_t buf_size)
+void tcg_register_jit(const void *buf, size_t buf_size)
 {
 tcg_register_jit_int(buf, buf_size, &debug_frame, sizeof(debug_frame));
 }
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index 36129b976f..ff667b1531 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -3847,7 +3847,7 @@ static DebugFrame debug_frame = {
 }
 };
 
-void tcg_register_jit(void *buf, size_t buf_size)
+void tcg_register_jit(const void *buf, size_t buf_size)
 {
 uint8_t *p = &debug_frame.fde_reg_of

[PATCH v3 11/41] tcg: Make DisasContextBase.tb const

2020-11-05 Thread Richard Henderson
There is nothing within the translators that ought to be
changing the TranslationBlock data, so make it const.

This does not actually use the read-only copy of the
data structure that exists within the rx region.

Signed-off-by: Richard Henderson 
---
 include/exec/gen-icount.h  | 4 ++--
 include/exec/translator.h  | 2 +-
 include/tcg/tcg-op.h   | 2 +-
 accel/tcg/translator.c | 4 ++--
 target/arm/translate-a64.c | 2 +-
 tcg/tcg-op.c   | 2 +-
 6 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/exec/gen-icount.h b/include/exec/gen-icount.h
index 822c43cfd3..aa4b44354a 100644
--- a/include/exec/gen-icount.h
+++ b/include/exec/gen-icount.h
@@ -32,7 +32,7 @@ static inline void gen_io_end(void)
 tcg_temp_free_i32(tmp);
 }
 
-static inline void gen_tb_start(TranslationBlock *tb)
+static inline void gen_tb_start(const TranslationBlock *tb)
 {
 TCGv_i32 count, imm;
 
@@ -71,7 +71,7 @@ static inline void gen_tb_start(TranslationBlock *tb)
 tcg_temp_free_i32(count);
 }
 
-static inline void gen_tb_end(TranslationBlock *tb, int num_insns)
+static inline void gen_tb_end(const TranslationBlock *tb, int num_insns)
 {
 if (tb_cflags(tb) & CF_USE_ICOUNT) {
 /* Update the num_insn immediate parameter now that we know
diff --git a/include/exec/translator.h b/include/exec/translator.h
index 638e1529c5..24232ead41 100644
--- a/include/exec/translator.h
+++ b/include/exec/translator.h
@@ -67,7 +67,7 @@ typedef enum DisasJumpType {
  * Architecture-agnostic disassembly context.
  */
 typedef struct DisasContextBase {
-TranslationBlock *tb;
+const TranslationBlock *tb;
 target_ulong pc_first;
 target_ulong pc_next;
 DisasJumpType is_jmp;
diff --git a/include/tcg/tcg-op.h b/include/tcg/tcg-op.h
index 5abf17fecc..cbe39a3b95 100644
--- a/include/tcg/tcg-op.h
+++ b/include/tcg/tcg-op.h
@@ -805,7 +805,7 @@ static inline void tcg_gen_insn_start(target_ulong pc, 
target_ulong a1,
  * be NULL and @idx should be 0.  Otherwise, @tb should be valid and
  * @idx should be one of the TB_EXIT_ values.
  */
-void tcg_gen_exit_tb(TranslationBlock *tb, unsigned idx);
+void tcg_gen_exit_tb(const TranslationBlock *tb, unsigned idx);
 
 /**
  * tcg_gen_goto_tb() - output goto_tb TCG operation
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index fb1e19c585..a49a794065 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -133,8 +133,8 @@ void translator_loop(const TranslatorOps *ops, 
DisasContextBase *db,
 }
 
 /* The disas_log hook may use these values rather than recompute.  */
-db->tb->size = db->pc_next - db->pc_first;
-db->tb->icount = db->num_insns;
+tb->size = db->pc_next - db->pc_first;
+tb->icount = db->num_insns;
 
 #ifdef DEBUG_DISAS
 if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 072754fa24..297782e6ef 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -410,7 +410,7 @@ static inline bool use_goto_tb(DisasContext *s, int n, 
uint64_t dest)
 
 static inline void gen_goto_tb(DisasContext *s, int n, uint64_t dest)
 {
-TranslationBlock *tb;
+const TranslationBlock *tb;
 
 tb = s->base.tb;
 if (use_goto_tb(s, n, dest)) {
diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index 4b8a473fad..e3dc0cb4cb 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -2664,7 +2664,7 @@ void tcg_gen_extr32_i64(TCGv_i64 lo, TCGv_i64 hi, 
TCGv_i64 arg)
 
 /* QEMU specific operations.  */
 
-void tcg_gen_exit_tb(TranslationBlock *tb, unsigned idx)
+void tcg_gen_exit_tb(const TranslationBlock *tb, unsigned idx)
 {
 uintptr_t val = (uintptr_t)tb + idx;
 
-- 
2.25.1




[PATCH v3 17/41] tcg: Return the TB pointer from the rx region from exit_tb

2020-11-05 Thread Richard Henderson
This produces a small pc-relative displacement within the
generated code to the TB structure that preceeds it.

Signed-off-by: Richard Henderson 
---
 accel/tcg/cpu-exec.c | 35 ++-
 tcg/tcg-op.c | 13 -
 2 files changed, 34 insertions(+), 14 deletions(-)

diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 272d596e0c..8df0a1782e 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -144,12 +144,13 @@ static void init_delay_params(SyncClocks *sc, const 
CPUState *cpu)
 #endif /* CONFIG USER ONLY */
 
 /* Execute a TB, and fix up the CPU state afterwards if necessary */
-static inline tcg_target_ulong cpu_tb_exec(CPUState *cpu, TranslationBlock 
*itb)
+static inline TranslationBlock *cpu_tb_exec(CPUState *cpu,
+TranslationBlock *itb,
+int *tb_exit)
 {
 CPUArchState *env = cpu->env_ptr;
 uintptr_t ret;
 TranslationBlock *last_tb;
-int tb_exit;
 const void *tb_ptr = itb->tc.ptr;
 
 qemu_log_mask_and_addr(CPU_LOG_EXEC, itb->pc,
@@ -177,11 +178,20 @@ static inline tcg_target_ulong cpu_tb_exec(CPUState *cpu, 
TranslationBlock *itb)
 
 ret = tcg_qemu_tb_exec(env, tb_ptr);
 cpu->can_do_io = 1;
-last_tb = (TranslationBlock *)(ret & ~TB_EXIT_MASK);
-tb_exit = ret & TB_EXIT_MASK;
-trace_exec_tb_exit(last_tb, tb_exit);
+/*
+ * TODO: Delay swapping back to the read-write region of the TB
+ * until we actually need to modify the TB.  The read-only copy,
+ * coming from the rx region, shares the same host TLB entry as
+ * the code that executed the exit_tb opcode that arrived here.
+ * If we insist on touching both the RX and the RW pages, we
+ * double the host TLB pressure.
+ */
+last_tb = tcg_splitwx_to_rw((void *)(ret & ~TB_EXIT_MASK));
+*tb_exit = ret & TB_EXIT_MASK;
 
-if (tb_exit > TB_EXIT_IDX1) {
+trace_exec_tb_exit(last_tb, *tb_exit);
+
+if (*tb_exit > TB_EXIT_IDX1) {
 /* We didn't start executing this TB (eg because the instruction
  * counter hit zero); we must restore the guest PC to the address
  * of the start of the TB.
@@ -199,7 +209,7 @@ static inline tcg_target_ulong cpu_tb_exec(CPUState *cpu, 
TranslationBlock *itb)
 cc->set_pc(cpu, last_tb->pc);
 }
 }
-return ret;
+return last_tb;
 }
 
 #ifndef CONFIG_USER_ONLY
@@ -210,6 +220,7 @@ static void cpu_exec_nocache(CPUState *cpu, int max_cycles,
 {
 TranslationBlock *tb;
 uint32_t cflags = curr_cflags() | CF_NOCACHE;
+int tb_exit;
 
 if (ignore_icount) {
 cflags &= ~CF_USE_ICOUNT;
@@ -227,7 +238,7 @@ static void cpu_exec_nocache(CPUState *cpu, int max_cycles,
 
 /* execute the generated code */
 trace_exec_tb_nocache(tb, tb->pc);
-cpu_tb_exec(cpu, tb);
+cpu_tb_exec(cpu, tb, &tb_exit);
 
 mmap_lock();
 tb_phys_invalidate(tb, -1);
@@ -244,6 +255,7 @@ void cpu_exec_step_atomic(CPUState *cpu)
 uint32_t flags;
 uint32_t cflags = 1;
 uint32_t cf_mask = cflags & CF_HASH_MASK;
+int tb_exit;
 
 if (sigsetjmp(cpu->jmp_env, 0) == 0) {
 start_exclusive();
@@ -260,7 +272,7 @@ void cpu_exec_step_atomic(CPUState *cpu)
 cc->cpu_exec_enter(cpu);
 /* execute the generated code */
 trace_exec_tb(tb, pc);
-cpu_tb_exec(cpu, tb);
+cpu_tb_exec(cpu, tb, &tb_exit);
 cc->cpu_exec_exit(cpu);
 } else {
 /*
@@ -653,13 +665,10 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
 static inline void cpu_loop_exec_tb(CPUState *cpu, TranslationBlock *tb,
 TranslationBlock **last_tb, int *tb_exit)
 {
-uintptr_t ret;
 int32_t insns_left;
 
 trace_exec_tb(tb, tb->pc);
-ret = cpu_tb_exec(cpu, tb);
-tb = (TranslationBlock *)(ret & ~TB_EXIT_MASK);
-*tb_exit = ret & TB_EXIT_MASK;
+tb = cpu_tb_exec(cpu, tb, tb_exit);
 if (*tb_exit != TB_EXIT_REQUESTED) {
 *last_tb = tb;
 return;
diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index e3dc0cb4cb..56bb8db040 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -2666,7 +2666,18 @@ void tcg_gen_extr32_i64(TCGv_i64 lo, TCGv_i64 hi, 
TCGv_i64 arg)
 
 void tcg_gen_exit_tb(const TranslationBlock *tb, unsigned idx)
 {
-uintptr_t val = (uintptr_t)tb + idx;
+/*
+ * Let the jit code return the read-only version of the
+ * TranslationBlock, so that we minimize the pc-relative
+ * distance of the address of the exit_tb code to TB.
+ * This will improve utilization of pc-relative address loads.
+ *
+ * TODO: Move this to translator_loop, so that all const
+ * TranslationBlock pointers refer to read-only memory.
+ * This requires coordination with targets that do not use
+ * the translator_loop.
+ */
+uintptr_t val = (uintptr_t)tcg_splitwx_to_rx((void *)tb) + idx;
 
 if (tb == NULL) {
 tc

[PATCH v3 33/41] tcg/riscv: Remove branch-over-branch fallback

2020-11-05 Thread Richard Henderson
Since 7ecd02a06f8, we are prepared to re-start code generation
with a smaller TB if a relocation is out of range.  We no longer
need to leave a nop in the stream Just In Case.

Signed-off-by: Richard Henderson 
---
 tcg/riscv/tcg-target.c.inc | 56 --
 1 file changed, 6 insertions(+), 50 deletions(-)

diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
index 195c3eff03..02beb86977 100644
--- a/tcg/riscv/tcg-target.c.inc
+++ b/tcg/riscv/tcg-target.c.inc
@@ -469,43 +469,16 @@ static bool reloc_call(tcg_insn_unit *code_ptr, const 
tcg_insn_unit *target)
 static bool patch_reloc(tcg_insn_unit *code_ptr, int type,
 intptr_t value, intptr_t addend)
 {
-uint32_t insn = *code_ptr;
-intptr_t diff;
-bool short_jmp;
-
 tcg_debug_assert(addend == 0);
-
 switch (type) {
 case R_RISCV_BRANCH:
-diff = value - (uintptr_t)code_ptr;
-short_jmp = diff == sextreg(diff, 0, 12);
-if (short_jmp) {
-return reloc_sbimm12(code_ptr, (tcg_insn_unit *)value);
-} else {
-/* Invert the condition */
-insn = insn ^ (1 << 12);
-/* Clear the offset */
-insn &= 0x01fff07f;
-/* Set the offset to the PC + 8 */
-insn |= encode_sbimm12(8);
-
-/* Move forward */
-code_ptr[0] = insn;
-
-/* Overwrite the NOP with jal x0,value */
-diff = value - (uintptr_t)(code_ptr + 1);
-insn = encode_uj(OPC_JAL, TCG_REG_ZERO, diff);
-code_ptr[1] = insn;
-
-return true;
-}
-break;
+return reloc_sbimm12(code_ptr, (tcg_insn_unit *)value);
 case R_RISCV_JAL:
 return reloc_jimm20(code_ptr, (tcg_insn_unit *)value);
 case R_RISCV_CALL:
 return reloc_call(code_ptr, (tcg_insn_unit *)value);
 default:
-tcg_abort();
+g_assert_not_reached();
 }
 }
 
@@ -779,21 +752,8 @@ static void tcg_out_brcond(TCGContext *s, TCGCond cond, 
TCGReg arg1,
 arg2 = t;
 }
 
-if (l->has_value) {
-intptr_t diff = tcg_pcrel_diff(s, l->u.value_ptr);
-if (diff == sextreg(diff, 0, 12)) {
-tcg_out_opc_branch(s, op, arg1, arg2, diff);
-} else {
-/* Invert the conditional branch.  */
-tcg_out_opc_branch(s, op ^ (1 << 12), arg1, arg2, 8);
-tcg_out_opc_jump(s, OPC_JAL, TCG_REG_ZERO, diff - 4);
-}
-} else {
-tcg_out_reloc(s, s->code_ptr, R_RISCV_BRANCH, l, 0);
-tcg_out_opc_branch(s, op, arg1, arg2, 0);
-/* NOP to allow patching later */
-tcg_out_opc_imm(s, OPC_ADDI, TCG_REG_ZERO, TCG_REG_ZERO, 0);
-}
+tcg_out_reloc(s, s->code_ptr, R_RISCV_BRANCH, l, 0);
+tcg_out_opc_branch(s, op, arg1, arg2, 0);
 }
 
 static void tcg_out_setcond(TCGContext *s, TCGCond cond, TCGReg ret,
@@ -1009,8 +969,6 @@ static void tcg_out_tlb_load(TCGContext *s, TCGReg addrl,
 /* Compare masked address with the TLB entry. */
 label_ptr[0] = s->code_ptr;
 tcg_out_opc_branch(s, OPC_BNE, TCG_REG_TMP0, TCG_REG_TMP1, 0);
-/* NOP to allow patching later */
-tcg_out_opc_imm(s, OPC_ADDI, TCG_REG_ZERO, TCG_REG_ZERO, 0);
 
 /* TLB Hit - translate address using addend.  */
 if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
@@ -1054,8 +1012,7 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, 
TCGLabelQemuLdst *l)
 }
 
 /* resolve label address */
-if (!patch_reloc(l->label_ptr[0], R_RISCV_BRANCH,
- (intptr_t) s->code_ptr, 0)) {
+if (!reloc_sbimm12(l->label_ptr[0], s->code_ptr)) {
 return false;
 }
 
@@ -1089,8 +1046,7 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, 
TCGLabelQemuLdst *l)
 }
 
 /* resolve label address */
-if (!patch_reloc(l->label_ptr[0], R_RISCV_BRANCH,
- (intptr_t) s->code_ptr, 0)) {
+if (!reloc_sbimm12(l->label_ptr[0], s->code_ptr)) {
 return false;
 }
 
-- 
2.25.1




[PATCH v3 05/41] tcg: Introduce tcg_splitwx_to_{rx,rw}

2020-11-05 Thread Richard Henderson
Add two helper functions, using a global variable to hold
the displacement.  The displacement is currently always 0,
so no change in behaviour.

Begin using the functions in tcg common code only.

Signed-off-by: Richard Henderson 
---
 accel/tcg/tcg-runtime.h   |  2 +-
 include/disas/disas.h |  2 +-
 include/exec/exec-all.h   |  2 +-
 include/exec/log.h|  2 +-
 include/tcg/tcg.h | 26 ++
 accel/tcg/cpu-exec.c  |  2 +-
 accel/tcg/tcg-runtime.c   |  2 +-
 accel/tcg/translate-all.c | 33 +++
 disas.c   |  4 ++-
 tcg/tcg.c | 56 ++-
 tcg/tci.c |  5 ++--
 accel/tcg/trace-events|  2 +-
 tcg/tcg-pool.c.inc|  6 -
 13 files changed, 104 insertions(+), 40 deletions(-)

diff --git a/accel/tcg/tcg-runtime.h b/accel/tcg/tcg-runtime.h
index 4eda24e63a..c276c8beb5 100644
--- a/accel/tcg/tcg-runtime.h
+++ b/accel/tcg/tcg-runtime.h
@@ -24,7 +24,7 @@ DEF_HELPER_FLAGS_1(clrsb_i64, TCG_CALL_NO_RWG_SE, i64, i64)
 DEF_HELPER_FLAGS_1(ctpop_i32, TCG_CALL_NO_RWG_SE, i32, i32)
 DEF_HELPER_FLAGS_1(ctpop_i64, TCG_CALL_NO_RWG_SE, i64, i64)
 
-DEF_HELPER_FLAGS_1(lookup_tb_ptr, TCG_CALL_NO_WG_SE, ptr, env)
+DEF_HELPER_FLAGS_1(lookup_tb_ptr, TCG_CALL_NO_WG_SE, cptr, env)
 
 DEF_HELPER_FLAGS_1(exit_atomic, TCG_CALL_NO_WG, noreturn, env)
 
diff --git a/include/disas/disas.h b/include/disas/disas.h
index 36c33f6f19..d363e95ede 100644
--- a/include/disas/disas.h
+++ b/include/disas/disas.h
@@ -7,7 +7,7 @@
 #include "cpu.h"
 
 /* Disassemble this for me please... (debugging). */
-void disas(FILE *out, void *code, unsigned long size);
+void disas(FILE *out, const void *code, unsigned long size);
 void target_disas(FILE *out, CPUState *cpu, target_ulong code,
   target_ulong size);
 
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 4707ac140c..aa65103702 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -448,7 +448,7 @@ int probe_access_flags(CPUArchState *env, target_ulong addr,
  * Note: the address of search data can be obtained by adding @size to @ptr.
  */
 struct tb_tc {
-void *ptr;/* pointer to the translated code */
+const void *ptr;/* pointer to the translated code */
 size_t size;
 };
 
diff --git a/include/exec/log.h b/include/exec/log.h
index e02fff5de1..3c7fa65ead 100644
--- a/include/exec/log.h
+++ b/include/exec/log.h
@@ -56,7 +56,7 @@ static inline void log_target_disas(CPUState *cpu, 
target_ulong start,
 rcu_read_unlock();
 }
 
-static inline void log_disas(void *code, unsigned long size)
+static inline void log_disas(const void *code, unsigned long size)
 {
 QemuLogFile *logfile;
 rcu_read_lock();
diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index e4d0ace44b..249f83be72 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -678,6 +678,7 @@ struct TCGContext {
 extern TCGContext tcg_init_ctx;
 extern __thread TCGContext *tcg_ctx;
 extern void *tcg_code_gen_epilogue;
+extern uintptr_t tcg_splitwx_diff;
 extern TCGv_env cpu_env;
 
 static inline bool in_code_gen_buffer(const void *p)
@@ -686,6 +687,21 @@ static inline bool in_code_gen_buffer(const void *p)
 return (size_t)(p - s->code_gen_buffer) <= s->code_gen_buffer_size;
 }
 
+#ifdef CONFIG_DEBUG_TCG
+const void *tcg_splitwx_to_rx(void *rw);
+void *tcg_splitwx_to_rw(const void *rx);
+#else
+static inline const void *tcg_splitwx_to_rx(void *rw)
+{
+return rw ? rw + tcg_splitwx_diff : NULL;
+}
+
+static inline void *tcg_splitwx_to_rw(const void *rx)
+{
+return rx ? (void *)rx - tcg_splitwx_diff : NULL;
+}
+#endif
+
 static inline size_t temp_idx(TCGTemp *ts)
 {
 ptrdiff_t n = ts - tcg_ctx->temps;
@@ -1106,7 +1122,7 @@ static inline TCGLabel *arg_label(TCGArg i)
  * correct result.
  */
 
-static inline ptrdiff_t tcg_ptr_byte_diff(void *a, void *b)
+static inline ptrdiff_t tcg_ptr_byte_diff(const void *a, const void *b)
 {
 return a - b;
 }
@@ -1120,9 +1136,9 @@ static inline ptrdiff_t tcg_ptr_byte_diff(void *a, void 
*b)
  * to the destination address.
  */
 
-static inline ptrdiff_t tcg_pcrel_diff(TCGContext *s, void *target)
+static inline ptrdiff_t tcg_pcrel_diff(TCGContext *s, const void *target)
 {
-return tcg_ptr_byte_diff(target, s->code_ptr);
+return tcg_ptr_byte_diff(target, tcg_splitwx_to_rx(s->code_ptr));
 }
 
 /**
@@ -1228,9 +1244,9 @@ static inline unsigned get_mmuidx(TCGMemOpIdx oi)
 #define TB_EXIT_REQUESTED 3
 
 #ifdef CONFIG_TCG_INTERPRETER
-uintptr_t tcg_qemu_tb_exec(CPUArchState *env, void *tb_ptr);
+uintptr_t tcg_qemu_tb_exec(CPUArchState *env, const void *tb_ptr);
 #else
-typedef uintptr_t tcg_prologue_fn(CPUArchState *env, void *tb_ptr);
+typedef uintptr_t tcg_prologue_fn(CPUArchState *env, const void *tb_ptr);
 extern tcg_prologue_fn *tcg_qemu_tb_exec;
 #endif
 
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 58aea605d8..1e3cb570f6 100644
--- a/accel/tcg/cpu-exec

[PATCH v3 15/41] accel/tcg: Support split-wx for linux with memfd

2020-11-05 Thread Richard Henderson
We cannot use a real temp file, because we would need to find
a filesystem that does not have noexec enabled.  However, a
memfd is not associated with any filesystem.

Signed-off-by: Richard Henderson 
---
 accel/tcg/translate-all.c | 84 +++
 1 file changed, 76 insertions(+), 8 deletions(-)

diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index a29cb4a42e..1931e65365 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -1078,17 +1078,11 @@ static bool alloc_code_gen_buffer(size_t size, int 
splitwx, Error **errp)
 return true;
 }
 #else
-static bool alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
+static bool alloc_code_gen_buffer_anon(size_t size, int prot,
+   int flags, Error **errp)
 {
-int prot = PROT_WRITE | PROT_READ | PROT_EXEC;
-int flags = MAP_PRIVATE | MAP_ANONYMOUS;
 void *buf;
 
-if (splitwx > 0) {
-error_setg(errp, "jit split-wx not supported");
-return false;
-}
-
 buf = mmap(NULL, size, prot, flags, -1, 0);
 if (buf == MAP_FAILED) {
 error_setg_errno(errp, errno,
@@ -1137,6 +1131,80 @@ static bool alloc_code_gen_buffer(size_t size, int 
splitwx, Error **errp)
 tcg_ctx->code_gen_buffer = buf;
 return true;
 }
+
+#ifdef CONFIG_POSIX
+#include "qemu/memfd.h"
+
+static bool alloc_code_gen_buffer_splitwx_memfd(size_t size, Error **errp)
+{
+void *buf_rw, *buf_rx;
+int fd = -1;
+
+buf_rw = qemu_memfd_alloc("tcg-jit", size, 0, &fd, errp);
+if (buf_rw == NULL) {
+return false;
+}
+
+buf_rx = mmap(NULL, size, PROT_READ | PROT_EXEC, MAP_SHARED, fd, 0);
+if (buf_rx == MAP_FAILED) {
+error_setg_errno(errp, errno,
+ "failed to map shared memory for execute");
+munmap(buf_rw, size);
+close(fd);
+return false;
+}
+close(fd);
+
+tcg_ctx->code_gen_buffer = buf_rw;
+tcg_ctx->code_gen_buffer_size = size;
+tcg_splitwx_diff = buf_rx - buf_rw;
+
+/* Request large pages for the buffer and the splitwx.  */
+qemu_madvise(buf_rw, size, QEMU_MADV_HUGEPAGE);
+qemu_madvise(buf_rx, size, QEMU_MADV_HUGEPAGE);
+return true;
+}
+#endif /* CONFIG_POSIX */
+
+static bool alloc_code_gen_buffer_splitwx(size_t size, Error **errp)
+{
+if (TCG_TARGET_SUPPORT_MIRROR) {
+#ifdef CONFIG_POSIX
+return alloc_code_gen_buffer_splitwx_memfd(size, errp);
+#endif
+}
+error_setg(errp, "jit split-wx not supported");
+return false;
+}
+
+static bool alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
+{
+ERRP_GUARD();
+int prot, flags;
+
+if (splitwx) {
+if (alloc_code_gen_buffer_splitwx(size, errp)) {
+return true;
+}
+/*
+ * If splitwx force-on (1), fail;
+ * if splitwx default-on (-1), fall through to splitwx off.
+ */
+if (splitwx > 0) {
+return false;
+}
+error_free_or_abort(errp);
+}
+
+prot = PROT_READ | PROT_WRITE | PROT_EXEC;
+flags = MAP_PRIVATE | MAP_ANONYMOUS;
+#ifdef CONFIG_TCG_INTERPRETER
+/* The tcg interpreter does not need execute permission. */
+prot = PROT_READ | PROT_WRITE;
+#endif
+
+return alloc_code_gen_buffer_anon(size, prot, flags, errp);
+}
 #endif /* USE_STATIC_CODE_GEN_BUFFER, WIN32, POSIX */
 
 static bool tb_cmp(const void *ap, const void *bp)
-- 
2.25.1




[PATCH v3 10/41] tcg: Adjust tb_target_set_jmp_target for split-wx

2020-11-05 Thread Richard Henderson
Pass both rx and rw addresses to tb_target_set_jmp_target.

Signed-off-by: Richard Henderson 
---
 tcg/aarch64/tcg-target.h |  2 +-
 tcg/arm/tcg-target.h |  2 +-
 tcg/i386/tcg-target.h|  6 +++---
 tcg/mips/tcg-target.h|  2 +-
 tcg/ppc/tcg-target.h |  2 +-
 tcg/riscv/tcg-target.h   |  2 +-
 tcg/s390/tcg-target.h|  8 
 tcg/sparc/tcg-target.h   |  2 +-
 tcg/tci/tcg-target.h |  6 +++---
 accel/tcg/cpu-exec.c |  4 +++-
 tcg/aarch64/tcg-target.c.inc | 12 ++--
 tcg/mips/tcg-target.c.inc|  8 
 tcg/ppc/tcg-target.c.inc | 16 
 tcg/sparc/tcg-target.c.inc   | 14 +++---
 14 files changed, 44 insertions(+), 42 deletions(-)

diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h
index d0a6a059b7..91313d93be 100644
--- a/tcg/aarch64/tcg-target.h
+++ b/tcg/aarch64/tcg-target.h
@@ -158,7 +158,7 @@ static inline void flush_idcache_range(uintptr_t rx, 
uintptr_t rw, size_t len)
 __builtin___clear_cache((char *)rx, (char *)(rx + len));
 }
 
-void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t);
+void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t, uintptr_t);
 
 #ifdef CONFIG_SOFTMMU
 #define TCG_TARGET_NEED_LDST_LABELS
diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h
index fa88b24e43..b21a2fb6a1 100644
--- a/tcg/arm/tcg-target.h
+++ b/tcg/arm/tcg-target.h
@@ -144,7 +144,7 @@ static inline void flush_idcache_range(uintptr_t rx, 
uintptr_t rw, size_t len)
 }
 
 /* not defined -- call should be eliminated at compile time */
-void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t);
+void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t, uintptr_t);
 
 #ifdef CONFIG_SOFTMMU
 #define TCG_TARGET_NEED_LDST_LABELS
diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h
index 8323e72639..f52ba0ffec 100644
--- a/tcg/i386/tcg-target.h
+++ b/tcg/i386/tcg-target.h
@@ -211,11 +211,11 @@ static inline void flush_idcache_range(uintptr_t rx, 
uintptr_t rw, size_t len)
 {
 }
 
-static inline void tb_target_set_jmp_target(uintptr_t tc_ptr,
-uintptr_t jmp_addr, uintptr_t addr)
+static inline void tb_target_set_jmp_target(uintptr_t tc_ptr, uintptr_t jmp_rx,
+uintptr_t jmp_rw, uintptr_t addr)
 {
 /* patch the branch destination */
-qatomic_set((int32_t *)jmp_addr, addr - (jmp_addr + 4));
+qatomic_set((int32_t *)jmp_rw, addr - (jmp_rx + 4));
 /* no need to flush icache explicitly */
 }
 
diff --git a/tcg/mips/tcg-target.h b/tcg/mips/tcg-target.h
index 47b1226ee9..cd548dacec 100644
--- a/tcg/mips/tcg-target.h
+++ b/tcg/mips/tcg-target.h
@@ -216,7 +216,7 @@ static inline void flush_idcache_range(uintptr_t rx, 
uintptr_t rw, size_t len)
 cacheflush((void *)rx, len, ICACHE);
 }
 
-void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t);
+void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t, uintptr_t);
 
 #ifdef CONFIG_SOFTMMU
 #define TCG_TARGET_NEED_LDST_LABELS
diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h
index fbb6dc1b47..8f3e4c924a 100644
--- a/tcg/ppc/tcg-target.h
+++ b/tcg/ppc/tcg-target.h
@@ -176,7 +176,7 @@ extern bool have_vsx;
 #define TCG_TARGET_HAS_cmpsel_vec   0
 
 void flush_idcache_range(uintptr_t rx, uintptr_t rw, size_t len);
-void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t);
+void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t, uintptr_t);
 
 #define TCG_TARGET_DEFAULT_MO (0)
 #define TCG_TARGET_HAS_MEMORY_BSWAP 1
diff --git a/tcg/riscv/tcg-target.h b/tcg/riscv/tcg-target.h
index 0fa6ae358e..e03fd17427 100644
--- a/tcg/riscv/tcg-target.h
+++ b/tcg/riscv/tcg-target.h
@@ -169,7 +169,7 @@ static inline void flush_idcache_range(uintptr_t rx, 
uintptr_t rw, size_t len)
 }
 
 /* not defined -- call should be eliminated at compile time */
-void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t);
+void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t, uintptr_t);
 
 #define TCG_TARGET_DEFAULT_MO (0)
 
diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
index c3dc2e8938..c5a749e425 100644
--- a/tcg/s390/tcg-target.h
+++ b/tcg/s390/tcg-target.h
@@ -150,12 +150,12 @@ static inline void flush_idcache_range(uintptr_t rx, 
uintptr_t rw, size_t len)
 {
 }
 
-static inline void tb_target_set_jmp_target(uintptr_t tc_ptr,
-uintptr_t jmp_addr, uintptr_t addr)
+static inline void tb_target_set_jmp_target(uintptr_t tc_ptr, uintptr_t jmp_rx,
+uintptr_t jmp_rw, uintptr_t addr)
 {
 /* patch the branch destination */
-intptr_t disp = addr - (jmp_addr - 2);
-qatomic_set((int32_t *)jmp_addr, disp / 2);
+intptr_t disp = addr - (jmp_rx - 2);
+qatomic_set((int32_t *)jmp_rw, disp / 2);
 /* no need to flush icache explicitly */
 }
 
diff --git a/tcg/sparc/tcg-target.h b/tcg/sparc/tcg-target

[PATCH v3 14/41] tcg: Add --accel tcg,split-wx property

2020-11-05 Thread Richard Henderson
Plumb the value through to alloc_code_gen_buffer.  This is not
supported by any os or tcg backend, so for now enabling it will
result in an error.

Signed-off-by: Richard Henderson 
---
 include/sysemu/tcg.h  |  3 ++-
 tcg/aarch64/tcg-target.h  |  1 +
 tcg/arm/tcg-target.h  |  1 +
 tcg/i386/tcg-target.h |  1 +
 tcg/mips/tcg-target.h |  1 +
 tcg/ppc/tcg-target.h  |  1 +
 tcg/riscv/tcg-target.h|  1 +
 tcg/s390/tcg-target.h |  1 +
 tcg/sparc/tcg-target.h|  1 +
 tcg/tci/tcg-target.h  |  1 +
 accel/tcg/tcg-all.c   | 26 +-
 accel/tcg/translate-all.c | 35 +++
 bsd-user/main.c   |  2 +-
 linux-user/main.c |  2 +-
 14 files changed, 65 insertions(+), 12 deletions(-)

diff --git a/include/sysemu/tcg.h b/include/sysemu/tcg.h
index d9d3ca8559..00349fb18a 100644
--- a/include/sysemu/tcg.h
+++ b/include/sysemu/tcg.h
@@ -8,7 +8,8 @@
 #ifndef SYSEMU_TCG_H
 #define SYSEMU_TCG_H
 
-void tcg_exec_init(unsigned long tb_size);
+void tcg_exec_init(unsigned long tb_size, int splitwx);
+
 #ifdef CONFIG_TCG
 extern bool tcg_allowed;
 #define tcg_enabled() (tcg_allowed)
diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h
index 91313d93be..fa64058d43 100644
--- a/tcg/aarch64/tcg-target.h
+++ b/tcg/aarch64/tcg-target.h
@@ -164,5 +164,6 @@ void tb_target_set_jmp_target(uintptr_t, uintptr_t, 
uintptr_t, uintptr_t);
 #define TCG_TARGET_NEED_LDST_LABELS
 #endif
 #define TCG_TARGET_NEED_POOL_LABELS
+#define TCG_TARGET_SUPPORT_MIRROR   0
 
 #endif /* AARCH64_TCG_TARGET_H */
diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h
index b21a2fb6a1..e355d6a4b2 100644
--- a/tcg/arm/tcg-target.h
+++ b/tcg/arm/tcg-target.h
@@ -150,5 +150,6 @@ void tb_target_set_jmp_target(uintptr_t, uintptr_t, 
uintptr_t, uintptr_t);
 #define TCG_TARGET_NEED_LDST_LABELS
 #endif
 #define TCG_TARGET_NEED_POOL_LABELS
+#define TCG_TARGET_SUPPORT_MIRROR   0
 
 #endif
diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h
index f52ba0ffec..1b9d41bd56 100644
--- a/tcg/i386/tcg-target.h
+++ b/tcg/i386/tcg-target.h
@@ -236,5 +236,6 @@ static inline void tb_target_set_jmp_target(uintptr_t 
tc_ptr, uintptr_t jmp_rx,
 #define TCG_TARGET_NEED_LDST_LABELS
 #endif
 #define TCG_TARGET_NEED_POOL_LABELS
+#define TCG_TARGET_SUPPORT_MIRROR   0
 
 #endif
diff --git a/tcg/mips/tcg-target.h b/tcg/mips/tcg-target.h
index cd548dacec..d231522dc9 100644
--- a/tcg/mips/tcg-target.h
+++ b/tcg/mips/tcg-target.h
@@ -206,6 +206,7 @@ extern bool use_mips32r2_instructions;
 
 #define TCG_TARGET_DEFAULT_MO (0)
 #define TCG_TARGET_HAS_MEMORY_BSWAP 1
+#define TCG_TARGET_SUPPORT_MIRROR   0
 
 /* Flush the dcache at RW, and the icache at RX, as necessary. */
 static inline void flush_idcache_range(uintptr_t rx, uintptr_t rw, size_t len)
diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h
index 8f3e4c924a..78d6a5e96f 100644
--- a/tcg/ppc/tcg-target.h
+++ b/tcg/ppc/tcg-target.h
@@ -185,5 +185,6 @@ void tb_target_set_jmp_target(uintptr_t, uintptr_t, 
uintptr_t, uintptr_t);
 #define TCG_TARGET_NEED_LDST_LABELS
 #endif
 #define TCG_TARGET_NEED_POOL_LABELS
+#define TCG_TARGET_SUPPORT_MIRROR   0
 
 #endif
diff --git a/tcg/riscv/tcg-target.h b/tcg/riscv/tcg-target.h
index e03fd17427..3c2e8305b0 100644
--- a/tcg/riscv/tcg-target.h
+++ b/tcg/riscv/tcg-target.h
@@ -179,5 +179,6 @@ void tb_target_set_jmp_target(uintptr_t, uintptr_t, 
uintptr_t, uintptr_t);
 #define TCG_TARGET_NEED_POOL_LABELS
 
 #define TCG_TARGET_HAS_MEMORY_BSWAP 0
+#define TCG_TARGET_SUPPORT_MIRROR   0
 
 #endif
diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
index c5a749e425..8324197127 100644
--- a/tcg/s390/tcg-target.h
+++ b/tcg/s390/tcg-target.h
@@ -163,5 +163,6 @@ static inline void tb_target_set_jmp_target(uintptr_t 
tc_ptr, uintptr_t jmp_rx,
 #define TCG_TARGET_NEED_LDST_LABELS
 #endif
 #define TCG_TARGET_NEED_POOL_LABELS
+#define TCG_TARGET_SUPPORT_MIRROR   0
 
 #endif
diff --git a/tcg/sparc/tcg-target.h b/tcg/sparc/tcg-target.h
index 87e2be61e6..517840705f 100644
--- a/tcg/sparc/tcg-target.h
+++ b/tcg/sparc/tcg-target.h
@@ -181,5 +181,6 @@ static inline void flush_idcache_range(uintptr_t rx, 
uintptr_t rw, size_t len)
 void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t, uintptr_t);
 
 #define TCG_TARGET_NEED_POOL_LABELS
+#define TCG_TARGET_SUPPORT_MIRROR   0
 
 #endif
diff --git a/tcg/tci/tcg-target.h b/tcg/tci/tcg-target.h
index a19a6b06e5..3653fef947 100644
--- a/tcg/tci/tcg-target.h
+++ b/tcg/tci/tcg-target.h
@@ -200,6 +200,7 @@ static inline void flush_idcache_range(uintptr_t rx, 
uintptr_t rw, size_t len)
 #define TCG_TARGET_DEFAULT_MO  (0)
 
 #define TCG_TARGET_HAS_MEMORY_BSWAP 1
+#define TCG_TARGET_SUPPORT_MIRROR   0
 
 static inline void tb_target_set_jmp_target(uintptr_t tc_ptr, uintptr_t jmp_rx,
 uintptr_t jmp_rw, uintptr_t addr)
diff --git a/accel/tcg/tcg-all.c b/accel/tcg/tcg-all.c
inde

[PATCH v3 03/41] tcg: Move tcg epilogue pointer out of TCGContext

2020-11-05 Thread Richard Henderson
This value is constant across all thread-local copies of TCGContext,
so we might as well move it out of thread-local storage.

Signed-off-by: Richard Henderson 
---
 include/tcg/tcg.h| 2 +-
 accel/tcg/tcg-runtime.c  | 2 +-
 tcg/tcg.c| 3 ++-
 tcg/aarch64/tcg-target.c.inc | 4 ++--
 tcg/arm/tcg-target.c.inc | 2 +-
 tcg/i386/tcg-target.c.inc| 4 ++--
 tcg/mips/tcg-target.c.inc| 2 +-
 tcg/ppc/tcg-target.c.inc | 2 +-
 tcg/riscv/tcg-target.c.inc   | 4 ++--
 tcg/s390/tcg-target.c.inc| 4 ++--
 tcg/sparc/tcg-target.c.inc   | 2 +-
 11 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index 9cc412f90c..bb1e97b13b 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -621,7 +621,6 @@ struct TCGContext {
here, because there's too much arithmetic throughout that relies
on addition and subtraction working on bytes.  Rely on the GCC
extension that allows arithmetic on void*.  */
-void *code_gen_epilogue;
 void *code_gen_buffer;
 size_t code_gen_buffer_size;
 void *code_gen_ptr;
@@ -678,6 +677,7 @@ struct TCGContext {
 
 extern TCGContext tcg_init_ctx;
 extern __thread TCGContext *tcg_ctx;
+extern void *tcg_code_gen_epilogue;
 extern TCGv_env cpu_env;
 
 static inline size_t temp_idx(TCGTemp *ts)
diff --git a/accel/tcg/tcg-runtime.c b/accel/tcg/tcg-runtime.c
index 446465a09a..f85dfefeab 100644
--- a/accel/tcg/tcg-runtime.c
+++ b/accel/tcg/tcg-runtime.c
@@ -154,7 +154,7 @@ void *HELPER(lookup_tb_ptr)(CPUArchState *env)
 
 tb = tb_lookup__cpu_state(cpu, &pc, &cs_base, &flags, curr_cflags());
 if (tb == NULL) {
-return tcg_ctx->code_gen_epilogue;
+return tcg_code_gen_epilogue;
 }
 qemu_log_mask_and_addr(CPU_LOG_EXEC, pc,
"Chain %d: %p ["
diff --git a/tcg/tcg.c b/tcg/tcg.c
index a6f47b033c..1e83eed83f 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -160,6 +160,7 @@ static int tcg_out_ldst_finalize(TCGContext *s);
 static TCGContext **tcg_ctxs;
 static unsigned int n_tcg_ctxs;
 TCGv_env cpu_env = 0;
+void *tcg_code_gen_epilogue;
 
 #ifndef CONFIG_TCG_INTERPRETER
 tcg_prologue_fn *tcg_qemu_tb_exec;
@@ -1130,7 +1131,7 @@ void tcg_prologue_init(TCGContext *s)
 
 /* Assert that goto_ptr is implemented completely.  */
 if (TCG_TARGET_HAS_goto_ptr) {
-tcg_debug_assert(s->code_gen_epilogue != NULL);
+tcg_debug_assert(tcg_code_gen_epilogue != NULL);
 }
 }
 
diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
index 83af3108a4..76f8ae48ad 100644
--- a/tcg/aarch64/tcg-target.c.inc
+++ b/tcg/aarch64/tcg-target.c.inc
@@ -1873,7 +1873,7 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
 case INDEX_op_exit_tb:
 /* Reuse the zeroing that exists for goto_ptr.  */
 if (a0 == 0) {
-tcg_out_goto_long(s, s->code_gen_epilogue);
+tcg_out_goto_long(s, tcg_code_gen_epilogue);
 } else {
 tcg_out_movi(s, TCG_TYPE_I64, TCG_REG_X0, a0);
 tcg_out_goto_long(s, tb_ret_addr);
@@ -2894,7 +2894,7 @@ static void tcg_target_qemu_prologue(TCGContext *s)
  * Return path for goto_ptr. Set return value to 0, a-la exit_tb,
  * and fall through to the rest of the epilogue.
  */
-s->code_gen_epilogue = s->code_ptr;
+tcg_code_gen_epilogue = s->code_ptr;
 tcg_out_movi(s, TCG_TYPE_REG, TCG_REG_X0, 0);
 
 /* TB epilogue */
diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
index 62c37a954b..1e32bf42b8 100644
--- a/tcg/arm/tcg-target.c.inc
+++ b/tcg/arm/tcg-target.c.inc
@@ -2297,7 +2297,7 @@ static void tcg_target_qemu_prologue(TCGContext *s)
  * Return path for goto_ptr. Set return value to 0, a-la exit_tb,
  * and fall through to the rest of the epilogue.
  */
-s->code_gen_epilogue = s->code_ptr;
+tcg_code_gen_epilogue = s->code_ptr;
 tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R0, 0);
 tcg_out_epilogue(s);
 }
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index d8797ed398..424dd1cdcf 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -2267,7 +2267,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode 
opc,
 case INDEX_op_exit_tb:
 /* Reuse the zeroing that exists for goto_ptr.  */
 if (a0 == 0) {
-tcg_out_jmp(s, s->code_gen_epilogue);
+tcg_out_jmp(s, tcg_code_gen_epilogue);
 } else {
 tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_EAX, a0);
 tcg_out_jmp(s, tb_ret_addr);
@@ -3825,7 +3825,7 @@ static void tcg_target_qemu_prologue(TCGContext *s)
  * Return path for goto_ptr. Set return value to 0, a-la exit_tb,
  * and fall through to the rest of the epilogue.
  */
-s->code_gen_epilogue = s->code_ptr;
+tcg_code_gen_epilogue = s->code_ptr;
 tcg_out_movi(s, TCG_TYPE_REG, TCG_REG_EAX, 0);
 
 /* TB epilogue */
diff --git a/tcg/mips/tcg-target.c.inc

[PATCH v3 06/41] tcg: Adjust TCGLabel for const

2020-11-05 Thread Richard Henderson
Change TCGLabel.u.value_ptr to const, and initialize it with
tcg_splitwx_to_rx.  Propagate const through tcg/host/ only
as far as needed to avoid errors from the value_ptr change.

Signed-off-by: Richard Henderson 
---
 include/tcg/tcg.h| 2 +-
 tcg/tcg.c| 2 +-
 tcg/aarch64/tcg-target.c.inc | 2 +-
 tcg/arm/tcg-target.c.inc | 2 +-
 tcg/mips/tcg-target.c.inc| 5 +++--
 tcg/ppc/tcg-target.c.inc | 4 ++--
 tcg/s390/tcg-target.c.inc| 2 +-
 7 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index 249f83be72..b2ba16ea8c 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -261,7 +261,7 @@ struct TCGLabel {
 unsigned refs : 16;
 union {
 uintptr_t value;
-tcg_insn_unit *value_ptr;
+const tcg_insn_unit *value_ptr;
 } u;
 QSIMPLEQ_HEAD(, TCGRelocation) relocs;
 QSIMPLEQ_ENTRY(TCGLabel) next;
diff --git a/tcg/tcg.c b/tcg/tcg.c
index cea3a4e4f2..a88b314e97 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -305,7 +305,7 @@ static void tcg_out_label(TCGContext *s, TCGLabel *l, 
tcg_insn_unit *ptr)
 {
 tcg_debug_assert(!l->has_value);
 l->has_value = 1;
-l->u.value_ptr = ptr;
+l->u.value_ptr = tcg_splitwx_to_rx(ptr);
 }
 
 TCGLabel *gen_new_label(void)
diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
index 76f8ae48ad..96dc9f4d0b 100644
--- a/tcg/aarch64/tcg-target.c.inc
+++ b/tcg/aarch64/tcg-target.c.inc
@@ -1306,7 +1306,7 @@ static void tcg_out_cmp(TCGContext *s, TCGType ext, 
TCGReg a,
 }
 }
 
-static inline void tcg_out_goto(TCGContext *s, tcg_insn_unit *target)
+static void tcg_out_goto(TCGContext *s, const tcg_insn_unit *target)
 {
 ptrdiff_t offset = target - s->code_ptr;
 tcg_debug_assert(offset == sextract64(offset, 0, 26));
diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
index 1e32bf42b8..f8f485d807 100644
--- a/tcg/arm/tcg-target.c.inc
+++ b/tcg/arm/tcg-target.c.inc
@@ -1019,7 +1019,7 @@ static inline void tcg_out_st8(TCGContext *s, int cond,
  * with the code buffer limited to 16MB we wouldn't need the long case.
  * But we also use it for the tail-call to the qemu_ld/st helpers, which does.
  */
-static void tcg_out_goto(TCGContext *s, int cond, tcg_insn_unit *addr)
+static void tcg_out_goto(TCGContext *s, int cond, const tcg_insn_unit *addr)
 {
 intptr_t addri = (intptr_t)addr;
 ptrdiff_t disp = tcg_pcrel_diff(s, addr);
diff --git a/tcg/mips/tcg-target.c.inc b/tcg/mips/tcg-target.c.inc
index f641105f9a..a3f838fa51 100644
--- a/tcg/mips/tcg-target.c.inc
+++ b/tcg/mips/tcg-target.c.inc
@@ -144,7 +144,8 @@ static tcg_insn_unit *bswap32_addr;
 static tcg_insn_unit *bswap32u_addr;
 static tcg_insn_unit *bswap64_addr;
 
-static inline uint32_t reloc_pc16_val(tcg_insn_unit *pc, tcg_insn_unit *target)
+static inline uint32_t reloc_pc16_val(tcg_insn_unit *pc,
+  const tcg_insn_unit *target)
 {
 /* Let the compiler perform the right-shift as part of the arithmetic.  */
 ptrdiff_t disp = target - (pc + 1);
@@ -152,7 +153,7 @@ static inline uint32_t reloc_pc16_val(tcg_insn_unit *pc, 
tcg_insn_unit *target)
 return disp & 0x;
 }
 
-static inline void reloc_pc16(tcg_insn_unit *pc, tcg_insn_unit *target)
+static inline void reloc_pc16(tcg_insn_unit *pc, const tcg_insn_unit *target)
 {
 *pc = deposit32(*pc, 0, 16, reloc_pc16_val(pc, target));
 }
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index be116c6164..8a0b20a86b 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -184,7 +184,7 @@ static inline bool in_range_b(tcg_target_long target)
 return target == sextract64(target, 0, 26);
 }
 
-static uint32_t reloc_pc24_val(tcg_insn_unit *pc, tcg_insn_unit *target)
+static uint32_t reloc_pc24_val(tcg_insn_unit *pc, const tcg_insn_unit *target)
 {
 ptrdiff_t disp = tcg_ptr_byte_diff(target, pc);
 tcg_debug_assert(in_range_b(disp));
@@ -201,7 +201,7 @@ static bool reloc_pc24(tcg_insn_unit *pc, tcg_insn_unit 
*target)
 return false;
 }
 
-static uint16_t reloc_pc14_val(tcg_insn_unit *pc, tcg_insn_unit *target)
+static uint16_t reloc_pc14_val(tcg_insn_unit *pc, const tcg_insn_unit *target)
 {
 ptrdiff_t disp = tcg_ptr_byte_diff(target, pc);
 tcg_debug_assert(disp == (int16_t) disp);
diff --git a/tcg/s390/tcg-target.c.inc b/tcg/s390/tcg-target.c.inc
index ac99ccea73..1b5c4f0ab0 100644
--- a/tcg/s390/tcg-target.c.inc
+++ b/tcg/s390/tcg-target.c.inc
@@ -1302,7 +1302,7 @@ static void tgen_extract(TCGContext *s, TCGReg dest, 
TCGReg src,
 tcg_out_risbg(s, dest, src, 64 - len, 63, 64 - ofs, 1);
 }
 
-static void tgen_gotoi(TCGContext *s, int cc, tcg_insn_unit *dest)
+static void tgen_gotoi(TCGContext *s, int cc, const tcg_insn_unit *dest)
 {
 ptrdiff_t off = dest - s->code_ptr;
 if (off == (int16_t)off) {
-- 
2.25.1




[PATCH v3 08/41] tcg: Adjust tcg_out_label for const

2020-11-05 Thread Richard Henderson
Simplify the arguments to always use s->code_ptr instead of
take it as an argument.  That makes it easy to ensure that
the value_ptr is always the rx version.

Signed-off-by: Richard Henderson 
---
 tcg/tcg.c |  6 +++---
 tcg/i386/tcg-target.c.inc | 10 +-
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/tcg/tcg.c b/tcg/tcg.c
index d3eeea355c..e5d2208e88 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -301,11 +301,11 @@ static void tcg_out_reloc(TCGContext *s, tcg_insn_unit 
*code_ptr, int type,
 QSIMPLEQ_INSERT_TAIL(&l->relocs, r, next);
 }
 
-static void tcg_out_label(TCGContext *s, TCGLabel *l, tcg_insn_unit *ptr)
+static void tcg_out_label(TCGContext *s, TCGLabel *l)
 {
 tcg_debug_assert(!l->has_value);
 l->has_value = 1;
-l->u.value_ptr = tcg_splitwx_to_rx(ptr);
+l->u.value_ptr = tcg_splitwx_to_rx(s->code_ptr);
 }
 
 TCGLabel *gen_new_label(void)
@@ -4322,7 +4322,7 @@ int tcg_gen_code(TCGContext *s, TranslationBlock *tb)
 break;
 case INDEX_op_set_label:
 tcg_reg_alloc_bb_end(s, s->reserved_regs);
-tcg_out_label(s, arg_label(op->args[0]), s->code_ptr);
+tcg_out_label(s, arg_label(op->args[0]));
 break;
 case INDEX_op_call:
 tcg_reg_alloc_call(s, op);
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index 095553ce28..0ac1ef3d82 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -1452,7 +1452,7 @@ static void tcg_out_brcond2(TCGContext *s, const TCGArg 
*args,
 default:
 tcg_abort();
 }
-tcg_out_label(s, label_next, s->code_ptr);
+tcg_out_label(s, label_next);
 }
 #endif
 
@@ -1494,10 +1494,10 @@ static void tcg_out_setcond2(TCGContext *s, const 
TCGArg *args,
 
 tcg_out_movi(s, TCG_TYPE_I32, args[0], 0);
 tcg_out_jxx(s, JCC_JMP, label_over, 1);
-tcg_out_label(s, label_true, s->code_ptr);
+tcg_out_label(s, label_true);
 
 tcg_out_movi(s, TCG_TYPE_I32, args[0], 1);
-tcg_out_label(s, label_over, s->code_ptr);
+tcg_out_label(s, label_over);
 } else {
 /* When the destination does not overlap one of the arguments,
clear the destination first, jump if cond false, and emit an
@@ -1511,7 +1511,7 @@ static void tcg_out_setcond2(TCGContext *s, const TCGArg 
*args,
 tcg_out_brcond2(s, new_args, const_args+1, 1);
 
 tgen_arithi(s, ARITH_ADD, args[0], 1, 0);
-tcg_out_label(s, label_over, s->code_ptr);
+tcg_out_label(s, label_over);
 }
 }
 #endif
@@ -1525,7 +1525,7 @@ static void tcg_out_cmov(TCGContext *s, TCGCond cond, int 
rexw,
 TCGLabel *over = gen_new_label();
 tcg_out_jxx(s, tcg_cond_to_jcc[tcg_invert_cond(cond)], over, 1);
 tcg_out_mov(s, TCG_TYPE_I32, dest, v1);
-tcg_out_label(s, over, s->code_ptr);
+tcg_out_label(s, over);
 }
 }
 
-- 
2.25.1




[PATCH v3 02/41] tcg: Move tcg prologue pointer out of TCGContext

2020-11-05 Thread Richard Henderson
This value is constant across all thread-local copies of TCGContext,
so we might as well move it out of thread-local storage.

Use the correct function pointer type, and name the variable
tcg_qemu_tb_exec, which means that we are able to remove the
macro that does the casting.

Replace HAVE_TCG_QEMU_TB_EXEC with CONFIG_TCG_INTERPRETER,
as this is somewhat clearer in intent.

Signed-off-by: Richard Henderson 
---
 include/tcg/tcg.h| 9 -
 tcg/tci/tcg-target.h | 2 --
 tcg/tcg.c| 9 -
 tcg/tci.c| 3 ++-
 4 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index 8ff9dad4ef..9cc412f90c 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -621,7 +621,6 @@ struct TCGContext {
here, because there's too much arithmetic throughout that relies
on addition and subtraction working on bytes.  Rely on the GCC
extension that allows arithmetic on void*.  */
-void *code_gen_prologue;
 void *code_gen_epilogue;
 void *code_gen_buffer;
 size_t code_gen_buffer_size;
@@ -1222,11 +1221,11 @@ static inline unsigned get_mmuidx(TCGMemOpIdx oi)
 #define TB_EXIT_IDXMAX1
 #define TB_EXIT_REQUESTED 3
 
-#ifdef HAVE_TCG_QEMU_TB_EXEC
-uintptr_t tcg_qemu_tb_exec(CPUArchState *env, uint8_t *tb_ptr);
+#ifdef CONFIG_TCG_INTERPRETER
+uintptr_t tcg_qemu_tb_exec(CPUArchState *env, void *tb_ptr);
 #else
-# define tcg_qemu_tb_exec(env, tb_ptr) \
-((uintptr_t (*)(void *, void *))tcg_ctx->code_gen_prologue)(env, tb_ptr)
+typedef uintptr_t tcg_prologue_fn(CPUArchState *env, void *tb_ptr);
+extern tcg_prologue_fn *tcg_qemu_tb_exec;
 #endif
 
 void tcg_register_jit(void *buf, size_t buf_size);
diff --git a/tcg/tci/tcg-target.h b/tcg/tci/tcg-target.h
index 6460449719..49f3291f8a 100644
--- a/tcg/tci/tcg-target.h
+++ b/tcg/tci/tcg-target.h
@@ -189,8 +189,6 @@ typedef enum {
 
 void tci_disas(uint8_t opc);
 
-#define HAVE_TCG_QEMU_TB_EXEC
-
 /* Flush the dcache at RW, and the icache at RX, as necessary. */
 static inline void flush_idcache_range(uintptr_t rx, uintptr_t rw, size_t len)
 {
diff --git a/tcg/tcg.c b/tcg/tcg.c
index d5a72c226f..a6f47b033c 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -161,6 +161,10 @@ static TCGContext **tcg_ctxs;
 static unsigned int n_tcg_ctxs;
 TCGv_env cpu_env = 0;
 
+#ifndef CONFIG_TCG_INTERPRETER
+tcg_prologue_fn *tcg_qemu_tb_exec;
+#endif
+
 struct tcg_region_tree {
 QemuMutex lock;
 GTree *tree;
@@ -1054,7 +1058,10 @@ void tcg_prologue_init(TCGContext *s)
 s->code_ptr = buf0;
 s->code_buf = buf0;
 s->data_gen_ptr = NULL;
-s->code_gen_prologue = buf0;
+
+#ifndef CONFIG_TCG_INTERPRETER
+tcg_qemu_tb_exec = (tcg_prologue_fn *)buf0;
+#endif
 
 /* Compute a high-water mark, at which we voluntarily flush the buffer
and start over.  The size here is arbitrary, significantly larger
diff --git a/tcg/tci.c b/tcg/tci.c
index 82039fd163..d996eb7cf8 100644
--- a/tcg/tci.c
+++ b/tcg/tci.c
@@ -475,8 +475,9 @@ static bool tci_compare64(uint64_t u0, uint64_t u1, TCGCond 
condition)
 #endif
 
 /* Interpret pseudo code in tb. */
-uintptr_t tcg_qemu_tb_exec(CPUArchState *env, uint8_t *tb_ptr)
+uintptr_t tcg_qemu_tb_exec(CPUArchState *env, void *v_tb_ptr)
 {
+uint8_t *tb_ptr = v_tb_ptr;
 tcg_target_ulong regs[TCG_TARGET_NB_REGS];
 long tcg_temps[CPU_TEMP_BUF_NLONGS];
 uintptr_t sp_value = (uintptr_t)(tcg_temps + CPU_TEMP_BUF_NLONGS);
-- 
2.25.1




[PATCH v3 01/41] tcg: Enhance flush_icache_range with separate data pointer

2020-11-05 Thread Richard Henderson
We are shortly going to have a split rw/rx jit buffer.  Depending
on the host, we need to flush the dcache at the rw data pointer and
flush the icache at the rx code pointer.

For now, the two passed pointers are identical, so there is no
effective change in behaviour.

Signed-off-by: Richard Henderson 
---
 tcg/aarch64/tcg-target.h |  9 +++--
 tcg/arm/tcg-target.h |  8 ++--
 tcg/i386/tcg-target.h|  3 ++-
 tcg/mips/tcg-target.h|  8 ++--
 tcg/ppc/tcg-target.h |  2 +-
 tcg/riscv/tcg-target.h   |  8 ++--
 tcg/s390/tcg-target.h|  3 ++-
 tcg/sparc/tcg-target.h   |  8 +---
 tcg/tci/tcg-target.h |  3 ++-
 softmmu/physmem.c|  9 -
 tcg/tcg.c|  6 --
 tcg/aarch64/tcg-target.c.inc |  2 +-
 tcg/mips/tcg-target.c.inc|  2 +-
 tcg/ppc/tcg-target.c.inc | 21 +++--
 tcg/sparc/tcg-target.c.inc   |  4 ++--
 15 files changed, 64 insertions(+), 32 deletions(-)

diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h
index 663dd0b95e..d0a6a059b7 100644
--- a/tcg/aarch64/tcg-target.h
+++ b/tcg/aarch64/tcg-target.h
@@ -148,9 +148,14 @@ typedef enum {
 #define TCG_TARGET_DEFAULT_MO (0)
 #define TCG_TARGET_HAS_MEMORY_BSWAP 1
 
-static inline void flush_icache_range(uintptr_t start, uintptr_t stop)
+/* Flush the dcache at RW, and the icache at RX, as necessary. */
+static inline void flush_idcache_range(uintptr_t rx, uintptr_t rw, size_t len)
 {
-__builtin___clear_cache((char *)start, (char *)stop);
+/* TODO: Copy this from gcc to avoid 4 loops instead of 2. */
+if (rw != rx) {
+__builtin___clear_cache((char *)rw, (char *)(rw + len));
+}
+__builtin___clear_cache((char *)rx, (char *)(rx + len));
 }
 
 void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t);
diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h
index 17e771374d..fa88b24e43 100644
--- a/tcg/arm/tcg-target.h
+++ b/tcg/arm/tcg-target.h
@@ -134,9 +134,13 @@ enum {
 #define TCG_TARGET_DEFAULT_MO (0)
 #define TCG_TARGET_HAS_MEMORY_BSWAP 1
 
-static inline void flush_icache_range(uintptr_t start, uintptr_t stop)
+/* Flush the dcache at RW, and the icache at RX, as necessary. */
+static inline void flush_idcache_range(uintptr_t rx, uintptr_t rw, size_t len)
 {
-__builtin___clear_cache((char *) start, (char *) stop);
+if (rw != rx) {
+__builtin___clear_cache((char *)rw, (char *)(rw + len));
+}
+__builtin___clear_cache((char *)rx, (char *)(rx + len));
 }
 
 /* not defined -- call should be eliminated at compile time */
diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h
index abd4ac7fc0..8323e72639 100644
--- a/tcg/i386/tcg-target.h
+++ b/tcg/i386/tcg-target.h
@@ -206,7 +206,8 @@ extern bool have_avx2;
 #define TCG_TARGET_extract_i64_valid(ofs, len) \
 (((ofs) == 8 && (len) == 8) || ((ofs) + (len)) == 32)
 
-static inline void flush_icache_range(uintptr_t start, uintptr_t stop)
+/* Flush the dcache at RW, and the icache at RX, as necessary. */
+static inline void flush_idcache_range(uintptr_t rx, uintptr_t rw, size_t len)
 {
 }
 
diff --git a/tcg/mips/tcg-target.h b/tcg/mips/tcg-target.h
index c6b091d849..47b1226ee9 100644
--- a/tcg/mips/tcg-target.h
+++ b/tcg/mips/tcg-target.h
@@ -207,9 +207,13 @@ extern bool use_mips32r2_instructions;
 #define TCG_TARGET_DEFAULT_MO (0)
 #define TCG_TARGET_HAS_MEMORY_BSWAP 1
 
-static inline void flush_icache_range(uintptr_t start, uintptr_t stop)
+/* Flush the dcache at RW, and the icache at RX, as necessary. */
+static inline void flush_idcache_range(uintptr_t rx, uintptr_t rw, size_t len)
 {
-cacheflush ((void *)start, stop-start, ICACHE);
+if (rx != rw) {
+cacheflush((void *)rw, len, DCACHE);
+}
+cacheflush((void *)rx, len, ICACHE);
 }
 
 void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t);
diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h
index be10363956..fbb6dc1b47 100644
--- a/tcg/ppc/tcg-target.h
+++ b/tcg/ppc/tcg-target.h
@@ -175,7 +175,7 @@ extern bool have_vsx;
 #define TCG_TARGET_HAS_bitsel_vec   have_vsx
 #define TCG_TARGET_HAS_cmpsel_vec   0
 
-void flush_icache_range(uintptr_t start, uintptr_t stop);
+void flush_idcache_range(uintptr_t rx, uintptr_t rw, size_t len);
 void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t);
 
 #define TCG_TARGET_DEFAULT_MO (0)
diff --git a/tcg/riscv/tcg-target.h b/tcg/riscv/tcg-target.h
index 032439d806..0fa6ae358e 100644
--- a/tcg/riscv/tcg-target.h
+++ b/tcg/riscv/tcg-target.h
@@ -159,9 +159,13 @@ typedef enum {
 #define TCG_TARGET_HAS_mulsh_i641
 #endif
 
-static inline void flush_icache_range(uintptr_t start, uintptr_t stop)
+/* Flush the dcache at RW, and the icache at RX, as necessary. */
+static inline void flush_idcache_range(uintptr_t rx, uintptr_t rw, size_t len)
 {
-__builtin___clear_cache((char *)start, (char *)stop);
+if (rx != rw) {
+__builtin___clear_cache((char *)r

[PATCH v3 07/41] tcg: Adjust tcg_out_call for const

2020-11-05 Thread Richard Henderson
We must change all targets at once, since all must match
the declaration in tcg.c.

Signed-off-by: Richard Henderson 
---
 tcg/tcg.c| 2 +-
 tcg/aarch64/tcg-target.c.inc | 2 +-
 tcg/arm/tcg-target.c.inc | 2 +-
 tcg/i386/tcg-target.c.inc| 4 ++--
 tcg/mips/tcg-target.c.inc| 6 +++---
 tcg/ppc/tcg-target.c.inc | 8 
 tcg/riscv/tcg-target.c.inc   | 6 +++---
 tcg/s390/tcg-target.c.inc| 2 +-
 tcg/sparc/tcg-target.c.inc   | 4 ++--
 tcg/tci/tcg-target.c.inc | 2 +-
 10 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/tcg/tcg.c b/tcg/tcg.c
index a88b314e97..d3eeea355c 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -148,7 +148,7 @@ static void tcg_out_st(TCGContext *s, TCGType type, TCGReg 
arg, TCGReg arg1,
intptr_t arg2);
 static bool tcg_out_sti(TCGContext *s, TCGType type, TCGArg val,
 TCGReg base, intptr_t ofs);
-static void tcg_out_call(TCGContext *s, tcg_insn_unit *target);
+static void tcg_out_call(TCGContext *s, const tcg_insn_unit *target);
 static int tcg_target_const_match(tcg_target_long val, TCGType type,
   const TCGArgConstraint *arg_ct);
 #ifdef TCG_TARGET_NEED_LDST_LABELS
diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
index 96dc9f4d0b..6d8152c468 100644
--- a/tcg/aarch64/tcg-target.c.inc
+++ b/tcg/aarch64/tcg-target.c.inc
@@ -1329,7 +1329,7 @@ static inline void tcg_out_callr(TCGContext *s, TCGReg 
reg)
 tcg_out_insn(s, 3207, BLR, reg);
 }
 
-static inline void tcg_out_call(TCGContext *s, tcg_insn_unit *target)
+static inline void tcg_out_call(TCGContext *s, const tcg_insn_unit *target)
 {
 ptrdiff_t offset = target - s->code_ptr;
 if (offset == sextract64(offset, 0, 26)) {
diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
index f8f485d807..a197e6bc45 100644
--- a/tcg/arm/tcg-target.c.inc
+++ b/tcg/arm/tcg-target.c.inc
@@ -1033,7 +1033,7 @@ static void tcg_out_goto(TCGContext *s, int cond, const 
tcg_insn_unit *addr)
 
 /* The call case is mostly used for helpers - so it's not unreasonable
  * for them to be beyond branch range */
-static void tcg_out_call(TCGContext *s, tcg_insn_unit *addr)
+static void tcg_out_call(TCGContext *s, const tcg_insn_unit *addr)
 {
 intptr_t addri = (intptr_t)addr;
 ptrdiff_t disp = tcg_pcrel_diff(s, addr);
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index 424dd1cdcf..095553ce28 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -1591,7 +1591,7 @@ static void tcg_out_clz(TCGContext *s, int rexw, TCGReg 
dest, TCGReg arg1,
 }
 }
 
-static void tcg_out_branch(TCGContext *s, int call, tcg_insn_unit *dest)
+static void tcg_out_branch(TCGContext *s, int call, const tcg_insn_unit *dest)
 {
 intptr_t disp = tcg_pcrel_diff(s, dest) - 5;
 
@@ -1610,7 +1610,7 @@ static void tcg_out_branch(TCGContext *s, int call, 
tcg_insn_unit *dest)
 }
 }
 
-static inline void tcg_out_call(TCGContext *s, tcg_insn_unit *dest)
+static inline void tcg_out_call(TCGContext *s, const tcg_insn_unit *dest)
 {
 tcg_out_branch(s, 1, dest);
 }
diff --git a/tcg/mips/tcg-target.c.inc b/tcg/mips/tcg-target.c.inc
index a3f838fa51..6d2c369a85 100644
--- a/tcg/mips/tcg-target.c.inc
+++ b/tcg/mips/tcg-target.c.inc
@@ -517,7 +517,7 @@ static void tcg_out_opc_sa64(TCGContext *s, MIPSInsn opc1, 
MIPSInsn opc2,
  * Type jump.
  * Returns true if the branch was in range and the insn was emitted.
  */
-static bool tcg_out_opc_jmp(TCGContext *s, MIPSInsn opc, void *target)
+static bool tcg_out_opc_jmp(TCGContext *s, MIPSInsn opc, const void *target)
 {
 uintptr_t dest = (uintptr_t)target;
 uintptr_t from = (uintptr_t)s->code_ptr + 4;
@@ -1080,7 +1080,7 @@ static void tcg_out_movcond(TCGContext *s, TCGCond cond, 
TCGReg ret,
 }
 }
 
-static void tcg_out_call_int(TCGContext *s, tcg_insn_unit *arg, bool tail)
+static void tcg_out_call_int(TCGContext *s, const tcg_insn_unit *arg, bool 
tail)
 {
 /* Note that the ABI requires the called function's address to be
loaded into T9, even if a direct branch is in range.  */
@@ -1098,7 +1098,7 @@ static void tcg_out_call_int(TCGContext *s, tcg_insn_unit 
*arg, bool tail)
 }
 }
 
-static void tcg_out_call(TCGContext *s, tcg_insn_unit *arg)
+static void tcg_out_call(TCGContext *s, const tcg_insn_unit *arg)
 {
 tcg_out_call_int(s, arg, false);
 tcg_out_nop(s);
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index 8a0b20a86b..36129b976f 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -1106,7 +1106,7 @@ static void tcg_out_xori32(TCGContext *s, TCGReg dst, 
TCGReg src, uint32_t c)
 tcg_out_zori32(s, dst, src, c, XORI, XORIS);
 }
 
-static void tcg_out_b(TCGContext *s, int mask, tcg_insn_unit *target)
+static void tcg_out_b(TCGContext *s, int mask, const tcg_insn_unit *target)
 {
 ptrdiff_t disp = tcg_pcrel_diff(s, target);
 if (in_range_b(disp

[PATCH v3 00/41] Mirror map JIT memory for TCG

2020-11-05 Thread Richard Henderson
This is my take on Joelle's patch set:
https://lists.nongnu.org/archive/html/qemu-devel/2020-10/msg07837.html

Changes for v3:
  * Even more patches -- all tcg backends converted.
  * Fixups for darwin/ios merged (Joelle).
  * Feature renamed to splitwx (Paolo).


r~


Richard Henderson (41):
  tcg: Enhance flush_icache_range with separate data pointer
  tcg: Move tcg prologue pointer out of TCGContext
  tcg: Move tcg epilogue pointer out of TCGContext
  tcg: Add in_code_gen_buffer
  tcg: Introduce tcg_splitwx_to_{rx,rw}
  tcg: Adjust TCGLabel for const
  tcg: Adjust tcg_out_call for const
  tcg: Adjust tcg_out_label for const
  tcg: Adjust tcg_register_jit for const
  tcg: Adjust tb_target_set_jmp_target for split-wx
  tcg: Make DisasContextBase.tb const
  tcg: Make tb arg to synchronize_from_tb const
  tcg: Use Error with alloc_code_gen_buffer
  tcg: Add --accel tcg,split-wx property
  accel/tcg: Support split-wx for linux with memfd
  accel/tcg: Support split-wx for darwin/iOS with vm_remap
  tcg: Return the TB pointer from the rx region from exit_tb
  tcg/i386: Support split-wx code generation
  tcg/aarch64: Use B not BL for tcg_out_goto_long
  tcg/aarch64: Implement flush_idcache_range manually
  tcg/aarch64: Support split-wx code generation
  disas: Push const down through host disasassembly
  tcg/tci: Push const down through bytecode reading
  tcg: Introduce tcg_tbrel_diff
  tcg/ppc: Use tcg_tbrel_diff
  tcg/ppc: Use tcg_out_mem_long to reset TCG_REG_TB
  tcg/ppc: Support split-wx code generation
  tcg/sparc: Use tcg_tbrel_diff
  tcg/sparc: Support split-wx code generation
  tcg/s390: Use tcg_tbrel_diff
  tcg/s390: Support split-wx code generation
  tcg/riscv: Fix branch range checks
  tcg/riscv: Remove branch-over-branch fallback
  tcg/riscv: Support split-wx code generation
  accel/tcg: Add mips support to alloc_code_gen_buffer_splitwx_memfd
  tcg/mips: Do not assert on relocation overflow
  tcg/mips: Support split-wx code generation
  tcg/arm: Support split-wx code generation
  tcg: Remove TCG_TARGET_SUPPORT_MIRROR
  tcg: Constify tcg_code_gen_epilogue
  tcg: Constify TCGLabelQemuLdst.raddr

 accel/tcg/tcg-runtime.h  |   2 +-
 include/disas/dis-asm.h  |   4 +-
 include/disas/disas.h|   2 +-
 include/exec/exec-all.h  |   2 +-
 include/exec/gen-icount.h|   4 +-
 include/exec/log.h   |   2 +-
 include/exec/translator.h|   2 +-
 include/hw/core/cpu.h|   3 +-
 include/sysemu/tcg.h |   3 +-
 include/tcg/tcg-op.h |   2 +-
 include/tcg/tcg.h|  56 +--
 tcg/aarch64/tcg-target.h |   8 +-
 tcg/arm/tcg-target.h |  10 +-
 tcg/i386/tcg-target.h|   9 +-
 tcg/mips/tcg-target.h|  10 +-
 tcg/ppc/tcg-target.h |   4 +-
 tcg/riscv/tcg-target.h   |  10 +-
 tcg/s390/tcg-target.h|  11 +-
 tcg/sparc/tcg-target.h   |  10 +-
 tcg/tci/tcg-target.h |  11 +-
 accel/tcg/cpu-exec.c |  41 +++--
 accel/tcg/tcg-all.c  |  26 ++-
 accel/tcg/tcg-runtime.c  |   4 +-
 accel/tcg/translate-all.c| 307 +++
 accel/tcg/translator.c   |   4 +-
 bsd-user/main.c  |   2 +-
 disas.c  |   2 +-
 disas/capstone.c |   2 +-
 linux-user/main.c|   2 +-
 softmmu/physmem.c|   9 +-
 target/arm/cpu.c |   3 +-
 target/arm/translate-a64.c   |   2 +-
 target/avr/cpu.c |   3 +-
 target/hppa/cpu.c|   3 +-
 target/i386/cpu.c|   3 +-
 target/microblaze/cpu.c  |   3 +-
 target/mips/cpu.c|   3 +-
 target/riscv/cpu.c   |   3 +-
 target/rx/cpu.c  |   3 +-
 target/sh4/cpu.c |   3 +-
 target/sparc/cpu.c   |   3 +-
 target/tricore/cpu.c |   2 +-
 tcg/tcg-op.c |  15 +-
 tcg/tcg.c|  86 --
 tcg/tci.c|  60 ---
 accel/tcg/trace-events   |   2 +-
 tcg/aarch64/tcg-target.c.inc | 139 
 tcg/arm/tcg-target.c.inc |  41 ++---
 tcg/i386/tcg-target.c.inc|  36 ++--
 tcg/mips/tcg-target.c.inc|  97 +--
 tcg/ppc/tcg-target.c.inc | 105 ++--
 tcg/riscv/tcg-target.c.inc   | 125 +-
 tcg/s390/tcg-target.c.inc|  91 +--
 tcg/sparc/tcg-target.c.inc   |  58 +++
 tcg/tcg-ldst.c.inc   |   2 +-
 tcg/tcg-pool.c.inc   |   6 +-
 tcg/tci/tcg-target.c.inc |   2 +-
 57 files changed, 917 insertions(+), 546 deletions(-)

-- 
2.25.1




[PATCH v3 04/41] tcg: Add in_code_gen_buffer

2020-11-05 Thread Richard Henderson
Create a function to determine if a pointer is within the buffer.

Signed-off-by: Richard Henderson 
---
 include/tcg/tcg.h |  6 ++
 accel/tcg/translate-all.c | 26 --
 2 files changed, 14 insertions(+), 18 deletions(-)

diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index bb1e97b13b..e4d0ace44b 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -680,6 +680,12 @@ extern __thread TCGContext *tcg_ctx;
 extern void *tcg_code_gen_epilogue;
 extern TCGv_env cpu_env;
 
+static inline bool in_code_gen_buffer(const void *p)
+{
+const TCGContext *s = &tcg_init_ctx;
+return (size_t)(p - s->code_gen_buffer) <= s->code_gen_buffer_size;
+}
+
 static inline size_t temp_idx(TCGTemp *ts)
 {
 ptrdiff_t n = ts - tcg_ctx->temps;
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 4572b4901f..744f97a717 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -392,27 +392,18 @@ void tb_destroy(TranslationBlock *tb)
 
 bool cpu_restore_state(CPUState *cpu, uintptr_t host_pc, bool will_exit)
 {
-TranslationBlock *tb;
-bool r = false;
-uintptr_t check_offset;
-
-/* The host_pc has to be in the region of current code buffer. If
- * it is not we will not be able to resolve it here. The two cases
- * where host_pc will not be correct are:
+/*
+ * The host_pc has to be in the region of the code buffer.
+ * If it is not we will not be able to resolve it here.
+ * The two cases where host_pc will not be correct are:
  *
  *  - fault during translation (instruction fetch)
  *  - fault from helper (not using GETPC() macro)
  *
  * Either way we need return early as we can't resolve it here.
- *
- * We are using unsigned arithmetic so if host_pc <
- * tcg_init_ctx.code_gen_buffer check_offset will wrap to way
- * above the code_gen_buffer_size
  */
-check_offset = host_pc - (uintptr_t) tcg_init_ctx.code_gen_buffer;
-
-if (check_offset < tcg_init_ctx.code_gen_buffer_size) {
-tb = tcg_tb_lookup(host_pc);
+if (in_code_gen_buffer((const void *)host_pc)) {
+TranslationBlock *tb = tcg_tb_lookup(host_pc);
 if (tb) {
 cpu_restore_state_from_tb(cpu, tb, host_pc, will_exit);
 if (tb_cflags(tb) & CF_NOCACHE) {
@@ -421,11 +412,10 @@ bool cpu_restore_state(CPUState *cpu, uintptr_t host_pc, 
bool will_exit)
 tcg_tb_remove(tb);
 tb_destroy(tb);
 }
-r = true;
+return true;
 }
 }
-
-return r;
+return false;
 }
 
 static void page_init(void)
-- 
2.25.1




Re: [RFC PATCH 02/15] hw/riscv: migrate fdt field to generic MachineState

2020-11-05 Thread Bin Meng
On Fri, Nov 6, 2020 at 1:57 AM Alex Bennée  wrote:
>
> This is a mechanical change to make the fdt available through
> MachineState.
>
> Signed-off-by: Alex Bennée 
> Reviewed-by: Alistair Francis 
> Message-Id: <20201021170842.25762-3-alex.ben...@linaro.org>
> Signed-off-by: Alex Bennée 
> ---
>  include/hw/riscv/virt.h |  1 -
>  hw/riscv/virt.c | 20 ++--
>  2 files changed, 10 insertions(+), 11 deletions(-)

What about the 'sifive_u' and 'spike' machines?

Regards,
Bin



Re: [PATCH v2 4/4] hw/riscv: Load the kernel after the firmware

2020-11-05 Thread Palmer Dabbelt

On Tue, 20 Oct 2020 08:46:45 PDT (-0700), alistai...@gmail.com wrote:

On Mon, Oct 19, 2020 at 4:17 PM Palmer Dabbelt  wrote:


On Tue, 13 Oct 2020 17:17:33 PDT (-0700), Alistair Francis wrote:
> Instead of loading the kernel at a hardcoded start address, let's load
> the kernel at the next alligned address after the end of the firmware.
>
> This should have no impact for current users of OpenSBI, but will
> allow loading a noMMU kernel at the start of memory.
>
> Signed-off-by: Alistair Francis 
> ---
>  include/hw/riscv/boot.h |  3 +++
>  hw/riscv/boot.c | 19 ++-
>  hw/riscv/opentitan.c|  3 ++-
>  hw/riscv/sifive_e.c |  3 ++-
>  hw/riscv/sifive_u.c | 10 --
>  hw/riscv/spike.c| 11 ---
>  hw/riscv/virt.c | 11 ---
>  7 files changed, 45 insertions(+), 15 deletions(-)
>
> diff --git a/include/hw/riscv/boot.h b/include/hw/riscv/boot.h
> index 2975ed1a31..0b01988727 100644
> --- a/include/hw/riscv/boot.h
> +++ b/include/hw/riscv/boot.h
> @@ -25,6 +25,8 @@
>
>  bool riscv_is_32_bit(MachineState *machine);
>
> +target_ulong riscv_calc_kernel_start_addr(MachineState *machine,
> +  target_ulong firmware_end_addr);
>  target_ulong riscv_find_and_load_firmware(MachineState *machine,
>const char 
*default_machine_firmware,
>hwaddr firmware_load_addr,
> @@ -34,6 +36,7 @@ target_ulong riscv_load_firmware(const char 
*firmware_filename,
>   hwaddr firmware_load_addr,
>   symbol_fn_t sym_cb);
>  target_ulong riscv_load_kernel(const char *kernel_filename,
> +   target_ulong firmware_end_addr,
> symbol_fn_t sym_cb);
>  hwaddr riscv_load_initrd(const char *filename, uint64_t mem_size,
>   uint64_t kernel_entry, hwaddr *start);
> diff --git a/hw/riscv/boot.c b/hw/riscv/boot.c
> index 5dea644f47..9b3fe3fb1e 100644
> --- a/hw/riscv/boot.c
> +++ b/hw/riscv/boot.c
> @@ -33,10 +33,8 @@
>  #include 
>
>  #if defined(TARGET_RISCV32)
> -# define KERNEL_BOOT_ADDRESS 0x8040
>  #define fw_dynamic_info_data(__val) cpu_to_le32(__val)
>  #else
> -# define KERNEL_BOOT_ADDRESS 0x8020
>  #define fw_dynamic_info_data(__val) cpu_to_le64(__val)
>  #endif
>
> @@ -49,6 +47,15 @@ bool riscv_is_32_bit(MachineState *machine)
>  }
>  }
>
> +target_ulong riscv_calc_kernel_start_addr(MachineState *machine,
> +  target_ulong firmware_end_addr) {
> +if (riscv_is_32_bit(machine)) {
> +return QEMU_ALIGN_UP(firmware_end_addr, 4 * MiB);
> +} else {
> +return QEMU_ALIGN_UP(firmware_end_addr, 2 * MiB);
> +}
> +}
> +
>  target_ulong riscv_find_and_load_firmware(MachineState *machine,
>const char 
*default_machine_firmware,
>hwaddr firmware_load_addr,
> @@ -123,7 +130,9 @@ target_ulong riscv_load_firmware(const char 
*firmware_filename,
>  exit(1);
>  }
>
> -target_ulong riscv_load_kernel(const char *kernel_filename, symbol_fn_t 
sym_cb)
> +target_ulong riscv_load_kernel(const char *kernel_filename,
> +   target_ulong kernel_start_addr,
> +   symbol_fn_t sym_cb)
>  {
>  uint64_t kernel_entry;
>
> @@ -138,9 +147,9 @@ target_ulong riscv_load_kernel(const char 
*kernel_filename, symbol_fn_t sym_cb)
>  return kernel_entry;
>  }
>
> -if (load_image_targphys_as(kernel_filename, KERNEL_BOOT_ADDRESS,
> +if (load_image_targphys_as(kernel_filename, kernel_start_addr,
> ram_size, NULL) > 0) {
> -return KERNEL_BOOT_ADDRESS;
> +return kernel_start_addr;
>  }
>
>  error_report("could not load kernel '%s'", kernel_filename);
> diff --git a/hw/riscv/opentitan.c b/hw/riscv/opentitan.c
> index 0531bd879b..cc758b78b8 100644
> --- a/hw/riscv/opentitan.c
> +++ b/hw/riscv/opentitan.c
> @@ -75,7 +75,8 @@ static void opentitan_board_init(MachineState *machine)
>  }
>
>  if (machine->kernel_filename) {
> -riscv_load_kernel(machine->kernel_filename, NULL);
> +riscv_load_kernel(machine->kernel_filename,
> +  memmap[IBEX_DEV_RAM].base, NULL);
>  }
>  }
>
> diff --git a/hw/riscv/sifive_e.c b/hw/riscv/sifive_e.c
> index fcfac16816..59bac4cc9a 100644
> --- a/hw/riscv/sifive_e.c
> +++ b/hw/riscv/sifive_e.c
> @@ -114,7 +114,8 @@ static void sifive_e_machine_init(MachineState *machine)
>memmap[SIFIVE_E_DEV_MROM].base, 
&address_space_memory);
>
>  if (machine->kernel_filename) {
> -riscv_load_kernel(machine->kernel_filename, NULL);
> +riscv_load_kernel(machine->kernel_filename,
> +  memmap[SIFIVE_E_DEV_DTIM].base, NUL

[PATCH v1 1/1] hw/intc/ibex_plic: Clear the claim register when read

2020-11-05 Thread Alistair Francis
After claiming the interrupt by reading the claim register we want to
clear the register to make sure the interrupt doesn't appear at the next
read.

This matches the documentation for the claim register as it clears the
pending bit (which we already do): 
https://docs.opentitan.org/hw/ip/rv_plic/doc/index.html

This also matches the current hardware.

Signed-off-by: Alistair Francis 
---
 hw/intc/ibex_plic.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/intc/ibex_plic.c b/hw/intc/ibex_plic.c
index f49fa67c91..235e6b88ff 100644
--- a/hw/intc/ibex_plic.c
+++ b/hw/intc/ibex_plic.c
@@ -139,6 +139,9 @@ static uint64_t ibex_plic_read(void *opaque, hwaddr addr,
 /* Return the current claimed interrupt */
 ret = s->claim;
 
+/* Clear the claimed interrupt */
+s->claim = 0x;
+
 /* Update the interrupt status after the claim */
 ibex_plic_update(s);
 }
-- 
2.28.0




Re: [PATCH] quorum: Implement bdrv_co_block_status()

2020-11-05 Thread Tao Xu
I test this patch in COLO, it resolve the issue qcow2 image become 
larger after drive-mirror. Thank you!


Tested-by: Tao Xu 

On 11/5/2020 2:04 AM, Alberto Garcia wrote:

The quorum driver does not implement bdrv_co_block_status() and
because of that it always reports to contain data even if all its
children are known to be empty.

One consequence of this is that if we for example create a quorum with
a size of 10GB and we mirror it to a new image the operation will
write 10GB of actual zeroes to the destination image wasting a lot of
time and disk space.

Since a quorum has an arbitrary number of children of potentially
different formats there is no way to report all possible allocation
status flags in a way that makes sense, so this implementation only
reports when a given region is known to contain zeroes
(BDRV_BLOCK_ZERO) or not (BDRV_BLOCK_DATA).

If all children agree that a region contains zeroes then we can return
BDRV_BLOCK_ZERO using the smallest size reported by the children
(because all agree that a region of at least that size contains
zeroes).

If at least one child disagrees we have to return BDRV_BLOCK_DATA.
In this case we use the largest of the sizes reported by the children
that didn't return BDRV_BLOCK_ZERO (because we know that there won't
be an agreement for at least that size).

Signed-off-by: Alberto Garcia 
---
  block/quorum.c |  49 
  tests/qemu-iotests/312 | 148 +
  tests/qemu-iotests/312.out |  67 +
  tests/qemu-iotests/group   |   1 +
  4 files changed, 265 insertions(+)
  create mode 100755 tests/qemu-iotests/312
  create mode 100644 tests/qemu-iotests/312.out

diff --git a/block/quorum.c b/block/quorum.c
index e846a7e892..29cee42705 100644
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -18,6 +18,7 @@
  #include "qemu/module.h"
  #include "qemu/option.h"
  #include "block/block_int.h"
+#include "block/coroutines.h"
  #include "block/qdict.h"
  #include "qapi/error.h"
  #include "qapi/qapi-events-block.h"
@@ -1174,6 +1175,53 @@ static void quorum_child_perm(BlockDriverState *bs, 
BdrvChild *c,
   | DEFAULT_PERM_UNCHANGED;
  }
  
+/*

+ * Each one of the children can report different status flags even
+ * when they contain the same data, so what this function does is
+ * return BDRV_BLOCK_ZERO if *all* children agree that a certain
+ * region contains zeroes, and BDRV_BLOCK_DATA otherwise.
+ */
+static int coroutine_fn quorum_co_block_status(BlockDriverState *bs,
+   bool want_zero,
+   int64_t offset, int64_t count,
+   int64_t *pnum, int64_t *map,
+   BlockDriverState **file)
+{
+BDRVQuorumState *s = bs->opaque;
+int i, ret;
+int64_t pnum_zero = count;
+int64_t pnum_data = 0;
+
+for (i = 0; i < s->num_children; i++) {
+int64_t bytes;
+ret = bdrv_co_common_block_status_above(s->children[i]->bs, NULL, 
false,
+want_zero, offset, count,
+&bytes, NULL, NULL, NULL);
+if (ret < 0) {
+return ret;
+}
+/*
+ * Even if all children agree about whether there are zeroes
+ * or not at @offset they might disagree on the size, so use
+ * the smallest when reporting BDRV_BLOCK_ZERO and the largest
+ * when reporting BDRV_BLOCK_DATA.
+ */
+if (ret & BDRV_BLOCK_ZERO) {
+pnum_zero = MIN(pnum_zero, bytes);
+} else {
+pnum_data = MAX(pnum_data, bytes);
+}
+}
+
+if (pnum_data) {
+*pnum = pnum_data;
+return BDRV_BLOCK_DATA;
+} else {
+*pnum = pnum_zero;
+return BDRV_BLOCK_ZERO;
+}
+}
+
  static const char *const quorum_strong_runtime_opts[] = {
  QUORUM_OPT_VOTE_THRESHOLD,
  QUORUM_OPT_BLKVERIFY,
@@ -1192,6 +1240,7 @@ static BlockDriver bdrv_quorum = {
  .bdrv_close = quorum_close,
  .bdrv_gather_child_options  = quorum_gather_child_options,
  .bdrv_dirname   = quorum_dirname,
+.bdrv_co_block_status   = quorum_co_block_status,
  
  .bdrv_co_flush_to_disk  = quorum_co_flush,
  
diff --git a/tests/qemu-iotests/312 b/tests/qemu-iotests/312

new file mode 100755
index 00..1b08f1552f
--- /dev/null
+++ b/tests/qemu-iotests/312
@@ -0,0 +1,148 @@
+#!/usr/bin/env bash
+#
+# Test drive-mirror with quorum
+#
+# The goal of this test is to check how the quorum driver reports
+# regions that are known to read as zeroes (BDRV_BLOCK_ZERO). The idea
+# is that drive-mirror will try the efficient representation of zeroes
+# in the destination image instead of writing actual zeroes.
+#
+# Copyright (C) 2020 Igalia, S.L.
+# Author: Alberto Garcia 

[Bug 1902470] Re: migration with TLS-MultiFD is stuck when the dst-libvirtd service restarts

2020-11-05 Thread Chuan Zheng
** Changed in: qemu
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1902470

Title:
  migration with TLS-MultiFD is stuck when the dst-libvirtd service
  restarts

Status in QEMU:
  Confirmed

Bug description:
  hi,

  I found that the multi-channel TLS-handshake will be stuck when the
  dst-libvirtd restarts, both the src and dst sockets are blocked in
  recvmsg. In the meantime, live_migration thread is blocked in
  multifd_send_sync_main, so migration cannot be cancelled though src-
  libvirt has delivered the QMP command.

  Is there any way to exit migration when the multi-channel TLS-
  handshake is stuck? Does setting TLS-handshake timeout function take
  effect?

  The stack trace are as follows:

  =src qemu-system-aar stack=:
  #0  0x87d6f28c in recvmsg () from target:/usr/lib64/libpthread.so.0
  #1  0xe3817424 in qio_channel_socket_readv (ioc=0xe9e30a30, 
iov=0xdb58e8a8, niov=1, fds=0x0, nfds=0x0, errp=0x0) at 
../io/channel-socket.c:502
  #2  0xe380f468 in qio_channel_readv_full (ioc=0xe9e30a30, 
iov=0xdb58e8a8, niov=1, fds=0x0, nfds=0x0, errp=0x0) at ../io/channel.c:66
  #3  0xe380f9e8 in qio_channel_read (ioc=0xe9e30a30, 
buf=0xea204e9b "\026\003\001\001L\001", buflen=5, errp=0x0) at 
../io/channel.c:217
  #4  0xe380e7d4 in qio_channel_tls_read_handler (buf=0xea204e9b 
"\026\003\001\001L\001", len=5, opaque=0xfffd38001190) at ../io/channel-tls.c:53
  #5  0xe3801114 in qcrypto_tls_session_pull (opaque=0xe99d5700, 
buf=0xea204e9b, len=5) at ../crypto/tlssession.c:89
  #6  0x8822ed30 in _gnutls_stream_read (ms=0xdb58eaac, 
pull_func=0xfffd38001870, size=5, bufel=, 
session=0xe983cd60) at buffers.c:346
  #7  _gnutls_read (ms=0xdb58eaac, pull_func=0xfffd38001870, size=5, 
bufel=, session=0xe983cd60) at buffers.c:426
  #8  _gnutls_io_read_buffered (session=session@entry=0xe983cd60, total=5, 
recv_type=recv_type@entry=4294967295, ms=0xdb58eaac) at buffers.c:581
  #9  0x88224954 in recv_headers (ms=, 
record=0x883cd000 , 
htype=65535, type=2284006288, record_params=0xe9e22a60, 
session=0xe983cd60) at record.c:1163
  #10 _gnutls_recv_in_buffers (session=session@entry=0xe983cd60, 
type=2284006288, type@entry=GNUTLS_HANDSHAKE, htype=65535, 
htype@entry=GNUTLS_HANDSHAKE_HELLO_RETRY_REQUEST, ms=, 
ms@entry=0) at record.c:1302
  #11 0x88230568 in _gnutls_handshake_io_recv_int 
(session=session@entry=0xe983cd60, 
htype=htype@entry=GNUTLS_HANDSHAKE_HELLO_RETRY_REQUEST, 
hsk=hsk@entry=0xdb58ec38, optional=optional@entry=1) at buffers.c:1445
  #12 0x88232b90 in _gnutls_recv_handshake 
(session=session@entry=0xe983cd60, 
type=type@entry=GNUTLS_HANDSHAKE_HELLO_RETRY_REQUEST, 
optional=optional@entry=1, buf=buf@entry=0x0) at handshake.c:1534
  #13 0x88235b40 in handshake_client 
(session=session@entry=0xe983cd60) at handshake.c:2925
  #14 0x88237824 in gnutls_handshake (session=0xe983cd60) at 
handshake.c:2739
  #15 0xe380213c in qcrypto_tls_session_handshake 
(session=0xe99d5700, errp=0xdb58ee58) at ../crypto/tlssession.c:493
  #16 0xe380ea40 in qio_channel_tls_handshake_task (ioc=0xfffd38001190, 
task=0xea61d4e0, context=0x0) at ../io/channel-tls.c:161
  #17 0xe380ec60 in qio_channel_tls_handshake (ioc=0xfffd38001190, 
func=0xe3394d20 , opaque=0xea189c30, 
destroy=0x0, context=0x0) at ../io/channel-tls.c:239
  #18 0xe3394e78 in multifd_tls_channel_connect (p=0xea189c30, 
ioc=0xe9e30a30, errp=0xdb58ef28) at ../migration/multifd.c:782
  #19 0xe3394f30 in multifd_channel_connect (p=0xea189c30, 
ioc=0xe9e30a30, error=0x0) at ../migration/multifd.c:804
  #20 0xe33950b8 in multifd_new_send_channel_async 
(task=0xea6855a0, opaque=0xea189c30) at ../migration/multifd.c:858
  #21 0xe3810cf8 in qio_task_complete (task=0xea6855a0) at 
../io/task.c:197
  #22 0xe381096c in qio_task_thread_result (opaque=0xea6855a0) at 
../io/task.c:112
  #23 0x88701df8 in ?? () from target:/usr/lib64/libglib-2.0.so.0
  #24 0x88705a7c in g_main_context_dispatch () from 
target:/usr/lib64/libglib-2.0.so.0
  #25 0xe3a5a29c in glib_pollfds_poll () at ../util/main-loop.c:221
  #26 0xe3a5a324 in os_host_main_loop_wait (timeout=0) at 
../util/main-loop.c:244
  #27 0xe3a5a444 in main_loop_wait (nonblocking=0) at 
../util/main-loop.c:520
  #28 0xe3696b20 in qemu_main_loop () at ../softmmu/vl.c:1677
  #29 0xe30949e4 in main (argc=81, argv=0xdb58f2c8, 
envp=0xdb58f558) at ../softmmu/main.c:50

  =src live_migration stack=:
  #0  0x87d6a5d8 in pthread_cond_wait () from 
target:/usr/lib64/libpthread.so.0
  #1  0xe3a5f3ec in qem

RE: [PATCH] block: Fix integer promotion error in bdrv_getlength()

2020-11-05 Thread Tuguoyi
On Wed, 2019-07-03 at 10:13 -0500, Eric Blake wrote:
> On 11/5/20 2:31 AM, Max Reitz wrote:
> > On 05.11.20 06:40, Tuguoyi wrote:
> >> As BDRV_SECTOR_SIZE is of type uint64_t, the expression will
> >> automatically convert the @ret to uint64_t. When an error code
> >> returned from bdrv_nb_sectors(), the promoted @ret will be a very
> >> large number, as a result the -EFBIG will be returned which is not the
> >> real error code.
> >>
> >> Signed-off-by: Guoyi Tu 
> >> ---
> >>   block.c | 2 +-
> >>   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > Thanks, applied to my block branch:
> >
> > https://git.xanclic.moe/XanClic/qemu/commits/branch/block
> >
> 
> I actually preferred the v1 solution, rather than this v2, as it avoided
> a cast.

There are several type promotion bugs(commits '570542ec') found recently, 
so i think explicitly casting the integer type can give a hint that there
is a potential type promotion if you don't do that.
 
Actually your solution look much simple and clear, and it's ok for me

--
Best regards,
Guoyi


[PATCH V2 2/2] plugins: Fix two resource leaks in setup_socket()

2020-11-05 Thread AlexChen
Either accept() fails or exits normally, we need to close the fd.

Reported-by: Euler Robot 
Signed-off-by: Alex Chen 
---
 contrib/plugins/lockstep.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/contrib/plugins/lockstep.c b/contrib/plugins/lockstep.c
index 319bd44b83..5aad50869d 100644
--- a/contrib/plugins/lockstep.c
+++ b/contrib/plugins/lockstep.c
@@ -268,11 +268,13 @@ static bool setup_socket(const char *path)
 socket_fd = accept(fd, NULL, NULL);
 if (socket_fd < 0 && errno != EINTR) {
 perror("accept socket");
+close(fd);
 return false;
 }

 qemu_plugin_outs("setup_socket::ready\n");

+close(fd);
 return true;
 }

-- 
2.19.1



[PATCH V2 1/2] plugins: Fix resource leak in connect_socket()

2020-11-05 Thread AlexChen
Close the fd when the connect() fails.

Reported-by: Euler Robot 
Signed-off-by: Alex Chen 
---
 contrib/plugins/lockstep.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/contrib/plugins/lockstep.c b/contrib/plugins/lockstep.c
index a696673dff..319bd44b83 100644
--- a/contrib/plugins/lockstep.c
+++ b/contrib/plugins/lockstep.c
@@ -292,6 +292,7 @@ static bool connect_socket(const char *path)

 if (connect(fd, (struct sockaddr *)&sockaddr, sizeof(sockaddr)) < 0) {
 perror("failed to connect");
+close(fd);
 return false;
 }

-- 
2.19.1



[PATCH V2 0/2] plugins: Fix some resource leaks

2020-11-05 Thread AlexChen
There are 3 resource leaks in contrib/plugins/lockstep.c, fix it.

v1->v2:
- add the cover letter
- modify the subject of the patch[2/2]

alexchen (2):
  plugins: Fix resource leak in connect_socket()
  plugins: Fix two resource leaks in setup_socket()

 contrib/plugins/lockstep.c | 3 +++
 1 file changed, 3 insertions(+)

-- 
2.19.1



Re: [PATCH v5] introduce vfio-user protocol specification

2020-11-05 Thread John G Johnson



> On Nov 2, 2020, at 3:51 AM, Thanos Makatos  wrote:
> 
> 
> 
>> -Original Message-
>> From: Qemu-devel > bounces+thanos.makatos=nutanix@nongnu.org> On Behalf Of John
>> Levon
>> Sent: 02 November 2020 11:41
>> To: Thanos Makatos 
>> Cc: benjamin.wal...@intel.com; Elena Ufimtseva
>> ; jag.ra...@oracle.com;
>> james.r.har...@intel.com; Swapnil Ingle ;
>> john.g.john...@oracle.com; yuvalkash...@gmail.com;
>> konrad.w...@oracle.com; tina.zh...@intel.com; qemu-devel@nongnu.org;
>> dgilb...@redhat.com; Marc-André Lureau
>> ; ism...@linux.com;
>> alex.william...@redhat.com; Stefan Hajnoczi ;
>> Felipe Franciosi ; xiuchun...@intel.com;
>> tomassetti.and...@gmail.com; changpeng@intel.com; Raphael Norwitz
>> ; kanth.ghatr...@oracle.com
>> Subject: Re: [PATCH v5] introduce vfio-user protocol specification
>> 
>> On Mon, Nov 02, 2020 at 11:29:23AM +, Thanos Makatos wrote:
>> 
 
>> +==++=
 ==+
> | version  | object | ``{"major": , "minor": }``
>> |
> |  || 
>   |
> |  || Version supported by the sender, e.g. "0.1".
>   |
 
 It seems quite unlikely but this should specify it's strings not floating 
 point
 values maybe?
 
 Definitely applies to max_fds too.
>>> 
>>> major and minor are JSON numbers and specifically integers.
>> 
>> It is debatable as to whether there is such a thing as a JSON integer :)
> 
> AFAIK there isn't.
> 
>> 
>>> The rationale behind this is to simplify parsing. Is specifying that
>>> major/minor/max_fds should be an interger sufficient to clear any
>> vagueness
>>> here?
>> 
>> I suppose that's OK as long as we never want a 0.1.1 or whatever. I'm not
>> sure
>> it simplifies parsing, but maybe it does.
> 
> Now that you mention it, why preclude 0.1.1? IIUC the whole point of not
> stating the version as a float is to simply have this flexibility in the 
> future.
> You're right in your earlier suggestion to explicitly state major/minor as
> strings.
> 

The idea behind the version IDs is to identify incompatible protocol
changes as major versions, and compatible changes as minor versions.  What
would be the purpose of the third version type?

The thing that makes parsing the JSON easier is knowing the version
beforehand so the parser knows what keys to expect, so I’d like to promote
major and minor to separate fields in the packet from being embedded at an
arbitrary points in the JSON string.


>> 
> Versioning and Feature Support
> ^^
> Upon accepting a connection, the server must send a
>> VFIO_USER_VERSION
 message
> proposing a protocol version and a set of capabilities. The client
>> compares
> these with the versions and capabilities it supports and sends a
> VFIO_USER_VERSION reply according to the following rules.
 
 I'm curious if there was a specific reason it's this way around, when it
>> seems
 more natural for the client to propose first, and the server to reply?
>>> 
>>> I'm not aware of any specific reason.
>> 
>> So can we switch it now so the initial setup is a send/recv too?
> 
> I'm fine with that but would first like to hear back from John in case he 
> objects.


I think I write that section, and just switched client and server.  The 
code
is written as client proposes, server responds; this is the better model.

JJ





Re: [PATCH 1/2] plugins: Fix resource leak in connect_socket()

2020-11-05 Thread AlexChen
On 2020/11/5 18:37, Alex Bennée wrote:
> 
> AlexChen  writes:
> 
>> Kindly ping.
> 
> Ahh sorry I missed these. Was there a cover letter for the series?
> 

I forgot to send the cover letter, I will send the patch V2 with the cover 
letter.

Thanks,
Alex Chen



[PATCH v3 1/3] hw/block/m25p80: Fix Numonyx NVCFG DIO and QIO bit polarity

2020-11-05 Thread Joe Komlodi
QIO and DIO modes should be enabled when the bits in NVCFG are set to 0.
This matches the behavior of the other bits in the NVCFG register.

Signed-off-by: Joe Komlodi 
---
 hw/block/m25p80.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/block/m25p80.c b/hw/block/m25p80.c
index 483925f..4255a6a 100644
--- a/hw/block/m25p80.c
+++ b/hw/block/m25p80.c
@@ -783,10 +783,10 @@ static void reset_memory(Flash *s)
 s->enh_volatile_cfg |= EVCFG_OUT_DRIVER_STRENGTH_DEF;
 s->enh_volatile_cfg |= EVCFG_VPP_ACCELERATOR;
 s->enh_volatile_cfg |= EVCFG_RESET_HOLD_ENABLED;
-if (s->nonvolatile_cfg & NVCFG_DUAL_IO_MASK) {
+if (!(s->nonvolatile_cfg & NVCFG_DUAL_IO_MASK)) {
 s->enh_volatile_cfg |= EVCFG_DUAL_IO_ENABLED;
 }
-if (s->nonvolatile_cfg & NVCFG_QUAD_IO_MASK) {
+if (!(s->nonvolatile_cfg & NVCFG_QUAD_IO_MASK)) {
 s->enh_volatile_cfg |= EVCFG_QUAD_IO_ENABLED;
 }
 if (!(s->nonvolatile_cfg & NVCFG_4BYTE_ADDR_MASK)) {
-- 
2.7.4




[PATCH v3 3/3] hw/block/m25p80: Fix Numonyx fast read dummy cycle count

2020-11-05 Thread Joe Komlodi
Numonyx chips determine the number of cycles to wait based on bits 7:4
in the volatile configuration register.

However, if these bits are 0x0 or 0xF, the number of dummy cycles to wait is
10 on a QIOR or QIOR4 command, or 8 on any other currently supported
fast read command. [1]

[1]
https://www.micron.com/-/media/client/global/documents/products/data-sheet/nor-flash/serial-nor/mt25q/die-rev-b/mt25q_qlkt_u_02g_cbb_0.pdf?rev=9b167fbf2b3645efba6385949a72e453

Signed-off-by: Joe Komlodi 
---
 hw/block/m25p80.c | 43 ---
 1 file changed, 40 insertions(+), 3 deletions(-)

diff --git a/hw/block/m25p80.c b/hw/block/m25p80.c
index 8a1b684..a2cdfb6 100644
--- a/hw/block/m25p80.c
+++ b/hw/block/m25p80.c
@@ -841,6 +841,43 @@ static uint8_t numonyx_get_mode(Flash *s)
 return mode;
 }
 
+static uint8_t numonyx_extract_cfg_num_dummies(Flash *s)
+{
+uint8_t cycle_count;
+uint8_t num_dummies;
+uint8_t mode;
+uint8_t cycle_table[0x100][3] = {
+[FAST_READ] = {8, 8, 10},
+[FAST_READ4] = {8, 8, 10},
+[DOR] = {8, 8, 0xff},
+[DOR4] = {8, 8, 0xff},
+[QOR] = {8, 0xff, 10},
+[QOR4] = {8, 0xff, 10},
+[DIOR] = {8, 8, 0xff},
+[DIOR4] = {8, 8, 0xff},
+[QIOR] = {10, 0xff, 10},
+[QIOR4] = {10, 0xff, 10},
+};
+assert(get_man(s) == MAN_NUMONYX);
+
+mode = numonyx_get_mode(s);
+
+cycle_count = extract32(s->volatile_cfg, 4, 4);
+if (cycle_count == 0x0 || cycle_count == 0xf) {
+num_dummies = cycle_table[s->cmd_in_progress][mode];
+} else {
+num_dummies = cycle_count;
+}
+
+/*
+ * Validation if the command can be executed should be done outside of
+ * this function. e.g. trying to execute DIOR in QIO mode.
+ */
+assert(num_dummies != 0xff);
+
+return num_dummies;
+}
+
 static bool numonyx_check_cmd_mode(Flash *s)
 {
 uint8_t mode;
@@ -901,7 +938,7 @@ static void decode_fast_read_cmd(Flash *s)
 break;
 case MAN_NUMONYX:
 if (numonyx_check_cmd_mode(s)) {
-s->needed_bytes += extract32(s->volatile_cfg, 4, 4);
+s->needed_bytes += numonyx_extract_cfg_num_dummies(s);
 s->state = STATE_COLLECTING_DATA;
 }
 break;
@@ -947,7 +984,7 @@ static void decode_dio_read_cmd(Flash *s)
 break;
 case MAN_NUMONYX:
 if (numonyx_check_cmd_mode(s)) {
-s->needed_bytes += extract32(s->volatile_cfg, 4, 4);
+s->needed_bytes += numonyx_extract_cfg_num_dummies(s);
 s->state = STATE_COLLECTING_DATA;
 }
 break;
@@ -993,7 +1030,7 @@ static void decode_qio_read_cmd(Flash *s)
 break;
 case MAN_NUMONYX:
 if (numonyx_check_cmd_mode(s)) {
-s->needed_bytes += extract32(s->volatile_cfg, 4, 4);
+s->needed_bytes += numonyx_extract_cfg_num_dummies(s);
 s->state = STATE_COLLECTING_DATA;
 }
 break;
-- 
2.7.4




[PATCH v3 0/3] hw/block/m25p80: Numonyx: Fix dummy cycles and check for SPI mode on cmds

2020-11-05 Thread Joe Komlodi
Changelog:
v2 -> v3
 - 1/3: Added, Fixes NVCFG polarity for DIO/QIO.
 - 2/3: Added, Checks if we can execute the current command in standard/DIO/QIO 
mode.
 - 3/3: Was 1/1 in v2.  Added cycle counts for DIO/QIO mode.
v1 -> v2
 - 1/2: Change function name to be more accurate
 - 2/2: Dropped

Hi all,

The series fixes the behavior of the dummy cycle register for Numonyx flashes so
it's closer to how hardware behaves.
It also checks if a command can be executed in the current SPI mode
(standard, DIO, or QIO) before extracting dummy cycles for the command.

On hardware, the dummy cycles for fast read commands are set to a specific value
(8 or 10) if the register is all 0s or 1s.
If the register value isn't all 0s or 1s, then the flash expects the amount of
cycles sent to be equal to the count in the register.

Thanks!
Joe

Joe Komlodi (3):
  hw/block/m25p80: Fix Numonyx NVCFG DIO and QIO bit polarity
  hw/block/m25p80: Check SPI mode before running some Numonyx commands
  hw/block/m25p80: Fix Numonyx fast read dummy cycle count

 hw/block/m25p80.c | 176 +-
 1 file changed, 161 insertions(+), 15 deletions(-)

-- 
2.7.4




[PATCH v3 2/3] hw/block/m25p80: Check SPI mode before running some Numonyx commands

2020-11-05 Thread Joe Komlodi
Some Numonyx flash commands cannot be executed in DIO and QIO mode, such as
trying to do DPP or DOR when in QIO mode.

Signed-off-by: Joe Komlodi 
---
 hw/block/m25p80.c | 132 --
 1 file changed, 119 insertions(+), 13 deletions(-)

diff --git a/hw/block/m25p80.c b/hw/block/m25p80.c
index 4255a6a..8a1b684 100644
--- a/hw/block/m25p80.c
+++ b/hw/block/m25p80.c
@@ -413,6 +413,12 @@ typedef enum {
 MAN_GENERIC,
 } Manufacturer;
 
+typedef enum {
+MODE_STD = 0,
+MODE_DIO = 1,
+MODE_QIO = 2
+} SPIMode;
+
 #define M25P80_INTERNAL_DATA_BUFFER_SZ 16
 
 struct Flash {
@@ -820,6 +826,70 @@ static void reset_memory(Flash *s)
 trace_m25p80_reset_done(s);
 }
 
+static uint8_t numonyx_get_mode(Flash *s)
+{
+uint8_t mode;
+
+if (s->enh_volatile_cfg & EVCFG_QUAD_IO_ENABLED) {
+mode = MODE_QIO;
+} else if (s->enh_volatile_cfg & EVCFG_DUAL_IO_ENABLED) {
+mode = MODE_DIO;
+} else {
+mode = MODE_STD;
+}
+
+return mode;
+}
+
+static bool numonyx_check_cmd_mode(Flash *s)
+{
+uint8_t mode;
+assert(get_man(s) == MAN_NUMONYX);
+
+mode = numonyx_get_mode(s);
+
+switch (mode) {
+case MODE_STD:
+return true;
+case MODE_DIO:
+switch (s->cmd_in_progress) {
+case QOR:
+case QOR4:
+case QIOR:
+case QIOR4:
+case QPP:
+case QPP_4:
+case PP4_4:
+case JEDEC_READ:
+case READ:
+case READ4:
+qemu_log_mask(LOG_GUEST_ERROR, "M25P80: Cannot execute cmd %x in "
+  "DIO mode\n", s->cmd_in_progress);
+return false;
+default:
+return true;
+}
+case MODE_QIO:
+switch (s->cmd_in_progress) {
+case DOR:
+case DOR4:
+case DIOR:
+case DIOR4:
+case DPP:
+case JEDEC_READ:
+case READ:
+case READ4:
+qemu_log_mask(LOG_GUEST_ERROR, "M25P80: Cannot execute cmd %x in "
+  "QIO mode\n", s->cmd_in_progress);
+return false;
+default:
+return true;
+}
+default:
+g_assert_not_reached();
+}
+}
+
 static void decode_fast_read_cmd(Flash *s)
 {
 s->needed_bytes = get_addr_length(s);
@@ -827,9 +897,13 @@ static void decode_fast_read_cmd(Flash *s)
 /* Dummy cycles - modeled with bytes writes instead of bits */
 case MAN_WINBOND:
 s->needed_bytes += 8;
+s->state = STATE_COLLECTING_DATA;
 break;
 case MAN_NUMONYX:
-s->needed_bytes += extract32(s->volatile_cfg, 4, 4);
+if (numonyx_check_cmd_mode(s)) {
+s->needed_bytes += extract32(s->volatile_cfg, 4, 4);
+s->state = STATE_COLLECTING_DATA;
+}
 break;
 case MAN_MACRONIX:
 if (extract32(s->volatile_cfg, 6, 2) == 1) {
@@ -837,19 +911,21 @@ static void decode_fast_read_cmd(Flash *s)
 } else {
 s->needed_bytes += 8;
 }
+s->state = STATE_COLLECTING_DATA;
 break;
 case MAN_SPANSION:
 s->needed_bytes += extract32(s->spansion_cr2v,
 SPANSION_DUMMY_CLK_POS,
 SPANSION_DUMMY_CLK_LEN
 );
+s->state = STATE_COLLECTING_DATA;
 break;
 default:
+s->state = STATE_COLLECTING_DATA;
 break;
 }
 s->pos = 0;
 s->len = 0;
-s->state = STATE_COLLECTING_DATA;
 }
 
 static void decode_dio_read_cmd(Flash *s)
@@ -859,6 +935,7 @@ static void decode_dio_read_cmd(Flash *s)
 switch (get_man(s)) {
 case MAN_WINBOND:
 s->needed_bytes += WINBOND_CONTINUOUS_READ_MODE_CMD_LEN;
+s->state = STATE_COLLECTING_DATA;
 break;
 case MAN_SPANSION:
 s->needed_bytes += SPANSION_CONTINUOUS_READ_MODE_CMD_LEN;
@@ -866,9 +943,13 @@ static void decode_dio_read_cmd(Flash *s)
 SPANSION_DUMMY_CLK_POS,
 SPANSION_DUMMY_CLK_LEN
 );
+s->state = STATE_COLLECTING_DATA;
 break;
 case MAN_NUMONYX:
-s->needed_bytes += extract32(s->volatile_cfg, 4, 4);
+if (numonyx_check_cmd_mode(s)) {
+s->needed_bytes += extract32(s->volatile_cfg, 4, 4);
+s->state = STATE_COLLECTING_DATA;
+}
 break;
 case MAN_MACRONIX:
 switch (extract32(s->volatile_cfg, 6, 2)) {
@@ -882,13 +963,14 @@ static void decode_dio_read_cmd(Flash *s)
 s->needed_bytes += 4;
 break;
 }
+s->state = STATE_COLLECTING_DATA;
 break;
 default:
+s->state = STATE_COLLECTING_DATA;
 break;
 }
 s->pos = 0;
 s->len = 0;
-s->state = STATE_COLLECTING_DATA;
 }
 
 static void decode_qio_read_cmd(Flash *s)
@@ -899,6 +981,7 @@ static void decode_qi

[Bug 1901981] Re: assert issue locates in hw/usb/dev-storage.c:248: usb_msd_send_status

2020-11-05 Thread Gaoning Pan
** Changed in: qemu
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1901981

Title:
  assert issue locates in hw/usb/dev-storage.c:248: usb_msd_send_status

Status in QEMU:
  Confirmed

Bug description:
  Hello,

  I found an assertion failure through hw/usb/dev-storage.c.

  This was found in latest version 5.1.0.

  

  qemu-system-x86_64: hw/usb/dev-storage.c:248: usb_msd_send_status: Assertion 
`s->csw.sig == cpu_to_le32(0x53425355)' failed.
  [1]29544 abort  sudo  -enable-kvm -boot c -m 2G -drive 
format=qcow2,file=./ubuntu.img -nic

  To reproduce the assertion failure, please run the QEMU with following
  command line.

  
  $ qemu-system-x86_64 -enable-kvm -boot c -m 2G -drive 
format=qcow2,file=./ubuntu.img -nic 
user,model=rtl8139,hostfwd=tcp:0.0.0.0:-:22 -device piix4-usb-uhci,id=uhci 
-device usb-storage,drive=mydrive -drive 
id=mydrive,file=null-co://,size=2M,format=raw,if=none

  The poc is attached.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1901981/+subscriptions



Re: [PULL v3 04/32] vfio: Add migration region initialization and finalize function

2020-11-05 Thread Alex Williamson
On Thu, 5 Nov 2020 23:55:32 +
Peter Maydell  wrote:

> On Sun, 1 Nov 2020 at 21:02, Alex Williamson  
> wrote:
> >
> > From: Kirti Wankhede 
> >
> > Whether the VFIO device supports migration or not is decided based of
> > migration region query. If migration region query is successful and 
> > migration
> > region initialization is successful then migration is supported else
> > migration is blocked.
> >
> > Signed-off-by: Kirti Wankhede 
> > Reviewed-by: Neo Jia 
> > Acked-by: Dr. David Alan Gilbert 
> > Reviewed-by: Cornelia Huck 
> > Signed-off-by: Alex Williamson   
> 
> Hi; Coverity points out (CID 1436126) that this code has a
> use-after-free:

Thanks, I already relayed this to Kirti and expect to see a patch.
Thanks,

Alex


> > +int vfio_migration_probe(VFIODevice *vbasedev, Error **errp)
> > +{
> > +struct vfio_region_info *info = NULL;
> > +Error *local_err = NULL;
> > +int ret;
> > +
> > +ret = vfio_get_dev_region_info(vbasedev, VFIO_REGION_TYPE_MIGRATION,
> > +   VFIO_REGION_SUBTYPE_MIGRATION, &info);
> > +if (ret) {
> > +goto add_blocker;
> > +}
> > +
> > +ret = vfio_migration_init(vbasedev, info);
> > +if (ret) {
> > +goto add_blocker;
> > +}
> > +
> > +g_free(info);
> > +trace_vfio_migration_probe(vbasedev->name, info->index);  
> 
> We free info, and then access info->index. Switching the
> order of the g_free() and the tracepoint seems the obvious fix.
> 
> thanks
> -- PMM
> 




Re: [PULL v3 04/32] vfio: Add migration region initialization and finalize function

2020-11-05 Thread Peter Maydell
On Sun, 1 Nov 2020 at 21:02, Alex Williamson  wrote:
>
> From: Kirti Wankhede 
>
> Whether the VFIO device supports migration or not is decided based of
> migration region query. If migration region query is successful and migration
> region initialization is successful then migration is supported else
> migration is blocked.
>
> Signed-off-by: Kirti Wankhede 
> Reviewed-by: Neo Jia 
> Acked-by: Dr. David Alan Gilbert 
> Reviewed-by: Cornelia Huck 
> Signed-off-by: Alex Williamson 

Hi; Coverity points out (CID 1436126) that this code has a
use-after-free:


> +int vfio_migration_probe(VFIODevice *vbasedev, Error **errp)
> +{
> +struct vfio_region_info *info = NULL;
> +Error *local_err = NULL;
> +int ret;
> +
> +ret = vfio_get_dev_region_info(vbasedev, VFIO_REGION_TYPE_MIGRATION,
> +   VFIO_REGION_SUBTYPE_MIGRATION, &info);
> +if (ret) {
> +goto add_blocker;
> +}
> +
> +ret = vfio_migration_init(vbasedev, info);
> +if (ret) {
> +goto add_blocker;
> +}
> +
> +g_free(info);
> +trace_vfio_migration_probe(vbasedev->name, info->index);

We free info, and then access info->index. Switching the
order of the g_free() and the tracepoint seems the obvious fix.

thanks
-- PMM



Re: [PULL v3 06/12] qga: add implementation of guest-get-disks for Linux

2020-11-05 Thread Peter Maydell
On Tue, 3 Nov 2020 at 02:45, Michael Roth  wrote:
>
> From: Tomáš Golembiovský 
>
> The command lists all disks (real and virtual) as well as disk
> partitions. For each disk the list of dependent disks is also listed and
> /dev path is used as a handle so it can be matched with "name" field of
> other returned disk entries. For disk partitions the "dependents" list
> is populated with the the parent device for easier tracking of
> hierarchy.

Hi; Coverity points out a resource leak in this function
(CID 1436130):


> +GuestDiskInfoList *qmp_guest_get_disks(Error **errp)
> +{
> +GuestDiskInfoList *item, *ret = NULL;
> +GuestDiskInfo *disk;
> +DIR *dp = NULL;
> +struct dirent *de = NULL;
> +
> +g_debug("listing /sys/block directory");
> +dp = opendir("/sys/block");

Here we opendir()...

> +if (dp == NULL) {
> +error_setg_errno(errp, errno, "Can't open directory \"/sys/block\"");
> +return NULL;
> +}
> +while ((de = readdir(dp)) != NULL) {
[stuff]
> +}
> +return ret;

...but we forget to closedir() it again.

> +}

thanks
-- PMM



Re: [PULL 0/3] ppc-for-5.2 patch queue 2020-11-05

2020-11-05 Thread Peter Maydell
On Thu, 5 Nov 2020 at 03:49, David Gibson  wrote:
>
> The following changes since commit 3c8c36c9087da957f580a9bb5ebf7814a753d1c6:
>
>   Merge remote-tracking branch 'remotes/kraxel/tags/ui-20201104-pull-request' 
> into staging (2020-11-04 16:52:17 +)
>
> are available in the Git repository at:
>
>   https://gitlab.com/dgibson/qemu.git tags/ppc-for-5.2-20201105
>
> for you to fetch changes up to f29b959dc6871c9d8df781d1bedcfaebc76d5565:
>
>   spapr: Convert hpt_prepare_thread() to use qemu_try_memalign() (2020-11-05 
> 12:18:48 +1100)
>
> 
> ppc patch queue for 2020-11-05
>
> A small PR this time, one bugfix, one removal of minor dead code, one
> warning suppression.
>
> 
> Chen Qun (1):
>   target/ppc/excp_helper: Add a fallthrough for fix compiler warning
>
> Greg Kurz (2):
>   spapr: Drop dead code in spapr_reallocate_hpt()
>   spapr: Convert hpt_prepare_thread() to use qemu_try_memalign()


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/5.2
for any user-visible changes.

-- PMM



Re: VFIO Migration

2020-11-05 Thread Yan Zhao
On Tue, Nov 03, 2020 at 10:13:05AM -0700, Alex Williamson wrote:
> On Tue, 3 Nov 2020 11:03:24 +
> Stefan Hajnoczi  wrote:

<...>
>  
> > Management tools need to match the device model/configuration from the
> > source device against the destination device. If the destination is
> > capable of supporting the source's device model/configuration then
> > migration can proceed safely.
> > 
> > Let's look at the case where we are migration from an older version of a
> > device to a newer version. On the source we have:
> > 
> >   model = https://vendor-a.com/my-nic
> > 
> > On the destination we have:
> > 
> >   model = https://vendor-a.com/my-nic
> >   rss = on
> > 
> > The two devices are incompatible because the destination exposes the RSS
> > feature that is not present on the source. The RSS feature involves
> > guest-visible hardware interface changes and a change to the device
> > state representation. It is not safe to migrate!
> > 
> > In this case an extra configuration step is necessary so that the
> > destination device can accept the device state from the source. The
> > management tool invokes a vendor-specific tool to put the device into
> > the right configuration:
> > 
> >   # vendor-tool set-migration-config --device :00:04.0 \
> >  --model https://vendor-a.com/my-nic
> > 
> > (This tool only succeeds when the device is bound to VFIO but not yet
> > opened.)
> > 
> > The tool invokes ioctls on the vendor-specific VFIO driver that does two
> > things:
> > 1. Tells the device to present the old hardware interface without RSS
> > 2. Uses the old device state representation without RSS support
> > 
> > Does this approach fit?
> 
> 
> Should we not require that any sort of configuration like this occurs
> through sysfs?  We must be able to create an instance with a specific
> configuration without using vendor specific tools, therefore in the
> worse case we should be able to remove and recreate an instance as we
> desire without invoking vendor specific tools.  Thanks,
> 
hi Alex,
could mdevctl serve as a general configuration tool to create/destroy/config
mdev devices?

I think previously the main debate is on what is an easy way for management
tool to find and create a compatible target mdev device according to sysfs
info of source mdev device, is it?
as in [1], we have simplified the method to 1:1 matching of mdev_type
in src and target. and we can further force it to be 1:1 matching of
vendor_specific attributes (e.g. pci id) and dynamic resources
(e.g. aggregator, fps,...) and have mdevctl to create a compatible target
for management tools.

Given management tools like openstack are still in their preliminary
stage of supporting mdev devices, could we first settle down the
compatibility sysfs protocol and treat mdevctl as userspace tool
currently?

[1]: https://lists.gnu.org/archive/html/qemu-devel/2020-09/msg03273.html

Thanks
Yan



Re: [PATCH for-5.2 1/3] linux-user/sparc: Fix errors in target_ucontext structures

2020-11-05 Thread Peter Maydell
On Thu, 5 Nov 2020 at 22:15, Richard Henderson
 wrote:
>
> On 11/5/20 1:23 PM, Peter Maydell wrote:
> > +} __attribute__((aligned(16)));
>
> Hmph, 96 uses of the attribute directly, 20 uses of QEMU_ALIGNED.  I suppose 
> we
> should just remove the wrapper...

Oops, I forget about that. We're better at adhering to use
of QEMU_SENTINEL and QEMU_NORETURN, at least. And a fair
chunk of those 96 are in code-that's-not-ours like the
headers imported from Linux or the pc-bios/s390-ccw code.

I'm in two minds here -- the wrappers look less clunky than
the __attribute__ syntax, but on the other hand "there is
only one way this can be written" results in less inconsistency
than "there are two ways".

thanks
-- PMM



[PATCH v3 3/9] hw/usb: reorder fields in UASStatus

2020-11-05 Thread Daniele Buono
The UASStatus data structure has a variable sized field inside of type uas_iu,
that however is not placed at the end of the data structure.

This placement triggers a warning with clang 11, and while not a bug right now,
(the status is never a uas_iu_command, which is the variable-sized case),
it could become one in the future.

../qemu-base/hw/usb/dev-uas.c:157:31: error: field 'status' with variable sized 
type 'uas_iu' not at the end of a struct or class is a GNU extension 
[-Werror,-Wgnu-variable-sized-type-not-at-end]
uas_iustatus;
  ^
1 error generated.

Fix this by moving uas_iu at the end of the struct

Signed-off-by: Daniele Buono 
---
 hw/usb/dev-uas.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/usb/dev-uas.c b/hw/usb/dev-uas.c
index cec071d96c..5ef3f4fec9 100644
--- a/hw/usb/dev-uas.c
+++ b/hw/usb/dev-uas.c
@@ -154,9 +154,9 @@ struct UASRequest {
 
 struct UASStatus {
 uint32_t  stream;
-uas_iustatus;
 uint32_t  length;
 QTAILQ_ENTRY(UASStatus)   next;
+uas_iustatus;
 };
 
 /* - */
-- 
2.17.1




Re: [PATCH for-5.2 3/3] linux-user/sparc: Don't zero high half of PC, NPC, PSR in sigreturn

2020-11-05 Thread Richard Henderson
On 11/5/20 1:23 PM, Peter Maydell wrote:
> The function do_sigreturn() tries to store the PC, NPC and PSR in
> uint32_t local variables, which implicitly drops the high half of
> these fields for 64-bit guests.
> 
> The usual effect was that a guest which used signals would crash on
> return from a signal unless it was lucky enough to take it while the
> PC was in the low 4GB of the address space.  In particular, Debian
> /bin/dash and /bin/bash would segfault after executing external
> commands.
> 
> Use abi_ulong, which is the type these fields all have in the
> __siginfo_t struct.
> 
> Signed-off-by: Peter Maydell 
> ---
>  linux-user/sparc/signal.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Richard Henderson 

r~



[PATCH v3 9/9] configure,meson: support Control-Flow Integrity

2020-11-05 Thread Daniele Buono
This patch adds a flag to enable/disable control flow integrity checks
on indirect function calls.
This feature only allows indirect function calls at runtime to functions
with compatible signatures.

This feature is only provided by LLVM/Clang, and depends on link-time
optimization which is currently supported only with LLVM/Clang >= 6.0

We also add an option to enable a debugging version of cfi, with verbose
output in case of a CFI violation.

CFI on indirect function calls does not support calls to functions in
shared libraries (since they were not known at compile time), and such
calls are forbidden. QEMU relies on dlopen/dlsym when using modules,
so we make modules incompatible with CFI.

All the checks are performed in meson.build. configure is only used to
forward the flags to meson

Signed-off-by: Daniele Buono 
---
 configure | 21 -
 meson.build   | 45 +
 meson_options.txt |  4 
 3 files changed, 69 insertions(+), 1 deletion(-)

diff --git a/configure b/configure
index 7115655fe4..020c5a6aff 100755
--- a/configure
+++ b/configure
@@ -399,6 +399,8 @@ coroutine=""
 coroutine_pool=""
 debug_stack_usage="no"
 crypto_afalg="no"
+cfi="disabled"
+cfi_debug="disabled"
 seccomp=""
 glusterfs=""
 glusterfs_xlator_opt="no"
@@ -1179,6 +1181,16 @@ for opt do
   ;;
   --disable-safe-stack) safe_stack="no"
   ;;
+  --enable-cfi)
+  cfi="enabled";
+  lto="true";
+  ;;
+  --disable-cfi) cfi="disabled"
+  ;;
+  --enable-cfi-debug) cfi_debug="enabled"
+  ;;
+  --disable-cfi-debug) cfi_debug="disabled"
+  ;;
   --disable-curses) curses="disabled"
   ;;
   --enable-curses) curses="enabled"
@@ -1753,6 +1765,13 @@ disabled with --disable-FEATURE, default is enabled if 
available:
   sparse  sparse checker
   safe-stack  SafeStack Stack Smash Protection. Depends on
   clang/llvm >= 3.7 and requires coroutine backend ucontext.
+  cfi Enable Control-Flow Integrity for indirect function calls.
+  In case of a cfi violation, QEMU is terminated with SIGILL
+  Depends on lto and is incompatible with modules
+  Automatically enables Link-Time Optimization (lto)
+  cfi-debug   In case of a cfi violation, a message containing the line 
that
+  triggered the error is written to stderr. After the error,
+  QEMU is still terminated with SIGILL
 
   gnutls  GNUTLS cryptography support
   nettle  nettle cryptography support
@@ -6997,7 +7016,7 @@ NINJA=$ninja $meson setup \
 -Dcapstone=$capstone -Dslirp=$slirp -Dfdt=$fdt \
 -Diconv=$iconv -Dcurses=$curses -Dlibudev=$libudev\
 -Ddocs=$docs -Dsphinx_build=$sphinx_build -Dinstall_blobs=$blobs \
--Db_lto=$lto \
+-Db_lto=$lto -Dcfi=$cfi -Dcfi_debug=$cfi_debug \
 $cross_arg \
 "$PWD" "$source_path"
 
diff --git a/meson.build b/meson.build
index 99c7ab1d38..49a301888e 100644
--- a/meson.build
+++ b/meson.build
@@ -751,6 +751,48 @@ statx_test = '''
 
 has_statx = cc.links(statx_test)
 
+if get_option('cfi').enabled()
+  cfi_flags=[]
+  # Check for dependency on LTO
+  if not get_option('b_lto')
+error('Selected Control-Flow Integrity but LTO is disabled')
+  endif
+  if config_host.has_key('CONFIG_MODULES')
+error('Selected Control-Flow Integrity is not compatible with modules')
+  endif
+  # Check for cfi flags. CFI requires LTO so we can't use
+  # get_supported_arguments, but need a more complex "compiles" which allows
+  # custom arguments
+  if cc.compiles('int main () { return 0; }', name: '-fsanitize=cfi-icall',
+ args: ['-flto', '-fsanitize=cfi-icall'] )
+cfi_flags += '-fsanitize=cfi-icall'
+  else
+error('-fsanitize=cfi-icall is not supported by the compiler')
+  endif
+  if cc.compiles('int main () { return 0; }',
+ name: '-fsanitize-cfi-icall-generalize-pointers',
+ args: ['-flto', '-fsanitize=cfi-icall',
+'-fsanitize-cfi-icall-generalize-pointers'] )
+cfi_flags += '-fsanitize-cfi-icall-generalize-pointers'
+  else
+error('-fsanitize-cfi-icall-generalize-pointers is not supported by the 
compiler')
+  endif
+  if get_option('cfi_debug').enabled()
+if cc.compiles('int main () { return 0; }',
+   name: '-fno-sanitize-trap=cfi-icall',
+   args: ['-flto', '-fsanitize=cfi-icall',
+  '-fno-sanitize-trap=cfi-icall'] )
+  cfi_flags += '-fno-sanitize-trap=cfi-icall'
+else
+  error('-fno-sanitize-trap=cfi-icall is not supported by the compiler')
+endif
+  endif
+  add_project_arguments(cfi_flags, native: false, language: ['c', 'cpp',
+ 'objc'])
+  add_project_link_arguments(cfi_flags, native: false, language: ['c', 'cpp',
+   

[PATCH v3 5/9] scsi: fix overflow in scsi_disk_new_request_dump

2020-11-05 Thread Daniele Buono
scsi_disk_new_request_dump is used to dump the content of a scsi request
for tracing. It does that by decoding the command to get the size of the
command buffer, and then printing the content of such buffer on a string.

When using gcc with link-time optimizations, it warns that the argument of
malloc may be too large.

In function 'scsi_disk_new_request_dump',
inlined from 'scsi_new_request' at 
../qemu-cfi-v3/hw/scsi/scsi-disk.c:2588:9:
../qemu-cfi-v3/hw/scsi/scsi-disk.c:2562:17: warning: argument 1 value 
'18446744073709551612' exceeds maximum object size 9223372036854775807 
[-Walloc-size-larger-than=]
 line_buffer = g_malloc(len * 5 + 1);
 ^
../qemu-cfi-v3/hw/scsi/scsi-disk.c: In function 'scsi_new_request':
/usr/include/glib-2.0/glib/gmem.h:78:10: note: in a call to allocation function 
'g_malloc' declared here
 gpointer g_malloc (gsize  n_bytes) G_GNUC_MALLOC G_GNUC_ALLOC_SIZE(1);

len is a signed integer filled up by scsi_cdb_length which can return -1
if it can't decode the command. In this case, g_malloc would probably fail.
However, an unknown command here is a possibility, and since this is used for
tracing, we should try to print the command anyway, for debugging purposes.

Since knowing the size of the command in the buffer is impossible (could not
decode the command), only print the header by setting len=1 if scsi_cdb_length
returned -1

Signed-off-by: Daniele Buono 
---
If we had a way to know the (maximum) size of the buffer, we could
alternatively dump the whole buffer, instead of dumping only the
first byte. Not sure if this can be done, nor if it is considered
a better option.

We could also produce an error instead/in addition to just dumping
the buffer, if the command cannot be decoded.

 hw/scsi/scsi-disk.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index e859534eaf..d70dfdd9dc 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -2559,6 +2559,10 @@ static void scsi_disk_new_request_dump(uint32_t lun, 
uint32_t tag, uint8_t *buf)
 int len = scsi_cdb_length(buf);
 char *line_buffer, *p;
 
+if (len < 0) {
+len = 1;
+}
+
 line_buffer = g_malloc(len * 5 + 1);
 
 for (i = 0, p = line_buffer; i < len; i++) {
-- 
2.17.1




[PATCH v3 8/9] check-block: enable iotests with cfi-icall

2020-11-05 Thread Daniele Buono
cfi-icall is a form of Control-Flow Integrity for indirect function
calls implemented by llvm. It is enabled with a -fsanitize flag.

iotests are currently disabled when -fsanitize options is used, with the
exception of SafeStack.

This patch implements a generic filtering mechanism to allow iotests
with a set of known-to-be-safe -fsanitize option. Then marks SafeStack
and the new options used for cfi-icall safe for iotests

Signed-off-by: Daniele Buono 
---
 tests/check-block.sh | 18 +++---
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/tests/check-block.sh b/tests/check-block.sh
index f6b1bda7b9..fb4c1baae9 100755
--- a/tests/check-block.sh
+++ b/tests/check-block.sh
@@ -21,14 +21,18 @@ if grep -q "CONFIG_GPROF=y" config-host.mak 2>/dev/null ; 
then
 exit 0
 fi
 
-# Disable tests with any sanitizer except for SafeStack
-CFLAGS=$( grep "CFLAGS.*-fsanitize" config-host.mak 2>/dev/null )
-SANITIZE_FLAGS=""
-#Remove all occurrencies of -fsanitize=safe-stack
-for i in ${CFLAGS}; do
-if [ "${i}" != "-fsanitize=safe-stack" ]; then
-SANITIZE_FLAGS="${SANITIZE_FLAGS} ${i}"
+# Disable tests with any sanitizer except for specific ones
+SANITIZE_FLAGS=$( grep "CFLAGS.*-fsanitize" config-host.mak 2>/dev/null )
+ALLOWED_SANITIZE_FLAGS="safe-stack cfi-icall"
+#Remove all occurrencies of allowed Sanitize flags
+for j in ${ALLOWED_SANITIZE_FLAGS}; do
+TMP_FLAGS=${SANITIZE_FLAGS}
+SANITIZE_FLAGS=""
+for i in ${TMP_FLAGS}; do
+if ! echo ${i} | grep -q "${j}" 2>/dev/null; then
+SANITIZE_FLAGS="${SANITIZE_FLAGS} ${i}"
 fi
+done
 done
 if echo ${SANITIZE_FLAGS} | grep -q "\-fsanitize" 2>/dev/null; then
 # Have a sanitize flag that is not allowed, stop
-- 
2.17.1




[PATCH v3 2/9] s390x: fix clang 11 warnings in cpu_models.c

2020-11-05 Thread Daniele Buono
There are void * pointers that get casted to enums, in cpu_models.c
Such casts can result in a small integer type and are caught as
warnings with clang, starting with version 11:

Clang 11 finds a bunch of spots in the code that trigger this new warnings:

../qemu-base/target/s390x/cpu_models.c:985:21: error: cast to smaller integer 
type 'S390Feat' from 'void *' [-Werror,-Wvoid-pointer-to-enum-cast]
S390Feat feat = (S390Feat) opaque;
^
../qemu-base/target/s390x/cpu_models.c:1002:21: error: cast to smaller integer 
type 'S390Feat' from 'void *' [-Werror,-Wvoid-pointer-to-enum-cast]
S390Feat feat = (S390Feat) opaque;
^
../qemu-base/target/s390x/cpu_models.c:1036:27: error: cast to smaller integer 
type 'S390FeatGroup' from 'void *' [-Werror,-Wvoid-pointer-to-enum-cast]
S390FeatGroup group = (S390FeatGroup) opaque;
  ^~
../qemu-base/target/s390x/cpu_models.c:1057:27: error: cast to smaller integer 
type 'S390FeatGroup' from 'void *' [-Werror,-Wvoid-pointer-to-enum-cast]
S390FeatGroup group = (S390FeatGroup) opaque;
  ^~
4 errors generated.

Avoid this warning by casting the pointer to uintptr_t first.

Signed-off-by: Daniele Buono 
---
 target/s390x/cpu_models.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/target/s390x/cpu_models.c b/target/s390x/cpu_models.c
index 461e0b8f4a..b5abff8bef 100644
--- a/target/s390x/cpu_models.c
+++ b/target/s390x/cpu_models.c
@@ -986,7 +986,7 @@ void s390_realize_cpu_model(CPUState *cs, Error **errp)
 static void get_feature(Object *obj, Visitor *v, const char *name,
 void *opaque, Error **errp)
 {
-S390Feat feat = (S390Feat) opaque;
+S390Feat feat = (S390Feat) (uintptr_t) opaque;
 S390CPU *cpu = S390_CPU(obj);
 bool value;
 
@@ -1003,7 +1003,7 @@ static void get_feature(Object *obj, Visitor *v, const 
char *name,
 static void set_feature(Object *obj, Visitor *v, const char *name,
 void *opaque, Error **errp)
 {
-S390Feat feat = (S390Feat) opaque;
+S390Feat feat = (S390Feat) (uintptr_t) opaque;
 DeviceState *dev = DEVICE(obj);
 S390CPU *cpu = S390_CPU(obj);
 bool value;
@@ -1037,7 +1037,7 @@ static void set_feature(Object *obj, Visitor *v, const 
char *name,
 static void get_feature_group(Object *obj, Visitor *v, const char *name,
   void *opaque, Error **errp)
 {
-S390FeatGroup group = (S390FeatGroup) opaque;
+S390FeatGroup group = (S390FeatGroup) (uintptr_t) opaque;
 const S390FeatGroupDef *def = s390_feat_group_def(group);
 S390CPU *cpu = S390_CPU(obj);
 S390FeatBitmap tmp;
@@ -1058,7 +1058,7 @@ static void get_feature_group(Object *obj, Visitor *v, 
const char *name,
 static void set_feature_group(Object *obj, Visitor *v, const char *name,
   void *opaque, Error **errp)
 {
-S390FeatGroup group = (S390FeatGroup) opaque;
+S390FeatGroup group = (S390FeatGroup) (uintptr_t) opaque;
 const S390FeatGroupDef *def = s390_feat_group_def(group);
 DeviceState *dev = DEVICE(obj);
 S390CPU *cpu = S390_CPU(obj);
-- 
2.17.1




[PATCH v3 6/9] configure,meson: add option to enable LTO

2020-11-05 Thread Daniele Buono
This patch allows to compile QEMU with link-time optimization (LTO).
Compilation with LTO is handled directly by meson. This patch only
adds the option in configure and forwards the request to meson

Tested with all major versions of clang from 6 to 12

Signed-off-by: Daniele Buono 
---
 configure   | 7 +++
 meson.build | 1 +
 2 files changed, 8 insertions(+)

diff --git a/configure b/configure
index 2c3c69f118..7115655fe4 100755
--- a/configure
+++ b/configure
@@ -242,6 +242,7 @@ host_cc="cc"
 audio_win_int=""
 libs_qga=""
 debug_info="yes"
+lto="false"
 stack_protector=""
 safe_stack=""
 use_containers="yes"
@@ -1166,6 +1167,10 @@ for opt do
   ;;
   --disable-werror) werror="no"
   ;;
+  --enable-lto) lto="true"
+  ;;
+  --disable-lto) lto="false"
+  ;;
   --enable-stack-protector) stack_protector="yes"
   ;;
   --disable-stack-protector) stack_protector="no"
@@ -1744,6 +1749,7 @@ disabled with --disable-FEATURE, default is enabled if 
available:
   module-upgrades try to load modules from alternate paths for upgrades
   debug-tcg   TCG debugging (default is disabled)
   debug-info  debugging information
+  lto Enable Link-Time Optimization.
   sparse  sparse checker
   safe-stack  SafeStack Stack Smash Protection. Depends on
   clang/llvm >= 3.7 and requires coroutine backend ucontext.
@@ -6991,6 +6997,7 @@ NINJA=$ninja $meson setup \
 -Dcapstone=$capstone -Dslirp=$slirp -Dfdt=$fdt \
 -Diconv=$iconv -Dcurses=$curses -Dlibudev=$libudev\
 -Ddocs=$docs -Dsphinx_build=$sphinx_build -Dinstall_blobs=$blobs \
+-Db_lto=$lto \
 $cross_arg \
 "$PWD" "$source_path"
 
diff --git a/meson.build b/meson.build
index 39ac5cf6d8..99c7ab1d38 100644
--- a/meson.build
+++ b/meson.build
@@ -2023,6 +2023,7 @@ summary_info += {'gprof enabled': 
config_host.has_key('CONFIG_GPROF')}
 summary_info += {'sparse enabled':sparse.found()}
 summary_info += {'strip binaries':get_option('strip')}
 summary_info += {'profiler':  config_host.has_key('CONFIG_PROFILER')}
+summary_info += {'link-time optimization (LTO)': get_option('b_lto')}
 summary_info += {'static build':  config_host.has_key('CONFIG_STATIC')}
 if targetos == 'darwin'
   summary_info += {'Cocoa support': config_host.has_key('CONFIG_COCOA')}
-- 
2.17.1




Re: [PATCH for-5.2 2/3] linux-user/sparc: Correct set/get_context handling of fp and i7

2020-11-05 Thread Richard Henderson
On 11/5/20 1:23 PM, Peter Maydell wrote:
> Because QEMU's user-mode emulation just directly accesses guest CPU
> state, for SPARC the guest register window state is not the same in
> the sparc64_get_context() and sparc64_set_context() functions as it
> is for the real kernel's versions of those functions.  Specifically,
> for the kernel it has saved the user space state such that the O*
> registers go into a pt_regs struct as UREG_I*, and the I* registers
> have been spilled onto the userspace stack.  For QEMU, we haven't
> done that, so the guest's O* registers are still in WREG_O* and the
> I* registers in WREG_I*.
> 
> The code was already accessing the O* registers correctly for QEMU,
> but had copied the kernel code for accessing the I* registers off the
> userspace stack.  Replace this with direct accesses to fp and i7 in
> the CPU state, and add a comment explaining why we differ from the
> kernel code here.
> 
> This fix is sufficient to get bash to a shell prompt.
> 
> Signed-off-by: Peter Maydell 
> ---
> I'm really pretty unsure about our handling of SPARC register
> windows here. This fix works, but should we instead be
> ensuring that the flush_windows() call cpu_loop() does
> before handling this trap has written the I* regs to the
> stack ???
> ---

Ach, I was so close to being right the last time I tried to clean up this code.

Reviewed-by: Richard Henderson 

r~



[PATCH v3 4/9] s390x: Avoid variable size warning in ipl.h

2020-11-05 Thread Daniele Buono
S390IPLState contains two IplParameterBlock, which may in turn have
either a IPLBlockPV or a IplBlockFcp, both ending with a variable
sized field (an array).

This causes a warning with clang 11 or greater, which checks that
variable sized type are only allocated at the end of the struct:

In file included from ../qemu-cfi-v3/target/s390x/diag.c:21:
../qemu-cfi-v3/hw/s390x/ipl.h:161:23: error: field 'iplb' with variable sized 
type 'IplParameterBlock' (aka 'union IplParameterBlock') not at the end of a 
struct or class is a GNU extension 
[-Werror,-Wgnu-variable-sized-type-not-at-end]
IplParameterBlock iplb;
  ^
../qemu-cfi-v3/hw/s390x/ipl.h:162:23: error: field 'iplb_pv' with variable 
sized type 'IplParameterBlock' (aka 'union IplParameterBlock') not at the end 
of a struct or class is a GNU extension 
[-Werror,-Wgnu-variable-sized-type-not-at-end]
IplParameterBlock iplb_pv;

In this case, however, the warning is a false positive, because
IPLBlockPV and IplBlockFcp are allocated in a union wrapped at 4K,
making the union non-variable sized.

Fix the warning by turning the two variable sized arrays into arrays
of size 0. This avoids the compiler error and should produce the
same code.

Signed-off-by: Daniele Buono 
---
There is the possibility of removing  IplBlockFcp from
IplParameterBlock, since it is not actually used.
This would also allow to entirely remove the definition of
IplBlockFcp, but we may want to keep it for completeness.

 hw/s390x/ipl.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/s390x/ipl.h b/hw/s390x/ipl.h
index 9e90169695..dfc6dfd89c 100644
--- a/hw/s390x/ipl.h
+++ b/hw/s390x/ipl.h
@@ -32,7 +32,7 @@ struct IPLBlockPV {
 uint32_t num_comp;  /* 0x74 */
 uint64_t pv_header_addr;/* 0x78 */
 uint64_t pv_header_len; /* 0x80 */
-struct IPLBlockPVComp components[];
+struct IPLBlockPVComp components[0];
 } QEMU_PACKED;
 typedef struct IPLBlockPV IPLBlockPV;
 
@@ -63,7 +63,7 @@ struct IplBlockFcp {
 uint64_t br_lba;
 uint32_t scp_data_len;
 uint8_t  reserved6[260];
-uint8_t  scp_data[];
+uint8_t  scp_data[0];
 } QEMU_PACKED;
 typedef struct IplBlockFcp IplBlockFcp;
 
-- 
2.17.1




[PATCH v3 1/9] fuzz: Make fork_fuzz.ld compatible with LLVM's LLD

2020-11-05 Thread Daniele Buono
LLVM's linker, LLD, supports the keyword "INSERT AFTER", starting with
version 11.
However, when multiple sections are defined in the same "INSERT AFTER",
they are added in a reversed order, compared to BFD's LD.

This patch makes fork_fuzz.ld generic enough to work with both linkers.
Each section now has its own "INSERT AFTER" keyword, so proper ordering is
defined between the sections added.

Signed-off-by: Daniele Buono 
---
 tests/qtest/fuzz/fork_fuzz.ld | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/tests/qtest/fuzz/fork_fuzz.ld b/tests/qtest/fuzz/fork_fuzz.ld
index bfb667ed06..cfb88b7fdb 100644
--- a/tests/qtest/fuzz/fork_fuzz.ld
+++ b/tests/qtest/fuzz/fork_fuzz.ld
@@ -16,6 +16,11 @@ SECTIONS
   /* Lowest stack counter */
   *(__sancov_lowest_stack);
   }
+}
+INSERT AFTER .data;
+
+SECTIONS
+{
   .data.fuzz_ordered :
   {
   /*
@@ -34,6 +39,11 @@ SECTIONS
*/
*(.bss._ZN6fuzzer3TPCE);
   }
+}
+INSERT AFTER .data.fuzz_start;
+
+SECTIONS
+{
   .data.fuzz_end : ALIGN(4K)
   {
   __FUZZ_COUNTERS_END = .;
@@ -43,4 +53,4 @@ SECTIONS
  * Don't overwrite the SECTIONS in the default linker script. Instead insert 
the
  * above into the default script
  */
-INSERT AFTER .data;
+INSERT AFTER .data.fuzz_ordered;
-- 
2.17.1




[PATCH v3 7/9] cfi: Initial support for cfi-icall in QEMU

2020-11-05 Thread Daniele Buono
LLVM/Clang, supports runtime checks for forward-edge Control-Flow
Integrity (CFI).

CFI on indirect function calls (cfi-icall) ensures that, in indirect
function calls, the function called is of the right signature for the
pointer type defined at compile time.

For this check to work, the code must always respect the function
signature when using function pointer, the function must be defined
at compile time, and be compiled with link-time optimization.

This rules out, for example, shared libraries that are dynamically loaded
(given that functions are not known at compile time), and code that is
dynamically generated at run-time.

This patch:

1) Introduces the CONFIG_CFI flag to support cfi in QEMU

2) Introduces a decorator to allow the definition of "sensitive"
functions, where a non-instrumented function may be called at runtime
through a pointer. The decorator will take care of disabling cfi-icall
checks on such functions, when cfi is enabled.

3) Marks functions currently in QEMU that exhibit such behavior,
in particular:
- The function in TCG that calls pre-compiled TBs
- The function in TCI that interprets instructions
- Functions in the plugin infrastructures that jump to callbacks
- Functions in util that directly call a signal handler

Signed-off-by: Daniele Buono 
Acked-by: Alex Bennée  for plugins
---
 accel/tcg/cpu-exec.c| 11 +++
 include/qemu/compiler.h | 12 
 plugins/core.c  | 37 +
 plugins/loader.c|  7 +++
 tcg/tci.c   |  7 +++
 util/main-loop.c| 11 +++
 util/oslib-posix.c  | 11 +++
 7 files changed, 96 insertions(+)

diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 58aea605d8..ffe0e1e3fb 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -26,6 +26,7 @@
 #include "exec/exec-all.h"
 #include "tcg/tcg.h"
 #include "qemu/atomic.h"
+#include "qemu/compiler.h"
 #include "sysemu/qtest.h"
 #include "qemu/timer.h"
 #include "qemu/rcu.h"
@@ -144,6 +145,16 @@ static void init_delay_params(SyncClocks *sc, const 
CPUState *cpu)
 #endif /* CONFIG USER ONLY */
 
 /* Execute a TB, and fix up the CPU state afterwards if necessary */
+/*
+ * Disable CFI checks.
+ * TCG creates binary blobs at runtime, with the transformed code.
+ * A TB is a blob of binary code, created at runtime and called with an
+ * indirect function call. Since such function did not exist at compile time,
+ * the CFI runtime has no way to verify its signature and would fail.
+ * TCG is not considered a security-sensitive part of QEMU so this does not
+ * affect the impact of CFI in environment with high security requirements
+ */
+QEMU_DISABLE_CFI
 static inline tcg_target_ulong cpu_tb_exec(CPUState *cpu, TranslationBlock 
*itb)
 {
 CPUArchState *env = cpu->env_ptr;
diff --git a/include/qemu/compiler.h b/include/qemu/compiler.h
index c76281f354..c87c242063 100644
--- a/include/qemu/compiler.h
+++ b/include/qemu/compiler.h
@@ -243,4 +243,16 @@ extern void QEMU_NORETURN QEMU_ERROR("code path is 
reachable")
 #define qemu_build_not_reached()  g_assert_not_reached()
 #endif
 
+#ifdef CONFIG_CFI
+/*
+ * If CFI is enabled, use an attribute to disable cfi-icall on the following
+ * function
+ */
+#define QEMU_DISABLE_CFI __attribute__((no_sanitize("cfi-icall")))
+#else
+/* If CFI is not enabled, use an empty define to not change the behavior */
+#define QEMU_DISABLE_CFI
+#endif
+
+
 #endif /* COMPILER_H */
diff --git a/plugins/core.c b/plugins/core.c
index 51bfc94787..87b823bbc4 100644
--- a/plugins/core.c
+++ b/plugins/core.c
@@ -31,6 +31,7 @@
 #include "tcg/tcg-op.h"
 #include "trace/mem-internal.h" /* mem_info macros */
 #include "plugin.h"
+#include "qemu/compiler.h"
 
 struct qemu_plugin_cb {
 struct qemu_plugin_ctx *ctx;
@@ -90,6 +91,12 @@ void plugin_unregister_cb__locked(struct qemu_plugin_ctx 
*ctx,
 }
 }
 
+/*
+ * Disable CFI checks.
+ * The callback function has been loaded from an external library so we do not
+ * have type information
+ */
+QEMU_DISABLE_CFI
 static void plugin_vcpu_cb__simple(CPUState *cpu, enum qemu_plugin_event ev)
 {
 struct qemu_plugin_cb *cb, *next;
@@ -111,6 +118,12 @@ static void plugin_vcpu_cb__simple(CPUState *cpu, enum 
qemu_plugin_event ev)
 }
 }
 
+/*
+ * Disable CFI checks.
+ * The callback function has been loaded from an external library so we do not
+ * have type information
+ */
+QEMU_DISABLE_CFI
 static void plugin_cb__simple(enum qemu_plugin_event ev)
 {
 struct qemu_plugin_cb *cb, *next;
@@ -128,6 +141,12 @@ static void plugin_cb__simple(enum qemu_plugin_event ev)
 }
 }
 
+/*
+ * Disable CFI checks.
+ * The callback function has been loaded from an external library so we do not
+ * have type information
+ */
+QEMU_DISABLE_CFI
 static void plugin_cb__udata(enum qemu_plugin_event ev)
 {
 struct qemu_plugin_cb *cb, *next;
@@ -325,6 +344,12 @@ void plugin_register_vcpu_mem_cb(GArray **arr,
 dyn_cb->f.generic = c

[PATCH v3 0/9] Add support for Control-Flow Integrity

2020-11-05 Thread Daniele Buono
This patch adds supports for Control-Flow Integrity checks
on indirect function calls.

Requires the use of clang, and link-time optimizations

Changes in v3:

- clang 11+ warnings are now handled directly at the source,
instead of disabling specific warnings for the whole code.
Some more work may be needed here to polish the patch, I
would kindly ask for a review from the corresponding
maintainers
- Remove configure-time checks for toolchain compatibility
with LTO.
- the decorator to disable cfi checks on functions has
been renamed and moved to include/qemu/compiler.h
- configure-time checks for cfi support and dependencies
has been moved from configure to meson

Link to v2: https://www.mail-archive.com/qemu-devel@nongnu.org/msg753675.html
Link to v1: https://www.mail-archive.com/qemu-devel@nongnu.org/msg718786.html

Daniele Buono (9):
  fuzz: Make fork_fuzz.ld compatible with LLVM's LLD
  s390x: fix clang 11 warnings in cpu_models.c
  hw/usb: reorder fields in UASStatus
  s390x: Avoid variable size warning in ipl.h
  scsi: fix overflow in scsi_disk_new_request_dump
  configure,meson: add option to enable LTO
  cfi: Initial support for cfi-icall in QEMU
  check-block: enable iotests with cfi-icall
  configure/meson: support Control-Flow Integrity

 accel/tcg/cpu-exec.c  | 11 +
 configure | 26 
 hw/s390x/ipl.h|  4 +--
 hw/scsi/scsi-disk.c   |  4 +++
 hw/usb/dev-uas.c  |  2 +-
 include/qemu/compiler.h   | 12 +
 meson.build   | 46 +++
 meson_options.txt |  4 +++
 plugins/core.c| 37 
 plugins/loader.c  |  7 ++
 target/s390x/cpu_models.c |  8 +++---
 tcg/tci.c |  7 ++
 tests/check-block.sh  | 18 --
 tests/qtest/fuzz/fork_fuzz.ld | 12 -
 util/main-loop.c  | 11 +
 util/oslib-posix.c| 11 +
 16 files changed, 205 insertions(+), 15 deletions(-)

-- 
2.17.1




Re: [PATCH for-5.2 1/3] linux-user/sparc: Fix errors in target_ucontext structures

2020-11-05 Thread Richard Henderson
On 11/5/20 1:23 PM, Peter Maydell wrote:
> The various structs that make up the SPARC target_ucontext had some
> errors:
>  * target structures must not include fields which are host pointers,
>which might be the wrong size.  These should be abi_ulong instead
>  * because we don't have the 'long double' part of the mcfpu_fregs
>union in our version of the target_mc_fpu struct, we need to
>manually force it to be 16-aligned
> 
> In particular, the lack of 16-alignment caused sparc64_get_context()
> and sparc64_set_context() to read and write all the registers at the
> wrong offset, which triggered a guest glibc stack check in
> siglongjmp:
>   *** longjmp causes uninitialized stack frame ***: terminated
> when trying to run bash.

Reviewed-by: Richard Henderson 

> +} __attribute__((aligned(16)));

Hmph, 96 uses of the attribute directly, 20 uses of QEMU_ALIGNED.  I suppose we
should just remove the wrapper...


r~



Re: [PATCH] target/openrisc: fix icount handling for timer instructions

2020-11-05 Thread Richard Henderson
On 11/5/20 3:54 AM, Pavel Dovgalyuk wrote:
> This patch adds icount handling to mfspr/mtspr instructions
> that may deal with hardware timers.
> 
> Signed-off-by: Pavel Dovgalyuk 
> ---
>  target/openrisc/translate.c |   15 +++
>  1 file changed, 15 insertions(+)

Looks correct, but it would be better not to duplicate the code from
trans_l_mtspr, and use an is_jmp code (called DISAS_UPDATE_EXIT in some other
targets).


r~



Re: [PATCH] target/alpha: fix icount handling for timer instructions

2020-11-05 Thread Richard Henderson
On 11/5/20 1:04 AM, Pavel Dovgalyuk wrote:
> This patch handles icount mode for timer read/write instructions,
> because it is required to call gen_io_start in such cases.
> 
> Signed-off-by: Pavel Dovgalyuk 
> ---
>  target/alpha/translate.c |9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)

Reviewed-by: Richard Henderson 

r~



[PATCH for-5.2 3/3] linux-user/sparc: Don't zero high half of PC, NPC, PSR in sigreturn

2020-11-05 Thread Peter Maydell
The function do_sigreturn() tries to store the PC, NPC and PSR in
uint32_t local variables, which implicitly drops the high half of
these fields for 64-bit guests.

The usual effect was that a guest which used signals would crash on
return from a signal unless it was lucky enough to take it while the
PC was in the low 4GB of the address space.  In particular, Debian
/bin/dash and /bin/bash would segfault after executing external
commands.

Use abi_ulong, which is the type these fields all have in the
__siginfo_t struct.

Signed-off-by: Peter Maydell 
---
 linux-user/sparc/signal.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/linux-user/sparc/signal.c b/linux-user/sparc/signal.c
index c315704b389..d12adc8e6ff 100644
--- a/linux-user/sparc/signal.c
+++ b/linux-user/sparc/signal.c
@@ -247,7 +247,7 @@ long do_sigreturn(CPUSPARCState *env)
 {
 abi_ulong sf_addr;
 struct target_signal_frame *sf;
-uint32_t up_psr, pc, npc;
+abi_ulong up_psr, pc, npc;
 target_sigset_t set;
 sigset_t host_set;
 int i;
-- 
2.20.1




  1   2   3   4   >