[PATCH] configure: Add 'mkdir build' check
QEMU configure script goes into an infinite error printing loop when in read only directory due to 'build' dir never being created. Checking if 'mkdir dir' succeeds and if the directory is writeable prevents this error. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/321 Signed-off-by: Dinah Baum --- configure | 37 ++--- 1 file changed, 30 insertions(+), 7 deletions(-) diff --git a/configure b/configure index 64960c6000..fe9028991f 100755 --- a/configure +++ b/configure @@ -32,9 +32,11 @@ then fi mkdir build -touch $MARKER +if [ -d build ] && [ -w build ] +then +touch $MARKER -cat > GNUmakefile <<'EOF' +cat > GNUmakefile <<'EOF' # This file is auto-generated by configure to support in-source tree # 'make' command invocation @@ -56,8 +58,15 @@ force: ; GNUmakefile: ; EOF -cd build -exec "$source_path/configure" "$@" +cd build +exec "$source_path/configure" "$@" +elif ! [ -d build ] +then +echo "ERROR: Unable to create ./build dir, try using a ../qemu/configure build" +elif ! [ -w build ] +then +echo "ERROR: ./build dir not writeable, try using a ../qemu/configure build" +fi fi # Temporary directory used for files created while @@ -181,9 +190,12 @@ compile_prog() { # symbolically link $1 to $2. Portable version of "ln -sf". symlink() { - rm -rf "$2" - mkdir -p "$(dirname "$2")" - ln -s "$1" "$2" + if [ -d $source_path/build ] && [ -w $source_path/build ] + then + rm -rf "$2" + mkdir -p "$(dirname "$2")" + ln -s "$1" "$2" + fi } # check whether a command is available to this shell (may be either an @@ -2287,7 +2299,18 @@ fi ### # generate config-host.mak +if ! [ -d $source_path/build ] || ! [ -w $source_path/build ] +then +echo "ERROR: ./build dir unusable, exiting" +# cleanup +rm -f config.log +rm -f Makefile.prereqs +rm -r "$TMPDIR1" +exit 1 +fi + if ! (GIT="$git" "$source_path/scripts/git-submodule.sh" "$git_submodules_action" "$git_submodules"); then +echo "BAD" exit 1 fi -- 2.30.2
Re: [PATCH] KVM: dirty ring: check if vcpu is created before dirty_ring_reap_one
Sorry, this patch is wrong. kvm_dirty_ring_reap_locked holds slots_lock, which may result in deadlock at the moment when modifying memory_region. I am finding a better way to get known the finishing of all vcpus' creations before waking reaper up. > -原始邮件-发件人:"Weinan Liu" 发送时间:2023-02-05 00:08:08 > (星期日)收件人:qemu-devel@nongnu.org抄送:pet...@redhat.com, dgilb...@redhat.com, > "Weinan Liu" 主题:[PATCH] KVM: dirty ring: check if vcpu > is created before dirty_ring_reap_one > > From: Weinan Liu > > Failed to assert '(dirty_gfns && ring_size)' in kvm_dirty_ring_reap_one if > the vcpu has not been finished to create yet. This bug occasionally occurs > when I open 200+ qemu instances on my 16G 6-cores x86 machine. And it must > be triggered if inserting a 'sleep(10)' into kvm_vcpu_thread_fn as below-- > > static void *kvm_vcpu_thread_fn(void *arg) > { > CPUState *cpu = arg; > int r; > > rcu_register_thread(); > > +sleep(10); > qemu_mutex_lock_iothread(); > qemu_thread_get_self(cpu->thread); > cpu->thread_id = qemu_get_thread_id(); > cpu->can_do_io = 1; > > where dirty ring reaper will wakeup but then a vcpu has not been finished > to create. > > Signed-off-by: Weinan Liu > --- > accel/kvm/kvm-all.c | 9 + > 1 file changed, 9 insertions(+) > > diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c > index 7e6a6076b1..840da7630e 100644 > --- a/accel/kvm/kvm-all.c > +++ b/accel/kvm/kvm-all.c > @@ -719,6 +719,15 @@ static uint64_t kvm_dirty_ring_reap_locked(KVMState *s, > CPUState* cpu) > total = kvm_dirty_ring_reap_one(s, cpu); > } else { > CPU_FOREACH(cpu) { > +/* > + * Must ensure kvm_init_vcpu is finished, so cpu->kvm_dirty_gfns > is > + * available. > + */ > +while (cpu->created == false) { > +qemu_mutex_unlock_iothread(); > +qemu_mutex_lock_iothread(); > +} > + > total += kvm_dirty_ring_reap_one(s, cpu); > } > } > -- > 2.25.1 -- Weinan Liu (刘炜楠) Department of Computer Science and Technology School of Informatics Xiamen University
Re: [PATCH] KVM: dirty ring: check if vcpu is created before dirty_ring_reap_one
Sorry, this patch is wrong. kvm_dirty_ring_reap_locked holds slots_lock, which may result in deadlock at the moment when modifying memory_region. I am finding a better way to get known the finishing of all vcpus' creations before waking reaper up. > -原始邮件-发件人:"Weinan Liu" 发送时间:2023-02-05 00:08:08 > (星期日)收件人:qemu-devel@nongnu.org抄送:pet...@redhat.com, dgilb...@redhat.com, > "Weinan Liu" 主题:[PATCH] KVM: dirty ring: check if vcpu > is created before dirty_ring_reap_one > > From: Weinan Liu > > Failed to assert '(dirty_gfns && ring_size)' in kvm_dirty_ring_reap_one if > the vcpu has not been finished to create yet. This bug occasionally occurs > when I open 200+ qemu instances on my 16G 6-cores x86 machine. And it must > be triggered if inserting a 'sleep(10)' into kvm_vcpu_thread_fn as below-- > > static void *kvm_vcpu_thread_fn(void *arg) > { > CPUState *cpu = arg; > int r; > > rcu_register_thread(); > > +sleep(10); > qemu_mutex_lock_iothread(); > qemu_thread_get_self(cpu->thread); > cpu->thread_id = qemu_get_thread_id(); > cpu->can_do_io = 1; > > where dirty ring reaper will wakeup but then a vcpu has not been finished > to create. > > Signed-off-by: Weinan Liu > --- > accel/kvm/kvm-all.c | 9 + > 1 file changed, 9 insertions(+) > > diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c > index 7e6a6076b1..840da7630e 100644 > --- a/accel/kvm/kvm-all.c > +++ b/accel/kvm/kvm-all.c > @@ -719,6 +719,15 @@ static uint64_t kvm_dirty_ring_reap_locked(KVMState *s, > CPUState* cpu) > total = kvm_dirty_ring_reap_one(s, cpu); > } else { > CPU_FOREACH(cpu) { > +/* > + * Must ensure kvm_init_vcpu is finished, so cpu->kvm_dirty_gfns > is > + * available. > + */ > +while (cpu->created == false) { > +qemu_mutex_unlock_iothread(); > +qemu_mutex_lock_iothread(); > +} > + > total += kvm_dirty_ring_reap_one(s, cpu); > } > } > -- > 2.25.1 -- Weinan Liu (刘炜楠) Department of Computer Science and Technology School of Informatics Xiamen University
[PATCH v2] KVM: dirty ring: check if vcpu is created before dirty_ring_reap_one
Failed to assert '(dirty_gfns && ring_size)' in kvm_dirty_ring_reap_one if the vcpu has not been finished to create yet. This bug occasionally occurs when I open 200+ qemu instances on my 16G 6-cores x86 machine. And it must be triggered if inserting a 'sleep(10)' into kvm_vcpu_thread_fn as below-- static void *kvm_vcpu_thread_fn(void *arg) { CPUState *cpu = arg; int r; rcu_register_thread(); +sleep(10); qemu_mutex_lock_iothread(); qemu_thread_get_self(cpu->thread); cpu->thread_id = qemu_get_thread_id(); cpu->can_do_io = 1; where dirty ring reaper will wakeup but then a vcpu has not been finished to create. Signed-off-by: Weinan Liu --- accel/kvm/kvm-all.c | 5 + 1 file changed, 5 insertions(+) diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index 7e6a6076b1..0070ad72b8 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -1416,6 +1416,11 @@ static void *kvm_dirty_ring_reaper_thread(void *data) */ sleep(1); +/* ensure kvm_init_vcpu is finished, so cpu->kvm_dirty_gfns is ok */ +if (!phase_check(PHASE_MACHINE_READY)) { +continue; +} + /* keep sleeping so that dirtylimit not be interfered by reaper */ if (dirtylimit_in_service()) { continue; -- 2.25.1
[PATCH] linux-user: add support for xtensa FDPIC
Define xtensa-specific info_is_fdpic and fill in FDPIC-specific registers in the xtensa version of init_thread. Signed-off-by: Max Filippov --- include/elf.h| 1 + linux-user/elfload.c | 16 +++- 2 files changed, 16 insertions(+), 1 deletion(-) diff --git a/include/elf.h b/include/elf.h index 8bf1e72720d5..e8bfe38a9fbd 100644 --- a/include/elf.h +++ b/include/elf.h @@ -1619,6 +1619,7 @@ typedef struct elf64_shdr { #define ELFOSABI_MODESTO11 /* Novell Modesto. */ #define ELFOSABI_OPENBSD12 /* OpenBSD. */ #define ELFOSABI_ARM_FDPIC 65 /* ARM FDPIC */ +#define ELFOSABI_XTENSA_FDPIC 65 /* Xtensa FDPIC */ #define ELFOSABI_ARM97 /* ARM */ #define ELFOSABI_STANDALONE 255 /* Standalone (embedded) application */ diff --git a/linux-user/elfload.c b/linux-user/elfload.c index 5928c14dfc97..150d1d450396 100644 --- a/linux-user/elfload.c +++ b/linux-user/elfload.c @@ -1748,6 +1748,15 @@ static inline void init_thread(struct target_pt_regs *regs, regs->windowstart = 1; regs->areg[1] = infop->start_stack; regs->pc = infop->entry; +if (info_is_fdpic(infop)) { +regs->areg[4] = infop->loadmap_addr; +regs->areg[5] = infop->interpreter_loadmap_addr; +if (infop->interpreter_loadmap_addr) { +regs->areg[6] = infop->interpreter_pt_dynamic_addr; +} else { +regs->areg[6] = infop->pt_dynamic_addr; +} +} } /* See linux kernel: arch/xtensa/include/asm/elf.h. */ @@ -2207,11 +2216,16 @@ static void zero_bss(abi_ulong elf_bss, abi_ulong last_bss, int prot) } } -#ifdef TARGET_ARM +#if defined(TARGET_ARM) static int elf_is_fdpic(struct elfhdr *exec) { return exec->e_ident[EI_OSABI] == ELFOSABI_ARM_FDPIC; } +#elif defined(TARGET_XTENSA) +static int elf_is_fdpic(struct elfhdr *exec) +{ +return exec->e_ident[EI_OSABI] == ELFOSABI_XTENSA_FDPIC; +} #else /* Default implementation, always false. */ static int elf_is_fdpic(struct elfhdr *exec) -- 2.30.2
[PATCH 10/10] docs/fuzz: remove mentions of fork-based fuzzing
Signed-off-by: Alexander Bulekov --- docs/devel/fuzzing.rst | 22 ++ 1 file changed, 2 insertions(+), 20 deletions(-) diff --git a/docs/devel/fuzzing.rst b/docs/devel/fuzzing.rst index 715330c856..3bfcb33fc4 100644 --- a/docs/devel/fuzzing.rst +++ b/docs/devel/fuzzing.rst @@ -19,11 +19,6 @@ responsibility to ensure that state is reset between fuzzing-runs. Building the fuzzers -*NOTE*: If possible, build a 32-bit binary. When forking, the 32-bit fuzzer is -much faster, since the page-map has a smaller size. This is due to the fact that -AddressSanitizer maps ~20TB of memory, as part of its detection. This results -in a large page-map, and a much slower ``fork()``. - To build the fuzzers, install a recent version of clang: Configure with (substitute the clang binaries with the version you installed). Here, enable-sanitizers, is optional but it allows us to reliably detect bugs @@ -296,10 +291,9 @@ input. It is also responsible for manually calling ``main_loop_wait`` to ensure that bottom halves are executed and any cleanup required before the next input. Since the same process is reused for many fuzzing runs, QEMU state needs to -be reset at the end of each run. There are currently two implemented -options for resetting state: +be reset at the end of each run. For example, this can be done by rebooting the +VM, after each run. -- Reboot the guest between runs. - *Pros*: Straightforward and fast for simple fuzz targets. - *Cons*: Depending on the device, does not reset all device state. If the @@ -308,15 +302,3 @@ options for resetting state: reboot. - *Example target*: ``i440fx-qtest-reboot-fuzz`` - -- Run each test case in a separate forked process and copy the coverage - information back to the parent. This is fairly similar to AFL's "deferred" - fork-server mode [3] - - - *Pros*: Relatively fast. Devices only need to be initialized once. No need to -do slow reboots or vmloads. - - - *Cons*: Not officially supported by libfuzzer. Does not work well for - devices that rely on dedicated threads. - - - *Example target*: ``virtio-net-fork-fuzz`` -- 2.39.0
[PATCH 00/10] Retire Fork-Based Fuzzing
Hello, This series removes fork-based fuzzing. How does fork-based fuzzing work? * A single parent process initializes QEMU * We identify the devices we wish to fuzz (fuzzer-dependent) * Use QTest to PCI enumerate the devices * After that we start a fork-server which forks the process and executes fuzzer inputs inside the disposable children. In a normal fuzzing process, everything happens in a single process. Pros of fork-based fuzzing: * We only need to do common configuration once (e.g. PCI enumeration). * Fork provides a strong guarantee that fuzzer inputs will not interfere with each-other * The fuzzing process can continue even after a child-process crashes * We can apply our-own timers to child-processes to exit slow inputs, early Cons of fork-based fuzzing: * Fork-based fuzzing is not supported by libfuzzer. We had to build our own fork-server and rely on tricks using linker-scripts and shared-memory to support fuzzing. ( https://physics.bu.edu/~alxndr/libfuzzer-forkserver/ ) * Fork-based fuzzing is currently the main blocker preventing us from enabling other fuzzers such as AFL++ on OSS-Fuzz * Fork-based fuzzing may be a reason why coverage-builds are failing on OSS-Fuzz. Coverage is an important fuzzing metric which would allow us to find parts of the code that are not well-covered. * Fork-based fuzzing has high overhead. fork() is an expensive system-call, especially for processes running ASAN (with large/complex) VMA layouts. * Fork prevents us from effectively fuzzing devices that rely on threads (e.g. qxl). These patches remove fork-based fuzzing and replace it with reboot-based fuzzing for most cases. Misc notes about this change: * libfuzzer appears to be no longer in active development. As such, the current implementation of fork-based fuzzing (while having some nice advantages) is likely to hold us back in the future. If these changes are approved and appear to run successfully on OSS-Fuzz, we should be able to easily experiment with other fuzzing engines (AFL++). * Some device do not completely reset their state. This can lead to non-reproducible crashes. However, in my local tests, most crashes were reproducible. OSS-Fuzz shouldn't send us reports unless it can consistently reproduce a crash. * In theory, the corpus-format should not change, so the existing corpus-inputs on OSS-Fuzz will transfer to the new reset()-able fuzzers. * Each fuzzing process will now exit after a single crash is found. To continue the fuzzing process, use libfuzzer flags such as -jobs=-1 * We no long control input-timeouts (those are handled by libfuzzer). Since timeouts on oss-fuzz can be many seconds long, I added a limit on the number of DMA bytes written. Alexander Bulekov (10): hw/sparse-mem: clear memory on reset fuzz: add fuzz_reboot API fuzz/generic-fuzz: use reboots instead of forks to reset state fuzz/generic-fuzz: add a limit on DMA bytes written fuzz/virtio-scsi: remove fork-based fuzzer fuzz/virtio-net: remove fork-based fuzzer fuzz/virtio-blk: remove fork-based fuzzer fuzz/i440fx: remove fork-based fuzzer fuzz: remove fork-fuzzing scaffolding docs/fuzz: remove mentions of fork-based fuzzing docs/devel/fuzzing.rst | 22 +- hw/mem/sparse-mem.c | 13 +++- meson.build | 4 - tests/qtest/fuzz/fork_fuzz.c| 41 -- tests/qtest/fuzz/fork_fuzz.h| 23 -- tests/qtest/fuzz/fork_fuzz.ld | 56 -- tests/qtest/fuzz/fuzz.c | 6 ++ tests/qtest/fuzz/fuzz.h | 2 +- tests/qtest/fuzz/generic_fuzz.c | 111 +++- tests/qtest/fuzz/i440fx_fuzz.c | 27 +-- tests/qtest/fuzz/meson.build| 6 +- tests/qtest/fuzz/virtio_blk_fuzz.c | 51 ++--- tests/qtest/fuzz/virtio_net_fuzz.c | 54 ++ tests/qtest/fuzz/virtio_scsi_fuzz.c | 51 ++--- 14 files changed, 72 insertions(+), 395 deletions(-) delete mode 100644 tests/qtest/fuzz/fork_fuzz.c delete mode 100644 tests/qtest/fuzz/fork_fuzz.h delete mode 100644 tests/qtest/fuzz/fork_fuzz.ld -- 2.39.0
[PATCH 03/10] fuzz/generic-fuzz: use reboots instead of forks to reset state
Signed-off-by: Alexander Bulekov --- tests/qtest/fuzz/generic_fuzz.c | 106 +++- 1 file changed, 23 insertions(+), 83 deletions(-) diff --git a/tests/qtest/fuzz/generic_fuzz.c b/tests/qtest/fuzz/generic_fuzz.c index 7326f6840b..c2e5642150 100644 --- a/tests/qtest/fuzz/generic_fuzz.c +++ b/tests/qtest/fuzz/generic_fuzz.c @@ -18,7 +18,6 @@ #include "tests/qtest/libqtest.h" #include "tests/qtest/libqos/pci-pc.h" #include "fuzz.h" -#include "fork_fuzz.h" #include "string.h" #include "exec/memory.h" #include "exec/ramblock.h" @@ -29,6 +28,8 @@ #include "generic_fuzz_configs.h" #include "hw/mem/sparse-mem.h" +static void pci_enum(gpointer pcidev, gpointer bus); + /* * SEPARATOR is used to separate "operations" in the fuzz input */ @@ -589,30 +590,6 @@ static void op_disable_pci(QTestState *s, const unsigned char *data, size_t len) pci_disabled = true; } -static void handle_timeout(int sig) -{ -if (qtest_log_enabled) { -fprintf(stderr, "[Timeout]\n"); -fflush(stderr); -} - -/* - * If there is a crash, libfuzzer/ASAN forks a child to run an - * "llvm-symbolizer" process for printing out a pretty stacktrace. It - * communicates with this child using a pipe. If we timeout+Exit, while - * libfuzzer is still communicating with the llvm-symbolizer child, we will - * be left with an orphan llvm-symbolizer process. Sometimes, this appears - * to lead to a deadlock in the forkserver. Use waitpid to check if there - * are any waitable children. If so, exit out of the signal-handler, and - * let libfuzzer finish communicating with the child, and exit, on its own. - */ -if (waitpid(-1, NULL, WNOHANG) == 0) { -return; -} - -_Exit(0); -} - /* * Here, we interpret random bytes from the fuzzer, as a sequence of commands. * Some commands can be variable-width, so we use a separator, SEPARATOR, to @@ -669,64 +646,34 @@ static void generic_fuzz(QTestState *s, const unsigned char *Data, size_t Size) size_t cmd_len; uint8_t op; -if (fork() == 0) { -struct sigaction sact; -struct itimerval timer; -sigset_t set; -/* - * Sometimes the fuzzer will find inputs that take quite a long time to - * process. Often times, these inputs do not result in new coverage. - * Even if these inputs might be interesting, they can slow down the - * fuzzer, overall. Set a timeout for each command to avoid hurting - * performance, too much - */ -if (timeout) { - -sigemptyset(&sact.sa_mask); -sact.sa_flags = SA_NODEFER; -sact.sa_handler = handle_timeout; -sigaction(SIGALRM, &sact, NULL); - -sigemptyset(&set); -sigaddset(&set, SIGALRM); -pthread_sigmask(SIG_UNBLOCK, &set, NULL); - -memset(&timer, 0, sizeof(timer)); -timer.it_value.tv_sec = timeout / USEC_IN_SEC; -timer.it_value.tv_usec = timeout % USEC_IN_SEC; -} +op_clear_dma_patterns(s, NULL, 0); +pci_disabled = false; -op_clear_dma_patterns(s, NULL, 0); -pci_disabled = false; +QPCIBus *pcibus = qpci_new_pc(s, NULL); +g_ptr_array_foreach(fuzzable_pci_devices, pci_enum, pcibus); +qpci_free_pc(pcibus); -while (cmd && Size) { -/* Reset the timeout, each time we run a new command */ -if (timeout) { -setitimer(ITIMER_REAL, &timer, NULL); -} +while (cmd && Size) { +/* Reset the timeout, each time we run a new command */ -/* Get the length until the next command or end of input */ -nextcmd = memmem(cmd, Size, SEPARATOR, strlen(SEPARATOR)); -cmd_len = nextcmd ? nextcmd - cmd : Size; +/* Get the length until the next command or end of input */ +nextcmd = memmem(cmd, Size, SEPARATOR, strlen(SEPARATOR)); +cmd_len = nextcmd ? nextcmd - cmd : Size; -if (cmd_len > 0) { -/* Interpret the first byte of the command as an opcode */ -op = *cmd % (sizeof(ops) / sizeof((ops)[0])); -ops[op](s, cmd + 1, cmd_len - 1); +if (cmd_len > 0) { +/* Interpret the first byte of the command as an opcode */ +op = *cmd % (sizeof(ops) / sizeof((ops)[0])); +ops[op](s, cmd + 1, cmd_len - 1); -/* Run the main loop */ -flush_events(s); -} -/* Advance to the next command */ -cmd = nextcmd ? nextcmd + sizeof(SEPARATOR) - 1 : nextcmd; -Size = Size - (cmd_len + sizeof(SEPARATOR) - 1); -g_array_set_size(dma_regions, 0); +/* Run the main loop */ +flush_events(s); } -_Exit(0); -} else { -flush_events(s); -wait(0); +/* Advance to the next command *
[PATCH 05/10] fuzz/virtio-scsi: remove fork-based fuzzer
Signed-off-by: Alexander Bulekov --- tests/qtest/fuzz/virtio_scsi_fuzz.c | 51 - 1 file changed, 7 insertions(+), 44 deletions(-) diff --git a/tests/qtest/fuzz/virtio_scsi_fuzz.c b/tests/qtest/fuzz/virtio_scsi_fuzz.c index b3220ef6cb..8b26e951ae 100644 --- a/tests/qtest/fuzz/virtio_scsi_fuzz.c +++ b/tests/qtest/fuzz/virtio_scsi_fuzz.c @@ -20,7 +20,6 @@ #include "standard-headers/linux/virtio_pci.h" #include "standard-headers/linux/virtio_scsi.h" #include "fuzz.h" -#include "fork_fuzz.h" #include "qos_fuzz.h" #define PCI_SLOT0x02 @@ -132,48 +131,24 @@ static void virtio_scsi_fuzz(QTestState *s, QVirtioSCSIQueues* queues, } } -static void virtio_scsi_fork_fuzz(QTestState *s, -const unsigned char *Data, size_t Size) -{ -QVirtioSCSI *scsi = fuzz_qos_obj; -static QVirtioSCSIQueues *queues; -if (!queues) { -queues = qvirtio_scsi_init(scsi->vdev, 0); -} -if (fork() == 0) { -virtio_scsi_fuzz(s, queues, Data, Size); -flush_events(s); -_Exit(0); -} else { -flush_events(s); -wait(NULL); -} -} - static void virtio_scsi_with_flag_fuzz(QTestState *s, const unsigned char *Data, size_t Size) { QVirtioSCSI *scsi = fuzz_qos_obj; static QVirtioSCSIQueues *queues; -if (fork() == 0) { -if (Size >= sizeof(uint64_t)) { -queues = qvirtio_scsi_init(scsi->vdev, *(uint64_t *)Data); -virtio_scsi_fuzz(s, queues, - Data + sizeof(uint64_t), Size - sizeof(uint64_t)); -flush_events(s); -} -_Exit(0); -} else { +if (Size >= sizeof(uint64_t)) { +queues = qvirtio_scsi_init(scsi->vdev, *(uint64_t *)Data); +virtio_scsi_fuzz(s, queues, +Data + sizeof(uint64_t), Size - sizeof(uint64_t)); flush_events(s); -wait(NULL); } +fuzz_reboot(s); } static void virtio_scsi_pre_fuzz(QTestState *s) { qos_init_path(s); -counter_shm_init(); } static void *virtio_scsi_test_setup(GString *cmd_line, void *arg) @@ -189,22 +164,10 @@ static void *virtio_scsi_test_setup(GString *cmd_line, void *arg) static void register_virtio_scsi_fuzz_targets(void) { -fuzz_add_qos_target(&(FuzzTarget){ -.name = "virtio-scsi-fuzz", -.description = "Fuzz the virtio-scsi virtual queues, forking " -"for each fuzz run", -.pre_vm_init = &counter_shm_init, -.pre_fuzz = &virtio_scsi_pre_fuzz, -.fuzz = virtio_scsi_fork_fuzz,}, -"virtio-scsi", -&(QOSGraphTestOptions){.before = virtio_scsi_test_setup} -); - fuzz_add_qos_target(&(FuzzTarget){ .name = "virtio-scsi-flags-fuzz", -.description = "Fuzz the virtio-scsi virtual queues, forking " -"for each fuzz run (also fuzzes the virtio flags)", -.pre_vm_init = &counter_shm_init, +.description = "Fuzz the virtio-scsi virtual queues. " +"Also fuzzes the virtio flags", .pre_fuzz = &virtio_scsi_pre_fuzz, .fuzz = virtio_scsi_with_flag_fuzz,}, "virtio-scsi", -- 2.39.0
[PATCH 07/10] fuzz/virtio-blk: remove fork-based fuzzer
Signed-off-by: Alexander Bulekov --- tests/qtest/fuzz/virtio_blk_fuzz.c | 51 -- 1 file changed, 7 insertions(+), 44 deletions(-) diff --git a/tests/qtest/fuzz/virtio_blk_fuzz.c b/tests/qtest/fuzz/virtio_blk_fuzz.c index a9fb9ecf6c..82575a11d9 100644 --- a/tests/qtest/fuzz/virtio_blk_fuzz.c +++ b/tests/qtest/fuzz/virtio_blk_fuzz.c @@ -19,7 +19,6 @@ #include "standard-headers/linux/virtio_pci.h" #include "standard-headers/linux/virtio_blk.h" #include "fuzz.h" -#include "fork_fuzz.h" #include "qos_fuzz.h" #define TEST_IMAGE_SIZE (64 * 1024 * 1024) @@ -128,48 +127,24 @@ static void virtio_blk_fuzz(QTestState *s, QVirtioBlkQueues* queues, } } -static void virtio_blk_fork_fuzz(QTestState *s, -const unsigned char *Data, size_t Size) -{ -QVirtioBlk *blk = fuzz_qos_obj; -static QVirtioBlkQueues *queues; -if (!queues) { -queues = qvirtio_blk_init(blk->vdev, 0); -} -if (fork() == 0) { -virtio_blk_fuzz(s, queues, Data, Size); -flush_events(s); -_Exit(0); -} else { -flush_events(s); -wait(NULL); -} -} - static void virtio_blk_with_flag_fuzz(QTestState *s, const unsigned char *Data, size_t Size) { QVirtioBlk *blk = fuzz_qos_obj; static QVirtioBlkQueues *queues; -if (fork() == 0) { -if (Size >= sizeof(uint64_t)) { -queues = qvirtio_blk_init(blk->vdev, *(uint64_t *)Data); -virtio_blk_fuzz(s, queues, - Data + sizeof(uint64_t), Size - sizeof(uint64_t)); -flush_events(s); -} -_Exit(0); -} else { +if (Size >= sizeof(uint64_t)) { +queues = qvirtio_blk_init(blk->vdev, *(uint64_t *)Data); +virtio_blk_fuzz(s, queues, +Data + sizeof(uint64_t), Size - sizeof(uint64_t)); flush_events(s); -wait(NULL); } +fuzz_reboot(s); } static void virtio_blk_pre_fuzz(QTestState *s) { qos_init_path(s); -counter_shm_init(); } static void drive_destroy(void *path) @@ -208,22 +183,10 @@ static void *virtio_blk_test_setup(GString *cmd_line, void *arg) static void register_virtio_blk_fuzz_targets(void) { -fuzz_add_qos_target(&(FuzzTarget){ -.name = "virtio-blk-fuzz", -.description = "Fuzz the virtio-blk virtual queues, forking " -"for each fuzz run", -.pre_vm_init = &counter_shm_init, -.pre_fuzz = &virtio_blk_pre_fuzz, -.fuzz = virtio_blk_fork_fuzz,}, -"virtio-blk", -&(QOSGraphTestOptions){.before = virtio_blk_test_setup} -); - fuzz_add_qos_target(&(FuzzTarget){ .name = "virtio-blk-flags-fuzz", -.description = "Fuzz the virtio-blk virtual queues, forking " -"for each fuzz run (also fuzzes the virtio flags)", -.pre_vm_init = &counter_shm_init, +.description = "Fuzz the virtio-blk virtual queues. " +"Also fuzzes the virtio flags)", .pre_fuzz = &virtio_blk_pre_fuzz, .fuzz = virtio_blk_with_flag_fuzz,}, "virtio-blk", -- 2.39.0
[PATCH 09/10] fuzz: remove fork-fuzzing scaffolding
Fork-fuzzing provides a few pros, but our implementation prevents us from using fuzzers other than libFuzzer, and may be causing issues such as coverage-failure builds on OSS-Fuzz. It is not a great long-term solution as it depends on internal implementation details of libFuzzer (which is no longer in active development). Remove it in favor of other methods of resetting state between inputs. Signed-off-by: Alexander Bulekov --- meson.build | 4 --- tests/qtest/fuzz/fork_fuzz.c | 41 - tests/qtest/fuzz/fork_fuzz.h | 23 -- tests/qtest/fuzz/fork_fuzz.ld | 56 --- tests/qtest/fuzz/meson.build | 6 ++-- 5 files changed, 3 insertions(+), 127 deletions(-) delete mode 100644 tests/qtest/fuzz/fork_fuzz.c delete mode 100644 tests/qtest/fuzz/fork_fuzz.h delete mode 100644 tests/qtest/fuzz/fork_fuzz.ld diff --git a/meson.build b/meson.build index 6d3b665629..8be27c2408 100644 --- a/meson.build +++ b/meson.build @@ -215,10 +215,6 @@ endif # Specify linker-script with add_project_link_arguments so that it is not placed # within a linker --start-group/--end-group pair if get_option('fuzzing') - add_project_link_arguments(['-Wl,-T,', - (meson.current_source_dir() / 'tests/qtest/fuzz/fork_fuzz.ld')], - native: false, language: all_languages) - # Specify a filter to only instrument code that is directly related to # virtual-devices. configure_file(output: 'instrumentation-filter', diff --git a/tests/qtest/fuzz/fork_fuzz.c b/tests/qtest/fuzz/fork_fuzz.c deleted file mode 100644 index 6ffb2a7937..00 --- a/tests/qtest/fuzz/fork_fuzz.c +++ /dev/null @@ -1,41 +0,0 @@ -/* - * Fork-based fuzzing helpers - * - * Copyright Red Hat Inc., 2019 - * - * Authors: - * Alexander Bulekov - * - * This work is licensed under the terms of the GNU GPL, version 2 or later. - * See the COPYING file in the top-level directory. - * - */ - -#include "qemu/osdep.h" -#include "fork_fuzz.h" - - -void counter_shm_init(void) -{ -/* Copy what's in the counter region to a temporary buffer.. */ -void *copy = malloc(&__FUZZ_COUNTERS_END - &__FUZZ_COUNTERS_START); -memcpy(copy, - &__FUZZ_COUNTERS_START, - &__FUZZ_COUNTERS_END - &__FUZZ_COUNTERS_START); - -/* Map a shared region over the counter region */ -if (mmap(&__FUZZ_COUNTERS_START, - &__FUZZ_COUNTERS_END - &__FUZZ_COUNTERS_START, - PROT_READ | PROT_WRITE, MAP_SHARED | MAP_FIXED | MAP_ANONYMOUS, - 0, 0) == MAP_FAILED) { -perror("Error: "); -exit(1); -} - -/* Copy the original data back to the counter-region */ -memcpy(&__FUZZ_COUNTERS_START, copy, - &__FUZZ_COUNTERS_END - &__FUZZ_COUNTERS_START); -free(copy); -} - - diff --git a/tests/qtest/fuzz/fork_fuzz.h b/tests/qtest/fuzz/fork_fuzz.h deleted file mode 100644 index 9ecb8b58ef..00 --- a/tests/qtest/fuzz/fork_fuzz.h +++ /dev/null @@ -1,23 +0,0 @@ -/* - * Fork-based fuzzing helpers - * - * Copyright Red Hat Inc., 2019 - * - * Authors: - * Alexander Bulekov - * - * This work is licensed under the terms of the GNU GPL, version 2 or later. - * See the COPYING file in the top-level directory. - * - */ - -#ifndef FORK_FUZZ_H -#define FORK_FUZZ_H - -extern uint8_t __FUZZ_COUNTERS_START; -extern uint8_t __FUZZ_COUNTERS_END; - -void counter_shm_init(void); - -#endif - diff --git a/tests/qtest/fuzz/fork_fuzz.ld b/tests/qtest/fuzz/fork_fuzz.ld deleted file mode 100644 index cfb88b7fdb..00 --- a/tests/qtest/fuzz/fork_fuzz.ld +++ /dev/null @@ -1,56 +0,0 @@ -/* - * We adjust linker script modification to place all of the stuff that needs to - * persist across fuzzing runs into a contiguous section of memory. Then, it is - * easy to re-map the counter-related memory as shared. - */ - -SECTIONS -{ - .data.fuzz_start : ALIGN(4K) - { - __FUZZ_COUNTERS_START = .; - __start___sancov_cntrs = .; - *(_*sancov_cntrs); - __stop___sancov_cntrs = .; - - /* Lowest stack counter */ - *(__sancov_lowest_stack); - } -} -INSERT AFTER .data; - -SECTIONS -{ - .data.fuzz_ordered : - { - /* - * Coverage counters. They're not necessary for fuzzing, but are useful - * for analyzing the fuzzing performance - */ - __start___llvm_prf_cnts = .; - *(*llvm_prf_cnts); - __stop___llvm_prf_cnts = .; - - /* Internal Libfuzzer TracePC object which contains the ValueProfileMap */ - FuzzerTracePC*(.bss*); - /* - * In case the above line fails, explicitly specify the (mangled) name of - * the object we care about - */ - *(.bss._ZN6fuzzer3TPCE); - } -} -INSERT AFTER .data.fuzz_start; - -SECTIONS -{ - .data.fuzz_end : ALIGN(4K) - { - __FUZZ_COUNTERS_END = .; - } -} -/* - * Don't overwrite the SECTIONS in the default linker script. Instead insert the - * above into the def
[PATCH 08/10] fuzz/i440fx: remove fork-based fuzzer
Signed-off-by: Alexander Bulekov --- tests/qtest/fuzz/i440fx_fuzz.c | 27 +-- 1 file changed, 1 insertion(+), 26 deletions(-) diff --git a/tests/qtest/fuzz/i440fx_fuzz.c b/tests/qtest/fuzz/i440fx_fuzz.c index b17fc725df..5d6a703481 100644 --- a/tests/qtest/fuzz/i440fx_fuzz.c +++ b/tests/qtest/fuzz/i440fx_fuzz.c @@ -18,7 +18,6 @@ #include "tests/qtest/libqos/pci-pc.h" #include "fuzz.h" #include "qos_fuzz.h" -#include "fork_fuzz.h" #define I440FX_PCI_HOST_BRIDGE_CFG 0xcf8 @@ -89,6 +88,7 @@ static void i440fx_fuzz_qtest(QTestState *s, size_t Size) { ioport_fuzz_qtest(s, Data, Size); +fuzz_reboot(s); } static void pciconfig_fuzz_qos(QTestState *s, QPCIBus *bus, @@ -145,17 +145,6 @@ static void i440fx_fuzz_qos(QTestState *s, pciconfig_fuzz_qos(s, bus, Data, Size); } -static void i440fx_fuzz_qos_fork(QTestState *s, -const unsigned char *Data, size_t Size) { -if (fork() == 0) { -i440fx_fuzz_qos(s, Data, Size); -_Exit(0); -} else { -flush_events(s); -wait(NULL); -} -} - static const char *i440fx_qtest_argv = TARGET_NAME " -machine accel=qtest" " -m 0 -display none"; static GString *i440fx_argv(FuzzTarget *t) @@ -163,10 +152,6 @@ static GString *i440fx_argv(FuzzTarget *t) return g_string_new(i440fx_qtest_argv); } -static void fork_init(void) -{ -counter_shm_init(); -} static void register_pci_fuzz_targets(void) { @@ -178,16 +163,6 @@ static void register_pci_fuzz_targets(void) .get_init_cmdline = i440fx_argv, .fuzz = i440fx_fuzz_qtest}); -/* Uses libqos and forks to prevent state leakage */ -fuzz_add_qos_target(&(FuzzTarget){ -.name = "i440fx-qos-fork-fuzz", -.description = "Fuzz the i440fx using raw qtest commands and " - "rebooting after each run", -.pre_vm_init = &fork_init, -.fuzz = i440fx_fuzz_qos_fork,}, -"i440FX-pcihost", -&(QOSGraphTestOptions){} -); /* * Uses libqos. Doesn't do anything to reset state. Note that if we were to -- 2.39.0
[PATCH 04/10] fuzz/generic-fuzz: add a limit on DMA bytes written
As we have repplaced fork-based fuzzing, with reboots - we can no longer use a timeout+exit() to avoid slow inputs. Libfuzzer has its own timer that it uses to catch slow inputs, however these timeouts are usually seconds-minutes long: more than enough to bog-down the fuzzing process. However, I found that slow inputs often attempt to fill overly large DMA requests. Thus, we can mitigate most timeouts by setting a cap on the total number of DMA bytes written by an input. Signed-off-by: Alexander Bulekov --- tests/qtest/fuzz/generic_fuzz.c | 5 + 1 file changed, 5 insertions(+) diff --git a/tests/qtest/fuzz/generic_fuzz.c b/tests/qtest/fuzz/generic_fuzz.c index c2e5642150..eab92cbc23 100644 --- a/tests/qtest/fuzz/generic_fuzz.c +++ b/tests/qtest/fuzz/generic_fuzz.c @@ -52,6 +52,7 @@ enum cmds { #define USEC_IN_SEC 10 #define MAX_DMA_FILL_SIZE 0x1 +#define MAX_TOTAL_DMA_SIZE 0x1000 #define PCI_HOST_BRIDGE_CFG 0xcf8 #define PCI_HOST_BRIDGE_DATA 0xcfc @@ -64,6 +65,7 @@ typedef struct { static useconds_t timeout = DEFAULT_TIMEOUT_US; static bool qtest_log_enabled; +size_t dma_bytes_written; MemoryRegion *sparse_mem_mr; @@ -197,6 +199,7 @@ void fuzz_dma_read_cb(size_t addr, size_t len, MemoryRegion *mr) */ if (dma_patterns->len == 0 || len == 0 +|| dma_bytes_written > MAX_TOTAL_DMA_SIZE || (mr != current_machine->ram && mr != sparse_mem_mr)) { return; } @@ -269,6 +272,7 @@ void fuzz_dma_read_cb(size_t addr, size_t len, MemoryRegion *mr) fflush(stderr); } qtest_memwrite(qts_global, addr, buf, l); +dma_bytes_written += l; } len -= l; buf += l; @@ -648,6 +652,7 @@ static void generic_fuzz(QTestState *s, const unsigned char *Data, size_t Size) op_clear_dma_patterns(s, NULL, 0); pci_disabled = false; +dma_bytes_written = 0; QPCIBus *pcibus = qpci_new_pc(s, NULL); g_ptr_array_foreach(fuzzable_pci_devices, pci_enum, pcibus); -- 2.39.0
[PATCH 02/10] fuzz: add fuzz_reboot API
As we are converting most fuzzers to rely on reboots to reset state, introduce an API to make sure reboots are invoked in a consistent manner. Signed-off-by: Alexander Bulekov --- tests/qtest/fuzz/fuzz.c | 6 ++ tests/qtest/fuzz/fuzz.h | 2 +- 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/tests/qtest/fuzz/fuzz.c b/tests/qtest/fuzz/fuzz.c index eb7520544b..c2d07a4c7e 100644 --- a/tests/qtest/fuzz/fuzz.c +++ b/tests/qtest/fuzz/fuzz.c @@ -51,6 +51,12 @@ void flush_events(QTestState *s) } } +void fuzz_reboot(QTestState *s) +{ +qemu_system_reset(SHUTDOWN_CAUSE_GUEST_RESET); +main_loop_wait(true); +} + static QTestState *qtest_setup(void) { qtest_server_set_send_handler(&qtest_client_inproc_recv, &fuzz_qts); diff --git a/tests/qtest/fuzz/fuzz.h b/tests/qtest/fuzz/fuzz.h index 327c1c5a55..69e2b3877f 100644 --- a/tests/qtest/fuzz/fuzz.h +++ b/tests/qtest/fuzz/fuzz.h @@ -103,7 +103,7 @@ typedef struct FuzzTarget { } FuzzTarget; void flush_events(QTestState *); -void reboot(QTestState *); +void fuzz_reboot(QTestState *); /* Use the QTest ASCII protocol or call address_space API directly?*/ void fuzz_qtest_set_serialize(bool option); -- 2.39.0
[PATCH 06/10] fuzz/virtio-net: remove fork-based fuzzer
Signed-off-by: Alexander Bulekov --- tests/qtest/fuzz/virtio_net_fuzz.c | 54 +++--- 1 file changed, 5 insertions(+), 49 deletions(-) diff --git a/tests/qtest/fuzz/virtio_net_fuzz.c b/tests/qtest/fuzz/virtio_net_fuzz.c index c2c15f07f0..d245ee66a1 100644 --- a/tests/qtest/fuzz/virtio_net_fuzz.c +++ b/tests/qtest/fuzz/virtio_net_fuzz.c @@ -16,7 +16,6 @@ #include "tests/qtest/libqtest.h" #include "tests/qtest/libqos/virtio-net.h" #include "fuzz.h" -#include "fork_fuzz.h" #include "qos_fuzz.h" @@ -115,36 +114,18 @@ static void virtio_net_fuzz_multi(QTestState *s, } } -static void virtio_net_fork_fuzz(QTestState *s, -const unsigned char *Data, size_t Size) -{ -if (fork() == 0) { -virtio_net_fuzz_multi(s, Data, Size, false); -flush_events(s); -_Exit(0); -} else { -flush_events(s); -wait(NULL); -} -} -static void virtio_net_fork_fuzz_check_used(QTestState *s, +static void virtio_net_fuzz_check_used(QTestState *s, const unsigned char *Data, size_t Size) { -if (fork() == 0) { -virtio_net_fuzz_multi(s, Data, Size, true); -flush_events(s); -_Exit(0); -} else { -flush_events(s); -wait(NULL); -} +virtio_net_fuzz_multi(s, Data, Size, true); +flush_events(s); +fuzz_reboot(s); } static void virtio_net_pre_fuzz(QTestState *s) { qos_init_path(s); -counter_shm_init(); } static void *virtio_net_test_setup_socket(GString *cmd_line, void *arg) @@ -158,23 +139,8 @@ static void *virtio_net_test_setup_socket(GString *cmd_line, void *arg) return arg; } -static void *virtio_net_test_setup_user(GString *cmd_line, void *arg) -{ -g_string_append_printf(cmd_line, " -netdev user,id=hs0 "); -return arg; -} - static void register_virtio_net_fuzz_targets(void) { -fuzz_add_qos_target(&(FuzzTarget){ -.name = "virtio-net-socket", -.description = "Fuzz the virtio-net virtual queues. Fuzz incoming " -"traffic using the socket backend", -.pre_fuzz = &virtio_net_pre_fuzz, -.fuzz = virtio_net_fork_fuzz,}, -"virtio-net", -&(QOSGraphTestOptions){.before = virtio_net_test_setup_socket} -); fuzz_add_qos_target(&(FuzzTarget){ .name = "virtio-net-socket-check-used", @@ -182,20 +148,10 @@ static void register_virtio_net_fuzz_targets(void) "descriptors to be used. Timeout may indicate improperly handled " "input", .pre_fuzz = &virtio_net_pre_fuzz, -.fuzz = virtio_net_fork_fuzz_check_used,}, +.fuzz = virtio_net_fuzz_check_used,}, "virtio-net", &(QOSGraphTestOptions){.before = virtio_net_test_setup_socket} ); -fuzz_add_qos_target(&(FuzzTarget){ -.name = "virtio-net-slirp", -.description = "Fuzz the virtio-net virtual queues with the slirp " -" backend. Warning: May result in network traffic emitted from the " -" process. Run in an isolated network environment.", -.pre_fuzz = &virtio_net_pre_fuzz, -.fuzz = virtio_net_fork_fuzz,}, -"virtio-net", -&(QOSGraphTestOptions){.before = virtio_net_test_setup_user} -); } fuzz_target_init(register_virtio_net_fuzz_targets); -- 2.39.0
[PATCH 01/10] hw/sparse-mem: clear memory on reset
We use sparse-mem for fuzzing. For long-running fuzzing processes, we eventually end up with many allocated sparse-mem pages. To avoid this, clear the allocated pages on system-reset. Signed-off-by: Alexander Bulekov --- hw/mem/sparse-mem.c | 13 - 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/hw/mem/sparse-mem.c b/hw/mem/sparse-mem.c index e6640eb8e7..72f038d47d 100644 --- a/hw/mem/sparse-mem.c +++ b/hw/mem/sparse-mem.c @@ -77,6 +77,13 @@ static void sparse_mem_write(void *opaque, hwaddr addr, uint64_t v, } +static void sparse_mem_enter_reset(Object *obj, ResetType type) +{ +SparseMemState *s = SPARSE_MEM(obj); +g_hash_table_remove_all(s->mapped); +return; +} + static const MemoryRegionOps sparse_mem_ops = { .read = sparse_mem_read, .write = sparse_mem_write, @@ -123,7 +130,8 @@ static void sparse_mem_realize(DeviceState *dev, Error **errp) assert(s->baseaddr + s->length > s->baseaddr); -s->mapped = g_hash_table_new(NULL, NULL); +s->mapped = g_hash_table_new_full(NULL, NULL, NULL, + (GDestroyNotify)g_free); memory_region_init_io(&s->mmio, OBJECT(s), &sparse_mem_ops, s, "sparse-mem", s->length); sysbus_init_mmio(sbd, &s->mmio); @@ -131,12 +139,15 @@ static void sparse_mem_realize(DeviceState *dev, Error **errp) static void sparse_mem_class_init(ObjectClass *klass, void *data) { +ResettableClass *rc = RESETTABLE_CLASS(klass); DeviceClass *dc = DEVICE_CLASS(klass); device_class_set_props(dc, sparse_mem_properties); dc->desc = "Sparse Memory Device"; dc->realize = sparse_mem_realize; + +rc->phases.enter = sparse_mem_enter_reset; } static const TypeInfo sparse_mem_types[] = { -- 2.39.0
[PATCH v6 4/4] hw: replace most qemu_bh_new calls with qemu_bh_new_guarded
This protects devices from bh->mmio reentrancy issues. Reviewed-by: Darren Kenny Reviewed-by: Stefan Hajnoczi Signed-off-by: Alexander Bulekov --- hw/9pfs/xen-9p-backend.c| 4 +++- hw/block/dataplane/virtio-blk.c | 3 ++- hw/block/dataplane/xen-block.c | 5 +++-- hw/char/virtio-serial-bus.c | 3 ++- hw/display/qxl.c| 9 ++--- hw/display/virtio-gpu.c | 6 -- hw/ide/ahci.c | 3 ++- hw/ide/core.c | 3 ++- hw/misc/imx_rngc.c | 6 -- hw/misc/macio/mac_dbdma.c | 2 +- hw/net/virtio-net.c | 3 ++- hw/nvme/ctrl.c | 6 -- hw/scsi/mptsas.c| 3 ++- hw/scsi/scsi-bus.c | 3 ++- hw/scsi/vmw_pvscsi.c| 3 ++- hw/usb/dev-uas.c| 3 ++- hw/usb/hcd-dwc2.c | 3 ++- hw/usb/hcd-ehci.c | 3 ++- hw/usb/hcd-uhci.c | 2 +- hw/usb/host-libusb.c| 6 -- hw/usb/redirect.c | 6 -- hw/usb/xen-usb.c| 3 ++- hw/virtio/virtio-balloon.c | 5 +++-- hw/virtio/virtio-crypto.c | 3 ++- 24 files changed, 63 insertions(+), 33 deletions(-) diff --git a/hw/9pfs/xen-9p-backend.c b/hw/9pfs/xen-9p-backend.c index 65c4979c3c..f077c1b255 100644 --- a/hw/9pfs/xen-9p-backend.c +++ b/hw/9pfs/xen-9p-backend.c @@ -441,7 +441,9 @@ static int xen_9pfs_connect(struct XenLegacyDevice *xendev) xen_9pdev->rings[i].ring.out = xen_9pdev->rings[i].data + XEN_FLEX_RING_SIZE(ring_order); -xen_9pdev->rings[i].bh = qemu_bh_new(xen_9pfs_bh, &xen_9pdev->rings[i]); +xen_9pdev->rings[i].bh = qemu_bh_new_guarded(xen_9pfs_bh, + &xen_9pdev->rings[i], + &DEVICE(xen_9pdev)->mem_reentrancy_guard); xen_9pdev->rings[i].out_cons = 0; xen_9pdev->rings[i].out_size = 0; xen_9pdev->rings[i].inprogress = false; diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c index b28d81737e..a6202997ee 100644 --- a/hw/block/dataplane/virtio-blk.c +++ b/hw/block/dataplane/virtio-blk.c @@ -127,7 +127,8 @@ bool virtio_blk_data_plane_create(VirtIODevice *vdev, VirtIOBlkConf *conf, } else { s->ctx = qemu_get_aio_context(); } -s->bh = aio_bh_new(s->ctx, notify_guest_bh, s); +s->bh = aio_bh_new_guarded(s->ctx, notify_guest_bh, s, + &DEVICE(vdev)->mem_reentrancy_guard); s->batch_notify_vqs = bitmap_new(conf->num_queues); *dataplane = s; diff --git a/hw/block/dataplane/xen-block.c b/hw/block/dataplane/xen-block.c index 2785b9e849..e31806b317 100644 --- a/hw/block/dataplane/xen-block.c +++ b/hw/block/dataplane/xen-block.c @@ -632,8 +632,9 @@ XenBlockDataPlane *xen_block_dataplane_create(XenDevice *xendev, } else { dataplane->ctx = qemu_get_aio_context(); } -dataplane->bh = aio_bh_new(dataplane->ctx, xen_block_dataplane_bh, - dataplane); +dataplane->bh = aio_bh_new_guarded(dataplane->ctx, xen_block_dataplane_bh, + dataplane, + &DEVICE(xendev)->mem_reentrancy_guard); return dataplane; } diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c index 7d4601cb5d..dd619f0731 100644 --- a/hw/char/virtio-serial-bus.c +++ b/hw/char/virtio-serial-bus.c @@ -985,7 +985,8 @@ static void virtser_port_device_realize(DeviceState *dev, Error **errp) return; } -port->bh = qemu_bh_new(flush_queued_data_bh, port); +port->bh = qemu_bh_new_guarded(flush_queued_data_bh, port, + &dev->mem_reentrancy_guard); port->elem = NULL; } diff --git a/hw/display/qxl.c b/hw/display/qxl.c index ec712d3ca2..c0460c4ef1 100644 --- a/hw/display/qxl.c +++ b/hw/display/qxl.c @@ -2201,11 +2201,14 @@ static void qxl_realize_common(PCIQXLDevice *qxl, Error **errp) qemu_add_vm_change_state_handler(qxl_vm_change_state_handler, qxl); -qxl->update_irq = qemu_bh_new(qxl_update_irq_bh, qxl); +qxl->update_irq = qemu_bh_new_guarded(qxl_update_irq_bh, qxl, + &DEVICE(qxl)->mem_reentrancy_guard); qxl_reset_state(qxl); -qxl->update_area_bh = qemu_bh_new(qxl_render_update_area_bh, qxl); -qxl->ssd.cursor_bh = qemu_bh_new(qemu_spice_cursor_refresh_bh, &qxl->ssd); +qxl->update_area_bh = qemu_bh_new_guarded(qxl_render_update_area_bh, qxl, + &DEVICE(qxl)->mem_reentrancy_guard); +qxl->ssd.cursor_bh = qemu_bh_new_guarded(qemu_spice_cursor_refresh_bh, &qxl->ssd, + &DEVICE(qxl)->mem_reentrancy_guard); } static void qxl_realize_primary(PCIDevice *dev, Error **errp) diff --git a/hw/display/virtio-gp
[PATCH v6 3/4] checkpatch: add qemu_bh_new/aio_bh_new checks
Advise authors to use the _guarded versions of the APIs, instead. Reviewed-by: Darren Kenny Signed-off-by: Alexander Bulekov --- scripts/checkpatch.pl | 8 1 file changed, 8 insertions(+) diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl index 6ecabfb2b5..fbb71c70f8 100755 --- a/scripts/checkpatch.pl +++ b/scripts/checkpatch.pl @@ -2865,6 +2865,14 @@ sub process { if ($line =~ /\bsignal\s*\(/ && !($line =~ /SIG_(?:IGN|DFL)/)) { ERROR("use sigaction to establish signal handlers; signal is not portable\n" . $herecurr); } +# recommend qemu_bh_new_guarded instead of qemu_bh_new +if ($realfile =~ /.*\/hw\/.*/ && $line =~ /\bqemu_bh_new\s*\(/) { + ERROR("use qemu_bh_new_guarded() instead of qemu_bh_new() to avoid reentrancy problems\n" . $herecurr); + } +# recommend aio_bh_new_guarded instead of aio_bh_new +if ($realfile =~ /.*\/hw\/.*/ && $line =~ /\baio_bh_new\s*\(/) { + ERROR("use aio_bh_new_guarded() instead of aio_bh_new() to avoid reentrancy problems\n" . $herecurr); + } # check for module_init(), use category-specific init macros explicitly please if ($line =~ /^module_init\s*\(/) { ERROR("please use block_init(), type_init() etc. instead of module_init()\n" . $herecurr); -- 2.39.0
[PATCH v6 2/4] async: Add an optional reentrancy guard to the BH API
Devices can pass their MemoryReentrancyGuard (from their DeviceState), when creating new BHes. Then, the async API will toggle the guard before/after calling the BH call-back. This prevents bh->mmio reentrancy issues. Reviewed-by: Darren Kenny Signed-off-by: Alexander Bulekov --- docs/devel/multiple-iothreads.txt | 7 +++ include/block/aio.h | 18 -- include/qemu/main-loop.h | 7 +-- tests/unit/ptimer-test-stubs.c| 3 ++- util/async.c | 18 +- util/main-loop.c | 5 +++-- util/trace-events | 1 + 7 files changed, 51 insertions(+), 8 deletions(-) diff --git a/docs/devel/multiple-iothreads.txt b/docs/devel/multiple-iothreads.txt index 343120f2ef..a3e949f6b3 100644 --- a/docs/devel/multiple-iothreads.txt +++ b/docs/devel/multiple-iothreads.txt @@ -61,6 +61,7 @@ There are several old APIs that use the main loop AioContext: * LEGACY qemu_aio_set_event_notifier() - monitor an event notifier * LEGACY timer_new_ms() - create a timer * LEGACY qemu_bh_new() - create a BH + * LEGACY qemu_bh_new_guarded() - create a BH with a device re-entrancy guard * LEGACY qemu_aio_wait() - run an event loop iteration Since they implicitly work on the main loop they cannot be used in code that @@ -72,8 +73,14 @@ Instead, use the AioContext functions directly (see include/block/aio.h): * aio_set_event_notifier() - monitor an event notifier * aio_timer_new() - create a timer * aio_bh_new() - create a BH + * aio_bh_new_guarded() - create a BH with a device re-entrancy guard * aio_poll() - run an event loop iteration +The qemu_bh_new_guarded/aio_bh_new_guarded APIs accept a "MemReentrancyGuard" +argument, which is used to check for and prevent re-entrancy problems. For +BHs associated with devices, the reentrancy-guard is contained in the +corresponding DeviceState and named "mem_reentrancy_guard". + The AioContext can be obtained from the IOThread using iothread_get_aio_context() or for the main loop using qemu_get_aio_context(). Code that takes an AioContext argument works both in IOThreads or the main diff --git a/include/block/aio.h b/include/block/aio.h index 8fba6a3584..3e3bdb9352 100644 --- a/include/block/aio.h +++ b/include/block/aio.h @@ -23,6 +23,8 @@ #include "qemu/thread.h" #include "qemu/timer.h" #include "block/graph-lock.h" +#include "hw/qdev-core.h" + typedef struct BlockAIOCB BlockAIOCB; typedef void BlockCompletionFunc(void *opaque, int ret); @@ -331,9 +333,11 @@ void aio_bh_schedule_oneshot_full(AioContext *ctx, QEMUBHFunc *cb, void *opaque, * is opaque and must be allocated prior to its use. * * @name: A human-readable identifier for debugging purposes. + * @reentrancy_guard: A guard set when entering a cb to prevent + * device-reentrancy issues */ QEMUBH *aio_bh_new_full(AioContext *ctx, QEMUBHFunc *cb, void *opaque, -const char *name); +const char *name, MemReentrancyGuard *reentrancy_guard); /** * aio_bh_new: Allocate a new bottom half structure @@ -342,7 +346,17 @@ QEMUBH *aio_bh_new_full(AioContext *ctx, QEMUBHFunc *cb, void *opaque, * string. */ #define aio_bh_new(ctx, cb, opaque) \ -aio_bh_new_full((ctx), (cb), (opaque), (stringify(cb))) +aio_bh_new_full((ctx), (cb), (opaque), (stringify(cb)), NULL) + +/** + * aio_bh_new_guarded: Allocate a new bottom half structure with a + * reentrancy_guard + * + * A convenience wrapper for aio_bh_new_full() that uses the cb as the name + * string. + */ +#define aio_bh_new_guarded(ctx, cb, opaque, guard) \ +aio_bh_new_full((ctx), (cb), (opaque), (stringify(cb)), guard) /** * aio_notify: Force processing of pending events. diff --git a/include/qemu/main-loop.h b/include/qemu/main-loop.h index c25f390696..84d1ce57f0 100644 --- a/include/qemu/main-loop.h +++ b/include/qemu/main-loop.h @@ -389,9 +389,12 @@ void qemu_cond_timedwait_iothread(QemuCond *cond, int ms); void qemu_fd_register(int fd); +#define qemu_bh_new_guarded(cb, opaque, guard) \ +qemu_bh_new_full((cb), (opaque), (stringify(cb)), guard) #define qemu_bh_new(cb, opaque) \ -qemu_bh_new_full((cb), (opaque), (stringify(cb))) -QEMUBH *qemu_bh_new_full(QEMUBHFunc *cb, void *opaque, const char *name); +qemu_bh_new_full((cb), (opaque), (stringify(cb)), NULL) +QEMUBH *qemu_bh_new_full(QEMUBHFunc *cb, void *opaque, const char *name, + MemReentrancyGuard *reentrancy_guard); void qemu_bh_schedule_idle(QEMUBH *bh); enum { diff --git a/tests/unit/ptimer-test-stubs.c b/tests/unit/ptimer-test-stubs.c index f5e75a96b6..24d5413f9d 100644 --- a/tests/unit/ptimer-test-stubs.c +++ b/tests/unit/ptimer-test-stubs.c @@ -107,7 +107,8 @@ int64_t qemu_clock_deadline_ns_all(QEMUClockType type, int attr_mask) return deadline; } -QEMUBH *qemu_bh_new_full(QEMUBHFunc *cb, void *opaque, const char *name) +QEMUBH *qemu_bh_new_full(QEMUBHFunc *cb,
[PATCH v6 1/4] memory: prevent dma-reentracy issues
Add a flag to the DeviceState, when a device is engaged in PIO/MMIO/DMA. This flag is set/checked prior to calling a device's MemoryRegion handlers, and set when device code initiates DMA. The purpose of this flag is to prevent two types of DMA-based reentrancy issues: 1.) mmio -> dma -> mmio case 2.) bh -> dma write -> mmio case These issues have led to problems such as stack-exhaustion and use-after-frees. Summary of the problem from Peter Maydell: https://lore.kernel.org/qemu-devel/cafeaca_23vc7he3iam-jva6w38lk4hjowae5kcknhprd5fp...@mail.gmail.com Resolves: https://gitlab.com/qemu-project/qemu/-/issues/62 Resolves: https://gitlab.com/qemu-project/qemu/-/issues/540 Resolves: https://gitlab.com/qemu-project/qemu/-/issues/541 Resolves: https://gitlab.com/qemu-project/qemu/-/issues/556 Resolves: https://gitlab.com/qemu-project/qemu/-/issues/557 Resolves: https://gitlab.com/qemu-project/qemu/-/issues/827 Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1282 Reviewed-by: Darren Kenny Reviewed-by: Stefan Hajnoczi Signed-off-by: Alexander Bulekov Acked-by: Peter Xu --- include/hw/qdev-core.h | 7 +++ softmmu/memory.c | 17 + softmmu/trace-events | 1 + 3 files changed, 25 insertions(+) diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h index 35fddb19a6..8858195262 100644 --- a/include/hw/qdev-core.h +++ b/include/hw/qdev-core.h @@ -162,6 +162,10 @@ struct NamedClockList { QLIST_ENTRY(NamedClockList) node; }; +typedef struct { +bool engaged_in_io; +} MemReentrancyGuard; + /** * DeviceState: * @realized: Indicates whether the device has been fully constructed. @@ -194,6 +198,9 @@ struct DeviceState { int alias_required_for_version; ResettableState reset; GSList *unplug_blockers; + +/* Is the device currently in mmio/pio/dma? Used to prevent re-entrancy */ +MemReentrancyGuard mem_reentrancy_guard; }; struct DeviceListener { diff --git a/softmmu/memory.c b/softmmu/memory.c index 9d64efca26..eefeeae317 100644 --- a/softmmu/memory.c +++ b/softmmu/memory.c @@ -533,6 +533,7 @@ static MemTxResult access_with_adjusted_size(hwaddr addr, uint64_t access_mask; unsigned access_size; unsigned i; +DeviceState *dev = NULL; MemTxResult r = MEMTX_OK; if (!access_size_min) { @@ -542,6 +543,19 @@ static MemTxResult access_with_adjusted_size(hwaddr addr, access_size_max = 4; } +/* Do not allow more than one simultanous access to a device's IO Regions */ +if (mr->owner && +!mr->ram_device && !mr->ram && !mr->rom_device && !mr->readonly) { +dev = (DeviceState *) object_dynamic_cast(mr->owner, TYPE_DEVICE); +if (dev) { +if (dev->mem_reentrancy_guard.engaged_in_io) { +trace_memory_region_reentrant_io(get_cpu_index(), mr, addr, size); +return MEMTX_ERROR; +} +dev->mem_reentrancy_guard.engaged_in_io = true; +} +} + /* FIXME: support unaligned access? */ access_size = MAX(MIN(size, access_size_max), access_size_min); access_mask = MAKE_64BIT_MASK(0, access_size * 8); @@ -556,6 +570,9 @@ static MemTxResult access_with_adjusted_size(hwaddr addr, access_mask, attrs); } } +if (dev) { +dev->mem_reentrancy_guard.engaged_in_io = false; +} return r; } diff --git a/softmmu/trace-events b/softmmu/trace-events index 22606dc27b..62d04ea9a7 100644 --- a/softmmu/trace-events +++ b/softmmu/trace-events @@ -13,6 +13,7 @@ memory_region_ops_read(int cpu_index, void *mr, uint64_t addr, uint64_t value, u memory_region_ops_write(int cpu_index, void *mr, uint64_t addr, uint64_t value, unsigned size, const char *name) "cpu %d mr %p addr 0x%"PRIx64" value 0x%"PRIx64" size %u name '%s'" memory_region_subpage_read(int cpu_index, void *mr, uint64_t offset, uint64_t value, unsigned size) "cpu %d mr %p offset 0x%"PRIx64" value 0x%"PRIx64" size %u" memory_region_subpage_write(int cpu_index, void *mr, uint64_t offset, uint64_t value, unsigned size) "cpu %d mr %p offset 0x%"PRIx64" value 0x%"PRIx64" size %u" +memory_region_reentrant_io(int cpu_index, void *mr, uint64_t offset, unsigned size) "cpu %d mr %p offset 0x%"PRIx64" size %u" memory_region_ram_device_read(int cpu_index, void *mr, uint64_t addr, uint64_t value, unsigned size) "cpu %d mr %p addr 0x%"PRIx64" value 0x%"PRIx64" size %u" memory_region_ram_device_write(int cpu_index, void *mr, uint64_t addr, uint64_t value, unsigned size) "cpu %d mr %p addr 0x%"PRIx64" value 0x%"PRIx64" size %u" memory_region_sync_dirty(const char *mr, const char *listener, int global) "mr '%s' listener '%s' synced (global=%d)" -- 2.39.0
[PATCH v6 0/4] memory: prevent dma-reentracy issues
These patches aim to solve two types of DMA-reentrancy issues: 1.) mmio -> dma -> mmio case To solve this, we track whether the device is engaged in io by checking/setting a reentrancy-guard within APIs used for MMIO access. 2.) bh -> dma write -> mmio case This case is trickier, since we dont have a generic way to associate a bh with the underlying Device/DeviceState. Thus, this version allows a device to associate a reentrancy-guard with a bh, when creating it. (Instead of calling qemu_bh_new, you call qemu_bh_new_guarded) I replaced most of the qemu_bh_new invocations with the guarded analog, except for the ones where the DeviceState was not trivially accessible. v5 -> v6: - Only apply checkpatch checks to code in paths containing "/hw/" (/hw/ and include/hw/) - Fix a bug in a _guarded call added to hw/block/virtio-blk.c v4-> v5: - Add corresponding checkpatch checks - Save/restore reentrancy-flag when entering/exiting BHs - Improve documentation - Check object_dynamic_cast return value v3 -> v4: Instead of changing all of the DMA APIs, instead add an optional reentrancy guard to the BH API. v2 -> v3: Bite the bullet and modify the DMA APIs, rather than attempting to guess DeviceStates in BHs. Alexander Bulekov (4): memory: prevent dma-reentracy issues async: Add an optional reentrancy guard to the BH API checkpatch: add qemu_bh_new/aio_bh_new checks hw: replace most qemu_bh_new calls with qemu_bh_new_guarded docs/devel/multiple-iothreads.txt | 7 +++ hw/9pfs/xen-9p-backend.c | 4 +++- hw/block/dataplane/virtio-blk.c | 3 ++- hw/block/dataplane/xen-block.c| 5 +++-- hw/char/virtio-serial-bus.c | 3 ++- hw/display/qxl.c | 9 ++--- hw/display/virtio-gpu.c | 6 -- hw/ide/ahci.c | 3 ++- hw/ide/core.c | 3 ++- hw/misc/imx_rngc.c| 6 -- hw/misc/macio/mac_dbdma.c | 2 +- hw/net/virtio-net.c | 3 ++- hw/nvme/ctrl.c| 6 -- hw/scsi/mptsas.c | 3 ++- hw/scsi/scsi-bus.c| 3 ++- hw/scsi/vmw_pvscsi.c | 3 ++- hw/usb/dev-uas.c | 3 ++- hw/usb/hcd-dwc2.c | 3 ++- hw/usb/hcd-ehci.c | 3 ++- hw/usb/hcd-uhci.c | 2 +- hw/usb/host-libusb.c | 6 -- hw/usb/redirect.c | 6 -- hw/usb/xen-usb.c | 3 ++- hw/virtio/virtio-balloon.c| 5 +++-- hw/virtio/virtio-crypto.c | 3 ++- include/block/aio.h | 18 -- include/hw/qdev-core.h| 7 +++ include/qemu/main-loop.h | 7 +-- scripts/checkpatch.pl | 8 softmmu/memory.c | 17 + softmmu/trace-events | 1 + tests/unit/ptimer-test-stubs.c| 3 ++- util/async.c | 18 +- util/main-loop.c | 5 +++-- util/trace-events | 1 + 35 files changed, 147 insertions(+), 41 deletions(-) -- 2.39.0
Re: [PULL 00/11] Net patches
On 2/4/23 15:57, Peter Maydell wrote: On Thu, 2 Feb 2023 at 06:21, Jason Wang wrote: The following changes since commit 13356edb87506c148b163b8c7eb0695647d00c2a: Merge tag 'block-pull-request' of https://gitlab.com/stefanha/qemu into staging (2023-01-24 09:45:33 +) are available in the git repository at: https://github.com/jasowang/qemu.git tags/net-pull-request for you to fetch changes up to 2bd492bca521ee8594f1d5db8dc9aac126fc4f85: vdpa: fix VHOST_BACKEND_F_IOTLB_ASID flag check (2023-02-02 14:16:48 +0800) Something weird has happened here -- this pullreq is trying to add tests/qtest/netdev-socket.c, but it already exists in the tree and doesn't have the same contents as the version in your pull request. Can you look at what's happened here and fix it up, please ? Thomas and Jason have queued the patch: tests/qtest: netdev: test stream and dgram backends For Jason it's because it's needed by net: stream: add a new option to automatically reconnect For me, both patches (in tree and Jason's one) are identical to my v7 (except the one that is merged does not have Thomas' acked-by). Jason, you can remove PULL 09/11 from your pull request has it is already merged [1] Thanks, Laurent [1] c95031a19f0d ("tests/qtest: netdev: test stream and dgram backends")
Re: [PULL 00/22] Linux user for 8.0 patches
On Sat, 4 Feb 2023 at 16:08, Laurent Vivier wrote: > > The following changes since commit 13356edb87506c148b163b8c7eb0695647d00c2a: > > Merge tag 'block-pull-request' of https://gitlab.com/stefanha/qemu into > staging (2023-01-24 09:45:33 +) > > are available in the Git repository at: > > https://gitlab.com/laurent_vivier/qemu.git > tags/linux-user-for-8.0-pull-request > > for you to fetch changes up to 3f0744f98b07c6fd2ce9d5840726d0915b2ae7c1: > > linux-user: Allow sendmsg() without IOV (2023-02-03 22:55:12 +0100) > > > linux-user branch pull request 20230204 > > Implement execveat() > un-parent OBJECT(cpu) when closing thread > Revert fix for glibc >= 2.36 sys/mount.h > Fix/update strace > move target_flat.h to target subdirs > Fix SO_ERROR return code of getsockopt() > Fix /proc/cpuinfo output for hppa > Add emulation for MADV_WIPEONFORK and MADV_KEEPONFORK in madvise() > Implement SOL_ALG encryption support > linux-user: Allow sendmsg() without IOV > > Applied, thanks. Please update the changelog at https://wiki.qemu.org/ChangeLog/8.0 for any user-visible changes. -- PMM
Re: [PATCH] hw/ppc/pegasos2: Fix a typo in a comment
Queued in gitlab.com/danielhb/qemu/tree/ppc-next. Thanks, Daniel On 2/3/23 16:43, BALATON Zoltan wrote: Reported-by: Stefan Weil Signed-off-by: BALATON Zoltan --- hw/ppc/pegasos2.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/ppc/pegasos2.c b/hw/ppc/pegasos2.c index 1a13632ba6..a9563f4fb2 100644 --- a/hw/ppc/pegasos2.c +++ b/hw/ppc/pegasos2.c @@ -564,7 +564,7 @@ static void dt_isa(PCIBus *bus, PCIDevice *d, FDTInfo *fi) qemu_fdt_setprop_string(fi->fdt, fi->path, "device_type", "isa"); qemu_fdt_setprop_string(fi->fdt, fi->path, "name", "isa"); -/* addional devices */ +/* additional devices */ g_string_printf(name, "%s/lpt@i3bc", fi->path); qemu_fdt_add_subnode(fi->fdt, name->str); qemu_fdt_setprop_cell(fi->fdt, name->str, "clock-frequency", 0);
Re: [PATCH] tcg: Init temp_subindex in liveness_pass_2
On 3/2/23 23:59, Richard Henderson wrote: Correctly handle large types while lowering. Fixes: fac87bd2a49b ("tcg: Add temp_subindex to TCGTemp") Signed-off-by: Richard Henderson --- tcg/tcg.c | 1 + 1 file changed, 1 insertion(+) diff --git a/tcg/tcg.c b/tcg/tcg.c index fd557d55d3..bc60fd0fe8 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -3063,6 +3063,7 @@ static bool liveness_pass_2(TCGContext *s) TCGTemp *dts = tcg_temp_alloc(s); dts->type = its->type; dts->base_type = its->base_type; +dts->temp_subindex = its->temp_subindex; dts->kind = TEMP_EBB; its->state_ptr = dts; } else { Reviewed-by: Philippe Mathieu-Daudé
Re: [PATCH 4/4] pcie: add trace-poing for power indicator transitions
Oops, sorry. Both [4] patches are equal, except for this one has a typo in subject -- Best regards, Vladimir
[PATCH 3/4] pcie: drop unused PCIExpressIndicator
The structure type is unused. Also, it's the only user of corresponding macros, so drop them too. Signed-off-by: Vladimir Sementsov-Ogievskiy --- include/hw/pci/pcie.h | 8 include/hw/pci/pcie_regs.h | 5 - 2 files changed, 13 deletions(-) diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h index 798a262a0a..3cc2b15957 100644 --- a/include/hw/pci/pcie.h +++ b/include/hw/pci/pcie.h @@ -27,14 +27,6 @@ #include "hw/pci/pcie_sriov.h" #include "hw/hotplug.h" -typedef enum { -/* for attention and power indicator */ -PCI_EXP_HP_IND_RESERVED = PCI_EXP_SLTCTL_IND_RESERVED, -PCI_EXP_HP_IND_ON = PCI_EXP_SLTCTL_IND_ON, -PCI_EXP_HP_IND_BLINK= PCI_EXP_SLTCTL_IND_BLINK, -PCI_EXP_HP_IND_OFF = PCI_EXP_SLTCTL_IND_OFF, -} PCIExpressIndicator; - typedef enum { /* these bits must match the bits in Slot Control/Status registers. * PCI_EXP_HP_EV_xxx = PCI_EXP_SLTCTL_xxxE = PCI_EXP_SLTSTA_xxx diff --git a/include/hw/pci/pcie_regs.h b/include/hw/pci/pcie_regs.h index 00b595a82e..1fe0bdd25b 100644 --- a/include/hw/pci/pcie_regs.h +++ b/include/hw/pci/pcie_regs.h @@ -66,11 +66,6 @@ typedef enum PCIExpLinkWidth { #define PCI_EXP_SLTCAP_PSN_SHIFTctz32(PCI_EXP_SLTCAP_PSN) -#define PCI_EXP_SLTCTL_IND_RESERVED 0x0 -#define PCI_EXP_SLTCTL_IND_ON 0x1 -#define PCI_EXP_SLTCTL_IND_BLINK0x2 -#define PCI_EXP_SLTCTL_IND_OFF 0x3 - #define PCI_EXP_SLTCTL_SUPPORTED\ (PCI_EXP_SLTCTL_ABPE | \ PCI_EXP_SLTCTL_PDCE | \ -- 2.34.1
[PATCH 2/4] pcie_regs: drop duplicated indicator value macros
We already have indicator values in include/standard-headers/linux/pci_regs.h , no reason to reinvent them in include/hw/pci/pcie_regs.h. (and we already have usage of PCI_EXP_SLTCTL_PWR_IND_BLINK and PCI_EXP_SLTCTL_PWR_IND_OFF in hw/pci/pcie.c, so let's be consistent) Signed-off-by: Vladimir Sementsov-Ogievskiy --- include/hw/pci/pcie_regs.h | 9 - hw/pci/pcie.c | 13 +++-- 2 files changed, 7 insertions(+), 15 deletions(-) diff --git a/include/hw/pci/pcie_regs.h b/include/hw/pci/pcie_regs.h index 963dc2e170..00b595a82e 100644 --- a/include/hw/pci/pcie_regs.h +++ b/include/hw/pci/pcie_regs.h @@ -70,15 +70,6 @@ typedef enum PCIExpLinkWidth { #define PCI_EXP_SLTCTL_IND_ON 0x1 #define PCI_EXP_SLTCTL_IND_BLINK0x2 #define PCI_EXP_SLTCTL_IND_OFF 0x3 -#define PCI_EXP_SLTCTL_AIC_SHIFTctz32(PCI_EXP_SLTCTL_AIC) -#define PCI_EXP_SLTCTL_AIC_OFF \ -(PCI_EXP_SLTCTL_IND_OFF << PCI_EXP_SLTCTL_AIC_SHIFT) - -#define PCI_EXP_SLTCTL_PIC_SHIFTctz32(PCI_EXP_SLTCTL_PIC) -#define PCI_EXP_SLTCTL_PIC_OFF \ -(PCI_EXP_SLTCTL_IND_OFF << PCI_EXP_SLTCTL_PIC_SHIFT) -#define PCI_EXP_SLTCTL_PIC_ON \ -(PCI_EXP_SLTCTL_IND_ON << PCI_EXP_SLTCTL_PIC_SHIFT) #define PCI_EXP_SLTCTL_SUPPORTED\ (PCI_EXP_SLTCTL_ABPE | \ diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c index 82ef723983..ccdb2377e1 100644 --- a/hw/pci/pcie.c +++ b/hw/pci/pcie.c @@ -634,8 +634,8 @@ void pcie_cap_slot_init(PCIDevice *dev, PCIESlot *s) PCI_EXP_SLTCTL_PIC | PCI_EXP_SLTCTL_AIC); pci_word_test_and_set_mask(dev->config + pos + PCI_EXP_SLTCTL, - PCI_EXP_SLTCTL_PIC_OFF | - PCI_EXP_SLTCTL_AIC_OFF); + PCI_EXP_SLTCTL_PWR_IND_OFF | + PCI_EXP_SLTCTL_ATTN_IND_OFF); pci_word_test_and_set_mask(dev->wmask + pos + PCI_EXP_SLTCTL, PCI_EXP_SLTCTL_PIC | PCI_EXP_SLTCTL_AIC | @@ -679,7 +679,7 @@ void pcie_cap_slot_reset(PCIDevice *dev) PCI_EXP_SLTCTL_PDCE | PCI_EXP_SLTCTL_ABPE); pci_word_test_and_set_mask(exp_cap + PCI_EXP_SLTCTL, - PCI_EXP_SLTCTL_AIC_OFF); + PCI_EXP_SLTCTL_ATTN_IND_OFF); if (dev->cap_present & QEMU_PCIE_SLTCAP_PCP) { /* Downstream ports enforce device number 0. */ @@ -694,7 +694,8 @@ void pcie_cap_slot_reset(PCIDevice *dev) PCI_EXP_SLTCTL_PCC); } -pic = populated ? PCI_EXP_SLTCTL_PIC_ON : PCI_EXP_SLTCTL_PIC_OFF; +pic = populated ? +PCI_EXP_SLTCTL_PWR_IND_ON : PCI_EXP_SLTCTL_PWR_IND_OFF; pci_word_test_and_set_mask(exp_cap + PCI_EXP_SLTCTL, pic); } @@ -770,9 +771,9 @@ void pcie_cap_slot_write_config(PCIDevice *dev, * control of powered off slots before powering them on. */ if ((sltsta & PCI_EXP_SLTSTA_PDS) && (val & PCI_EXP_SLTCTL_PCC) && -(val & PCI_EXP_SLTCTL_PIC) == PCI_EXP_SLTCTL_PIC_OFF && +(val & PCI_EXP_SLTCTL_PIC) == PCI_EXP_SLTCTL_PWR_IND_OFF && (!(old_slt_ctl & PCI_EXP_SLTCTL_PCC) || -(old_slt_ctl & PCI_EXP_SLTCTL_PIC) != PCI_EXP_SLTCTL_PIC_OFF)) { +(old_slt_ctl & PCI_EXP_SLTCTL_PIC) != PCI_EXP_SLTCTL_PWR_IND_OFF)) { pcie_cap_slot_do_unplug(dev); } pcie_cap_update_power(dev); -- 2.34.1
[PATCH 4/4] pcie: add trace-poing for power indicator transitions
Signed-off-by: Vladimir Sementsov-Ogievskiy --- hw/pci/pcie.c | 20 hw/pci/trace-events | 3 +++ 2 files changed, 23 insertions(+) diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c index ccdb2377e1..1a19368994 100644 --- a/hw/pci/pcie.c +++ b/hw/pci/pcie.c @@ -28,6 +28,7 @@ #include "hw/pci/pcie_regs.h" #include "hw/pci/pcie_port.h" #include "qemu/range.h" +#include "trace.h" //#define DEBUG_PCIE #ifdef DEBUG_PCIE @@ -718,6 +719,20 @@ void pcie_cap_slot_get(PCIDevice *dev, uint16_t *slt_ctl, uint16_t *slt_sta) *slt_sta = pci_get_word(exp_cap + PCI_EXP_SLTSTA); } +static const char *pcie_sltctl_pic_str(uint16_t sltctl) +{ +switch (sltctl & PCI_EXP_SLTCTL_PIC) { +case PCI_EXP_SLTCTL_PWR_IND_ON: +return "on"; +case PCI_EXP_SLTCTL_PWR_IND_BLINK: +return "blink"; +case PCI_EXP_SLTCTL_PWR_IND_OFF: +return "off"; +default: +return "?"; +} +} + void pcie_cap_slot_write_config(PCIDevice *dev, uint16_t old_slt_ctl, uint16_t old_slt_sta, uint32_t addr, uint32_t val, int len) @@ -762,6 +777,11 @@ void pcie_cap_slot_write_config(PCIDevice *dev, sltsta); } +if ((val & PCI_EXP_SLTCTL_PIC) != (old_slt_ctl & PCI_EXP_SLTCTL_PIC)) { +trace_pcie_power_indicator(pcie_sltctl_pic_str(old_slt_ctl), + pcie_sltctl_pic_str(val)); +} + /* * If the slot is populated, power indicator is off and power * controller is off, it is safe to detach the devices. diff --git a/hw/pci/trace-events b/hw/pci/trace-events index aaf46bc92d..ec4a5ff43d 100644 --- a/hw/pci/trace-events +++ b/hw/pci/trace-events @@ -15,3 +15,6 @@ msix_write_config(char *name, bool enabled, bool masked) "dev %s enabled %d mask sriov_register_vfs(const char *name, int slot, int function, int num_vfs) "%s %02x:%x: creating %d vf devs" sriov_unregister_vfs(const char *name, int slot, int function, int num_vfs) "%s %02x:%x: Unregistering %d vf devs" sriov_config_write(const char *name, int slot, int fun, uint32_t offset, uint32_t val, uint32_t len) "%s %02x:%x: sriov offset 0x%x val 0x%x len %d" + +# pcie.c +pcie_power_indicator(const char *old, const char *new) "%s -> %s" -- 2.34.1
[PATCH 4/4] pcie: add trace-point for power indicator transitions
Signed-off-by: Vladimir Sementsov-Ogievskiy --- hw/pci/pcie.c | 20 hw/pci/trace-events | 3 +++ 2 files changed, 23 insertions(+) diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c index ccdb2377e1..1a19368994 100644 --- a/hw/pci/pcie.c +++ b/hw/pci/pcie.c @@ -28,6 +28,7 @@ #include "hw/pci/pcie_regs.h" #include "hw/pci/pcie_port.h" #include "qemu/range.h" +#include "trace.h" //#define DEBUG_PCIE #ifdef DEBUG_PCIE @@ -718,6 +719,20 @@ void pcie_cap_slot_get(PCIDevice *dev, uint16_t *slt_ctl, uint16_t *slt_sta) *slt_sta = pci_get_word(exp_cap + PCI_EXP_SLTSTA); } +static const char *pcie_sltctl_pic_str(uint16_t sltctl) +{ +switch (sltctl & PCI_EXP_SLTCTL_PIC) { +case PCI_EXP_SLTCTL_PWR_IND_ON: +return "on"; +case PCI_EXP_SLTCTL_PWR_IND_BLINK: +return "blink"; +case PCI_EXP_SLTCTL_PWR_IND_OFF: +return "off"; +default: +return "?"; +} +} + void pcie_cap_slot_write_config(PCIDevice *dev, uint16_t old_slt_ctl, uint16_t old_slt_sta, uint32_t addr, uint32_t val, int len) @@ -762,6 +777,11 @@ void pcie_cap_slot_write_config(PCIDevice *dev, sltsta); } +if ((val & PCI_EXP_SLTCTL_PIC) != (old_slt_ctl & PCI_EXP_SLTCTL_PIC)) { +trace_pcie_power_indicator(pcie_sltctl_pic_str(old_slt_ctl), + pcie_sltctl_pic_str(val)); +} + /* * If the slot is populated, power indicator is off and power * controller is off, it is safe to detach the devices. diff --git a/hw/pci/trace-events b/hw/pci/trace-events index aaf46bc92d..ec4a5ff43d 100644 --- a/hw/pci/trace-events +++ b/hw/pci/trace-events @@ -15,3 +15,6 @@ msix_write_config(char *name, bool enabled, bool masked) "dev %s enabled %d mask sriov_register_vfs(const char *name, int slot, int function, int num_vfs) "%s %02x:%x: creating %d vf devs" sriov_unregister_vfs(const char *name, int slot, int function, int num_vfs) "%s %02x:%x: Unregistering %d vf devs" sriov_config_write(const char *name, int slot, int fun, uint32_t offset, uint32_t val, uint32_t len) "%s %02x:%x: sriov offset 0x%x val 0x%x len %d" + +# pcie.c +pcie_power_indicator(const char *old, const char *new) "%s -> %s" -- 2.34.1
[PATCH 1/4] pcie: pcie_cap_slot_write_config(): use correct macro
PCI_EXP_SLTCTL_PIC_OFF is a value, and PCI_EXP_SLTCTL_PIC is a mask. Happily PCI_EXP_SLTCTL_PIC_OFF is a maximum value for this mask and is equal to the mask itself. Still the code looks like a bug. Let's make it more reader-friendly. Signed-off-by: Vladimir Sementsov-Ogievskiy --- hw/pci/pcie.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c index 924fdabd15..82ef723983 100644 --- a/hw/pci/pcie.c +++ b/hw/pci/pcie.c @@ -770,9 +770,9 @@ void pcie_cap_slot_write_config(PCIDevice *dev, * control of powered off slots before powering them on. */ if ((sltsta & PCI_EXP_SLTSTA_PDS) && (val & PCI_EXP_SLTCTL_PCC) && -(val & PCI_EXP_SLTCTL_PIC_OFF) == PCI_EXP_SLTCTL_PIC_OFF && +(val & PCI_EXP_SLTCTL_PIC) == PCI_EXP_SLTCTL_PIC_OFF && (!(old_slt_ctl & PCI_EXP_SLTCTL_PCC) || -(old_slt_ctl & PCI_EXP_SLTCTL_PIC_OFF) != PCI_EXP_SLTCTL_PIC_OFF)) { +(old_slt_ctl & PCI_EXP_SLTCTL_PIC) != PCI_EXP_SLTCTL_PIC_OFF)) { pcie_cap_slot_do_unplug(dev); } pcie_cap_update_power(dev); -- 2.34.1
[PATCH 0/4] pcie: cleanup code and add trace point
Hi all! Here is tiny code cleanup + on trace point to track power indicator changes (which may help to analyze "Hot-unplug failed: guest is busy (power indicator blinking)" error message). Vladimir Sementsov-Ogievskiy (4): pcie: pcie_cap_slot_write_config(): use correct macro pcie_regs: drop duplicated indicator value macros pcie: drop unused PCIExpressIndicator pcie: add trace-point for power indicator transitions include/hw/pci/pcie.h | 8 include/hw/pci/pcie_regs.h | 14 -- hw/pci/pcie.c | 33 +++-- hw/pci/trace-events| 3 +++ 4 files changed, 30 insertions(+), 28 deletions(-) -- 2.34.1
Re: [PULL 0/1] M68k next patches
On Wed, 1 Feb 2023 at 09:54, Laurent Vivier wrote: > > The following changes since commit 13356edb87506c148b163b8c7eb0695647d00c2a: > > Merge tag 'block-pull-request' of https://gitlab.com/stefanha/qemu into > staging (2023-01-24 09:45:33 +) > > are available in the Git repository at: > > https://github.com/vivier/qemu-m68k.git tags/m68k-next-pull-request > > for you to fetch changes up to c1fc91b82545a2b8ab73f81e5b7b6b0fec292ea1: > > m68k: fix 'bkpt' instruction in softmmu mode (2023-02-01 10:18:21 +0100) > > > m68k pull request 20230201 > > fix 'bkpt' instruction in softmmu mode > > > > Laurent Vivier (1): > m68k: fix 'bkpt' instruction in softmmu mode > Applied, thanks. Please update the changelog at https://wiki.qemu.org/ChangeLog/8.0 for any user-visible changes. -- PMM
Re: pixman_blt on aarch64
This has just bounced, I hoped to still be able to post after moderation but now I'm resending it after subscribing to the pixman list. Meanwhile I've found this ticket as well: https://gitlab.freedesktop.org/pixman/pixman/-/merge_requests/71 See the rest of the message below. Looks like this is being worked on but I'm not sure how far is it from getting resolved. Any info on that? On Sat, 4 Feb 2023, BALATON Zoltan wrote: Hello, I'm trying to involve the pixman list in this thread on qemu-devel list started with subject "Display update issue on M1 Macs". See here: https://lists.nongnu.org/archive/html/qemu-devel/2023-02/msg01033.html We have found that on aarch64 Macs running macOS the pixman_blt and pixman_fill functions are disabled without fallback due to not being able to compile the needed assembly code. See detailed discussion below. Is there a way to fix this in pixman in the near future or provide a fallback for this in pixman? Or do I need to add a fallback in QEMU or try using something else instead of pixman for these functions? Thank you, BALATON Zoltan On Sat, 4 Feb 2023, Akihiko Odaki wrote: On 2023/02/03 22:45, BALATON Zoltan wrote: On Fri, 3 Feb 2023, Akihiko Odaki wrote: I finally reproduced the issue with MorphOS and ati-vga and figured out its cause. The problem is that pixman_blt() is disabled because its backend is written in GNU assembly, and GNU assembler is not available on macOS. There is no fallback written in C, unfortunately. The issue is tracked by the upstream at: https://gitlab.freedesktop.org/pixman/pixman/-/issues/59 Hm, OK but that ticket is just about compile error and suggests to disable it and does not say it won't work then. Are they aware this is a problem? Maybe we should write to their mailing list after we're sure what's happening. That's a good idea. They may prioritize the issue if they realize that disables pixman_blt(). I hit the same problem on Asahi Linux, which is based on Arch Linux ARM. It is because Arch Linux copied PKGBUILD from x86 Arch Linux, which disables Arm backends. It is easy to enable the backend for the platform so I proposed a change at: https://github.com/archlinuxarm/PKGBUILDs/pull/1985 On macOS one source of pixman most people use is brew.sh where this seems to be disabled: https://github.com/Homebrew/homebrew-core/blob/master/Formula/pixman.rb another source is macports which has an older version and no such options: https://github.com/macports/macports-ports/blob/master/graphics/libpixman-devel/Portfile I wonder if it compiles from macports on aarch64 then. It's more likely that it is just outdated. It does not carry a patch to fix the issue. I wait if I can get some more test results and try to check pixman but its source is not too clear to me and there are no docs either so maybe the best way is to ask on their list. If this is a pixman issue I hope it can be fixed there and we don't need to implement a fallback in QEMU. This is certainly a pixman issue. If you read the source, you can see pixman_blt() calls _pixman_implementation_blt(). _pixman_implementation_blt() calls blt member of pixman_implementation_t in turn. Grepping for "blt =" tells it is only assigned in: pixman/pixman-arm-neon.c pixman/pixman-arm-simd.c pixman/pixman-mips-dspr2.c pixman/pixman-mmx.c pixman/pixman-sse2.c For AArch64, only pixman/pixman-arm-neon.c is relevant, and it needs to be disabled to build the library on macOS. Regards, Akihiko Odaki Regards, BALATON Zoltan
[PULL 10/40] include/qemu/int128: Use Int128 structure for TCI
We are about to allow passing Int128 to/from tcg helper functions, but libffi doesn't support __int128_t, so use the structure. In order for atomic128.h to continue working, we must provide a mechanism to frob between real __int128_t and the structure. Provide a new union, Int128Alias, for this. We cannot modify Int128 itself, as any changed alignment would also break libffi. Reviewed-by: Alex Bennée Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- include/qemu/atomic128.h | 29 +-- include/qemu/int128.h| 25 +--- util/int128.c| 42 3 files changed, 87 insertions(+), 9 deletions(-) diff --git a/include/qemu/atomic128.h b/include/qemu/atomic128.h index adb9a1a260..d0ba0b9c65 100644 --- a/include/qemu/atomic128.h +++ b/include/qemu/atomic128.h @@ -44,13 +44,23 @@ #if defined(CONFIG_ATOMIC128) static inline Int128 atomic16_cmpxchg(Int128 *ptr, Int128 cmp, Int128 new) { -return qatomic_cmpxchg__nocheck(ptr, cmp, new); +Int128Alias r, c, n; + +c.s = cmp; +n.s = new; +r.i = qatomic_cmpxchg__nocheck((__int128_t *)ptr, c.i, n.i); +return r.s; } # define HAVE_CMPXCHG128 1 #elif defined(CONFIG_CMPXCHG128) static inline Int128 atomic16_cmpxchg(Int128 *ptr, Int128 cmp, Int128 new) { -return __sync_val_compare_and_swap_16(ptr, cmp, new); +Int128Alias r, c, n; + +c.s = cmp; +n.s = new; +r.i = __sync_val_compare_and_swap_16((__int128_t *)ptr, c.i, n.i); +return r.s; } # define HAVE_CMPXCHG128 1 #elif defined(__aarch64__) @@ -89,12 +99,18 @@ Int128 QEMU_ERROR("unsupported atomic") #if defined(CONFIG_ATOMIC128) static inline Int128 atomic16_read(Int128 *ptr) { -return qatomic_read__nocheck(ptr); +Int128Alias r; + +r.i = qatomic_read__nocheck((__int128_t *)ptr); +return r.s; } static inline void atomic16_set(Int128 *ptr, Int128 val) { -qatomic_set__nocheck(ptr, val); +Int128Alias v; + +v.s = val; +qatomic_set__nocheck((__int128_t *)ptr, v.i); } # define HAVE_ATOMIC128 1 @@ -132,7 +148,8 @@ static inline void atomic16_set(Int128 *ptr, Int128 val) static inline Int128 atomic16_read(Int128 *ptr) { /* Maybe replace 0 with 0, returning the old value. */ -return atomic16_cmpxchg(ptr, 0, 0); +Int128 z = int128_make64(0); +return atomic16_cmpxchg(ptr, z, z); } static inline void atomic16_set(Int128 *ptr, Int128 val) @@ -141,7 +158,7 @@ static inline void atomic16_set(Int128 *ptr, Int128 val) do { cmp = old; old = atomic16_cmpxchg(ptr, cmp, val); -} while (old != cmp); +} while (int128_ne(old, cmp)); } # define HAVE_ATOMIC128 1 diff --git a/include/qemu/int128.h b/include/qemu/int128.h index d2b76ca6ac..f62a46b48c 100644 --- a/include/qemu/int128.h +++ b/include/qemu/int128.h @@ -3,7 +3,12 @@ #include "qemu/bswap.h" -#ifdef CONFIG_INT128 +/* + * With TCI, we need to use libffi for interfacing with TCG helpers. + * But libffi does not support __int128_t, and therefore cannot pass + * or return values of this type, force use of the Int128 struct. + */ +#if defined(CONFIG_INT128) && !defined(CONFIG_TCG_INTERPRETER) typedef __int128_t Int128; static inline Int128 int128_make64(uint64_t a) @@ -460,8 +465,7 @@ Int128 int128_divu(Int128, Int128); Int128 int128_remu(Int128, Int128); Int128 int128_divs(Int128, Int128); Int128 int128_rems(Int128, Int128); - -#endif /* CONFIG_INT128 */ +#endif /* CONFIG_INT128 && !CONFIG_TCG_INTERPRETER */ static inline void bswap128s(Int128 *s) { @@ -472,4 +476,19 @@ static inline void bswap128s(Int128 *s) #define INT128_MAX int128_make128(UINT64_MAX, INT64_MAX) #define INT128_MIN int128_make128(0, INT64_MIN) +/* + * When compiler supports a 128-bit type, define a combination of + * a possible structure and the native types. Ease parameter passing + * via use of the transparent union extension. + */ +#ifdef CONFIG_INT128 +typedef union { +Int128 s; +__int128_t i; +__uint128_t u; +} Int128Alias __attribute__((transparent_union)); +#else +typedef Int128 Int128Alias; +#endif /* CONFIG_INT128 */ + #endif /* INT128_H */ diff --git a/util/int128.c b/util/int128.c index ed8f25fef1..df6c6331bd 100644 --- a/util/int128.c +++ b/util/int128.c @@ -144,4 +144,46 @@ Int128 int128_rems(Int128 a, Int128 b) return r; } +#elif defined(CONFIG_TCG_INTERPRETER) + +Int128 int128_divu(Int128 a_s, Int128 b_s) +{ +Int128Alias r, a, b; + +a.s = a_s; +b.s = b_s; +r.u = a.u / b.u; +return r.s; +} + +Int128 int128_remu(Int128 a_s, Int128 b_s) +{ +Int128Alias r, a, b; + +a.s = a_s; +b.s = b_s; +r.u = a.u % b.u; +return r.s; +} + +Int128 int128_divs(Int128 a_s, Int128 b_s) +{ +Int128Alias r, a, b; + +a.s = a_s; +b.s = b_s; +r.i = a.i / b.i; +return r.s; +} + +Int128 int128_rems(Int128 a_s, Int128 b_s) +{ +Int128Alias r, a, b; + +a.s = a_s; +
[PULL 36/40] target/s390x: Implement CC_OP_NZ in gen_op_calc_cc
This case is trivial to implement inline. Reviewed-by: David Hildenbrand Signed-off-by: Richard Henderson --- target/s390x/tcg/translate.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c index 9ea28b3e52..ac5bd98f04 100644 --- a/target/s390x/tcg/translate.c +++ b/target/s390x/tcg/translate.c @@ -625,6 +625,9 @@ static void gen_op_calc_cc(DisasContext *s) /* env->cc_op already is the cc value */ break; case CC_OP_NZ: +tcg_gen_setcondi_i64(TCG_COND_NE, cc_dst, cc_dst, 0); +tcg_gen_extrl_i64_i32(cc_op, cc_dst); +break; case CC_OP_ABS_64: case CC_OP_NABS_64: case CC_OP_ABS_32: -- 2.34.1
[PULL 33/40] target/s390x: Use Int128 for returning float128
Acked-by: David Hildenbrand Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- v2: Remove extraneous return_low128. --- target/s390x/helper.h| 22 +++--- target/s390x/tcg/insn-data.h.inc | 20 ++--- target/s390x/tcg/fpu_helper.c| 29 +- target/s390x/tcg/translate.c | 51 +--- 4 files changed, 63 insertions(+), 59 deletions(-) diff --git a/target/s390x/helper.h b/target/s390x/helper.h index b4170a4256..d40aeb471f 100644 --- a/target/s390x/helper.h +++ b/target/s390x/helper.h @@ -31,32 +31,32 @@ DEF_HELPER_4(clcle, i32, env, i32, i64, i32) DEF_HELPER_4(clclu, i32, env, i32, i64, i32) DEF_HELPER_3(cegb, i64, env, s64, i32) DEF_HELPER_3(cdgb, i64, env, s64, i32) -DEF_HELPER_3(cxgb, i64, env, s64, i32) +DEF_HELPER_3(cxgb, i128, env, s64, i32) DEF_HELPER_3(celgb, i64, env, i64, i32) DEF_HELPER_3(cdlgb, i64, env, i64, i32) -DEF_HELPER_3(cxlgb, i64, env, i64, i32) +DEF_HELPER_3(cxlgb, i128, env, i64, i32) DEF_HELPER_4(cdsg, void, env, i64, i32, i32) DEF_HELPER_4(cdsg_parallel, void, env, i64, i32, i32) DEF_HELPER_4(csst, i32, env, i32, i64, i64) DEF_HELPER_4(csst_parallel, i32, env, i32, i64, i64) DEF_HELPER_FLAGS_3(aeb, TCG_CALL_NO_WG, i64, env, i64, i64) DEF_HELPER_FLAGS_3(adb, TCG_CALL_NO_WG, i64, env, i64, i64) -DEF_HELPER_FLAGS_5(axb, TCG_CALL_NO_WG, i64, env, i64, i64, i64, i64) +DEF_HELPER_FLAGS_5(axb, TCG_CALL_NO_WG, i128, env, i64, i64, i64, i64) DEF_HELPER_FLAGS_3(seb, TCG_CALL_NO_WG, i64, env, i64, i64) DEF_HELPER_FLAGS_3(sdb, TCG_CALL_NO_WG, i64, env, i64, i64) -DEF_HELPER_FLAGS_5(sxb, TCG_CALL_NO_WG, i64, env, i64, i64, i64, i64) +DEF_HELPER_FLAGS_5(sxb, TCG_CALL_NO_WG, i128, env, i64, i64, i64, i64) DEF_HELPER_FLAGS_3(deb, TCG_CALL_NO_WG, i64, env, i64, i64) DEF_HELPER_FLAGS_3(ddb, TCG_CALL_NO_WG, i64, env, i64, i64) -DEF_HELPER_FLAGS_5(dxb, TCG_CALL_NO_WG, i64, env, i64, i64, i64, i64) +DEF_HELPER_FLAGS_5(dxb, TCG_CALL_NO_WG, i128, env, i64, i64, i64, i64) DEF_HELPER_FLAGS_3(meeb, TCG_CALL_NO_WG, i64, env, i64, i64) DEF_HELPER_FLAGS_3(mdeb, TCG_CALL_NO_WG, i64, env, i64, i64) DEF_HELPER_FLAGS_3(mdb, TCG_CALL_NO_WG, i64, env, i64, i64) -DEF_HELPER_FLAGS_5(mxb, TCG_CALL_NO_WG, i64, env, i64, i64, i64, i64) -DEF_HELPER_FLAGS_4(mxdb, TCG_CALL_NO_WG, i64, env, i64, i64, i64) +DEF_HELPER_FLAGS_5(mxb, TCG_CALL_NO_WG, i128, env, i64, i64, i64, i64) +DEF_HELPER_FLAGS_4(mxdb, TCG_CALL_NO_WG, i128, env, i64, i64, i64) DEF_HELPER_FLAGS_2(ldeb, TCG_CALL_NO_WG, i64, env, i64) DEF_HELPER_FLAGS_4(ldxb, TCG_CALL_NO_WG, i64, env, i64, i64, i32) -DEF_HELPER_FLAGS_2(lxdb, TCG_CALL_NO_WG, i64, env, i64) -DEF_HELPER_FLAGS_2(lxeb, TCG_CALL_NO_WG, i64, env, i64) +DEF_HELPER_FLAGS_2(lxdb, TCG_CALL_NO_WG, i128, env, i64) +DEF_HELPER_FLAGS_2(lxeb, TCG_CALL_NO_WG, i128, env, i64) DEF_HELPER_FLAGS_3(ledb, TCG_CALL_NO_WG, i64, env, i64, i32) DEF_HELPER_FLAGS_4(lexb, TCG_CALL_NO_WG, i64, env, i64, i64, i32) DEF_HELPER_FLAGS_3(ceb, TCG_CALL_NO_WG_SE, i32, env, i64, i64) @@ -79,7 +79,7 @@ DEF_HELPER_3(clfdb, i64, env, i64, i32) DEF_HELPER_4(clfxb, i64, env, i64, i64, i32) DEF_HELPER_FLAGS_3(fieb, TCG_CALL_NO_WG, i64, env, i64, i32) DEF_HELPER_FLAGS_3(fidb, TCG_CALL_NO_WG, i64, env, i64, i32) -DEF_HELPER_FLAGS_4(fixb, TCG_CALL_NO_WG, i64, env, i64, i64, i32) +DEF_HELPER_FLAGS_4(fixb, TCG_CALL_NO_WG, i128, env, i64, i64, i32) DEF_HELPER_FLAGS_4(maeb, TCG_CALL_NO_WG, i64, env, i64, i64, i64) DEF_HELPER_FLAGS_4(madb, TCG_CALL_NO_WG, i64, env, i64, i64, i64) DEF_HELPER_FLAGS_4(mseb, TCG_CALL_NO_WG, i64, env, i64, i64, i64) @@ -89,7 +89,7 @@ DEF_HELPER_FLAGS_3(tcdb, TCG_CALL_NO_RWG_SE, i32, env, i64, i64) DEF_HELPER_FLAGS_4(tcxb, TCG_CALL_NO_RWG_SE, i32, env, i64, i64, i64) DEF_HELPER_FLAGS_2(sqeb, TCG_CALL_NO_WG, i64, env, i64) DEF_HELPER_FLAGS_2(sqdb, TCG_CALL_NO_WG, i64, env, i64) -DEF_HELPER_FLAGS_3(sqxb, TCG_CALL_NO_WG, i64, env, i64, i64) +DEF_HELPER_FLAGS_3(sqxb, TCG_CALL_NO_WG, i128, env, i64, i64) DEF_HELPER_FLAGS_1(cvd, TCG_CALL_NO_RWG_SE, i64, s32) DEF_HELPER_FLAGS_4(pack, TCG_CALL_NO_WG, void, env, i32, i64, i64) DEF_HELPER_FLAGS_4(pka, TCG_CALL_NO_WG, void, env, i64, i64, i32) diff --git a/target/s390x/tcg/insn-data.h.inc b/target/s390x/tcg/insn-data.h.inc index d0814cb218..517a4500ae 100644 --- a/target/s390x/tcg/insn-data.h.inc +++ b/target/s390x/tcg/insn-data.h.inc @@ -306,10 +306,10 @@ /* CONVERT FROM FIXED */ F(0xb394, CEFBR, RRF_e, Z, 0, r2_32s, new, e1, cegb, 0, IF_BFP) F(0xb395, CDFBR, RRF_e, Z, 0, r2_32s, new, f1, cdgb, 0, IF_BFP) -F(0xb396, CXFBR, RRF_e, Z, 0, r2_32s, new_P, x1, cxgb, 0, IF_BFP) +F(0xb396, CXFBR, RRF_e, Z, 0, r2_32s, new_x, x1, cxgb, 0, IF_BFP) F(0xb3a4, CEGBR, RRF_e, Z, 0, r2_o, new, e1, cegb, 0, IF_BFP) F(0xb3a5, CDGBR, RRF_e, Z, 0, r2_o, new, f1, cdgb, 0, IF_BFP) -F(0xb3a6, CXGBR, RRF_e, Z, 0, r2_o, new_P, x1, cxgb, 0, IF_BFP) +F(0xb3a6, CXGBR, RRF_e, Z, 0, r2_o, new_x, x1, cxgb, 0, IF_B
[PULL 01/40] accel/tcg: Test CPUJumpCache in tb_jmp_cache_clear_page
From: Eric Auger After commit 4e4fa6c12d ("accel/tcg: Complete cpu initialization before registration"), it looks the CPUJumpCache pointer can be NULL. This causes a SIGSEV when running debug-wp-migration kvm unit test. At the first place it should be clarified why this TCG code is called with KVM acceleration. This may hide another bug. Fixes: 4e4fa6c12d ("accel/tcg: Complete cpu initialization before registration") Signed-off-by: Eric Auger Message-Id: <20230203171510.2867451-1-eric.au...@redhat.com> Signed-off-by: Richard Henderson --- accel/tcg/cputlb.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c index 4e040a1cb9..04e270742e 100644 --- a/accel/tcg/cputlb.c +++ b/accel/tcg/cputlb.c @@ -100,9 +100,14 @@ static void tlb_window_reset(CPUTLBDesc *desc, int64_t ns, static void tb_jmp_cache_clear_page(CPUState *cpu, target_ulong page_addr) { -int i, i0 = tb_jmp_cache_hash_page(page_addr); CPUJumpCache *jc = cpu->tb_jmp_cache; +int i, i0; +if (unlikely(!jc)) { +return; +} + +i0 = tb_jmp_cache_hash_page(page_addr); for (i = 0; i < TB_JMP_PAGE_SIZE; i++) { qatomic_set(&jc->array[i0 + i].tb, NULL); } -- 2.34.1
[PULL 30/40] target/s390x: Use Int128 for return from CKSM
Acked-by: Ilya Leoshkevich Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- target/s390x/helper.h | 2 +- target/s390x/tcg/mem_helper.c | 7 +++ target/s390x/tcg/translate.c | 6 -- 3 files changed, 8 insertions(+), 7 deletions(-) diff --git a/target/s390x/helper.h b/target/s390x/helper.h index 25c2dd0b3c..03b29efa3e 100644 --- a/target/s390x/helper.h +++ b/target/s390x/helper.h @@ -103,7 +103,7 @@ DEF_HELPER_4(tre, i64, env, i64, i64, i64) DEF_HELPER_4(trt, i32, env, i32, i64, i64) DEF_HELPER_4(trtr, i32, env, i32, i64, i64) DEF_HELPER_5(trXX, i32, env, i32, i32, i32, i32) -DEF_HELPER_4(cksm, i64, env, i64, i64, i64) +DEF_HELPER_4(cksm, i128, env, i64, i64, i64) DEF_HELPER_FLAGS_5(calc_cc, TCG_CALL_NO_RWG_SE, i32, env, i32, i64, i64, i64) DEF_HELPER_FLAGS_2(sfpc, TCG_CALL_NO_WG, void, env, i64) DEF_HELPER_FLAGS_2(sfas, TCG_CALL_NO_WG, void, env, i64) diff --git a/target/s390x/tcg/mem_helper.c b/target/s390x/tcg/mem_helper.c index 9be42851d8..b0b403e23a 100644 --- a/target/s390x/tcg/mem_helper.c +++ b/target/s390x/tcg/mem_helper.c @@ -1350,8 +1350,8 @@ uint32_t HELPER(clclu)(CPUS390XState *env, uint32_t r1, uint64_t a2, } /* checksum */ -uint64_t HELPER(cksm)(CPUS390XState *env, uint64_t r1, - uint64_t src, uint64_t src_len) +Int128 HELPER(cksm)(CPUS390XState *env, uint64_t r1, +uint64_t src, uint64_t src_len) { uintptr_t ra = GETPC(); uint64_t max_len, len; @@ -1392,8 +1392,7 @@ uint64_t HELPER(cksm)(CPUS390XState *env, uint64_t r1, env->cc_op = (len == src_len ? 0 : 3); /* Return both cksm and processed length. */ -env->retxl = cksm; -return len; +return int128_make128(cksm, len); } void HELPER(pack)(CPUS390XState *env, uint32_t len, uint64_t dest, uint64_t src) diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c index 8397fe2bd8..1a7aa9e4ae 100644 --- a/target/s390x/tcg/translate.c +++ b/target/s390x/tcg/translate.c @@ -2041,11 +2041,13 @@ static DisasJumpType op_cxlgb(DisasContext *s, DisasOps *o) static DisasJumpType op_cksm(DisasContext *s, DisasOps *o) { int r2 = get_field(s, r2); +TCGv_i128 pair = tcg_temp_new_i128(); TCGv_i64 len = tcg_temp_new_i64(); -gen_helper_cksm(len, cpu_env, o->in1, o->in2, regs[r2 + 1]); +gen_helper_cksm(pair, cpu_env, o->in1, o->in2, regs[r2 + 1]); set_cc_static(s); -return_low128(o->out); +tcg_gen_extr_i128_i64(o->out, len, pair); +tcg_temp_free_i128(pair); tcg_gen_add_i64(regs[r2], regs[r2], len); tcg_gen_sub_i64(regs[r2 + 1], regs[r2 + 1], len); -- 2.34.1
[PULL 00/40] tcg patch queue
The following changes since commit 579510e196a544b42bd8bca9cc61688d4d1211ac: Merge tag 'pull-monitor-2023-02-03-v2' of https://repo.or.cz/qemu/armbru into staging (2023-02-04 10:19:55 +) are available in the Git repository at: https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20230204 for you to fetch changes up to a2495ede07498ee36b18b03e7038ba30c9871bb2: tcg/aarch64: Fix patching of LDR in tb_target_set_jmp_target (2023-02-04 06:19:43 -1000) tcg: Add support for TCGv_i128 in parameters and returns. tcg: Add support for TCGv_i128 in cmpxchg. tcg: Test CPUJumpCache in tb_jmp_cache_clear_page tcg: Split out tcg_gen_nonatomic_cmpxchg_i{32,64} tcg/aarch64: Fix patching of LDR in tb_target_set_jmp_target target/arm: Use tcg_gen_atomic_cmpxchg_i128 target/i386: Use tcg_gen_atomic_cmpxchg_i128 target/i386: Use tcg_gen_nonatomic_cmpxchg_i{32,64} target/s390x: Use tcg_gen_atomic_cmpxchg_i128 target/s390x: Use TCGv_i128 in passing and returning float128 target/s390x: Implement CC_OP_NZ in gen_op_calc_cc Eric Auger (1): accel/tcg: Test CPUJumpCache in tb_jmp_cache_clear_page Ilya Leoshkevich (3): tests/tcg/s390x: Add div.c tests/tcg/s390x: Add clst.c tests/tcg/s390x: Add cdsg.c Richard Henderson (36): tcg: Init temp_subindex in liveness_pass_2 tcg: Define TCG_TYPE_I128 and related helper macros tcg: Handle dh_typecode_i128 with TCG_CALL_{RET,ARG}_NORMAL tcg: Allocate objects contiguously in temp_allocate_frame tcg: Introduce tcg_out_addi_ptr tcg: Add TCG_CALL_{RET,ARG}_BY_REF tcg: Introduce tcg_target_call_oarg_reg tcg: Add TCG_CALL_RET_BY_VEC include/qemu/int128: Use Int128 structure for TCI tcg/i386: Add TCG_TARGET_CALL_{RET,ARG}_I128 tcg/tci: Fix big-endian return register ordering tcg/tci: Add TCG_TARGET_CALL_{RET,ARG}_I128 tcg: Add TCG_TARGET_CALL_{RET,ARG}_I128 tcg: Add temp allocation for TCGv_i128 tcg: Add basic data movement for TCGv_i128 tcg: Add guest load/store primitives for TCGv_i128 tcg: Add tcg_gen_{non}atomic_cmpxchg_i128 tcg: Split out tcg_gen_nonatomic_cmpxchg_i{32,64} target/arm: Use tcg_gen_atomic_cmpxchg_i128 for STXP target/arm: Use tcg_gen_atomic_cmpxchg_i128 for CASP target/ppc: Use tcg_gen_atomic_cmpxchg_i128 for STQCX tests/tcg/s390x: Add long-double.c target/s390x: Use a single return for helper_divs32/u32 target/s390x: Use a single return for helper_divs64/u64 target/s390x: Use Int128 for return from CLST target/s390x: Use Int128 for return from CKSM target/s390x: Use Int128 for return from TRE target/s390x: Copy wout_x1 to wout_x1_P target/s390x: Use Int128 for returning float128 target/s390x: Use Int128 for passing float128 target/s390x: Use tcg_gen_atomic_cmpxchg_i128 for CDSG target/s390x: Implement CC_OP_NZ in gen_op_calc_cc target/i386: Split out gen_cmpxchg8b, gen_cmpxchg16b target/i386: Inline cmpxchg8b target/i386: Inline cmpxchg16b tcg/aarch64: Fix patching of LDR in tb_target_set_jmp_target accel/tcg/tcg-runtime.h | 11 ++ include/exec/cpu_ldst.h | 10 + include/exec/helper-head.h | 7 + include/qemu/atomic128.h | 29 ++- include/qemu/int128.h| 25 ++- include/tcg/tcg-op.h | 15 ++ include/tcg/tcg.h| 49 - target/arm/helper-a64.h | 8 - target/i386/helper.h | 6 - target/ppc/helper.h | 2 - target/s390x/helper.h| 54 +++--- tcg/aarch64/tcg-target.h | 2 + tcg/arm/tcg-target.h | 2 + tcg/i386/tcg-target.h| 10 + tcg/loongarch64/tcg-target.h | 2 + tcg/mips/tcg-target.h| 2 + tcg/riscv/tcg-target.h | 3 + tcg/s390x/tcg-target.h | 2 + tcg/sparc64/tcg-target.h | 2 + tcg/tcg-internal.h | 17 ++ tcg/tci/tcg-target.h | 3 + target/s390x/tcg/insn-data.h.inc | 60 +++--- accel/tcg/cputlb.c | 119 +++- accel/tcg/user-exec.c| 66 +++ target/arm/helper-a64.c | 147 --- target/arm/translate-a64.c | 121 ++-- target/i386/tcg/mem_helper.c | 126 - target/i386/tcg/translate.c | 126 +++-- target/ppc/mem_helper.c | 44 - target/ppc/translate.c | 102 +- target/s390x/tcg/fpu_helper.c| 103 +- target/s390x/tcg/int_helper.c| 64 +++ target/s390x/tcg/mem_helper.c| 77 +--- target/s390x/tcg/translate.c | 212 ++--- tcg/tcg-op.c | 393 +-- tcg/tcg.c| 308 ++
[PULL 02/40] tcg: Init temp_subindex in liveness_pass_2
Correctly handle large types while lowering. Fixes: fac87bd2a49b ("tcg: Add temp_subindex to TCGTemp") Signed-off-by: Richard Henderson --- tcg/tcg.c | 1 + 1 file changed, 1 insertion(+) diff --git a/tcg/tcg.c b/tcg/tcg.c index fd557d55d3..bc60fd0fe8 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -3063,6 +3063,7 @@ static bool liveness_pass_2(TCGContext *s) TCGTemp *dts = tcg_temp_alloc(s); dts->type = its->type; dts->base_type = its->base_type; +dts->temp_subindex = its->temp_subindex; dts->kind = TEMP_EBB; its->state_ptr = dts; } else { -- 2.34.1
[PULL 13/40] tcg/tci: Add TCG_TARGET_CALL_{RET,ARG}_I128
Fill in the parameters for libffi for Int128. Adjust the interpreter to allow for 16-byte return values. Adjust tcg_out_call to record the return value length. Call parameters are no longer all the same size, so we cannot reuse the same call_slots array for every function. Compute it each time now, but only fill in slots required for the call we're about to make. Reviewed-by: Alex Bennée Signed-off-by: Richard Henderson --- tcg/tci/tcg-target.h | 3 +++ tcg/tcg.c| 19 + tcg/tci.c| 44 tcg/tci/tcg-target.c.inc | 10 - 4 files changed, 49 insertions(+), 27 deletions(-) diff --git a/tcg/tci/tcg-target.h b/tcg/tci/tcg-target.h index 1414ab4d5b..7140a76a73 100644 --- a/tcg/tci/tcg-target.h +++ b/tcg/tci/tcg-target.h @@ -160,10 +160,13 @@ typedef enum { #if TCG_TARGET_REG_BITS == 32 # define TCG_TARGET_CALL_ARG_I32TCG_CALL_ARG_EVEN # define TCG_TARGET_CALL_ARG_I64TCG_CALL_ARG_EVEN +# define TCG_TARGET_CALL_ARG_I128 TCG_CALL_ARG_EVEN #else # define TCG_TARGET_CALL_ARG_I32TCG_CALL_ARG_NORMAL # define TCG_TARGET_CALL_ARG_I64TCG_CALL_ARG_NORMAL +# define TCG_TARGET_CALL_ARG_I128 TCG_CALL_ARG_NORMAL #endif +#define TCG_TARGET_CALL_RET_I128TCG_CALL_RET_NORMAL #define HAVE_TCG_QEMU_TB_EXEC #define TCG_TARGET_NEED_POOL_LABELS diff --git a/tcg/tcg.c b/tcg/tcg.c index 098be83b00..865ed5ea0f 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -570,6 +570,22 @@ static GHashTable *helper_table; #ifdef CONFIG_TCG_INTERPRETER static ffi_type *typecode_to_ffi(int argmask) { +/* + * libffi does not support __int128_t, so we have forced Int128 + * to use the structure definition instead of the builtin type. + */ +static ffi_type *ffi_type_i128_elements[3] = { +&ffi_type_uint64, +&ffi_type_uint64, +NULL +}; +static ffi_type ffi_type_i128 = { +.size = 16, +.alignment = __alignof__(Int128), +.type = FFI_TYPE_STRUCT, +.elements = ffi_type_i128_elements, +}; + switch (argmask) { case dh_typecode_void: return &ffi_type_void; @@ -583,6 +599,8 @@ static ffi_type *typecode_to_ffi(int argmask) return &ffi_type_sint64; case dh_typecode_ptr: return &ffi_type_pointer; +case dh_typecode_i128: +return &ffi_type_i128; } g_assert_not_reached(); } @@ -613,6 +631,7 @@ static void init_ffi_layouts(void) /* Ignoring the return type, find the last non-zero field. */ nargs = 32 - clz32(typemask >> 3); nargs = DIV_ROUND_UP(nargs, 3); +assert(nargs <= MAX_CALL_IARGS); ca = g_malloc0(sizeof(*ca) + nargs * sizeof(ffi_type *)); ca->cif.rtype = typecode_to_ffi(typemask & 7); diff --git a/tcg/tci.c b/tcg/tci.c index eeccdde8bc..022fe9d0f8 100644 --- a/tcg/tci.c +++ b/tcg/tci.c @@ -470,12 +470,9 @@ uintptr_t QEMU_DISABLE_CFI tcg_qemu_tb_exec(CPUArchState *env, tcg_target_ulong regs[TCG_TARGET_NB_REGS]; uint64_t stack[(TCG_STATIC_CALL_ARGS_SIZE + TCG_STATIC_FRAME_SIZE) / sizeof(uint64_t)]; -void *call_slots[TCG_STATIC_CALL_ARGS_SIZE / sizeof(uint64_t)]; regs[TCG_AREG0] = (tcg_target_ulong)env; regs[TCG_REG_CALL_STACK] = (uintptr_t)stack; -/* Other call_slots entries initialized at first use (see below). */ -call_slots[0] = NULL; tci_assert(tb_ptr); for (;;) { @@ -498,26 +495,26 @@ uintptr_t QEMU_DISABLE_CFI tcg_qemu_tb_exec(CPUArchState *env, switch (opc) { case INDEX_op_call: -/* - * Set up the ffi_avalue array once, delayed until now - * because many TB's do not make any calls. In tcg_gen_callN, - * we arranged for every real argument to be "left-aligned" - * in each 64-bit slot. - */ -if (unlikely(call_slots[0] == NULL)) { -for (int i = 0; i < ARRAY_SIZE(call_slots); ++i) { -call_slots[i] = &stack[i]; -} -} - -tci_args_nl(insn, tb_ptr, &len, &ptr); - -/* Helper functions may need to access the "return address" */ -tci_tb_ptr = (uintptr_t)tb_ptr; - { -void **pptr = ptr; -ffi_call(pptr[1], pptr[0], stack, call_slots); +void *call_slots[MAX_CALL_IARGS]; +ffi_cif *cif; +void *func; +unsigned i, s, n; + +tci_args_nl(insn, tb_ptr, &len, &ptr); +func = ((void **)ptr)[0]; +cif = ((void **)ptr)[1]; + +n = cif->nargs; +for (i = s = 0; i < n; ++i) { +ffi_type *t = cif->arg_types[i]; +call_slots[i] = &stack[s]; +s += DIV_ROUND_UP(t->size, 8); +} + +/* Helper func
[PULL 31/40] target/s390x: Use Int128 for return from TRE
Acked-by: Ilya Leoshkevich Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- target/s390x/helper.h | 2 +- target/s390x/tcg/mem_helper.c | 7 +++ target/s390x/tcg/translate.c | 7 +-- 3 files changed, 9 insertions(+), 7 deletions(-) diff --git a/target/s390x/helper.h b/target/s390x/helper.h index 03b29efa3e..b4170a4256 100644 --- a/target/s390x/helper.h +++ b/target/s390x/helper.h @@ -99,7 +99,7 @@ DEF_HELPER_FLAGS_4(unpka, TCG_CALL_NO_WG, i32, env, i64, i32, i64) DEF_HELPER_FLAGS_4(unpku, TCG_CALL_NO_WG, i32, env, i64, i32, i64) DEF_HELPER_FLAGS_3(tp, TCG_CALL_NO_WG, i32, env, i64, i32) DEF_HELPER_FLAGS_4(tr, TCG_CALL_NO_WG, void, env, i32, i64, i64) -DEF_HELPER_4(tre, i64, env, i64, i64, i64) +DEF_HELPER_4(tre, i128, env, i64, i64, i64) DEF_HELPER_4(trt, i32, env, i32, i64, i64) DEF_HELPER_4(trtr, i32, env, i32, i64, i64) DEF_HELPER_5(trXX, i32, env, i32, i32, i32, i32) diff --git a/target/s390x/tcg/mem_helper.c b/target/s390x/tcg/mem_helper.c index b0b403e23a..49969abda7 100644 --- a/target/s390x/tcg/mem_helper.c +++ b/target/s390x/tcg/mem_helper.c @@ -1632,8 +1632,8 @@ void HELPER(tr)(CPUS390XState *env, uint32_t len, uint64_t array, do_helper_tr(env, len, array, trans, GETPC()); } -uint64_t HELPER(tre)(CPUS390XState *env, uint64_t array, - uint64_t len, uint64_t trans) +Int128 HELPER(tre)(CPUS390XState *env, uint64_t array, + uint64_t len, uint64_t trans) { uintptr_t ra = GETPC(); uint8_t end = env->regs[0] & 0xff; @@ -1668,8 +1668,7 @@ uint64_t HELPER(tre)(CPUS390XState *env, uint64_t array, } env->cc_op = cc; -env->retxl = len - i; -return array + i; +return int128_make128(len - i, array + i); } static inline uint32_t do_helper_trt(CPUS390XState *env, int len, diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c index 1a7aa9e4ae..f3e4b70ed9 100644 --- a/target/s390x/tcg/translate.c +++ b/target/s390x/tcg/translate.c @@ -4905,8 +4905,11 @@ static DisasJumpType op_tr(DisasContext *s, DisasOps *o) static DisasJumpType op_tre(DisasContext *s, DisasOps *o) { -gen_helper_tre(o->out, cpu_env, o->out, o->out2, o->in2); -return_low128(o->out2); +TCGv_i128 pair = tcg_temp_new_i128(); + +gen_helper_tre(pair, cpu_env, o->out, o->out2, o->in2); +tcg_gen_extr_i128_i64(o->out2, o->out, pair); +tcg_temp_free_i128(pair); set_cc_static(s); return DISAS_NEXT; } -- 2.34.1
[PULL 40/40] tcg/aarch64: Fix patching of LDR in tb_target_set_jmp_target
'offset' should be bits [23:5] of LDR instruction, rather than [4:0]. Fixes: d59d83a1c388 ("tcg/aarch64: Reorg goto_tb implementation") Reviewed-by: Zenghui Yu Reported-by: Zenghui Yu Signed-off-by: Richard Henderson --- tcg/aarch64/tcg-target.c.inc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc index fde3b30ad1..a091326f84 100644 --- a/tcg/aarch64/tcg-target.c.inc +++ b/tcg/aarch64/tcg-target.c.inc @@ -1914,7 +1914,7 @@ void tb_target_set_jmp_target(const TranslationBlock *tb, int n, ptrdiff_t i_offset = i_addr - jmp_rx; /* Note that we asserted this in range in tcg_out_goto_tb. */ -insn = deposit32(I3305_LDR | TCG_REG_TMP, 0, 5, i_offset >> 2); +insn = deposit32(I3305_LDR | TCG_REG_TMP, 5, 19, i_offset >> 2); } qatomic_set((uint32_t *)jmp_rw, insn); flush_idcache_range(jmp_rx, jmp_rw, 4); -- 2.34.1
[PULL 03/40] tcg: Define TCG_TYPE_I128 and related helper macros
Begin staging in support for TCGv_i128 with Int128. Define the type enumerator, the typedef, and the helper-head.h macros. This cannot yet be used, because you can't allocate temporaries of this new type. Reviewed-by: Alex Bennée Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- include/exec/helper-head.h | 7 +++ include/tcg/tcg.h | 17 ++--- 2 files changed, 17 insertions(+), 7 deletions(-) diff --git a/include/exec/helper-head.h b/include/exec/helper-head.h index bc6698b19f..b8d1140dc7 100644 --- a/include/exec/helper-head.h +++ b/include/exec/helper-head.h @@ -26,6 +26,7 @@ #define dh_alias_int i32 #define dh_alias_i64 i64 #define dh_alias_s64 i64 +#define dh_alias_i128 i128 #define dh_alias_f16 i32 #define dh_alias_f32 i32 #define dh_alias_f64 i64 @@ -40,6 +41,7 @@ #define dh_ctype_int int #define dh_ctype_i64 uint64_t #define dh_ctype_s64 int64_t +#define dh_ctype_i128 Int128 #define dh_ctype_f16 uint32_t #define dh_ctype_f32 float32 #define dh_ctype_f64 float64 @@ -71,6 +73,7 @@ #define dh_retvar_decl0_noreturn void #define dh_retvar_decl0_i32 TCGv_i32 retval #define dh_retvar_decl0_i64 TCGv_i64 retval +#define dh_retval_decl0_i128 TCGv_i128 retval #define dh_retvar_decl0_ptr TCGv_ptr retval #define dh_retvar_decl0(t) glue(dh_retvar_decl0_, dh_alias(t)) @@ -78,6 +81,7 @@ #define dh_retvar_decl_noreturn #define dh_retvar_decl_i32 TCGv_i32 retval, #define dh_retvar_decl_i64 TCGv_i64 retval, +#define dh_retvar_decl_i128 TCGv_i128 retval, #define dh_retvar_decl_ptr TCGv_ptr retval, #define dh_retvar_decl(t) glue(dh_retvar_decl_, dh_alias(t)) @@ -85,6 +89,7 @@ #define dh_retvar_noreturn NULL #define dh_retvar_i32 tcgv_i32_temp(retval) #define dh_retvar_i64 tcgv_i64_temp(retval) +#define dh_retvar_i128 tcgv_i128_temp(retval) #define dh_retvar_ptr tcgv_ptr_temp(retval) #define dh_retvar(t) glue(dh_retvar_, dh_alias(t)) @@ -95,6 +100,7 @@ #define dh_typecode_i64 4 #define dh_typecode_s64 5 #define dh_typecode_ptr 6 +#define dh_typecode_i128 7 #define dh_typecode_int dh_typecode_s32 #define dh_typecode_f16 dh_typecode_i32 #define dh_typecode_f32 dh_typecode_i32 @@ -104,6 +110,7 @@ #define dh_callflag_i32 0 #define dh_callflag_i64 0 +#define dh_callflag_i128 0 #define dh_callflag_ptr 0 #define dh_callflag_void 0 #define dh_callflag_noreturn TCG_CALL_NO_RETURN diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h index c5112da0ef..4d7e4107a9 100644 --- a/include/tcg/tcg.h +++ b/include/tcg/tcg.h @@ -270,6 +270,7 @@ typedef struct TCGPool { typedef enum TCGType { TCG_TYPE_I32, TCG_TYPE_I64, +TCG_TYPE_I128, TCG_TYPE_V64, TCG_TYPE_V128, @@ -351,13 +352,14 @@ typedef tcg_target_ulong TCGArg; in tcg/README. Target CPU front-end code uses these types to deal with TCG variables as it emits TCG code via the tcg_gen_* functions. They come in several flavours: -* TCGv_i32 : 32 bit integer type -* TCGv_i64 : 64 bit integer type -* TCGv_ptr : a host pointer type -* TCGv_vec : a host vector type; the exact size is not exposed - to the CPU front-end code. -* TCGv : an integer type the same size as target_ulong - (an alias for either TCGv_i32 or TCGv_i64) +* TCGv_i32 : 32 bit integer type +* TCGv_i64 : 64 bit integer type +* TCGv_i128 : 128 bit integer type +* TCGv_ptr : a host pointer type +* TCGv_vec : a host vector type; the exact size is not exposed + to the CPU front-end code. +* TCGv : an integer type the same size as target_ulong + (an alias for either TCGv_i32 or TCGv_i64) The compiler's type checking will complain if you mix them up and pass the wrong sized TCGv to a function. @@ -377,6 +379,7 @@ typedef tcg_target_ulong TCGArg; typedef struct TCGv_i32_d *TCGv_i32; typedef struct TCGv_i64_d *TCGv_i64; +typedef struct TCGv_i128_d *TCGv_i128; typedef struct TCGv_ptr_d *TCGv_ptr; typedef struct TCGv_vec_d *TCGv_vec; typedef TCGv_ptr TCGv_env; -- 2.34.1
[PULL 26/40] tests/tcg/s390x: Add cdsg.c
From: Ilya Leoshkevich Add a simple test to prevent regressions. Signed-off-by: Ilya Leoshkevich Message-Id: <20230201133257.3223115-1-...@linux.ibm.com> Signed-off-by: Richard Henderson --- tests/tcg/s390x/cdsg.c | 93 + tests/tcg/s390x/Makefile.target | 4 ++ 2 files changed, 97 insertions(+) create mode 100644 tests/tcg/s390x/cdsg.c diff --git a/tests/tcg/s390x/cdsg.c b/tests/tcg/s390x/cdsg.c new file mode 100644 index 00..800618ff4b --- /dev/null +++ b/tests/tcg/s390x/cdsg.c @@ -0,0 +1,93 @@ +/* + * Test CDSG instruction. + * + * Increment the first half of aligned_quadword by 1, and the second half by 2 + * from 2 threads. Verify that the result is consistent. + * + * SPDX-License-Identifier: GPL-2.0-or-later + */ +#include +#include +#include +#include + +static volatile bool start; +typedef unsigned long aligned_quadword[2] __attribute__((__aligned__(16))); +static aligned_quadword val; +static const int n_iterations = 100; + +static inline int cdsg(unsigned long *orig0, unsigned long *orig1, + unsigned long new0, unsigned long new1, + aligned_quadword *mem) +{ +register unsigned long r0 asm("r0"); +register unsigned long r1 asm("r1"); +register unsigned long r2 asm("r2"); +register unsigned long r3 asm("r3"); +int cc; + +r0 = *orig0; +r1 = *orig1; +r2 = new0; +r3 = new1; +asm("cdsg %[r0],%[r2],%[db2]\n" +"ipm %[cc]" +: [r0] "+r" (r0) +, [r1] "+r" (r1) +, [db2] "+m" (*mem) +, [cc] "=r" (cc) +: [r2] "r" (r2) +, [r3] "r" (r3) +: "cc"); +*orig0 = r0; +*orig1 = r1; + +return (cc >> 28) & 3; +} + +void *cdsg_loop(void *arg) +{ +unsigned long orig0, orig1, new0, new1; +int cc; +int i; + +while (!start) { +} + +orig0 = val[0]; +orig1 = val[1]; +for (i = 0; i < n_iterations;) { +new0 = orig0 + 1; +new1 = orig1 + 2; + +cc = cdsg(&orig0, &orig1, new0, new1, &val); + +if (cc == 0) { +orig0 = new0; +orig1 = new1; +i++; +} else { +assert(cc == 1); +} +} + +return NULL; +} + +int main(void) +{ +pthread_t thread; +int ret; + +ret = pthread_create(&thread, NULL, cdsg_loop, NULL); +assert(ret == 0); +start = true; +cdsg_loop(NULL); +ret = pthread_join(thread, NULL); +assert(ret == 0); + +assert(val[0] == n_iterations * 2); +assert(val[1] == n_iterations * 4); + +return EXIT_SUCCESS; +} diff --git a/tests/tcg/s390x/Makefile.target b/tests/tcg/s390x/Makefile.target index 1d454270c0..72ad309b27 100644 --- a/tests/tcg/s390x/Makefile.target +++ b/tests/tcg/s390x/Makefile.target @@ -27,6 +27,10 @@ TESTS+=noexec TESTS+=div TESTS+=clst TESTS+=long-double +TESTS+=cdsg + +cdsg: CFLAGS+=-pthread +cdsg: LDFLAGS+=-pthread Z13_TESTS=vistr $(Z13_TESTS): CFLAGS+=-march=z13 -O2 -- 2.34.1
[PULL 16/40] tcg: Add basic data movement for TCGv_i128
Add code generation functions for data movement between TCGv_i128 (mov) and to/from TCGv_i64 (concat, extract). Reviewed-by: Alex Bennée Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- include/tcg/tcg-op.h | 4 tcg/tcg-internal.h | 13 + tcg/tcg-op.c | 20 3 files changed, 37 insertions(+) diff --git a/include/tcg/tcg-op.h b/include/tcg/tcg-op.h index 79b1cf786f..c4276767d1 100644 --- a/include/tcg/tcg-op.h +++ b/include/tcg/tcg-op.h @@ -712,6 +712,10 @@ void tcg_gen_extrh_i64_i32(TCGv_i32 ret, TCGv_i64 arg); void tcg_gen_extr_i64_i32(TCGv_i32 lo, TCGv_i32 hi, TCGv_i64 arg); void tcg_gen_extr32_i64(TCGv_i64 lo, TCGv_i64 hi, TCGv_i64 arg); +void tcg_gen_mov_i128(TCGv_i128 dst, TCGv_i128 src); +void tcg_gen_extr_i128_i64(TCGv_i64 lo, TCGv_i64 hi, TCGv_i128 arg); +void tcg_gen_concat_i64_i128(TCGv_i128 ret, TCGv_i64 lo, TCGv_i64 hi); + static inline void tcg_gen_concat32_i64(TCGv_i64 ret, TCGv_i64 lo, TCGv_i64 hi) { tcg_gen_deposit_i64(ret, lo, hi, 32, 32); diff --git a/tcg/tcg-internal.h b/tcg/tcg-internal.h index 33f1d8b411..e542a4e9b7 100644 --- a/tcg/tcg-internal.h +++ b/tcg/tcg-internal.h @@ -117,4 +117,17 @@ extern TCGv_i32 TCGV_LOW(TCGv_i64) QEMU_ERROR("32-bit code path is reachable"); extern TCGv_i32 TCGV_HIGH(TCGv_i64) QEMU_ERROR("32-bit code path is reachable"); #endif +static inline TCGv_i64 TCGV128_LOW(TCGv_i128 t) +{ +/* For 32-bit, offset by 2, which may then have TCGV_{LOW,HIGH} applied. */ +int o = HOST_BIG_ENDIAN ? 64 / TCG_TARGET_REG_BITS : 0; +return temp_tcgv_i64(tcgv_i128_temp(t) + o); +} + +static inline TCGv_i64 TCGV128_HIGH(TCGv_i128 t) +{ +int o = HOST_BIG_ENDIAN ? 0 : 64 / TCG_TARGET_REG_BITS; +return temp_tcgv_i64(tcgv_i128_temp(t) + o); +} + #endif /* TCG_INTERNAL_H */ diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c index 326a9180ef..cb83d2375d 100644 --- a/tcg/tcg-op.c +++ b/tcg/tcg-op.c @@ -2747,6 +2747,26 @@ void tcg_gen_extr32_i64(TCGv_i64 lo, TCGv_i64 hi, TCGv_i64 arg) tcg_gen_shri_i64(hi, arg, 32); } +void tcg_gen_extr_i128_i64(TCGv_i64 lo, TCGv_i64 hi, TCGv_i128 arg) +{ +tcg_gen_mov_i64(lo, TCGV128_LOW(arg)); +tcg_gen_mov_i64(hi, TCGV128_HIGH(arg)); +} + +void tcg_gen_concat_i64_i128(TCGv_i128 ret, TCGv_i64 lo, TCGv_i64 hi) +{ +tcg_gen_mov_i64(TCGV128_LOW(ret), lo); +tcg_gen_mov_i64(TCGV128_HIGH(ret), hi); +} + +void tcg_gen_mov_i128(TCGv_i128 dst, TCGv_i128 src) +{ +if (dst != src) { +tcg_gen_mov_i64(TCGV128_LOW(dst), TCGV128_LOW(src)); +tcg_gen_mov_i64(TCGV128_HIGH(dst), TCGV128_HIGH(src)); +} +} + /* QEMU specific operations. */ void tcg_gen_exit_tb(const TranslationBlock *tb, unsigned idx) -- 2.34.1
[PULL 39/40] target/i386: Inline cmpxchg16b
Use tcg_gen_atomic_cmpxchg_i128 for the atomic case, and tcg_gen_qemu_ld/st_i128 otherwise. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- target/i386/helper.h | 4 --- target/i386/tcg/mem_helper.c | 69 target/i386/tcg/translate.c | 44 --- 3 files changed, 39 insertions(+), 78 deletions(-) diff --git a/target/i386/helper.h b/target/i386/helper.h index 2df8049f91..e627a93107 100644 --- a/target/i386/helper.h +++ b/target/i386/helper.h @@ -66,10 +66,6 @@ DEF_HELPER_1(rsm, void, env) #endif /* !CONFIG_USER_ONLY */ DEF_HELPER_2(into, void, env, int) -#ifdef TARGET_X86_64 -DEF_HELPER_2(cmpxchg16b_unlocked, void, env, tl) -DEF_HELPER_2(cmpxchg16b, void, env, tl) -#endif DEF_HELPER_FLAGS_1(single_step, TCG_CALL_NO_WG, noreturn, env) DEF_HELPER_1(rechecking_single_step, void, env) DEF_HELPER_1(cpuid, void, env) diff --git a/target/i386/tcg/mem_helper.c b/target/i386/tcg/mem_helper.c index 814786bb87..3ef84e90d9 100644 --- a/target/i386/tcg/mem_helper.c +++ b/target/i386/tcg/mem_helper.c @@ -27,75 +27,6 @@ #include "tcg/tcg.h" #include "helper-tcg.h" -#ifdef TARGET_X86_64 -void helper_cmpxchg16b_unlocked(CPUX86State *env, target_ulong a0) -{ -uintptr_t ra = GETPC(); -Int128 oldv, cmpv, newv; -uint64_t o0, o1; -int eflags; -bool success; - -if ((a0 & 0xf) != 0) { -raise_exception_ra(env, EXCP0D_GPF, GETPC()); -} -eflags = cpu_cc_compute_all(env, CC_OP); - -cmpv = int128_make128(env->regs[R_EAX], env->regs[R_EDX]); -newv = int128_make128(env->regs[R_EBX], env->regs[R_ECX]); - -o0 = cpu_ldq_data_ra(env, a0 + 0, ra); -o1 = cpu_ldq_data_ra(env, a0 + 8, ra); - -oldv = int128_make128(o0, o1); -success = int128_eq(oldv, cmpv); -if (!success) { -newv = oldv; -} - -cpu_stq_data_ra(env, a0 + 0, int128_getlo(newv), ra); -cpu_stq_data_ra(env, a0 + 8, int128_gethi(newv), ra); - -if (success) { -eflags |= CC_Z; -} else { -env->regs[R_EAX] = int128_getlo(oldv); -env->regs[R_EDX] = int128_gethi(oldv); -eflags &= ~CC_Z; -} -CC_SRC = eflags; -} - -void helper_cmpxchg16b(CPUX86State *env, target_ulong a0) -{ -uintptr_t ra = GETPC(); - -if ((a0 & 0xf) != 0) { -raise_exception_ra(env, EXCP0D_GPF, ra); -} else if (HAVE_CMPXCHG128) { -int eflags = cpu_cc_compute_all(env, CC_OP); - -Int128 cmpv = int128_make128(env->regs[R_EAX], env->regs[R_EDX]); -Int128 newv = int128_make128(env->regs[R_EBX], env->regs[R_ECX]); - -int mem_idx = cpu_mmu_index(env, false); -MemOpIdx oi = make_memop_idx(MO_TE | MO_128 | MO_ALIGN, mem_idx); -Int128 oldv = cpu_atomic_cmpxchgo_le_mmu(env, a0, cmpv, newv, oi, ra); - -if (int128_eq(oldv, cmpv)) { -eflags |= CC_Z; -} else { -env->regs[R_EAX] = int128_getlo(oldv); -env->regs[R_EDX] = int128_gethi(oldv); -eflags &= ~CC_Z; -} -CC_SRC = eflags; -} else { -cpu_loop_exit_atomic(env_cpu(env), ra); -} -} -#endif - void helper_boundw(CPUX86State *env, target_ulong a0, int v) { int low, high; diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index b542b084a6..9d9392b009 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -3053,15 +3053,49 @@ static void gen_cmpxchg8b(DisasContext *s, CPUX86State *env, int modrm) #ifdef TARGET_X86_64 static void gen_cmpxchg16b(DisasContext *s, CPUX86State *env, int modrm) { +MemOp mop = MO_TE | MO_128 | MO_ALIGN; +TCGv_i64 t0, t1; +TCGv_i128 cmp, val; + gen_lea_modrm(env, s, modrm); -if ((s->prefix & PREFIX_LOCK) && -(tb_cflags(s->base.tb) & CF_PARALLEL)) { -gen_helper_cmpxchg16b(cpu_env, s->A0); +cmp = tcg_temp_new_i128(); +val = tcg_temp_new_i128(); +tcg_gen_concat_i64_i128(cmp, cpu_regs[R_EAX], cpu_regs[R_EDX]); +tcg_gen_concat_i64_i128(val, cpu_regs[R_EBX], cpu_regs[R_ECX]); + +/* Only require atomic with LOCK; non-parallel handled in generator. */ +if (s->prefix & PREFIX_LOCK) { +tcg_gen_atomic_cmpxchg_i128(val, s->A0, cmp, val, s->mem_index, mop); } else { -gen_helper_cmpxchg16b_unlocked(cpu_env, s->A0); +tcg_gen_nonatomic_cmpxchg_i128(val, s->A0, cmp, val, s->mem_index, mop); } -set_cc_op(s, CC_OP_EFLAGS); + +tcg_gen_extr_i128_i64(s->T0, s->T1, val); +tcg_temp_free_i128(cmp); +tcg_temp_free_i128(val); + +/* Determine success after the fact. */ +t0 = tcg_temp_new_i64(); +t1 = tcg_temp_new_i64(); +tcg_gen_xor_i64(t0, s->T0, cpu_regs[R_EAX]); +tcg_gen_xor_i64(t1, s->T1, cpu_regs[R_EDX]); +tcg_gen_or_i64(t0, t0, t1); +tcg_temp_free_i64(t1); + +/* Update Z. */ +gen_compute_eflags(s); +tcg_gen_setcondi_i64(TCG_COND_EQ, t0, t0, 0); +tcg_gen_deposit_tl(cpu_cc_src, cpu_cc_src,
[PULL 08/40] tcg: Introduce tcg_target_call_oarg_reg
Replace the flat array tcg_target_call_oarg_regs[] with a function call including the TCGCallReturnKind. Extend the set of registers for ARM to r0-r3 to match the ABI: https://github.com/ARM-software/abi-aa/blob/main/aapcs32/aapcs32.rst#result-return Reviewed-by: Alex Bennée Reviewed-by: Daniel Henrique Barboza Signed-off-by: Richard Henderson --- tcg/tcg.c| 9 ++--- tcg/aarch64/tcg-target.c.inc | 10 +++--- tcg/arm/tcg-target.c.inc | 10 +++--- tcg/i386/tcg-target.c.inc| 16 ++-- tcg/loongarch64/tcg-target.c.inc | 10 ++ tcg/mips/tcg-target.c.inc| 10 ++ tcg/ppc/tcg-target.c.inc | 10 ++ tcg/riscv/tcg-target.c.inc | 10 ++ tcg/s390x/tcg-target.c.inc | 9 ++--- tcg/sparc64/tcg-target.c.inc | 12 ++-- tcg/tci/tcg-target.c.inc | 12 ++-- 11 files changed, 72 insertions(+), 46 deletions(-) diff --git a/tcg/tcg.c b/tcg/tcg.c index 123cde7000..a77483eee8 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -151,6 +151,7 @@ static bool tcg_out_sti(TCGContext *s, TCGType type, TCGArg val, TCGReg base, intptr_t ofs); static void tcg_out_call(TCGContext *s, const tcg_insn_unit *target, const TCGHelperInfo *info); +static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind kind, int slot); static bool tcg_target_const_match(int64_t val, TCGType type, int ct); #ifdef TCG_TARGET_NEED_LDST_LABELS static int tcg_out_ldst_finalize(TCGContext *s); @@ -740,14 +741,16 @@ static void init_call_layout(TCGHelperInfo *info) case dh_typecode_s64: info->nr_out = 64 / TCG_TARGET_REG_BITS; info->out_kind = TCG_CALL_RET_NORMAL; -assert(info->nr_out <= ARRAY_SIZE(tcg_target_call_oarg_regs)); +/* Query the last register now to trigger any assert early. */ +tcg_target_call_oarg_reg(info->out_kind, info->nr_out - 1); break; case dh_typecode_i128: info->nr_out = 128 / TCG_TARGET_REG_BITS; info->out_kind = TCG_CALL_RET_NORMAL; /* TODO */ switch (/* TODO */ TCG_CALL_RET_NORMAL) { case TCG_CALL_RET_NORMAL: -assert(info->nr_out <= ARRAY_SIZE(tcg_target_call_oarg_regs)); +/* Query the last register now to trigger any assert early. */ +tcg_target_call_oarg_reg(info->out_kind, info->nr_out - 1); break; case TCG_CALL_RET_BY_REF: /* @@ -4592,7 +4595,7 @@ static void tcg_reg_alloc_call(TCGContext *s, TCGOp *op) case TCG_CALL_RET_NORMAL: for (i = 0; i < nb_oargs; i++) { TCGTemp *ts = arg_temp(op->args[i]); -TCGReg reg = tcg_target_call_oarg_regs[i]; +TCGReg reg = tcg_target_call_oarg_reg(TCG_CALL_RET_NORMAL, i); /* ENV should not be modified. */ tcg_debug_assert(!temp_readonly(ts)); diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc index bd6da72678..fde3b30ad1 100644 --- a/tcg/aarch64/tcg-target.c.inc +++ b/tcg/aarch64/tcg-target.c.inc @@ -63,9 +63,13 @@ static const int tcg_target_call_iarg_regs[8] = { TCG_REG_X0, TCG_REG_X1, TCG_REG_X2, TCG_REG_X3, TCG_REG_X4, TCG_REG_X5, TCG_REG_X6, TCG_REG_X7 }; -static const int tcg_target_call_oarg_regs[1] = { -TCG_REG_X0 -}; + +static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind kind, int slot) +{ +tcg_debug_assert(kind == TCG_CALL_RET_NORMAL); +tcg_debug_assert(slot >= 0 && slot <= 1); +return TCG_REG_X0 + slot; +} #define TCG_REG_TMP TCG_REG_X30 #define TCG_VEC_TMP TCG_REG_V31 diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc index 6e9e9b9b3f..d06ac60c15 100644 --- a/tcg/arm/tcg-target.c.inc +++ b/tcg/arm/tcg-target.c.inc @@ -79,9 +79,13 @@ static const int tcg_target_reg_alloc_order[] = { static const int tcg_target_call_iarg_regs[4] = { TCG_REG_R0, TCG_REG_R1, TCG_REG_R2, TCG_REG_R3 }; -static const int tcg_target_call_oarg_regs[2] = { -TCG_REG_R0, TCG_REG_R1 -}; + +static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind kind, int slot) +{ +tcg_debug_assert(kind == TCG_CALL_RET_NORMAL); +tcg_debug_assert(slot >= 0 && slot <= 3); +return TCG_REG_R0 + slot; +} #define TCG_REG_TMP TCG_REG_R12 #define TCG_VEC_TMP TCG_REG_Q15 diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index 7b573bd287..2f0a9521bf 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -109,12 +109,16 @@ static const int tcg_target_call_iarg_regs[] = { #endif }; -static const int tcg_target_call_oarg_regs[] = { -TCG_REG_EAX, -#if TCG_TARGET_REG_BITS == 32 -TCG_REG_EDX -#endif -}; +static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind kind, int slot) +{ +switch (kind) { +case TCG_CALL_RET_NORMAL: +tcg_debug_assert(slot >= 0 && slot <= 1); +return slot ? TCG_REG_EDX : TCG_REG_EAX; +default: +g_assert_
[PULL 11/40] tcg/i386: Add TCG_TARGET_CALL_{RET,ARG}_I128
Fill in the parameters for the host ABI for Int128. Adjust tcg_target_call_oarg_reg for _WIN64, and tcg_out_call for i386 sysv. Allow TCG_TYPE_V128 stores without AVX enabled. Reviewed-by: Alex Bennée Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.h | 10 ++ tcg/i386/tcg-target.c.inc | 30 +- 2 files changed, 39 insertions(+), 1 deletion(-) diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h index 5797a55ea0..d4f2a6f8c2 100644 --- a/tcg/i386/tcg-target.h +++ b/tcg/i386/tcg-target.h @@ -100,6 +100,16 @@ typedef enum { #endif #define TCG_TARGET_CALL_ARG_I32 TCG_CALL_ARG_NORMAL #define TCG_TARGET_CALL_ARG_I64 TCG_CALL_ARG_NORMAL +#if defined(_WIN64) +# define TCG_TARGET_CALL_ARG_I128TCG_CALL_ARG_BY_REF +# define TCG_TARGET_CALL_RET_I128TCG_CALL_RET_BY_VEC +#elif TCG_TARGET_REG_BITS == 64 +# define TCG_TARGET_CALL_ARG_I128TCG_CALL_ARG_NORMAL +# define TCG_TARGET_CALL_RET_I128TCG_CALL_RET_NORMAL +#else +# define TCG_TARGET_CALL_ARG_I128TCG_CALL_ARG_NORMAL +# define TCG_TARGET_CALL_RET_I128TCG_CALL_RET_BY_REF +#endif extern bool have_bmi1; extern bool have_popcnt; diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index 2f0a9521bf..883ced8168 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -115,6 +115,11 @@ static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind kind, int slot) case TCG_CALL_RET_NORMAL: tcg_debug_assert(slot >= 0 && slot <= 1); return slot ? TCG_REG_EDX : TCG_REG_EAX; +#ifdef _WIN64 +case TCG_CALL_RET_BY_VEC: +tcg_debug_assert(slot == 0); +return TCG_REG_XMM0; +#endif default: g_assert_not_reached(); } @@ -1188,9 +1193,16 @@ static void tcg_out_st(TCGContext *s, TCGType type, TCGReg arg, * The gvec infrastructure is asserts that v128 vector loads * and stores use a 16-byte aligned offset. Validate that the * final pointer is aligned by using an insn that will SIGSEGV. + * + * This specific instance is also used by TCG_CALL_RET_BY_VEC, + * for _WIN64, which must have SSE2 but may not have AVX. */ tcg_debug_assert(arg >= 16); -tcg_out_vex_modrm_offset(s, OPC_MOVDQA_WxVx, arg, 0, arg1, arg2); +if (have_avx1) { +tcg_out_vex_modrm_offset(s, OPC_MOVDQA_WxVx, arg, 0, arg1, arg2); +} else { +tcg_out_modrm_offset(s, OPC_MOVDQA_WxVx, arg, arg1, arg2); +} break; case TCG_TYPE_V256: /* @@ -1677,6 +1689,22 @@ static void tcg_out_call(TCGContext *s, const tcg_insn_unit *dest, const TCGHelperInfo *info) { tcg_out_branch(s, 1, dest); + +#ifndef _WIN32 +if (TCG_TARGET_REG_BITS == 32 && info->out_kind == TCG_CALL_RET_BY_REF) { +/* + * The sysv i386 abi for struct return places a reference as the + * first argument of the stack, and pops that argument with the + * return statement. Since we want to retain the aligned stack + * pointer for the callee, we do not want to actually push that + * argument before the call but rely on the normal store to the + * stack slot. But we do need to compensate for the pop in order + * to reset our correct stack pointer value. + * Pushing a garbage value back onto the stack is quickest. + */ +tcg_out_push(s, TCG_REG_EAX); +} +#endif } static void tcg_out_jmp(TCGContext *s, const tcg_insn_unit *dest) -- 2.34.1
[PULL 14/40] tcg: Add TCG_TARGET_CALL_{RET,ARG}_I128
Fill in the parameters for the host ABI for Int128 for those backends which require no extra modification. Reviewed-by: Alex Bennée Reviewed-by: Daniel Henrique Barboza Signed-off-by: Richard Henderson --- tcg/aarch64/tcg-target.h | 2 ++ tcg/arm/tcg-target.h | 2 ++ tcg/loongarch64/tcg-target.h | 2 ++ tcg/mips/tcg-target.h| 2 ++ tcg/riscv/tcg-target.h | 3 +++ tcg/s390x/tcg-target.h | 2 ++ tcg/sparc64/tcg-target.h | 2 ++ tcg/tcg.c| 6 +++--- tcg/ppc/tcg-target.c.inc | 3 +++ 9 files changed, 21 insertions(+), 3 deletions(-) diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h index 8d244292aa..c0b0f614ba 100644 --- a/tcg/aarch64/tcg-target.h +++ b/tcg/aarch64/tcg-target.h @@ -54,6 +54,8 @@ typedef enum { #define TCG_TARGET_CALL_STACK_OFFSET0 #define TCG_TARGET_CALL_ARG_I32 TCG_CALL_ARG_NORMAL #define TCG_TARGET_CALL_ARG_I64 TCG_CALL_ARG_NORMAL +#define TCG_TARGET_CALL_ARG_I128TCG_CALL_ARG_EVEN +#define TCG_TARGET_CALL_RET_I128TCG_CALL_RET_NORMAL /* optional instructions */ #define TCG_TARGET_HAS_div_i32 1 diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h index 91b8954804..def2a189e6 100644 --- a/tcg/arm/tcg-target.h +++ b/tcg/arm/tcg-target.h @@ -91,6 +91,8 @@ extern bool use_neon_instructions; #define TCG_TARGET_CALL_STACK_OFFSET 0 #define TCG_TARGET_CALL_ARG_I32 TCG_CALL_ARG_NORMAL #define TCG_TARGET_CALL_ARG_I64 TCG_CALL_ARG_EVEN +#define TCG_TARGET_CALL_ARG_I128TCG_CALL_ARG_EVEN +#define TCG_TARGET_CALL_RET_I128TCG_CALL_RET_BY_REF /* optional instructions */ #define TCG_TARGET_HAS_ext8s_i321 diff --git a/tcg/loongarch64/tcg-target.h b/tcg/loongarch64/tcg-target.h index 8b151e7f6f..17b8193aa5 100644 --- a/tcg/loongarch64/tcg-target.h +++ b/tcg/loongarch64/tcg-target.h @@ -92,6 +92,8 @@ typedef enum { #define TCG_TARGET_CALL_STACK_OFFSET0 #define TCG_TARGET_CALL_ARG_I32 TCG_CALL_ARG_NORMAL #define TCG_TARGET_CALL_ARG_I64 TCG_CALL_ARG_NORMAL +#define TCG_TARGET_CALL_ARG_I128TCG_CALL_ARG_NORMAL +#define TCG_TARGET_CALL_RET_I128TCG_CALL_RET_NORMAL /* optional instructions */ #define TCG_TARGET_HAS_movcond_i32 1 diff --git a/tcg/mips/tcg-target.h b/tcg/mips/tcg-target.h index 7bc8e15293..68b11e4d48 100644 --- a/tcg/mips/tcg-target.h +++ b/tcg/mips/tcg-target.h @@ -89,6 +89,8 @@ typedef enum { # define TCG_TARGET_CALL_ARG_I64 TCG_CALL_ARG_NORMAL #endif #define TCG_TARGET_CALL_ARG_I32 TCG_CALL_ARG_NORMAL +#define TCG_TARGET_CALL_ARG_I128 TCG_CALL_ARG_EVEN +#define TCG_TARGET_CALL_RET_I128 TCG_CALL_RET_NORMAL /* MOVN/MOVZ instructions detection */ #if (defined(__mips_isa_rev) && (__mips_isa_rev >= 1)) || \ diff --git a/tcg/riscv/tcg-target.h b/tcg/riscv/tcg-target.h index 1337bc1f1e..0deb33701f 100644 --- a/tcg/riscv/tcg-target.h +++ b/tcg/riscv/tcg-target.h @@ -85,9 +85,12 @@ typedef enum { #define TCG_TARGET_CALL_ARG_I32 TCG_CALL_ARG_NORMAL #if TCG_TARGET_REG_BITS == 32 #define TCG_TARGET_CALL_ARG_I64 TCG_CALL_ARG_EVEN +#define TCG_TARGET_CALL_ARG_I128TCG_CALL_ARG_EVEN #else #define TCG_TARGET_CALL_ARG_I64 TCG_CALL_ARG_NORMAL +#define TCG_TARGET_CALL_ARG_I128TCG_CALL_ARG_NORMAL #endif +#define TCG_TARGET_CALL_RET_I128TCG_CALL_RET_NORMAL /* optional instructions */ #define TCG_TARGET_HAS_movcond_i32 0 diff --git a/tcg/s390x/tcg-target.h b/tcg/s390x/tcg-target.h index e597e47e60..a05b473117 100644 --- a/tcg/s390x/tcg-target.h +++ b/tcg/s390x/tcg-target.h @@ -169,6 +169,8 @@ extern uint64_t s390_facilities[3]; #define TCG_TARGET_CALL_STACK_OFFSET 160 #define TCG_TARGET_CALL_ARG_I32 TCG_CALL_ARG_EXTEND #define TCG_TARGET_CALL_ARG_I64 TCG_CALL_ARG_NORMAL +#define TCG_TARGET_CALL_ARG_I128TCG_CALL_ARG_BY_REF +#define TCG_TARGET_CALL_RET_I128TCG_CALL_RET_BY_REF #define TCG_TARGET_HAS_MEMORY_BSWAP 1 diff --git a/tcg/sparc64/tcg-target.h b/tcg/sparc64/tcg-target.h index 1d6a5c8b07..ffe22b1d21 100644 --- a/tcg/sparc64/tcg-target.h +++ b/tcg/sparc64/tcg-target.h @@ -73,6 +73,8 @@ typedef enum { #define TCG_TARGET_CALL_STACK_OFFSET(128 + 6*8 + TCG_TARGET_STACK_BIAS) #define TCG_TARGET_CALL_ARG_I32 TCG_CALL_ARG_EXTEND #define TCG_TARGET_CALL_ARG_I64 TCG_CALL_ARG_NORMAL +#define TCG_TARGET_CALL_ARG_I128TCG_CALL_ARG_NORMAL +#define TCG_TARGET_CALL_RET_I128TCG_CALL_RET_NORMAL #if defined(__VIS__) && __VIS__ >= 0x300 #define use_vis3_instructions 1 diff --git a/tcg/tcg.c b/tcg/tcg.c index 865ed5ea0f..163913c95f 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -765,8 +765,8 @@ static void init_call_layout(TCGHelperInfo *info) break; case dh_typecode_i128: info->nr_out = 128 / TCG_TARGET_REG_BITS; -info->out_kind = TCG_CALL_RET_NORMAL; /* TODO */ -switch (/* TODO */ TCG_CALL_RET_NORMAL)
[PULL 21/40] target/arm: Use tcg_gen_atomic_cmpxchg_i128 for CASP
Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell Message-Id: <20221112042555.2622152-3-richard.hender...@linaro.org> --- target/arm/helper-a64.h| 2 -- target/arm/helper-a64.c| 43 --- target/arm/translate-a64.c | 61 +++--- 3 files changed, 18 insertions(+), 88 deletions(-) diff --git a/target/arm/helper-a64.h b/target/arm/helper-a64.h index 94065d1917..ff56807247 100644 --- a/target/arm/helper-a64.h +++ b/target/arm/helper-a64.h @@ -50,8 +50,6 @@ DEF_HELPER_FLAGS_2(frecpx_f16, TCG_CALL_NO_RWG, f16, f16, ptr) DEF_HELPER_FLAGS_2(fcvtx_f64_to_f32, TCG_CALL_NO_RWG, f32, f64, env) DEF_HELPER_FLAGS_3(crc32_64, TCG_CALL_NO_RWG_SE, i64, i64, i64, i32) DEF_HELPER_FLAGS_3(crc32c_64, TCG_CALL_NO_RWG_SE, i64, i64, i64, i32) -DEF_HELPER_5(casp_le_parallel, void, env, i32, i64, i64, i64) -DEF_HELPER_5(casp_be_parallel, void, env, i32, i64, i64, i64) DEF_HELPER_FLAGS_3(advsimd_maxh, TCG_CALL_NO_RWG, f16, f16, f16, ptr) DEF_HELPER_FLAGS_3(advsimd_minh, TCG_CALL_NO_RWG, f16, f16, f16, ptr) DEF_HELPER_FLAGS_3(advsimd_maxnumh, TCG_CALL_NO_RWG, f16, f16, f16, ptr) diff --git a/target/arm/helper-a64.c b/target/arm/helper-a64.c index 7dbdb2c233..0972a4bdd0 100644 --- a/target/arm/helper-a64.c +++ b/target/arm/helper-a64.c @@ -505,49 +505,6 @@ uint64_t HELPER(crc32c_64)(uint64_t acc, uint64_t val, uint32_t bytes) return crc32c(acc, buf, bytes) ^ 0x; } -/* Writes back the old data into Rs. */ -void HELPER(casp_le_parallel)(CPUARMState *env, uint32_t rs, uint64_t addr, - uint64_t new_lo, uint64_t new_hi) -{ -Int128 oldv, cmpv, newv; -uintptr_t ra = GETPC(); -int mem_idx; -MemOpIdx oi; - -assert(HAVE_CMPXCHG128); - -mem_idx = cpu_mmu_index(env, false); -oi = make_memop_idx(MO_LE | MO_128 | MO_ALIGN, mem_idx); - -cmpv = int128_make128(env->xregs[rs], env->xregs[rs + 1]); -newv = int128_make128(new_lo, new_hi); -oldv = cpu_atomic_cmpxchgo_le_mmu(env, addr, cmpv, newv, oi, ra); - -env->xregs[rs] = int128_getlo(oldv); -env->xregs[rs + 1] = int128_gethi(oldv); -} - -void HELPER(casp_be_parallel)(CPUARMState *env, uint32_t rs, uint64_t addr, - uint64_t new_hi, uint64_t new_lo) -{ -Int128 oldv, cmpv, newv; -uintptr_t ra = GETPC(); -int mem_idx; -MemOpIdx oi; - -assert(HAVE_CMPXCHG128); - -mem_idx = cpu_mmu_index(env, false); -oi = make_memop_idx(MO_LE | MO_128 | MO_ALIGN, mem_idx); - -cmpv = int128_make128(env->xregs[rs + 1], env->xregs[rs]); -newv = int128_make128(new_lo, new_hi); -oldv = cpu_atomic_cmpxchgo_be_mmu(env, addr, cmpv, newv, oi, ra); - -env->xregs[rs + 1] = int128_getlo(oldv); -env->xregs[rs] = int128_gethi(oldv); -} - /* * AdvSIMD half-precision */ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 951b64c9b1..da9f877476 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -2709,53 +2709,28 @@ static void gen_compare_and_swap_pair(DisasContext *s, int rs, int rt, tcg_gen_extr32_i64(s2, s1, cmp); } tcg_temp_free_i64(cmp); -} else if (tb_cflags(s->base.tb) & CF_PARALLEL) { -if (HAVE_CMPXCHG128) { -TCGv_i32 tcg_rs = tcg_constant_i32(rs); -if (s->be_data == MO_LE) { -gen_helper_casp_le_parallel(cpu_env, tcg_rs, -clean_addr, t1, t2); -} else { -gen_helper_casp_be_parallel(cpu_env, tcg_rs, -clean_addr, t1, t2); -} -} else { -gen_helper_exit_atomic(cpu_env); -s->base.is_jmp = DISAS_NORETURN; -} } else { -TCGv_i64 d1 = tcg_temp_new_i64(); -TCGv_i64 d2 = tcg_temp_new_i64(); -TCGv_i64 a2 = tcg_temp_new_i64(); -TCGv_i64 c1 = tcg_temp_new_i64(); -TCGv_i64 c2 = tcg_temp_new_i64(); -TCGv_i64 zero = tcg_constant_i64(0); +TCGv_i128 cmp = tcg_temp_new_i128(); +TCGv_i128 val = tcg_temp_new_i128(); -/* Load the two words, in memory order. */ -tcg_gen_qemu_ld_i64(d1, clean_addr, memidx, -MO_64 | MO_ALIGN_16 | s->be_data); -tcg_gen_addi_i64(a2, clean_addr, 8); -tcg_gen_qemu_ld_i64(d2, a2, memidx, MO_64 | s->be_data); +if (s->be_data == MO_LE) { +tcg_gen_concat_i64_i128(val, t1, t2); +tcg_gen_concat_i64_i128(cmp, s1, s2); +} else { +tcg_gen_concat_i64_i128(val, t2, t1); +tcg_gen_concat_i64_i128(cmp, s2, s1); +} -/* Compare the two words, also in memory order. */ -tcg_gen_setcond_i64(TCG_COND_EQ, c1, d1, s1); -tcg_gen_setcond_i64(TCG_COND_EQ, c2, d2, s2); -tcg_gen_and_i64(c2, c2, c1); +tcg_gen_atomic_cmpxchg_i128(cmp, clean_addr, cmp, val, memidx, +
[PULL 04/40] tcg: Handle dh_typecode_i128 with TCG_CALL_{RET, ARG}_NORMAL
Many hosts pass and return 128-bit quantities like sequential 64-bit quantities. Treat this just like we currently break down 64-bit quantities for a 32-bit host. Reviewed-by: Alex Bennée Signed-off-by: Richard Henderson --- tcg/tcg.c | 37 + 1 file changed, 33 insertions(+), 4 deletions(-) diff --git a/tcg/tcg.c b/tcg/tcg.c index bc60fd0fe8..bc7198e5d0 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -707,11 +707,22 @@ static void init_call_layout(TCGHelperInfo *info) case dh_typecode_s64: info->nr_out = 64 / TCG_TARGET_REG_BITS; info->out_kind = TCG_CALL_RET_NORMAL; +assert(info->nr_out <= ARRAY_SIZE(tcg_target_call_oarg_regs)); +break; +case dh_typecode_i128: +info->nr_out = 128 / TCG_TARGET_REG_BITS; +info->out_kind = TCG_CALL_RET_NORMAL; /* TODO */ +switch (/* TODO */ TCG_CALL_RET_NORMAL) { +case TCG_CALL_RET_NORMAL: +assert(info->nr_out <= ARRAY_SIZE(tcg_target_call_oarg_regs)); +break; +default: +qemu_build_not_reached(); +} break; default: g_assert_not_reached(); } -assert(info->nr_out <= ARRAY_SIZE(tcg_target_call_oarg_regs)); /* * Parse and place function arguments. @@ -733,6 +744,9 @@ static void init_call_layout(TCGHelperInfo *info) case dh_typecode_ptr: type = TCG_TYPE_PTR; break; +case dh_typecode_i128: +type = TCG_TYPE_I128; +break; default: g_assert_not_reached(); } @@ -772,6 +786,19 @@ static void init_call_layout(TCGHelperInfo *info) } break; +case TCG_TYPE_I128: +switch (/* TODO */ TCG_CALL_ARG_NORMAL) { +case TCG_CALL_ARG_EVEN: +layout_arg_even(&cum); +/* fall through */ +case TCG_CALL_ARG_NORMAL: +layout_arg_normal_n(&cum, info, 128 / TCG_TARGET_REG_BITS); +break; +default: +qemu_build_not_reached(); +} +break; + default: g_assert_not_reached(); } @@ -1692,11 +1719,13 @@ void tcg_gen_callN(void *func, TCGTemp *ret, int nargs, TCGTemp **args) op->args[pi++] = temp_arg(ret); break; case 2: +case 4: tcg_debug_assert(ret != NULL); -tcg_debug_assert(ret->base_type == ret->type + 1); +tcg_debug_assert(ret->base_type == ret->type + ctz32(n)); tcg_debug_assert(ret->temp_subindex == 0); -op->args[pi++] = temp_arg(ret); -op->args[pi++] = temp_arg(ret + 1); +for (i = 0; i < n; ++i) { +op->args[pi++] = temp_arg(ret + i); +} break; default: g_assert_not_reached(); -- 2.34.1
[PULL 17/40] tcg: Add guest load/store primitives for TCGv_i128
These are not yet considering atomicity of the 16-byte value; this is a direct replacement for the current target code which uses a pair of 8-byte operations. Reviewed-by: Alex Bennée Signed-off-by: Richard Henderson --- include/exec/cpu_ldst.h | 10 +++ include/tcg/tcg-op.h| 2 + accel/tcg/cputlb.c | 112 + accel/tcg/user-exec.c | 66 tcg/tcg-op.c| 134 5 files changed, 324 insertions(+) diff --git a/include/exec/cpu_ldst.h b/include/exec/cpu_ldst.h index d0c7c0d5fe..09b55cc0ee 100644 --- a/include/exec/cpu_ldst.h +++ b/include/exec/cpu_ldst.h @@ -220,6 +220,11 @@ uint32_t cpu_ldl_le_mmu(CPUArchState *env, abi_ptr ptr, uint64_t cpu_ldq_le_mmu(CPUArchState *env, abi_ptr ptr, MemOpIdx oi, uintptr_t ra); +Int128 cpu_ld16_be_mmu(CPUArchState *env, abi_ptr addr, + MemOpIdx oi, uintptr_t ra); +Int128 cpu_ld16_le_mmu(CPUArchState *env, abi_ptr addr, + MemOpIdx oi, uintptr_t ra); + void cpu_stb_mmu(CPUArchState *env, abi_ptr ptr, uint8_t val, MemOpIdx oi, uintptr_t ra); void cpu_stw_be_mmu(CPUArchState *env, abi_ptr ptr, uint16_t val, @@ -235,6 +240,11 @@ void cpu_stl_le_mmu(CPUArchState *env, abi_ptr ptr, uint32_t val, void cpu_stq_le_mmu(CPUArchState *env, abi_ptr ptr, uint64_t val, MemOpIdx oi, uintptr_t ra); +void cpu_st16_be_mmu(CPUArchState *env, abi_ptr addr, Int128 val, + MemOpIdx oi, uintptr_t ra); +void cpu_st16_le_mmu(CPUArchState *env, abi_ptr addr, Int128 val, + MemOpIdx oi, uintptr_t ra); + uint32_t cpu_atomic_cmpxchgb_mmu(CPUArchState *env, target_ulong addr, uint32_t cmpv, uint32_t newv, MemOpIdx oi, uintptr_t retaddr); diff --git a/include/tcg/tcg-op.h b/include/tcg/tcg-op.h index c4276767d1..e5f5b63c37 100644 --- a/include/tcg/tcg-op.h +++ b/include/tcg/tcg-op.h @@ -845,6 +845,8 @@ void tcg_gen_qemu_ld_i32(TCGv_i32, TCGv, TCGArg, MemOp); void tcg_gen_qemu_st_i32(TCGv_i32, TCGv, TCGArg, MemOp); void tcg_gen_qemu_ld_i64(TCGv_i64, TCGv, TCGArg, MemOp); void tcg_gen_qemu_st_i64(TCGv_i64, TCGv, TCGArg, MemOp); +void tcg_gen_qemu_ld_i128(TCGv_i128, TCGv, TCGArg, MemOp); +void tcg_gen_qemu_st_i128(TCGv_i128, TCGv, TCGArg, MemOp); static inline void tcg_gen_qemu_ld8u(TCGv ret, TCGv addr, int mem_index) { diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c index 04e270742e..4812d83961 100644 --- a/accel/tcg/cputlb.c +++ b/accel/tcg/cputlb.c @@ -2192,6 +2192,64 @@ uint64_t cpu_ldq_le_mmu(CPUArchState *env, abi_ptr addr, return cpu_load_helper(env, addr, oi, ra, helper_le_ldq_mmu); } +Int128 cpu_ld16_be_mmu(CPUArchState *env, abi_ptr addr, + MemOpIdx oi, uintptr_t ra) +{ +MemOp mop = get_memop(oi); +int mmu_idx = get_mmuidx(oi); +MemOpIdx new_oi; +unsigned a_bits; +uint64_t h, l; + +tcg_debug_assert((mop & (MO_BSWAP|MO_SSIZE)) == (MO_BE|MO_128)); +a_bits = get_alignment_bits(mop); + +/* Handle CPU specific unaligned behaviour */ +if (addr & ((1 << a_bits) - 1)) { +cpu_unaligned_access(env_cpu(env), addr, MMU_DATA_LOAD, + mmu_idx, ra); +} + +/* Construct an unaligned 64-bit replacement MemOpIdx. */ +mop = (mop & ~(MO_SIZE | MO_AMASK)) | MO_64 | MO_UNALN; +new_oi = make_memop_idx(mop, mmu_idx); + +h = helper_be_ldq_mmu(env, addr, new_oi, ra); +l = helper_be_ldq_mmu(env, addr + 8, new_oi, ra); + +qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); +return int128_make128(l, h); +} + +Int128 cpu_ld16_le_mmu(CPUArchState *env, abi_ptr addr, + MemOpIdx oi, uintptr_t ra) +{ +MemOp mop = get_memop(oi); +int mmu_idx = get_mmuidx(oi); +MemOpIdx new_oi; +unsigned a_bits; +uint64_t h, l; + +tcg_debug_assert((mop & (MO_BSWAP|MO_SSIZE)) == (MO_LE|MO_128)); +a_bits = get_alignment_bits(mop); + +/* Handle CPU specific unaligned behaviour */ +if (addr & ((1 << a_bits) - 1)) { +cpu_unaligned_access(env_cpu(env), addr, MMU_DATA_LOAD, + mmu_idx, ra); +} + +/* Construct an unaligned 64-bit replacement MemOpIdx. */ +mop = (mop & ~(MO_SIZE | MO_AMASK)) | MO_64 | MO_UNALN; +new_oi = make_memop_idx(mop, mmu_idx); + +l = helper_le_ldq_mmu(env, addr, new_oi, ra); +h = helper_le_ldq_mmu(env, addr + 8, new_oi, ra); + +qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); +return int128_make128(l, h); +} + /* * Store Helpers */ @@ -2546,6 +2604,60 @@ void cpu_stq_le_mmu(CPUArchState *env, target_ulong addr, uint64_t val, cpu_store_helper(env, addr, val, oi, retaddr, helper_le_stq_mmu); } +void cpu_st16_be_mmu(CPUArchState *env, abi_ptr addr, Int128 val, + MemOpIdx o
[PULL 12/40] tcg/tci: Fix big-endian return register ordering
We expect the backend to require register pairs in host-endian ordering, thus for big-endian the first register of a pair contains the high part. We were forcing R0 to contain the low part for calls. Reviewed-by: Alex Bennée Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- tcg/tci.c | 21 +++-- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/tcg/tci.c b/tcg/tci.c index 05a24163d3..eeccdde8bc 100644 --- a/tcg/tci.c +++ b/tcg/tci.c @@ -520,27 +520,28 @@ uintptr_t QEMU_DISABLE_CFI tcg_qemu_tb_exec(CPUArchState *env, ffi_call(pptr[1], pptr[0], stack, call_slots); } -/* Any result winds up "left-aligned" in the stack[0] slot. */ switch (len) { case 0: /* void */ break; case 1: /* uint32_t */ /* + * The result winds up "left-aligned" in the stack[0] slot. * Note that libffi has an odd special case in that it will * always widen an integral result to ffi_arg. */ -if (sizeof(ffi_arg) == 4) { -regs[TCG_REG_R0] = *(uint32_t *)stack; -break; -} -/* fall through */ -case 2: /* uint64_t */ -if (TCG_TARGET_REG_BITS == 32) { -tci_write_reg64(regs, TCG_REG_R1, TCG_REG_R0, stack[0]); +if (sizeof(ffi_arg) == 8) { +regs[TCG_REG_R0] = (uint32_t)stack[0]; } else { -regs[TCG_REG_R0] = stack[0]; +regs[TCG_REG_R0] = *(uint32_t *)stack; } break; +case 2: /* uint64_t */ +/* + * For TCG_TARGET_REG_BITS == 32, the register pair + * must stay in host memory order. + */ +memcpy(®s[TCG_REG_R0], stack, 8); +break; default: g_assert_not_reached(); } -- 2.34.1
[PULL 15/40] tcg: Add temp allocation for TCGv_i128
This enables allocation of i128. The type is not yet usable, as we have not yet added data movement ops. Reviewed-by: Alex Bennée Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- include/tcg/tcg.h | 32 + tcg/tcg.c | 60 +-- 2 files changed, 74 insertions(+), 18 deletions(-) diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h index 4d7e4107a9..59854f95b1 100644 --- a/include/tcg/tcg.h +++ b/include/tcg/tcg.h @@ -687,6 +687,11 @@ static inline TCGTemp *tcgv_i64_temp(TCGv_i64 v) return tcgv_i32_temp((TCGv_i32)v); } +static inline TCGTemp *tcgv_i128_temp(TCGv_i128 v) +{ +return tcgv_i32_temp((TCGv_i32)v); +} + static inline TCGTemp *tcgv_ptr_temp(TCGv_ptr v) { return tcgv_i32_temp((TCGv_i32)v); @@ -707,6 +712,11 @@ static inline TCGArg tcgv_i64_arg(TCGv_i64 v) return temp_arg(tcgv_i64_temp(v)); } +static inline TCGArg tcgv_i128_arg(TCGv_i128 v) +{ +return temp_arg(tcgv_i128_temp(v)); +} + static inline TCGArg tcgv_ptr_arg(TCGv_ptr v) { return temp_arg(tcgv_ptr_temp(v)); @@ -728,6 +738,11 @@ static inline TCGv_i64 temp_tcgv_i64(TCGTemp *t) return (TCGv_i64)temp_tcgv_i32(t); } +static inline TCGv_i128 temp_tcgv_i128(TCGTemp *t) +{ +return (TCGv_i128)temp_tcgv_i32(t); +} + static inline TCGv_ptr temp_tcgv_ptr(TCGTemp *t) { return (TCGv_ptr)temp_tcgv_i32(t); @@ -853,6 +868,11 @@ static inline void tcg_temp_free_i64(TCGv_i64 arg) tcg_temp_free_internal(tcgv_i64_temp(arg)); } +static inline void tcg_temp_free_i128(TCGv_i128 arg) +{ +tcg_temp_free_internal(tcgv_i128_temp(arg)); +} + static inline void tcg_temp_free_ptr(TCGv_ptr arg) { tcg_temp_free_internal(tcgv_ptr_temp(arg)); @@ -901,6 +921,18 @@ static inline TCGv_i64 tcg_temp_local_new_i64(void) return temp_tcgv_i64(t); } +static inline TCGv_i128 tcg_temp_new_i128(void) +{ +TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_I128, false); +return temp_tcgv_i128(t); +} + +static inline TCGv_i128 tcg_temp_local_new_i128(void) +{ +TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_I128, true); +return temp_tcgv_i128(t); +} + static inline TCGv_ptr tcg_global_mem_new_ptr(TCGv_ptr reg, intptr_t offset, const char *name) { diff --git a/tcg/tcg.c b/tcg/tcg.c index 163913c95f..a4a3da6804 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -1273,26 +1273,45 @@ TCGTemp *tcg_temp_new_internal(TCGType type, bool temp_local) tcg_debug_assert(ts->base_type == type); tcg_debug_assert(ts->kind == kind); } else { +int i, n; + +switch (type) { +case TCG_TYPE_I32: +case TCG_TYPE_V64: +case TCG_TYPE_V128: +case TCG_TYPE_V256: +n = 1; +break; +case TCG_TYPE_I64: +n = 64 / TCG_TARGET_REG_BITS; +break; +case TCG_TYPE_I128: +n = 128 / TCG_TARGET_REG_BITS; +break; +default: +g_assert_not_reached(); +} + ts = tcg_temp_alloc(s); -if (TCG_TARGET_REG_BITS == 32 && type == TCG_TYPE_I64) { -TCGTemp *ts2 = tcg_temp_alloc(s); +ts->base_type = type; +ts->temp_allocated = 1; +ts->kind = kind; -ts->base_type = type; -ts->type = TCG_TYPE_I32; -ts->temp_allocated = 1; -ts->kind = kind; - -tcg_debug_assert(ts2 == ts + 1); -ts2->base_type = TCG_TYPE_I64; -ts2->type = TCG_TYPE_I32; -ts2->temp_allocated = 1; -ts2->temp_subindex = 1; -ts2->kind = kind; -} else { -ts->base_type = type; +if (n == 1) { ts->type = type; -ts->temp_allocated = 1; -ts->kind = kind; +} else { +ts->type = TCG_TYPE_REG; + +for (i = 1; i < n; ++i) { +TCGTemp *ts2 = tcg_temp_alloc(s); + +tcg_debug_assert(ts2 == ts + i); +ts2->base_type = type; +ts2->type = TCG_TYPE_REG; +ts2->temp_allocated = 1; +ts2->temp_subindex = i; +ts2->kind = kind; +} } } @@ -3384,9 +3403,14 @@ static void temp_allocate_frame(TCGContext *s, TCGTemp *ts) case TCG_TYPE_V64: align = 8; break; +case TCG_TYPE_I128: case TCG_TYPE_V128: case TCG_TYPE_V256: -/* Note that we do not require aligned storage for V256. */ +/* + * Note that we do not require aligned storage for V256, + * and that we provide alignment for I128 to match V128, + * even if that's above what the host ABI requires. + */ align = 16; break; default: -- 2.34.1
[PULL 35/40] target/s390x: Use tcg_gen_atomic_cmpxchg_i128 for CDSG
Acked-by: Ilya Leoshkevich Signed-off-by: Richard Henderson --- target/s390x/helper.h| 2 -- target/s390x/tcg/insn-data.h.inc | 2 +- target/s390x/tcg/mem_helper.c| 52 -- target/s390x/tcg/translate.c | 55 +++- 4 files changed, 33 insertions(+), 78 deletions(-) diff --git a/target/s390x/helper.h b/target/s390x/helper.h index bccd3bfca6..341bc51ec2 100644 --- a/target/s390x/helper.h +++ b/target/s390x/helper.h @@ -35,8 +35,6 @@ DEF_HELPER_3(cxgb, i128, env, s64, i32) DEF_HELPER_3(celgb, i64, env, i64, i32) DEF_HELPER_3(cdlgb, i64, env, i64, i32) DEF_HELPER_3(cxlgb, i128, env, i64, i32) -DEF_HELPER_4(cdsg, void, env, i64, i32, i32) -DEF_HELPER_4(cdsg_parallel, void, env, i64, i32, i32) DEF_HELPER_4(csst, i32, env, i32, i64, i64) DEF_HELPER_4(csst_parallel, i32, env, i32, i64, i64) DEF_HELPER_FLAGS_3(aeb, TCG_CALL_NO_WG, i64, env, i64, i64) diff --git a/target/s390x/tcg/insn-data.h.inc b/target/s390x/tcg/insn-data.h.inc index 893f4b48db..9d2d35f084 100644 --- a/target/s390x/tcg/insn-data.h.inc +++ b/target/s390x/tcg/insn-data.h.inc @@ -276,7 +276,7 @@ /* COMPARE DOUBLE AND SWAP */ D(0xbb00, CDS, RS_a, Z, r3_D32, r1_D32, new, r1_D32, cs, 0, MO_TEUQ) D(0xeb31, CDSY,RSY_a, LD, r3_D32, r1_D32, new, r1_D32, cs, 0, MO_TEUQ) -C(0xeb3e, CDSG,RSY_a, Z, 0, 0, 0, 0, cdsg, 0) +C(0xeb3e, CDSG,RSY_a, Z, la2, r3_D64, 0, r1_D64, cdsg, 0) /* COMPARE AND SWAP AND STORE */ C(0xc802, CSST,SSF, CASS, la1, a2, 0, 0, csst, 0) diff --git a/target/s390x/tcg/mem_helper.c b/target/s390x/tcg/mem_helper.c index 49969abda7..d6725fd18c 100644 --- a/target/s390x/tcg/mem_helper.c +++ b/target/s390x/tcg/mem_helper.c @@ -1771,58 +1771,6 @@ uint32_t HELPER(trXX)(CPUS390XState *env, uint32_t r1, uint32_t r2, return cc; } -void HELPER(cdsg)(CPUS390XState *env, uint64_t addr, - uint32_t r1, uint32_t r3) -{ -uintptr_t ra = GETPC(); -Int128 cmpv = int128_make128(env->regs[r1 + 1], env->regs[r1]); -Int128 newv = int128_make128(env->regs[r3 + 1], env->regs[r3]); -Int128 oldv; -uint64_t oldh, oldl; -bool fail; - -check_alignment(env, addr, 16, ra); - -oldh = cpu_ldq_data_ra(env, addr + 0, ra); -oldl = cpu_ldq_data_ra(env, addr + 8, ra); - -oldv = int128_make128(oldl, oldh); -fail = !int128_eq(oldv, cmpv); -if (fail) { -newv = oldv; -} - -cpu_stq_data_ra(env, addr + 0, int128_gethi(newv), ra); -cpu_stq_data_ra(env, addr + 8, int128_getlo(newv), ra); - -env->cc_op = fail; -env->regs[r1] = int128_gethi(oldv); -env->regs[r1 + 1] = int128_getlo(oldv); -} - -void HELPER(cdsg_parallel)(CPUS390XState *env, uint64_t addr, - uint32_t r1, uint32_t r3) -{ -uintptr_t ra = GETPC(); -Int128 cmpv = int128_make128(env->regs[r1 + 1], env->regs[r1]); -Int128 newv = int128_make128(env->regs[r3 + 1], env->regs[r3]); -int mem_idx; -MemOpIdx oi; -Int128 oldv; -bool fail; - -assert(HAVE_CMPXCHG128); - -mem_idx = cpu_mmu_index(env, false); -oi = make_memop_idx(MO_TE | MO_128 | MO_ALIGN, mem_idx); -oldv = cpu_atomic_cmpxchgo_be_mmu(env, addr, cmpv, newv, oi, ra); -fail = !int128_eq(oldv, cmpv); - -env->cc_op = fail; -env->regs[r1] = int128_gethi(oldv); -env->regs[r1 + 1] = int128_getlo(oldv); -} - static uint32_t do_csst(CPUS390XState *env, uint32_t r3, uint64_t a1, uint64_t a2, bool parallel) { diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c index d422a1e62b..9ea28b3e52 100644 --- a/target/s390x/tcg/translate.c +++ b/target/s390x/tcg/translate.c @@ -2224,31 +2224,25 @@ static DisasJumpType op_cs(DisasContext *s, DisasOps *o) static DisasJumpType op_cdsg(DisasContext *s, DisasOps *o) { int r1 = get_field(s, r1); -int r3 = get_field(s, r3); -int d2 = get_field(s, d2); -int b2 = get_field(s, b2); -DisasJumpType ret = DISAS_NEXT; -TCGv_i64 addr; -TCGv_i32 t_r1, t_r3; -/* Note that R1:R1+1 = expected value and R3:R3+1 = new value. */ -addr = get_address(s, 0, b2, d2); -t_r1 = tcg_const_i32(r1); -t_r3 = tcg_const_i32(r3); -if (!(tb_cflags(s->base.tb) & CF_PARALLEL)) { -gen_helper_cdsg(cpu_env, addr, t_r1, t_r3); -} else if (HAVE_CMPXCHG128) { -gen_helper_cdsg_parallel(cpu_env, addr, t_r1, t_r3); -} else { -gen_helper_exit_atomic(cpu_env); -ret = DISAS_NORETURN; -} -tcg_temp_free_i64(addr); -tcg_temp_free_i32(t_r1); -tcg_temp_free_i32(t_r3); +o->out_128 = tcg_temp_new_i128(); +tcg_gen_concat_i64_i128(o->out_128, regs[r1 + 1], regs[r1]); -set_cc_static(s); -return ret; +/* Note out (R1:R1+1) = expected value and in2 (R3:R3+1) = new value. */ +tcg_gen_atomic_cmpxchg_i128(o->out_128, o->addr1, o->out_128, o->in2_128, +get_mem_index(s), MO
[PULL 37/40] target/i386: Split out gen_cmpxchg8b, gen_cmpxchg16b
Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- target/i386/tcg/translate.c | 48 - 1 file changed, 31 insertions(+), 17 deletions(-) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index 7e0b2a709a..a82131d635 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -2993,6 +2993,34 @@ static void gen_sty_env_A0(DisasContext *s, int offset, bool align) #include "emit.c.inc" #include "decode-new.c.inc" +static void gen_cmpxchg8b(DisasContext *s, CPUX86State *env, int modrm) +{ +gen_lea_modrm(env, s, modrm); + +if ((s->prefix & PREFIX_LOCK) && +(tb_cflags(s->base.tb) & CF_PARALLEL)) { +gen_helper_cmpxchg8b(cpu_env, s->A0); +} else { +gen_helper_cmpxchg8b_unlocked(cpu_env, s->A0); +} +set_cc_op(s, CC_OP_EFLAGS); +} + +#ifdef TARGET_X86_64 +static void gen_cmpxchg16b(DisasContext *s, CPUX86State *env, int modrm) +{ +gen_lea_modrm(env, s, modrm); + +if ((s->prefix & PREFIX_LOCK) && +(tb_cflags(s->base.tb) & CF_PARALLEL)) { +gen_helper_cmpxchg16b(cpu_env, s->A0); +} else { +gen_helper_cmpxchg16b_unlocked(cpu_env, s->A0); +} +set_cc_op(s, CC_OP_EFLAGS); +} +#endif + /* convert one instruction. s->base.is_jmp is set if the translation must be stopped. Return the next pc value */ static bool disas_insn(DisasContext *s, CPUState *cpu) @@ -3844,28 +3872,14 @@ static bool disas_insn(DisasContext *s, CPUState *cpu) if (!(s->cpuid_ext_features & CPUID_EXT_CX16)) { goto illegal_op; } -gen_lea_modrm(env, s, modrm); -if ((s->prefix & PREFIX_LOCK) && -(tb_cflags(s->base.tb) & CF_PARALLEL)) { -gen_helper_cmpxchg16b(cpu_env, s->A0); -} else { -gen_helper_cmpxchg16b_unlocked(cpu_env, s->A0); -} -set_cc_op(s, CC_OP_EFLAGS); +gen_cmpxchg16b(s, env, modrm); break; } -#endif +#endif if (!(s->cpuid_features & CPUID_CX8)) { goto illegal_op; } -gen_lea_modrm(env, s, modrm); -if ((s->prefix & PREFIX_LOCK) && -(tb_cflags(s->base.tb) & CF_PARALLEL)) { -gen_helper_cmpxchg8b(cpu_env, s->A0); -} else { -gen_helper_cmpxchg8b_unlocked(cpu_env, s->A0); -} -set_cc_op(s, CC_OP_EFLAGS); +gen_cmpxchg8b(s, env, modrm); break; case 7: /* RDSEED */ -- 2.34.1
[PULL 09/40] tcg: Add TCG_CALL_RET_BY_VEC
This will be used by _WIN64 to return i128. Not yet used, because allocation is not yet enabled. Reviewed-by: Alex Bennée Signed-off-by: Richard Henderson --- tcg/tcg-internal.h | 1 + tcg/tcg.c | 19 +++ 2 files changed, 20 insertions(+) diff --git a/tcg/tcg-internal.h b/tcg/tcg-internal.h index 2ec1ea01df..33f1d8b411 100644 --- a/tcg/tcg-internal.h +++ b/tcg/tcg-internal.h @@ -37,6 +37,7 @@ typedef enum { TCG_CALL_RET_NORMAL, /* by registers */ TCG_CALL_RET_BY_REF, /* for i128, by reference */ +TCG_CALL_RET_BY_VEC, /* for i128, by vector register */ } TCGCallReturnKind; typedef enum { diff --git a/tcg/tcg.c b/tcg/tcg.c index a77483eee8..098be83b00 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -752,6 +752,10 @@ static void init_call_layout(TCGHelperInfo *info) /* Query the last register now to trigger any assert early. */ tcg_target_call_oarg_reg(info->out_kind, info->nr_out - 1); break; +case TCG_CALL_RET_BY_VEC: +/* Query the single register now to trigger any assert early. */ +tcg_target_call_oarg_reg(TCG_CALL_RET_BY_VEC, 0); +break; case TCG_CALL_RET_BY_REF: /* * Allocate the first argument to the output. @@ -4605,6 +4609,21 @@ static void tcg_reg_alloc_call(TCGContext *s, TCGOp *op) } break; +case TCG_CALL_RET_BY_VEC: +{ +TCGTemp *ts = arg_temp(op->args[0]); + +tcg_debug_assert(ts->base_type == TCG_TYPE_I128); +tcg_debug_assert(ts->temp_subindex == 0); +if (!ts->mem_allocated) { +temp_allocate_frame(s, ts); +} +tcg_out_st(s, TCG_TYPE_V128, + tcg_target_call_oarg_reg(TCG_CALL_RET_BY_VEC, 0), + ts->mem_base->reg, ts->mem_offset); +} +/* fall through to mark all parts in memory */ + case TCG_CALL_RET_BY_REF: /* The callee has performed a write through the reference. */ for (i = 0; i < nb_oargs; i++) { -- 2.34.1
[PULL 38/40] target/i386: Inline cmpxchg8b
Use tcg_gen_atomic_cmpxchg_i64 for the atomic case, and tcg_gen_nonatomic_cmpxchg_i64 otherwise. Reviewed-by: Alex Bennée Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- target/i386/helper.h | 2 -- target/i386/tcg/mem_helper.c | 57 target/i386/tcg/translate.c | 54 ++ 3 files changed, 49 insertions(+), 64 deletions(-) diff --git a/target/i386/helper.h b/target/i386/helper.h index b7de5429ef..2df8049f91 100644 --- a/target/i386/helper.h +++ b/target/i386/helper.h @@ -66,8 +66,6 @@ DEF_HELPER_1(rsm, void, env) #endif /* !CONFIG_USER_ONLY */ DEF_HELPER_2(into, void, env, int) -DEF_HELPER_2(cmpxchg8b_unlocked, void, env, tl) -DEF_HELPER_2(cmpxchg8b, void, env, tl) #ifdef TARGET_X86_64 DEF_HELPER_2(cmpxchg16b_unlocked, void, env, tl) DEF_HELPER_2(cmpxchg16b, void, env, tl) diff --git a/target/i386/tcg/mem_helper.c b/target/i386/tcg/mem_helper.c index e3cdafd2d4..814786bb87 100644 --- a/target/i386/tcg/mem_helper.c +++ b/target/i386/tcg/mem_helper.c @@ -27,63 +27,6 @@ #include "tcg/tcg.h" #include "helper-tcg.h" -void helper_cmpxchg8b_unlocked(CPUX86State *env, target_ulong a0) -{ -uintptr_t ra = GETPC(); -uint64_t oldv, cmpv, newv; -int eflags; - -eflags = cpu_cc_compute_all(env, CC_OP); - -cmpv = deposit64(env->regs[R_EAX], 32, 32, env->regs[R_EDX]); -newv = deposit64(env->regs[R_EBX], 32, 32, env->regs[R_ECX]); - -oldv = cpu_ldq_data_ra(env, a0, ra); -newv = (cmpv == oldv ? newv : oldv); -/* always do the store */ -cpu_stq_data_ra(env, a0, newv, ra); - -if (oldv == cmpv) { -eflags |= CC_Z; -} else { -env->regs[R_EAX] = (uint32_t)oldv; -env->regs[R_EDX] = (uint32_t)(oldv >> 32); -eflags &= ~CC_Z; -} -CC_SRC = eflags; -} - -void helper_cmpxchg8b(CPUX86State *env, target_ulong a0) -{ -#ifdef CONFIG_ATOMIC64 -uint64_t oldv, cmpv, newv; -int eflags; - -eflags = cpu_cc_compute_all(env, CC_OP); - -cmpv = deposit64(env->regs[R_EAX], 32, 32, env->regs[R_EDX]); -newv = deposit64(env->regs[R_EBX], 32, 32, env->regs[R_ECX]); - -{ -uintptr_t ra = GETPC(); -int mem_idx = cpu_mmu_index(env, false); -MemOpIdx oi = make_memop_idx(MO_TEUQ, mem_idx); -oldv = cpu_atomic_cmpxchgq_le_mmu(env, a0, cmpv, newv, oi, ra); -} - -if (oldv == cmpv) { -eflags |= CC_Z; -} else { -env->regs[R_EAX] = (uint32_t)oldv; -env->regs[R_EDX] = (uint32_t)(oldv >> 32); -eflags &= ~CC_Z; -} -CC_SRC = eflags; -#else -cpu_loop_exit_atomic(env_cpu(env), GETPC()); -#endif /* CONFIG_ATOMIC64 */ -} - #ifdef TARGET_X86_64 void helper_cmpxchg16b_unlocked(CPUX86State *env, target_ulong a0) { diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index a82131d635..b542b084a6 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -2995,15 +2995,59 @@ static void gen_sty_env_A0(DisasContext *s, int offset, bool align) static void gen_cmpxchg8b(DisasContext *s, CPUX86State *env, int modrm) { +TCGv_i64 cmp, val, old; +TCGv Z; + gen_lea_modrm(env, s, modrm); -if ((s->prefix & PREFIX_LOCK) && -(tb_cflags(s->base.tb) & CF_PARALLEL)) { -gen_helper_cmpxchg8b(cpu_env, s->A0); +cmp = tcg_temp_new_i64(); +val = tcg_temp_new_i64(); +old = tcg_temp_new_i64(); + +/* Construct the comparison values from the register pair. */ +tcg_gen_concat_tl_i64(cmp, cpu_regs[R_EAX], cpu_regs[R_EDX]); +tcg_gen_concat_tl_i64(val, cpu_regs[R_EBX], cpu_regs[R_ECX]); + +/* Only require atomic with LOCK; non-parallel handled in generator. */ +if (s->prefix & PREFIX_LOCK) { +tcg_gen_atomic_cmpxchg_i64(old, s->A0, cmp, val, s->mem_index, MO_TEUQ); } else { -gen_helper_cmpxchg8b_unlocked(cpu_env, s->A0); +tcg_gen_nonatomic_cmpxchg_i64(old, s->A0, cmp, val, + s->mem_index, MO_TEUQ); } -set_cc_op(s, CC_OP_EFLAGS); +tcg_temp_free_i64(val); + +/* Set tmp0 to match the required value of Z. */ +tcg_gen_setcond_i64(TCG_COND_EQ, cmp, old, cmp); +Z = tcg_temp_new(); +tcg_gen_trunc_i64_tl(Z, cmp); +tcg_temp_free_i64(cmp); + +/* + * Extract the result values for the register pair. + * For 32-bit, we may do this unconditionally, because on success (Z=1), + * the old value matches the previous value in EDX:EAX. For x86_64, + * the store must be conditional, because we must leave the source + * registers unchanged on success, and zero-extend the writeback + * on failure (Z=0). + */ +if (TARGET_LONG_BITS == 32) { +tcg_gen_extr_i64_tl(cpu_regs[R_EAX], cpu_regs[R_EDX], old); +} else { +TCGv zero = tcg_constant_tl(0); + +tcg_gen_extr_i64_tl(s->T0, s->T1, old); +tcg_gen_movcond_tl(TCG_COND_EQ, cpu_regs[R_EAX], Z, zero, +
[PULL 20/40] target/arm: Use tcg_gen_atomic_cmpxchg_i128 for STXP
Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell Message-Id: <20221112042555.2622152-2-richard.hender...@linaro.org> --- target/arm/helper-a64.h| 6 --- target/arm/helper-a64.c| 104 - target/arm/translate-a64.c | 60 - 3 files changed, 35 insertions(+), 135 deletions(-) diff --git a/target/arm/helper-a64.h b/target/arm/helper-a64.h index 7b706571bb..94065d1917 100644 --- a/target/arm/helper-a64.h +++ b/target/arm/helper-a64.h @@ -50,12 +50,6 @@ DEF_HELPER_FLAGS_2(frecpx_f16, TCG_CALL_NO_RWG, f16, f16, ptr) DEF_HELPER_FLAGS_2(fcvtx_f64_to_f32, TCG_CALL_NO_RWG, f32, f64, env) DEF_HELPER_FLAGS_3(crc32_64, TCG_CALL_NO_RWG_SE, i64, i64, i64, i32) DEF_HELPER_FLAGS_3(crc32c_64, TCG_CALL_NO_RWG_SE, i64, i64, i64, i32) -DEF_HELPER_FLAGS_4(paired_cmpxchg64_le, TCG_CALL_NO_WG, i64, env, i64, i64, i64) -DEF_HELPER_FLAGS_4(paired_cmpxchg64_le_parallel, TCG_CALL_NO_WG, - i64, env, i64, i64, i64) -DEF_HELPER_FLAGS_4(paired_cmpxchg64_be, TCG_CALL_NO_WG, i64, env, i64, i64, i64) -DEF_HELPER_FLAGS_4(paired_cmpxchg64_be_parallel, TCG_CALL_NO_WG, - i64, env, i64, i64, i64) DEF_HELPER_5(casp_le_parallel, void, env, i32, i64, i64, i64) DEF_HELPER_5(casp_be_parallel, void, env, i32, i64, i64, i64) DEF_HELPER_FLAGS_3(advsimd_maxh, TCG_CALL_NO_RWG, f16, f16, f16, ptr) diff --git a/target/arm/helper-a64.c b/target/arm/helper-a64.c index 77a8502b6b..7dbdb2c233 100644 --- a/target/arm/helper-a64.c +++ b/target/arm/helper-a64.c @@ -505,110 +505,6 @@ uint64_t HELPER(crc32c_64)(uint64_t acc, uint64_t val, uint32_t bytes) return crc32c(acc, buf, bytes) ^ 0x; } -uint64_t HELPER(paired_cmpxchg64_le)(CPUARMState *env, uint64_t addr, - uint64_t new_lo, uint64_t new_hi) -{ -Int128 cmpv = int128_make128(env->exclusive_val, env->exclusive_high); -Int128 newv = int128_make128(new_lo, new_hi); -Int128 oldv; -uintptr_t ra = GETPC(); -uint64_t o0, o1; -bool success; -int mem_idx = cpu_mmu_index(env, false); -MemOpIdx oi0 = make_memop_idx(MO_LEUQ | MO_ALIGN_16, mem_idx); -MemOpIdx oi1 = make_memop_idx(MO_LEUQ, mem_idx); - -o0 = cpu_ldq_le_mmu(env, addr + 0, oi0, ra); -o1 = cpu_ldq_le_mmu(env, addr + 8, oi1, ra); -oldv = int128_make128(o0, o1); - -success = int128_eq(oldv, cmpv); -if (success) { -cpu_stq_le_mmu(env, addr + 0, int128_getlo(newv), oi1, ra); -cpu_stq_le_mmu(env, addr + 8, int128_gethi(newv), oi1, ra); -} - -return !success; -} - -uint64_t HELPER(paired_cmpxchg64_le_parallel)(CPUARMState *env, uint64_t addr, - uint64_t new_lo, uint64_t new_hi) -{ -Int128 oldv, cmpv, newv; -uintptr_t ra = GETPC(); -bool success; -int mem_idx; -MemOpIdx oi; - -assert(HAVE_CMPXCHG128); - -mem_idx = cpu_mmu_index(env, false); -oi = make_memop_idx(MO_LE | MO_128 | MO_ALIGN, mem_idx); - -cmpv = int128_make128(env->exclusive_val, env->exclusive_high); -newv = int128_make128(new_lo, new_hi); -oldv = cpu_atomic_cmpxchgo_le_mmu(env, addr, cmpv, newv, oi, ra); - -success = int128_eq(oldv, cmpv); -return !success; -} - -uint64_t HELPER(paired_cmpxchg64_be)(CPUARMState *env, uint64_t addr, - uint64_t new_lo, uint64_t new_hi) -{ -/* - * High and low need to be switched here because this is not actually a - * 128bit store but two doublewords stored consecutively - */ -Int128 cmpv = int128_make128(env->exclusive_high, env->exclusive_val); -Int128 newv = int128_make128(new_hi, new_lo); -Int128 oldv; -uintptr_t ra = GETPC(); -uint64_t o0, o1; -bool success; -int mem_idx = cpu_mmu_index(env, false); -MemOpIdx oi0 = make_memop_idx(MO_BEUQ | MO_ALIGN_16, mem_idx); -MemOpIdx oi1 = make_memop_idx(MO_BEUQ, mem_idx); - -o1 = cpu_ldq_be_mmu(env, addr + 0, oi0, ra); -o0 = cpu_ldq_be_mmu(env, addr + 8, oi1, ra); -oldv = int128_make128(o0, o1); - -success = int128_eq(oldv, cmpv); -if (success) { -cpu_stq_be_mmu(env, addr + 0, int128_gethi(newv), oi1, ra); -cpu_stq_be_mmu(env, addr + 8, int128_getlo(newv), oi1, ra); -} - -return !success; -} - -uint64_t HELPER(paired_cmpxchg64_be_parallel)(CPUARMState *env, uint64_t addr, - uint64_t new_lo, uint64_t new_hi) -{ -Int128 oldv, cmpv, newv; -uintptr_t ra = GETPC(); -bool success; -int mem_idx; -MemOpIdx oi; - -assert(HAVE_CMPXCHG128); - -mem_idx = cpu_mmu_index(env, false); -oi = make_memop_idx(MO_BE | MO_128 | MO_ALIGN, mem_idx); - -/* - * High and low need to be switched here because this is not actually a - * 128bit store but two doublewords stored consecutively - */ -cmpv = int128_make128(env->exclusive_high, env->exclusive_val); -newv = int128_make128(ne
[PULL 19/40] tcg: Split out tcg_gen_nonatomic_cmpxchg_i{32,64}
Normally this is automatically handled by the CF_PARALLEL checks with in tcg_gen_atomic_cmpxchg_i{32,64}, but x86 has a special case of !PREFIX_LOCK where it always wants the non-atomic version. Split these out so that x86 does not have to roll its own. Reviewed-by: Alex Bennée Signed-off-by: Richard Henderson --- include/tcg/tcg-op.h | 4 ++ tcg/tcg-op.c | 154 +++ 2 files changed, 101 insertions(+), 57 deletions(-) diff --git a/include/tcg/tcg-op.h b/include/tcg/tcg-op.h index 31bf3d287e..839d91c0c7 100644 --- a/include/tcg/tcg-op.h +++ b/include/tcg/tcg-op.h @@ -910,6 +910,10 @@ void tcg_gen_atomic_cmpxchg_i64(TCGv_i64, TCGv, TCGv_i64, TCGv_i64, void tcg_gen_atomic_cmpxchg_i128(TCGv_i128, TCGv, TCGv_i128, TCGv_i128, TCGArg, MemOp); +void tcg_gen_nonatomic_cmpxchg_i32(TCGv_i32, TCGv, TCGv_i32, TCGv_i32, + TCGArg, MemOp); +void tcg_gen_nonatomic_cmpxchg_i64(TCGv_i64, TCGv, TCGv_i64, TCGv_i64, + TCGArg, MemOp); void tcg_gen_nonatomic_cmpxchg_i128(TCGv_i128, TCGv, TCGv_i128, TCGv_i128, TCGArg, MemOp); diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c index 5811ecd3e7..c581ae77c4 100644 --- a/tcg/tcg-op.c +++ b/tcg/tcg-op.c @@ -3325,82 +3325,122 @@ static void * const table_cmpxchg[(MO_SIZE | MO_BSWAP) + 1] = { WITH_ATOMIC128([MO_128 | MO_BE] = gen_helper_atomic_cmpxchgo_be) }; +void tcg_gen_nonatomic_cmpxchg_i32(TCGv_i32 retv, TCGv addr, TCGv_i32 cmpv, + TCGv_i32 newv, TCGArg idx, MemOp memop) +{ +TCGv_i32 t1 = tcg_temp_new_i32(); +TCGv_i32 t2 = tcg_temp_new_i32(); + +tcg_gen_ext_i32(t2, cmpv, memop & MO_SIZE); + +tcg_gen_qemu_ld_i32(t1, addr, idx, memop & ~MO_SIGN); +tcg_gen_movcond_i32(TCG_COND_EQ, t2, t1, t2, newv, t1); +tcg_gen_qemu_st_i32(t2, addr, idx, memop); +tcg_temp_free_i32(t2); + +if (memop & MO_SIGN) { +tcg_gen_ext_i32(retv, t1, memop); +} else { +tcg_gen_mov_i32(retv, t1); +} +tcg_temp_free_i32(t1); +} + void tcg_gen_atomic_cmpxchg_i32(TCGv_i32 retv, TCGv addr, TCGv_i32 cmpv, TCGv_i32 newv, TCGArg idx, MemOp memop) { -memop = tcg_canonicalize_memop(memop, 0, 0); +gen_atomic_cx_i32 gen; +MemOpIdx oi; if (!(tcg_ctx->gen_tb->cflags & CF_PARALLEL)) { -TCGv_i32 t1 = tcg_temp_new_i32(); -TCGv_i32 t2 = tcg_temp_new_i32(); - -tcg_gen_ext_i32(t2, cmpv, memop & MO_SIZE); - -tcg_gen_qemu_ld_i32(t1, addr, idx, memop & ~MO_SIGN); -tcg_gen_movcond_i32(TCG_COND_EQ, t2, t1, t2, newv, t1); -tcg_gen_qemu_st_i32(t2, addr, idx, memop); -tcg_temp_free_i32(t2); - -if (memop & MO_SIGN) { -tcg_gen_ext_i32(retv, t1, memop); -} else { -tcg_gen_mov_i32(retv, t1); -} -tcg_temp_free_i32(t1); -} else { -gen_atomic_cx_i32 gen; -MemOpIdx oi; - -gen = table_cmpxchg[memop & (MO_SIZE | MO_BSWAP)]; -tcg_debug_assert(gen != NULL); - -oi = make_memop_idx(memop & ~MO_SIGN, idx); -gen(retv, cpu_env, addr, cmpv, newv, tcg_constant_i32(oi)); - -if (memop & MO_SIGN) { -tcg_gen_ext_i32(retv, retv, memop); -} +tcg_gen_nonatomic_cmpxchg_i32(retv, addr, cmpv, newv, idx, memop); +return; } + +memop = tcg_canonicalize_memop(memop, 0, 0); +gen = table_cmpxchg[memop & (MO_SIZE | MO_BSWAP)]; +tcg_debug_assert(gen != NULL); + +oi = make_memop_idx(memop & ~MO_SIGN, idx); +gen(retv, cpu_env, addr, cmpv, newv, tcg_constant_i32(oi)); + +if (memop & MO_SIGN) { +tcg_gen_ext_i32(retv, retv, memop); +} +} + +void tcg_gen_nonatomic_cmpxchg_i64(TCGv_i64 retv, TCGv addr, TCGv_i64 cmpv, + TCGv_i64 newv, TCGArg idx, MemOp memop) +{ +TCGv_i64 t1, t2; + +if (TCG_TARGET_REG_BITS == 32 && (memop & MO_SIZE) < MO_64) { +tcg_gen_nonatomic_cmpxchg_i32(TCGV_LOW(retv), addr, TCGV_LOW(cmpv), + TCGV_LOW(newv), idx, memop); +if (memop & MO_SIGN) { +tcg_gen_sari_i32(TCGV_HIGH(retv), TCGV_LOW(retv), 31); +} else { +tcg_gen_movi_i32(TCGV_HIGH(retv), 0); +} +return; +} + +t1 = tcg_temp_new_i64(); +t2 = tcg_temp_new_i64(); + +tcg_gen_ext_i64(t2, cmpv, memop & MO_SIZE); + +tcg_gen_qemu_ld_i64(t1, addr, idx, memop & ~MO_SIGN); +tcg_gen_movcond_i64(TCG_COND_EQ, t2, t1, t2, newv, t1); +tcg_gen_qemu_st_i64(t2, addr, idx, memop); +tcg_temp_free_i64(t2); + +if (memop & MO_SIGN) { +tcg_gen_ext_i64(retv, t1, memop); +} else { +tcg_gen_mov_i64(retv, t1); +} +tcg_temp_free_i64(t1); } void tcg_gen_atomic_cmpxchg_i64(TCGv_i64 retv, TCGv addr, TCGv_i64 cmpv, TCGv
[PULL 18/40] tcg: Add tcg_gen_{non}atomic_cmpxchg_i128
This will allow targets to avoid rolling their own. Reviewed-by: Alex Bennée Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- accel/tcg/tcg-runtime.h | 11 + include/tcg/tcg-op.h | 5 +++ tcg/tcg-op.c | 85 +++ accel/tcg/atomic_common.c.inc | 45 +++ 4 files changed, 146 insertions(+) diff --git a/accel/tcg/tcg-runtime.h b/accel/tcg/tcg-runtime.h index 37cbd722bf..e141a6ab24 100644 --- a/accel/tcg/tcg-runtime.h +++ b/accel/tcg/tcg-runtime.h @@ -55,6 +55,17 @@ DEF_HELPER_FLAGS_5(atomic_cmpxchgq_be, TCG_CALL_NO_WG, DEF_HELPER_FLAGS_5(atomic_cmpxchgq_le, TCG_CALL_NO_WG, i64, env, tl, i64, i64, i32) #endif +#ifdef CONFIG_CMPXCHG128 +DEF_HELPER_FLAGS_5(atomic_cmpxchgo_be, TCG_CALL_NO_WG, + i128, env, tl, i128, i128, i32) +DEF_HELPER_FLAGS_5(atomic_cmpxchgo_le, TCG_CALL_NO_WG, + i128, env, tl, i128, i128, i32) +#endif + +DEF_HELPER_FLAGS_5(nonatomic_cmpxchgo_be, TCG_CALL_NO_WG, + i128, env, tl, i128, i128, i32) +DEF_HELPER_FLAGS_5(nonatomic_cmpxchgo_le, TCG_CALL_NO_WG, + i128, env, tl, i128, i128, i32) #ifdef CONFIG_ATOMIC64 #define GEN_ATOMIC_HELPERS(NAME) \ diff --git a/include/tcg/tcg-op.h b/include/tcg/tcg-op.h index e5f5b63c37..31bf3d287e 100644 --- a/include/tcg/tcg-op.h +++ b/include/tcg/tcg-op.h @@ -907,6 +907,11 @@ void tcg_gen_atomic_cmpxchg_i32(TCGv_i32, TCGv, TCGv_i32, TCGv_i32, TCGArg, MemOp); void tcg_gen_atomic_cmpxchg_i64(TCGv_i64, TCGv, TCGv_i64, TCGv_i64, TCGArg, MemOp); +void tcg_gen_atomic_cmpxchg_i128(TCGv_i128, TCGv, TCGv_i128, TCGv_i128, + TCGArg, MemOp); + +void tcg_gen_nonatomic_cmpxchg_i128(TCGv_i128, TCGv, TCGv_i128, TCGv_i128, +TCGArg, MemOp); void tcg_gen_atomic_xchg_i32(TCGv_i32, TCGv, TCGv_i32, TCGArg, MemOp); void tcg_gen_atomic_xchg_i64(TCGv_i64, TCGv, TCGv_i64, TCGArg, MemOp); diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c index 33ef325f6e..5811ecd3e7 100644 --- a/tcg/tcg-op.c +++ b/tcg/tcg-op.c @@ -3295,6 +3295,8 @@ typedef void (*gen_atomic_cx_i32)(TCGv_i32, TCGv_env, TCGv, TCGv_i32, TCGv_i32, TCGv_i32); typedef void (*gen_atomic_cx_i64)(TCGv_i64, TCGv_env, TCGv, TCGv_i64, TCGv_i64, TCGv_i32); +typedef void (*gen_atomic_cx_i128)(TCGv_i128, TCGv_env, TCGv, + TCGv_i128, TCGv_i128, TCGv_i32); typedef void (*gen_atomic_op_i32)(TCGv_i32, TCGv_env, TCGv, TCGv_i32, TCGv_i32); typedef void (*gen_atomic_op_i64)(TCGv_i64, TCGv_env, TCGv, @@ -3305,6 +3307,11 @@ typedef void (*gen_atomic_op_i64)(TCGv_i64, TCGv_env, TCGv, #else # define WITH_ATOMIC64(X) #endif +#ifdef CONFIG_CMPXCHG128 +# define WITH_ATOMIC128(X) X, +#else +# define WITH_ATOMIC128(X) +#endif static void * const table_cmpxchg[(MO_SIZE | MO_BSWAP) + 1] = { [MO_8] = gen_helper_atomic_cmpxchgb, @@ -3314,6 +3321,8 @@ static void * const table_cmpxchg[(MO_SIZE | MO_BSWAP) + 1] = { [MO_32 | MO_BE] = gen_helper_atomic_cmpxchgl_be, WITH_ATOMIC64([MO_64 | MO_LE] = gen_helper_atomic_cmpxchgq_le) WITH_ATOMIC64([MO_64 | MO_BE] = gen_helper_atomic_cmpxchgq_be) +WITH_ATOMIC128([MO_128 | MO_LE] = gen_helper_atomic_cmpxchgo_le) +WITH_ATOMIC128([MO_128 | MO_BE] = gen_helper_atomic_cmpxchgo_be) }; void tcg_gen_atomic_cmpxchg_i32(TCGv_i32 retv, TCGv addr, TCGv_i32 cmpv, @@ -3412,6 +3421,82 @@ void tcg_gen_atomic_cmpxchg_i64(TCGv_i64 retv, TCGv addr, TCGv_i64 cmpv, } } +void tcg_gen_nonatomic_cmpxchg_i128(TCGv_i128 retv, TCGv addr, TCGv_i128 cmpv, +TCGv_i128 newv, TCGArg idx, MemOp memop) +{ +if (TCG_TARGET_REG_BITS == 32) { +/* Inline expansion below is simply too large for 32-bit hosts. */ +gen_atomic_cx_i128 gen = ((memop & MO_BSWAP) == MO_LE + ? gen_helper_nonatomic_cmpxchgo_le + : gen_helper_nonatomic_cmpxchgo_be); +MemOpIdx oi = make_memop_idx(memop, idx); + +tcg_debug_assert((memop & MO_SIZE) == MO_128); +tcg_debug_assert((memop & MO_SIGN) == 0); + +gen(retv, cpu_env, addr, cmpv, newv, tcg_constant_i32(oi)); +} else { +TCGv_i128 oldv = tcg_temp_new_i128(); +TCGv_i128 tmpv = tcg_temp_new_i128(); +TCGv_i64 t0 = tcg_temp_new_i64(); +TCGv_i64 t1 = tcg_temp_new_i64(); +TCGv_i64 z = tcg_constant_i64(0); + +tcg_gen_qemu_ld_i128(oldv, addr, idx, memop); + +/* Compare i128 */ +tcg_gen_xor_i64(t0, TCGV128_LOW(oldv), TCGV128_LOW(cmpv)); +tcg_gen_xor_i64(t1, TCGV128_HIGH(oldv), TCGV128_HIGH(cmpv)); +tcg_gen_or_i64(t0, t0, t1); + +/* tmpv = equal
[PULL 25/40] tests/tcg/s390x: Add long-double.c
Acked-by: Ilya Leoshkevich Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- tests/tcg/s390x/long-double.c | 24 tests/tcg/s390x/Makefile.target | 1 + 2 files changed, 25 insertions(+) create mode 100644 tests/tcg/s390x/long-double.c diff --git a/tests/tcg/s390x/long-double.c b/tests/tcg/s390x/long-double.c new file mode 100644 index 00..757a6262fd --- /dev/null +++ b/tests/tcg/s390x/long-double.c @@ -0,0 +1,24 @@ +/* + * Perform some basic arithmetic with long double, as a sanity check. + * With small integral numbers, we can cross-check with integers. + */ + +#include + +int main() +{ +int i, j; + +for (i = 1; i < 5; i++) { +for (j = 1; j < 5; j++) { +long double la = (long double)i + j; +long double lm = (long double)i * j; +long double ls = (long double)i - j; + +assert(la == i + j); +assert(lm == i * j); +assert(ls == i - j); +} +} +return 0; +} diff --git a/tests/tcg/s390x/Makefile.target b/tests/tcg/s390x/Makefile.target index 79250f31dd..1d454270c0 100644 --- a/tests/tcg/s390x/Makefile.target +++ b/tests/tcg/s390x/Makefile.target @@ -26,6 +26,7 @@ TESTS+=branch-relative-long TESTS+=noexec TESTS+=div TESTS+=clst +TESTS+=long-double Z13_TESTS=vistr $(Z13_TESTS): CFLAGS+=-march=z13 -O2 -- 2.34.1
[PULL 07/40] tcg: Add TCG_CALL_{RET,ARG}_BY_REF
These will be used by some hosts, both 32 and 64-bit, to pass and return i128. Not yet used, because allocation is not yet enabled. Reviewed-by: Alex Bennée Signed-off-by: Richard Henderson --- tcg/tcg-internal.h | 3 + tcg/tcg.c | 135 - 2 files changed, 135 insertions(+), 3 deletions(-) diff --git a/tcg/tcg-internal.h b/tcg/tcg-internal.h index 6e50aeba3a..2ec1ea01df 100644 --- a/tcg/tcg-internal.h +++ b/tcg/tcg-internal.h @@ -36,6 +36,7 @@ */ typedef enum { TCG_CALL_RET_NORMAL, /* by registers */ +TCG_CALL_RET_BY_REF, /* for i128, by reference */ } TCGCallReturnKind; typedef enum { @@ -44,6 +45,8 @@ typedef enum { TCG_CALL_ARG_EXTEND, /* for i32, as a sign/zero-extended i64 */ TCG_CALL_ARG_EXTEND_U, /* ... as a zero-extended i64 */ TCG_CALL_ARG_EXTEND_S, /* ... as a sign-extended i64 */ +TCG_CALL_ARG_BY_REF, /* for i128, by reference, first */ +TCG_CALL_ARG_BY_REF_N, /* ... by reference, subsequent */ } TCGCallArgumentKind; typedef struct TCGCallArgumentLoc { diff --git a/tcg/tcg.c b/tcg/tcg.c index 8923b52044..123cde7000 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -104,8 +104,7 @@ static void tcg_out_ld(TCGContext *s, TCGType type, TCGReg ret, TCGReg arg1, static bool tcg_out_mov(TCGContext *s, TCGType type, TCGReg ret, TCGReg arg); static void tcg_out_movi(TCGContext *s, TCGType type, TCGReg ret, tcg_target_long arg); -static void tcg_out_addi_ptr(TCGContext *s, TCGReg, TCGReg, tcg_target_long) -__attribute__((unused)); +static void tcg_out_addi_ptr(TCGContext *s, TCGReg, TCGReg, tcg_target_long); static void tcg_out_exit_tb(TCGContext *s, uintptr_t arg); static void tcg_out_goto_tb(TCGContext *s, int which); static void tcg_out_op(TCGContext *s, TCGOpcode opc, @@ -683,6 +682,38 @@ static void layout_arg_normal_n(TCGCumulativeArgs *cum, cum->arg_slot += n; } +static void layout_arg_by_ref(TCGCumulativeArgs *cum, TCGHelperInfo *info) +{ +TCGCallArgumentLoc *loc = &info->in[cum->info_in_idx]; +int n = 128 / TCG_TARGET_REG_BITS; + +/* The first subindex carries the pointer. */ +layout_arg_1(cum, info, TCG_CALL_ARG_BY_REF); + +/* + * The callee is allowed to clobber memory associated with + * structure pass by-reference. Therefore we must make copies. + * Allocate space from "ref_slot", which will be adjusted to + * follow the parameters on the stack. + */ +loc[0].ref_slot = cum->ref_slot; + +/* + * Subsequent words also go into the reference slot, but + * do not accumulate into the regular arguments. + */ +for (int i = 1; i < n; ++i) { +loc[i] = (TCGCallArgumentLoc){ +.kind = TCG_CALL_ARG_BY_REF_N, +.arg_idx = cum->arg_idx, +.tmp_subindex = i, +.ref_slot = cum->ref_slot + i, +}; +} +cum->info_in_idx += n; +cum->ref_slot += n; +} + static void init_call_layout(TCGHelperInfo *info) { int max_reg_slots = ARRAY_SIZE(tcg_target_call_iarg_regs); @@ -718,6 +749,14 @@ static void init_call_layout(TCGHelperInfo *info) case TCG_CALL_RET_NORMAL: assert(info->nr_out <= ARRAY_SIZE(tcg_target_call_oarg_regs)); break; +case TCG_CALL_RET_BY_REF: +/* + * Allocate the first argument to the output. + * We don't need to store this anywhere, just make it + * unavailable for use in the input loop below. + */ +cum.arg_slot = 1; +break; default: qemu_build_not_reached(); } @@ -796,6 +835,9 @@ static void init_call_layout(TCGHelperInfo *info) case TCG_CALL_ARG_NORMAL: layout_arg_normal_n(&cum, info, 128 / TCG_TARGET_REG_BITS); break; +case TCG_CALL_ARG_BY_REF: +layout_arg_by_ref(&cum, info); +break; default: qemu_build_not_reached(); } @@ -811,7 +853,39 @@ static void init_call_layout(TCGHelperInfo *info) assert(cum.info_in_idx <= ARRAY_SIZE(info->in)); /* Validate the backend has enough argument space. */ assert(cum.arg_slot <= max_reg_slots + max_stk_slots); -assert(cum.ref_slot <= max_stk_slots); + +/* + * Relocate the "ref_slot" area to the end of the parameters. + * Minimizing this stack offset helps code size for x86, + * which has a signed 8-bit offset encoding. + */ +if (cum.ref_slot != 0) { +int ref_base = 0; + +if (cum.arg_slot > max_reg_slots) { +int align = __alignof(Int128) / sizeof(tcg_target_long); + +ref_base = cum.arg_slot - max_reg_slots; +if (align > 1) { +ref_base = ROUND_UP(ref_base, align); +} +} +assert(ref_base + cum.ref_slot <=
[PULL 05/40] tcg: Allocate objects contiguously in temp_allocate_frame
When allocating a temp to the stack frame, consider the base type and allocate all parts at once. Reviewed-by: Alex Bennée Signed-off-by: Richard Henderson --- tcg/tcg.c | 34 ++ 1 file changed, 26 insertions(+), 8 deletions(-) diff --git a/tcg/tcg.c b/tcg/tcg.c index bc7198e5d0..cdfc50b164 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -3267,11 +3267,12 @@ static bool liveness_pass_2(TCGContext *s) static void temp_allocate_frame(TCGContext *s, TCGTemp *ts) { -int size = tcg_type_size(ts->type); -int align; intptr_t off; +int size, align; -switch (ts->type) { +/* When allocating an object, look at the full type. */ +size = tcg_type_size(ts->base_type); +switch (ts->base_type) { case TCG_TYPE_I32: align = 4; break; @@ -3302,13 +3303,30 @@ static void temp_allocate_frame(TCGContext *s, TCGTemp *ts) tcg_raise_tb_overflow(s); } s->current_frame_offset = off + size; - -ts->mem_offset = off; #if defined(__sparc__) -ts->mem_offset += TCG_TARGET_STACK_BIAS; +off += TCG_TARGET_STACK_BIAS; #endif -ts->mem_base = s->frame_temp; -ts->mem_allocated = 1; + +/* If the object was subdivided, assign memory to all the parts. */ +if (ts->base_type != ts->type) { +int part_size = tcg_type_size(ts->type); +int part_count = size / part_size; + +/* + * Each part is allocated sequentially in tcg_temp_new_internal. + * Jump back to the first part by subtracting the current index. + */ +ts -= ts->temp_subindex; +for (int i = 0; i < part_count; ++i) { +ts[i].mem_offset = off + i * part_size; +ts[i].mem_base = s->frame_temp; +ts[i].mem_allocated = 1; +} +} else { +ts->mem_offset = off; +ts->mem_base = s->frame_temp; +ts->mem_allocated = 1; +} } /* Assign @reg to @ts, and update reg_to_temp[]. */ -- 2.34.1
[PULL 22/40] target/ppc: Use tcg_gen_atomic_cmpxchg_i128 for STQCX
Note that the previous direct reference to reserve_val, - tcg_gen_ld_i64(t1, cpu_env, (ctx->le_mode -? offsetof(CPUPPCState, reserve_val2) -: offsetof(CPUPPCState, reserve_val))); was incorrect because all references should have gone through cpu_reserve_val. Create a cpu_reserve_val2 tcg temp to fix this. Signed-off-by: Richard Henderson Reviewed-by: Daniel Henrique Barboza Message-Id: <20221112061122.2720163-2-richard.hender...@linaro.org> --- target/ppc/helper.h | 2 - target/ppc/mem_helper.c | 44 - target/ppc/translate.c | 102 ++-- 3 files changed, 47 insertions(+), 101 deletions(-) diff --git a/target/ppc/helper.h b/target/ppc/helper.h index 8dd22a35e4..0beaca5c7a 100644 --- a/target/ppc/helper.h +++ b/target/ppc/helper.h @@ -818,6 +818,4 @@ DEF_HELPER_FLAGS_5(stq_le_parallel, TCG_CALL_NO_WG, void, env, tl, i64, i64, i32) DEF_HELPER_FLAGS_5(stq_be_parallel, TCG_CALL_NO_WG, void, env, tl, i64, i64, i32) -DEF_HELPER_5(stqcx_le_parallel, i32, env, tl, i64, i64, i32) -DEF_HELPER_5(stqcx_be_parallel, i32, env, tl, i64, i64, i32) #endif diff --git a/target/ppc/mem_helper.c b/target/ppc/mem_helper.c index d1163f316c..1578887a8f 100644 --- a/target/ppc/mem_helper.c +++ b/target/ppc/mem_helper.c @@ -413,50 +413,6 @@ void helper_stq_be_parallel(CPUPPCState *env, target_ulong addr, val = int128_make128(lo, hi); cpu_atomic_sto_be_mmu(env, addr, val, opidx, GETPC()); } - -uint32_t helper_stqcx_le_parallel(CPUPPCState *env, target_ulong addr, - uint64_t new_lo, uint64_t new_hi, - uint32_t opidx) -{ -bool success = false; - -/* We will have raised EXCP_ATOMIC from the translator. */ -assert(HAVE_CMPXCHG128); - -if (likely(addr == env->reserve_addr)) { -Int128 oldv, cmpv, newv; - -cmpv = int128_make128(env->reserve_val2, env->reserve_val); -newv = int128_make128(new_lo, new_hi); -oldv = cpu_atomic_cmpxchgo_le_mmu(env, addr, cmpv, newv, - opidx, GETPC()); -success = int128_eq(oldv, cmpv); -} -env->reserve_addr = -1; -return env->so + success * CRF_EQ_BIT; -} - -uint32_t helper_stqcx_be_parallel(CPUPPCState *env, target_ulong addr, - uint64_t new_lo, uint64_t new_hi, - uint32_t opidx) -{ -bool success = false; - -/* We will have raised EXCP_ATOMIC from the translator. */ -assert(HAVE_CMPXCHG128); - -if (likely(addr == env->reserve_addr)) { -Int128 oldv, cmpv, newv; - -cmpv = int128_make128(env->reserve_val2, env->reserve_val); -newv = int128_make128(new_lo, new_hi); -oldv = cpu_atomic_cmpxchgo_be_mmu(env, addr, cmpv, newv, - opidx, GETPC()); -success = int128_eq(oldv, cmpv); -} -env->reserve_addr = -1; -return env->so + success * CRF_EQ_BIT; -} #endif /*/ diff --git a/target/ppc/translate.c b/target/ppc/translate.c index edb3daa9b5..1c17d5a558 100644 --- a/target/ppc/translate.c +++ b/target/ppc/translate.c @@ -72,6 +72,7 @@ static TCGv cpu_cfar; static TCGv cpu_xer, cpu_so, cpu_ov, cpu_ca, cpu_ov32, cpu_ca32; static TCGv cpu_reserve; static TCGv cpu_reserve_val; +static TCGv cpu_reserve_val2; static TCGv cpu_fpscr; static TCGv_i32 cpu_access_type; @@ -141,8 +142,11 @@ void ppc_translate_init(void) offsetof(CPUPPCState, reserve_addr), "reserve_addr"); cpu_reserve_val = tcg_global_mem_new(cpu_env, - offsetof(CPUPPCState, reserve_val), - "reserve_val"); + offsetof(CPUPPCState, reserve_val), + "reserve_val"); +cpu_reserve_val2 = tcg_global_mem_new(cpu_env, + offsetof(CPUPPCState, reserve_val2), + "reserve_val2"); cpu_fpscr = tcg_global_mem_new(cpu_env, offsetof(CPUPPCState, fpscr), "fpscr"); @@ -3998,78 +4002,66 @@ static void gen_lqarx(DisasContext *ctx) /* stqcx. */ static void gen_stqcx_(DisasContext *ctx) { +TCGLabel *lab_fail, *lab_over; int rs = rS(ctx->opcode); -TCGv EA, hi, lo; +TCGv EA, t0, t1; +TCGv_i128 cmp, val; if (unlikely(rs & 1)) { gen_inval_exception(ctx, POWERPC_EXCP_INVAL_INVAL); return; } +lab_fail = gen_new_label(); +lab_over = gen_new_label(); + gen_set_access_type(ctx, ACCESS_RES); EA = tcg_temp_new(); gen_addr_reg_index(ctx, EA); +tcg_gen_brcon
[PULL 34/40] target/s390x: Use Int128 for passing float128
Acked-by: David Hildenbrand Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- v2: Fix SPEC_in1_x1. --- target/s390x/helper.h| 32 ++-- target/s390x/tcg/insn-data.h.inc | 30 +-- target/s390x/tcg/fpu_helper.c| 88 ++-- target/s390x/tcg/translate.c | 76 ++- 4 files changed, 121 insertions(+), 105 deletions(-) diff --git a/target/s390x/helper.h b/target/s390x/helper.h index d40aeb471f..bccd3bfca6 100644 --- a/target/s390x/helper.h +++ b/target/s390x/helper.h @@ -41,55 +41,55 @@ DEF_HELPER_4(csst, i32, env, i32, i64, i64) DEF_HELPER_4(csst_parallel, i32, env, i32, i64, i64) DEF_HELPER_FLAGS_3(aeb, TCG_CALL_NO_WG, i64, env, i64, i64) DEF_HELPER_FLAGS_3(adb, TCG_CALL_NO_WG, i64, env, i64, i64) -DEF_HELPER_FLAGS_5(axb, TCG_CALL_NO_WG, i128, env, i64, i64, i64, i64) +DEF_HELPER_FLAGS_3(axb, TCG_CALL_NO_WG, i128, env, i128, i128) DEF_HELPER_FLAGS_3(seb, TCG_CALL_NO_WG, i64, env, i64, i64) DEF_HELPER_FLAGS_3(sdb, TCG_CALL_NO_WG, i64, env, i64, i64) -DEF_HELPER_FLAGS_5(sxb, TCG_CALL_NO_WG, i128, env, i64, i64, i64, i64) +DEF_HELPER_FLAGS_3(sxb, TCG_CALL_NO_WG, i128, env, i128, i128) DEF_HELPER_FLAGS_3(deb, TCG_CALL_NO_WG, i64, env, i64, i64) DEF_HELPER_FLAGS_3(ddb, TCG_CALL_NO_WG, i64, env, i64, i64) -DEF_HELPER_FLAGS_5(dxb, TCG_CALL_NO_WG, i128, env, i64, i64, i64, i64) +DEF_HELPER_FLAGS_3(dxb, TCG_CALL_NO_WG, i128, env, i128, i128) DEF_HELPER_FLAGS_3(meeb, TCG_CALL_NO_WG, i64, env, i64, i64) DEF_HELPER_FLAGS_3(mdeb, TCG_CALL_NO_WG, i64, env, i64, i64) DEF_HELPER_FLAGS_3(mdb, TCG_CALL_NO_WG, i64, env, i64, i64) -DEF_HELPER_FLAGS_5(mxb, TCG_CALL_NO_WG, i128, env, i64, i64, i64, i64) -DEF_HELPER_FLAGS_4(mxdb, TCG_CALL_NO_WG, i128, env, i64, i64, i64) +DEF_HELPER_FLAGS_3(mxb, TCG_CALL_NO_WG, i128, env, i128, i128) +DEF_HELPER_FLAGS_3(mxdb, TCG_CALL_NO_WG, i128, env, i128, i64) DEF_HELPER_FLAGS_2(ldeb, TCG_CALL_NO_WG, i64, env, i64) -DEF_HELPER_FLAGS_4(ldxb, TCG_CALL_NO_WG, i64, env, i64, i64, i32) +DEF_HELPER_FLAGS_3(ldxb, TCG_CALL_NO_WG, i64, env, i128, i32) DEF_HELPER_FLAGS_2(lxdb, TCG_CALL_NO_WG, i128, env, i64) DEF_HELPER_FLAGS_2(lxeb, TCG_CALL_NO_WG, i128, env, i64) DEF_HELPER_FLAGS_3(ledb, TCG_CALL_NO_WG, i64, env, i64, i32) -DEF_HELPER_FLAGS_4(lexb, TCG_CALL_NO_WG, i64, env, i64, i64, i32) +DEF_HELPER_FLAGS_3(lexb, TCG_CALL_NO_WG, i64, env, i128, i32) DEF_HELPER_FLAGS_3(ceb, TCG_CALL_NO_WG_SE, i32, env, i64, i64) DEF_HELPER_FLAGS_3(cdb, TCG_CALL_NO_WG_SE, i32, env, i64, i64) -DEF_HELPER_FLAGS_5(cxb, TCG_CALL_NO_WG_SE, i32, env, i64, i64, i64, i64) +DEF_HELPER_FLAGS_3(cxb, TCG_CALL_NO_WG_SE, i32, env, i128, i128) DEF_HELPER_FLAGS_3(keb, TCG_CALL_NO_WG, i32, env, i64, i64) DEF_HELPER_FLAGS_3(kdb, TCG_CALL_NO_WG, i32, env, i64, i64) -DEF_HELPER_FLAGS_5(kxb, TCG_CALL_NO_WG, i32, env, i64, i64, i64, i64) +DEF_HELPER_FLAGS_3(kxb, TCG_CALL_NO_WG, i32, env, i128, i128) DEF_HELPER_3(cgeb, i64, env, i64, i32) DEF_HELPER_3(cgdb, i64, env, i64, i32) -DEF_HELPER_4(cgxb, i64, env, i64, i64, i32) +DEF_HELPER_3(cgxb, i64, env, i128, i32) DEF_HELPER_3(cfeb, i64, env, i64, i32) DEF_HELPER_3(cfdb, i64, env, i64, i32) -DEF_HELPER_4(cfxb, i64, env, i64, i64, i32) +DEF_HELPER_3(cfxb, i64, env, i128, i32) DEF_HELPER_3(clgeb, i64, env, i64, i32) DEF_HELPER_3(clgdb, i64, env, i64, i32) -DEF_HELPER_4(clgxb, i64, env, i64, i64, i32) +DEF_HELPER_3(clgxb, i64, env, i128, i32) DEF_HELPER_3(clfeb, i64, env, i64, i32) DEF_HELPER_3(clfdb, i64, env, i64, i32) -DEF_HELPER_4(clfxb, i64, env, i64, i64, i32) +DEF_HELPER_3(clfxb, i64, env, i128, i32) DEF_HELPER_FLAGS_3(fieb, TCG_CALL_NO_WG, i64, env, i64, i32) DEF_HELPER_FLAGS_3(fidb, TCG_CALL_NO_WG, i64, env, i64, i32) -DEF_HELPER_FLAGS_4(fixb, TCG_CALL_NO_WG, i128, env, i64, i64, i32) +DEF_HELPER_FLAGS_3(fixb, TCG_CALL_NO_WG, i128, env, i128, i32) DEF_HELPER_FLAGS_4(maeb, TCG_CALL_NO_WG, i64, env, i64, i64, i64) DEF_HELPER_FLAGS_4(madb, TCG_CALL_NO_WG, i64, env, i64, i64, i64) DEF_HELPER_FLAGS_4(mseb, TCG_CALL_NO_WG, i64, env, i64, i64, i64) DEF_HELPER_FLAGS_4(msdb, TCG_CALL_NO_WG, i64, env, i64, i64, i64) DEF_HELPER_FLAGS_3(tceb, TCG_CALL_NO_RWG_SE, i32, env, i64, i64) DEF_HELPER_FLAGS_3(tcdb, TCG_CALL_NO_RWG_SE, i32, env, i64, i64) -DEF_HELPER_FLAGS_4(tcxb, TCG_CALL_NO_RWG_SE, i32, env, i64, i64, i64) +DEF_HELPER_FLAGS_3(tcxb, TCG_CALL_NO_RWG_SE, i32, env, i128, i64) DEF_HELPER_FLAGS_2(sqeb, TCG_CALL_NO_WG, i64, env, i64) DEF_HELPER_FLAGS_2(sqdb, TCG_CALL_NO_WG, i64, env, i64) -DEF_HELPER_FLAGS_3(sqxb, TCG_CALL_NO_WG, i128, env, i64, i64) +DEF_HELPER_FLAGS_2(sqxb, TCG_CALL_NO_WG, i128, env, i128) DEF_HELPER_FLAGS_1(cvd, TCG_CALL_NO_RWG_SE, i64, s32) DEF_HELPER_FLAGS_4(pack, TCG_CALL_NO_WG, void, env, i32, i64, i64) DEF_HELPER_FLAGS_4(pka, TCG_CALL_NO_WG, void, env, i64, i64, i32) diff --git a/target/s390x/tcg/insn-data.h.inc b/target/s390x/tcg/insn-data.h.inc index 517a4500ae..893f4b48db 100644 --- a/target/s390x/tcg/insn-data.h.inc +++ b/targ
[PULL 23/40] tests/tcg/s390x: Add div.c
From: Ilya Leoshkevich Add a basic test to prevent regressions. Signed-off-by: Ilya Leoshkevich Message-Id: <2022110300.2539919-1-...@linux.ibm.com> Signed-off-by: Richard Henderson --- tests/tcg/s390x/div.c | 40 + tests/tcg/s390x/Makefile.target | 1 + 2 files changed, 41 insertions(+) create mode 100644 tests/tcg/s390x/div.c diff --git a/tests/tcg/s390x/div.c b/tests/tcg/s390x/div.c new file mode 100644 index 00..5807295614 --- /dev/null +++ b/tests/tcg/s390x/div.c @@ -0,0 +1,40 @@ +#include +#include + +static void test_dr(void) +{ +register int32_t r0 asm("r0") = -1; +register int32_t r1 asm("r1") = -4241; +int32_t b = 101, q, r; + +asm("dr %[r0],%[b]" +: [r0] "+r" (r0), [r1] "+r" (r1) +: [b] "r" (b) +: "cc"); +q = r1; +r = r0; +assert(q == -41); +assert(r == -100); +} + +static void test_dlr(void) +{ +register uint32_t r0 asm("r0") = 0; +register uint32_t r1 asm("r1") = 4243; +uint32_t b = 101, q, r; + +asm("dlr %[r0],%[b]" +: [r0] "+r" (r0), [r1] "+r" (r1) +: [b] "r" (b) +: "cc"); +q = r1; +r = r0; +assert(q == 42); +assert(r == 1); +} + +int main(void) +{ +test_dr(); +test_dlr(); +} diff --git a/tests/tcg/s390x/Makefile.target b/tests/tcg/s390x/Makefile.target index 07fcc6d0ce..ab7a3bcfb2 100644 --- a/tests/tcg/s390x/Makefile.target +++ b/tests/tcg/s390x/Makefile.target @@ -24,6 +24,7 @@ TESTS+=trap TESTS+=signals-s390x TESTS+=branch-relative-long TESTS+=noexec +TESTS+=div Z13_TESTS=vistr $(Z13_TESTS): CFLAGS+=-march=z13 -O2 -- 2.34.1
[PULL 29/40] target/s390x: Use Int128 for return from CLST
Reviewed-by: Philippe Mathieu-Daudé Acked-by: Ilya Leoshkevich Signed-off-by: Richard Henderson --- target/s390x/helper.h | 2 +- target/s390x/tcg/mem_helper.c | 11 --- target/s390x/tcg/translate.c | 8 ++-- 3 files changed, 11 insertions(+), 10 deletions(-) diff --git a/target/s390x/helper.h b/target/s390x/helper.h index 593f3c8bee..25c2dd0b3c 100644 --- a/target/s390x/helper.h +++ b/target/s390x/helper.h @@ -16,7 +16,7 @@ DEF_HELPER_FLAGS_3(divs64, TCG_CALL_NO_WG, i128, env, s64, s64) DEF_HELPER_FLAGS_4(divu64, TCG_CALL_NO_WG, i128, env, i64, i64, i64) DEF_HELPER_3(srst, void, env, i32, i32) DEF_HELPER_3(srstu, void, env, i32, i32) -DEF_HELPER_4(clst, i64, env, i64, i64, i64) +DEF_HELPER_4(clst, i128, env, i64, i64, i64) DEF_HELPER_FLAGS_4(mvn, TCG_CALL_NO_WG, void, env, i32, i64, i64) DEF_HELPER_FLAGS_4(mvo, TCG_CALL_NO_WG, void, env, i32, i64, i64) DEF_HELPER_FLAGS_4(mvpg, TCG_CALL_NO_WG, i32, env, i64, i32, i32) diff --git a/target/s390x/tcg/mem_helper.c b/target/s390x/tcg/mem_helper.c index cb82cd1c1d..9be42851d8 100644 --- a/target/s390x/tcg/mem_helper.c +++ b/target/s390x/tcg/mem_helper.c @@ -886,7 +886,7 @@ void HELPER(srstu)(CPUS390XState *env, uint32_t r1, uint32_t r2) } /* unsigned string compare (c is string terminator) */ -uint64_t HELPER(clst)(CPUS390XState *env, uint64_t c, uint64_t s1, uint64_t s2) +Int128 HELPER(clst)(CPUS390XState *env, uint64_t c, uint64_t s1, uint64_t s2) { uintptr_t ra = GETPC(); uint32_t len; @@ -904,23 +904,20 @@ uint64_t HELPER(clst)(CPUS390XState *env, uint64_t c, uint64_t s1, uint64_t s2) if (v1 == c) { /* Equal. CC=0, and don't advance the registers. */ env->cc_op = 0; -env->retxl = s2; -return s1; +return int128_make128(s2, s1); } } else { /* Unequal. CC={1,2}, and advance the registers. Note that the terminator need not be zero, but the string that contains the terminator is by definition "low". */ env->cc_op = (v1 == c ? 1 : v2 == c ? 2 : v1 < v2 ? 1 : 2); -env->retxl = s2 + len; -return s1 + len; +return int128_make128(s2 + len, s1 + len); } } /* CPU-determined bytes equal; advance the registers. */ env->cc_op = 3; -env->retxl = s2 + len; -return s1 + len; +return int128_make128(s2 + len, s1 + len); } /* move page */ diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c index 6953b81de7..8397fe2bd8 100644 --- a/target/s390x/tcg/translate.c +++ b/target/s390x/tcg/translate.c @@ -2164,9 +2164,13 @@ static DisasJumpType op_clm(DisasContext *s, DisasOps *o) static DisasJumpType op_clst(DisasContext *s, DisasOps *o) { -gen_helper_clst(o->in1, cpu_env, regs[0], o->in1, o->in2); +TCGv_i128 pair = tcg_temp_new_i128(); + +gen_helper_clst(pair, cpu_env, regs[0], o->in1, o->in2); +tcg_gen_extr_i128_i64(o->in2, o->in1, pair); +tcg_temp_free_i128(pair); + set_cc_static(s); -return_low128(o->in2); return DISAS_NEXT; } -- 2.34.1
[PULL 28/40] target/s390x: Use a single return for helper_divs64/u64
Pack the quotient and remainder into a single Int128. Use the divu128 primitive to remove the cpu_abort on 32-bit hosts. Reviewed-by: Philippe Mathieu-Daudé Acked-by: Ilya Leoshkevich Signed-off-by: Richard Henderson --- v2: Extended div test case to cover these insns. --- target/s390x/helper.h | 4 ++-- target/s390x/tcg/int_helper.c | 38 +-- target/s390x/tcg/translate.c | 14 + tests/tcg/s390x/div.c | 35 4 files changed, 56 insertions(+), 35 deletions(-) diff --git a/target/s390x/helper.h b/target/s390x/helper.h index bc828d976b..593f3c8bee 100644 --- a/target/s390x/helper.h +++ b/target/s390x/helper.h @@ -12,8 +12,8 @@ DEF_HELPER_3(clcl, i32, env, i32, i32) DEF_HELPER_FLAGS_4(clm, TCG_CALL_NO_WG, i32, env, i32, i32, i64) DEF_HELPER_FLAGS_3(divs32, TCG_CALL_NO_WG, i64, env, s64, s64) DEF_HELPER_FLAGS_3(divu32, TCG_CALL_NO_WG, i64, env, i64, i64) -DEF_HELPER_FLAGS_3(divs64, TCG_CALL_NO_WG, s64, env, s64, s64) -DEF_HELPER_FLAGS_4(divu64, TCG_CALL_NO_WG, i64, env, i64, i64, i64) +DEF_HELPER_FLAGS_3(divs64, TCG_CALL_NO_WG, i128, env, s64, s64) +DEF_HELPER_FLAGS_4(divu64, TCG_CALL_NO_WG, i128, env, i64, i64, i64) DEF_HELPER_3(srst, void, env, i32, i32) DEF_HELPER_3(srstu, void, env, i32, i32) DEF_HELPER_4(clst, i64, env, i64, i64, i64) diff --git a/target/s390x/tcg/int_helper.c b/target/s390x/tcg/int_helper.c index 7260583cf2..eb8e6dd1b5 100644 --- a/target/s390x/tcg/int_helper.c +++ b/target/s390x/tcg/int_helper.c @@ -76,46 +76,26 @@ uint64_t HELPER(divu32)(CPUS390XState *env, uint64_t a, uint64_t b64) } /* 64/64 -> 64 signed division */ -int64_t HELPER(divs64)(CPUS390XState *env, int64_t a, int64_t b) +Int128 HELPER(divs64)(CPUS390XState *env, int64_t a, int64_t b) { /* Catch divide by zero, and non-representable quotient (MIN / -1). */ if (b == 0 || (b == -1 && a == (1ll << 63))) { tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC()); } -env->retxl = a % b; -return a / b; +return int128_make128(a / b, a % b); } /* 128 -> 64/64 unsigned division */ -uint64_t HELPER(divu64)(CPUS390XState *env, uint64_t ah, uint64_t al, -uint64_t b) +Int128 HELPER(divu64)(CPUS390XState *env, uint64_t ah, uint64_t al, uint64_t b) { -uint64_t ret; -/* Signal divide by zero. */ -if (b == 0) { -tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC()); -} -if (ah == 0) { -/* 64 -> 64/64 case */ -env->retxl = al % b; -ret = al / b; -} else { -/* ??? Move i386 idivq helper to host-utils. */ -#ifdef CONFIG_INT128 -__uint128_t a = ((__uint128_t)ah << 64) | al; -__uint128_t q = a / b; -env->retxl = a % b; -ret = q; -if (ret != q) { -tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC()); +if (b != 0) { +uint64_t r = divu128(&al, &ah, b); +if (ah == 0) { +return int128_make128(al, r); } -#else -/* 32-bit hosts would need special wrapper functionality - just abort if - we encounter such a case; it's very unlikely anyways. */ -cpu_abort(env_cpu(env), "128 -> 64/64 division not implemented\n"); -#endif } -return ret; +/* divide by zero or overflow */ +tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC()); } uint64_t HELPER(cvd)(int32_t reg) diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c index 169f7ee1b2..6953b81de7 100644 --- a/target/s390x/tcg/translate.c +++ b/target/s390x/tcg/translate.c @@ -2409,15 +2409,21 @@ static DisasJumpType op_divu32(DisasContext *s, DisasOps *o) static DisasJumpType op_divs64(DisasContext *s, DisasOps *o) { -gen_helper_divs64(o->out2, cpu_env, o->in1, o->in2); -return_low128(o->out); +TCGv_i128 t = tcg_temp_new_i128(); + +gen_helper_divs64(t, cpu_env, o->in1, o->in2); +tcg_gen_extr_i128_i64(o->out2, o->out, t); +tcg_temp_free_i128(t); return DISAS_NEXT; } static DisasJumpType op_divu64(DisasContext *s, DisasOps *o) { -gen_helper_divu64(o->out2, cpu_env, o->out, o->out2, o->in2); -return_low128(o->out); +TCGv_i128 t = tcg_temp_new_i128(); + +gen_helper_divu64(t, cpu_env, o->out, o->out2, o->in2); +tcg_gen_extr_i128_i64(o->out2, o->out, t); +tcg_temp_free_i128(t); return DISAS_NEXT; } diff --git a/tests/tcg/s390x/div.c b/tests/tcg/s390x/div.c index 5807295614..6ad9900e08 100644 --- a/tests/tcg/s390x/div.c +++ b/tests/tcg/s390x/div.c @@ -33,8 +33,43 @@ static void test_dlr(void) assert(r == 1); } +static void test_dsgr(void) +{ +register int64_t r0 asm("r0") = -1; +register int64_t r1 asm("r1") = -4241; +int64_t b = 101, q, r; + +asm("dsgr %[r0],%[b]" +: [r0] "+r" (r0), [r1] "+r" (r1) +: [b] "r" (b) +: "cc"); +q = r1; +r = r0; +assert(q == -41); +assert(r == -
[PULL 27/40] target/s390x: Use a single return for helper_divs32/u32
Pack the quotient and remainder into a single uint64_t. Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: David Hildenbrand Signed-off-by: Richard Henderson --- v2: Fix operand ordering; use tcg_extr32_i64. --- target/s390x/helper.h | 2 +- target/s390x/tcg/int_helper.c | 26 +- target/s390x/tcg/translate.c | 8 3 files changed, 18 insertions(+), 18 deletions(-) diff --git a/target/s390x/helper.h b/target/s390x/helper.h index 93923ca153..bc828d976b 100644 --- a/target/s390x/helper.h +++ b/target/s390x/helper.h @@ -10,7 +10,7 @@ DEF_HELPER_FLAGS_4(clc, TCG_CALL_NO_WG, i32, env, i32, i64, i64) DEF_HELPER_3(mvcl, i32, env, i32, i32) DEF_HELPER_3(clcl, i32, env, i32, i32) DEF_HELPER_FLAGS_4(clm, TCG_CALL_NO_WG, i32, env, i32, i32, i64) -DEF_HELPER_FLAGS_3(divs32, TCG_CALL_NO_WG, s64, env, s64, s64) +DEF_HELPER_FLAGS_3(divs32, TCG_CALL_NO_WG, i64, env, s64, s64) DEF_HELPER_FLAGS_3(divu32, TCG_CALL_NO_WG, i64, env, i64, i64) DEF_HELPER_FLAGS_3(divs64, TCG_CALL_NO_WG, s64, env, s64, s64) DEF_HELPER_FLAGS_4(divu64, TCG_CALL_NO_WG, i64, env, i64, i64, i64) diff --git a/target/s390x/tcg/int_helper.c b/target/s390x/tcg/int_helper.c index 954542388a..7260583cf2 100644 --- a/target/s390x/tcg/int_helper.c +++ b/target/s390x/tcg/int_helper.c @@ -34,45 +34,45 @@ #endif /* 64/32 -> 32 signed division */ -int64_t HELPER(divs32)(CPUS390XState *env, int64_t a, int64_t b64) +uint64_t HELPER(divs32)(CPUS390XState *env, int64_t a, int64_t b64) { -int32_t ret, b = b64; -int64_t q; +int32_t b = b64; +int64_t q, r; if (b == 0) { tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC()); } -ret = q = a / b; -env->retxl = a % b; +q = a / b; +r = a % b; /* Catch non-representable quotient. */ -if (ret != q) { +if (q != (int32_t)q) { tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC()); } -return ret; +return deposit64(q, 32, 32, r); } /* 64/32 -> 32 unsigned division */ uint64_t HELPER(divu32)(CPUS390XState *env, uint64_t a, uint64_t b64) { -uint32_t ret, b = b64; -uint64_t q; +uint32_t b = b64; +uint64_t q, r; if (b == 0) { tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC()); } -ret = q = a / b; -env->retxl = a % b; +q = a / b; +r = a % b; /* Catch non-representable quotient. */ -if (ret != q) { +if (q != (uint32_t)q) { tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC()); } -return ret; +return deposit64(q, 32, 32, r); } /* 64/64 -> 64 signed division */ diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c index a339b277e9..169f7ee1b2 100644 --- a/target/s390x/tcg/translate.c +++ b/target/s390x/tcg/translate.c @@ -2395,15 +2395,15 @@ static DisasJumpType op_diag(DisasContext *s, DisasOps *o) static DisasJumpType op_divs32(DisasContext *s, DisasOps *o) { -gen_helper_divs32(o->out2, cpu_env, o->in1, o->in2); -return_low128(o->out); +gen_helper_divs32(o->out, cpu_env, o->in1, o->in2); +tcg_gen_extr32_i64(o->out2, o->out, o->out); return DISAS_NEXT; } static DisasJumpType op_divu32(DisasContext *s, DisasOps *o) { -gen_helper_divu32(o->out2, cpu_env, o->in1, o->in2); -return_low128(o->out); +gen_helper_divu32(o->out, cpu_env, o->in1, o->in2); +tcg_gen_extr32_i64(o->out2, o->out, o->out); return DISAS_NEXT; } -- 2.34.1
[PULL 06/40] tcg: Introduce tcg_out_addi_ptr
Implement the function for arm, i386, and s390x, which will use it. Add stubs for all other backends. Reviewed-by: Alex Bennée Reviewed-by: Daniel Henrique Barboza Signed-off-by: Richard Henderson --- tcg/tcg.c| 2 ++ tcg/aarch64/tcg-target.c.inc | 7 +++ tcg/arm/tcg-target.c.inc | 20 tcg/i386/tcg-target.c.inc| 8 tcg/loongarch64/tcg-target.c.inc | 7 +++ tcg/mips/tcg-target.c.inc| 7 +++ tcg/ppc/tcg-target.c.inc | 7 +++ tcg/riscv/tcg-target.c.inc | 7 +++ tcg/s390x/tcg-target.c.inc | 7 +++ tcg/sparc64/tcg-target.c.inc | 7 +++ tcg/tci/tcg-target.c.inc | 7 +++ 11 files changed, 86 insertions(+) diff --git a/tcg/tcg.c b/tcg/tcg.c index cdfc50b164..8923b52044 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -104,6 +104,8 @@ static void tcg_out_ld(TCGContext *s, TCGType type, TCGReg ret, TCGReg arg1, static bool tcg_out_mov(TCGContext *s, TCGType type, TCGReg ret, TCGReg arg); static void tcg_out_movi(TCGContext *s, TCGType type, TCGReg ret, tcg_target_long arg); +static void tcg_out_addi_ptr(TCGContext *s, TCGReg, TCGReg, tcg_target_long) +__attribute__((unused)); static void tcg_out_exit_tb(TCGContext *s, uintptr_t arg); static void tcg_out_goto_tb(TCGContext *s, int which); static void tcg_out_op(TCGContext *s, TCGOpcode opc, diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc index 330d26b395..bd6da72678 100644 --- a/tcg/aarch64/tcg-target.c.inc +++ b/tcg/aarch64/tcg-target.c.inc @@ -1102,6 +1102,13 @@ static void tcg_out_movi(TCGContext *s, TCGType type, TCGReg rd, tcg_out_insn(s, 3305, LDR, 0, rd); } +static void tcg_out_addi_ptr(TCGContext *s, TCGReg rd, TCGReg rs, + tcg_target_long imm) +{ +/* This function is only used for passing structs by reference. */ +g_assert_not_reached(); +} + /* Define something more legible for general use. */ #define tcg_out_ldst_r tcg_out_insn_3310 diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc index 0f5f9f4925..6e9e9b9b3f 100644 --- a/tcg/arm/tcg-target.c.inc +++ b/tcg/arm/tcg-target.c.inc @@ -2581,6 +2581,26 @@ static void tcg_out_movi(TCGContext *s, TCGType type, tcg_out_movi32(s, COND_AL, ret, arg); } +static void tcg_out_addi_ptr(TCGContext *s, TCGReg rd, TCGReg rs, + tcg_target_long imm) +{ +int enc, opc = ARITH_ADD; + +/* All of the easiest immediates to encode are positive. */ +if (imm < 0) { +imm = -imm; +opc = ARITH_SUB; +} +enc = encode_imm(imm); +if (enc >= 0) { +tcg_out_dat_imm(s, COND_AL, opc, rd, rs, enc); +} else { +tcg_out_movi32(s, COND_AL, TCG_REG_TMP, imm); +tcg_out_dat_reg(s, COND_AL, opc, rd, rs, +TCG_REG_TMP, SHIFT_IMM_LSL(0)); +} +} + /* Type is always V128, with I64 elements. */ static void tcg_out_dup2_vec(TCGContext *s, TCGReg rd, TCGReg rl, TCGReg rh) { diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index c71c3e664d..7b573bd287 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -1069,6 +1069,14 @@ static void tcg_out_movi(TCGContext *s, TCGType type, } } +static void tcg_out_addi_ptr(TCGContext *s, TCGReg rd, TCGReg rs, + tcg_target_long imm) +{ +/* This function is only used for passing structs by reference. */ +tcg_debug_assert(TCG_TARGET_REG_BITS == 32); +tcg_out_modrm_offset(s, OPC_LEA, rd, rs, imm); +} + static inline void tcg_out_pushi(TCGContext *s, tcg_target_long val) { if (val == (int8_t)val) { diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc index ce4a153887..b6e2ff6213 100644 --- a/tcg/loongarch64/tcg-target.c.inc +++ b/tcg/loongarch64/tcg-target.c.inc @@ -417,6 +417,13 @@ static void tcg_out_addi(TCGContext *s, TCGType type, TCGReg rd, } } +static void tcg_out_addi_ptr(TCGContext *s, TCGReg rd, TCGReg rs, + tcg_target_long imm) +{ +/* This function is only used for passing structs by reference. */ +g_assert_not_reached(); +} + static void tcg_out_ext8u(TCGContext *s, TCGReg ret, TCGReg arg) { tcg_out_opc_andi(s, ret, arg, 0xff); diff --git a/tcg/mips/tcg-target.c.inc b/tcg/mips/tcg-target.c.inc index 6e000d8e69..d419c4c1fc 100644 --- a/tcg/mips/tcg-target.c.inc +++ b/tcg/mips/tcg-target.c.inc @@ -550,6 +550,13 @@ static void tcg_out_movi(TCGContext *s, TCGType type, } } +static void tcg_out_addi_ptr(TCGContext *s, TCGReg rd, TCGReg rs, + tcg_target_long imm) +{ +/* This function is only used for passing structs by reference. */ +g_assert_not_reached(); +} + static void tcg_out_bswap16(TCGContext *s, TCGReg ret, TCGReg arg, int flags) { /* ret and arg can't be register tmp0 */ diff --git a/t
[PULL 24/40] tests/tcg/s390x: Add clst.c
From: Ilya Leoshkevich Add a basic test to prevent regressions. Signed-off-by: Ilya Leoshkevich Message-Id: <20221025213008.2209006-2-...@linux.ibm.com> Signed-off-by: Richard Henderson --- tests/tcg/s390x/clst.c | 82 + tests/tcg/s390x/Makefile.target | 1 + 2 files changed, 83 insertions(+) create mode 100644 tests/tcg/s390x/clst.c diff --git a/tests/tcg/s390x/clst.c b/tests/tcg/s390x/clst.c new file mode 100644 index 00..ed2fe7326c --- /dev/null +++ b/tests/tcg/s390x/clst.c @@ -0,0 +1,82 @@ +#define _GNU_SOURCE +#include +#include + +static int clst(char sep, const char **s1, const char **s2) +{ +const char *r1 = *s1; +const char *r2 = *s2; +int cc; + +do { +register int r0 asm("r0") = sep; + +asm("clst %[r1],%[r2]\n" +"ipm %[cc]\n" +"srl %[cc],28" +: [r1] "+r" (r1), [r2] "+r" (r2), "+r" (r0), [cc] "=r" (cc) +: +: "cc"); +*s1 = r1; +*s2 = r2; +} while (cc == 3); + +return cc; +} + +static const struct test { +const char *name; +char sep; +const char *s1; +const char *s2; +int exp_cc; +int exp_off; +} tests[] = { +{ +.name = "cc0", +.sep = 0, +.s1 = "aa", +.s2 = "aa", +.exp_cc = 0, +.exp_off = 0, +}, +{ +.name = "cc1", +.sep = 1, +.s1 = "a\x01", +.s2 = "aa\x01", +.exp_cc = 1, +.exp_off = 1, +}, +{ +.name = "cc2", +.sep = 2, +.s1 = "abc\x02", +.s2 = "abb\x02", +.exp_cc = 2, +.exp_off = 2, +}, +}; + +int main(void) +{ +const struct test *t; +const char *s1, *s2; +size_t i; +int cc; + +for (i = 0; i < sizeof(tests) / sizeof(tests[0]); i++) { +t = &tests[i]; +s1 = t->s1; +s2 = t->s2; +cc = clst(t->sep, &s1, &s2); +if (cc != t->exp_cc || +s1 != t->s1 + t->exp_off || +s2 != t->s2 + t->exp_off) { +fprintf(stderr, "%s\n", t->name); +return EXIT_FAILURE; +} +} + +return EXIT_SUCCESS; +} diff --git a/tests/tcg/s390x/Makefile.target b/tests/tcg/s390x/Makefile.target index ab7a3bcfb2..79250f31dd 100644 --- a/tests/tcg/s390x/Makefile.target +++ b/tests/tcg/s390x/Makefile.target @@ -25,6 +25,7 @@ TESTS+=signals-s390x TESTS+=branch-relative-long TESTS+=noexec TESTS+=div +TESTS+=clst Z13_TESTS=vistr $(Z13_TESTS): CFLAGS+=-march=z13 -O2 -- 2.34.1
[PULL 32/40] target/s390x: Copy wout_x1 to wout_x1_P
Make a copy of wout_x1 before modifying it, as wout_x1_P emphasizing that it operates on the out/out2 pair. The insns that use x1_P are data movement that will not change to Int128. Acked-by: Ilya Leoshkevich Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- target/s390x/tcg/insn-data.h.inc | 12 ++-- target/s390x/tcg/translate.c | 8 2 files changed, 14 insertions(+), 6 deletions(-) diff --git a/target/s390x/tcg/insn-data.h.inc b/target/s390x/tcg/insn-data.h.inc index 79c6ab509a..d0814cb218 100644 --- a/target/s390x/tcg/insn-data.h.inc +++ b/target/s390x/tcg/insn-data.h.inc @@ -422,7 +422,7 @@ F(0x3800, LER, RR_a, Z, 0, e2, 0, cond_e1e2, mov2, 0, IF_AFP1 | IF_AFP2) F(0x7800, LE, RX_a, Z, 0, m2_32u, 0, e1, mov2, 0, IF_AFP1) F(0xed64, LEY, RXY_a, LD, 0, m2_32u, 0, e1, mov2, 0, IF_AFP1) -F(0xb365, LXR, RRE, Z, x2h, x2l, 0, x1, movx, 0, IF_AFP1) +F(0xb365, LXR, RRE, Z, x2h, x2l, 0, x1_P, movx, 0, IF_AFP1) /* LOAD IMMEDIATE */ C(0xc001, LGFI,RIL_a, EI, 0, i2, 0, r1, mov2, 0) /* LOAD RELATIVE LONG */ @@ -461,7 +461,7 @@ C(0xe332, LTGF,RXY_a, GIE, 0, a2, r1, 0, ld32s, s64) F(0xb302, LTEBR, RRE, Z, 0, e2, 0, cond_e1e2, mov2, f32, IF_BFP) F(0xb312, LTDBR, RRE, Z, 0, f2, 0, f1, mov2, f64, IF_BFP) -F(0xb342, LTXBR, RRE, Z, x2h, x2l, 0, x1, movx, f128, IF_BFP) +F(0xb342, LTXBR, RRE, Z, x2h, x2l, 0, x1_P, movx, f128, IF_BFP) /* LOAD AND TRAP */ C(0xe39f, LAT, RXY_a, LAT, 0, m2_32u, r1, 0, lat, 0) C(0xe385, LGAT,RXY_a, LAT, 0, a2, r1, 0, lgat, 0) @@ -483,7 +483,7 @@ C(0xb913, LCGFR, RRE, Z, 0, r2_32s, r1, 0, neg, neg64) F(0xb303, LCEBR, RRE, Z, 0, e2, new, e1, negf32, f32, IF_BFP) F(0xb313, LCDBR, RRE, Z, 0, f2, new, f1, negf64, f64, IF_BFP) -F(0xb343, LCXBR, RRE, Z, x2h, x2l, new_P, x1, negf128, f128, IF_BFP) +F(0xb343, LCXBR, RRE, Z, x2h, x2l, new_P, x1_P, negf128, f128, IF_BFP) F(0xb373, LCDFR, RRE, FPSSH, 0, f2, new, f1, negf64, 0, IF_AFP1 | IF_AFP2) /* LOAD COUNT TO BLOCK BOUNDARY */ C(0xe727, LCBB,RXE, V, la2, 0, r1, 0, lcbb, 0) @@ -552,7 +552,7 @@ C(0xb911, LNGFR, RRE, Z, 0, r2_32s, r1, 0, nabs, nabs64) F(0xb301, LNEBR, RRE, Z, 0, e2, new, e1, nabsf32, f32, IF_BFP) F(0xb311, LNDBR, RRE, Z, 0, f2, new, f1, nabsf64, f64, IF_BFP) -F(0xb341, LNXBR, RRE, Z, x2h, x2l, new_P, x1, nabsf128, f128, IF_BFP) +F(0xb341, LNXBR, RRE, Z, x2h, x2l, new_P, x1_P, nabsf128, f128, IF_BFP) F(0xb371, LNDFR, RRE, FPSSH, 0, f2, new, f1, nabsf64, 0, IF_AFP1 | IF_AFP2) /* LOAD ON CONDITION */ C(0xb9f2, LOCR,RRF_c, LOC, r1, r2, new, r1_32, loc, 0) @@ -577,7 +577,7 @@ C(0xb910, LPGFR, RRE, Z, 0, r2_32s, r1, 0, abs, abs64) F(0xb300, LPEBR, RRE, Z, 0, e2, new, e1, absf32, f32, IF_BFP) F(0xb310, LPDBR, RRE, Z, 0, f2, new, f1, absf64, f64, IF_BFP) -F(0xb340, LPXBR, RRE, Z, x2h, x2l, new_P, x1, absf128, f128, IF_BFP) +F(0xb340, LPXBR, RRE, Z, x2h, x2l, new_P, x1_P, absf128, f128, IF_BFP) F(0xb370, LPDFR, RRE, FPSSH, 0, f2, new, f1, absf64, 0, IF_AFP1 | IF_AFP2) /* LOAD REVERSED */ C(0xb91f, LRVR,RRE, Z, 0, r2_32u, new, r1_32, rev32, 0) @@ -588,7 +588,7 @@ /* LOAD ZERO */ F(0xb374, LZER,RRE, Z, 0, 0, 0, e1, zero, 0, IF_AFP1) F(0xb375, LZDR,RRE, Z, 0, 0, 0, f1, zero, 0, IF_AFP1) -F(0xb376, LZXR,RRE, Z, 0, 0, 0, x1, zero2, 0, IF_AFP1) +F(0xb376, LZXR,RRE, Z, 0, 0, 0, x1_P, zero2, 0, IF_AFP1) /* LOAD FPC */ F(0xb29d, LFPC,S, Z, 0, m2_32u, 0, 0, sfpc, 0, IF_BFP) diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c index f3e4b70ed9..d25b6f3c03 100644 --- a/target/s390x/tcg/translate.c +++ b/target/s390x/tcg/translate.c @@ -5518,6 +5518,14 @@ static void wout_x1(DisasContext *s, DisasOps *o) } #define SPEC_wout_x1 SPEC_r1_f128 +static void wout_x1_P(DisasContext *s, DisasOps *o) +{ +int f1 = get_field(s, r1); +store_freg(f1, o->out); +store_freg(f1 + 2, o->out2); +} +#define SPEC_wout_x1_P SPEC_r1_f128 + static void wout_cond_r1r2_32(DisasContext *s, DisasOps *o) { if (get_field(s, r1) != get_field(s, r2)) { -- 2.34.1
[PATCH] KVM: dirty ring: check if vcpu is created before dirty_ring_reap_one
From: Weinan Liu Failed to assert '(dirty_gfns && ring_size)' in kvm_dirty_ring_reap_one if the vcpu has not been finished to create yet. This bug occasionally occurs when I open 200+ qemu instances on my 16G 6-cores x86 machine. And it must be triggered if inserting a 'sleep(10)' into kvm_vcpu_thread_fn as below-- static void *kvm_vcpu_thread_fn(void *arg) { CPUState *cpu = arg; int r; rcu_register_thread(); +sleep(10); qemu_mutex_lock_iothread(); qemu_thread_get_self(cpu->thread); cpu->thread_id = qemu_get_thread_id(); cpu->can_do_io = 1; where dirty ring reaper will wakeup but then a vcpu has not been finished to create. Signed-off-by: Weinan Liu --- accel/kvm/kvm-all.c | 9 + 1 file changed, 9 insertions(+) diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index 7e6a6076b1..840da7630e 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -719,6 +719,15 @@ static uint64_t kvm_dirty_ring_reap_locked(KVMState *s, CPUState* cpu) total = kvm_dirty_ring_reap_one(s, cpu); } else { CPU_FOREACH(cpu) { +/* + * Must ensure kvm_init_vcpu is finished, so cpu->kvm_dirty_gfns is + * available. + */ +while (cpu->created == false) { +qemu_mutex_unlock_iothread(); +qemu_mutex_lock_iothread(); +} + total += kvm_dirty_ring_reap_one(s, cpu); } } -- 2.25.1
[PULL 16/22] linux-user: Fix /proc/cpuinfo output for hppa
From: Helge Deller The hppa architectures provides an own output for the emulated /proc/cpuinfo file. Some userspace applications count (even if that's not the recommended way) the number of lines which start with "processor:" and assume that this number then reflects the number of online CPUs. Since those 3 architectures don't provide any such line, applications may assume "0" CPUs. One such issue can be seen in debian bug report: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1024653 Avoid such issues by adding a "processor:" line for each of the online CPUs. Signed-off-by: Helge Deller Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Richard Henderson Reviewed-by: Laurent Vivier Message-Id: Signed-off-by: Laurent Vivier --- linux-user/syscall.c | 16 +++- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/linux-user/syscall.c b/linux-user/syscall.c index 1c42df651801..55d53b344b84 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -8232,11 +8232,17 @@ static int open_cpuinfo(CPUArchState *cpu_env, int fd) #if defined(TARGET_HPPA) static int open_cpuinfo(CPUArchState *cpu_env, int fd) { -dprintf(fd, "cpu family\t: PA-RISC 1.1e\n"); -dprintf(fd, "cpu\t\t: PA7300LC (PCX-L2)\n"); -dprintf(fd, "capabilities\t: os32\n"); -dprintf(fd, "model\t\t: 9000/778/B160L\n"); -dprintf(fd, "model name\t: Merlin L2 160 QEMU (9000/778/B160L)\n"); +int i, num_cpus; + +num_cpus = sysconf(_SC_NPROCESSORS_ONLN); +for (i = 0; i < num_cpus; i++) { +dprintf(fd, "processor\t: %d\n", i); +dprintf(fd, "cpu family\t: PA-RISC 1.1e\n"); +dprintf(fd, "cpu\t\t: PA7300LC (PCX-L2)\n"); +dprintf(fd, "capabilities\t: os32\n"); +dprintf(fd, "model\t\t: 9000/778/B160L - " +"Merlin L2 160 QEMU (9000/778/B160L)\n\n"); +} return 0; } #endif -- 2.39.1
[PULL 03/22] linux-user/strace: Add output for execveat() syscall
From: Drew DeVault Signed-off-by: Drew DeVault Message-Id: <20221104081015.706009-1-...@cmpwn.com> Suggested-by: Helge Deller [PMD: Split of bigger patch] Signed-off-by: Philippe Mathieu-Daudé Reviewed-by: Laurent Vivier Message-Id: <20221104173632.1052-4-phi...@linaro.org> Signed-off-by: Laurent Vivier --- linux-user/strace.c| 23 +++ linux-user/strace.list | 2 +- 2 files changed, 24 insertions(+), 1 deletion(-) diff --git a/linux-user/strace.c b/linux-user/strace.c index 3d11d2f75978..7bccb4f0c067 100644 --- a/linux-user/strace.c +++ b/linux-user/strace.c @@ -1104,6 +1104,16 @@ UNUSED static const struct flags clone_flags[] = { FLAG_END, }; +UNUSED static const struct flags execveat_flags[] = { +#ifdef AT_EMPTY_PATH +FLAG_GENERIC(AT_EMPTY_PATH), +#endif +#ifdef AT_SYMLINK_NOFOLLOW +FLAG_GENERIC(AT_SYMLINK_NOFOLLOW), +#endif +FLAG_END, +}; + UNUSED static const struct flags msg_flags[] = { /* send */ FLAG_GENERIC(MSG_CONFIRM), @@ -1976,6 +1986,19 @@ print_execve(CPUArchState *cpu_env, const struct syscallname *name, print_syscall_epilogue(name); } +static void +print_execveat(CPUArchState *cpu_env, const struct syscallname *name, + abi_long arg1, abi_long arg2, abi_long arg3, + abi_long arg4, abi_long arg5, abi_long arg6) +{ +print_syscall_prologue(name); +print_at_dirfd(arg1, 0); +print_string(arg2, 0); +print_execve_argv(arg3, 0); +print_flags(execveat_flags, arg5, 1); +print_syscall_epilogue(name); +} + #if defined(TARGET_NR_faccessat) || defined(TARGET_NR_faccessat2) static void print_faccessat(CPUArchState *cpu_env, const struct syscallname *name, diff --git a/linux-user/strace.list b/linux-user/strace.list index 3a898e2532d3..bb21c054148e 100644 --- a/linux-user/strace.list +++ b/linux-user/strace.list @@ -164,7 +164,7 @@ { TARGET_NR_execve, "execve" , NULL, print_execve, NULL }, #endif #ifdef TARGET_NR_execveat -{ TARGET_NR_execveat, "execveat" , NULL, NULL, NULL }, +{ TARGET_NR_execveat, "execveat" , NULL, print_execveat, NULL }, #endif #ifdef TARGET_NR_exec_with_loader { TARGET_NR_exec_with_loader, "exec_with_loader" , NULL, NULL, NULL }, -- 2.39.1
[PULL 02/22] linux-user/strace: Extract print_execve_argv() from print_execve()
From: Drew DeVault In order to add print_execveat() which re-use common code from print_execve(), extract print_execve_argv() from it. Signed-off-by: Drew DeVault Message-Id: <20221104081015.706009-1-...@cmpwn.com> [PMD: Split of bigger patch, filled description, fixed style] Signed-off-by: Philippe Mathieu-Daudé Reviewed-by: Laurent Vivier Message-Id: <20221104173632.1052-3-phi...@linaro.org> Signed-off-by: Laurent Vivier --- linux-user/strace.c | 71 + 1 file changed, 39 insertions(+), 32 deletions(-) diff --git a/linux-user/strace.c b/linux-user/strace.c index 25c47f03160d..3d11d2f75978 100644 --- a/linux-user/strace.c +++ b/linux-user/strace.c @@ -616,38 +616,6 @@ print_semctl(CPUArchState *cpu_env, const struct syscallname *name, } #endif -static void -print_execve(CPUArchState *cpu_env, const struct syscallname *name, - abi_long arg1, abi_long arg2, abi_long arg3, - abi_long arg4, abi_long arg5, abi_long arg6) -{ -abi_ulong arg_ptr_addr; -char *s; - -if (!(s = lock_user_string(arg1))) -return; -qemu_log("%s(\"%s\",{", name->name, s); -unlock_user(s, arg1, 0); - -for (arg_ptr_addr = arg2; ; arg_ptr_addr += sizeof(abi_ulong)) { -abi_ulong *arg_ptr, arg_addr; - -arg_ptr = lock_user(VERIFY_READ, arg_ptr_addr, sizeof(abi_ulong), 1); -if (!arg_ptr) -return; -arg_addr = tswapal(*arg_ptr); -unlock_user(arg_ptr, arg_ptr_addr, 0); -if (!arg_addr) -break; -if ((s = lock_user_string(arg_addr))) { -qemu_log("\"%s\",", s); -unlock_user(s, arg_addr, 0); -} -} - -qemu_log("NULL})"); -} - #ifdef TARGET_NR_ipc static void print_ipc(CPUArchState *cpu_env, const struct syscallname *name, @@ -1969,6 +1937,45 @@ print_execv(CPUArchState *cpu_env, const struct syscallname *name, } #endif +static void +print_execve_argv(abi_long argv, int last) +{ +abi_ulong arg_ptr_addr; +char *s; + +qemu_log("{"); +for (arg_ptr_addr = argv; ; arg_ptr_addr += sizeof(abi_ulong)) { +abi_ulong *arg_ptr, arg_addr; + +arg_ptr = lock_user(VERIFY_READ, arg_ptr_addr, sizeof(abi_ulong), 1); +if (!arg_ptr) { +return; +} +arg_addr = tswapal(*arg_ptr); +unlock_user(arg_ptr, arg_ptr_addr, 0); +if (!arg_addr) { +break; +} +s = lock_user_string(arg_addr); +if (s) { +qemu_log("\"%s\",", s); +unlock_user(s, arg_addr, 0); +} +} +qemu_log("NULL}%s", get_comma(last)); +} + +static void +print_execve(CPUArchState *cpu_env, const struct syscallname *name, + abi_long arg1, abi_long arg2, abi_long arg3, + abi_long arg4, abi_long arg5, abi_long arg6) +{ +print_syscall_prologue(name); +print_string(arg1, 0); +print_execve_argv(arg2, 1); +print_syscall_epilogue(name); +} + #if defined(TARGET_NR_faccessat) || defined(TARGET_NR_faccessat2) static void print_faccessat(CPUArchState *cpu_env, const struct syscallname *name, -- 2.39.1
[PULL 17/22] linux-user: Improve strace output of personality() and sysinfo()
From: Helge Deller Make the strace look nicer for those two syscalls. Signed-off-by: Helge Deller Reviewed-by: Richard Henderson Reviewed-by: Laurent Vivier Message-Id: Signed-off-by: Laurent Vivier --- linux-user/strace.list | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/linux-user/strace.list b/linux-user/strace.list index cf291d02edfe..3a1f61803a39 100644 --- a/linux-user/strace.list +++ b/linux-user/strace.list @@ -1049,7 +1049,8 @@ { TARGET_NR_perfctr, "perfctr" , NULL, NULL, NULL }, #endif #ifdef TARGET_NR_personality -{ TARGET_NR_personality, "personality" , NULL, NULL, NULL }, +{ TARGET_NR_personality, "personality" , "%s(0x"TARGET_ABI_FMT_lx")", NULL, + print_syscall_ret_addr }, #endif #ifdef TARGET_NR_pipe { TARGET_NR_pipe, "pipe" , NULL, NULL, NULL }, @@ -1504,7 +1505,7 @@ { TARGET_NR_sysfs, "sysfs" , NULL, NULL, NULL }, #endif #ifdef TARGET_NR_sysinfo -{ TARGET_NR_sysinfo, "sysinfo" , NULL, NULL, NULL }, +{ TARGET_NR_sysinfo, "sysinfo" , "%s(%p)", NULL, NULL }, #endif #ifdef TARGET_NR_sys_kexec_load { TARGET_NR_sys_kexec_load, "sys_kexec_load" , NULL, NULL, NULL }, -- 2.39.1
[PULL 12/22] linux-user: Add strace output for clock_getres_time64() and futex_time64()
From: Helge Deller Add the two syscalls to strace output to avoid "Unknown syscall" message. Signed-off-by: Helge Deller Reviewed-by: Laurent Vivier Message-Id: <20230115113517.25143-1-del...@gmx.de> Signed-off-by: Laurent Vivier --- linux-user/strace.list | 6 ++ 1 file changed, 6 insertions(+) diff --git a/linux-user/strace.list b/linux-user/strace.list index bb21c054148e..64db8e6b8412 100644 --- a/linux-user/strace.list +++ b/linux-user/strace.list @@ -86,6 +86,9 @@ { TARGET_NR_clock_getres, "clock_getres" , NULL, print_clock_getres, print_syscall_ret_clock_getres }, #endif +#ifdef TARGET_NR_clock_getres_time64 +{ TARGET_NR_clock_getres_time64, "clock_getres_time64" , NULL, NULL, NULL }, +#endif #ifdef TARGET_NR_clock_gettime { TARGET_NR_clock_gettime, "clock_gettime" , NULL, print_clock_gettime, print_syscall_ret_clock_gettime }, @@ -275,6 +278,9 @@ #ifdef TARGET_NR_futex { TARGET_NR_futex, "futex" , NULL, print_futex, NULL }, #endif +#ifdef TARGET_NR_futex_time64 +{ TARGET_NR_futex_time64, "futex_time64" , NULL, NULL, NULL }, +#endif #ifdef TARGET_NR_futimesat { TARGET_NR_futimesat, "futimesat" , NULL, print_futimesat, NULL }, #endif -- 2.39.1
[PULL 19/22] linux-user: Show 4th argument of rt_sigprocmask() in strace
From: Helge Deller Add output for the missing 4th parameter (size_t sigsetsize). Signed-off-by: Helge Deller Reviewed-by: Richard Henderson Reviewed-by: Laurent Vivier Message-Id: Signed-off-by: Laurent Vivier --- linux-user/strace.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/linux-user/strace.c b/linux-user/strace.c index f38227ba5db5..340010661c4f 100644 --- a/linux-user/strace.c +++ b/linux-user/strace.c @@ -3224,7 +3224,8 @@ print_rt_sigprocmask(CPUArchState *cpu_env, const struct syscallname *name, } qemu_log("%s,", how); print_pointer(arg1, 0); -print_pointer(arg2, 1); +print_pointer(arg2, 0); +print_raw_param("%u", arg3, 1); print_syscall_epilogue(name); } #endif -- 2.39.1
[PULL 06/22] linux-user: Add missing MAP_HUGETLB and MAP_STACK flags in strace
From: Helge Deller Add two missing mmap flags. Signed-off-by: Helge Deller Reviewed-by: Laurent Vivier Message-Id: Signed-off-by: Laurent Vivier --- linux-user/strace.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/linux-user/strace.c b/linux-user/strace.c index 7bccb4f0c067..5027289bdde4 100644 --- a/linux-user/strace.c +++ b/linux-user/strace.c @@ -1057,6 +1057,8 @@ UNUSED static const struct flags mmap_flags[] = { #ifdef TARGET_MAP_UNINITIALIZED FLAG_TARGET(MAP_UNINITIALIZED), #endif +FLAG_TARGET(MAP_HUGETLB), +FLAG_TARGET(MAP_STACK), FLAG_END, }; -- 2.39.1
[PULL 15/22] linux-user: Fix SO_ERROR return code of getsockopt()
From: Helge Deller Add translation for the host error return code of: getsockopt(19, SOL_SOCKET, SO_ERROR, [ECONNREFUSED], [4]) = 0 This fixes the testsuite of the cockpit debian package with a hppa-linux guest on a x86-64 host. Signed-off-by: Helge Deller Reviewed-by: Richard Henderson Reviewed-by: Laurent Vivier Message-Id: Signed-off-by: Laurent Vivier --- linux-user/syscall.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/linux-user/syscall.c b/linux-user/syscall.c index 210db5f0be94..1c42df651801 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -2758,8 +2758,13 @@ get_timeout: ret = get_errno(getsockopt(sockfd, level, optname, &val, &lv)); if (ret < 0) return ret; -if (optname == SO_TYPE) { +switch (optname) { +case SO_TYPE: val = host_to_target_sock_type(val); +break; +case SO_ERROR: +val = host_to_target_errno(val); +break; } if (len > lv) len = lv; -- 2.39.1
[PULL 09/22] linux-user: add more netlink protocol constants
From: Letu Ren Currently, qemu strace only prints four protocol contants. This patch adds others listed in "linux/netlink.h". Signed-off-by: Letu Ren Message-Id: <20230101141105.12024-1-fantasq...@gmail.com> Signed-off-by: Laurent Vivier --- linux-user/strace.c | 48 + 1 file changed, 48 insertions(+) diff --git a/linux-user/strace.c b/linux-user/strace.c index 081fc87344ca..f38227ba5db5 100644 --- a/linux-user/strace.c +++ b/linux-user/strace.c @@ -506,21 +506,69 @@ print_socket_protocol(int domain, int type, int protocol) case NETLINK_ROUTE: qemu_log("NETLINK_ROUTE"); break; +case NETLINK_UNUSED: +qemu_log("NETLINK_UNUSED"); +break; +case NETLINK_USERSOCK: +qemu_log("NETLINK_USERSOCK"); +break; +case NETLINK_FIREWALL: +qemu_log("NETLINK_FIREWALL"); +break; +case NETLINK_SOCK_DIAG: +qemu_log("NETLINK_SOCK_DIAG"); +break; +case NETLINK_NFLOG: +qemu_log("NETLINK_NFLOG"); +break; +case NETLINK_XFRM: +qemu_log("NETLINK_XFRM"); +break; +case NETLINK_SELINUX: +qemu_log("NETLINK_SELINUX"); +break; +case NETLINK_ISCSI: +qemu_log("NETLINK_ISCSI"); +break; case NETLINK_AUDIT: qemu_log("NETLINK_AUDIT"); break; +case NETLINK_FIB_LOOKUP: +qemu_log("NETLINK_FIB_LOOKUP"); +break; +case NETLINK_CONNECTOR: +qemu_log("NETLINK_CONNECTOR"); +break; case NETLINK_NETFILTER: qemu_log("NETLINK_NETFILTER"); break; +case NETLINK_IP6_FW: +qemu_log("NETLINK_IP6_FW"); +break; +case NETLINK_DNRTMSG: +qemu_log("NETLINK_DNRTMSG"); +break; case NETLINK_KOBJECT_UEVENT: qemu_log("NETLINK_KOBJECT_UEVENT"); break; +case NETLINK_GENERIC: +qemu_log("NETLINK_GENERIC"); +break; +case NETLINK_SCSITRANSPORT: +qemu_log("NETLINK_SCSITRANSPORT"); +break; +case NETLINK_ECRYPTFS: +qemu_log("NETLINK_ECRYPTFS"); +break; case NETLINK_RDMA: qemu_log("NETLINK_RDMA"); break; case NETLINK_CRYPTO: qemu_log("NETLINK_CRYPTO"); break; +case NETLINK_SMC: +qemu_log("NETLINK_SMC"); +break; default: qemu_log("%d", protocol); break; -- 2.39.1
[PULL 18/22] linux-user: Add emulation for MADV_WIPEONFORK and MADV_KEEPONFORK in madvise()
From: Helge Deller Both parameters have a different value on the parisc platform, so first translate the target value into a host value for usage in the native madvise() syscall. Those parameters are often used by security sensitive applications (e.g. tor browser, boringssl, ...) which expect the call to return a proper return code on failure, so return -EINVAL if qemu fails to forward the syscall to the host OS. While touching this code, enhance the comments about MADV_DONTNEED. Tested with testcase of tor browser when running hppa-linux guest on x86-64 host. Signed-off-by: Helge Deller Acked-by: Ilya Leoshkevich Reviewed-by: Laurent Vivier Message-Id: Signed-off-by: Laurent Vivier --- linux-user/mmap.c | 56 --- 1 file changed, 43 insertions(+), 13 deletions(-) diff --git a/linux-user/mmap.c b/linux-user/mmap.c index 10f5079331c3..28135c9e6aa9 100644 --- a/linux-user/mmap.c +++ b/linux-user/mmap.c @@ -857,7 +857,7 @@ abi_long target_mremap(abi_ulong old_addr, abi_ulong old_size, return new_addr; } -static bool can_passthrough_madv_dontneed(abi_ulong start, abi_ulong end) +static bool can_passthrough_madvise(abi_ulong start, abi_ulong end) { ulong addr; @@ -901,23 +901,53 @@ abi_long target_madvise(abi_ulong start, abi_ulong len_in, int advice) return -TARGET_EINVAL; } +/* Translate for some architectures which have different MADV_xxx values */ +switch (advice) { +case TARGET_MADV_DONTNEED: /* alpha */ +advice = MADV_DONTNEED; +break; +case TARGET_MADV_WIPEONFORK:/* parisc */ +advice = MADV_WIPEONFORK; +break; +case TARGET_MADV_KEEPONFORK:/* parisc */ +advice = MADV_KEEPONFORK; +break; +/* we do not care about the other MADV_xxx values yet */ +} + /* - * A straight passthrough may not be safe because qemu sometimes turns - * private file-backed mappings into anonymous mappings. + * Most advice values are hints, so ignoring and returning success is ok. + * + * However, some advice values such as MADV_DONTNEED, MADV_WIPEONFORK and + * MADV_KEEPONFORK are not hints and need to be emulated. * - * This is a hint, so ignoring and returning success is ok. + * A straight passthrough for those may not be safe because qemu sometimes + * turns private file-backed mappings into anonymous mappings. + * can_passthrough_madvise() helps to check if a passthrough is possible by + * comparing mappings that are known to have the same semantics in the host + * and the guest. In this case passthrough is safe. * - * This breaks MADV_DONTNEED, completely implementing which is quite - * complicated. However, there is one low-hanging fruit: mappings that are - * known to have the same semantics in the host and the guest. In this case - * passthrough is safe, so do it. + * We pass through MADV_WIPEONFORK and MADV_KEEPONFORK if possible and + * return failure if not. + * + * MADV_DONTNEED is passed through as well, if possible. + * If passthrough isn't possible, we nevertheless (wrongly!) return + * success, which is broken but some userspace programs fail to work + * otherwise. Completely implementing such emulation is quite complicated + * though. */ mmap_lock(); -if (advice == TARGET_MADV_DONTNEED && -can_passthrough_madv_dontneed(start, end)) { -ret = get_errno(madvise(g2h_untagged(start), len, MADV_DONTNEED)); -if (ret == 0) { -page_reset_target_data(start, start + len); +switch (advice) { +case MADV_WIPEONFORK: +case MADV_KEEPONFORK: +ret = -EINVAL; +/* fall through */ +case MADV_DONTNEED: +if (can_passthrough_madvise(start, end)) { +ret = get_errno(madvise(g2h_untagged(start), len, advice)); +if ((advice == MADV_DONTNEED) && (ret == 0)) { +page_reset_target_data(start, start + len); +} } } mmap_unlock(); -- 2.39.1
[PULL 07/22] linux-user: un-parent OBJECT(cpu) when closing thread
From: Richard Henderson This reinstates commit 52f0c1607671293afcdb2acc2f83e9bccbfa74bb: While forcing the CPU to unrealize by hand does trigger the clean-up code we never fully free resources because refcount never reaches zero. This is because QOM automatically added objects without an explicit parent to /unattached/, incrementing the refcount. Instead of manually triggering unrealization just unparent the object and let the device machinery deal with that for us. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/866 Signed-off-by: Alex Bennée Reviewed-by: Laurent Vivier Message-Id: <20220811151413.3350684-2-alex.ben...@linaro.org> The original patch tickled a problem in target/arm, and was reverted. But that problem is fixed as of commit 3b07a936d3bf. Signed-off-by: Richard Henderson Message-Id: <20230124201019.3935934-1-richard.hender...@linaro.org> Signed-off-by: Laurent Vivier --- linux-user/syscall.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/linux-user/syscall.c b/linux-user/syscall.c index 3e72bd333ede..dbf51e500b4f 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -8756,7 +8756,13 @@ static abi_long do_syscall1(CPUArchState *cpu_env, int num, abi_long arg1, if (CPU_NEXT(first_cpu)) { TaskState *ts = cpu->opaque; -object_property_set_bool(OBJECT(cpu), "realized", false, NULL); +if (ts->child_tidptr) { +put_user_u32(0, ts->child_tidptr); +do_sys_futex(g2h(cpu, ts->child_tidptr), + FUTEX_WAKE, INT_MAX, NULL, NULL, 0); +} + +object_unparent(OBJECT(cpu)); object_unref(OBJECT(cpu)); /* * At this point the CPU should be unrealized and removed @@ -8766,11 +8772,6 @@ static abi_long do_syscall1(CPUArchState *cpu_env, int num, abi_long arg1, pthread_mutex_unlock(&clone_lock); -if (ts->child_tidptr) { -put_user_u32(0, ts->child_tidptr); -do_sys_futex(g2h(cpu, ts->child_tidptr), - FUTEX_WAKE, INT_MAX, NULL, NULL, 0); -} thread_cpu = NULL; g_free(ts); rcu_unregister_thread(); -- 2.39.1
[PULL 08/22] linux-user: fix strace build w/out munlockall
From: Mike Frysinger Signed-off-by: Mike Frysinger Reviewed-by: Philippe Mathieu-Daudé Message-Id: <20230118090144.31155-1-vap...@gentoo.org> Signed-off-by: Laurent Vivier --- linux-user/strace.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/linux-user/strace.c b/linux-user/strace.c index 5027289bdde4..081fc87344ca 100644 --- a/linux-user/strace.c +++ b/linux-user/strace.c @@ -1360,7 +1360,8 @@ UNUSED static const struct flags termios_lflags[] = { FLAG_END, }; -UNUSED static const struct flags mlockall_flags[] = { +#ifdef TARGET_NR_mlockall +static const struct flags mlockall_flags[] = { FLAG_TARGET(MCL_CURRENT), FLAG_TARGET(MCL_FUTURE), #ifdef MCL_ONFAULT @@ -1368,6 +1369,7 @@ UNUSED static const struct flags mlockall_flags[] = { #endif FLAG_END, }; +#endif /* IDs of the various system clocks */ #define TARGET_CLOCK_REALTIME 0 -- 2.39.1
[PULL 22/22] linux-user: Allow sendmsg() without IOV
From: Helge Deller Applications do call sendmsg() without any IOV, e.g.: sendmsg(4, {msg_name=NULL, msg_namelen=0, msg_iov=NULL, msg_iovlen=0, msg_control=[{cmsg_len=36, cmsg_level=SOL_ALG, cmsg_type=0x2}], msg_controllen=40, msg_flags=0}, MSG_MORE) = 0 sendmsg(4, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="The quick brown fox jumps over t"..., iov_len=183}], msg_iovlen=1, msg_control=[{cmsg_len=20, cmsg_level=SOL_ALG, cmsg_type=0x3}], msg_controllen=24, msg_flags=0}, 0) = 183 The function do_sendrecvmsg_locked() is used for sndmsg() and recvmsg() and calls lock_iovec() to lock the IOV into memory. For the first sendmsg() above it returns NULL and thus wrongly skips the call the host sendmsg() syscall, which will break the calling application. Fix this issue by: - allowing sendmsg() even with empty IOV - skip recvmsg() if IOV is NULL - skip both if the return code of do_sendrecvmsg_locked() != 0, which indicates some failure like EFAULT on the IOV Tested with the debian "ell" package with hppa guest on x86_64 host. Signed-off-by: Helge Deller Reviewed-by: Laurent Vivier Message-Id: <20221212173416.90590-2-del...@gmx.de> Signed-off-by: Laurent Vivier --- linux-user/syscall.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/linux-user/syscall.c b/linux-user/syscall.c index a0d2beddaa4e..1e868e9b0e27 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -3293,7 +3293,10 @@ static abi_long do_sendrecvmsg_locked(int fd, struct target_msghdr *msgp, target_vec, count, send); if (vec == NULL) { ret = -host_to_target_errno(errno); -goto out2; +/* allow sending packet without any iov, e.g. with MSG_MORE flag */ +if (!send || ret) { +goto out2; +} } msg.msg_iovlen = count; msg.msg_iov = vec; @@ -3345,7 +3348,9 @@ static abi_long do_sendrecvmsg_locked(int fd, struct target_msghdr *msgp, } out: -unlock_iovec(vec, target_vec, count, !send); +if (vec) { +unlock_iovec(vec, target_vec, count, !send); +} out2: return ret; } -- 2.39.1
[PULL 04/22] linux-user/syscall: Extract do_execve() from do_syscall1()
From: Drew DeVault execve() is a particular case of execveat(). In order to add do_execveat(), first factor do_execve() out. Signed-off-by: Drew DeVault Message-Id: <20221104081015.706009-1-...@cmpwn.com> [PMD: Split of bigger patch, filled description, fixed style] Signed-off-by: Philippe Mathieu-Daudé Reviewed-by: Laurent Vivier Message-Id: <20221104173632.1052-5-phi...@linaro.org> Signed-off-by: Laurent Vivier --- linux-user/syscall.c | 211 +++ 1 file changed, 114 insertions(+), 97 deletions(-) diff --git a/linux-user/syscall.c b/linux-user/syscall.c index 1f8c10f8ef94..11236d16a372 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -8357,6 +8357,119 @@ static int do_openat(CPUArchState *cpu_env, int dirfd, const char *pathname, int return safe_openat(dirfd, path(pathname), flags, mode); } +static int do_execve(CPUArchState *cpu_env, + abi_long pathname, abi_long guest_argp, + abi_long guest_envp) +{ +int ret; +char **argp, **envp; +int argc, envc; +abi_ulong gp; +abi_ulong addr; +char **q; +void *p; + +argc = 0; + +for (gp = guest_argp; gp; gp += sizeof(abi_ulong)) { +if (get_user_ual(addr, gp)) { +return -TARGET_EFAULT; +} +if (!addr) { +break; +} +argc++; +} +envc = 0; +for (gp = guest_envp; gp; gp += sizeof(abi_ulong)) { +if (get_user_ual(addr, gp)) { +return -TARGET_EFAULT; +} +if (!addr) { +break; +} +envc++; +} + +argp = g_new0(char *, argc + 1); +envp = g_new0(char *, envc + 1); + +for (gp = guest_argp, q = argp; gp; gp += sizeof(abi_ulong), q++) { +if (get_user_ual(addr, gp)) { +goto execve_efault; +} +if (!addr) { +break; +} +*q = lock_user_string(addr); +if (!*q) { +goto execve_efault; +} +} +*q = NULL; + +for (gp = guest_envp, q = envp; gp; gp += sizeof(abi_ulong), q++) { +if (get_user_ual(addr, gp)) { +goto execve_efault; +} +if (!addr) { +break; +} +*q = lock_user_string(addr); +if (!*q) { +goto execve_efault; +} +} +*q = NULL; + +/* + * Although execve() is not an interruptible syscall it is + * a special case where we must use the safe_syscall wrapper: + * if we allow a signal to happen before we make the host + * syscall then we will 'lose' it, because at the point of + * execve the process leaves QEMU's control. So we use the + * safe syscall wrapper to ensure that we either take the + * signal as a guest signal, or else it does not happen + * before the execve completes and makes it the other + * program's problem. + */ +p = lock_user_string(pathname); +if (!p) { +goto execve_efault; +} + +if (is_proc_myself(p, "exe")) { +ret = get_errno(safe_execve(exec_path, argp, envp)); +} else { +ret = get_errno(safe_execve(p, argp, envp)); +} + +unlock_user(p, pathname, 0); + +goto execve_end; + +execve_efault: +ret = -TARGET_EFAULT; + +execve_end: +for (gp = guest_argp, q = argp; *q; gp += sizeof(abi_ulong), q++) { +if (get_user_ual(addr, gp) || !addr) { +break; +} +unlock_user(*q, addr, 0); +} +for (gp = guest_envp, q = envp; *q; gp += sizeof(abi_ulong), q++) { +if (get_user_ual(addr, gp) || !addr) { +break; +} +unlock_user(*q, addr, 0); +} + +g_free(argp); +g_free(envp); +return ret; +} + #define TIMER_MAGIC 0x0caf #define TIMER_MAGIC_MASK 0x @@ -8867,103 +8980,7 @@ static abi_long do_syscall1(CPUArchState *cpu_env, int num, abi_long arg1, return ret; #endif case TARGET_NR_execve: -{ -char **argp, **envp; -int argc, envc; -abi_ulong gp; -abi_ulong guest_argp; -abi_ulong guest_envp; -abi_ulong addr; -char **q; - -argc = 0; -guest_argp = arg2; -for (gp = guest_argp; gp; gp += sizeof(abi_ulong)) { -if (get_user_ual(addr, gp)) -return -TARGET_EFAULT; -if (!addr) -break; -argc++; -} -envc = 0; -guest_envp = arg3; -for (gp = guest_envp; gp; gp += sizeof(abi_ulong)) { -if (get_user_ual(addr, gp)) -return -TARGET_EFAULT; -if (!addr) -break; -envc++; -} - -argp = g_new0(char *, argc + 1); -envp = g_new0(char *, envc + 1); - -for (gp = guest_argp, q = argp; gp; -
[PULL 20/22] linux-user: Enhance strace output for various syscalls
From: Helge Deller Add appropriate strace printf formats for various Linux syscalls. Signed-off-by: Helge Deller Reviewed-by: Philippe Mathieu-Daudé Message-Id: Signed-off-by: Laurent Vivier --- linux-user/strace.list | 43 ++ 1 file changed, 23 insertions(+), 20 deletions(-) diff --git a/linux-user/strace.list b/linux-user/strace.list index 3a1f61803a39..d8acbeec6093 100644 --- a/linux-user/strace.list +++ b/linux-user/strace.list @@ -343,7 +343,7 @@ { TARGET_NR_getpagesize, "getpagesize" , NULL, NULL, NULL }, #endif #ifdef TARGET_NR_getpeername -{ TARGET_NR_getpeername, "getpeername" , NULL, NULL, NULL }, +{ TARGET_NR_getpeername, "getpeername" , "%s(%d,%p,%p)", NULL, NULL }, #endif #ifdef TARGET_NR_getpgid { TARGET_NR_getpgid, "getpgid" , "%s(%u)", NULL, NULL }, @@ -367,19 +367,19 @@ { TARGET_NR_getrandom, "getrandom", "%s(%p,%u,%u)", NULL, NULL }, #endif #ifdef TARGET_NR_getresgid -{ TARGET_NR_getresgid, "getresgid" , NULL, NULL, NULL }, +{ TARGET_NR_getresgid, "getresgid" , "%s(%p,%p,%p)", NULL, NULL }, #endif #ifdef TARGET_NR_getresgid32 { TARGET_NR_getresgid32, "getresgid32" , NULL, NULL, NULL }, #endif #ifdef TARGET_NR_getresuid -{ TARGET_NR_getresuid, "getresuid" , NULL, NULL, NULL }, +{ TARGET_NR_getresuid, "getresuid" , "%s(%p,%p,%p)", NULL, NULL }, #endif #ifdef TARGET_NR_getresuid32 { TARGET_NR_getresuid32, "getresuid32" , NULL, NULL, NULL }, #endif #ifdef TARGET_NR_getrlimit -{ TARGET_NR_getrlimit, "getrlimit" , NULL, NULL, NULL }, +{ TARGET_NR_getrlimit, "getrlimit" , "%s(%d,%p)", NULL, NULL }, #endif #ifdef TARGET_NR_get_robust_list { TARGET_NR_get_robust_list, "get_robust_list" , NULL, NULL, NULL }, @@ -391,10 +391,10 @@ { TARGET_NR_getsid, "getsid" , "%s(%d)", NULL, NULL }, #endif #ifdef TARGET_NR_getsockname -{ TARGET_NR_getsockname, "getsockname" , NULL, NULL, NULL }, +{ TARGET_NR_getsockname, "getsockname" , "%s(%d,%p,%p)", NULL, NULL }, #endif #ifdef TARGET_NR_getsockopt -{ TARGET_NR_getsockopt, "getsockopt" , NULL, NULL, NULL }, +{ TARGET_NR_getsockopt, "getsockopt" , "%s(%d,%d,%d,%p,%p)", NULL, NULL }, #endif #ifdef TARGET_NR_get_thread_area #if defined(TARGET_I386) && defined(TARGET_ABI32) @@ -1059,10 +1059,10 @@ { TARGET_NR_pivot_root, "pivot_root" , NULL, NULL, NULL }, #endif #ifdef TARGET_NR_poll -{ TARGET_NR_poll, "poll" , NULL, NULL, NULL }, +{ TARGET_NR_poll, "poll" , "%s(%p,%u,%d)", NULL, NULL }, #endif #ifdef TARGET_NR_ppoll -{ TARGET_NR_ppoll, "ppoll" , NULL, NULL, NULL }, +{ TARGET_NR_ppoll, "ppoll" , "%s(%p,%u,%p,%p)", NULL, NULL }, #endif #ifdef TARGET_NR_prctl { TARGET_NR_prctl, "prctl" , NULL, NULL, NULL }, @@ -1131,7 +1131,7 @@ { TARGET_NR_reboot, "reboot" , NULL, NULL, NULL }, #endif #ifdef TARGET_NR_recv -{ TARGET_NR_recv, "recv" , NULL, NULL, NULL }, +{ TARGET_NR_recv, "recv" , "%s(%d,%p,%u,%d)", NULL, NULL }, #endif #ifdef TARGET_NR_recvfrom { TARGET_NR_recvfrom, "recvfrom" , NULL, NULL, NULL }, @@ -1191,7 +1191,7 @@ { TARGET_NR_rt_sigqueueinfo, "rt_sigqueueinfo" , NULL, print_rt_sigqueueinfo, NULL }, #endif #ifdef TARGET_NR_rt_sigreturn -{ TARGET_NR_rt_sigreturn, "rt_sigreturn" , NULL, NULL, NULL }, +{ TARGET_NR_rt_sigreturn, "rt_sigreturn" , "%s(%p)", NULL, NULL }, #endif #ifdef TARGET_NR_rt_sigsuspend { TARGET_NR_rt_sigsuspend, "rt_sigsuspend" , NULL, NULL, NULL }, @@ -1203,16 +1203,19 @@ { TARGET_NR_rt_tgsigqueueinfo, "rt_tgsigqueueinfo" , NULL, print_rt_tgsigqueueinfo, NULL }, #endif #ifdef TARGET_NR_sched_getaffinity -{ TARGET_NR_sched_getaffinity, "sched_getaffinity" , NULL, NULL, NULL }, +{ TARGET_NR_sched_getaffinity, "sched_getaffinity" , "%s(%d,%u,%p)", NULL, NULL }, #endif #ifdef TARGET_NR_sched_get_affinity { TARGET_NR_sched_get_affinity, "sched_get_affinity" , NULL, NULL, NULL }, #endif #ifdef TARGET_NR_sched_getattr -{ TARGET_NR_sched_getattr, "sched_getattr" , NULL, NULL, NULL }, +{ TARGET_NR_sched_getattr, "sched_getattr" , "%s(%d,%p,%u,%u)", NULL, NULL }, +#endif +#ifdef TARGET_NR_sched_setattr +{ TARGET_NR_sched_setattr, "sched_setattr" , "%s(%p,%p)", NULL, NULL }, #endif #ifdef TARGET_NR_sched_getparam -{ TARGET_NR_sched_getparam, "sched_getparam" , NULL, NULL, NULL }, +{ TARGET_NR_sched_getparam, "sched_getparam" , "%s(%d,%p)", NULL, NULL }, #endif #ifdef TARGET_NR_sched_get_priority_max { TARGET_NR_sched_get_priority_max, "sched_get_priority_max" , NULL, NULL, NULL }, @@ -1227,7 +1230,7 @@ { TARGET_NR_sched_rr_get_interval, "sched_rr_get_interval" , NULL, NULL, NULL }, #endif #ifdef TARGET_NR_sched_setaffinity -{ TARGET_NR_sched_setaffinity, "sched_setaffinity" , NULL, NULL, NULL }, +{ TARGET_NR_sched_setaffinity, "sched_setaffinity" , "%s(%d,%u,%p)", NULL, NULL }, #endif #ifdef TARGET_NR_sched_setatt { TARGET_NR_sched_setatt, "sched_setatt" , NULL, NULL, NULL }, @@ -1360,23 +1363,23 @@ { TARGET_NR_setreuid32, "setreuid32" , "%s(%u,%u)", NULL, NULL }, #endif #ifdef TARGET_NR_setrlimit -{ TARGET_NR_setrlimit, "setrlim
[PULL 05/22] linux-user/syscall: Implement execveat()
From: Drew DeVault References: https://gitlab.com/qemu-project/qemu/-/issues/1007 Signed-off-by: Drew DeVault Reviewed-by: Laurent Vivier Message-Id: <20221104081015.706009-1-...@cmpwn.com> Signed-off-by: Philippe Mathieu-Daudé Message-Id: <20221104173632.1052-6-phi...@linaro.org> Signed-off-by: Laurent Vivier --- linux-user/syscall.c | 15 +-- 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/linux-user/syscall.c b/linux-user/syscall.c index 11236d16a372..3e72bd333ede 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -696,7 +696,8 @@ safe_syscall4(pid_t, wait4, pid_t, pid, int *, status, int, options, \ #endif safe_syscall5(int, waitid, idtype_t, idtype, id_t, id, siginfo_t *, infop, \ int, options, struct rusage *, rusage) -safe_syscall3(int, execve, const char *, filename, char **, argv, char **, envp) +safe_syscall5(int, execveat, int, dirfd, const char *, filename, + char **, argv, char **, envp, int, flags) #if defined(TARGET_NR_select) || defined(TARGET_NR__newselect) || \ defined(TARGET_NR_pselect6) || defined(TARGET_NR_pselect6_time64) safe_syscall6(int, pselect6, int, nfds, fd_set *, readfds, fd_set *, writefds, \ @@ -8357,9 +8358,9 @@ static int do_openat(CPUArchState *cpu_env, int dirfd, const char *pathname, int return safe_openat(dirfd, path(pathname), flags, mode); } -static int do_execve(CPUArchState *cpu_env, +static int do_execveat(CPUArchState *cpu_env, int dirfd, abi_long pathname, abi_long guest_argp, - abi_long guest_envp) + abi_long guest_envp, int flags) { int ret; char **argp, **envp; @@ -8439,9 +8440,9 @@ static int do_execve(CPUArchState *cpu_env, } if (is_proc_myself(p, "exe")) { -ret = get_errno(safe_execve(exec_path, argp, envp)); +ret = get_errno(safe_execveat(dirfd, exec_path, argp, envp, flags)); } else { -ret = get_errno(safe_execve(p, argp, envp)); +ret = get_errno(safe_execveat(dirfd, p, argp, envp, flags)); } unlock_user(p, pathname, 0); @@ -8979,8 +8980,10 @@ static abi_long do_syscall1(CPUArchState *cpu_env, int num, abi_long arg1, unlock_user(p, arg2, 0); return ret; #endif +case TARGET_NR_execveat: +return do_execveat(cpu_env, arg1, arg2, arg3, arg4, arg5); case TARGET_NR_execve: -return do_execve(cpu_env, arg1, arg2, arg3); +return do_execveat(cpu_env, AT_FDCWD, arg1, arg2, arg3, 0); case TARGET_NR_chdir: if (!(p = lock_user_string(arg1))) return -TARGET_EFAULT; -- 2.39.1
[PULL 11/22] Revert "linux-user: fix compat with glibc >= 2.36 sys/mount.h"
From: Daniel P. Berrangé This reverts commit 3cd3df2a9584e6f753bb62a0028bd67124ab5532. glibc has fixed (in 2.36.9000-40-g774058d729) the problem that caused a clash when both sys/mount.h annd linux/mount.h are included, and backported this to the 2.36 stable release too: https://sourceware.org/glibc/wiki/Release/2.36#Usage_of_.3Clinux.2Fmount.h.3E_and_.3Csys.2Fmount.h.3E It is saner for QEMU to remove the workaround it applied for glibc 2.36 and expect distros to ship the 2.36 maint release with the fix. This avoids needing to add a further workaround to QEMU to deal with the fact that linux/brtfs.h now also pulls in linux/mount.h via linux/fs.h since Linux 6.1 Signed-off-by: Daniel P. Berrangé Reviewed-by: Marc-André Lureau Message-Id: <20230110174901.2580297-3-berra...@redhat.com> Signed-off-by: Laurent Vivier --- linux-user/syscall.c | 18 -- meson.build | 2 -- 2 files changed, 20 deletions(-) diff --git a/linux-user/syscall.c b/linux-user/syscall.c index b88f8ee96f0f..210db5f0be94 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -95,25 +95,7 @@ #include #include #include - -#ifdef HAVE_SYS_MOUNT_FSCONFIG -/* - * glibc >= 2.36 linux/mount.h conflicts with sys/mount.h, - * which in turn prevents use of linux/fs.h. So we have to - * define the constants ourselves for now. - */ -#define FS_IOC_GETFLAGS_IOR('f', 1, long) -#define FS_IOC_SETFLAGS_IOW('f', 2, long) -#define FS_IOC_GETVERSION _IOR('v', 1, long) -#define FS_IOC_SETVERSION _IOW('v', 2, long) -#define FS_IOC_FIEMAP _IOWR('f', 11, struct fiemap) -#define FS_IOC32_GETFLAGS _IOR('f', 1, int) -#define FS_IOC32_SETFLAGS _IOW('f', 2, int) -#define FS_IOC32_GETVERSION_IOR('v', 1, int) -#define FS_IOC32_SETVERSION_IOW('v', 2, int) -#else #include -#endif #include #if defined(CONFIG_FIEMAP) #include diff --git a/meson.build b/meson.build index 6d3b66562975..cccd19f864e3 100644 --- a/meson.build +++ b/meson.build @@ -2046,8 +2046,6 @@ config_host_data.set('HAVE_OPTRESET', cc.has_header_symbol('getopt.h', 'optreset')) config_host_data.set('HAVE_IPPROTO_MPTCP', cc.has_header_symbol('netinet/in.h', 'IPPROTO_MPTCP')) -config_host_data.set('HAVE_SYS_MOUNT_FSCONFIG', - cc.has_header_symbol('sys/mount.h', 'FSCONFIG_SET_FLAG')) # has_member config_host_data.set('HAVE_SIGEV_NOTIFY_THREAD_ID', -- 2.39.1
[PULL 10/22] Revert "linux-user: add more compat ioctl definitions"
From: Daniel P. Berrangé This reverts commit c5495f4ecb0cdaaf2e9dddeb48f1689cdb520ca0. glibc has fixed (in 2.36.9000-40-g774058d729) the problem that caused a clash when both sys/mount.h annd linux/mount.h are included, and backported this to the 2.36 stable release too: https://sourceware.org/glibc/wiki/Release/2.36#Usage_of_.3Clinux.2Fmount.h.3E_and_.3Csys.2Fmount.h.3E It is saner for QEMU to remove the workaround it applied for glibc 2.36 and expect distros to ship the 2.36 maint release with the fix. This avoids needing to add a further workaround to QEMU to deal with the fact that linux/brtfs.h now also pulls in linux/mount.h via linux/fs.h since Linux 6.1 Signed-off-by: Daniel P. Berrangé Reviewed-by: Marc-André Lureau Message-Id: <20230110174901.2580297-2-berra...@redhat.com> Signed-off-by: Laurent Vivier --- linux-user/syscall.c | 25 - 1 file changed, 25 deletions(-) diff --git a/linux-user/syscall.c b/linux-user/syscall.c index dbf51e500b4f..b88f8ee96f0f 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -111,31 +111,6 @@ #define FS_IOC32_SETFLAGS _IOW('f', 2, int) #define FS_IOC32_GETVERSION_IOR('v', 1, int) #define FS_IOC32_SETVERSION_IOW('v', 2, int) - -#define BLKGETSIZE64 _IOR(0x12,114,size_t) -#define BLKDISCARD _IO(0x12,119) -#define BLKIOMIN _IO(0x12,120) -#define BLKIOOPT _IO(0x12,121) -#define BLKALIGNOFF _IO(0x12,122) -#define BLKPBSZGET _IO(0x12,123) -#define BLKDISCARDZEROES _IO(0x12,124) -#define BLKSECDISCARD _IO(0x12,125) -#define BLKROTATIONAL _IO(0x12,126) -#define BLKZEROOUT _IO(0x12,127) - -#define FIBMAP _IO(0x00,1) -#define FIGETBSZ _IO(0x00,2) - -struct file_clone_range { -__s64 src_fd; -__u64 src_offset; -__u64 src_length; -__u64 dest_offset; -}; - -#define FICLONE _IOW(0x94, 9, int) -#define FICLONERANGE_IOW(0x94, 13, struct file_clone_range) - #else #include #endif -- 2.39.1
[PULL 01/22] linux-user/strace: Constify struct flags
From: Philippe Mathieu-Daudé print_flags() takes a const pointer. Signed-off-by: Philippe Mathieu-Daudé Reviewed-by: Laurent Vivier Message-Id: <20221104173632.1052-2-phi...@linaro.org> Signed-off-by: Laurent Vivier --- linux-user/strace.c | 40 1 file changed, 20 insertions(+), 20 deletions(-) diff --git a/linux-user/strace.c b/linux-user/strace.c index 9ae5a812cd71..25c47f03160d 100644 --- a/linux-user/strace.c +++ b/linux-user/strace.c @@ -945,7 +945,7 @@ print_syscall_ret_ioctl(CPUArchState *cpu_env, const struct syscallname *name, } #endif -UNUSED static struct flags access_flags[] = { +UNUSED static const struct flags access_flags[] = { FLAG_GENERIC(F_OK), FLAG_GENERIC(R_OK), FLAG_GENERIC(W_OK), @@ -953,7 +953,7 @@ UNUSED static struct flags access_flags[] = { FLAG_END, }; -UNUSED static struct flags at_file_flags[] = { +UNUSED static const struct flags at_file_flags[] = { #ifdef AT_EACCESS FLAG_GENERIC(AT_EACCESS), #endif @@ -963,14 +963,14 @@ UNUSED static struct flags at_file_flags[] = { FLAG_END, }; -UNUSED static struct flags unlinkat_flags[] = { +UNUSED static const struct flags unlinkat_flags[] = { #ifdef AT_REMOVEDIR FLAG_GENERIC(AT_REMOVEDIR), #endif FLAG_END, }; -UNUSED static struct flags mode_flags[] = { +UNUSED static const struct flags mode_flags[] = { FLAG_GENERIC(S_IFSOCK), FLAG_GENERIC(S_IFLNK), FLAG_GENERIC(S_IFREG), @@ -981,14 +981,14 @@ UNUSED static struct flags mode_flags[] = { FLAG_END, }; -UNUSED static struct flags open_access_flags[] = { +UNUSED static const struct flags open_access_flags[] = { FLAG_TARGET(O_RDONLY), FLAG_TARGET(O_WRONLY), FLAG_TARGET(O_RDWR), FLAG_END, }; -UNUSED static struct flags open_flags[] = { +UNUSED static const struct flags open_flags[] = { FLAG_TARGET(O_APPEND), FLAG_TARGET(O_CREAT), FLAG_TARGET(O_DIRECTORY), @@ -1019,7 +1019,7 @@ UNUSED static struct flags open_flags[] = { FLAG_END, }; -UNUSED static struct flags mount_flags[] = { +UNUSED static const struct flags mount_flags[] = { #ifdef MS_BIND FLAG_GENERIC(MS_BIND), #endif @@ -1044,7 +1044,7 @@ UNUSED static struct flags mount_flags[] = { FLAG_END, }; -UNUSED static struct flags umount2_flags[] = { +UNUSED static const struct flags umount2_flags[] = { #ifdef MNT_FORCE FLAG_GENERIC(MNT_FORCE), #endif @@ -1057,7 +1057,7 @@ UNUSED static struct flags umount2_flags[] = { FLAG_END, }; -UNUSED static struct flags mmap_prot_flags[] = { +UNUSED static const struct flags mmap_prot_flags[] = { FLAG_GENERIC(PROT_NONE), FLAG_GENERIC(PROT_EXEC), FLAG_GENERIC(PROT_READ), @@ -1068,7 +1068,7 @@ UNUSED static struct flags mmap_prot_flags[] = { FLAG_END, }; -UNUSED static struct flags mmap_flags[] = { +UNUSED static const struct flags mmap_flags[] = { FLAG_TARGET(MAP_SHARED), FLAG_TARGET(MAP_PRIVATE), FLAG_TARGET(MAP_ANONYMOUS), @@ -1092,7 +1092,7 @@ UNUSED static struct flags mmap_flags[] = { FLAG_END, }; -UNUSED static struct flags clone_flags[] = { +UNUSED static const struct flags clone_flags[] = { FLAG_GENERIC(CLONE_VM), FLAG_GENERIC(CLONE_FS), FLAG_GENERIC(CLONE_FILES), @@ -1136,7 +1136,7 @@ UNUSED static struct flags clone_flags[] = { FLAG_END, }; -UNUSED static struct flags msg_flags[] = { +UNUSED static const struct flags msg_flags[] = { /* send */ FLAG_GENERIC(MSG_CONFIRM), FLAG_GENERIC(MSG_DONTROUTE), @@ -1156,7 +1156,7 @@ UNUSED static struct flags msg_flags[] = { FLAG_END, }; -UNUSED static struct flags statx_flags[] = { +UNUSED static const struct flags statx_flags[] = { #ifdef AT_EMPTY_PATH FLAG_GENERIC(AT_EMPTY_PATH), #endif @@ -1178,7 +1178,7 @@ UNUSED static struct flags statx_flags[] = { FLAG_END, }; -UNUSED static struct flags statx_mask[] = { +UNUSED static const struct flags statx_mask[] = { /* This must come first, because it includes everything. */ #ifdef STATX_ALL FLAG_GENERIC(STATX_ALL), @@ -1226,7 +1226,7 @@ UNUSED static struct flags statx_mask[] = { FLAG_END, }; -UNUSED static struct flags falloc_flags[] = { +UNUSED static const struct flags falloc_flags[] = { FLAG_GENERIC(FALLOC_FL_KEEP_SIZE), FLAG_GENERIC(FALLOC_FL_PUNCH_HOLE), #ifdef FALLOC_FL_NO_HIDE_STALE @@ -1246,7 +1246,7 @@ UNUSED static struct flags falloc_flags[] = { #endif }; -UNUSED static struct flags termios_iflags[] = { +UNUSED static const struct flags termios_iflags[] = { FLAG_TARGET(IGNBRK), FLAG_TARGET(BRKINT), FLAG_TARGET(IGNPAR), @@ -1265,7 +1265,7 @@ UNUSED static struct flags termios_iflags[] = { FLAG_END, }; -UNUSED static struct flags termios_oflags[] = { +UNUSED static const struct flags termios_oflags[] = { FLAG_TARGET(OPOST), FLAG_TARGET(OLCUC), FLAG_TARGET(ONLCR), @@ -1349,7 +1349,7 @@ UNUSED static struct enums termios_cflags_CS
[PULL 21/22] linux-user: Implement SOL_ALG encryption support
From: Helge Deller Add suport to handle SOL_ALG packets via sendmsg() and recvmsg(). This allows emulated userspace to use encryption functionality. Tested with the debian ell package with hppa guest on x86_64 host. Signed-off-by: Helge Deller Reviewed-by: Laurent Vivier Message-Id: <20221212173416.90590-1-del...@gmx.de> Signed-off-by: Laurent Vivier --- linux-user/syscall.c | 8 1 file changed, 8 insertions(+) diff --git a/linux-user/syscall.c b/linux-user/syscall.c index 55d53b344b84..a0d2beddaa4e 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -1829,6 +1829,14 @@ static inline abi_long target_to_host_cmsg(struct msghdr *msgh, __get_user(cred->pid, &target_cred->pid); __get_user(cred->uid, &target_cred->uid); __get_user(cred->gid, &target_cred->gid); +} else if (cmsg->cmsg_level == SOL_ALG) { +uint32_t *dst = (uint32_t *)data; + +memcpy(dst, target_data, len); +/* fix endianess of first 32-bit word */ +if (len >= sizeof(uint32_t)) { +*dst = tswap32(*dst); +} } else { qemu_log_mask(LOG_UNIMP, "Unsupported ancillary data: %d/%d\n", cmsg->cmsg_level, cmsg->cmsg_type); -- 2.39.1
[PULL 00/22] Linux user for 8.0 patches
The following changes since commit 13356edb87506c148b163b8c7eb0695647d00c2a: Merge tag 'block-pull-request' of https://gitlab.com/stefanha/qemu into staging (2023-01-24 09:45:33 +) are available in the Git repository at: https://gitlab.com/laurent_vivier/qemu.git tags/linux-user-for-8.0-pull-request for you to fetch changes up to 3f0744f98b07c6fd2ce9d5840726d0915b2ae7c1: linux-user: Allow sendmsg() without IOV (2023-02-03 22:55:12 +0100) linux-user branch pull request 20230204 Implement execveat() un-parent OBJECT(cpu) when closing thread Revert fix for glibc >= 2.36 sys/mount.h Fix/update strace move target_flat.h to target subdirs Fix SO_ERROR return code of getsockopt() Fix /proc/cpuinfo output for hppa Add emulation for MADV_WIPEONFORK and MADV_KEEPONFORK in madvise() Implement SOL_ALG encryption support linux-user: Allow sendmsg() without IOV Daniel P. Berrangé (2): Revert "linux-user: add more compat ioctl definitions" Revert "linux-user: fix compat with glibc >= 2.36 sys/mount.h" Drew DeVault (4): linux-user/strace: Extract print_execve_argv() from print_execve() linux-user/strace: Add output for execveat() syscall linux-user/syscall: Extract do_execve() from do_syscall1() linux-user/syscall: Implement execveat() Helge Deller (11): linux-user: Add missing MAP_HUGETLB and MAP_STACK flags in strace linux-user: Add strace output for clock_getres_time64() and futex_time64() linux-user: Improve strace output of getgroups() and setgroups() linux-user: Fix SO_ERROR return code of getsockopt() linux-user: Fix /proc/cpuinfo output for hppa linux-user: Improve strace output of personality() and sysinfo() linux-user: Add emulation for MADV_WIPEONFORK and MADV_KEEPONFORK in madvise() linux-user: Show 4th argument of rt_sigprocmask() in strace linux-user: Enhance strace output for various syscalls linux-user: Implement SOL_ALG encryption support linux-user: Allow sendmsg() without IOV Letu Ren (1): linux-user: add more netlink protocol constants Mike Frysinger (2): linux-user: fix strace build w/out munlockall linux-user: move target_flat.h to target subdirs Philippe Mathieu-Daudé (1): linux-user/strace: Constify struct flags Richard Henderson (1): linux-user: un-parent OBJECT(cpu) when closing thread linux-user/aarch64/target_flat.h | 1 + linux-user/arm/target_flat.h | 1 + linux-user/{ => generic}/target_flat.h | 0 linux-user/m68k/target_flat.h | 1 + linux-user/microblaze/target_flat.h| 1 + linux-user/mmap.c | 56 +++-- linux-user/sh4/target_flat.h | 1 + linux-user/strace.c| 189 ++- linux-user/strace.list | 64 ++--- linux-user/syscall.c | 312 + meson.build| 2 - 11 files changed, 378 insertions(+), 250 deletions(-) create mode 100644 linux-user/aarch64/target_flat.h create mode 100644 linux-user/arm/target_flat.h rename linux-user/{ => generic}/target_flat.h (100%) create mode 100644 linux-user/m68k/target_flat.h create mode 100644 linux-user/microblaze/target_flat.h create mode 100644 linux-user/sh4/target_flat.h -- 2.39.1
[PULL 13/22] linux-user: Improve strace output of getgroups() and setgroups()
From: Helge Deller Make the strace look nicer for those syscalls. Signed-off-by: Helge Deller Reviewed-by: Laurent Vivier Message-Id: <20230115210057.445132-1-del...@gmx.de> Signed-off-by: Laurent Vivier --- linux-user/strace.list | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/linux-user/strace.list b/linux-user/strace.list index 64db8e6b8412..cf291d02edfe 100644 --- a/linux-user/strace.list +++ b/linux-user/strace.list @@ -321,10 +321,10 @@ { TARGET_NR_getgid32, "getgid32" , NULL, NULL, NULL }, #endif #ifdef TARGET_NR_getgroups -{ TARGET_NR_getgroups, "getgroups" , NULL, NULL, NULL }, +{ TARGET_NR_getgroups, "getgroups" , "%s(%d,%p)", NULL, NULL }, #endif #ifdef TARGET_NR_getgroups32 -{ TARGET_NR_getgroups32, "getgroups32" , NULL, NULL, NULL }, +{ TARGET_NR_getgroups32, "getgroups32" , "%s(%d,%p)", NULL, NULL }, #endif #ifdef TARGET_NR_gethostname { TARGET_NR_gethostname, "gethostname" , NULL, NULL, NULL }, @@ -1304,10 +1304,10 @@ { TARGET_NR_setgid32, "setgid32" , "%s(%u)", NULL, NULL }, #endif #ifdef TARGET_NR_setgroups -{ TARGET_NR_setgroups, "setgroups" , NULL, NULL, NULL }, +{ TARGET_NR_setgroups, "setgroups" , "%s(%d,%p)", NULL, NULL }, #endif #ifdef TARGET_NR_setgroups32 -{ TARGET_NR_setgroups32, "setgroups32" , NULL, NULL, NULL }, +{ TARGET_NR_setgroups32, "setgroups32" , "%s(%d,%p)", NULL, NULL }, #endif #ifdef TARGET_NR_sethae { TARGET_NR_sethae, "sethae" , NULL, NULL, NULL }, -- 2.39.1
[PULL 14/22] linux-user: move target_flat.h to target subdirs
From: Mike Frysinger This makes target_flat.h behave like every other target_xxx.h header. It also makes it actually work -- while the current header says adding a header to the target subdir overrides the common one, it doesn't. This is for two reasons: * meson.build adds -Ilinux-user before -Ilinux-user/$arch * the compiler search path for "target_flat.h" looks in the same dir as the source file before searching -I paths. This can be seen with the xtensa port -- the subdir settings aren't used which breaks stack setup. Move it to the generic/ subdir and add include stubs like every other target_xxx.h header is handled. Signed-off-by: Mike Frysinger Reviewed-by: Richard Henderson Message-Id: <20230129004625.11228-1-vap...@gentoo.org> Signed-off-by: Laurent Vivier --- linux-user/aarch64/target_flat.h | 1 + linux-user/arm/target_flat.h | 1 + linux-user/{ => generic}/target_flat.h | 0 linux-user/m68k/target_flat.h | 1 + linux-user/microblaze/target_flat.h| 1 + linux-user/sh4/target_flat.h | 1 + 6 files changed, 5 insertions(+) create mode 100644 linux-user/aarch64/target_flat.h create mode 100644 linux-user/arm/target_flat.h rename linux-user/{ => generic}/target_flat.h (100%) create mode 100644 linux-user/m68k/target_flat.h create mode 100644 linux-user/microblaze/target_flat.h create mode 100644 linux-user/sh4/target_flat.h diff --git a/linux-user/aarch64/target_flat.h b/linux-user/aarch64/target_flat.h new file mode 100644 index ..bc83224cea12 --- /dev/null +++ b/linux-user/aarch64/target_flat.h @@ -0,0 +1 @@ +#include "../generic/target_flat.h" diff --git a/linux-user/arm/target_flat.h b/linux-user/arm/target_flat.h new file mode 100644 index ..bc83224cea12 --- /dev/null +++ b/linux-user/arm/target_flat.h @@ -0,0 +1 @@ +#include "../generic/target_flat.h" diff --git a/linux-user/target_flat.h b/linux-user/generic/target_flat.h similarity index 100% rename from linux-user/target_flat.h rename to linux-user/generic/target_flat.h diff --git a/linux-user/m68k/target_flat.h b/linux-user/m68k/target_flat.h new file mode 100644 index ..bc83224cea12 --- /dev/null +++ b/linux-user/m68k/target_flat.h @@ -0,0 +1 @@ +#include "../generic/target_flat.h" diff --git a/linux-user/microblaze/target_flat.h b/linux-user/microblaze/target_flat.h new file mode 100644 index ..bc83224cea12 --- /dev/null +++ b/linux-user/microblaze/target_flat.h @@ -0,0 +1 @@ +#include "../generic/target_flat.h" diff --git a/linux-user/sh4/target_flat.h b/linux-user/sh4/target_flat.h new file mode 100644 index ..bc83224cea12 --- /dev/null +++ b/linux-user/sh4/target_flat.h @@ -0,0 +1 @@ +#include "../generic/target_flat.h" -- 2.39.1
pixman_blt on aarch64
Hello, I'm trying to involve the pixman list in this thread on qemu-devel list started with subject "Display update issue on M1 Macs". See here: https://lists.nongnu.org/archive/html/qemu-devel/2023-02/msg01033.html We have found that on aarch64 Macs running macOS the pixman_blt and pixman_fill functions are disabled without fallback due to not being able to compile the needed assembly code. See detailed discussion below. Is there a way to fix this in pixman in the near future or provide a fallback for this in pixman? Or do I need to add a fallback in QEMU or try using something else instead of pixman for these functions? Thank you, BALATON Zoltan On Sat, 4 Feb 2023, Akihiko Odaki wrote: On 2023/02/03 22:45, BALATON Zoltan wrote: On Fri, 3 Feb 2023, Akihiko Odaki wrote: I finally reproduced the issue with MorphOS and ati-vga and figured out its cause. The problem is that pixman_blt() is disabled because its backend is written in GNU assembly, and GNU assembler is not available on macOS. There is no fallback written in C, unfortunately. The issue is tracked by the upstream at: https://gitlab.freedesktop.org/pixman/pixman/-/issues/59 Hm, OK but that ticket is just about compile error and suggests to disable it and does not say it won't work then. Are they aware this is a problem? Maybe we should write to their mailing list after we're sure what's happening. That's a good idea. They may prioritize the issue if they realize that disables pixman_blt(). I hit the same problem on Asahi Linux, which is based on Arch Linux ARM. It is because Arch Linux copied PKGBUILD from x86 Arch Linux, which disables Arm backends. It is easy to enable the backend for the platform so I proposed a change at: https://github.com/archlinuxarm/PKGBUILDs/pull/1985 On macOS one source of pixman most people use is brew.sh where this seems to be disabled: https://github.com/Homebrew/homebrew-core/blob/master/Formula/pixman.rb another source is macports which has an older version and no such options: https://github.com/macports/macports-ports/blob/master/graphics/libpixman-devel/Portfile I wonder if it compiles from macports on aarch64 then. It's more likely that it is just outdated. It does not carry a patch to fix the issue. I wait if I can get some more test results and try to check pixman but its source is not too clear to me and there are no docs either so maybe the best way is to ask on their list. If this is a pixman issue I hope it can be fixed there and we don't need to implement a fallback in QEMU. This is certainly a pixman issue. If you read the source, you can see pixman_blt() calls _pixman_implementation_blt(). _pixman_implementation_blt() calls blt member of pixman_implementation_t in turn. Grepping for "blt =" tells it is only assigned in: pixman/pixman-arm-neon.c pixman/pixman-arm-simd.c pixman/pixman-mips-dspr2.c pixman/pixman-mmx.c pixman/pixman-sse2.c For AArch64, only pixman/pixman-arm-neon.c is relevant, and it needs to be disabled to build the library on macOS. Regards, Akihiko Odaki Regards, BALATON Zoltan