Re: [Qemu-devel] [PATCH 4/9] target/ppc: Add kvmppc_hpt_needs_host_contiguous_pages() helper
On 06/18/2018 08:36 AM, David Gibson wrote: > KVM HV has a restriction that for HPT mode guests, guest pages must be hpa > contiguous as well as gpa contiguous. We have to account for that in > various places. We determine whether we're subject to this restriction > from the SMMU information exposed by KVM. > > Planned cleanups to the way we handle this will require knowing whether > this restriction is in play in wider parts of the code. So, expose a > helper function which returns it. > > This does mean some redundant calls to kvm_get_smmu_info(), but they'll go > away again with future cleanups. > > Signed-off-by: David Gibson Reviewed-by: Cédric Le Goater but this patch is already committed it seems. C. > --- > target/ppc/kvm.c | 17 +++-- > target/ppc/kvm_ppc.h | 6 ++ > 2 files changed, 21 insertions(+), 2 deletions(-) > > diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c > index 5c0e313ca6..50b5d01432 100644 > --- a/target/ppc/kvm.c > +++ b/target/ppc/kvm.c > @@ -406,9 +406,22 @@ target_ulong kvmppc_configure_v3_mmu(PowerPCCPU *cpu, > } > } > > +bool kvmppc_hpt_needs_host_contiguous_pages(void) > +{ > +PowerPCCPU *cpu = POWERPC_CPU(first_cpu); > +static struct kvm_ppc_smmu_info smmu_info; > + > +if (!kvm_enabled()) { > +return false; > +} > + > +kvm_get_smmu_info(cpu, _info); > +return !!(smmu_info.flags & KVM_PPC_PAGE_SIZES_REAL); > +} > + > static bool kvm_valid_page_size(uint32_t flags, long rampgsize, uint32_t > shift) > { > -if (!(flags & KVM_PPC_PAGE_SIZES_REAL)) { > +if (!kvmppc_hpt_needs_host_contiguous_pages()) { > return true; > } > > @@ -445,7 +458,7 @@ static void kvm_fixup_page_sizes(PowerPCCPU *cpu) > /* If we have HV KVM, we need to forbid CI large pages if our > * host page size is smaller than 64K. > */ > -if (smmu_info.flags & KVM_PPC_PAGE_SIZES_REAL) { > +if (kvmppc_hpt_needs_host_contiguous_pages()) { > if (getpagesize() >= 0x1) { > cpu->hash64_opts->flags |= PPC_HASH64_CI_LARGEPAGE; > } else { > diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h > index e2840e1d33..a7ddb8a5d6 100644 > --- a/target/ppc/kvm_ppc.h > +++ b/target/ppc/kvm_ppc.h > @@ -70,6 +70,7 @@ int kvmppc_resize_hpt_prepare(PowerPCCPU *cpu, target_ulong > flags, int shift); > int kvmppc_resize_hpt_commit(PowerPCCPU *cpu, target_ulong flags, int shift); > bool kvmppc_pvr_workaround_required(PowerPCCPU *cpu); > > +bool kvmppc_hpt_needs_host_contiguous_pages(void); > bool kvmppc_is_mem_backend_page_size_ok(const char *obj_path); > > #else > @@ -222,6 +223,11 @@ static inline uint64_t kvmppc_rma_size(uint64_t > current_size, > return ram_size; > } > > +static inline bool kvmppc_hpt_needs_host_contiguous_pages(void) > +{ > +return false; > +} > + > static inline bool kvmppc_is_mem_backend_page_size_ok(const char *obj_path) > { > return true; >
Re: [Qemu-devel] [PATCH] hw/i386: Deprecate the machine types pc-0.10 and pc-0.11
Eduardo Habkost writes: > On Mon, Jun 11, 2018 at 05:41:04AM +0200, Thomas Huth wrote: >> The oldest machine type which is still used in a maintained distribution >> is a pc-0.12 based machine type in RHEL6, so everything that is older >> than pc-0.12 should not be used anymore. Thus let's deprecate pc-0.10 >> and pc-0.11 so that we can finally remove them in a future release. >> >> Signed-off-by: Thomas Huth >> --- >> This is based on a patch that I already sent in 2017. But back then, we >> were still in progress of discussing our deprecation policies (e.g. auto- >> matic deprecation for old machine types), and there was no clear consensus >> whether we should deprecate 0.10 - 0.11, all 0.x or even up to version 1.2. >> After some iterations and too much discussion, I've forgotten about this >> patch. Anyway, I think we agreed that at least 0.10 and 0.11 can certainly >> be removed nowadays, so let's finally get at least those two machine types >> marked as deprecated! If that works fine and we will finally have removed >> these two types in v3.2, we can resume the discussion about newer machine >> types afterwards. > > Thanks! > > >> >> hw/i386/pc_piix.c | 2 ++ >> include/hw/boards.h | 1 + >> qemu-doc.texi | 5 + >> vl.c| 9 +++-- >> 4 files changed, 15 insertions(+), 2 deletions(-) >> >> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c >> index 3d81136..fa61dc3 100644 >> --- a/hw/i386/pc_piix.c >> +++ b/hw/i386/pc_piix.c >> @@ -955,6 +955,8 @@ static void pc_i440fx_0_11_machine_options(MachineClass >> *m) >> { >> pc_i440fx_0_12_machine_options(m); >> m->hw_version = "0.11"; >> +m->deprecation_msg = "Old and unsupported machine version, " >> + "use a newer machine type instead."; > > Sounds simple enough to me, but see comment about QMP below. > > >> SET_MACHINE_COMPAT(m, PC_COMPAT_0_11); >> } >> [...] >> diff --git a/vl.c b/vl.c >> index 0603171..096814c 100644 >> --- a/vl.c >> +++ b/vl.c >> @@ -2560,8 +2560,9 @@ static gint machine_class_cmp(gconstpointer a, >> gconstpointer b) >> if (mc->alias) { >> printf("%-20s %s (alias of %s)\n", mc->alias, mc->desc, >> mc->name); >> } >> -printf("%-20s %s%s\n", mc->name, mc->desc, >> - mc->is_default ? " (default)" : ""); >> +printf("%-20s %s%s%s\n", mc->name, mc->desc, >> + mc->is_default ? " (default)" : "", >> + mc->deprecation_msg ? " (deprecated)" : ""); >> } >> } >> >> @@ -3952,6 +3953,10 @@ int main(int argc, char **argv, char **envp) >> } >> >> machine_class = select_machine(); >> +if (machine_class->deprecation_msg) { >> +error_report("Machine type '%s' is deprecated: %s", >> + machine_class->name, machine_class->deprecation_msg); >> +} > > Do you plan to add this info to 'query-machines' QMP command? > > If we do that, maybe we should represent the common "this machine > type is too old, but there's a new version" case in a more > machine-friendly way? Maybe a 'deprecation_reason' enum would be > better than a 'deprecation_msg' field? QMP needs a generic mechanism to communicate "FOO is deprecated, use BAR instead". > (Note that I don't think any discussions about the QMP interface > should block this patch from being merged. We can deprecate the > machines first, and decide about QMP later.) Yes. >> >> set_memory_options(_slots, _size, machine_class); >> >> -- >> 1.8.3.1 >>
Re: [Qemu-devel] [PATCH] nbd/client: add x-block-status hack for testing server
Hi, This series failed docker-mingw@fedora build test. Please find the testing commands and their output below. If you have Docker installed, you can probably reproduce it locally. Type: series Message-id: 20180621032539.134944-1-ebl...@redhat.com Subject: [Qemu-devel] [PATCH] nbd/client: add x-block-status hack for testing server === TEST SCRIPT BEGIN === #!/bin/bash set -e git submodule update --init dtc # Let docker tests dump environment info export SHOW_ENV=1 export J=8 time make docker-test-mingw@fedora === TEST SCRIPT END === Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 Switched to a new branch 'test' a8c78c2842 nbd/client: add x-block-status hack for testing server === OUTPUT BEGIN === Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc' Cloning into '/var/tmp/patchew-tester-tmp-ecxi0nux/src/dtc'... Submodule path 'dtc': checked out 'e54388015af1fb4bf04d0bca99caba1074d9cc42' BUILD fedora make[1]: Entering directory '/var/tmp/patchew-tester-tmp-ecxi0nux/src' GEN /var/tmp/patchew-tester-tmp-ecxi0nux/src/docker-src.2018-06-21-01.51.06.25070/qemu.tar Cloning into '/var/tmp/patchew-tester-tmp-ecxi0nux/src/docker-src.2018-06-21-01.51.06.25070/qemu.tar.vroot'... done. Your branch is up-to-date with 'origin/test'. Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc' Cloning into '/var/tmp/patchew-tester-tmp-ecxi0nux/src/docker-src.2018-06-21-01.51.06.25070/qemu.tar.vroot/dtc'... Submodule path 'dtc': checked out 'e54388015af1fb4bf04d0bca99caba1074d9cc42' Submodule 'ui/keycodemapdb' (git://git.qemu.org/keycodemapdb.git) registered for path 'ui/keycodemapdb' Cloning into '/var/tmp/patchew-tester-tmp-ecxi0nux/src/docker-src.2018-06-21-01.51.06.25070/qemu.tar.vroot/ui/keycodemapdb'... Submodule path 'ui/keycodemapdb': checked out '6b3d716e2b6472eb7189d3220552280ef3d832ce' COPYRUNNER RUN test-mingw in qemu:fedora Packages installed: SDL2-devel-2.0.8-5.fc28.x86_64 bc-1.07.1-5.fc28.x86_64 bison-3.0.4-9.fc28.x86_64 bluez-libs-devel-5.49-3.fc28.x86_64 brlapi-devel-0.6.7-12.fc28.x86_64 bzip2-1.0.6-26.fc28.x86_64 bzip2-devel-1.0.6-26.fc28.x86_64 ccache-3.4.2-2.fc28.x86_64 clang-6.0.0-5.fc28.x86_64 device-mapper-multipath-devel-0.7.4-2.git07e7bd5.fc28.x86_64 findutils-4.6.0-19.fc28.x86_64 flex-2.6.1-7.fc28.x86_64 gcc-8.1.1-1.fc28.x86_64 gcc-c++-8.1.1-1.fc28.x86_64 gettext-0.19.8.1-14.fc28.x86_64 git-2.17.1-2.fc28.x86_64 glib2-devel-2.56.1-3.fc28.x86_64 glusterfs-api-devel-4.0.2-1.fc28.x86_64 gnutls-devel-3.6.2-1.fc28.x86_64 gtk3-devel-3.22.30-1.fc28.x86_64 hostname-3.20-3.fc28.x86_64 libaio-devel-0.3.110-11.fc28.x86_64 libasan-8.1.1-1.fc28.x86_64 libattr-devel-2.4.47-23.fc28.x86_64 libcap-devel-2.25-9.fc28.x86_64 libcap-ng-devel-0.7.9-1.fc28.x86_64 libcurl-devel-7.59.0-3.fc28.x86_64 libfdt-devel-1.4.6-4.fc28.x86_64 libpng-devel-1.6.34-3.fc28.x86_64 librbd-devel-12.2.5-1.fc28.x86_64 libssh2-devel-1.8.0-7.fc28.x86_64 libubsan-8.1.1-1.fc28.x86_64 libusbx-devel-1.0.21-6.fc28.x86_64 libxml2-devel-2.9.7-4.fc28.x86_64 llvm-6.0.0-11.fc28.x86_64 lzo-devel-2.08-12.fc28.x86_64 make-4.2.1-6.fc28.x86_64 mingw32-SDL2-2.0.5-3.fc27.noarch mingw32-bzip2-1.0.6-9.fc27.noarch mingw32-curl-7.57.0-1.fc28.noarch mingw32-glib2-2.54.1-1.fc28.noarch mingw32-gmp-6.1.2-2.fc27.noarch mingw32-gnutls-3.5.13-2.fc27.noarch mingw32-gtk3-3.22.16-1.fc27.noarch mingw32-libjpeg-turbo-1.5.1-3.fc27.noarch mingw32-libpng-1.6.29-2.fc27.noarch mingw32-libssh2-1.8.0-3.fc27.noarch mingw32-libtasn1-4.13-1.fc28.noarch mingw32-nettle-3.3-3.fc27.noarch mingw32-pixman-0.34.0-3.fc27.noarch mingw32-pkg-config-0.28-9.fc27.x86_64 mingw64-SDL2-2.0.5-3.fc27.noarch mingw64-bzip2-1.0.6-9.fc27.noarch mingw64-curl-7.57.0-1.fc28.noarch mingw64-glib2-2.54.1-1.fc28.noarch mingw64-gmp-6.1.2-2.fc27.noarch mingw64-gnutls-3.5.13-2.fc27.noarch mingw64-gtk3-3.22.16-1.fc27.noarch mingw64-libjpeg-turbo-1.5.1-3.fc27.noarch mingw64-libpng-1.6.29-2.fc27.noarch mingw64-libssh2-1.8.0-3.fc27.noarch mingw64-libtasn1-4.13-1.fc28.noarch mingw64-nettle-3.3-3.fc27.noarch mingw64-pixman-0.34.0-3.fc27.noarch mingw64-pkg-config-0.28-9.fc27.x86_64 ncurses-devel-6.1-5.20180224.fc28.x86_64 nettle-devel-3.4-2.fc28.x86_64 nss-devel-3.36.1-1.1.fc28.x86_64 numactl-devel-2.0.11-8.fc28.x86_64 package PyYAML is not installed package libjpeg-devel is not installed perl-5.26.2-411.fc28.x86_64 pixman-devel-0.34.0-8.fc28.x86_64 python3-3.6.5-1.fc28.x86_64 snappy-devel-1.1.7-5.fc28.x86_64 sparse-0.5.2-1.fc28.x86_64 spice-server-devel-0.14.0-4.fc28.x86_64 systemtap-sdt-devel-3.2-11.fc28.x86_64 tar-1.30-3.fc28.x86_64 usbredir-devel-0.7.1-7.fc28.x86_64 virglrenderer-devel-0.6.0-4.20170210git76b3da97b.fc28.x86_64 vte3-devel-0.36.5-6.fc28.x86_64 which-2.21-8.fc28.x86_64 xen-devel-4.10.1-3.fc28.x86_64 zlib-devel-1.2.11-8.fc28.x86_64 Environment variables: TARGET_LIST= PACKAGES=ccache gettext git tar PyYAML sparse flex bison python3 bzip2 hostname gcc gcc-c++ llvm clang make perl which bc findutils glib2-devel
Re: [Qemu-devel] [PULL 0/7] bitmap export over NBD
Hi, This series failed docker-mingw@fedora build test. Please find the testing commands and their output below. If you have Docker installed, you can probably reproduce it locally. Type: series Message-id: 20180621031957.134718-1-ebl...@redhat.com Subject: [Qemu-devel] [PULL 0/7] bitmap export over NBD === TEST SCRIPT BEGIN === #!/bin/bash set -e git submodule update --init dtc # Let docker tests dump environment info export SHOW_ENV=1 export J=8 time make docker-test-mingw@fedora === TEST SCRIPT END === Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 Switched to a new branch 'test' c22707306f docs/interop: add nbd.txt 4c0d1f2119 qapi: new qmp command nbd-server-add-bitmap 822d9ac324 nbd/server: implement dirty bitmap export 89d48246d1 nbd/server: add nbd_meta_empty_or_pattern helper 9666d89233 nbd/server: refactor NBDExportMetaContexts f57941b5a4 nbd/server: fix trace 3683ca1d05 tests: Simplify .gitignore === OUTPUT BEGIN === Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc' Cloning into '/var/tmp/patchew-tester-tmp-kndg27b2/src/dtc'... Submodule path 'dtc': checked out 'e54388015af1fb4bf04d0bca99caba1074d9cc42' BUILD fedora make[1]: Entering directory '/var/tmp/patchew-tester-tmp-kndg27b2/src' GEN /var/tmp/patchew-tester-tmp-kndg27b2/src/docker-src.2018-06-21-01.33.16.21343/qemu.tar Cloning into '/var/tmp/patchew-tester-tmp-kndg27b2/src/docker-src.2018-06-21-01.33.16.21343/qemu.tar.vroot'... done. Your branch is up-to-date with 'origin/test'. Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc' Cloning into '/var/tmp/patchew-tester-tmp-kndg27b2/src/docker-src.2018-06-21-01.33.16.21343/qemu.tar.vroot/dtc'... Submodule path 'dtc': checked out 'e54388015af1fb4bf04d0bca99caba1074d9cc42' Submodule 'ui/keycodemapdb' (git://git.qemu.org/keycodemapdb.git) registered for path 'ui/keycodemapdb' Cloning into '/var/tmp/patchew-tester-tmp-kndg27b2/src/docker-src.2018-06-21-01.33.16.21343/qemu.tar.vroot/ui/keycodemapdb'... Submodule path 'ui/keycodemapdb': checked out '6b3d716e2b6472eb7189d3220552280ef3d832ce' COPYRUNNER RUN test-mingw in qemu:fedora Packages installed: SDL2-devel-2.0.8-5.fc28.x86_64 bc-1.07.1-5.fc28.x86_64 bison-3.0.4-9.fc28.x86_64 bluez-libs-devel-5.49-3.fc28.x86_64 brlapi-devel-0.6.7-12.fc28.x86_64 bzip2-1.0.6-26.fc28.x86_64 bzip2-devel-1.0.6-26.fc28.x86_64 ccache-3.4.2-2.fc28.x86_64 clang-6.0.0-5.fc28.x86_64 device-mapper-multipath-devel-0.7.4-2.git07e7bd5.fc28.x86_64 findutils-4.6.0-19.fc28.x86_64 flex-2.6.1-7.fc28.x86_64 gcc-8.1.1-1.fc28.x86_64 gcc-c++-8.1.1-1.fc28.x86_64 gettext-0.19.8.1-14.fc28.x86_64 git-2.17.1-2.fc28.x86_64 glib2-devel-2.56.1-3.fc28.x86_64 glusterfs-api-devel-4.0.2-1.fc28.x86_64 gnutls-devel-3.6.2-1.fc28.x86_64 gtk3-devel-3.22.30-1.fc28.x86_64 hostname-3.20-3.fc28.x86_64 libaio-devel-0.3.110-11.fc28.x86_64 libasan-8.1.1-1.fc28.x86_64 libattr-devel-2.4.47-23.fc28.x86_64 libcap-devel-2.25-9.fc28.x86_64 libcap-ng-devel-0.7.9-1.fc28.x86_64 libcurl-devel-7.59.0-3.fc28.x86_64 libfdt-devel-1.4.6-4.fc28.x86_64 libpng-devel-1.6.34-3.fc28.x86_64 librbd-devel-12.2.5-1.fc28.x86_64 libssh2-devel-1.8.0-7.fc28.x86_64 libubsan-8.1.1-1.fc28.x86_64 libusbx-devel-1.0.21-6.fc28.x86_64 libxml2-devel-2.9.7-4.fc28.x86_64 llvm-6.0.0-11.fc28.x86_64 lzo-devel-2.08-12.fc28.x86_64 make-4.2.1-6.fc28.x86_64 mingw32-SDL2-2.0.5-3.fc27.noarch mingw32-bzip2-1.0.6-9.fc27.noarch mingw32-curl-7.57.0-1.fc28.noarch mingw32-glib2-2.54.1-1.fc28.noarch mingw32-gmp-6.1.2-2.fc27.noarch mingw32-gnutls-3.5.13-2.fc27.noarch mingw32-gtk3-3.22.16-1.fc27.noarch mingw32-libjpeg-turbo-1.5.1-3.fc27.noarch mingw32-libpng-1.6.29-2.fc27.noarch mingw32-libssh2-1.8.0-3.fc27.noarch mingw32-libtasn1-4.13-1.fc28.noarch mingw32-nettle-3.3-3.fc27.noarch mingw32-pixman-0.34.0-3.fc27.noarch mingw32-pkg-config-0.28-9.fc27.x86_64 mingw64-SDL2-2.0.5-3.fc27.noarch mingw64-bzip2-1.0.6-9.fc27.noarch mingw64-curl-7.57.0-1.fc28.noarch mingw64-glib2-2.54.1-1.fc28.noarch mingw64-gmp-6.1.2-2.fc27.noarch mingw64-gnutls-3.5.13-2.fc27.noarch mingw64-gtk3-3.22.16-1.fc27.noarch mingw64-libjpeg-turbo-1.5.1-3.fc27.noarch mingw64-libpng-1.6.29-2.fc27.noarch mingw64-libssh2-1.8.0-3.fc27.noarch mingw64-libtasn1-4.13-1.fc28.noarch mingw64-nettle-3.3-3.fc27.noarch mingw64-pixman-0.34.0-3.fc27.noarch mingw64-pkg-config-0.28-9.fc27.x86_64 ncurses-devel-6.1-5.20180224.fc28.x86_64 nettle-devel-3.4-2.fc28.x86_64 nss-devel-3.36.1-1.1.fc28.x86_64 numactl-devel-2.0.11-8.fc28.x86_64 package PyYAML is not installed package libjpeg-devel is not installed perl-5.26.2-411.fc28.x86_64 pixman-devel-0.34.0-8.fc28.x86_64 python3-3.6.5-1.fc28.x86_64 snappy-devel-1.1.7-5.fc28.x86_64 sparse-0.5.2-1.fc28.x86_64 spice-server-devel-0.14.0-4.fc28.x86_64 systemtap-sdt-devel-3.2-11.fc28.x86_64 tar-1.30-3.fc28.x86_64 usbredir-devel-0.7.1-7.fc28.x86_64 virglrenderer-devel-0.6.0-4.20170210git76b3da97b.fc28.x86_64 vte3-devel-0.36.5-6.fc28.x86_64 which-2.21-8.fc28.x86_64 xen-devel-4.10.1-3.fc28.x86_64
Re: [Qemu-devel] [PATCH 3/9] spapr: Add cpu_apply hook to capabilities
On 06/18/2018 08:36 AM, David Gibson wrote: > spapr capabilities have an apply hook to actually activate (or deactivate) > the feature in the system at reset time. However, a number of capabilities > affect the setup of cpus, and need to be applied to each of them - > including hotplugged cpus for extra complication. To make this simpler, > add an optional cpu_apply hook that is called from spapr_cpu_reset(). > > Signed-off-by: David Gibson Reviewed-by: Cédric Le Goater Thanks, C. > --- > hw/ppc/spapr_caps.c | 19 +++ > hw/ppc/spapr_cpu_core.c | 2 ++ > include/hw/ppc/spapr.h | 1 + > 3 files changed, 22 insertions(+) > > diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c > index dabed817d1..68a4243efc 100644 > --- a/hw/ppc/spapr_caps.c > +++ b/hw/ppc/spapr_caps.c > @@ -59,6 +59,8 @@ typedef struct sPAPRCapabilityInfo { > sPAPRCapPossible *possible; > /* Make sure the virtual hardware can support this capability */ > void (*apply)(sPAPRMachineState *spapr, uint8_t val, Error **errp); > +void (*cpu_apply)(sPAPRMachineState *spapr, PowerPCCPU *cpu, > + uint8_t val, Error **errp); > } sPAPRCapabilityInfo; > > static void spapr_cap_get_bool(Object *obj, Visitor *v, const char *name, > @@ -472,6 +474,23 @@ void spapr_caps_apply(sPAPRMachineState *spapr) > } > } > > +void spapr_caps_cpu_apply(sPAPRMachineState *spapr, PowerPCCPU *cpu) > +{ > +int i; > + > +for (i = 0; i < SPAPR_CAP_NUM; i++) { > +sPAPRCapabilityInfo *info = _table[i]; > + > +/* > + * If the apply function can't set the desired level and thinks it's > + * fatal, it should cause that. > + */ > +if (info->cpu_apply) { > +info->cpu_apply(spapr, cpu, spapr->eff.caps[i], _fatal); > +} > +} > +} > + > void spapr_caps_add_properties(sPAPRMachineClass *smc, Error **errp) > { > Error *local_err = NULL; > diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c > index aef3be33a3..324623190d 100644 > --- a/hw/ppc/spapr_cpu_core.c > +++ b/hw/ppc/spapr_cpu_core.c > @@ -76,6 +76,8 @@ static void spapr_cpu_reset(void *opaque) > spapr_cpu->slb_shadow_size = 0; > spapr_cpu->dtl_addr = 0; > spapr_cpu->dtl_size = 0; > + > +spapr_caps_cpu_apply(SPAPR_MACHINE(qdev_get_machine()), cpu); > } > > void spapr_cpu_set_entry_state(PowerPCCPU *cpu, target_ulong nip, > target_ulong r3) > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h > index 9dbd6010f5..9dd46a72f6 100644 > --- a/include/hw/ppc/spapr.h > +++ b/include/hw/ppc/spapr.h > @@ -800,6 +800,7 @@ static inline uint8_t spapr_get_cap(sPAPRMachineState > *spapr, int cap) > > void spapr_caps_init(sPAPRMachineState *spapr); > void spapr_caps_apply(sPAPRMachineState *spapr); > +void spapr_caps_cpu_apply(sPAPRMachineState *spapr, PowerPCCPU *cpu); > void spapr_caps_add_properties(sPAPRMachineClass *smc, Error **errp); > int spapr_caps_post_migration(sPAPRMachineState *spapr); > >
Re: [Qemu-devel] [PATCH 2/9] spapr: Compute effective capability values earlier
On 06/18/2018 08:35 AM, David Gibson wrote: > Previously, the effective values of the various spapr capability flags > were only determined at machine reset time. That was a lazy way of making > sure it was after cpu initialization so it could use the cpu object to > inform the defaults. > > But we've now improved the compat checking code so that we don't need to > instantiate the cpus to use it. That lets us move the resolution of the > capability defaults much earlier. > > This is going to be necessary for some future capabilities. > > Signed-off-by: David Gibson Reviewed-by: Cédric Le Goater Thanks, C. > --- > hw/ppc/spapr.c | 6 -- > hw/ppc/spapr_caps.c| 9 ++--- > include/hw/ppc/spapr.h | 3 ++- > 3 files changed, 12 insertions(+), 6 deletions(-) > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > index b0b94fc1f0..40858d047c 100644 > --- a/hw/ppc/spapr.c > +++ b/hw/ppc/spapr.c > @@ -1612,7 +1612,7 @@ static void spapr_machine_reset(void) > void *fdt; > int rc; > > -spapr_caps_reset(spapr); > +spapr_caps_apply(spapr); > > first_ppc_cpu = POWERPC_CPU(first_cpu); > if (kvm_enabled() && kvmppc_has_cap_mmu_radix() && > @@ -2526,7 +2526,9 @@ static void spapr_machine_init(MachineState *machine) > QLIST_INIT(>phbs); > QTAILQ_INIT(>pending_dimm_unplugs); > > -/* Check HPT resizing availability */ > +/* Determine capabilities to run with */ > +spapr_caps_init(spapr); > + > kvmppc_check_papr_resize_hpt(_hpt_err); > if (spapr->resize_hpt == SPAPR_RESIZE_HPT_DEFAULT) { > /* > diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c > index 469f38f0ef..dabed817d1 100644 > --- a/hw/ppc/spapr_caps.c > +++ b/hw/ppc/spapr_caps.c > @@ -439,12 +439,12 @@ SPAPR_CAP_MIG_STATE(cfpc, SPAPR_CAP_CFPC); > SPAPR_CAP_MIG_STATE(sbbc, SPAPR_CAP_SBBC); > SPAPR_CAP_MIG_STATE(ibs, SPAPR_CAP_IBS); > > -void spapr_caps_reset(sPAPRMachineState *spapr) > +void spapr_caps_init(sPAPRMachineState *spapr) > { > sPAPRCapabilities default_caps; > int i; > > -/* First compute the actual set of caps we're running with.. */ > +/* Compute the actual set of caps we should run with */ > default_caps = default_caps_with_cpu(spapr, MACHINE(spapr)->cpu_type); > > for (i = 0; i < SPAPR_CAP_NUM; i++) { > @@ -455,8 +455,11 @@ void spapr_caps_reset(sPAPRMachineState *spapr) > spapr->eff.caps[i] = default_caps.caps[i]; > } > } > +} > > -/* .. then apply those caps to the virtual hardware */ > +void spapr_caps_apply(sPAPRMachineState *spapr) > +{ > +int i; > > for (i = 0; i < SPAPR_CAP_NUM; i++) { > sPAPRCapabilityInfo *info = _table[i]; > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h > index 3388750fc7..9dbd6010f5 100644 > --- a/include/hw/ppc/spapr.h > +++ b/include/hw/ppc/spapr.h > @@ -798,7 +798,8 @@ static inline uint8_t spapr_get_cap(sPAPRMachineState > *spapr, int cap) > return spapr->eff.caps[cap]; > } > > -void spapr_caps_reset(sPAPRMachineState *spapr); > +void spapr_caps_init(sPAPRMachineState *spapr); > +void spapr_caps_apply(sPAPRMachineState *spapr); > void spapr_caps_add_properties(sPAPRMachineClass *smc, Error **errp); > int spapr_caps_post_migration(sPAPRMachineState *spapr); > >
Re: [Qemu-devel] [PATCH 1/9] target/ppc: Allow cpu compatiblity checks based on type, not instance
On 06/18/2018 08:35 AM, David Gibson wrote: > ppc_check_compat() is used in a number of places to check if a cpu object > supports a certain compatiblity mode, subject to various constraints. > > It takes a PowerPCCPU *, however it really only depends on the cpu's class. > We have upcoming cases where it would be useful to make compatibility > checks before we fully instantiate the cpu objects. > > ppc_type_check_compat() will now make an equivalent check, but based on a > CPU's QOM typename instead of an instantiated CPU object. > > We make use of the new interface in several places in spapr, where we're > essentially making a global check, rather than one specific to a particular > cpu. This avoids some ugly uses of first_cpu to grab a "representative" > instance. > > Signed-off-by: David Gibson Reviewed-by: Cédric Le Goater Looks good to me, Thanks, C. > --- > hw/ppc/spapr.c | 10 -- > hw/ppc/spapr_caps.c | 19 +-- > target/ppc/compat.c | 27 +-- > target/ppc/cpu.h| 4 > 4 files changed, 38 insertions(+), 22 deletions(-) > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > index db0fb385d4..b0b94fc1f0 100644 > --- a/hw/ppc/spapr.c > +++ b/hw/ppc/spapr.c > @@ -1616,8 +1616,8 @@ static void spapr_machine_reset(void) > > first_ppc_cpu = POWERPC_CPU(first_cpu); > if (kvm_enabled() && kvmppc_has_cap_mmu_radix() && > -ppc_check_compat(first_ppc_cpu, CPU_POWERPC_LOGICAL_3_00, 0, > - spapr->max_compat_pvr)) { > +ppc_type_check_compat(machine->cpu_type, CPU_POWERPC_LOGICAL_3_00, 0, > + spapr->max_compat_pvr)) { > /* If using KVM with radix mode available, VCPUs can be started > * without a HPT because KVM will start them in radix mode. > * Set the GR bit in PATB so that we know there is no HPT. */ > @@ -2520,7 +2520,6 @@ static void spapr_machine_init(MachineState *machine) > long load_limit, fw_size; > char *filename; > Error *resize_hpt_err = NULL; > -PowerPCCPU *first_ppc_cpu; > > msi_nonbroken = true; > > @@ -2618,10 +2617,9 @@ static void spapr_machine_init(MachineState *machine) > /* init CPUs */ > spapr_init_cpus(spapr); > > -first_ppc_cpu = POWERPC_CPU(first_cpu); > if ((!kvm_enabled() || kvmppc_has_cap_mmu_radix()) && > -ppc_check_compat(first_ppc_cpu, CPU_POWERPC_LOGICAL_3_00, 0, > - spapr->max_compat_pvr)) { > +ppc_type_check_compat(machine->cpu_type, CPU_POWERPC_LOGICAL_3_00, 0, > + spapr->max_compat_pvr)) { > /* KVM and TCG always allow GTSE with radix... */ > spapr_ovec_set(spapr->ov5, OV5_MMU_RADIX_GTSE); > } > diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c > index 00e43a9ba7..469f38f0ef 100644 > --- a/hw/ppc/spapr_caps.c > +++ b/hw/ppc/spapr_caps.c > @@ -327,27 +327,26 @@ sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = { > }; > > static sPAPRCapabilities default_caps_with_cpu(sPAPRMachineState *spapr, > - CPUState *cs) > + const char *cputype) > { > sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr); > -PowerPCCPU *cpu = POWERPC_CPU(cs); > sPAPRCapabilities caps; > > caps = smc->default_caps; > > -if (!ppc_check_compat(cpu, CPU_POWERPC_LOGICAL_2_07, > - 0, spapr->max_compat_pvr)) { > +if (!ppc_type_check_compat(cputype, CPU_POWERPC_LOGICAL_2_07, > + 0, spapr->max_compat_pvr)) { > caps.caps[SPAPR_CAP_HTM] = SPAPR_CAP_OFF; > caps.caps[SPAPR_CAP_CFPC] = SPAPR_CAP_BROKEN; > } > > -if (!ppc_check_compat(cpu, CPU_POWERPC_LOGICAL_2_06_PLUS, > - 0, spapr->max_compat_pvr)) { > +if (!ppc_type_check_compat(cputype, CPU_POWERPC_LOGICAL_2_06_PLUS, > + 0, spapr->max_compat_pvr)) { > caps.caps[SPAPR_CAP_SBBC] = SPAPR_CAP_BROKEN; > } > > -if (!ppc_check_compat(cpu, CPU_POWERPC_LOGICAL_2_06, > - 0, spapr->max_compat_pvr)) { > +if (!ppc_type_check_compat(cputype, CPU_POWERPC_LOGICAL_2_06, > + 0, spapr->max_compat_pvr)) { > caps.caps[SPAPR_CAP_VSX] = SPAPR_CAP_OFF; > caps.caps[SPAPR_CAP_DFP] = SPAPR_CAP_OFF; > caps.caps[SPAPR_CAP_IBS] = SPAPR_CAP_BROKEN; > @@ -384,7 +383,7 @@ int spapr_caps_post_migration(sPAPRMachineState *spapr) > sPAPRCapabilities dstcaps = spapr->eff; > sPAPRCapabilities srccaps; > > -srccaps = default_caps_with_cpu(spapr, first_cpu); > +srccaps = default_caps_with_cpu(spapr, MACHINE(spapr)->cpu_type); > for (i = 0; i < SPAPR_CAP_NUM; i++) { > /* If not default value then assume came in with the migration */ > if (spapr->mig.caps[i] !=
Re: [Qemu-devel] [PATCH v5 00/35] target/arm SVE patches
Hi, This series seems to have some coding style problems. See output below for more information: Type: series Message-id: 20180621015359.12018-1-richard.hender...@linaro.org Subject: [Qemu-devel] [PATCH v5 00/35] target/arm SVE patches === TEST SCRIPT BEGIN === #!/bin/bash BASE=base n=1 total=$(git log --oneline $BASE.. | wc -l) failed=0 git config --local diff.renamelimit 0 git config --local diff.renames True git config --local diff.algorithm histogram commits="$(git log --format=%H --reverse $BASE..)" for c in $commits; do echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..." if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then failed=1 echo fi n=$((n+1)) done exit $failed === TEST SCRIPT END === Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 From https://github.com/patchew-project/qemu * [new tag] patchew/20180621015359.12018-1-richard.hender...@linaro.org -> patchew/20180621015359.12018-1-richard.hender...@linaro.org Switched to a new branch 'test' 20da78ae3f target/arm: Implement ARMv8.2-DotProd 0cb7019b8e target/arm: Enable SVE for aarch64-linux-user 3994a5311f target/arm: Implement SVE dot product (indexed) 68544c457f target/arm: Implement SVE dot product (vectors) d4de43a8d5 target/arm: Implement SVE fp complex multiply add (indexed) 2c2c7d3891 target/arm: Pass index to AdvSIMD FCMLA (indexed) a0e69d3f69 target/arm: Implement SVE fp complex multiply add 5f64a686f3 target/arm: Implement SVE floating-point complex add dad639bc3d target/arm: Implement SVE MOVPRFX 52b01c3388 target/arm: Implement SVE floating-point unary operations a19d3bf926 target/arm: Implement SVE floating-point round to integral value 3f24de5407 target/arm: Implement SVE floating-point convert to integer 05a1fefa84 target/arm: Implement SVE floating-point convert precision 7162d18555 target/arm: Implement SVE floating-point trig multiply-add coefficient a0db6af977 target/arm: Implement SVE FP Compare with Zero Group 473295b570 target/arm: Implement SVE Floating Point Unary Operations - Unpredicated Group e4c7547715 target/arm: Implement SVE FP Fast Reduction Group 5c8c7bc112 target/arm: Implement SVE Floating Point Multiply Indexed Group e25c13b2f2 target/arm: Implement SVE floating-point arithmetic with immediate bd5f8b5b58 target/arm: Implement SVE floating-point compare vectors 0f3b43fd17 target/arm: Implement SVE scatter store vector immediate 33a026c41d target/arm: Implement SVE first-fault gather loads 91830cd0ff target/arm: Implement SVE gather loads bd1b3f8926 target/arm: Implement SVE prefetches e56bc2e4b2 target/arm: Implement SVE scatter stores 0a1f63356b target/arm: Implement SVE store vector/predicate register c8768a98d4 target/arm: Implement SVE load and broadcast element 2248737380 target/arm: Implement SVE Floating Point Accumulating Reduction Group eb09ba5657 target/arm: Implement SVE FP Multiply-Add Group 4acac93526 target/arm: Implement SVE floating-point arithmetic (predicated) 690d0c157f target/arm: Implement SVE integer convert to floating-point cd9ddaedd0 target/arm: Implement SVE load and broadcast quadword 96f8560153 target/arm: Implement SVE Memory Contiguous Store Group 786b7911df target/arm: Implement SVE Contiguous Load, first-fault and no-fault 3a8acf3218 target/arm: Implement SVE Memory Contiguous Load Group === OUTPUT BEGIN === Checking PATCH 1/35: target/arm: Implement SVE Memory Contiguous Load Group... ERROR: space prohibited before that close parenthesis ')' #241: FILE: target/arm/sve_helper.c:2931: +DO_LD1(sve_ld1bdu_r, cpu_ldub_data_ra, uint64_t, uint8_t, ) ERROR: space prohibited before that close parenthesis ')' #242: FILE: target/arm/sve_helper.c:2932: +DO_LD1(sve_ld1bds_r, cpu_ldsb_data_ra, uint64_t, int8_t, ) ERROR: space prohibited before that close parenthesis ')' #246: FILE: target/arm/sve_helper.c:2936: +DO_LD1(sve_ld1hdu_r, cpu_lduw_data_ra, uint64_t, uint16_t, ) ERROR: space prohibited before that close parenthesis ')' #247: FILE: target/arm/sve_helper.c:2937: +DO_LD1(sve_ld1hds_r, cpu_ldsw_data_ra, uint64_t, int16_t, ) ERROR: space prohibited before that close parenthesis ')' #249: FILE: target/arm/sve_helper.c:2939: +DO_LD1(sve_ld1sdu_r, cpu_ldl_data_ra, uint64_t, uint32_t, ) ERROR: space prohibited before that close parenthesis ')' #250: FILE: target/arm/sve_helper.c:2940: +DO_LD1(sve_ld1sds_r, cpu_ldl_data_ra, uint64_t, int32_t, ) ERROR: space prohibited before that close parenthesis ')' #267: FILE: target/arm/sve_helper.c:2957: +DO_LD1(sve_ld1dd_r, cpu_ldq_data_ra, uint64_t, uint64_t, ) ERROR: space prohibited before that close parenthesis ')' #268: FILE: target/arm/sve_helper.c:2958: +DO_LD2(sve_ld2dd_r, cpu_ldq_data_ra, uint64_t, uint64_t, ) ERROR: space prohibited before that close parenthesis ')' #269: FILE: target/arm/sve_helper.c:2959: +DO_LD3(sve_ld3dd_r, cpu_ldq_data_ra, uint64_t, uint64_t, ) ERROR: space prohibited before that close parenthesis ')' #270: FILE:
[Qemu-devel] Fwd: port forward not work when using with macvtap
Hi all, I create a vm with a macvtap network and a nat network. ``` netdev user,id=fl.1,hostfwd=tcp::-:22 \ -device e1000,netdev=fl.1 \ -net nic,model=virtio,macaddr=$(< /sys/class/net/macvtap0/address) \ -net tap,fd=3 3<>/dev/tap$(< /sys/class/net/macvtap0/ifindex) ``` I create two network cards because the macvtap ip is got from dhcp, and I have to login from vnc to get it. So I add a nat card to use as port forward to login from ssh. The macvtap works well with a dhcp ip. However, the port forwd on nat isn't work. If I delete the macvtap from qemu command, the port forwd works well. On host, the port is listenning. tcp0 0 0.0.0.0:0.0.0.0:* LISTEN 13218/qemu-system-x But I couldn't login in ssh with using ` ssh -p 192.168.17.61`, it will hang. `ssh -p 127.0.0.1 ` could success. Anyone know how to fix this issue? Millions of thanks. Thanks, -- The SmartX email address is only for business purpose. Any sent message that is not related to the business is not authorized or permitted by SmartX. 本邮箱为北京志凌海纳科技有限公司(SmartX)工作邮箱. 如本邮箱发出的邮件与工作无关,该邮件未得到本公司任何的明示或默示的授权.
Re: [Qemu-devel] [Qemu-block] [PATCH] [RFC v2] aio: properly bubble up errors from initialization
On 20.06.2018 [12:34:52 -0700], Nishanth Aravamudan wrote: > On 20.06.2018 [11:57:42 +0200], Kevin Wolf wrote: > > Am 20.06.2018 um 00:54 hat Nishanth Aravamudan geschrieben: > > > On 19.06.2018 [15:35:57 -0700], Nishanth Aravamudan wrote: > > > > On 19.06.2018 [13:14:51 -0700], Nishanth Aravamudan wrote: > > > > > On 19.06.2018 [14:35:33 -0500], Eric Blake wrote: > > > > > > On 06/15/2018 12:47 PM, Nishanth Aravamudan via Qemu-devel wrote: > > > > > > > > > > > > > > > > > > > } else if (s->use_linux_aio) { > > > > > > > +int rc; > > > > > > > +rc = aio_setup_linux_aio(bdrv_get_aio_context(bs)); > > > > > > > +if (rc != 0) { > > > > > > > +error_report("Unable to use native AIO, falling > > > > > > > back to " > > > > > > > + "thread pool."); > > > > > > > > > > > > In general, error_report() should not output a trailing '.'. > > > > > > > > > > Will fix. > > > > > > > > > > > > +s->use_linux_aio = 0; > > > > > > > +return rc; > > > > > > > > > > > > Wait - the message claims we are falling back, but the non-zero > > > > > > return code > > > > > > sounds like we are returning an error instead of falling back. (My > > > > > > preference - if the user requested something and we can't do it, > > > > > > it's better > > > > > > to error than to fall back to something that does not match the > > > > > > user's > > > > > > request). > > > > > > > > > > I think that makes sense, I hadn't tested this specific case (in my > > > > > reading of the code, it wasn't clear to me if raw_co_prw() could be > > > > > called before raw_aio_plug() had been called, but I think returning > > > > > the > > > > > error code up should be handled correctly. What about the cases where > > > > > there is no error handling (the other two changes in the patch)? > > > > > > > > While looking at doing these changes, I realized that I'm not quite sure > > > > what the right approach is here. My original rationale for returning > > > > non-zero was that AIO was requested but could not be completed. I > > > > haven't fully tracked back the calling paths, but I assumed it would get > > > > retried at the top level, and since we indicated to not use AIO on > > > > subsequent calls, it will succeed and use threads then (note, that I do > > > > now realize this means a mismatch between the qemu command-line and the > > > > in-use AIO model). > > > > > > > > In practice, with my v2 patch, where I do return a non-zero error-code > > > > from this function, qemu does not exit (nor is any logging other than > > > > that I added emitted on the monitor). If I do not fallback, I imagine we > > > > would just continuously see this error message and IO might not actually > > > > every occur? Reworking all of the callpath to fail on non-zero returns > > > > from raw_co_prw() seems like a fair bit of work, but if that is what is > > > > being requested, I can try that (it will just take a while). > > > > Alternatively, I can produce a v3 quickly that does not bubble the > > > > actual errno all the way up (since it does seem like it is ignored > > > > anyways?). > > > > > > Sorry for the noise, but I had one more thought. Would it be appropriate > > > to push the _setup() call up to when we parse the arguments about > > > aio=native? E.g., we already check there if cache=directsync is > > > specified and error out if not. > > > > We already do this: > > Right, I stated above it already is done, I simply meant adding a second > check here that we can obtain and setup the AIO context successfully. > > > /* Currently Linux does AIO only for files opened with O_DIRECT */ > > if (s->use_linux_aio && !(s->open_flags & O_DIRECT)) { > > error_setg(errp, "aio=native was specified, but it requires " > > "cache.direct=on, which was not specified."); > > ret = -EINVAL; > > goto fail; > > } > > > > laio_init() is about other types of errors. But anyway, yes, calling > > laio_init() already in .bdrv_open() is possible. Returning errors from > > .bdrv_open() is nice and easy and we should do it. > > Ack. > > > However, we may also need to call laio_init() again when switching to a > > different I/O thread after the image is already opened. This is what I > > meant when I commented on v1 that you should do this in the > > .bdrv_attach_aio_context callback. The problem here is that we can't > > return an error there and the guest is already using the image. In this > > case, logging an error and falling back to the thread pool seems to be > > the best option we have. > > Is this is a request for new functionality? Just trying to understand, > because aiui, block/file-posix.c does not implement the > bdrv_attach_aio_context callback currently. Instead, aio_get_linux_aio() > is called from three places, raw_co_prw, raw_aio_plug and > raw_aio_unplug, which calls into
[Qemu-devel] [PATCH] nbd/client: add x-block-status hack for testing server
In order to test that the NBD server is properly advertising dirty bitmaps, we need a bare minimum client that can request and read the context. This patch is a hack (hence the use of the x- prefix) that serves two purposes: first, it lets the client pass a request of more than one context at a time to the server, to test the reaction of the server to various contexts (via the list command). Second, whatever the first context in the user's list becomes the context wired up to the results visible in bdrv_block_status(); this has the result that if you pass in 'qemu:dirty-bitmap:b' instead of the usual 'base:allocation', and the server is currently serving a named bitmap 'b', then commands like 'qemu-img map' now output status corresponding to the dirty bitmap (dirty sections look like holes, while clean sections look like data, based on how the status bits are mapped over the NBD protocol). Since the hack corrupts the meaning of bdrv_block_status(), I would NOT try to run 'qemu-img convert' or any other program that might misbehave based on thinking clusters have a different status than what the normal 'base:allocation' would provide. The hack uses a semicolon-separated list embedded in a single string, as that was easier to wire into the nbd block driver than figuring out the right incantation of flattened QDict to represent an array via the command line. Oh well, just one more reason that this hack deserves the 'x-' prefix. As a demo, I was able to prove things work with the following sequence: $ qemu-img info file image: file file format: qcow2 virtual size: 2.0M (2097152 bytes) disk size: 2.0M cluster_size: 65536 Format specific information: compat: 1.1 lazy refcounts: false refcount bits: 16 corrupt: false $ ./x86_64-softmmu/qemu-system-x86_64 -nodefaults -nographic -qmp stdio {"QMP": {"version": {"qemu": {"micro": 50, "minor": 12, "major": 2}, "package": "v2.12.0-1531-g3ab98aa673d"}, "capabilities": []}} {'execute':'qmp_capabilities'} {"return": {}} {'execute':'blockdev-add','arguments':{'driver':'qcow2','node-name':'n','file':{'driver':'file','filename':'file'}}} {"return": {}} {'execute':'block-dirty-bitmap-add','arguments':{'node':'n','name':'b','persistent':true}} {"return": {}} {'execute':'quit'} {"return": {}} {"timestamp": {"seconds": 1529548814, "microseconds": 472828}, "event": "SHUTDOWN", "data": {"guest": false}} $ ./qemu-io -f qcow2 file qemu-io> r -v 0 1 : 01 . read 1/1 bytes at offset 0 1 bytes, 1 ops; 0.0001 sec (4.957 KiB/sec and 5076.1421 ops/sec) qemu-io> w -P 1 0 1 wrote 1/1 bytes at offset 0 1 bytes, 1 ops; 0.0078 sec (127.502231 bytes/sec and 127.5022 ops/sec) qemu-io> q $ ./x86_64-softmmu/qemu-system-x86_64 -nodefaults -nographic -qmp stdio {"QMP": {"version": {"qemu": {"micro": 50, "minor": 12, "major": 2}, "package": "v2.12.0-1531-g3ab98aa673d"}, "capabilities": []}} {'execute':'qmp_capabilities'} {"return": {}} {'execute':'nbd-server-start','arguments':{'addr':{'type':'inet','data':{'host':'localhost','port':'10809' {"return": {}} {'execute':'blockdev-add','arguments':{'driver':'qcow2','node-name':'n','file':{'driver':'file','filename':'file'}}} {"return": {}} {'execute':'nbd-server-add','arguments':{'device':'n'}} {"return": {}} {'execute':'x-nbd-server-add-bitmap','arguments':{'name':'n','bitmap':'b'}} {"error": {"class": "GenericError", "desc": "Bitmap 'b' is enabled"}} {'execute':'x-block-dirty-bitmap-disable','arguments':{'node':'n','name':'b'}} {"return": {}} {'execute':'x-nbd-server-add-bitmap','arguments':{'name':'n','bitmap':'b'}} {"return": {}} ... leave running $ ./qemu-img map --output=json --image-opts driver=nbd,export=n,server.type=inet,server.host=localhost,server.port=10809 [{ "start": 0, "length": 1114112, "depth": 0, "zero": false, "data": true}, { "start": 1114112, "length": 458752, "depth": 0, "zero": true, "data": false}, { "start": 1572864, "length": 524288, "depth": 0, "zero": false, "data": true}] $ ./qemu-img map --output=json --image-opts driver=nbd,export=n,server.type=inet,server.host=localhost,server.port=10809,x-block-status=qemu:dirty-bitmap:b [{ "start": 0, "length": 65536, "depth": 0, "zero": false, "data": false}, { "start": 65536, "length": 2031616, "depth": 0, "zero": false, "data": true}] The difference between the two runs shows that base:allocation status is thus different from the contents of dirty bitmap 'b'; and that the dirty bitmap 'b' indeed tracked the first 64k of the file as being dirty due to the qemu-io write at offset 0 performed between the creation of bitmap b in the first qemu, and the disabling it prior to exporting it in the second qemu. Signed-off-by: Eric Blake --- Based-on: <20180621031957.134718-1-ebl...@redhat.com> ([PULL 0/7] bitmap export over NBD) qapi/block-core.json | 12 - block/nbd-client.h | 1 + include/block/nbd.h | 1 + block/nbd-client.c | 2 + block/nbd.c | 9 +++- nbd/client.c | 130
[Qemu-devel] [PULL 6/7] qapi: new qmp command nbd-server-add-bitmap
From: Vladimir Sementsov-Ogievskiy For now, the actual command ix x-nbd-server-add-bitmap, reflecting the fact that we are still working on libvirt code that proves the command works as needed, and also the fact that we may remove bitmap-export-name (and just require that the exported name be the bitmap name). Signed-off-by: Vladimir Sementsov-Ogievskiy Message-Id: <20180609151758.17343-6-vsement...@virtuozzo.com> Reviewed-by: Eric Blake [eblake: make the command experimental by adding x- prefix] Signed-off-by: Eric Blake --- qapi/block.json | 23 +++ blockdev-nbd.c | 23 +++ 2 files changed, 46 insertions(+) diff --git a/qapi/block.json b/qapi/block.json index c6945240029..ca807f176ae 100644 --- a/qapi/block.json +++ b/qapi/block.json @@ -268,6 +268,29 @@ { 'command': 'nbd-server-remove', 'data': {'name': 'str', '*mode': 'NbdServerRemoveMode'} } +## +# @x-nbd-server-add-bitmap: +# +# Expose a dirty bitmap associated with the selected export. The bitmap search +# starts at the device attached to the export, and includes all backing files. +# The exported bitmap is then locked until the NBD export is removed. +# +# @name: Export name. +# +# @bitmap: Bitmap name to search for. +# +# @bitmap-export-name: How the bitmap will be seen by nbd clients +# (default @bitmap) +# +# Note: the client must use NBD_OPT_SET_META_CONTEXT with a query of +# "qemu:dirty-bitmap:NAME" (where NAME matches @bitmap-export-name) to access +# the exposed bitmap. +# +# Since: 3.0 +## + { 'command': 'x-nbd-server-add-bitmap', +'data': {'name': 'str', 'bitmap': 'str', '*bitmap-export-name': 'str'} } + ## # @nbd-server-stop: # diff --git a/blockdev-nbd.c b/blockdev-nbd.c index 65a84739edc..1ef11041a73 100644 --- a/blockdev-nbd.c +++ b/blockdev-nbd.c @@ -220,3 +220,26 @@ void qmp_nbd_server_stop(Error **errp) nbd_server_free(nbd_server); nbd_server = NULL; } + +void qmp_x_nbd_server_add_bitmap(const char *name, const char *bitmap, + bool has_bitmap_export_name, + const char *bitmap_export_name, + Error **errp) +{ +NBDExport *exp; + +if (!nbd_server) { +error_setg(errp, "NBD server not running"); +return; +} + +exp = nbd_export_find(name); +if (exp == NULL) { +error_setg(errp, "Export '%s' is not found", name); +return; +} + +nbd_export_bitmap(exp, bitmap, + has_bitmap_export_name ? bitmap_export_name : bitmap, + errp); +} -- 2.14.4
[Qemu-devel] [PULL 5/7] nbd/server: implement dirty bitmap export
From: Vladimir Sementsov-Ogievskiy Handle a new NBD meta namespace: "qemu", and corresponding queries: "qemu:dirty-bitmap:". With the new metadata context negotiated, BLOCK_STATUS query will reply with dirty-bitmap data, converted to extents. The new public function nbd_export_bitmap selects which bitmap to export. For now, only one bitmap may be exported. Signed-off-by: Vladimir Sementsov-Ogievskiy Message-Id: <20180609151758.17343-5-vsement...@virtuozzo.com> Reviewed-by: Eric Blake [eblake: wording tweaks, minor cleanups, additional tracing] Signed-off-by: Eric Blake --- include/block/nbd.h | 8 +- nbd/server.c| 277 +++- nbd/trace-events| 1 + 3 files changed, 261 insertions(+), 25 deletions(-) diff --git a/include/block/nbd.h b/include/block/nbd.h index fcdcd545023..8bb9606c39b 100644 --- a/include/block/nbd.h +++ b/include/block/nbd.h @@ -229,11 +229,13 @@ enum { #define NBD_REPLY_TYPE_ERROR NBD_REPLY_ERR(1) #define NBD_REPLY_TYPE_ERROR_OFFSET NBD_REPLY_ERR(2) -/* Flags for extents (NBDExtent.flags) of NBD_REPLY_TYPE_BLOCK_STATUS, - * for base:allocation meta context */ +/* Extent flags for base:allocation in NBD_REPLY_TYPE_BLOCK_STATUS */ #define NBD_STATE_HOLE (1 << 0) #define NBD_STATE_ZERO (1 << 1) +/* Extent flags for qemu:dirty-bitmap in NBD_REPLY_TYPE_BLOCK_STATUS */ +#define NBD_STATE_DIRTY (1 << 0) + static inline bool nbd_reply_type_is_error(int type) { return type & (1 << 15); @@ -315,6 +317,8 @@ void nbd_client_put(NBDClient *client); void nbd_server_start(SocketAddress *addr, const char *tls_creds, Error **errp); +void nbd_export_bitmap(NBDExport *exp, const char *bitmap, + const char *bitmap_export_name, Error **errp); /* nbd_read * Reads @size bytes from @ioc. Returns 0 on success. diff --git a/nbd/server.c b/nbd/server.c index cea5192addb..f7f1fda4b3f 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -23,6 +23,13 @@ #include "nbd-internal.h" #define NBD_META_ID_BASE_ALLOCATION 0 +#define NBD_META_ID_DIRTY_BITMAP 1 + +/* NBD_MAX_BITMAP_EXTENTS: 1 mb of extents data. An empirical + * constant. If an increase is needed, note that the NBD protocol + * recommends no larger than 32 mb, so that the client won't consider + * the reply as a denial of service attack. */ +#define NBD_MAX_BITMAP_EXTENTS (0x10 / 8) static int system_errno_to_nbd_errno(int err) { @@ -80,6 +87,9 @@ struct NBDExport { BlockBackend *eject_notifier_blk; Notifier eject_notifier; + +BdrvDirtyBitmap *export_bitmap; +char *export_bitmap_context; }; static QTAILQ_HEAD(, NBDExport) exports = QTAILQ_HEAD_INITIALIZER(exports); @@ -92,6 +102,7 @@ typedef struct NBDExportMetaContexts { bool valid; /* means that negotiation of the option finished without errors */ bool base_allocation; /* export base:allocation context (block status) */ +bool bitmap; /* export qemu:dirty-bitmap: */ } NBDExportMetaContexts; struct NBDClient { @@ -814,6 +825,56 @@ static int nbd_meta_base_query(NBDClient *client, NBDExportMetaContexts *meta, >base_allocation, errp); } +/* nbd_meta_bitmap_query + * + * Handle query to 'qemu:' namespace. + * @len is the amount of text remaining to be read from the current name, after + * the 'qemu:' portion has been stripped. + * + * Return -errno on I/O error, 0 if option was completely handled by + * sending a reply about inconsistent lengths, or 1 on success. */ +static int nbd_meta_qemu_query(NBDClient *client, NBDExportMetaContexts *meta, + uint32_t len, Error **errp) +{ +bool dirty_bitmap = false; +size_t dirty_bitmap_len = strlen("dirty-bitmap:"); +int ret; + +if (!meta->exp->export_bitmap) { +trace_nbd_negotiate_meta_query_skip("no dirty-bitmap exported"); +return nbd_opt_skip(client, len, errp); +} + +if (len == 0) { +if (client->opt == NBD_OPT_LIST_META_CONTEXT) { +meta->bitmap = true; +} +trace_nbd_negotiate_meta_query_parse("empty"); +return 1; +} + +if (len < dirty_bitmap_len) { +trace_nbd_negotiate_meta_query_skip("not dirty-bitmap:"); +return nbd_opt_skip(client, len, errp); +} + +len -= dirty_bitmap_len; +ret = nbd_meta_pattern(client, "dirty-bitmap:", _bitmap, errp); +if (ret <= 0) { +return ret; +} +if (!dirty_bitmap) { +trace_nbd_negotiate_meta_query_skip("not dirty-bitmap:"); +return nbd_opt_skip(client, len, errp); +} + +trace_nbd_negotiate_meta_query_parse("dirty-bitmap:"); + +return nbd_meta_empty_or_pattern( +client, meta->exp->export_bitmap_context + +strlen("qemu:dirty_bitmap:"), len, >bitmap, errp); +} + /* nbd_negotiate_meta_query * * Parse namespace name and call corresponding function to parse body of the @@ -829,9
[Qemu-devel] [PULL 2/7] nbd/server: fix trace
From: Vladimir Sementsov-Ogievskiy Return code = 1 doesn't mean that we parsed base:allocation. Use correct traces in both -parsed and -skipped cases. Signed-off-by: Vladimir Sementsov-Ogievskiy Message-Id: <20180609151758.17343-2-vsement...@virtuozzo.com> Reviewed-by: Eric Blake [eblake: comment tweaks] Signed-off-by: Eric Blake --- nbd/server.c | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/nbd/server.c b/nbd/server.c index 9e1f2271784..e71301b8cd7 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -736,12 +736,16 @@ static int nbd_negotiate_send_meta_context(NBDClient *client, /* nbd_meta_base_query * - * Handle query to 'base' namespace. For now, only base:allocation context is - * available in it. 'len' is the amount of text remaining to be read from + * Handle queries to 'base' namespace. For now, only the base:allocation + * context is available. 'len' is the amount of text remaining to be read from * the current name, after the 'base:' portion has been stripped. * * Return -errno on I/O error, 0 if option was completely handled by - * sending a reply about inconsistent lengths, or 1 on success. */ + * sending a reply about inconsistent lengths, or 1 on success. + * + * Note: return code = 1 doesn't mean that we've parsed the "base:allocation" + * namespace. It only means that there are no errors. + */ static int nbd_meta_base_query(NBDClient *client, NBDExportMetaContexts *meta, uint32_t len, Error **errp) { @@ -768,10 +772,12 @@ static int nbd_meta_base_query(NBDClient *client, NBDExportMetaContexts *meta, } if (strncmp(query, "allocation", alen) == 0) { +trace_nbd_negotiate_meta_query_parse("base:allocation"); meta->base_allocation = true; +} else { +trace_nbd_negotiate_meta_query_skip("not base:allocation"); } -trace_nbd_negotiate_meta_query_parse("base:allocation"); return 1; } -- 2.14.4
[Qemu-devel] [PULL 4/7] nbd/server: add nbd_meta_empty_or_pattern helper
From: Vladimir Sementsov-Ogievskiy Add nbd_meta_pattern() and nbd_meta_empty_or_pattern() helpers for metadata query parsing. nbd_meta_pattern() will be reused for the "qemu" namespace in following patches. Signed-off-by: Vladimir Sementsov-Ogievskiy Message-Id: <20180609151758.17343-4-vsement...@virtuozzo.com> Reviewed-by: Eric Blake [eblake: comment tweaks] Signed-off-by: Eric Blake --- nbd/server.c | 101 --- 1 file changed, 68 insertions(+), 33 deletions(-) diff --git a/nbd/server.c b/nbd/server.c index bbdc3c01b9f..cea5192addb 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -733,6 +733,71 @@ static int nbd_negotiate_send_meta_context(NBDClient *client, return qio_channel_writev_all(client->ioc, iov, 2, errp) < 0 ? -EIO : 0; } +/* Read strlen(@pattern) bytes, and set @match to true if they match @pattern. + * @match is never set to false. + * + * Return -errno on I/O error, 0 if option was completely handled by + * sending a reply about inconsistent lengths, or 1 on success. + * + * Note: return code = 1 doesn't mean that we've read exactly @pattern. + * It only means that there are no errors. + */ +static int nbd_meta_pattern(NBDClient *client, const char *pattern, bool *match, +Error **errp) +{ +int ret; +char *query; +size_t len = strlen(pattern); + +assert(len); + +query = g_malloc(len); +ret = nbd_opt_read(client, query, len, errp); +if (ret <= 0) { +g_free(query); +return ret; +} + +if (strncmp(query, pattern, len) == 0) { +trace_nbd_negotiate_meta_query_parse(pattern); +*match = true; +} else { +trace_nbd_negotiate_meta_query_skip("pattern not matched"); +} +g_free(query); + +return 1; +} + +/* + * Read @len bytes, and set @match to true if they match @pattern, or if @len + * is 0 and the client is performing _LIST_. @match is never set to false. + * + * Return -errno on I/O error, 0 if option was completely handled by + * sending a reply about inconsistent lengths, or 1 on success. + * + * Note: return code = 1 doesn't mean that we've read exactly @pattern. + * It only means that there are no errors. + */ +static int nbd_meta_empty_or_pattern(NBDClient *client, const char *pattern, + uint32_t len, bool *match, Error **errp) +{ +if (len == 0) { +if (client->opt == NBD_OPT_LIST_META_CONTEXT) { +*match = true; +} +trace_nbd_negotiate_meta_query_parse("empty"); +return 1; +} + +if (len != strlen(pattern)) { +trace_nbd_negotiate_meta_query_skip("different lengths"); +return nbd_opt_skip(client, len, errp); +} + +return nbd_meta_pattern(client, pattern, match, errp); +} + /* nbd_meta_base_query * * Handle queries to 'base' namespace. For now, only the base:allocation @@ -741,43 +806,12 @@ static int nbd_negotiate_send_meta_context(NBDClient *client, * * Return -errno on I/O error, 0 if option was completely handled by * sending a reply about inconsistent lengths, or 1 on success. - * - * Note: return code = 1 doesn't mean that we've parsed the "base:allocation" - * namespace. It only means that there are no errors. */ static int nbd_meta_base_query(NBDClient *client, NBDExportMetaContexts *meta, uint32_t len, Error **errp) { -int ret; -char query[sizeof("allocation") - 1]; -size_t alen = strlen("allocation"); - -if (len == 0) { -if (client->opt == NBD_OPT_LIST_META_CONTEXT) { -meta->base_allocation = true; -} -trace_nbd_negotiate_meta_query_parse("base:"); -return 1; -} - -if (len != alen) { -trace_nbd_negotiate_meta_query_skip("not base:allocation"); -return nbd_opt_skip(client, len, errp); -} - -ret = nbd_opt_read(client, query, len, errp); -if (ret <= 0) { -return ret; -} - -if (strncmp(query, "allocation", alen) == 0) { -trace_nbd_negotiate_meta_query_parse("base:allocation"); -meta->base_allocation = true; -} else { -trace_nbd_negotiate_meta_query_skip("not base:allocation"); -} - -return 1; +return nbd_meta_empty_or_pattern(client, "allocation", len, + >base_allocation, errp); } /* nbd_negotiate_meta_query @@ -823,6 +857,7 @@ static int nbd_negotiate_meta_query(NBDClient *client, return nbd_opt_skip(client, len, errp); } +trace_nbd_negotiate_meta_query_parse("base:"); return nbd_meta_base_query(client, meta, len, errp); } -- 2.14.4
[Qemu-devel] [PULL 0/7] bitmap export over NBD
The following changes since commit 46012db666990ff2eed1d3dc199ab8006439a93b: Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20180619' into staging (2018-06-20 09:51:30 +0100) are available in the Git repository at: git://repo.or.cz/qemu/ericb.git tags/pull-nbd-2018-06-20 for you to fetch changes up to 5bde4bbbd1e217c323551d2785a0efff6340d840: docs/interop: add nbd.txt (2018-06-20 21:20:05 -0500) nbd patches for 2018-06-20 Add experimental x-nbd-server-add-bitmap to expose a disabled bitmap over NBD, in preparation for a pull model incremental backup scheme. - Eric Blake: tests: Simplify .gitignore - Vladimir Sementsov-Ogievskiy: 0/6 NBD export bitmaps Eric Blake (1): tests: Simplify .gitignore Vladimir Sementsov-Ogievskiy (6): nbd/server: fix trace nbd/server: refactor NBDExportMetaContexts nbd/server: add nbd_meta_empty_or_pattern helper nbd/server: implement dirty bitmap export qapi: new qmp command nbd-server-add-bitmap docs/interop: add nbd.txt docs/interop/nbd.txt | 38 ++ qapi/block.json | 23 include/block/nbd.h | 8 +- blockdev-nbd.c | 23 nbd/server.c | 369 --- MAINTAINERS | 1 + nbd/trace-events | 1 + tests/.gitignore | 93 + 8 files changed, 417 insertions(+), 139 deletions(-) create mode 100644 docs/interop/nbd.txt -- 2.14.4
[Qemu-devel] [PULL 3/7] nbd/server: refactor NBDExportMetaContexts
From: Vladimir Sementsov-Ogievskiy Use NBDExport pointer instead of just export name: there is no need to store a duplicated name in the struct; moreover, NBDExport will be used further. Signed-off-by: Vladimir Sementsov-Ogievskiy Message-Id: <20180609151758.17343-3-vsement...@virtuozzo.com> Reviewed-by: Eric Blake [eblake: commit message grammar tweak] Signed-off-by: Eric Blake --- nbd/server.c | 23 +++ 1 file changed, 11 insertions(+), 12 deletions(-) diff --git a/nbd/server.c b/nbd/server.c index e71301b8cd7..bbdc3c01b9f 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -88,7 +88,7 @@ static QTAILQ_HEAD(, NBDExport) exports = QTAILQ_HEAD_INITIALIZER(exports); * as selected by NBD_OPT_SET_META_CONTEXT. Also used for * NBD_OPT_LIST_META_CONTEXT. */ typedef struct NBDExportMetaContexts { -char export_name[NBD_MAX_NAME_SIZE + 1]; +NBDExport *exp; bool valid; /* means that negotiation of the option finished without errors */ bool base_allocation; /* export base:allocation context (block status) */ @@ -399,10 +399,9 @@ static int nbd_negotiate_handle_list(NBDClient *client, Error **errp) return nbd_negotiate_send_rep(client, NBD_REP_ACK, errp); } -static void nbd_check_meta_export_name(NBDClient *client) +static void nbd_check_meta_export(NBDClient *client) { -client->export_meta.valid &= !strcmp(client->exp->name, - client->export_meta.export_name); +client->export_meta.valid &= client->exp == client->export_meta.exp; } /* Send a reply to NBD_OPT_EXPORT_NAME. @@ -456,7 +455,7 @@ static int nbd_negotiate_handle_export_name(NBDClient *client, QTAILQ_INSERT_TAIL(>exp->clients, client, next); nbd_export_get(client->exp); -nbd_check_meta_export_name(client); +nbd_check_meta_export(client); return 0; } @@ -650,7 +649,7 @@ static int nbd_negotiate_handle_info(NBDClient *client, uint16_t myflags, client->exp = exp; QTAILQ_INSERT_TAIL(>exp->clients, client, next); nbd_export_get(client->exp); -nbd_check_meta_export_name(client); +nbd_check_meta_export(client); rc = 1; } return rc; @@ -835,7 +834,7 @@ static int nbd_negotiate_meta_queries(NBDClient *client, NBDExportMetaContexts *meta, Error **errp) { int ret; -NBDExport *exp; +char export_name[NBD_MAX_NAME_SIZE + 1]; NBDExportMetaContexts local_meta; uint32_t nb_queries; int i; @@ -854,15 +853,15 @@ static int nbd_negotiate_meta_queries(NBDClient *client, memset(meta, 0, sizeof(*meta)); -ret = nbd_opt_read_name(client, meta->export_name, NULL, errp); +ret = nbd_opt_read_name(client, export_name, NULL, errp); if (ret <= 0) { return ret; } -exp = nbd_export_find(meta->export_name); -if (exp == NULL) { +meta->exp = nbd_export_find(export_name); +if (meta->exp == NULL) { return nbd_opt_drop(client, NBD_REP_ERR_UNKNOWN, errp, -"export '%s' not present", meta->export_name); +"export '%s' not present", export_name); } ret = nbd_opt_read(client, _queries, sizeof(nb_queries), errp); @@ -871,7 +870,7 @@ static int nbd_negotiate_meta_queries(NBDClient *client, } cpu_to_be32s(_queries); trace_nbd_negotiate_meta_context(nbd_opt_lookup(client->opt), - meta->export_name, nb_queries); + export_name, nb_queries); if (client->opt == NBD_OPT_LIST_META_CONTEXT && !nb_queries) { /* enable all known contexts */ -- 2.14.4
[Qemu-devel] [PULL 7/7] docs/interop: add nbd.txt
From: Vladimir Sementsov-Ogievskiy Describe new metadata namespace: "qemu". Signed-off-by: Vladimir Sementsov-Ogievskiy Message-Id: <20180609151758.17343-7-vsement...@virtuozzo.com> Reviewed-by: Eric Blake [eblake: grammar tweaks] Signed-off-by: Eric Blake --- docs/interop/nbd.txt | 38 ++ MAINTAINERS | 1 + 2 files changed, 39 insertions(+) create mode 100644 docs/interop/nbd.txt diff --git a/docs/interop/nbd.txt b/docs/interop/nbd.txt new file mode 100644 index 000..77b5f459111 --- /dev/null +++ b/docs/interop/nbd.txt @@ -0,0 +1,38 @@ +Qemu supports the NBD protocol, and has an internal NBD client (see +block/nbd.c), an internal NBD server (see blockdev-nbd.c), and an +external NBD server tool (see qemu-nbd.c). The common code is placed +in nbd/*. + +The NBD protocol is specified here: +https://github.com/NetworkBlockDevice/nbd/blob/master/doc/proto.md + +The following paragraphs describe some specific properties of NBD +protocol realization in Qemu. + += Metadata namespaces = + +Qemu supports the "base:allocation" metadata context as defined in the +NBD protocol specification, and also defines an additional metadata +namespace "qemu". + + +== "qemu" namespace == + +The "qemu" namespace currently contains only one type of context, +related to exposing the contents of a dirty bitmap alongside the +associated disk contents. That context has the following form: + +qemu:dirty-bitmap: + +Each dirty-bitmap metadata context defines only one flag for extents +in reply for NBD_CMD_BLOCK_STATUS: + +bit 0: NBD_STATE_DIRTY, means that the extent is "dirty" + +For NBD_OPT_LIST_META_CONTEXT the following queries are supported +in addition to "qemu:dirty-bitmap:": + +* "qemu:" - returns list of all available metadata contexts in the +namespace. +* "qemu:dirty-bitmap:" - returns list of all available dirty-bitmap + metadata contexts. diff --git a/MAINTAINERS b/MAINTAINERS index da91501c7a6..efb17e6ac0f 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1972,6 +1972,7 @@ F: nbd/ F: include/block/nbd* F: qemu-nbd.* F: blockdev-nbd.c +F: docs/interop/nbd.txt T: git git://repo.or.cz/qemu/ericb.git nbd NFS -- 2.14.4
[Qemu-devel] [PULL 1/7] tests: Simplify .gitignore
Commit 0bcc8e5b was yet another instance of 'git status' reporting dirty files after an in-tree build, thanks to the new binary tests/check-block-qdict. Instead of piecemeal exemptions of each new binary as they are added, let's use git's negative globbing feature to exempt ALL files that have a 'test-' or 'check-' prefix, except for the ones ending in '.c' or '.sh'. We still have a couple of generated files that then need (re-)exclusion, but the overall list is a LOT shorter, and less prone to needing future edits. Signed-off-by: Eric Blake Message-Id: <20180619203918.65450-1-ebl...@redhat.com> Reviewed-by: Philippe Mathieu-Daudé --- tests/.gitignore | 93 +++- 1 file changed, 5 insertions(+), 88 deletions(-) diff --git a/tests/.gitignore b/tests/.gitignore index 2bc61a9a58d..08e2df1ce1f 100644 --- a/tests/.gitignore +++ b/tests/.gitignore @@ -2,101 +2,18 @@ atomic_add-bench benchmark-crypto-cipher benchmark-crypto-hash benchmark-crypto-hmac -check-qdict -check-qnum -check-qjson -check-qlist -check-qlit -check-qnull -check-qobject -check-qstring -check-qom-interface -check-qom-proplist +check-* +!check-*.c +!check-*.sh qht-bench rcutorture -test-aio -test-aio-multithread -test-arm-mptimer -test-base64 -test-bdrv-drain -test-bitops -test-bitcnt -test-block-backend -test-blockjob -test-blockjob-txn -test-bufferiszero -test-char -test-clone-visitor -test-coroutine -test-crypto-afsplit -test-crypto-block -test-crypto-cipher -test-crypto-hash -test-crypto-hmac -test-crypto-ivgen -test-crypto-pbkdf -test-crypto-secret -test-crypto-tlscredsx509 -test-crypto-tlscredsx509-work/ -test-crypto-tlscredsx509-certs/ -test-crypto-tlssession -test-crypto-tlssession-work/ -test-crypto-tlssession-client/ -test-crypto-tlssession-server/ -test-crypto-xts -test-cutils -test-hbitmap -test-hmp -test-int128 -test-iov -test-io-channel-buffer -test-io-channel-command -test-io-channel-command.fifo -test-io-channel-file -test-io-channel-file.txt -test-io-channel-socket -test-io-channel-tls -test-io-task -test-keyval -test-logging -test-mul64 -test-opts-visitor +test-* +!test-*.c test-qapi-commands.[ch] test-qapi-events.[ch] test-qapi-types.[ch] -test-qapi-util test-qapi-visit.[ch] -test-qdev-global-props -test-qemu-opts -test-qdist -test-qga -test-qht -test-qht-par -test-qmp-cmds -test-qmp-event -test-qobject-input-strict -test-qobject-input-visitor test-qapi-introspect.[ch] -test-qobject-output-visitor -test-rcu-list -test-replication -test-shift128 -test-string-input-visitor -test-string-output-visitor -test-thread-pool -test-throttle -test-timed-average -test-uuid -test-util-sockets -test-visitor-serialization -test-vmstate -test-write-threshold -test-x86-cpuid -test-x86-cpuid-compat -test-xbzrle -test-netfilter -test-filter-mirror -test-filter-redirector *-test qapi-schema/*.test.* vm/*.img -- 2.14.4
[Qemu-devel] [PATCH v3] hw/i386: Deprecate the machine types pc-0.10 and pc-0.11
The oldest machine type which is still used in a still maintained distro is a pc-0.12 based machine type in RHEL6, so everything that is older than pc-0.12 should not be used anymore. Thus let's deprecate pc-0.10 and pc-0.11 so that we can finally remove them in a future release. Signed-off-by: Thomas Huth --- This is based on a patch that I already sent in 2017. But back then, we were still in progress of discussing our deprecation policies (e.g. auto- matic deprecation for old machine types), and there was no clear consensus whether we should deprecate 0.10 - 0.11, all 0.x or even up to version 1.2. After some iterations and too much discussion, I've forgotten about this patch. Anyway, I think we agreed that at least 0.10 and 0.11 can certainly be removed nowadays, so let's finally get at least those two machine types marked as deprecated! If that works fine and we will finally have removed these two types in v3.2, we can resume the discussion about newer machine types afterwards. Note: I don't want to add a QMP interface for this in this patch here, let's keep this small and simple! If we decide that we need a QMP interface, we can do that with a separate patch later. v3: - Do not print the deprecation messages if qtest_enabled() v2: - Renamed deprecation_msg to deprecation_reason - Added information about that field to the MachineClass comment hw/i386/pc_piix.c | 2 ++ include/hw/boards.h | 4 qemu-doc.texi | 5 + vl.c| 10 -- 4 files changed, 19 insertions(+), 2 deletions(-) diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index e9b6f06..b4fd164 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -956,6 +956,8 @@ static void pc_i440fx_0_11_machine_options(MachineClass *m) { pc_i440fx_0_12_machine_options(m); m->hw_version = "0.11"; +m->deprecation_reason = "Old and unsupported machine version, " +"use a newer machine type instead."; SET_MACHINE_COMPAT(m, PC_COMPAT_0_11); } diff --git a/include/hw/boards.h b/include/hw/boards.h index ef7457f..6926928 100644 --- a/include/hw/boards.h +++ b/include/hw/boards.h @@ -107,6 +107,9 @@ typedef struct { /** * MachineClass: + * @deprecation_reason: If set, the machine is marked as deprecated. The + *string should give some information about why the machine is deprecated, + *and must provide some clear information about what to use instead. * @max_cpus: maximum number of CPUs supported. Default: 1 * @min_cpus: minimum number of CPUs supported. Default: 1 * @default_cpus: number of CPUs instantiated if none are specified. Default: 1 @@ -166,6 +169,7 @@ struct MachineClass { char *name; const char *alias; const char *desc; +const char *deprecation_reason; void (*init)(MachineState *state); void (*reset)(void); diff --git a/qemu-doc.texi b/qemu-doc.texi index 282bc3d..16fcb47 100644 --- a/qemu-doc.texi +++ b/qemu-doc.texi @@ -2943,6 +2943,11 @@ support page sizes < 4096 any longer. @section System emulator machines +@subsection pc-0.10 and pc-0.11 (since 3.0) + +These machine types are very old and likely can not be used for live migration +from old QEMU versions anymore. A newer machine type should be used instead. + @section Device options @subsection Block device options diff --git a/vl.c b/vl.c index b3426e0..6d54bc6 100644 --- a/vl.c +++ b/vl.c @@ -2560,8 +2560,9 @@ static gint machine_class_cmp(gconstpointer a, gconstpointer b) if (mc->alias) { printf("%-20s %s (alias of %s)\n", mc->alias, mc->desc, mc->name); } -printf("%-20s %s%s\n", mc->name, mc->desc, - mc->is_default ? " (default)" : ""); +printf("%-20s %s%s%s\n", mc->name, mc->desc, + mc->is_default ? " (default)" : "", + mc->deprecation_reason ? " (deprecated)" : ""); } } @@ -4257,6 +4258,11 @@ int main(int argc, char **argv, char **envp) configure_accelerator(current_machine); +if (!qtest_enabled() && machine_class->deprecation_reason) { +error_report("Machine type '%s' is deprecated: %s", + machine_class->name, machine_class->deprecation_reason); +} + /* * Register all the global properties, including accel properties, * machine properties, and user-specified ones. -- 1.8.3.1
Re: [Qemu-devel] [PATCH] tests: Simplify .gitignore
On 06/19/2018 04:10 PM, Philippe Mathieu-Daudé wrote: On 06/19/2018 05:39 PM, Eric Blake wrote: Commit 0bcc8e5b was yet another instance of 'git status' reporting dirty files after an in-tree build, thanks to the new binary tests/check-block-qdict. Instead of piecemeal exemptions of each new binary as they are added, let's use git's negative globbing feature to exempt ALL files that have a 'test-' or 'check-' prefix, except for the ones ending in '.c' or '.sh'. We still have a couple of generated files that then need (re-)exclusion, but the overall list is a LOT shorter, and less prone to needing future edits. Finally :) Signed-off-by: Eric Blake Reviewed-by: Philippe Mathieu-Daudé Thanks; including it in my NBD queue for a pull request. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
[Qemu-devel] [Bug 1777969] [NEW] Crash with UEFI, q35, AHCI, and <= SystemRescueCD 4.3.0
Public bug reported: I am getting a crash when booting <= SystemRescueCD 4.3.0 in UEFI mode with q35 machine and from a AHCI device with qemu 2.11.1 and 2.12.0. The crash doesn't occur if I compile with --enable-trace-backends=simple or if I use virtio-scsi. The original crash was noticed on Gentoo with hardened gcc 6.4.0 and an Intel CPU, the test system to reproduce the crash is on Gentoo with non-hardened gcc 5.4.0 and an Intel CPU. OVMF version is from Gentoo: edk2-ovmf-2017_p20180211-bin.tar.xz Here is the commands I have run on qemu 2.12.0 to reproduce the issue although it also crashes with accel=kvm removed: ./configure --target-list="x86_64-softmmu" make qemu-system-x86_64 -nodefaults -machine q35,accel=kvm -cpu qemu64 -drive if=pflash,format=raw,unit=0,file=/usr/share/edk2-ovmf/OVMF_CODE.fd,readonly=on -drive if=pflash,format=raw,unit=1,file=OVMF_VARS.fd -m 512 -drive file=systemrescuecd-x86-4.3.0.iso,if=none,id=cdrom-sysresc,readonly=on -device ide-cd,bus=ide.0,unit=0,drive=cdrom-sysresc,bootindex=5 -device VGA -display gtk Valgrind says "Bad permissions for mapped region at address 0x4C022FE0" for the crash. Here is a backtrace from gdb: Program received signal SIGSEGV, Segmentation fault. 0x7f42dcbc5833 in malloc () from /lib64/libc.so.6 (gdb) bt #0 0x7f42dcbc5833 in malloc () from /lib64/libc.so.6 #1 0x7f42e10117d9 in g_malloc () from /usr/lib64/libglib-2.0.so.0 #2 0x55a3ff9def8f in qemu_aio_get (aiocb_info=aiocb_info@entry=0x55a4001b39a0 , bs=bs@entry=0x0, cb=cb@entry=0x55a3ff9dfe20 , opaque=opaque@entry=0x7f42961e30b0) at util/aiocb.c:33 #3 0x55a3ff9e0249 in thread_pool_submit_aio (pool=pool@entry=0x55a400c038d0, func=func@entry=0x55a3ff956620 , arg=arg@entry=0x55a400bd30b0, cb=cb@entry=0x55a3ff9dfe20 , opaque=opaque@entry=0x7f42961e30b0) at util/thread-pool.c:251 #4 0x55a3ff9e0423 in thread_pool_submit_co (pool=0x55a400c038d0, func=func@entry=0x55a3ff956620 , arg=arg@entry=0x55a400bd30b0) at util/thread-pool.c:289 #5 0x55a3ff956b50 in paio_submit_co (bs=0x55a400bff180, fd=, offset=362702848, qiov=, bytes=2048, type=1) at block/file-posix.c:1536 #6 0x55a3ff95c82a in bdrv_driver_preadv (bs=bs@entry=0x55a400bff180, offset=offset@entry=362702848, bytes=bytes@entry=2048, qiov=qiov@entry=0x7f42961e3650, flags=0) at block/io.c:924 #7 0x55a3ff960154 in bdrv_aligned_preadv (child=child@entry=0x55a400c03a20, req=req@entry=0x7f42961e32e0, offset=offset@entry=362702848, bytes=bytes@entry=2048, align=align@entry=1, qiov=qiov@entry=0x7f42961e3650, flags=0) at block/io.c:1228 #8 0x55a3ff960434 in bdrv_co_preadv (child=0x55a400c03a20, offset=362702848, bytes=2048, qiov=0x7f42961e3650, flags=0) at block/io.c:1324 #9 0x55a3ff95c82a in bdrv_driver_preadv (bs=bs@entry=0x55a400bf8e50, offset=offset@entry=362702848, bytes=bytes@entry=2048, qiov=qiov@entry=0x7f42961e3650, flags=0) at block/io.c:924 #10 0x55a3ff960154 in bdrv_aligned_preadv (child=child@entry=0x55a400be92c0, req=req@entry=0x7f42961e3510, offset=offset@entry=362702848, bytes=bytes@entry=2048, align=align@entry=512, qiov=qiov@entry=0x7f42961e3650, flags=0) at block/io.c:1228 #11 0x55a3ff960434 in bdrv_co_preadv (child=0x55a400be92c0, offset=offset@entry=362702848, bytes=bytes@entry=2048, qiov=qiov@entry=0x7f42961e3650, flags=flags@entry=0) at block/io.c:1324 #12 0x55a3ff94f4ce in blk_co_preadv (blk=0x55a400bf8ba0, offset=362702848, bytes=2048, qiov=0x7f42961e3650, flags=0) at block/block-backend.c:1158 #13 0x55a3ff94f5ac in blk_read_entry (opaque=0x7f42961e3670) at block/block-backend.c:1206 #14 0x55a3ff94e000 in blk_prw (blk=0x55a400bf8ba0, offset=362702848, buf=, bytes=bytes@entry=2048, co_entry=co_entry@entry=0x55a3ff94f590 , flags=flags@entry=0) at block/block-backend.c:1243 #15 0x55a3ff94f076 in blk_pread (blk=, offset=, buf=, count=count@entry=2048) at block/block-backend.c:1409 #16 0x55a3ff7d8b93 in cd_read_sector_sync (s=0x55a401a0faa0) at hw/ide/atapi.c:124 #17 ide_atapi_cmd_reply_end (s=0x55a401a0faa0) at hw/ide/atapi.c:269 #18 0x55a3ff7dde0e in ahci_start_transfer (dma=0x55a401a0f9f0) at hw/ide/ahci.c:1325 #19 0x55a3ff7d870c in ide_atapi_cmd_reply_end (s=0x55a401a0faa0) at hw/ide/atapi.c:285 #20 0x55a3ff7dde0e in ahci_start_transfer (dma=0x55a401a0f9f0) at hw/ide/ahci.c:1325 #21 0x55a3ff7d870c in ide_atapi_cmd_reply_end (s=0x55a401a0faa0) at hw/ide/atapi.c:285 #22 0x55a3ff7dde0e in ahci_start_transfer (dma=0x55a401a0f9f0) at hw/ide/ahci.c:1325 #23 0x55a3ff7d870c in ide_atapi_cmd_reply_end (s=0x55a401a0faa0) at hw/ide/atapi.c:285 #24 0x55a3ff7dde0e in ahci_start_transfer (dma=0x55a401a0f9f0) at hw/ide/ahci.c:1325 #25 0x55a3ff7d870c in ide_atapi_cmd_reply_end (s=0x55a401a0faa0) at hw/ide/atapi.c:285 #26 0x55a3ff7dde0e in ahci_start_transfer (dma=0x55a401a0f9f0) at hw/ide/ahci.c:1325 #27 0x55a3ff7d870c in ide_atapi_cmd_reply_end (s=0x55a401a0faa0) at
Re: [Qemu-devel] [PATCH v2] hw/i386: Deprecate the machine types pc-0.10 and pc-0.11
On 21.06.2018 04:18, Thomas Huth wrote: > The oldest machine type which is still used in a still maintained distro > is a pc-0.12 based machine type in RHEL6, so everything that is older > than pc-0.12 should not be used anymore. Thus let's deprecate pc-0.10 > and pc-0.11 so that we can finally remove them in a future release. [...] > @@ -3953,6 +3954,10 @@ int main(int argc, char **argv, char **envp) > } > > machine_class = select_machine(); > +if (machine_class->deprecation_reason) { > +error_report("Machine type '%s' is deprecated: %s", > + machine_class->name, machine_class->deprecation_reason); > +} I just noticed that I need to check for !qtest_enabled() here, too, otherwise these messages show up during "make check". I'll send a v3... Thomas
[Qemu-devel] [PATCH v5 34/35] target/arm: Enable SVE for aarch64-linux-user
Enable ARM_FEATURE_SVE for the generic "max" cpu. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/cpu.c | 7 +++ target/arm/cpu64.c | 1 + 2 files changed, 8 insertions(+) diff --git a/target/arm/cpu.c b/target/arm/cpu.c index e1de45e904..8e4f4d8c21 100644 --- a/target/arm/cpu.c +++ b/target/arm/cpu.c @@ -164,6 +164,13 @@ static void arm_cpu_reset(CPUState *s) env->cp15.sctlr_el[1] |= SCTLR_UCT | SCTLR_UCI | SCTLR_DZE; /* and to the FP/Neon instructions */ env->cp15.cpacr_el1 = deposit64(env->cp15.cpacr_el1, 20, 2, 3); +/* and to the SVE instructions */ +env->cp15.cpacr_el1 = deposit64(env->cp15.cpacr_el1, 16, 2, 3); +env->cp15.cptr_el[3] |= CPTR_EZ; +/* with maximum vector length */ +env->vfp.zcr_el[1] = ARM_MAX_VQ - 1; +env->vfp.zcr_el[2] = ARM_MAX_VQ - 1; +env->vfp.zcr_el[3] = ARM_MAX_VQ - 1; #else /* Reset into the highest available EL */ if (arm_feature(env, ARM_FEATURE_EL3)) { diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c index c50dcd4077..0360d7efc5 100644 --- a/target/arm/cpu64.c +++ b/target/arm/cpu64.c @@ -252,6 +252,7 @@ static void aarch64_max_initfn(Object *obj) set_feature(>env, ARM_FEATURE_V8_RDM); set_feature(>env, ARM_FEATURE_V8_FP16); set_feature(>env, ARM_FEATURE_V8_FCMA); +set_feature(>env, ARM_FEATURE_SVE); /* For usermode -cpu max we can use a larger and more efficient DCZ * blocksize since we don't have to follow what the hardware does. */ -- 2.17.1
[Qemu-devel] [PATCH v5 32/35] target/arm: Implement SVE dot product (vectors)
Signed-off-by: Richard Henderson --- target/arm/helper.h| 5 +++ target/arm/translate-sve.c | 17 ++ target/arm/vec_helper.c| 67 ++ target/arm/sve.decode | 3 ++ 4 files changed, 92 insertions(+) diff --git a/target/arm/helper.h b/target/arm/helper.h index 8607077dda..e23ce7ff19 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -583,6 +583,11 @@ DEF_HELPER_FLAGS_5(gvec_qrdmlah_s32, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_qrdmlsh_s32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sdot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_udot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sdot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_udot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(gvec_fcaddh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fcadds, TCG_CALL_NO_RWG, diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 209a69cd76..aa109208e5 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3423,6 +3423,23 @@ DO_ZZI(UMIN, umin) #undef DO_ZZI +static bool trans_DOT_zzz(DisasContext *s, arg_DOT_zzz *a, uint32_t insn) +{ +static gen_helper_gvec_3 * const fns[2][2] = { +{ gen_helper_gvec_sdot_b, gen_helper_gvec_sdot_h }, +{ gen_helper_gvec_udot_b, gen_helper_gvec_udot_h } +}; + +if (sve_access_check(s)) { +unsigned vsz = vec_full_reg_size(s); +tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vsz, vsz, 0, fns[a->u][a->sz]); +} +return true; +} + /* *** SVE Floating Point Multiply-Add Indexed Group */ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index db5aeb9f24..c16a30c3b5 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -194,6 +194,73 @@ void HELPER(gvec_qrdmlsh_s32)(void *vd, void *vn, void *vm, clear_tail(d, opr_sz, simd_maxsz(desc)); } +/* Integer 8 and 16-bit dot-product. + * + * Note that for the loops herein, host endianness does not matter + * with respect to the ordering of data within the 64-bit lanes. + * All elements are treated equally, no matter where they are. + */ + +void HELPER(gvec_sdot_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ +intptr_t i, opr_sz = simd_oprsz(desc); +uint32_t *d = vd; +int8_t *n = vn, *m = vm; + +for (i = 0; i < opr_sz / 4; ++i) { +d[i] += n[i * 4 + 0] * m[i * 4 + 0] + + n[i * 4 + 1] * m[i * 4 + 1] + + n[i * 4 + 2] * m[i * 4 + 2] + + n[i * 4 + 3] * m[i * 4 + 3]; +} +clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_udot_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ +intptr_t i, opr_sz = simd_oprsz(desc); +uint32_t *d = vd; +uint8_t *n = vn, *m = vm; + +for (i = 0; i < opr_sz / 4; ++i) { +d[i] += n[i * 4 + 0] * m[i * 4 + 0] + + n[i * 4 + 1] * m[i * 4 + 1] + + n[i * 4 + 2] * m[i * 4 + 2] + + n[i * 4 + 3] * m[i * 4 + 3]; +} +clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_sdot_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ +intptr_t i, opr_sz = simd_oprsz(desc); +uint64_t *d = vd; +int16_t *n = vn, *m = vm; + +for (i = 0; i < opr_sz / 8; ++i) { +d[i] += (int64_t)n[i * 4 + 0] * m[i * 4 + 0] + + (int64_t)n[i * 4 + 1] * m[i * 4 + 1] + + (int64_t)n[i * 4 + 2] * m[i * 4 + 2] + + (int64_t)n[i * 4 + 3] * m[i * 4 + 3]; +} +clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_udot_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ +intptr_t i, opr_sz = simd_oprsz(desc); +uint64_t *d = vd; +uint16_t *n = vn, *m = vm; + +for (i = 0; i < opr_sz / 8; ++i) { +d[i] += (uint64_t)n[i * 4 + 0] * m[i * 4 + 0] + + (uint64_t)n[i * 4 + 1] * m[i * 4 + 1] + + (uint64_t)n[i * 4 + 2] * m[i * 4 + 2] + + (uint64_t)n[i * 4 + 3] * m[i * 4 + 3]; +} +clear_tail(d, opr_sz, simd_maxsz(desc)); +} + void HELPER(gvec_fcaddh)(void *vd, void *vn, void *vm, void *vfpst, uint32_t desc) { diff --git a/target/arm/sve.decode b/target/arm/sve.decode index b578d104c4..0b29da9f3a 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -721,6 +721,9 @@ UMIN_zzi00100101 .. 101 011 110 . @rdn_i8u # SVE integer multiply immediate (unpredicated) MUL_zzi 00100101 .. 110 000 110 . @rdn_i8s +# SVE integer dot product (unpredicated) +DOT_zzz 01000100 1 sz:1 0 rm:5 0 u:1 rn:5 rd:5 + # SVE floating-point complex add
[Qemu-devel] [PATCH v5 33/35] target/arm: Implement SVE dot product (indexed)
Signed-off-by: Richard Henderson --- target/arm/helper.h| 5 ++ target/arm/translate-sve.c | 18 +++ target/arm/vec_helper.c| 96 ++ target/arm/sve.decode | 8 +++- 4 files changed, 126 insertions(+), 1 deletion(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index e23ce7ff19..59e8c3bd1b 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -588,6 +588,11 @@ DEF_HELPER_FLAGS_4(gvec_udot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_sdot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_udot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sdot_idx_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_udot_idx_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sdot_idx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_udot_idx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(gvec_fcaddh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fcadds, TCG_CALL_NO_RWG, diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index aa109208e5..af2958be10 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3440,6 +3440,24 @@ static bool trans_DOT_zzz(DisasContext *s, arg_DOT_zzz *a, uint32_t insn) return true; } +static bool trans_DOT_zzx(DisasContext *s, arg_DOT_zzx *a, uint32_t insn) +{ +static gen_helper_gvec_3 * const fns[2][2] = { +{ gen_helper_gvec_sdot_idx_b, gen_helper_gvec_sdot_idx_h }, +{ gen_helper_gvec_udot_idx_b, gen_helper_gvec_udot_idx_h } +}; + +if (sve_access_check(s)) { +unsigned vsz = vec_full_reg_size(s); +tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vsz, vsz, a->index, fns[a->u][a->sz]); +} +return true; +} + + /* *** SVE Floating Point Multiply-Add Indexed Group */ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index c16a30c3b5..3117ee39cd 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -261,6 +261,102 @@ void HELPER(gvec_udot_h)(void *vd, void *vn, void *vm, uint32_t desc) clear_tail(d, opr_sz, simd_maxsz(desc)); } +void HELPER(gvec_sdot_idx_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ +intptr_t i, j, opr_sz = simd_oprsz(desc), opr_sz_4 = opr_sz / 4; +intptr_t index = simd_data(desc); +uint32_t *d = vd; +int8_t *n = vn, *m = vm; + +for (i = 0; i < opr_sz_4; i = j) { +int8_t m0 = m[(i + index) * 4 + 0]; +int8_t m1 = m[(i + index) * 4 + 1]; +int8_t m2 = m[(i + index) * 4 + 2]; +int8_t m3 = m[(i + index) * 4 + 3]; + +j = i; +do { +d[j] += n[j * 4 + 0] * m0 + + n[j * 4 + 1] * m1 + + n[j * 4 + 2] * m2 + + n[j * 4 + 3] * m3; +} while (++j < MIN(i + 4, opr_sz_4)); +} +clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_udot_idx_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ +intptr_t i, j, opr_sz = simd_oprsz(desc), opr_sz_4 = opr_sz / 4; +intptr_t index = simd_data(desc); +uint32_t *d = vd; +uint8_t *n = vn, *m = vm; + +for (i = 0; i < opr_sz_4; i = j) { +uint8_t m0 = m[(i + index) * 4 + 0]; +uint8_t m1 = m[(i + index) * 4 + 1]; +uint8_t m2 = m[(i + index) * 4 + 2]; +uint8_t m3 = m[(i + index) * 4 + 3]; + +j = i; +do { +d[j] += n[j * 4 + 0] * m0 + + n[j * 4 + 1] * m1 + + n[j * 4 + 2] * m2 + + n[j * 4 + 3] * m3; +} while (++j < MIN(i + 4, opr_sz_4)); +} +clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_sdot_idx_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ +intptr_t i, j, opr_sz = simd_oprsz(desc), opr_sz_8 = opr_sz / 8; +intptr_t index = simd_data(desc); +uint64_t *d = vd; +int16_t *n = vn, *m = vm; + +for (i = 0; i < opr_sz_8; i = j) { +int64_t m0 = m[(i + index) * 4 + 0]; +int64_t m1 = m[(i + index) * 4 + 1]; +int64_t m2 = m[(i + index) * 4 + 2]; +int64_t m3 = m[(i + index) * 4 + 3]; + +j = i; +do { +d[j] += n[j * 4 + 0] * m0 + + n[j * 4 + 1] * m1 + + n[j * 4 + 2] * m2 + + n[j * 4 + 3] * m3; +} while (++j < MIN(i + 2, opr_sz_8)); +} +clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_udot_idx_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ +intptr_t i, j, opr_sz = simd_oprsz(desc), opr_sz_8 = opr_sz / 8; +intptr_t index = simd_data(desc); +uint64_t *d = vd; +uint16_t *n = vn, *m = vm; + +for (i = 0; i < opr_sz_8; i = j) { +
[Qemu-devel] [PATCH v5 30/35] target/arm: Pass index to AdvSIMD FCMLA (indexed)
The original commit failed to pass, or use, the index. Fixes: d17b7cdcf4ea Signed-off-by: Richard Henderson --- target/arm/translate-a64.c | 21 - target/arm/vec_helper.c| 10 ++ 2 files changed, 18 insertions(+), 13 deletions(-) diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 8d8a4cecb0..038e48278f 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -12669,15 +12669,18 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) case 0x13: /* FCMLA #90 */ case 0x15: /* FCMLA #180 */ case 0x17: /* FCMLA #270 */ -tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd), - vec_full_reg_offset(s, rn), - vec_reg_offset(s, rm, index, size), fpst, - is_q ? 16 : 8, vec_full_reg_size(s), - extract32(insn, 13, 2), /* rot */ - size == MO_64 - ? gen_helper_gvec_fcmlas_idx - : gen_helper_gvec_fcmlah_idx); -tcg_temp_free_ptr(fpst); +{ +int rot = extract32(insn, 13, 2); +int data = index * 4 + rot; +tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), + vec_reg_offset(s, rm, index, size), fpst, + is_q ? 16 : 8, vec_full_reg_size(s), data, + size == MO_64 + ? gen_helper_gvec_fcmlas_idx + : gen_helper_gvec_fcmlah_idx); +tcg_temp_free_ptr(fpst); +} return; } diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 073e5c58e7..8f2dc4b989 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -317,10 +317,11 @@ void HELPER(gvec_fcmlah_idx)(void *vd, void *vn, void *vm, float_status *fpst = vfpst; intptr_t flip = extract32(desc, SIMD_DATA_SHIFT, 1); uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1); +intptr_t index = extract32(desc, SIMD_DATA_SHIFT + 2, 2); uint32_t neg_real = flip ^ neg_imag; uintptr_t i; -float16 e1 = m[H2(flip)]; -float16 e3 = m[H2(1 - flip)]; +float16 e1 = m[H2(2 * index + flip)]; +float16 e3 = m[H2(2 * index + 1 - flip)]; /* Shift boolean to the sign bit so we can xor to negate. */ neg_real <<= 15; @@ -377,10 +378,11 @@ void HELPER(gvec_fcmlas_idx)(void *vd, void *vn, void *vm, float_status *fpst = vfpst; intptr_t flip = extract32(desc, SIMD_DATA_SHIFT, 1); uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1); +intptr_t index = extract32(desc, SIMD_DATA_SHIFT + 2, 2); uint32_t neg_real = flip ^ neg_imag; uintptr_t i; -float32 e1 = m[H4(flip)]; -float32 e3 = m[H4(1 - flip)]; +float32 e1 = m[H4(2 * index + flip)]; +float32 e3 = m[H4(2 * index + 1 - flip)]; /* Shift boolean to the sign bit so we can xor to negate. */ neg_real <<= 31; -- 2.17.1
[Qemu-devel] [PATCH v5 28/35] target/arm: Implement SVE floating-point complex add
Signed-off-by: Richard Henderson --- target/arm/helper-sve.h| 7 +++ target/arm/sve_helper.c| 100 + target/arm/translate-sve.c | 24 + target/arm/sve.decode | 4 ++ 4 files changed, 135 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 891346a5ac..0bd9fe2f28 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -1092,6 +1092,13 @@ DEF_HELPER_FLAGS_6(sve_facgt_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_6(sve_facgt_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcadd_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcadd_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcadd_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 5309cf0866..ee7fc23bb9 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3629,6 +3629,106 @@ void HELPER(sve_ftmad_d)(void *vd, void *vn, void *vm, void *vs, uint32_t desc) } } +/* + * FP Complex Add + */ + +void HELPER(sve_fcadd_h)(void *vd, void *vn, void *vm, void *vg, + void *vs, uint32_t desc) +{ +intptr_t j, i = simd_oprsz(desc); +uint64_t *g = vg; +float16 neg_imag = float16_set_sign(0, simd_data(desc)); +float16 neg_real = float16_chs(neg_imag); + +do { +uint64_t pg = g[(i - 1) >> 6]; +do { +float16 e0, e1, e2, e3; + +/* I holds the real index; J holds the imag index. */ +j = i - sizeof(float16); +i -= 2 * sizeof(float16); + +e0 = *(float16 *)(vn + H1_2(i)); +e1 = *(float16 *)(vm + H1_2(j)) ^ neg_real; +e2 = *(float16 *)(vn + H1_2(j)); +e3 = *(float16 *)(vm + H1_2(i)) ^ neg_imag; + +if (likely((pg >> (i & 63)) & 1)) { +*(float16 *)(vd + H1_2(i)) = float16_add(e0, e1, vs); +} +if (likely((pg >> (j & 63)) & 1)) { +*(float16 *)(vd + H1_2(j)) = float16_add(e2, e3, vs); +} +} while (i & 63); +} while (i != 0); +} + +void HELPER(sve_fcadd_s)(void *vd, void *vn, void *vm, void *vg, + void *vs, uint32_t desc) +{ +intptr_t j, i = simd_oprsz(desc); +uint64_t *g = vg; +float32 neg_imag = float32_set_sign(0, simd_data(desc)); +float32 neg_real = float32_chs(neg_imag); + +do { +uint64_t pg = g[(i - 1) >> 6]; +do { +float32 e0, e1, e2, e3; + +/* I holds the real index; J holds the imag index. */ +j = i - sizeof(float32); +i -= 2 * sizeof(float32); + +e0 = *(float32 *)(vn + H1_2(i)); +e1 = *(float32 *)(vm + H1_2(j)) ^ neg_real; +e2 = *(float32 *)(vn + H1_2(j)); +e3 = *(float32 *)(vm + H1_2(i)) ^ neg_imag; + +if (likely((pg >> (i & 63)) & 1)) { +*(float32 *)(vd + H1_2(i)) = float32_add(e0, e1, vs); +} +if (likely((pg >> (j & 63)) & 1)) { +*(float32 *)(vd + H1_2(j)) = float32_add(e2, e3, vs); +} +} while (i & 63); +} while (i != 0); +} + +void HELPER(sve_fcadd_d)(void *vd, void *vn, void *vm, void *vg, + void *vs, uint32_t desc) +{ +intptr_t j, i = simd_oprsz(desc); +uint64_t *g = vg; +float64 neg_imag = float64_set_sign(0, simd_data(desc)); +float64 neg_real = float64_chs(neg_imag); + +do { +uint64_t pg = g[(i - 1) >> 6]; +do { +float64 e0, e1, e2, e3; + +/* I holds the real index; J holds the imag index. */ +j = i - sizeof(float64); +i -= 2 * sizeof(float64); + +e0 = *(float64 *)(vn + H1_2(i)); +e1 = *(float64 *)(vm + H1_2(j)) ^ neg_real; +e2 = *(float64 *)(vn + H1_2(j)); +e3 = *(float64 *)(vm + H1_2(i)) ^ neg_imag; + +if (likely((pg >> (i & 63)) & 1)) { +*(float64 *)(vd + H1_2(i)) = float64_add(e0, e1, vs); +} +if (likely((pg >> (j & 63)) & 1)) { +*(float64 *)(vd + H1_2(j)) = float64_add(e2, e3, vs); +} +} while (i & 63); +} while (i != 0); +} + /* * Load contiguous data, protected by a governing predicate. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 067c219b54..7a39be9bdd 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3895,6 +3895,30 @@ DO_FPCMP(FACGT, facgt) #undef DO_FPCMP
[Qemu-devel] [PATCH v5 27/35] target/arm: Implement SVE MOVPRFX
Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 60 +- target/arm/sve.decode | 7 + 2 files changed, 66 insertions(+), 1 deletion(-) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 308c04de89..067c219b54 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -351,6 +351,23 @@ static bool do_zpzz_ool(DisasContext *s, arg_rprr_esz *a, gen_helper_gvec_4 *fn) return true; } +/* Select active elememnts from Zn and inactive elements from Zm, + * storing the result in Zd. + */ +static void do_sel_z(DisasContext *s, int rd, int rn, int rm, int pg, int esz) +{ +static gen_helper_gvec_4 * const fns[4] = { +gen_helper_sve_sel_zpzz_b, gen_helper_sve_sel_zpzz_h, +gen_helper_sve_sel_zpzz_s, gen_helper_sve_sel_zpzz_d +}; +unsigned vsz = vec_full_reg_size(s); +tcg_gen_gvec_4_ool(vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), + vec_full_reg_offset(s, rm), + pred_full_reg_offset(s, pg), + vsz, vsz, 0, fns[esz]); +} + #define DO_ZPZZ(NAME, name) \ static bool trans_##NAME##_zpzz(DisasContext *s, arg_rprr_esz *a, \ uint32_t insn)\ @@ -401,7 +418,13 @@ static bool trans_UDIV_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn) return do_zpzz_ool(s, a, fns[a->esz]); } -DO_ZPZZ(SEL, sel) +static bool trans_SEL_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ +if (sve_access_check(s)) { +do_sel_z(s, a->rd, a->rn, a->rm, a->pg, a->esz); +} +return true; +} #undef DO_ZPZZ @@ -5038,3 +5061,38 @@ static bool trans_PRF_rr(DisasContext *s, arg_PRF_rr *a, uint32_t insn) sve_access_check(s); return true; } + +/* + * Move Prefix + * + * TODO: The implementation so far could handle predicated merging movprfx. + * The helper functions as written take an extra source register to + * use in the operation, but the result is only written when predication + * succeeds. For unpredicated movprfx, we need to rearrange the helpers + * to allow the final write back to the destination to be unconditional. + * For predicated zering movprfz, we need to rearrange the helpers to + * allow the final write back to zero inactives. + * + * In the meantime, just emit the moves. + */ + +static bool trans_MOVPRFX(DisasContext *s, arg_MOVPRFX *a, uint32_t insn) +{ +return do_mov_z(s, a->rd, a->rn); +} + +static bool trans_MOVPRFX_m(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ +if (sve_access_check(s)) { +do_sel_z(s, a->rd, a->rn, a->rd, a->pg, a->esz); +} +return true; +} + +static bool trans_MOVPRFX_z(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ +if (sve_access_check(s)) { +do_movz_zpz(s, a->rd, a->rn, a->pg, a->esz); +} +return true; +} diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 94d7b157b4..85f2b39776 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -266,6 +266,10 @@ ORV 0100 .. 011 000 001 ... . . @rd_pg_rn EORV0100 .. 011 001 001 ... . . @rd_pg_rn ANDV0100 .. 011 010 001 ... . . @rd_pg_rn +# SVE constructive prefix (predicated) +MOVPRFX_z 0100 .. 010 000 001 ... . . @rd_pg_rn +MOVPRFX_m 0100 .. 010 001 001 ... . . @rd_pg_rn + # SVE integer add reduction (predicated) # Note that saddv requires size != 3. UADDV 0100 .. 000 001 001 ... . . @rd_pg_rn @@ -414,6 +418,9 @@ ADR_p64 0100 11 1 . 1010 .. . . @rd_rn_msz_rm ### SVE Integer Misc - Unpredicated Group +# SVE constructive prefix (unpredicated) +MOVPRFX 0100 00 1 0 10 rn:5 rd:5 + # SVE floating-point exponential accelerator # Note esz != 0 FEXPA 0100 .. 1 0 101110 . . @rd_rn -- 2.17.1
[Qemu-devel] [PATCH v2] hw/i386: Deprecate the machine types pc-0.10 and pc-0.11
The oldest machine type which is still used in a still maintained distro is a pc-0.12 based machine type in RHEL6, so everything that is older than pc-0.12 should not be used anymore. Thus let's deprecate pc-0.10 and pc-0.11 so that we can finally remove them in a future release. Signed-off-by: Thomas Huth --- This is based on a patch that I already sent in 2017. But back then, we were still in progress of discussing our deprecation policies (e.g. auto- matic deprecation for old machine types), and there was no clear consensus whether we should deprecate 0.10 - 0.11, all 0.x or even up to version 1.2. After some iterations and too much discussion, I've forgotten about this patch. Anyway, I think we agreed that at least 0.10 and 0.11 can certainly be removed nowadays, so let's finally get at least those two machine types marked as deprecated! If that works fine and we will finally have removed these two types in v3.2, we can resume the discussion about newer machine types afterwards. Note: I don't want to add a QMP interface for this in this patch here, let's keep this small and simple! If we decide that we need a QMP interface, we can do that with a separate patch later. v2: - Renamed deprecation_msg to deprecation_reason - Added information about that field to the MachineClass comment hw/i386/pc_piix.c | 2 ++ include/hw/boards.h | 4 qemu-doc.texi | 5 + vl.c| 9 +++-- 4 files changed, 18 insertions(+), 2 deletions(-) diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index e9b6f06..b4fd164 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -956,6 +956,8 @@ static void pc_i440fx_0_11_machine_options(MachineClass *m) { pc_i440fx_0_12_machine_options(m); m->hw_version = "0.11"; +m->deprecation_reason = "Old and unsupported machine version, " +"use a newer machine type instead."; SET_MACHINE_COMPAT(m, PC_COMPAT_0_11); } diff --git a/include/hw/boards.h b/include/hw/boards.h index ef7457f..6926928 100644 --- a/include/hw/boards.h +++ b/include/hw/boards.h @@ -107,6 +107,9 @@ typedef struct { /** * MachineClass: + * @deprecation_reason: If set, the machine is marked as deprecated. The + *string should give some information about why the machine is deprecated, + *and must provide some clear information about what to use instead. * @max_cpus: maximum number of CPUs supported. Default: 1 * @min_cpus: minimum number of CPUs supported. Default: 1 * @default_cpus: number of CPUs instantiated if none are specified. Default: 1 @@ -166,6 +169,7 @@ struct MachineClass { char *name; const char *alias; const char *desc; +const char *deprecation_reason; void (*init)(MachineState *state); void (*reset)(void); diff --git a/qemu-doc.texi b/qemu-doc.texi index 282bc3d..f1dab7e 100644 --- a/qemu-doc.texi +++ b/qemu-doc.texi @@ -2943,6 +2943,11 @@ support page sizes < 4096 any longer. @section System emulator machines +@subsection pc-0.10 and pc-0.11 (since 3.0) + +These machine types are very old and likely can not be used for live migration +from old QEMU versions anymore. A newer machine type should be used instead. + @section Device options @subsection Block device options diff --git a/vl.c b/vl.c index b3426e0..7eda6f0 100644 --- a/vl.c +++ b/vl.c @@ -2560,8 +2560,9 @@ static gint machine_class_cmp(gconstpointer a, gconstpointer b) if (mc->alias) { printf("%-20s %s (alias of %s)\n", mc->alias, mc->desc, mc->name); } -printf("%-20s %s%s\n", mc->name, mc->desc, - mc->is_default ? " (default)" : ""); +printf("%-20s %s%s%s\n", mc->name, mc->desc, + mc->is_default ? " (default)" : "", + mc->deprecation_reason ? " (deprecated)" : ""); } } @@ -3953,6 +3954,10 @@ int main(int argc, char **argv, char **envp) } machine_class = select_machine(); +if (machine_class->deprecation_reason) { +error_report("Machine type '%s' is deprecated: %s", + machine_class->name, machine_class->deprecation_reason); +} set_memory_options(_slots, _size, machine_class); -- 1.8.3.1
[Qemu-devel] [PATCH v5 26/35] target/arm: Implement SVE floating-point unary operations
Signed-off-by: Richard Henderson --- target/arm/helper-sve.h| 14 ++ target/arm/sve_helper.c| 8 target/arm/translate-sve.c | 26 ++ target/arm/sve.decode | 4 4 files changed, 52 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 36168c5bb2..891346a5ac 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -999,6 +999,20 @@ DEF_HELPER_FLAGS_5(sve_frintx_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_frintx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frecpx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frecpx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frecpx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_fsqrt_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fsqrt_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fsqrt_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 3b7a2f58c1..5309cf0866 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3270,6 +3270,14 @@ DO_ZPZ_FP(sve_frintx_h, uint16_t, H1_2, float16_round_to_int) DO_ZPZ_FP(sve_frintx_s, uint32_t, H1_4, float32_round_to_int) DO_ZPZ_FP(sve_frintx_d, uint64_t, , float64_round_to_int) +DO_ZPZ_FP(sve_frecpx_h, uint16_t, H1_2, helper_frecpx_f16) +DO_ZPZ_FP(sve_frecpx_s, uint32_t, H1_4, helper_frecpx_f32) +DO_ZPZ_FP(sve_frecpx_d, uint64_t, , helper_frecpx_f64) + +DO_ZPZ_FP(sve_fsqrt_h, uint16_t, H1_2, float16_sqrt) +DO_ZPZ_FP(sve_fsqrt_s, uint32_t, H1_4, float32_sqrt) +DO_ZPZ_FP(sve_fsqrt_d, uint64_t, , float64_sqrt) + DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16) DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16) DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 15e4f2888b..308c04de89 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -4117,6 +4117,32 @@ static bool trans_FRINTA(DisasContext *s, arg_rpr_esz *a, uint32_t insn) return do_frint_mode(s, a, float_round_ties_away); } +static bool trans_FRECPX(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ +static gen_helper_gvec_3_ptr * const fns[3] = { +gen_helper_sve_frecpx_h, +gen_helper_sve_frecpx_s, +gen_helper_sve_frecpx_d +}; +if (a->esz == 0) { +return false; +} +return do_zpz_ptr(s, a->rd, a->rn, a->pg, a->esz == MO_16, fns[a->esz - 1]); +} + +static bool trans_FSQRT(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ +static gen_helper_gvec_3_ptr * const fns[3] = { +gen_helper_sve_fsqrt_h, +gen_helper_sve_fsqrt_s, +gen_helper_sve_fsqrt_d +}; +if (a->esz == 0) { +return false; +} +return do_zpz_ptr(s, a->rd, a->rn, a->pg, a->esz == MO_16, fns[a->esz - 1]); +} + static bool trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) { return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_hh); diff --git a/target/arm/sve.decode b/target/arm/sve.decode index eeab3d485f..94d7b157b4 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -849,6 +849,10 @@ FRINTA 01100101 .. 000 100 101 ... . . @rd_pg_rn FRINTX 01100101 .. 000 110 101 ... . . @rd_pg_rn FRINTI 01100101 .. 000 111 101 ... . . @rd_pg_rn +# SVE floating-point unary operations +FRECPX 01100101 .. 001 100 101 ... . . @rd_pg_rn +FSQRT 01100101 .. 001 101 101 ... . . @rd_pg_rn + # SVE integer convert to floating-point SCVTF_hh01100101 01 010 01 0 101 ... . .@rd_pg_rn_e0 SCVTF_sh01100101 01 010 10 0 101 ... . .@rd_pg_rn_e0 -- 2.17.1
[Qemu-devel] [PATCH v5 31/35] target/arm: Implement SVE fp complex multiply add (indexed)
Enhance the existing helpers to support SVE, which takes the index from each 128-bit segment. The change has no effect for AdvSIMD, since there is only one such segment. Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 23 ++ target/arm/vec_helper.c| 50 +++--- target/arm/sve.decode | 6 + 3 files changed, 59 insertions(+), 20 deletions(-) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 6487fe760a..209a69cd76 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -4005,6 +4005,29 @@ static bool trans_FCMLA_zpzzz(DisasContext *s, return true; } +static bool trans_FCMLA_zzxz(DisasContext *s, arg_FCMLA_zzxz *a, uint32_t insn) +{ +static gen_helper_gvec_3_ptr * const fns[2] = { +gen_helper_gvec_fcmlah_idx, +gen_helper_gvec_fcmlas_idx, +}; + +tcg_debug_assert(a->esz == 1 || a->esz == 2); +tcg_debug_assert(a->rd == a->ra); +if (sve_access_check(s)) { +unsigned vsz = vec_full_reg_size(s); +TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16); +tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + status, vsz, vsz, + a->index * 4 + a->rot, + fns[a->esz - 1]); +tcg_temp_free_ptr(status); +} +return true; +} + /* *** SVE Floating Point Unary Operations Prediated Group */ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 8f2dc4b989..db5aeb9f24 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -319,22 +319,27 @@ void HELPER(gvec_fcmlah_idx)(void *vd, void *vn, void *vm, uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1); intptr_t index = extract32(desc, SIMD_DATA_SHIFT + 2, 2); uint32_t neg_real = flip ^ neg_imag; -uintptr_t i; -float16 e1 = m[H2(2 * index + flip)]; -float16 e3 = m[H2(2 * index + 1 - flip)]; +intptr_t elements = opr_sz / sizeof(float16); +intptr_t eltspersegment = 16 / sizeof(float16); +intptr_t i, j; /* Shift boolean to the sign bit so we can xor to negate. */ neg_real <<= 15; neg_imag <<= 15; -e1 ^= neg_real; -e3 ^= neg_imag; -for (i = 0; i < opr_sz / 2; i += 2) { -float16 e2 = n[H2(i + flip)]; -float16 e4 = e2; +for (i = 0; i < elements; i += eltspersegment) { +float16 mr = m[H2(i + 2 * index + 0)]; +float16 mi = m[H2(i + 2 * index + 1)]; +float16 e1 = neg_real ^ (flip ? mi : mr); +float16 e3 = neg_imag ^ (flip ? mr : mi); -d[H2(i)] = float16_muladd(e2, e1, d[H2(i)], 0, fpst); -d[H2(i + 1)] = float16_muladd(e4, e3, d[H2(i + 1)], 0, fpst); +for (j = i; j < i + eltspersegment; j += 2) { +float16 e2 = n[H2(j + flip)]; +float16 e4 = e2; + +d[H2(j)] = float16_muladd(e2, e1, d[H2(j)], 0, fpst); +d[H2(j + 1)] = float16_muladd(e4, e3, d[H2(j + 1)], 0, fpst); +} } clear_tail(d, opr_sz, simd_maxsz(desc)); } @@ -380,22 +385,27 @@ void HELPER(gvec_fcmlas_idx)(void *vd, void *vn, void *vm, uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1); intptr_t index = extract32(desc, SIMD_DATA_SHIFT + 2, 2); uint32_t neg_real = flip ^ neg_imag; -uintptr_t i; -float32 e1 = m[H4(2 * index + flip)]; -float32 e3 = m[H4(2 * index + 1 - flip)]; +intptr_t elements = opr_sz / sizeof(float32); +intptr_t eltspersegment = 16 / sizeof(float32); +intptr_t i, j; /* Shift boolean to the sign bit so we can xor to negate. */ neg_real <<= 31; neg_imag <<= 31; -e1 ^= neg_real; -e3 ^= neg_imag; -for (i = 0; i < opr_sz / 4; i += 2) { -float32 e2 = n[H4(i + flip)]; -float32 e4 = e2; +for (i = 0; i < elements; i += eltspersegment) { +float32 mr = m[H4(i + 2 * index + 0)]; +float32 mi = m[H4(i + 2 * index + 1)]; +float32 e1 = neg_real ^ (flip ? mi : mr); +float32 e3 = neg_imag ^ (flip ? mr : mi); -d[H4(i)] = float32_muladd(e2, e1, d[H4(i)], 0, fpst); -d[H4(i + 1)] = float32_muladd(e4, e3, d[H4(i + 1)], 0, fpst); +for (j = i; j < i + eltspersegment; j += 2) { +float32 e2 = n[H4(j + flip)]; +float32 e4 = e2; + +d[H4(j)] = float32_muladd(e2, e1, d[H4(j)], 0, fpst); +d[H4(j + 1)] = float32_muladd(e4, e3, d[H4(j + 1)], 0, fpst); +} } clear_tail(d, opr_sz, simd_maxsz(desc)); } diff --git a/target/arm/sve.decode b/target/arm/sve.decode index da89697700..b578d104c4 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -729,6 +729,12 @@ FCADD 01100100 esz:2 0 rot:1 100 pg:3 rm:5 rd:5 \ FCMLA_zpzzz 01100100 esz:2 0 rm:5 0 rot:2 pg:3 rn:5
[Qemu-devel] [PATCH v5 21/35] target/arm: Implement SVE FP Compare with Zero Group
Signed-off-by: Richard Henderson --- target/arm/helper-sve.h| 42 + target/arm/sve_helper.c| 43 ++ target/arm/translate-sve.c | 43 ++ target/arm/sve.decode | 10 + 4 files changed, 138 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index ff69d143a0..44a98440c9 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -767,6 +767,48 @@ DEF_HELPER_FLAGS_5(sve_fadda_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_fadda_d, TCG_CALL_NO_RWG, i64, i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmge0_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmge0_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmge0_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_fcmgt0_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmgt0_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmgt0_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_fcmlt0_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmlt0_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmlt0_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_fcmle0_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmle0_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmle0_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_fcmeq0_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmeq0_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmeq0_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_fcmne0_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmne0_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmne0_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_6(sve_fadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_6(sve_fadd_s, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index befea9ba54..06e963e9c2 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3362,6 +3362,8 @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, \ #define DO_FCMGE(TYPE, X, Y, ST) TYPE##_compare(Y, X, ST) <= 0 #define DO_FCMGT(TYPE, X, Y, ST) TYPE##_compare(Y, X, ST) < 0 +#define DO_FCMLE(TYPE, X, Y, ST) TYPE##_compare(X, Y, ST) <= 0 +#define DO_FCMLT(TYPE, X, Y, ST) TYPE##_compare(X, Y, ST) < 0 #define DO_FCMEQ(TYPE, X, Y, ST) TYPE##_compare_quiet(X, Y, ST) == 0 #define DO_FCMNE(TYPE, X, Y, ST) TYPE##_compare_quiet(X, Y, ST) != 0 #define DO_FCMUO(TYPE, X, Y, ST) \ @@ -3385,6 +3387,47 @@ DO_FPCMP_PPZZ_ALL(sve_facgt, DO_FACGT) #undef DO_FPCMP_PPZZ_H #undef DO_FPCMP_PPZZ +/* One operand floating-point comparison against zero, controlled + * by a predicate. + */ +#define DO_FPCMP_PPZ0(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vg,\ + void *status, uint32_t desc) \ +{ \ +intptr_t i = simd_oprsz(desc), j = (i - 1) >> 6; \ +uint64_t *d = vd, *g = vg; \ +do { \ +uint64_t out = 0, pg = g[j]; \ +do { \ +i -= sizeof(TYPE), out <<= sizeof(TYPE); \ +if ((pg >> (i & 63)) & 1) {\ +TYPE nn = *(TYPE *)(vn + H(i));\ +out |= OP(TYPE, nn, 0, status);\ +} \ +} while (i & 63); \ +d[j--] = out; \ +} while (i > 0); \ +} + +#define DO_FPCMP_PPZ0_H(NAME, OP) \ +DO_FPCMP_PPZ0(NAME##_h, float16, H1_2, OP) +#define DO_FPCMP_PPZ0_S(NAME, OP) \ +DO_FPCMP_PPZ0(NAME##_s, float32, H1_4, OP) +#define DO_FPCMP_PPZ0_D(NAME, OP) \ +DO_FPCMP_PPZ0(NAME##_d, float64, , OP) + +#define DO_FPCMP_PPZ0_ALL(NAME, OP) \ +DO_FPCMP_PPZ0_H(NAME, OP) \ +DO_FPCMP_PPZ0_S(NAME, OP) \ +
[Qemu-devel] [PATCH v5 35/35] target/arm: Implement ARMv8.2-DotProd
We've already added the helpers with an SVE patch, all that remains is to wire up the aa64 and aa32 translators. Enable the feature within -cpu max for CONFIG_USER_ONLY. Signed-off-by: Richard Henderson --- target/arm/cpu.h | 1 + linux-user/elfload.c | 1 + target/arm/cpu.c | 1 + target/arm/cpu64.c | 1 + target/arm/translate-a64.c | 36 + target/arm/translate.c | 81 ++ 6 files changed, 96 insertions(+), 25 deletions(-) diff --git a/target/arm/cpu.h b/target/arm/cpu.h index 8488273c5b..23098474e1 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -1480,6 +1480,7 @@ enum arm_features { ARM_FEATURE_V8_SM4, /* implements SM4 part of v8 Crypto Extensions */ ARM_FEATURE_V8_ATOMICS, /* ARMv8.1-Atomics feature */ ARM_FEATURE_V8_RDM, /* implements v8.1 simd round multiply */ +ARM_FEATURE_V8_DOTPROD, /* implements v8.2 simd dot product */ ARM_FEATURE_V8_FP16, /* implements v8.2 half-precision float */ ARM_FEATURE_V8_FCMA, /* has complex number part of v8.3 extensions. */ }; diff --git a/linux-user/elfload.c b/linux-user/elfload.c index 13bc78d0c8..bdb023b477 100644 --- a/linux-user/elfload.c +++ b/linux-user/elfload.c @@ -583,6 +583,7 @@ static uint32_t get_elf_hwcap(void) ARM_HWCAP_A64_FPHP | ARM_HWCAP_A64_ASIMDHP); GET_FEATURE(ARM_FEATURE_V8_ATOMICS, ARM_HWCAP_A64_ATOMICS); GET_FEATURE(ARM_FEATURE_V8_RDM, ARM_HWCAP_A64_ASIMDRDM); +GET_FEATURE(ARM_FEATURE_V8_DOTPROD, ARM_HWCAP_A64_ASIMDDP); GET_FEATURE(ARM_FEATURE_V8_FCMA, ARM_HWCAP_A64_FCMA); #undef GET_FEATURE diff --git a/target/arm/cpu.c b/target/arm/cpu.c index 8e4f4d8c21..95ac9c064d 100644 --- a/target/arm/cpu.c +++ b/target/arm/cpu.c @@ -1794,6 +1794,7 @@ static void arm_max_initfn(Object *obj) set_feature(>env, ARM_FEATURE_V8_PMULL); set_feature(>env, ARM_FEATURE_CRC); set_feature(>env, ARM_FEATURE_V8_RDM); +set_feature(>env, ARM_FEATURE_V8_DOTPROD); set_feature(>env, ARM_FEATURE_V8_FCMA); #endif } diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c index 0360d7efc5..3b4bc73ffa 100644 --- a/target/arm/cpu64.c +++ b/target/arm/cpu64.c @@ -250,6 +250,7 @@ static void aarch64_max_initfn(Object *obj) set_feature(>env, ARM_FEATURE_CRC); set_feature(>env, ARM_FEATURE_V8_ATOMICS); set_feature(>env, ARM_FEATURE_V8_RDM); +set_feature(>env, ARM_FEATURE_V8_DOTPROD); set_feature(>env, ARM_FEATURE_V8_FP16); set_feature(>env, ARM_FEATURE_V8_FCMA); set_feature(>env, ARM_FEATURE_SVE); diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 038e48278f..903d6233d3 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -640,6 +640,16 @@ static void gen_gvec_op3(DisasContext *s, bool is_q, int rd, vec_full_reg_size(s), gvec_op); } +/* Expand a 3-operand operation using an out-of-line helper. */ +static void gen_gvec_op3_ool(DisasContext *s, bool is_q, int rd, + int rn, int rm, int data, gen_helper_gvec_3 *fn) +{ +tcg_gen_gvec_3_ool(vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), + vec_full_reg_offset(s, rm), + is_q ? 16 : 8, vec_full_reg_size(s), data, fn); +} + /* Expand a 3-operand + env pointer operation using * an out-of-line helper. */ @@ -11336,6 +11346,14 @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn) } feature = ARM_FEATURE_V8_RDM; break; +case 0x02: /* SDOT (vector) */ +case 0x12: /* UDOT (vector) */ +if (size != MO_32) { +unallocated_encoding(s); +return; +} +feature = ARM_FEATURE_V8_DOTPROD; +break; case 0x8: /* FCMLA, #0 */ case 0x9: /* FCMLA, #90 */ case 0xa: /* FCMLA, #180 */ @@ -11389,6 +11407,11 @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn) } return; +case 0x2: /* SDOT / UDOT */ +gen_gvec_op3_ool(s, is_q, rd, rn, rm, 0, + u ? gen_helper_gvec_udot_b : gen_helper_gvec_sdot_b); +return; + case 0x8: /* FCMLA, #0 */ case 0x9: /* FCMLA, #90 */ case 0xa: /* FCMLA, #180 */ @@ -12568,6 +12591,13 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) return; } break; +case 0x0e: /* SDOT */ +case 0x1e: /* UDOT */ +if (size != MO_32 || !arm_dc_feature(s, ARM_FEATURE_V8_DOTPROD)) { +unallocated_encoding(s); +return; +} +break; case 0x11: /* FCMLA #0 */ case 0x13: /* FCMLA #90 */ case 0x15: /* FCMLA #180 */ @@ -12665,6 +12695,12 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) } switch (16 * u + opcode) { +case 0x0e: /*
[Qemu-devel] [PATCH v5 24/35] target/arm: Implement SVE floating-point convert to integer
Signed-off-by: Richard Henderson --- target/arm/helper-sve.h| 30 + target/arm/helper.h| 12 +++--- target/arm/helper.c| 2 +- target/arm/sve_helper.c| 88 ++ target/arm/translate-sve.c | 70 ++ target/arm/sve.decode | 16 +++ 6 files changed, 211 insertions(+), 7 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 4c379dbb05..37fa9eb9bb 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -955,6 +955,36 @@ DEF_HELPER_FLAGS_5(sve_fcvt_hd, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_fcvt_sd, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzs_hh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzs_hs, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzs_ss, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzs_ds, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzs_hd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzs_sd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzs_dd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_fcvtzu_hh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzu_hs, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzu_ss, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzu_ds, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzu_hd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzu_sd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzu_dd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG, diff --git a/target/arm/helper.h b/target/arm/helper.h index ad9cb6c7d5..8607077dda 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -134,12 +134,12 @@ DEF_HELPER_2(vfp_touid, i32, f64, ptr) DEF_HELPER_2(vfp_touizh, i32, f16, ptr) DEF_HELPER_2(vfp_touizs, i32, f32, ptr) DEF_HELPER_2(vfp_touizd, i32, f64, ptr) -DEF_HELPER_2(vfp_tosih, i32, f16, ptr) -DEF_HELPER_2(vfp_tosis, i32, f32, ptr) -DEF_HELPER_2(vfp_tosid, i32, f64, ptr) -DEF_HELPER_2(vfp_tosizh, i32, f16, ptr) -DEF_HELPER_2(vfp_tosizs, i32, f32, ptr) -DEF_HELPER_2(vfp_tosizd, i32, f64, ptr) +DEF_HELPER_2(vfp_tosih, s32, f16, ptr) +DEF_HELPER_2(vfp_tosis, s32, f32, ptr) +DEF_HELPER_2(vfp_tosid, s32, f64, ptr) +DEF_HELPER_2(vfp_tosizh, s32, f16, ptr) +DEF_HELPER_2(vfp_tosizs, s32, f32, ptr) +DEF_HELPER_2(vfp_tosizd, s32, f64, ptr) DEF_HELPER_3(vfp_toshs_round_to_zero, i32, f32, i32, ptr) DEF_HELPER_3(vfp_tosls_round_to_zero, i32, f32, i32, ptr) diff --git a/target/arm/helper.c b/target/arm/helper.c index 1248d84e6f..a36f5b1899 100644 --- a/target/arm/helper.c +++ b/target/arm/helper.c @@ -11360,7 +11360,7 @@ ftype HELPER(name)(uint32_t x, void *fpstp) \ } #define CONV_FTOI(name, ftype, fsz, sign, round)\ -uint32_t HELPER(name)(ftype x, void *fpstp) \ +sign##int32_t HELPER(name)(ftype x, void *fpstp)\ { \ float_status *fpst = fpstp; \ if (float##fsz##_is_any_nan(x)) { \ diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index e1bbe1f550..497efbf3a8 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3167,6 +3167,78 @@ static inline float16 sve_f64_to_f16(float64 f, float_status *s) return float64_to_float16(f, true, s); } +static inline int16_t vfp_float16_to_int16_rtz(float16 f, float_status *s) +{ +if (float16_is_any_nan(f)) { +float_raise(float_flag_invalid, s); +return 0; +} +return float16_to_int16_round_to_zero(f, s); +} + +static inline int64_t vfp_float16_to_int64_rtz(float16 f, float_status *s) +{ +if (float16_is_any_nan(f)) { +float_raise(float_flag_invalid, s); +return 0; +} +return float16_to_int64_round_to_zero(f, s); +} + +static inline int64_t vfp_float32_to_int64_rtz(float32 f, float_status *s) +{ +if (float32_is_any_nan(f)) { +float_raise(float_flag_invalid, s); +return 0; +} +return float32_to_int64_round_to_zero(f, s); +} + +static inline int64_t vfp_float64_to_int64_rtz(float64 f, float_status *s) +{ +if
[Qemu-devel] [PATCH v5 25/35] target/arm: Implement SVE floating-point round to integral value
Signed-off-by: Richard Henderson --- target/arm/helper-sve.h| 14 +++ target/arm/sve_helper.c| 8 target/arm/translate-sve.c | 77 ++ target/arm/sve.decode | 9 + 4 files changed, 108 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 37fa9eb9bb..36168c5bb2 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -985,6 +985,20 @@ DEF_HELPER_FLAGS_5(sve_fcvtzu_sd, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_fcvtzu_dd, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frint_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frint_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frint_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_frintx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frintx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frintx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 497efbf3a8..3b7a2f58c1 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3262,6 +3262,14 @@ DO_ZPZ_FP(sve_fcvtzu_sd, uint64_t, , vfp_float32_to_uint64_rtz) DO_ZPZ_FP(sve_fcvtzu_ds, uint64_t, , helper_vfp_touizd) DO_ZPZ_FP(sve_fcvtzu_dd, uint64_t, , vfp_float64_to_uint64_rtz) +DO_ZPZ_FP(sve_frint_h, uint16_t, H1_2, helper_advsimd_rinth) +DO_ZPZ_FP(sve_frint_s, uint32_t, H1_4, helper_rints) +DO_ZPZ_FP(sve_frint_d, uint64_t, , helper_rintd) + +DO_ZPZ_FP(sve_frintx_h, uint16_t, H1_2, float16_round_to_int) +DO_ZPZ_FP(sve_frintx_s, uint32_t, H1_4, float32_round_to_int) +DO_ZPZ_FP(sve_frintx_d, uint64_t, , float64_round_to_int) + DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16) DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16) DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index fd94c5337e..15e4f2888b 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -4040,6 +4040,83 @@ static bool trans_FCVTZU_dd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzu_dd); } +static gen_helper_gvec_3_ptr * const frint_fns[3] = { +gen_helper_sve_frint_h, +gen_helper_sve_frint_s, +gen_helper_sve_frint_d +}; + +static bool trans_FRINTI(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ +if (a->esz == 0) { +return false; +} +return do_zpz_ptr(s, a->rd, a->rn, a->pg, a->esz == MO_16, + frint_fns[a->esz - 1]); +} + +static bool trans_FRINTX(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ +static gen_helper_gvec_3_ptr * const fns[3] = { +gen_helper_sve_frintx_h, +gen_helper_sve_frintx_s, +gen_helper_sve_frintx_d +}; +if (a->esz == 0) { +return false; +} +return do_zpz_ptr(s, a->rd, a->rn, a->pg, a->esz == MO_16, fns[a->esz - 1]); +} + +static bool do_frint_mode(DisasContext *s, arg_rpr_esz *a, int mode) +{ +if (a->esz == 0) { +return false; +} +if (sve_access_check(s)) { +unsigned vsz = vec_full_reg_size(s); +TCGv_i32 tmode = tcg_const_i32(mode); +TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16); + +gen_helper_set_rmode(tmode, tmode, status); + +tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + pred_full_reg_offset(s, a->pg), + status, vsz, vsz, 0, frint_fns[a->esz - 1]); + +gen_helper_set_rmode(tmode, tmode, status); +tcg_temp_free_i32(tmode); +tcg_temp_free_ptr(status); +} +return true; +} + +static bool trans_FRINTN(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ +return do_frint_mode(s, a, float_round_nearest_even); +} + +static bool trans_FRINTP(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ +return do_frint_mode(s, a, float_round_up); +} + +static bool trans_FRINTM(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ +return do_frint_mode(s, a, float_round_down); +} + +static bool trans_FRINTZ(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ +return do_frint_mode(s, a, float_round_to_zero); +} + +static bool trans_FRINTA(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ +return do_frint_mode(s, a, float_round_ties_away); +} + static bool trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) { return do_zpz_ptr(s, a->rd, a->rn,
[Qemu-devel] [PATCH v5 19/35] target/arm: Implement SVE FP Fast Reduction Group
Signed-off-by: Richard Henderson --- target/arm/helper-sve.h| 35 ++ target/arm/sve_helper.c| 61 ++ target/arm/translate-sve.c | 57 +++ target/arm/sve.decode | 8 + 4 files changed, 161 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 087819ec2b..ff69d143a0 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -725,6 +725,41 @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_faddv_h, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_faddv_s, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_faddv_d, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_fmaxnmv_h, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fmaxnmv_s, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fmaxnmv_d, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_fminnmv_h, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fminnmv_s, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fminnmv_d, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_fmaxv_h, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fmaxv_s, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fmaxv_d, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_fminv_h, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fminv_s, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fminv_d, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_fadda_h, TCG_CALL_NO_RWG, i64, i64, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_fadda_s, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index a40df62414..befea9ba54 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2852,6 +2852,67 @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc) return predtest_ones(d, oprsz, esz_mask); } +/* Recursive reduction on a function; + * C.f. the ARM ARM function ReducePredicated. + * + * While it would be possible to write this without the DATA temporary, + * it is much simpler to process the predicate register this way. + * The recursion is bounded to depth 7 (128 fp16 elements), so there's + * little to gain with a more complex non-recursive form. + */ +#define DO_REDUCE(NAME, TYPE, H, FUNC, IDENT) \ +static TYPE NAME##_reduce(TYPE *data, float_status *status, uintptr_t n) \ +{ \ +if (n == 1) { \ +return *data; \ +} else { \ +uintptr_t half = n / 2; \ +TYPE lo = NAME##_reduce(data, status, half); \ +TYPE hi = NAME##_reduce(data + half, status, half); \ +return TYPE##_##FUNC(lo, hi, status); \ +} \ +} \ +uint64_t HELPER(NAME)(void *vn, void *vg, void *vs, uint32_t desc)\ +{ \ +uintptr_t i, oprsz = simd_oprsz(desc), maxsz = simd_maxsz(desc); \ +TYPE data[sizeof(ARMVectorReg) / sizeof(TYPE)]; \ +for (i = 0; i < oprsz; ) {\ +uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ +do { \ +TYPE nn = *(TYPE *)(vn + H(i)); \ +*(TYPE *)((void *)data + i) = (pg & 1 ? nn : IDENT); \ +i += sizeof(TYPE), pg >>= sizeof(TYPE); \ +} while (i & 15); \ +} \ +for (; i < maxsz; i += sizeof(TYPE)) {\ +*(TYPE *)((void *)data + i) = IDENT; \ +} \ +return NAME##_reduce(data, vs, maxsz / sizeof(TYPE)); \ +} + +DO_REDUCE(sve_faddv_h,
[Qemu-devel] [PATCH v5 22/35] target/arm: Implement SVE floating-point trig multiply-add coefficient
Signed-off-by: Richard Henderson --- target/arm/helper-sve.h| 4 +++ target/arm/sve_helper.c| 70 ++ target/arm/translate-sve.c | 27 +++ target/arm/sve.decode | 3 ++ 4 files changed, 104 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 44a98440c9..aca137fc37 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -1037,6 +1037,10 @@ DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ftmad_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ftmad_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ftmad_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 06e963e9c2..3dc35d8b12 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3428,6 +3428,76 @@ DO_FPCMP_PPZ0_ALL(sve_fcmlt0, DO_FCMLT) DO_FPCMP_PPZ0_ALL(sve_fcmeq0, DO_FCMEQ) DO_FPCMP_PPZ0_ALL(sve_fcmne0, DO_FCMNE) +/* FP Trig Multiply-Add. */ + +void HELPER(sve_ftmad_h)(void *vd, void *vn, void *vm, void *vs, uint32_t desc) +{ +static const float16 coeff[16] = { +0x3c00, 0xb155, 0x2030, 0x, 0x, 0x, 0x, 0x, +0x3c00, 0xb800, 0x293a, 0x, 0x, 0x, 0x, 0x, +}; +intptr_t i, opr_sz = simd_oprsz(desc) / sizeof(float16); +intptr_t x = simd_data(desc); +float16 *d = vd, *n = vn, *m = vm; +for (i = 0; i < opr_sz; i++) { +float16 mm = m[i]; +intptr_t xx = x; +if (float16_is_neg(mm)) { +mm = float16_abs(mm); +xx += 8; +} +d[i] = float16_muladd(n[i], mm, coeff[xx], 0, vs); +} +} + +void HELPER(sve_ftmad_s)(void *vd, void *vn, void *vm, void *vs, uint32_t desc) +{ +static const float32 coeff[16] = { +0x3f80, 0xbe2b, 0x3c06, 0xb95008b9, +0x36369d6d, 0x, 0x, 0x, +0x3f80, 0xbf00, 0x3d26, 0xbab60705, +0x37cd37cc, 0x, 0x, 0x, +}; +intptr_t i, opr_sz = simd_oprsz(desc) / sizeof(float32); +intptr_t x = simd_data(desc); +float32 *d = vd, *n = vn, *m = vm; +for (i = 0; i < opr_sz; i++) { +float32 mm = m[i]; +intptr_t xx = x; +if (float32_is_neg(mm)) { +mm = float32_abs(mm); +xx += 8; +} +d[i] = float32_muladd(n[i], mm, coeff[xx], 0, vs); +} +} + +void HELPER(sve_ftmad_d)(void *vd, void *vn, void *vm, void *vs, uint32_t desc) +{ +static const float64 coeff[16] = { +0x3ff0ull, 0xbfc55543ull, +0x3f80f30cull, 0xbf2a01a019b92fc6ull, +0x3ec71de351f3d22bull, 0xbe5ae5e2b60f7b91ull, +0x3de5d8408868552full, 0xull, +0x3ff0ull, 0xbfe0ull, +0x3fa55536ull, 0xbf56c16c16c13a0bull, +0x3efa01a019b1e8d8ull, 0xbe927e4f7282f468ull, +0x3e21ee96d2641b13ull, 0xbda8f76380fbb401ull, +}; +intptr_t i, opr_sz = simd_oprsz(desc) / sizeof(float64); +intptr_t x = simd_data(desc); +float64 *d = vd, *n = vn, *m = vm; +for (i = 0; i < opr_sz; i++) { +float64 mm = m[i]; +intptr_t xx = x; +if (float64_is_neg(mm)) { +mm = float64_abs(mm); +xx += 8; +} +d[i] = float64_muladd(n[i], mm, coeff[xx], 0, vs); +} +} + /* * Load contiguous data, protected by a governing predicate. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 0c55501935..fb225d56a1 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3597,6 +3597,33 @@ DO_PPZ(FCMNE_ppz0, fcmne0) #undef DO_PPZ +/* + *** SVE floating-point trig multiply-add coefficient + */ + +static bool trans_FTMAD(DisasContext *s, arg_FTMAD *a, uint32_t insn) +{ +static gen_helper_gvec_3_ptr * const fns[3] = { +gen_helper_sve_ftmad_h, +gen_helper_sve_ftmad_s, +gen_helper_sve_ftmad_d, +}; + +if (a->esz == 0) { +return false; +} +if (sve_access_check(s)) { +unsigned vsz = vec_full_reg_size(s); +TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16); +tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + status, vsz, vsz, a->imm, fns[a->esz - 1]); +tcg_temp_free_ptr(status); +
[Qemu-devel] [PATCH v5 23/35] target/arm: Implement SVE floating-point convert precision
Signed-off-by: Richard Henderson --- target/arm/helper-sve.h| 13 + target/arm/sve_helper.c| 27 +++ target/arm/translate-sve.c | 30 ++ target/arm/sve.decode | 8 4 files changed, 78 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index aca137fc37..4c379dbb05 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -942,6 +942,19 @@ DEF_HELPER_FLAGS_6(sve_fmins_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_6(sve_fmins_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvt_sh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvt_dh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvt_hs, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvt_ds, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvt_hd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvt_sd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 3dc35d8b12..e1bbe1f550 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3147,6 +3147,33 @@ void HELPER(NAME)(void *vd, void *vn, void *vg, void *status, uint32_t desc) \ } while (i != 0); \ } +static inline float32 sve_f16_to_f32(float16 f, float_status *s) +{ +return float16_to_float32(f, true, s); +} + +static inline float64 sve_f16_to_f64(float16 f, float_status *s) +{ +return float16_to_float64(f, true, s); +} + +static inline float16 sve_f32_to_f16(float32 f, float_status *s) +{ +return float32_to_float16(f, true, s); +} + +static inline float16 sve_f64_to_f16(float64 f, float_status *s) +{ +return float64_to_float16(f, true, s); +} + +DO_ZPZ_FP(sve_fcvt_sh, uint32_t, H1_4, sve_f32_to_f16) +DO_ZPZ_FP(sve_fcvt_hs, uint32_t, H1_4, sve_f16_to_f32) +DO_ZPZ_FP(sve_fcvt_dh, uint64_t, , sve_f64_to_f16) +DO_ZPZ_FP(sve_fcvt_hd, uint64_t, , sve_f16_to_f64) +DO_ZPZ_FP(sve_fcvt_ds, uint64_t, , float64_to_float32) +DO_ZPZ_FP(sve_fcvt_sd, uint64_t, , float32_to_float64) + DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16) DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16) DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index fb225d56a1..c08c1263f1 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3940,6 +3940,36 @@ static bool do_zpz_ptr(DisasContext *s, int rd, int rn, int pg, return true; } +static bool trans_FCVT_sh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ +return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvt_sh); +} + +static bool trans_FCVT_hs(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ +return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_hs); +} + +static bool trans_FCVT_dh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ +return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvt_dh); +} + +static bool trans_FCVT_hd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ +return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_hd); +} + +static bool trans_FCVT_ds(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ +return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_ds); +} + +static bool trans_FCVT_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ +return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_sd); +} + static bool trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) { return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_hh); diff --git a/target/arm/sve.decode b/target/arm/sve.decode index e29c598783..fd45f51029 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -816,6 +816,14 @@ FNMLS_zpzzz 01100101 .. 1 . 111 ... . . @rdn_pg_rm_ra ### SVE FP Unary Operations Predicated Group +# SVE floating-point convert precision +FCVT_sh 01100101 10 0010 00 101 ... . . @rd_pg_rn_e0 +FCVT_hs 01100101 10 0010 01 101 ... . . @rd_pg_rn_e0 +FCVT_dh 01100101 11 0010 00 101 ... . . @rd_pg_rn_e0 +FCVT_hd 01100101 11 0010 01 101 ... . . @rd_pg_rn_e0 +FCVT_ds 01100101 11 0010 10 101 ... . . @rd_pg_rn_e0 +FCVT_sd 01100101 11 0010 11 101 ... . . @rd_pg_rn_e0 + # SVE integer convert to floating-point SCVTF_hh
[Qemu-devel] [PATCH v5 16/35] target/arm: Implement SVE floating-point compare vectors
Signed-off-by: Richard Henderson --- target/arm/helper-sve.h| 49 ++ target/arm/sve_helper.c| 62 ++ target/arm/translate-sve.c | 40 target/arm/sve.decode | 11 +++ 4 files changed, 162 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 55e8a908d4..6089b3a53f 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -839,6 +839,55 @@ DEF_HELPER_FLAGS_5(sve_ucvt_ds, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_ucvt_dd, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmge_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmge_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmge_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fcmgt_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmgt_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmgt_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fcmeq_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmeq_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmeq_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fcmne_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmne_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmne_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fcmuo_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmuo_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmuo_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_facge_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_facge_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_facge_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_facgt_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_facgt_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_facgt_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 38f7cc2274..2fd34d722b 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3193,6 +3193,68 @@ void HELPER(sve_fnmls_zpzzz_d)(CPUARMState *env, void *vg, uint32_t desc) do_fmla_zpzzz_d(env, vg, desc, 0, INT64_MIN); } +/* Two operand floating-point comparison controlled by a predicate. + * Unlike the integer version, we are not allowed to optimistically + * compare operands, since the comparison may have side effects wrt + * the FPSR. + */ +#define DO_FPCMP_PPZZ(NAME, TYPE, H, OP)\ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, \ + void *status, uint32_t desc) \ +{ \ +intptr_t i = simd_oprsz(desc), j = (i - 1) >> 6;\ +uint64_t *d = vd, *g = vg; \ +do {\ +uint64_t out = 0, pg = g[j];\ +do {\ +i -= sizeof(TYPE), out <<= sizeof(TYPE);\ +if (likely((pg >> (i & 63)) & 1)) { \ +TYPE nn = *(TYPE *)(vn + H(i)); \ +TYPE mm = *(TYPE *)(vm + H(i)); \ +out |= OP(TYPE, nn, mm, status);\ +} \ +} while (i & 63); \ +d[j--] = out; \ +} while (i > 0);\
[Qemu-devel] [PATCH v5 20/35] target/arm: Implement SVE Floating Point Unary Operations - Unpredicated Group
Signed-off-by: Richard Henderson --- target/arm/helper.h| 8 +++ target/arm/translate-sve.c | 47 ++ target/arm/vec_helper.c| 20 target/arm/sve.decode | 5 4 files changed, 80 insertions(+) diff --git a/target/arm/helper.h b/target/arm/helper.h index 56439ac1e4..ad9cb6c7d5 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -601,6 +601,14 @@ DEF_HELPER_FLAGS_5(gvec_fcmlas_idx, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_fcmlad, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_frecpe_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_frecpe_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_frecpe_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(gvec_frsqrte_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_frsqrte_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_frsqrte_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(gvec_fadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 47d64f2fc7..d7957cddbd 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3507,6 +3507,53 @@ DO_VPZ(FMAXNMV, fmaxnmv) DO_VPZ(FMINV, fminv) DO_VPZ(FMAXV, fmaxv) +/* + *** SVE Floating Point Unary Operations - Unpredicated Group + */ + +static void do_zz_fp(DisasContext *s, arg_rr_esz *a, gen_helper_gvec_2_ptr *fn) +{ +unsigned vsz = vec_full_reg_size(s); +TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16); + +tcg_gen_gvec_2_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + status, vsz, vsz, 0, fn); +tcg_temp_free_ptr(status); +} + +static bool trans_FRECPE(DisasContext *s, arg_rr_esz *a, uint32_t insn) +{ +static gen_helper_gvec_2_ptr * const fns[3] = { +gen_helper_gvec_frecpe_h, +gen_helper_gvec_frecpe_s, +gen_helper_gvec_frecpe_d, +}; +if (a->esz == 0) { +return false; +} +if (sve_access_check(s)) { +do_zz_fp(s, a, fns[a->esz - 1]); +} +return true; +} + +static bool trans_FRSQRTE(DisasContext *s, arg_rr_esz *a, uint32_t insn) +{ +static gen_helper_gvec_2_ptr * const fns[3] = { +gen_helper_gvec_frsqrte_h, +gen_helper_gvec_frsqrte_s, +gen_helper_gvec_frsqrte_d, +}; +if (a->esz == 0) { +return false; +} +if (sve_access_check(s)) { +do_zz_fp(s, a, fns[a->esz - 1]); +} +return true; +} + /* *** SVE Floating Point Accumulating Reduction Group */ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 97af75a61b..073e5c58e7 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -427,6 +427,26 @@ void HELPER(gvec_fcmlad)(void *vd, void *vn, void *vm, clear_tail(d, opr_sz, simd_maxsz(desc)); } +#define DO_2OP(NAME, FUNC, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, void *stat, uint32_t desc) \ +{ \ +intptr_t i, oprsz = simd_oprsz(desc); \ +TYPE *d = vd, *n = vn;\ +for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ +d[i] = FUNC(n[i], stat); \ +} \ +} + +DO_2OP(gvec_frecpe_h, helper_recpe_f16, float16) +DO_2OP(gvec_frecpe_s, helper_recpe_f32, float32) +DO_2OP(gvec_frecpe_d, helper_recpe_f64, float64) + +DO_2OP(gvec_frsqrte_h, helper_rsqrte_f16, float16) +DO_2OP(gvec_frsqrte_s, helper_rsqrte_f32, float32) +DO_2OP(gvec_frsqrte_d, helper_rsqrte_f64, float64) + +#undef DO_2OP + /* Floating-point trigonometric starting value. * See the ARM ARM pseudocode function FPTrigSMul. */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 39a803621f..191be9463d 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -739,6 +739,11 @@ FMINNMV 01100101 .. 000 101 001 ... . . @rd_pg_rn FMAXV 01100101 .. 000 110 001 ... . . @rd_pg_rn FMINV 01100101 .. 000 111 001 ... . . @rd_pg_rn +## SVE Floating Point Unary Operations - Unpredicated Group + +FRECPE 01100101 .. 001 110 001100 . . @rd_rn +FRSQRTE 01100101 .. 001 111 001100 . . @rd_rn + ### SVE FP Accumulating Reduction Group # SVE floating-point serial reduction (predicated) -- 2.17.1
[Qemu-devel] [PATCH v5 18/35] target/arm: Implement SVE Floating Point Multiply Indexed Group
Signed-off-by: Richard Henderson --- target/arm/helper.h| 14 +++ target/arm/translate-sve.c | 50 ++ target/arm/vec_helper.c| 48 target/arm/sve.decode | 19 +++ 4 files changed, 131 insertions(+) diff --git a/target/arm/helper.h b/target/arm/helper.h index 879a7229e9..56439ac1e4 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -620,6 +620,20 @@ DEF_HELPER_FLAGS_5(gvec_ftsmul_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_ftsmul_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fmul_idx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fmul_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fmul_idx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(gvec_fmla_idx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(gvec_fmla_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(gvec_fmla_idx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 12cfadf4e9..e4ba84cadd 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3400,6 +3400,56 @@ DO_ZZI(UMIN, umin) #undef DO_ZZI +/* + *** SVE Floating Point Multiply-Add Indexed Group + */ + +static bool trans_FMLA_zzxz(DisasContext *s, arg_FMLA_zzxz *a, uint32_t insn) +{ +static gen_helper_gvec_4_ptr * const fns[3] = { +gen_helper_gvec_fmla_idx_h, +gen_helper_gvec_fmla_idx_s, +gen_helper_gvec_fmla_idx_d, +}; + +if (sve_access_check(s)) { +unsigned vsz = vec_full_reg_size(s); +TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16); +tcg_gen_gvec_4_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vec_full_reg_offset(s, a->ra), + status, vsz, vsz, a->index * 2 + a->sub, + fns[a->esz - 1]); +tcg_temp_free_ptr(status); +} +return true; +} + +/* + *** SVE Floating Point Multiply Indexed Group + */ + +static bool trans_FMUL_zzx(DisasContext *s, arg_FMUL_zzx *a, uint32_t insn) +{ +static gen_helper_gvec_3_ptr * const fns[3] = { +gen_helper_gvec_fmul_idx_h, +gen_helper_gvec_fmul_idx_s, +gen_helper_gvec_fmul_idx_d, +}; + +if (sve_access_check(s)) { +unsigned vsz = vec_full_reg_size(s); +TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16); +tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + status, vsz, vsz, a->index, fns[a->esz - 1]); +tcg_temp_free_ptr(status); +} +return true; +} + /* *** SVE Floating Point Accumulating Reduction Group */ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index f504dd53c8..97af75a61b 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -495,3 +495,51 @@ DO_3OP(gvec_rsqrts_d, helper_rsqrtsf_f64, float64) #endif #undef DO_3OP + +/* For the indexed ops, SVE applies the index per 128-bit vector segment. + * For AdvSIMD, there is of course only one such vector segment. + */ + +#define DO_MUL_IDX(NAME, TYPE, H) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \ +{ \ +intptr_t i, j, oprsz = simd_oprsz(desc), segment = 16 / sizeof(TYPE); \ +intptr_t idx = simd_data(desc);\ +TYPE *d = vd, *n = vn, *m = vm;\ +for (i = 0; i < oprsz / sizeof(TYPE); i += segment) { \ +TYPE mm = m[H(i + idx)]; \ +for (j = 0; j < segment; j++) {\ +d[i + j] = TYPE##_mul(n[i + j], mm, stat); \ +} \ +} \ +} + +DO_MUL_IDX(gvec_fmul_idx_h, float16, H2) +DO_MUL_IDX(gvec_fmul_idx_s, float32, H4) +DO_MUL_IDX(gvec_fmul_idx_d, float64, ) + +#undef DO_MUL_IDX + +#define DO_FMLA_IDX(NAME, TYPE, H) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, \ + void *stat, uint32_t desc) \ +{
[Qemu-devel] [PATCH v5 12/35] target/arm: Implement SVE prefetches
Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 21 + target/arm/sve.decode | 23 +++ 2 files changed, 44 insertions(+) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 6e1907cedd..c054e3268b 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -4302,3 +4302,24 @@ static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn) cpu_reg_sp(s, a->rn), fn); return true; } + +/* + * Prefetches + */ + +static bool trans_PRF(DisasContext *s, arg_PRF *a, uint32_t insn) +{ +/* Prefetch is a nop within QEMU. */ +sve_access_check(s); +return true; +} + +static bool trans_PRF_rr(DisasContext *s, arg_PRF_rr *a, uint32_t insn) +{ +if (a->rm == 31) { +return false; +} +/* Prefetch is a nop within QEMU. */ +sve_access_check(s); +return true; +} diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 2ca0fd85e6..a20f98b70c 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -793,6 +793,29 @@ LD1RQ_zprr 1010010 .. 00 . 000 ... . . \ LD1RQ_zpri 1010010 .. 00 0 001 ... . . \ @rpri_load_msz nreg=0 +# SVE 32-bit gather prefetch (scalar plus 32-bit scaled offsets) +PRF 110 00 -1 - 0-- --- - 0 + +# SVE 32-bit gather prefetch (vector plus immediate) +PRF 110 -- 00 - 111 --- - 0 + +# SVE contiguous prefetch (scalar plus immediate) +PRF 110 11 1- - 0-- --- - 0 + +# SVE contiguous prefetch (scalar plus scalar) +PRF_rr 110 -- 00 rm:5 110 --- - 0 + +### SVE Memory 64-bit Gather Group + +# SVE 64-bit gather prefetch (scalar plus 64-bit scaled offsets) +PRF 1100010 00 11 - 1-- --- - 0 + +# SVE 64-bit gather prefetch (scalar plus unpacked 32-bit scaled offsets) +PRF 1100010 00 -1 - 0-- --- - 0 + +# SVE 64-bit gather prefetch (vector plus immediate) +PRF 1100010 -- 00 - 111 --- - 0 + ### SVE Memory Store Group # SVE store predicate register -- 2.17.1
[Qemu-devel] [PATCH v5 14/35] target/arm: Implement SVE first-fault gather loads
Signed-off-by: Richard Henderson --- target/arm/helper-sve.h| 67 target/arm/sve_helper.c| 88 ++ target/arm/translate-sve.c | 126 - 3 files changed, 236 insertions(+), 45 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index aeb62afc34..55e8a908d4 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -1026,6 +1026,73 @@ DEF_HELPER_FLAGS_6(sve_ldhds_zd, TCG_CALL_NO_WG, DEF_HELPER_FLAGS_6(sve_ldsds_zd, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffbsu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffhsu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffssu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffbss_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffhss_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_ldffbsu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffhsu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffssu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffbss_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffhss_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_ldffbdu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffhdu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffsdu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffddu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffbds_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffhds_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffsds_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_ldffbdu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffhdu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffsdu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffddu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffbds_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffhds_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffsds_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_ldffbdu_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffhdu_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffsdu_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffddu_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffbds_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffhds_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffsds_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + DEF_HELPER_FLAGS_6(sve_stbs_zsu, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr, tl, i32) DEF_HELPER_FLAGS_6(sve_sths_zsu, TCG_CALL_NO_WG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 9b99718156..38f7cc2274 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3790,6 +3790,94 @@ DO_LD1_ZPZ_D(sve_ldbds_zd, uint64_t, int8_t, cpu_ldub_data_ra) DO_LD1_ZPZ_D(sve_ldhds_zd, uint64_t, int16_t, cpu_lduw_data_ra) DO_LD1_ZPZ_D(sve_ldsds_zd, uint64_t, int32_t, cpu_ldl_data_ra) +/* First fault loads with a vector index. */ + +#ifdef CONFIG_USER_ONLY + +#define DO_LDFF1_ZPZ(NAME, TYPEE, TYPEI, TYPEM, FN, H) \ +void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \ + target_ulong base, uint32_t desc) \ +{ \ +intptr_t i, oprsz = simd_oprsz(desc); \ +unsigned scale = simd_data(desc);
[Qemu-devel] [PATCH v5 17/35] target/arm: Implement SVE floating-point arithmetic with immediate
Signed-off-by: Richard Henderson --- target/arm/helper-sve.h| 56 target/arm/sve_helper.c| 69 +++ target/arm/translate-sve.c | 75 ++ target/arm/sve.decode | 14 +++ 4 files changed, 214 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 6089b3a53f..087819ec2b 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -809,6 +809,62 @@ DEF_HELPER_FLAGS_6(sve_fmulx_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_6(sve_fmulx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fadds_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fadds_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fadds_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fsubs_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fsubs_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fsubs_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmuls_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmuls_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmuls_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fsubrs_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fsubrs_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fsubrs_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmaxnms_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmaxnms_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmaxnms_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fminnms_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fminnms_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fminnms_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmaxs_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmaxs_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmaxs_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmins_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmins_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmins_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 2fd34d722b..a40df62414 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2997,6 +2997,75 @@ DO_ZPZZ_FP(sve_fmulx_d, uint64_t, , helper_vfp_mulxd) #undef DO_ZPZZ_FP +/* Three-operand expander, with one scalar operand, controlled by + * a predicate, with the extra float_status parameter. + */ +#define DO_ZPZS_FP(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vg, uint64_t scalar, \ + void *status, uint32_t desc)\ +{ \ +intptr_t i = simd_oprsz(desc);\ +uint64_t *g = vg; \ +TYPE mm = scalar; \ +do { \ +uint64_t pg = g[(i - 1) >> 6];\ +do { \ +i -= sizeof(TYPE);\ +if (likely((pg >> (i & 63)) & 1)) { \ +TYPE nn = *(TYPE *)(vn + H(i)); \ +*(TYPE *)(vd + H(i)) = OP(nn, mm, status);\ +} \ +} while (i & 63); \ +} while (i != 0); \ +} + +DO_ZPZS_FP(sve_fadds_h, float16, H1_2, float16_add)
[Qemu-devel] [PATCH v5 29/35] target/arm: Implement SVE fp complex multiply add
Signed-off-by: Richard Henderson --- target/arm/helper-sve.h| 4 + target/arm/sve_helper.c| 162 + target/arm/translate-sve.c | 37 + target/arm/sve.decode | 4 + 4 files changed, 207 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 0bd9fe2f28..023952a9a4 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -1115,6 +1115,10 @@ DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fcmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fcmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fcmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) + DEF_HELPER_FLAGS_5(sve_ftmad_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_ftmad_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_ftmad_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index ee7fc23bb9..cd3dfc8b26 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3729,6 +3729,168 @@ void HELPER(sve_fcadd_d)(void *vd, void *vn, void *vm, void *vg, } while (i != 0); } +/* + * FP Complex Multiply + */ + +QEMU_BUILD_BUG_ON(SIMD_DATA_SHIFT + 22 > 32); + +void HELPER(sve_fcmla_zpzzz_h)(CPUARMState *env, void *vg, uint32_t desc) +{ +intptr_t j, i = simd_oprsz(desc); +unsigned rd = extract32(desc, SIMD_DATA_SHIFT, 5); +unsigned rn = extract32(desc, SIMD_DATA_SHIFT + 5, 5); +unsigned rm = extract32(desc, SIMD_DATA_SHIFT + 10, 5); +unsigned ra = extract32(desc, SIMD_DATA_SHIFT + 15, 5); +unsigned rot = extract32(desc, SIMD_DATA_SHIFT + 20, 2); +bool flip = rot & 1; +float16 neg_imag, neg_real; +void *vd = >vfp.zregs[rd]; +void *vn = >vfp.zregs[rn]; +void *vm = >vfp.zregs[rm]; +void *va = >vfp.zregs[ra]; +uint64_t *g = vg; + +neg_imag = float16_set_sign(0, (rot & 2) != 0); +neg_real = float16_set_sign(0, rot == 1 || rot == 2); + +do { +uint64_t pg = g[(i - 1) >> 6]; +do { +float16 e1, e2, e3, e4, nr, ni, mr, mi, d; + +/* I holds the real index; J holds the imag index. */ +j = i - sizeof(float16); +i -= 2 * sizeof(float16); + +nr = *(float16 *)(vn + H1_2(i)); +ni = *(float16 *)(vn + H1_2(j)); +mr = *(float16 *)(vm + H1_2(i)); +mi = *(float16 *)(vm + H1_2(j)); + +e2 = (flip ? ni : nr); +e1 = (flip ? mi : mr) ^ neg_real; +e4 = e2; +e3 = (flip ? mr : mi) ^ neg_imag; + +if (likely((pg >> (i & 63)) & 1)) { +d = *(float16 *)(va + H1_2(i)); +d = float16_muladd(e2, e1, d, 0, >vfp.fp_status_f16); +*(float16 *)(vd + H1_2(i)) = d; +} +if (likely((pg >> (j & 63)) & 1)) { +d = *(float16 *)(va + H1_2(j)); +d = float16_muladd(e4, e3, d, 0, >vfp.fp_status_f16); +*(float16 *)(vd + H1_2(j)) = d; +} +} while (i & 63); +} while (i != 0); +} + +void HELPER(sve_fcmla_zpzzz_s)(CPUARMState *env, void *vg, uint32_t desc) +{ +intptr_t j, i = simd_oprsz(desc); +unsigned rd = extract32(desc, SIMD_DATA_SHIFT, 5); +unsigned rn = extract32(desc, SIMD_DATA_SHIFT + 5, 5); +unsigned rm = extract32(desc, SIMD_DATA_SHIFT + 10, 5); +unsigned ra = extract32(desc, SIMD_DATA_SHIFT + 15, 5); +unsigned rot = extract32(desc, SIMD_DATA_SHIFT + 20, 2); +bool flip = rot & 1; +float32 neg_imag, neg_real; +void *vd = >vfp.zregs[rd]; +void *vn = >vfp.zregs[rn]; +void *vm = >vfp.zregs[rm]; +void *va = >vfp.zregs[ra]; +uint64_t *g = vg; + +neg_imag = float32_set_sign(0, (rot & 2) != 0); +neg_real = float32_set_sign(0, rot == 1 || rot == 2); + +do { +uint64_t pg = g[(i - 1) >> 6]; +do { +float32 e1, e2, e3, e4, nr, ni, mr, mi, d; + +/* I holds the real index; J holds the imag index. */ +j = i - sizeof(float32); +i -= 2 * sizeof(float32); + +nr = *(float32 *)(vn + H1_2(i)); +ni = *(float32 *)(vn + H1_2(j)); +mr = *(float32 *)(vm + H1_2(i)); +mi = *(float32 *)(vm + H1_2(j)); + +e2 = (flip ? ni : nr); +e1 = (flip ? mi : mr) ^ neg_real; +e4 = e2; +e3 = (flip ? mr : mi) ^ neg_imag; + +if (likely((pg >> (i & 63)) & 1)) { +d = *(float32 *)(va + H1_2(i)); +d = float32_muladd(e2, e1, d, 0, >vfp.fp_status); +*(float32 *)(vd + H1_2(i)) = d; +} +if (likely((pg
[Qemu-devel] [PATCH v5 10/35] target/arm: Implement SVE store vector/predicate register
Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 103 + target/arm/sve.decode | 6 +++ 2 files changed, 109 insertions(+) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 954d6653d3..50f1ff75ef 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3762,6 +3762,89 @@ static void do_ldr(DisasContext *s, uint32_t vofs, uint32_t len, tcg_temp_free_i64(t0); } +/* Similarly for stores. */ +static void do_str(DisasContext *s, uint32_t vofs, uint32_t len, + int rn, int imm) +{ +uint32_t len_align = QEMU_ALIGN_DOWN(len, 8); +uint32_t len_remain = len % 8; +uint32_t nparts = len / 8 + ctpop8(len_remain); +int midx = get_mem_index(s); +TCGv_i64 addr, t0; + +addr = tcg_temp_new_i64(); +t0 = tcg_temp_new_i64(); + +/* Note that unpredicated load/store of vector/predicate registers + * are defined as a stream of bytes, which equates to little-endian + * operations on larger quantities. There is no nice way to force + * a little-endian store for aarch64_be-linux-user out of line. + * + * Attempt to keep code expansion to a minimum by limiting the + * amount of unrolling done. + */ +if (nparts <= 4) { +int i; + +for (i = 0; i < len_align; i += 8) { +tcg_gen_ld_i64(t0, cpu_env, vofs + i); +tcg_gen_addi_i64(addr, cpu_reg_sp(s, rn), imm + i); +tcg_gen_qemu_st_i64(t0, addr, midx, MO_LEQ); +} +} else { +TCGLabel *loop = gen_new_label(); +TCGv_ptr t2, i = tcg_const_local_ptr(0); + +gen_set_label(loop); + +t2 = tcg_temp_new_ptr(); +tcg_gen_add_ptr(t2, cpu_env, i); +tcg_gen_ld_i64(t0, t2, vofs); + +/* Minimize the number of local temps that must be re-read from + * the stack each iteration. Instead, re-compute values other + * than the loop counter. + */ +tcg_gen_addi_ptr(t2, i, imm); +tcg_gen_extu_ptr_i64(addr, t2); +tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, rn)); +tcg_temp_free_ptr(t2); + +tcg_gen_qemu_st_i64(t0, addr, midx, MO_LEQ); + +tcg_gen_addi_ptr(i, i, 8); + +tcg_gen_brcondi_ptr(TCG_COND_LTU, i, len_align, loop); +tcg_temp_free_ptr(i); +} + +/* Predicate register stores can be any multiple of 2. */ +if (len_remain) { +tcg_gen_ld_i64(t0, cpu_env, vofs + len_align); +tcg_gen_addi_i64(addr, cpu_reg_sp(s, rn), imm + len_align); + +switch (len_remain) { +case 2: +case 4: +case 8: +tcg_gen_qemu_st_i64(t0, addr, midx, MO_LE | ctz32(len_remain)); +break; + +case 6: +tcg_gen_qemu_st_i64(t0, addr, midx, MO_LEUL); +tcg_gen_addi_i64(addr, addr, 4); +tcg_gen_shri_i64(addr, addr, 32); +tcg_gen_qemu_st_i64(t0, addr, midx, MO_LEUW); +break; + +default: +g_assert_not_reached(); +} +} +tcg_temp_free_i64(addr); +tcg_temp_free_i64(t0); +} + static bool trans_LDR_zri(DisasContext *s, arg_rri *a, uint32_t insn) { if (sve_access_check(s)) { @@ -3782,6 +3865,26 @@ static bool trans_LDR_pri(DisasContext *s, arg_rri *a, uint32_t insn) return true; } +static bool trans_STR_zri(DisasContext *s, arg_rri *a, uint32_t insn) +{ +if (sve_access_check(s)) { +int size = vec_full_reg_size(s); +int off = vec_full_reg_offset(s, a->rd); +do_str(s, off, size, a->rn, a->imm * size); +} +return true; +} + +static bool trans_STR_pri(DisasContext *s, arg_rri *a, uint32_t insn) +{ +if (sve_access_check(s)) { +int size = pred_full_reg_size(s); +int off = pred_full_reg_offset(s, a->rd); +do_str(s, off, size, a->rn, a->imm * size); +} +return true; +} + /* *** SVE Memory - Contiguous Load Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 56039e2193..c088e51493 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -792,6 +792,12 @@ LD1RQ_zpri 1010010 .. 00 0 001 ... . . \ ### SVE Memory Store Group +# SVE store predicate register +STR_pri 1110010 11 0. . 000 ... . 0 @pd_rn_i9 + +# SVE store vector register +STR_zri 1110010 11 0. . 010 ... . . @rd_rn_i9 + # SVE contiguous store (scalar plus immediate) # ST1B, ST1H, ST1W, ST1D; require msz <= esz ST_zpri 1110010 .. esz:2 0 111 ... . . \ -- 2.17.1
[Qemu-devel] [PATCH v5 11/35] target/arm: Implement SVE scatter stores
Signed-off-by: Richard Henderson --- target/arm/helper-sve.h| 41 + target/arm/sve_helper.c| 62 target/arm/translate-sve.c | 74 ++ target/arm/sve.decode | 39 4 files changed, 216 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index a5d3bb121c..8880128f9c 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -958,3 +958,44 @@ DEF_HELPER_FLAGS_4(sve_st1hs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_st1hd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_st1sd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_stbs_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_sths_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stss_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_stbs_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_sths_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stss_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_stbd_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_sthd_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stsd_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stdd_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_stbd_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_sthd_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stsd_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stdd_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_stbd_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_sthd_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stsd_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stdd_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index a9c98bca32..ed4861a292 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3712,3 +3712,65 @@ void HELPER(sve_st4dd_r)(CPUARMState *env, void *vg, addr += 4 * 8; } } + +/* Stores with a vector index. */ + +#define DO_ST1_ZPZ_S(NAME, TYPEI, FN) \ +void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \ + target_ulong base, uint32_t desc) \ +{ \ +intptr_t i, oprsz = simd_oprsz(desc) / 8; \ +unsigned scale = simd_data(desc); \ +uintptr_t ra = GETPC(); \ +uint32_t *d = vd; TYPEI *m = vm; uint8_t *pg = vg; \ +for (i = 0; i < oprsz; i++) { \ +uint8_t pp = pg[H1(i)]; \ +if (pp & 0x01) {\ +target_ulong off = (target_ulong)m[H4(i * 2)] << scale; \ +FN(env, base + off, d[H4(i * 2)], ra); \ +} \ +if (pp & 0x10) {\ +target_ulong off = (target_ulong)m[H4(i * 2 + 1)] << scale; \ +FN(env, base + off, d[H4(i * 2 + 1)], ra); \ +} \ +} \ +} + +#define DO_ST1_ZPZ_D(NAME, TYPEI, FN) \ +void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \ + target_ulong base, uint32_t desc) \ +{ \ +intptr_t i, oprsz = simd_oprsz(desc) / 8; \ +unsigned scale = simd_data(desc); \ +uintptr_t ra = GETPC(); \ +uint64_t *d = vd, *m = vm; uint8_t *pg = vg;\ +for (i = 0; i < oprsz; i++) {
[Qemu-devel] [PATCH v5 15/35] target/arm: Implement SVE scatter store vector immediate
Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 83 ++ target/arm/sve.decode | 11 + 2 files changed, 69 insertions(+), 25 deletions(-) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 11c1bf112c..fdc3231e63 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -4395,31 +4395,33 @@ static bool trans_LD1_zpiz(DisasContext *s, arg_LD1_zpiz *a, uint32_t insn) return true; } +/* Indexed by [xs][msz]. */ +static gen_helper_gvec_mem_scatter * const scatter_store_fn32[2][3] = { +{ gen_helper_sve_stbs_zsu, + gen_helper_sve_sths_zsu, + gen_helper_sve_stss_zsu, }, +{ gen_helper_sve_stbs_zss, + gen_helper_sve_sths_zss, + gen_helper_sve_stss_zss, }, +}; + +static gen_helper_gvec_mem_scatter * const scatter_store_fn64[3][4] = { +{ gen_helper_sve_stbd_zsu, + gen_helper_sve_sthd_zsu, + gen_helper_sve_stsd_zsu, + gen_helper_sve_stdd_zsu, }, +{ gen_helper_sve_stbd_zss, + gen_helper_sve_sthd_zss, + gen_helper_sve_stsd_zss, + gen_helper_sve_stdd_zss, }, +{ gen_helper_sve_stbd_zd, + gen_helper_sve_sthd_zd, + gen_helper_sve_stsd_zd, + gen_helper_sve_stdd_zd, }, +}; + static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn) { -/* Indexed by [xs][msz]. */ -static gen_helper_gvec_mem_scatter * const fn32[2][3] = { -{ gen_helper_sve_stbs_zsu, - gen_helper_sve_sths_zsu, - gen_helper_sve_stss_zsu, }, -{ gen_helper_sve_stbs_zss, - gen_helper_sve_sths_zss, - gen_helper_sve_stss_zss, }, -}; -static gen_helper_gvec_mem_scatter * const fn64[3][4] = { -{ gen_helper_sve_stbd_zsu, - gen_helper_sve_sthd_zsu, - gen_helper_sve_stsd_zsu, - gen_helper_sve_stdd_zsu, }, -{ gen_helper_sve_stbd_zss, - gen_helper_sve_sthd_zss, - gen_helper_sve_stsd_zss, - gen_helper_sve_stdd_zss, }, -{ gen_helper_sve_stbd_zd, - gen_helper_sve_sthd_zd, - gen_helper_sve_stsd_zd, - gen_helper_sve_stdd_zd, }, -}; gen_helper_gvec_mem_scatter *fn; if (a->esz < a->msz || (a->msz == 0 && a->scale)) { @@ -4430,10 +4432,10 @@ static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn) } switch (a->esz) { case MO_32: -fn = fn32[a->xs][a->msz]; +fn = scatter_store_fn32[a->xs][a->msz]; break; case MO_64: -fn = fn64[a->xs][a->msz]; +fn = scatter_store_fn64[a->xs][a->msz]; break; default: g_assert_not_reached(); @@ -4443,6 +4445,37 @@ static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn) return true; } +static bool trans_ST1_zpiz(DisasContext *s, arg_ST1_zpiz *a, uint32_t insn) +{ +gen_helper_gvec_mem_scatter *fn = NULL; +TCGv_i64 imm; + +if (a->esz < a->msz) { +return false; +} +if (!sve_access_check(s)) { +return true; +} + +switch (a->esz) { +case MO_32: +fn = scatter_store_fn32[0][a->msz]; +break; +case MO_64: +fn = scatter_store_fn64[2][a->msz]; +break; +} +assert(fn != NULL); + +/* Treat ST1_zpiz (zn[x] + imm) the same way as ST1_zprz (rn + zm[x]) + * by loading the immediate into the scalar parameter. + */ +imm = tcg_const_i64(a->imm << a->msz); +do_mem_zpz(s, a->rd, a->pg, a->rn, 0, imm, fn); +tcg_temp_free_i64(imm); +return true; +} + /* * Prefetches */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index fb73a22c0e..311ae6dbdf 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -83,6 +83,7 @@ _gather_load rd pg rn rm esz msz u ff xs scale _gather_load rd pg rn imm esz msz u ff _scatter_store rd pg rn rm esz msz xs scale +_scatter_store rd pg rn imm esz msz ### # Named instruction formats. These are generally used to @@ -215,6 +216,8 @@ _store nreg=0 @rprr_scatter_store ... msz:2 .. rm:5 ... pg:3 rn:5 rd:5 \ _scatter_store +@rpri_scatter_store ... msz:2 ..imm:5 ... pg:3 rn:5 rd:5 \ +_scatter_store ### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. @@ -927,6 +930,14 @@ ST1_zprz1110010 .. 01 . 101 ... . . \ ST1_zprz1110010 .. 00 . 101 ... . . \ @rprr_scatter_store xs=2 esz=3 scale=0 +# SVE 64-bit scatter store (vector plus immediate) +ST1_zpiz1110010 .. 10 . 101 ... . . \ +@rpri_scatter_store esz=3 + +# SVE 32-bit scatter store (vector plus immediate) +ST1_zpiz1110010 .. 11 . 101 ... . . \ +
[Qemu-devel] [PATCH v5 07/35] target/arm: Implement SVE FP Multiply-Add Group
Signed-off-by: Richard Henderson --- target/arm/helper-sve.h| 16 target/arm/sve_helper.c| 158 + target/arm/translate-sve.c | 49 target/arm/sve.decode | 17 4 files changed, 240 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 4097b55f0e..eb0645dd43 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -827,6 +827,22 @@ DEF_HELPER_FLAGS_5(sve_ucvt_ds, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_ucvt_dd, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_fmls_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fmls_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fmls_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_fnmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fnmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fnmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) + DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 6bf30a3e66..aeb4ccadd9 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2938,6 +2938,164 @@ DO_ZPZ_FP(sve_ucvt_dd, uint64_t, , uint64_to_float64) #undef DO_ZPZ_FP +/* 4-operand predicated multiply-add. This requires 7 operands to pass + * "properly", so we need to encode some of the registers into DESC. + */ +QEMU_BUILD_BUG_ON(SIMD_DATA_SHIFT + 20 > 32); + +static void do_fmla_zpzzz_h(CPUARMState *env, void *vg, uint32_t desc, +uint16_t neg1, uint16_t neg3) +{ +intptr_t i = simd_oprsz(desc); +unsigned rd = extract32(desc, SIMD_DATA_SHIFT, 5); +unsigned rn = extract32(desc, SIMD_DATA_SHIFT + 5, 5); +unsigned rm = extract32(desc, SIMD_DATA_SHIFT + 10, 5); +unsigned ra = extract32(desc, SIMD_DATA_SHIFT + 15, 5); +void *vd = >vfp.zregs[rd]; +void *vn = >vfp.zregs[rn]; +void *vm = >vfp.zregs[rm]; +void *va = >vfp.zregs[ra]; +uint64_t *g = vg; + +do { +uint64_t pg = g[(i - 1) >> 6]; +do { +i -= 2; +if (likely((pg >> (i & 63)) & 1)) { +float16 e1, e2, e3, r; + +e1 = *(uint16_t *)(vn + H1_2(i)) ^ neg1; +e2 = *(uint16_t *)(vm + H1_2(i)); +e3 = *(uint16_t *)(va + H1_2(i)) ^ neg3; +r = float16_muladd(e1, e2, e3, 0, >vfp.fp_status); +*(uint16_t *)(vd + H1_2(i)) = r; +} +} while (i & 63); +} while (i != 0); +} + +void HELPER(sve_fmla_zpzzz_h)(CPUARMState *env, void *vg, uint32_t desc) +{ +do_fmla_zpzzz_h(env, vg, desc, 0, 0); +} + +void HELPER(sve_fmls_zpzzz_h)(CPUARMState *env, void *vg, uint32_t desc) +{ +do_fmla_zpzzz_h(env, vg, desc, 0x8000, 0); +} + +void HELPER(sve_fnmla_zpzzz_h)(CPUARMState *env, void *vg, uint32_t desc) +{ +do_fmla_zpzzz_h(env, vg, desc, 0x8000, 0x8000); +} + +void HELPER(sve_fnmls_zpzzz_h)(CPUARMState *env, void *vg, uint32_t desc) +{ +do_fmla_zpzzz_h(env, vg, desc, 0, 0x8000); +} + +static void do_fmla_zpzzz_s(CPUARMState *env, void *vg, uint32_t desc, +uint32_t neg1, uint32_t neg3) +{ +intptr_t i = simd_oprsz(desc); +unsigned rd = extract32(desc, SIMD_DATA_SHIFT, 5); +unsigned rn = extract32(desc, SIMD_DATA_SHIFT + 5, 5); +unsigned rm = extract32(desc, SIMD_DATA_SHIFT + 10, 5); +unsigned ra = extract32(desc, SIMD_DATA_SHIFT + 15, 5); +void *vd = >vfp.zregs[rd]; +void *vn = >vfp.zregs[rn]; +void *vm = >vfp.zregs[rm]; +void *va = >vfp.zregs[ra]; +uint64_t *g = vg; + +do { +uint64_t pg = g[(i - 1) >> 6]; +do { +i -= 4; +if (likely((pg >> (i & 63)) & 1)) { +float32 e1, e2, e3, r; + +e1 = *(uint32_t *)(vn + H1_4(i)) ^ neg1; +e2 = *(uint32_t *)(vm + H1_4(i)); +e3 = *(uint32_t *)(va + H1_4(i)) ^ neg3; +r = float32_muladd(e1, e2, e3, 0, >vfp.fp_status); +*(uint32_t *)(vd + H1_4(i)) = r; +} +} while (i & 63); +} while (i != 0); +} + +void HELPER(sve_fmla_zpzzz_s)(CPUARMState *env, void *vg, uint32_t desc) +{ +
[Qemu-devel] [PATCH v5 13/35] target/arm: Implement SVE gather loads
Signed-off-by: Richard Henderson --- target/arm/helper-sve.h| 67 target/arm/sve_helper.c| 77 +++ target/arm/translate-sve.c | 104 + target/arm/sve.decode | 53 +++ 4 files changed, 301 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 8880128f9c..aeb62afc34 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -959,6 +959,73 @@ DEF_HELPER_FLAGS_4(sve_st1hd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_st1sd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldbsu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhsu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldssu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldbss_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhss_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_ldbsu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhsu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldssu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldbss_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhss_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_ldbdu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhdu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldsdu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldddu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldbds_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhds_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldsds_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_ldbdu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhdu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldsdu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldddu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldbds_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhds_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldsds_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_ldbdu_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhdu_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldsdu_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldddu_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldbds_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhds_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldsds_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + DEF_HELPER_FLAGS_6(sve_stbs_zsu, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr, tl, i32) DEF_HELPER_FLAGS_6(sve_sths_zsu, TCG_CALL_NO_WG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index ed4861a292..9b99718156 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3713,6 +3713,83 @@ void HELPER(sve_st4dd_r)(CPUARMState *env, void *vg, } } +/* Loads with a vector index. */ + +#define DO_LD1_ZPZ_S(NAME, TYPEI, TYPEM, FN)\ +void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \ + target_ulong base, uint32_t desc) \ +{ \ +intptr_t i, oprsz = simd_oprsz(desc); \ +unsigned scale = simd_data(desc); \ +uintptr_t ra = GETPC(); \ +for (i = 0; i < oprsz; i++) { \ +
[Qemu-devel] [PATCH v5 09/35] target/arm: Implement SVE load and broadcast element
Signed-off-by: Richard Henderson --- target/arm/helper-sve.h| 5 +++ target/arm/sve_helper.c| 41 + target/arm/translate-sve.c | 62 ++ target/arm/sve.decode | 5 +++ 4 files changed, 113 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 68e55a8d03..a5d3bb121c 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -274,6 +274,11 @@ DEF_HELPER_FLAGS_3(sve_clr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_clr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_clr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_movz_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_movz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_movz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_movz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(sve_asr_zpzi_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_asr_zpzi_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_asr_zpzi_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 990e5f3900..a9c98bca32 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -995,6 +995,47 @@ void HELPER(sve_clr_d)(void *vd, void *vg, uint32_t desc) } } +/* Copy Zn into Zn, and store zero into inactive elements. */ +void HELPER(sve_movz_b)(void *vd, void *vn, void *vg, uint32_t desc) +{ +intptr_t i, opr_sz = simd_oprsz(desc) / 8; +uint64_t *d = vd, *n = vn; +uint8_t *pg = vg; +for (i = 0; i < opr_sz; i += 1) { +d[i] = n[i] & expand_pred_b(pg[H1(i)]); +} +} + +void HELPER(sve_movz_h)(void *vd, void *vn, void *vg, uint32_t desc) +{ +intptr_t i, opr_sz = simd_oprsz(desc) / 8; +uint64_t *d = vd, *n = vn; +uint8_t *pg = vg; +for (i = 0; i < opr_sz; i += 1) { +d[i] = n[i] & expand_pred_h(pg[H1(i)]); +} +} + +void HELPER(sve_movz_s)(void *vd, void *vn, void *vg, uint32_t desc) +{ +intptr_t i, opr_sz = simd_oprsz(desc) / 8; +uint64_t *d = vd, *n = vn; +uint8_t *pg = vg; +for (i = 0; i < opr_sz; i += 1) { +d[i] = n[i] & expand_pred_s(pg[H1(i)]); +} +} + +void HELPER(sve_movz_d)(void *vd, void *vn, void *vg, uint32_t desc) +{ +intptr_t i, opr_sz = simd_oprsz(desc) / 8; +uint64_t *d = vd, *n = vn; +uint8_t *pg = vg; +for (i = 0; i < opr_sz; i += 1) { +d[i] = n[1] & -(uint64_t)(pg[H1(i)] & 1); +} +} + /* Three-operand expander, immediate operand, controlled by a predicate. */ #define DO_ZPZI(NAME, TYPE, H, OP) \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 483ad33179..954d6653d3 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -606,6 +606,20 @@ static bool do_clr_zp(DisasContext *s, int rd, int pg, int esz) return true; } +/* Copy Zn into Zd, storing zeros into inactive elements. */ +static void do_movz_zpz(DisasContext *s, int rd, int rn, int pg, int esz) +{ +static gen_helper_gvec_3 * const fns[4] = { +gen_helper_sve_movz_b, gen_helper_sve_movz_h, +gen_helper_sve_movz_s, gen_helper_sve_movz_d, +}; +unsigned vsz = vec_full_reg_size(s); +tcg_gen_gvec_3_ool(vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), + pred_full_reg_offset(s, pg), + vsz, vsz, 0, fns[esz]); +} + static bool do_zpzi_ool(DisasContext *s, arg_rpri_esz *a, gen_helper_gvec_3 *fn) { @@ -3999,6 +4013,54 @@ static bool trans_LD1RQ_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn) return true; } +/* Load and broadcast element. */ +static bool trans_LD1R_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn) +{ +if (!sve_access_check(s)) { +return true; +} + +unsigned vsz = vec_full_reg_size(s); +unsigned psz = pred_full_reg_size(s); +unsigned esz = dtype_esz[a->dtype]; +TCGLabel *over = gen_new_label(); +TCGv_i64 temp; + +/* If the guarding predicate has no bits set, no load occurs. */ +if (psz <= 8) { +/* Reduce the pred_esz_masks value simply to reduce the + * size of the code generated here. + */ +uint64_t psz_mask = MAKE_64BIT_MASK(0, psz * 8); +temp = tcg_temp_new_i64(); +tcg_gen_ld_i64(temp, cpu_env, pred_full_reg_offset(s, a->pg)); +tcg_gen_andi_i64(temp, temp, pred_esz_masks[esz] & psz_mask); +tcg_gen_brcondi_i64(TCG_COND_EQ, temp, 0, over); +tcg_temp_free_i64(temp); +} else { +TCGv_i32 t32 = tcg_temp_new_i32(); +find_last_active(s, t32, esz, a->pg); +tcg_gen_brcondi_i32(TCG_COND_LT, t32, 0, over); +tcg_temp_free_i32(t32); +} + +/* Load the data. */ +
[Qemu-devel] [PATCH v5 08/35] target/arm: Implement SVE Floating Point Accumulating Reduction Group
Signed-off-by: Richard Henderson --- target/arm/helper-sve.h| 7 + target/arm/sve_helper.c| 56 ++ target/arm/translate-sve.c | 45 ++ target/arm/sve.decode | 5 4 files changed, 113 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index eb0645dd43..68e55a8d03 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -720,6 +720,13 @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fadda_h, TCG_CALL_NO_RWG, + i64, i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fadda_s, TCG_CALL_NO_RWG, + i64, i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fadda_d, TCG_CALL_NO_RWG, + i64, i64, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_6(sve_fadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_6(sve_fadd_s, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index aeb4ccadd9..990e5f3900 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2811,6 +2811,62 @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc) return predtest_ones(d, oprsz, esz_mask); } +uint64_t HELPER(sve_fadda_h)(uint64_t nn, void *vm, void *vg, + void *status, uint32_t desc) +{ +intptr_t i = 0, opr_sz = simd_oprsz(desc); +float16 result = nn; + +do { +uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); +do { +if (pg & 1) { +float16 mm = *(float16 *)(vm + H1_2(i)); +result = float16_add(result, mm, status); +} +i += sizeof(float16), pg >>= sizeof(float16); +} while (i & 15); +} while (i < opr_sz); + +return result; +} + +uint64_t HELPER(sve_fadda_s)(uint64_t nn, void *vm, void *vg, + void *status, uint32_t desc) +{ +intptr_t i = 0, opr_sz = simd_oprsz(desc); +float32 result = nn; + +do { +uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); +do { +if (pg & 1) { +float32 mm = *(float32 *)(vm + H1_2(i)); +result = float32_add(result, mm, status); +} +i += sizeof(float32), pg >>= sizeof(float32); +} while (i & 15); +} while (i < opr_sz); + +return result; +} + +uint64_t HELPER(sve_fadda_d)(uint64_t nn, void *vm, void *vg, + void *status, uint32_t desc) +{ +intptr_t i = 0, opr_sz = simd_oprsz(desc) / 8; +uint64_t *m = vm; +uint8_t *pg = vg; + +for (i = 0; i < opr_sz; i++) { +if (pg[H1(i)] & 1) { +nn = float64_add(nn, m[i], status); +} +} + +return nn; +} + /* Fully general three-operand expander, controlled by a predicate, * With the extra float_status parameter. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index acad6374ef..483ad33179 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3383,6 +3383,51 @@ DO_ZZI(UMIN, umin) #undef DO_ZZI +/* + *** SVE Floating Point Accumulating Reduction Group + */ + +static bool trans_FADDA(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ +typedef void fadda_fn(TCGv_i64, TCGv_i64, TCGv_ptr, + TCGv_ptr, TCGv_ptr, TCGv_i32); +static fadda_fn * const fns[3] = { +gen_helper_sve_fadda_h, +gen_helper_sve_fadda_s, +gen_helper_sve_fadda_d, +}; +unsigned vsz = vec_full_reg_size(s); +TCGv_ptr t_rm, t_pg, t_fpst; +TCGv_i64 t_val; +TCGv_i32 t_desc; + +if (a->esz == 0) { +return false; +} +if (!sve_access_check(s)) { +return true; +} + +t_val = load_esz(cpu_env, vec_reg_offset(s, a->rn, 0, a->esz), a->esz); +t_rm = tcg_temp_new_ptr(); +t_pg = tcg_temp_new_ptr(); +tcg_gen_addi_ptr(t_rm, cpu_env, vec_full_reg_offset(s, a->rm)); +tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, a->pg)); +t_fpst = get_fpstatus_ptr(a->esz == MO_16); +t_desc = tcg_const_i32(simd_desc(vsz, vsz, 0)); + +fns[a->esz - 1](t_val, t_val, t_rm, t_pg, t_fpst, t_desc); + +tcg_temp_free_i32(t_desc); +tcg_temp_free_ptr(t_fpst); +tcg_temp_free_ptr(t_pg); +tcg_temp_free_ptr(t_rm); + +write_fp_dreg(s, a->rd, t_val); +tcg_temp_free_i64(t_val); +return true; +} + /* *** SVE Floating Point Arithmetic - Unpredicated Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 70e5a3aeb5..ba10cddb8a 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -676,6 +676,11 @@ UMIN_zzi00100101 .. 101 011 110 . @rdn_i8u # SVE integer multiply immediate (unpredicated) MUL_zzi 00100101 .. 110 000 110
[Qemu-devel] [PATCH v5 04/35] target/arm: Implement SVE load and broadcast quadword
Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 52 ++ target/arm/sve.decode | 9 +++ 2 files changed, 61 insertions(+) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index b25fe96b77..83de87ee0e 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3717,6 +3717,58 @@ static bool trans_LDNF1_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn) return true; } +static void do_ldrq(DisasContext *s, int zt, int pg, TCGv_i64 addr, int msz) +{ +static gen_helper_gvec_mem * const fns[4] = { +gen_helper_sve_ld1bb_r, gen_helper_sve_ld1hh_r, +gen_helper_sve_ld1ss_r, gen_helper_sve_ld1dd_r, +}; +unsigned vsz = vec_full_reg_size(s); +TCGv_ptr t_pg; +TCGv_i32 desc; + +/* Load the first quadword using the normal predicated load helpers. */ +desc = tcg_const_i32(simd_desc(16, 16, zt)); +t_pg = tcg_temp_new_ptr(); + +tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg)); +fns[msz](cpu_env, t_pg, addr, desc); + +tcg_temp_free_ptr(t_pg); +tcg_temp_free_i32(desc); + +/* Replicate that first quadword. */ +if (vsz > 16) { +unsigned dofs = vec_full_reg_offset(s, zt); +tcg_gen_gvec_dup_mem(4, dofs + 16, dofs, vsz - 16, vsz - 16); +} +} + +static bool trans_LD1RQ_zprr(DisasContext *s, arg_rprr_load *a, uint32_t insn) +{ +if (a->rm == 31) { +return false; +} +if (sve_access_check(s)) { +int msz = dtype_msz(a->dtype); +TCGv_i64 addr = new_tmp_a64(s); +tcg_gen_shli_i64(addr, cpu_reg(s, a->rm), msz); +tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn)); +do_ldrq(s, a->rd, a->pg, addr, msz); +} +return true; +} + +static bool trans_LD1RQ_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn) +{ +if (sve_access_check(s)) { +TCGv_i64 addr = new_tmp_a64(s); +tcg_gen_addi_i64(addr, cpu_reg_sp(s, a->rn), a->imm * 16); +do_ldrq(s, a->rd, a->pg, addr, dtype_msz(a->dtype)); +} +return true; +} + static void do_st_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr, int msz, int esz, int nreg) { diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 6e159faaec..606c4f623c 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -715,6 +715,15 @@ LD_zprr 1010010 .. nreg:2 . 110 ... . . @rprr_load_msz # LD2B, LD2H, LD2W, LD2D; etc. LD_zpri 1010010 .. nreg:2 0 111 ... . . @rpri_load_msz +# SVE load and broadcast quadword (scalar plus scalar) +LD1RQ_zprr 1010010 .. 00 . 000 ... . . \ +@rprr_load_msz nreg=0 + +# SVE load and broadcast quadword (scalar plus immediate) +# LD1RQB, LD1RQH, LD1RQS, LD1RQD +LD1RQ_zpri 1010010 .. 00 0 001 ... . . \ +@rpri_load_msz nreg=0 + ### SVE Memory Store Group # SVE contiguous store (scalar plus immediate) -- 2.17.1
[Qemu-devel] [PATCH v5 05/35] target/arm: Implement SVE integer convert to floating-point
Signed-off-by: Richard Henderson --- target/arm/helper-sve.h| 30 + target/arm/sve_helper.c| 38 target/arm/translate-sve.c | 90 ++ target/arm/sve.decode | 22 ++ 4 files changed, 180 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index b768128951..185112e1d2 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -720,6 +720,36 @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_scvt_dh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_scvt_ss, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_scvt_sd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_scvt_ds, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_scvt_dd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_ucvt_hh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ucvt_sh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ucvt_dh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ucvt_ss, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ucvt_sd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ucvt_ds, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ucvt_dd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index f20774e240..a2f034820a 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2811,6 +2811,44 @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc) return predtest_ones(d, oprsz, esz_mask); } +/* Fully general two-operand expander, controlled by a predicate, + * With the extra float_status parameter. + */ +#define DO_ZPZ_FP(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vg, void *status, uint32_t desc) \ +{ \ +intptr_t i = simd_oprsz(desc);\ +uint64_t *g = vg; \ +do { \ +uint64_t pg = g[(i - 1) >> 6];\ +do { \ +i -= sizeof(TYPE);\ +if (likely((pg >> (i & 63)) & 1)) { \ +TYPE nn = *(TYPE *)(vn + H(i)); \ +*(TYPE *)(vd + H(i)) = OP(nn, status);\ +} \ +} while (i & 63); \ +} while (i != 0); \ +} + +DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16) +DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16) +DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32) +DO_ZPZ_FP(sve_scvt_sd, uint64_t, , int32_to_float64) +DO_ZPZ_FP(sve_scvt_dh, uint64_t, , int64_to_float16) +DO_ZPZ_FP(sve_scvt_ds, uint64_t, , int64_to_float32) +DO_ZPZ_FP(sve_scvt_dd, uint64_t, , int64_to_float64) + +DO_ZPZ_FP(sve_ucvt_hh, uint16_t, H1_2, uint16_to_float16) +DO_ZPZ_FP(sve_ucvt_sh, uint32_t, H1_4, uint32_to_float16) +DO_ZPZ_FP(sve_ucvt_ss, uint32_t, H1_4, uint32_to_float32) +DO_ZPZ_FP(sve_ucvt_sd, uint64_t, , uint32_to_float64) +DO_ZPZ_FP(sve_ucvt_dh, uint64_t, , uint64_to_float16) +DO_ZPZ_FP(sve_ucvt_ds, uint64_t, , uint64_to_float32) +DO_ZPZ_FP(sve_ucvt_dd, uint64_t, , uint64_to_float64) + +#undef DO_ZPZ_FP + /* * Load contiguous data, protected by a governing predicate. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 83de87ee0e..7639e589f5 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3425,6 +3425,96 @@ DO_FP3(FRSQRTS, rsqrts) #undef DO_FP3 + +/* + *** SVE Floating Point Unary Operations
[Qemu-devel] [PATCH v5 06/35] target/arm: Implement SVE floating-point arithmetic (predicated)
Signed-off-by: Richard Henderson --- target/arm/helper-sve.h| 77 + target/arm/sve_helper.c| 89 ++ target/arm/translate-sve.c | 46 target/arm/sve.decode | 17 4 files changed, 229 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 185112e1d2..4097b55f0e 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -720,6 +720,83 @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fadd_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fadd_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fadd_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fsub_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fsub_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fsub_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmul_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmul_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmul_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fdiv_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fdiv_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fdiv_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmin_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmin_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmin_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmax_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmax_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmax_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fminnum_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fminnum_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fminnum_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmaxnum_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmaxnum_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmaxnum_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fabd_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fabd_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fabd_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fscalbn_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fscalbn_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fscalbn_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmulx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmulx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmulx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index a2f034820a..6bf30a3e66 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2811,6 +2811,95 @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc) return predtest_ones(d, oprsz, esz_mask); } +/* Fully general three-operand expander, controlled by a predicate, + * With the extra float_status parameter. + */ +#define DO_ZPZZ_FP(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, \ + void *status, uint32_t desc) \ +{
[Qemu-devel] [PATCH v5 03/35] target/arm: Implement SVE Memory Contiguous Store Group
Signed-off-by: Richard Henderson --- target/arm/helper-sve.h| 29 + target/arm/sve_helper.c| 211 + target/arm/translate-sve.c | 65 target/arm/sve.decode | 38 +++ 4 files changed, 343 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 7338abbbcf..b768128951 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -794,3 +794,32 @@ DEF_HELPER_FLAGS_4(sve_ldnf1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ldnf1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ldnf1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_st1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st4bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_st1hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st2hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st3hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st4hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_st1ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st2ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st3ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st4ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_st1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st2dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st3dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st4dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_st1bh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st1bs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st1bd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_st1hs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st1hd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_st1sd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 6e1b539ce3..f20774e240 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3119,3 +3119,214 @@ DO_LDNF1(sds_r) DO_LDNF1(dd_r) #undef DO_LDNF1 + +/* + * Store contiguous data, protected by a governing predicate. + */ +#define DO_ST1(NAME, FN, TYPEE, TYPEM, H) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc)\ +{ \ +intptr_t i, oprsz = simd_oprsz(desc); \ +intptr_t ra = GETPC(); \ +unsigned rd = simd_data(desc); \ +void *vd = >vfp.zregs[rd];\ +for (i = 0; i < oprsz; ) { \ +uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3));\ +do { \ +if (pg & 1) { \ +TYPEM m = *(TYPEE *)(vd + H(i)); \ +FN(env, addr, m, ra); \ +} \ +i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \ +addr += sizeof(TYPEM); \ +} while (i & 15); \ +} \ +} + +#define DO_ST1_D(NAME, FN, TYPEM) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc)\ +{ \ +intptr_t i, oprsz = simd_oprsz(desc) / 8; \ +intptr_t ra = GETPC(); \ +unsigned rd = simd_data(desc); \ +uint64_t *d = >vfp.zregs[rd].d[0];\ +uint8_t *pg = vg; \ +for (i = 0; i < oprsz; i += 1) { \ +if (pg[H1(i)] & 1) { \ +FN(env, addr, d[i], ra); \ +} \ +addr += sizeof(TYPEM); \ +} \ +} + +#define DO_ST2(NAME, FN, TYPEE, TYPEM, H) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc)\ +{ \ +
[Qemu-devel] [PATCH v5 00/35] target/arm SVE patches
This is the remainder of the SVE enablement patches, with an extra bonus patch to enable ARMv8.2-DotProd. r~ Richard Henderson (35): target/arm: Implement SVE Memory Contiguous Load Group target/arm: Implement SVE Contiguous Load, first-fault and no-fault target/arm: Implement SVE Memory Contiguous Store Group target/arm: Implement SVE load and broadcast quadword target/arm: Implement SVE integer convert to floating-point target/arm: Implement SVE floating-point arithmetic (predicated) target/arm: Implement SVE FP Multiply-Add Group target/arm: Implement SVE Floating Point Accumulating Reduction Group target/arm: Implement SVE load and broadcast element target/arm: Implement SVE store vector/predicate register target/arm: Implement SVE scatter stores target/arm: Implement SVE prefetches target/arm: Implement SVE gather loads target/arm: Implement SVE first-fault gather loads target/arm: Implement SVE scatter store vector immediate target/arm: Implement SVE floating-point compare vectors target/arm: Implement SVE floating-point arithmetic with immediate target/arm: Implement SVE Floating Point Multiply Indexed Group target/arm: Implement SVE FP Fast Reduction Group target/arm: Implement SVE Floating Point Unary Operations-Unpredicated Group target/arm: Implement SVE FP Compare with Zero Group target/arm: Implement SVE floating-point trig multiply-add coefficient target/arm: Implement SVE floating-point convert precision target/arm: Implement SVE floating-point convert to integer target/arm: Implement SVE floating-point round to integral value target/arm: Implement SVE floating-point unary operations target/arm: Implement SVE MOVPRFX target/arm: Implement SVE floating-point complex add target/arm: Implement SVE fp complex multiply add target/arm: Pass index to AdvSIMD FCMLA (indexed) target/arm: Implement SVE fp complex multiply add (indexed) target/arm: Implement SVE dot product (vectors) target/arm: Implement SVE dot product (indexed) target/arm: Enable SVE for aarch64-linux-user target/arm: Implement ARMv8.2-DotProd target/arm/cpu.h |1 + target/arm/helper-sve.h| 682 ++ target/arm/helper.h| 44 +- linux-user/elfload.c |1 + target/arm/cpu.c |8 + target/arm/cpu64.c |2 + target/arm/helper.c|2 +- target/arm/sve_helper.c| 1827 target/arm/translate-a64.c | 57 +- target/arm/translate-sve.c | 1691 - target/arm/translate.c | 81 +- target/arm/vec_helper.c| 283 +- target/arm/sve.decode | 422 + 13 files changed, 5039 insertions(+), 62 deletions(-) -- 2.17.1
[Qemu-devel] [PATCH v5 02/35] target/arm: Implement SVE Contiguous Load, first-fault and no-fault
Signed-off-by: Richard Henderson --- target/arm/helper-sve.h| 40 ++ target/arm/sve_helper.c| 156 + target/arm/translate-sve.c | 69 target/arm/sve.decode | 6 ++ 4 files changed, 271 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index fcc9ba5f50..7338abbbcf 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -754,3 +754,43 @@ DEF_HELPER_FLAGS_4(sve_ld1hds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ldff1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldff1bhu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldff1bsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldff1bdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldff1bhs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldff1bss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldff1bds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ldff1hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldff1hsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldff1hdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldff1hss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldff1hds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ldff1ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldff1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldff1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ldff1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ldnf1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldnf1bhu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldnf1bsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldnf1bdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldnf1bhs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldnf1bss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldnf1bds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ldnf1hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldnf1hsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldnf1hdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldnf1hss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldnf1hds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ldnf1ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldnf1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldnf1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ldnf1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 4e6ad282f9..6e1b539ce3 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2963,3 +2963,159 @@ DO_LD4(sve_ld4dd_r, cpu_ldq_data_ra, uint64_t, uint64_t, ) #undef DO_LD2 #undef DO_LD3 #undef DO_LD4 + +/* + * Load contiguous data, first-fault and no-fault. + */ + +#ifdef CONFIG_USER_ONLY + +/* Fault on byte I. All bits in FFR from I are cleared. The vector + * result from I is CONSTRAINED UNPREDICTABLE; we choose the MERGE + * option, which leaves subsequent data unchanged. + */ +static void __attribute__((cold)) +record_fault(CPUARMState *env, intptr_t i, intptr_t oprsz) +{ +uint64_t *ffr = env->vfp.pregs[FFR_PRED_NUM].p; +if (i & 63) { +ffr[i / 64] &= MAKE_64BIT_MASK(0, (i & 63) - 1); +i = ROUND_UP(i, 64); +} +for (; i < oprsz; i += 64) { +ffr[i / 64] = 0; +} +} + +/* Hold the mmap lock during the operation so that there is no race + * between page_check_range and the load operation. We expect the + * usual case to have no faults at all, so we check the whole range + * first and if successful defer to the normal load operation. + * + * TODO: Change mmap_lock to a rwlock so that multiple readers + * can run simultaneously. This will probably help other uses + * within QEMU as well. + */ +#define DO_LDFF1(PART, FN, TYPEE, TYPEM, H) \ +static void do_sve_ldff1##PART(CPUARMState *env, void *vd, void *vg,\ + target_ulong addr, intptr_t oprsz, \ + bool first, uintptr_t ra)\ +{ \ +intptr_t i = 0; \ +do {
[Qemu-devel] [PATCH v5 01/35] target/arm: Implement SVE Memory Contiguous Load Group
Signed-off-by: Richard Henderson --- target/arm/helper-sve.h| 35 + target/arm/sve_helper.c| 153 + target/arm/translate-sve.c | 121 + target/arm/sve.decode | 34 + 4 files changed, 343 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 2e76084992..fcc9ba5f50 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -719,3 +719,38 @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld4bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ld1hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld2hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld3hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld4hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ld1ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld2ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld3ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld4ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ld1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld2dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld3dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld4dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ld1bhu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1bsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1bdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1bhs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1bss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1bds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ld1hsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1hdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1hss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1hds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ld1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 128bbf9b04..4e6ad282f9 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2810,3 +2810,156 @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc) return predtest_ones(d, oprsz, esz_mask); } + +/* + * Load contiguous data, protected by a governing predicate. + */ +#define DO_LD1(NAME, FN, TYPEE, TYPEM, H) \ +static void do_##NAME(CPUARMState *env, void *vd, void *vg, \ + target_ulong addr, intptr_t oprsz, \ + uintptr_t ra)\ +{ \ +intptr_t i = 0;\ +do { \ +uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3));\ +do { \ +TYPEM m = 0; \ +if (pg & 1) { \ +m = FN(env, addr, ra); \ +} \ +*(TYPEE *)(vd + H(i)) = m; \ +i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \ +addr += sizeof(TYPEM); \ +} while (i & 15); \ +} while (i < oprsz); \ +} \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc)\ +{ \ +do_##NAME(env, >vfp.zregs[simd_data(desc)], vg, \ + addr, simd_oprsz(desc), GETPC());\ +} + +#define DO_LD2(NAME, FN, TYPEE, TYPEM, H) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc)\ +{ \ +intptr_t i, oprsz = simd_oprsz(desc); \ +intptr_t ra = GETPC();
Re: [Qemu-devel] [PATCH v2 for-2.11.2] spapr: make pseries-2.11 the default machine type
On Wed, Jun 20, 2018 at 02:54:15PM +0200, Greg Kurz wrote: > The spapr capability framework was introduced in QEMU 2.12. It allows > to have an explicit control on how host features are exposed to the > guest. This is especially needed to handle migration between hetero- > geneous hosts (eg, POWER8 to POWER9). It is also used to expose fixes/ > workarounds against speculative execution vulnerabilities to guests. > The framework was hence backported to QEMU 2.11.1, especially these > commits: > > 0fac4aa93074 spapr: Add pseries-2.12 machine type > 9070f408f491 spapr: Treat Hardware Transactional Memory (HTM) as an > optional capability > > 0fac4aa93074 has the confusing effect of making pseries-2.12 the default > machine type for QEMU 2.11.1, instead of the expected pseries-2.11. This > patch changes the default machine back to pseries-2.11. > > Unfortunately, 9070f408f491 enforces the HTM capability for pseries-2.11 > to be enabled by default, ie, when not passing cap-htm on the command > line. This breaks several 'make check' testcases that run qemu-system-ppc64 > with TCG. > > The only sane way to fix this is to adapt the impacted testcases so that > they all pass cap-htm=off in this case. This patch does that as well. > > Signed-off-by: Greg Kurz > --- > v2: - have the testcases to pass cap-htm=off instead of violating the > capabilities logic. > > Upstream doesn't need anything like that since newer pseries machine types > start with HTM disabled by default. This is really a oneshot fix for 2.11.2, > and I've tried to make it as small as possible. > > This is a full replacement of the previous version. It is based on Mike's > staging tree for 2.11: Thanks for fixing this up Reviewed-by: David Gibson Btw, 2.11.z should probably have the 2.12 machine type removed entirely, as well as (obviously) not being the default. Not within scope for this patch, though. > > https://github.com/mdroth/qemu/commits/stable-2.11-staging 72cc467aabd1a2 > --- > hw/ppc/spapr.c |4 ++-- > tests/boot-serial-test.c |8 ++-- > tests/migration-test.c |4 ++-- > tests/prom-env-test.c|6 -- > tests/pxe-test.c | 10 +++--- > 5 files changed, 21 insertions(+), 11 deletions(-) > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > index 1a2dd1f597d9..6499a867520f 100644 > --- a/hw/ppc/spapr.c > +++ b/hw/ppc/spapr.c > @@ -3820,7 +3820,7 @@ static void > spapr_machine_2_12_class_options(MachineClass *mc) > /* Defaults for the latest behaviour inherited from the base class */ > } > > -DEFINE_SPAPR_MACHINE(2_12, "2.12", true); > +DEFINE_SPAPR_MACHINE(2_12, "2.12", false); > > /* > * pseries-2.11 > @@ -3842,7 +3842,7 @@ static void > spapr_machine_2_11_class_options(MachineClass *mc) > SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_11); > } > > -DEFINE_SPAPR_MACHINE(2_11, "2.11", false); > +DEFINE_SPAPR_MACHINE(2_11, "2.11", true); > > /* > * pseries-2.10 > diff --git a/tests/boot-serial-test.c b/tests/boot-serial-test.c > index c935d69824bd..98c5462377f8 100644 > --- a/tests/boot-serial-test.c > +++ b/tests/boot-serial-test.c > @@ -73,18 +73,22 @@ static void test_machine(const void *data) > const testdef_t *test = data; > char tmpname[] = "/tmp/qtest-boot-serial-XX"; > int fd; > +const char *machine_props; > > fd = mkstemp(tmpname); > g_assert(fd != -1); > > +machine_props = strcmp(test->machine, "pseries") == 0 ? ",cap-htm=off" : > ""; > + > /* > * Make sure that this test uses tcg if available: It is used as a > * fast-enough smoketest for that. > */ > -global_qtest = qtest_startf("-M %s,accel=tcg:kvm " > +global_qtest = qtest_startf("-M %s%s,accel=tcg:kvm " > "-chardev file,id=serial0,path=%s " > "-no-shutdown -serial chardev:serial0 %s", > -test->machine, tmpname, test->extra); > +test->machine, machine_props, tmpname, > +test->extra); > unlink(tmpname); > > check_guest_output(test, fd); > diff --git a/tests/migration-test.c b/tests/migration-test.c > index be598d3257ba..906d29b38241 100644 > --- a/tests/migration-test.c > +++ b/tests/migration-test.c > @@ -460,12 +460,12 @@ static void test_migrate_start(QTestState **from, > QTestState **to, > /* On ppc64, the test only works with kvm-hv, but not with kvm-pr */ > accel = access("/sys/module/kvm_hv", F_OK) ? "tcg" : "kvm:tcg"; > init_bootfile_ppc(bootpath); > -cmd_src = g_strdup_printf("-machine accel=%s -m 256M" > +cmd_src = g_strdup_printf("-machine accel=%s,cap-htm=off -m 256M" >" -name pcsource,debug-threads=on" >" -serial file:%s/src_serial" >" -drive file=%s,if=pflash,format=raw", >
Re: [Qemu-devel] [PATCH 0/9] spapr: Clean up pagesize handling
On Mon, Jun 18, 2018 at 04:35:57PM +1000, David Gibson wrote: > Currently the "pseries" machine type will (usually) advertise > different pagesizes to the guest when running under KVM and TCG, which > is not how things are supposed to work. > > This comes from poor handling of hardware limitations which mean that > under KVM HV the guest is unable to use pagesizes larger than those > backing the guest's RAM on the host side. > > The new scheme turns things around by having an explicit machine > parameter controlling the largest page size that the guest is allowed > to use. This limitation applies regardless of accelerator. When > we're running on KVM HV we ensure that our backing pages are adequate > to supply the requested guest page sizes, rather than adjusting the > guest page sizes based on what KVM can supply. > > This means that in order to use hugepages in a PAPR guest it's > necessary to add a "cap-hpt-max-page-size=16m" machine parameter as > well as setting the mem-path correctly. This is a bit more work on > the user and/or management side, but results in consistent behaviour > so I think it's worth it. > > Longer term, we might also use this parameter to control IOMMU page > sizes. But, I'm still working out how restrictions deriving from the > guest kernel, host kernel and hardware capabilities all interact here. > > This applies on top of my ppc-for-3.0 tree. Greg, Cédric, could you try to review this series pretty soon? I'd really like to get it merged, because it's the basis for a number of fixes for assorted problems with hugepage behaviour. > > Changes since RFC: > * Add preliminary cleanups to allow us to evaluate effective >capabilities levels earlier. > * Don't try to remove double resetting of cpus. It doesn't quite >work, and is no longer necessary with the above. > * Some user-friendliness improvements: use "hpt-max-page-size" >instead of the cryptic "hpt-mps", and take an actual page size >(allowing k/m/g suffixies) instead of a shift > > David Gibson (9): > target/ppc: Allow cpu compatiblity checks based on type, not instance > spapr: Compute effective capability values earlier > spapr: Add cpu_apply hook to capabilities > target/ppc: Add kvmppc_hpt_needs_host_contiguous_pages() helper > spapr: Maximum (HPT) pagesize property > spapr: Use maximum page size capability to simplify memory backend > checking > target/ppc: Add ppc_hash64_filter_pagesizes() > spapr: Limit available pagesizes to provide a consistent guest > environment > spapr: Don't rewrite mmu capabilities in KVM mode > > hw/ppc/spapr.c | 45 +++- > hw/ppc/spapr_caps.c | 156 > hw/ppc/spapr_cpu_core.c | 4 ++ > include/hw/ppc/spapr.h | 11 ++- > target/ppc/compat.c | 27 +-- > target/ppc/cpu.h| 4 ++ > target/ppc/kvm.c| 146 ++--- > target/ppc/kvm_ppc.h| 11 ++- > target/ppc/mmu-hash64.c | 59 +++ > target/ppc/mmu-hash64.h | 3 + > 10 files changed, 349 insertions(+), 117 deletions(-) > -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [Qemu-devel] [RFC 3/6] kvm: add kvm_get_max_vm_phys_shift
On Wed, Jun 20, 2018 at 03:07:30PM +0200, Eric Auger wrote: > Add the kvm_get_max_vm_phys_shift() helper that returns the > log of the maximum IPA size supported by KVM. This capability > needs to be known to create the VM with a correct IPA max size > (kvm_type passed along KVM_CREATE_VM ioctl. > > Signed-off-by: Eric Auger > --- > accel/kvm/kvm-all.c| 7 +++ > accel/stubs/kvm-stub.c | 5 + > include/sysemu/kvm.h | 1 + > 3 files changed, 13 insertions(+) > > diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c > index 0590986..137c38e 100644 > --- a/accel/kvm/kvm-all.c > +++ b/accel/kvm/kvm-all.c > @@ -280,6 +280,13 @@ static int kvm_set_user_memory_region(KVMMemoryListener > *kml, KVMSlot *slot) > return ret; > } > > +int kvm_get_max_vm_phys_shift(MachineState *ms) > +{ > +KVMState *s = KVM_STATE(ms->accelerator); > + > +return kvm_ioctl(s, KVM_ARM_GET_MAX_VM_PHYS_SHIFT, 0); From the ioctl() name, I'm assuming this is an ARM specific call. In which case, shouldn't it be in target/arm.kvm.c, not kvm-all.c ? > +} > + > int kvm_destroy_vcpu(CPUState *cpu) > { > KVMState *s = kvm_state; > diff --git a/accel/stubs/kvm-stub.c b/accel/stubs/kvm-stub.c > index 02d5170..7575ba7 100644 > --- a/accel/stubs/kvm-stub.c > +++ b/accel/stubs/kvm-stub.c > @@ -161,6 +161,11 @@ bool kvm_has_free_slot(MachineState *ms) > return false; > } > > +int kvm_get_max_vm_phys_shift(MachineState *ms) > +{ > +return 0; > +} > + > void kvm_init_cpu_signals(CPUState *cpu) > { > abort(); > diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h > index 0b64b8e..240e3d9 100644 > --- a/include/sysemu/kvm.h > +++ b/include/sysemu/kvm.h > @@ -206,6 +206,7 @@ extern KVMState *kvm_state; > /* external API */ > > bool kvm_has_free_slot(MachineState *ms); > +int kvm_get_max_vm_phys_shift(MachineState *ms); > bool kvm_has_sync_mmu(void); > int kvm_has_vcpu_events(void); > int kvm_has_robust_singlestep(void); -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [Qemu-devel] [RFC 2/6] hw/boards: Add a MachineState parameter to kvm_type callback
On Wed, Jun 20, 2018 at 03:07:29PM +0200, Eric Auger wrote: 1;5202;0c> On ARM, the kvm_type will be resolved by querying the KVMState. > Let's add the MachineState handle to the callback so that we > can retrieve the KVMState handle. in kvm_init, when the callback > is called, the kvm_state variable is not yet set. > > Signed-off-by: Eric Auger ppc parts Acked-by: David Gibson > --- > accel/kvm/kvm-all.c | 2 +- > hw/ppc/mac_newworld.c | 2 +- > hw/ppc/mac_oldworld.c | 2 +- > hw/ppc/spapr.c| 2 +- > include/hw/boards.h | 2 +- > 5 files changed, 5 insertions(+), 5 deletions(-) > > diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c > index ffee68e..0590986 100644 > --- a/accel/kvm/kvm-all.c > +++ b/accel/kvm/kvm-all.c > @@ -1550,7 +1550,7 @@ static int kvm_init(MachineState *ms) > > kvm_type = qemu_opt_get(qemu_get_machine_opts(), "kvm-type"); > if (mc->kvm_type) { > -type = mc->kvm_type(kvm_type); > +type = mc->kvm_type(ms, kvm_type); > } else if (kvm_type) { > ret = -EINVAL; > fprintf(stderr, "Invalid argument kvm-type=%s\n", kvm_type); > diff --git a/hw/ppc/mac_newworld.c b/hw/ppc/mac_newworld.c > index 744acdf..1409d9e 100644 > --- a/hw/ppc/mac_newworld.c > +++ b/hw/ppc/mac_newworld.c > @@ -492,7 +492,7 @@ static void ppc_core99_init(MachineState *machine) > qemu_register_boot_set(fw_cfg_boot_set, fw_cfg); > } > > -static int core99_kvm_type(const char *arg) > +static int core99_kvm_type(MachineState *ms, const char *arg) > { > /* Always force PR KVM */ > return 2; > diff --git a/hw/ppc/mac_oldworld.c b/hw/ppc/mac_oldworld.c > index 4608bab..1211fcd 100644 > --- a/hw/ppc/mac_oldworld.c > +++ b/hw/ppc/mac_oldworld.c > @@ -363,7 +363,7 @@ static void ppc_heathrow_init(MachineState *machine) > qemu_register_boot_set(fw_cfg_boot_set, fw_cfg); > } > > -static int heathrow_kvm_type(const char *arg) > +static int heathrow_kvm_type(MachineState *ms, const char *arg) > { > /* Always force PR KVM */ > return 2; > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > index f5d..faf078e 100644 > --- a/hw/ppc/spapr.c > +++ b/hw/ppc/spapr.c > @@ -2834,7 +2834,7 @@ static void spapr_machine_init(MachineState *machine) > } > } > > -static int spapr_kvm_type(const char *vm_type) > +static int spapr_kvm_type(MachineState *ms, const char *vm_type) > { > if (!vm_type) { > return 0; > diff --git a/include/hw/boards.h b/include/hw/boards.h > index ef7457f..78f90a1 100644 > --- a/include/hw/boards.h > +++ b/include/hw/boards.h > @@ -170,7 +170,7 @@ struct MachineClass { > void (*init)(MachineState *state); > void (*reset)(void); > void (*hot_add_cpu)(const int64_t id, Error **errp); > -int (*kvm_type)(const char *arg); > +int (*kvm_type)(MachineState *ms, const char *arg); > > BlockInterfaceType block_default_type; > int units_per_default_bus; -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [Qemu-devel] [PATCH v2 03/10] qcow2/bitmap: cache bm_list
On 06/20/2018 09:04 AM, Vladimir Sementsov-Ogievskiy wrote: > 13.06.2018 05:06, John Snow wrote: >> We don't need to re-read this list every time, exactly. We can keep it >> cached >> and delete our copy when we flush to disk. >> >> Because we don't try to flush bitmaps on close if there's nothing to >> flush, >> add a new conditional to delete the state anyway for a clean exit. >> >> Signed-off-by: John Snow >> --- >> block/qcow2-bitmap.c | 74 >> ++-- >> block/qcow2.c | 2 ++ >> block/qcow2.h | 2 ++ >> 3 files changed, 52 insertions(+), 26 deletions(-) >> >> diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c >> index 85c1b5afe3..5ae9b17928 100644 >> --- a/block/qcow2-bitmap.c >> +++ b/block/qcow2-bitmap.c >> @@ -636,6 +636,34 @@ fail: >> return NULL; >> } >> +static Qcow2BitmapList *get_bitmap_list(BlockDriverState *bs, Error >> **errp) >> +{ >> + BDRVQcow2State *s = bs->opaque; >> + Qcow2BitmapList *bm_list; >> + >> + if (s->bitmap_list) { >> + return (Qcow2BitmapList *)s->bitmap_list; >> + } >> + >> + if (s->nb_bitmaps) { >> + bm_list = bitmap_list_load(bs, errp); >> + } else { >> + bm_list = bitmap_list_new(); >> + } >> + s->bitmap_list = bm_list; > > may be, we shouldn't cache it in inactive mode. However, I think we'll > finally will not load bitmaps in inactive mode and drop the on > inactivate, so it would not matter.. > Do we not load bitmaps when BDRV_O_INACTIVE is set? it looks like we do? (From your subsequent email): > > really, now it would be a problem: we can start in inactive mode, load > nothing, and then we can't reload bitmaps; my fix in > https://lists.gnu.org/archive/html/qemu-devel/2018-06/msg03254.html > will not work after this patch. > So we load nothing because when we opened up the image RO, saw IN_USE bitmaps (or saw none at all) and decided not to load them. qcow2_do_open however marks that it has "loaded the bitmaps." Later, when we reopen RW, we have a cached bm_list that doesn't include this bitmap, so we don't mark it as IN_USE again or update the header. (Wait, if you are worried about the bitmap's data having been changed, why do we not reload the bitmap data here too?) Now, patch 06 changes the cache so that we load all bitmaps and not just ones not IN_USE. On RW reload, I reject any such IN_USE bitmaps as a reason to prohibit the RW reload. However, this is broken too, because I will miss any new flags that exist on-disk, so this function should never use the cached data. I'm confused, though, the comment that calls reopen_bitmaps_rw_hint says: "It's some kind of reopen. There are no known cases where we need to reload bitmaps in such a situation, so it's safer to skip them. Moreover, if we have some readonly bitmaps and we are reopening for rw we should reopen bitmaps correspondingly." Why do we assume that bitmap data cannot change while BDRV_O_INACTIVATE is set? Is that not wrong? > Hm, I've understood the following problem: cache becomes incorrect after > failed update_ext_header_and_dir or update_ext_header_and_dir_in_place > operations. (after failed qcow2_remove_persistent_dirty_bitmap or > qcow2_reopen_bitmaps_rw_hint) > > And this comes from incomplete architecture after the patch: > On the one hand, we work with one singleton bitmap_list, and loading > part is refactored to do it safely. On the other hand, storing functions > still have old behavior, they work with bitmap list like with their own > local variable. Yeah, I see it. Dropping the cache on such errors is fine. > > So, we have safe mechanism to read list through the cache. We need also > safe mechanism to update list both in cache and in file. > > There are two possible variants: > > 1. drop cache after failed store > 2. rollback cache after failed store > > 1 looks simpler.. > > Also, we should drop cache on inactivate (we do this) and should not > create cache in inactive mode, because the other process may change the > image. > > Hm. may be, it is better to work with s->bitmap_list directly? In this > case it will be more obvious that it is the cache, not local variable. > And we will work with it like with other "parts of extension cache" > s->nb_bitmaps, s->bitmap_directory_offset ... > > After the patch, functions update_ext_header_and_dir* becomes strange: > > 1. before the patch, they take external parameter - bm_list, and by this > parameter they updated the file and cached s->nb_bitmaps, s->bitmap_*, .. > 2. after the patch, they take parameter (actually s->bitmap_list) of > same nature like s->nb_bitmap, and update s->nb_bitmap from it. > Yeah, if we do decide that keeping a cache is the right thing, some of the helper functions could be refactored or simplified a little to take advantage of the new paradigm. > Sorry for being late and for disordered stream of thoughts. Is this > patch really needed for the whole series? >
[Qemu-devel] [PATCH v15 12/12] migration: Stop sending whole pages through main channel
We have to flush() the QEMUFile because now we sent really few data through that channel. Signed-off-by: Juan Quintela Reviewed-by: Dr. David Alan Gilbert --- migration/ram.c | 11 +++ 1 file changed, 3 insertions(+), 8 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index 57d3ad1c45..6d0782623c 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -1824,15 +1824,7 @@ static int ram_save_page(RAMState *rs, PageSearchStatus *pss, bool last_stage) static int ram_save_multifd_page(RAMState *rs, RAMBlock *block, ram_addr_t offset) { -uint8_t *p; - -p = block->host + offset; - -ram_counters.transferred += save_page_header(rs, rs->f, block, - offset | RAM_SAVE_FLAG_PAGE); multifd_queue_page(block, offset); -qemu_put_buffer(rs->f, p, TARGET_PAGE_SIZE); -ram_counters.transferred += TARGET_PAGE_SIZE; ram_counters.normal++; return 1; @@ -3073,6 +3065,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque) multifd_send_sync_main(); qemu_put_be64(f, RAM_SAVE_FLAG_EOS); +qemu_fflush(f); return 0; } @@ -3155,6 +3148,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque) multifd_send_sync_main(); out: qemu_put_be64(f, RAM_SAVE_FLAG_EOS); +qemu_fflush(f); ram_counters.transferred += 8; ret = qemu_file_get_error(f); @@ -3208,6 +3202,7 @@ static int ram_save_complete(QEMUFile *f, void *opaque) multifd_send_sync_main(); qemu_put_be64(f, RAM_SAVE_FLAG_EOS); +qemu_fflush(f); return 0; } -- 2.17.1
[Qemu-devel] [PATCH v15 11/12] migration: Remove not needed semaphore and quit
We know quit closing the QIO. Signed-off-by: Juan Quintela -- Add comment don't object_ref() twice. --- migration/ram.c | 17 ++--- 1 file changed, 6 insertions(+), 11 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index 2c3a452a7d..57d3ad1c45 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -594,14 +594,10 @@ typedef struct { QemuThread thread; /* communication channel */ QIOChannel *c; -/* sem where to wait for more work */ -QemuSemaphore sem; /* this mutex protects the following parameters */ QemuMutex mutex; /* is this channel thread running */ bool running; -/* should this thread finish */ -bool quit; /* array of pages to receive */ MultiFDPages_t *pages; /* packet allocated len */ @@ -1152,8 +1148,12 @@ static void multifd_recv_terminate_threads(Error *err) MultiFDRecvParams *p = _recv_state->params[i]; qemu_mutex_lock(>mutex); -p->quit = true; -qemu_sem_post(>sem); +/* We could arrive here for two reasons: + - normal quit, i.e. everything went fine, just finished + - error quit: We close the channels so the channel threads + finish the qio_channel_read_all_eof() */ +object_unref(OBJECT(p->c)); +p->c = NULL; qemu_mutex_unlock(>mutex); } } @@ -1173,10 +1173,7 @@ int multifd_load_cleanup(Error **errp) if (p->running) { qemu_thread_join(>thread); } -object_unref(OBJECT(p->c)); -p->c = NULL; qemu_mutex_destroy(>mutex); -qemu_sem_destroy(>sem); qemu_sem_destroy(>sem_sync); g_free(p->name); p->name = NULL; @@ -1299,9 +1296,7 @@ int multifd_load_setup(void) MultiFDRecvParams *p = _recv_state->params[i]; qemu_mutex_init(>mutex); -qemu_sem_init(>sem, 0); qemu_sem_init(>sem_sync, 0); -p->quit = false; p->id = i; p->pages = multifd_pages_init(page_count); p->packet_len = sizeof(MultiFDPacket_t) -- 2.17.1
[Qemu-devel] [PATCH v15 10/12] migration: Wait for blocking IO
We have three conditions here: - channel fails -> error - we have to quit: we close the channel and reads fails - normal read that success, we are in bussiness So forget the complications of waiting in a semaphore. Signed-off-by: Juan Quintela Reviewed-by: Dr. David Alan Gilbert --- migration/ram.c | 13 - 1 file changed, 13 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index 09df573441..2c3a452a7d 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -602,8 +602,6 @@ typedef struct { bool running; /* should this thread finish */ bool quit; -/* thread has work to do */ -bool pending_job; /* array of pages to receive */ MultiFDPages_t *pages; /* packet allocated len */ @@ -1207,14 +1205,6 @@ static void multifd_recv_sync_main(void) for (i = 0; i < migrate_multifd_channels(); i++) { MultiFDRecvParams *p = _recv_state->params[i]; -trace_multifd_recv_sync_main_signal(p->id); -qemu_mutex_lock(>mutex); -p->pending_job = true; -qemu_mutex_unlock(>mutex); -} -for (i = 0; i < migrate_multifd_channels(); i++) { -MultiFDRecvParams *p = _recv_state->params[i]; - trace_multifd_recv_sync_main_wait(p->id); qemu_sem_wait(_recv_state->sem_sync); qemu_mutex_lock(>mutex); @@ -1227,7 +1217,6 @@ static void multifd_recv_sync_main(void) MultiFDRecvParams *p = _recv_state->params[i]; trace_multifd_recv_sync_main_signal(p->id); - qemu_sem_post(>sem_sync); } trace_multifd_recv_sync_main(atomic_read(_recv_state->packet_num)); @@ -1264,7 +1253,6 @@ static void *multifd_recv_thread(void *opaque) used = p->pages->used; flags = p->flags; trace_multifd_recv(p->id, p->packet_num, used, flags); -p->pending_job = false; p->num_packets++; p->num_pages += used; qemu_mutex_unlock(>mutex); @@ -1314,7 +1302,6 @@ int multifd_load_setup(void) qemu_sem_init(>sem, 0); qemu_sem_init(>sem_sync, 0); p->quit = false; -p->pending_job = false; p->id = i; p->pages = multifd_pages_init(page_count); p->packet_len = sizeof(MultiFDPacket_t) -- 2.17.1
[Qemu-devel] [PATCH v15 07/12] migration: Synchronize multifd threads with main thread
We synchronize all threads each RAM_SAVE_FLAG_EOS. Bitmap synchronizations don't happen inside a ram section, so we are safe about two channels trying to overwrite the same memory. Signed-off-by: Juan Quintela -- seq needs to be atomic now, will also be accessed from main thread. Fix the if (true || ...) leftover --- migration/ram.c| 147 - migration/trace-events | 6 ++ 2 files changed, 122 insertions(+), 31 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index 793f0dc5d3..516f347d24 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -510,6 +510,8 @@ exit: #define MULTIFD_MAGIC 0x11223344U #define MULTIFD_VERSION 1 +#define MULTIFD_FLAG_SYNC (1 << 0) + typedef struct { uint32_t magic; uint32_t version; @@ -577,6 +579,8 @@ typedef struct { uint32_t num_packets; /* pages sent through this channel */ uint32_t num_pages; +/* syncs main thread and channels */ +QemuSemaphore sem_sync; } MultiFDSendParams; typedef struct { @@ -614,6 +618,8 @@ typedef struct { uint32_t num_packets; /* pages sent through this channel */ uint32_t num_pages; +/* syncs main thread and channels */ +QemuSemaphore sem_sync; } MultiFDRecvParams; static int multifd_send_initial_packet(MultiFDSendParams *p, Error **errp) @@ -801,6 +807,10 @@ struct { int count; /* array of pages to sent */ MultiFDPages_t *pages; +/* syncs main thread and channels */ +QemuSemaphore sem_sync; +/* global number of generated multifd packets */ +uint64_t packet_num; } *multifd_send_state; static void multifd_send_terminate_threads(Error *err) @@ -848,6 +858,7 @@ int multifd_save_cleanup(Error **errp) p->c = NULL; qemu_mutex_destroy(>mutex); qemu_sem_destroy(>sem); +qemu_sem_destroy(>sem_sync); g_free(p->name); p->name = NULL; multifd_pages_clear(p->pages); @@ -856,6 +867,7 @@ int multifd_save_cleanup(Error **errp) g_free(p->packet); p->packet = NULL; } +qemu_sem_destroy(_send_state->sem_sync); g_free(multifd_send_state->params); multifd_send_state->params = NULL; multifd_pages_clear(multifd_send_state->pages); @@ -865,6 +877,33 @@ int multifd_save_cleanup(Error **errp) return ret; } +static void multifd_send_sync_main(void) +{ +int i; + +if (!migrate_use_multifd()) { +return; +} +for (i = 0; i < migrate_multifd_channels(); i++) { +MultiFDSendParams *p = _send_state->params[i]; + +trace_multifd_send_sync_main_signal(p->id); + +qemu_mutex_lock(>mutex); +p->flags |= MULTIFD_FLAG_SYNC; +p->pending_job++; +qemu_mutex_unlock(>mutex); +qemu_sem_post(>sem); +} +for (i = 0; i < migrate_multifd_channels(); i++) { +MultiFDSendParams *p = _send_state->params[i]; + +trace_multifd_send_sync_main_wait(p->id); +qemu_sem_wait(_send_state->sem_sync); +} +trace_multifd_send_sync_main(atomic_read(_send_state->packet_num)); +} + static void *multifd_send_thread(void *opaque) { MultiFDSendParams *p = opaque; @@ -901,15 +940,17 @@ static void *multifd_send_thread(void *opaque) qemu_mutex_lock(>mutex); p->pending_job--; qemu_mutex_unlock(>mutex); -continue; + +if (flags & MULTIFD_FLAG_SYNC) { +qemu_sem_post(_send_state->sem_sync); +} } else if (p->quit) { qemu_mutex_unlock(>mutex); break; +} else { +qemu_mutex_unlock(>mutex); +/* sometimes there are spurious wakeups */ } -qemu_mutex_unlock(>mutex); -/* this is impossible */ -error_setg(_err, "multifd_send_thread: Unknown command"); -break; } out: @@ -961,12 +1002,14 @@ int multifd_save_setup(void) multifd_send_state->params = g_new0(MultiFDSendParams, thread_count); atomic_set(_send_state->count, 0); multifd_send_state->pages = multifd_pages_init(page_count); +qemu_sem_init(_send_state->sem_sync, 0); for (i = 0; i < thread_count; i++) { MultiFDSendParams *p = _send_state->params[i]; qemu_mutex_init(>mutex); qemu_sem_init(>sem, 0); +qemu_sem_init(>sem_sync, 0); p->quit = false; p->pending_job = 0; p->id = i; @@ -991,6 +1034,10 @@ struct { MultiFDRecvParams *params; /* number of created threads */ int count; +/* syncs main thread and channels */ +QemuSemaphore sem_sync; +/* global number of generated multifd packets */ +uint64_t packet_num; } *multifd_recv_state; static void multifd_recv_terminate_threads(Error *err) @@ -1036,6 +1083,7 @@ int multifd_load_cleanup(Error **errp) p->c = NULL; qemu_mutex_destroy(>mutex); qemu_sem_destroy(>sem); +
[Qemu-devel] [PATCH v15 09/12] migration: Start sending messages
Signed-off-by: Juan Quintela Reviewed-by: Dr. David Alan Gilbert --- migration/ram.c | 29 - 1 file changed, 24 insertions(+), 5 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index 71a33b73e7..09df573441 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -736,9 +736,6 @@ static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) RAMBlock *block; int i; -/* ToDo: We can't use it until we haven't received a message */ -return 0; - be32_to_cpus(>magic); if (packet->magic != MULTIFD_MAGIC) { error_setg(errp, "multifd: received packet " @@ -990,6 +987,7 @@ static void *multifd_send_thread(void *opaque) { MultiFDSendParams *p = opaque; Error *local_err = NULL; +int ret; trace_multifd_send_thread_start(p->id); @@ -1017,7 +1015,16 @@ static void *multifd_send_thread(void *opaque) trace_multifd_send(p->id, packet_num, used, flags); -/* ToDo: send packet here */ +ret = qio_channel_write_all(p->c, (void *)p->packet, +p->packet_len, _err); +if (ret != 0) { +break; +} + +ret = qio_channel_writev_all(p->c, p->pages->iov, used, _err); +if (ret != 0) { +break; +} qemu_mutex_lock(>mutex); p->pending_job--; @@ -1238,7 +1245,14 @@ static void *multifd_recv_thread(void *opaque) uint32_t used; uint32_t flags; -/* ToDo: recv packet here */ +ret = qio_channel_read_all_eof(p->c, (void *)p->packet, + p->packet_len, _err); +if (ret == 0) { /* EOF */ +break; +} +if (ret == -1) { /* Error */ +break; +} qemu_mutex_lock(>mutex); ret = multifd_recv_unfill_packet(p, _err); @@ -1255,6 +1269,11 @@ static void *multifd_recv_thread(void *opaque) p->num_pages += used; qemu_mutex_unlock(>mutex); +ret = qio_channel_readv_all(p->c, p->pages->iov, used, _err); +if (ret != 0) { +break; +} + if (flags & MULTIFD_FLAG_SYNC) { qemu_sem_post(_recv_state->sem_sync); qemu_sem_wait(>sem_sync); -- 2.17.1
[Qemu-devel] [PATCH v15 08/12] migration: Create ram_save_multifd_page
The function still don't use multifd, but we have simplified ram_save_page, xbzrle and RDMA stuff is gone. We have added a new counter. Signed-off-by: Juan Quintela Reviewed-by: Dr. David Alan Gilbert -- Add last_page parameter Add commets for done and address Remove multifd field, it is the same than normal pages Merge next patch, now we send multiple pages at a time Remove counter for multifd pages, it is identical to normal pages Use iovec's instead of creating the equivalent. Clear memory used by pages (dave) Use g_new0(danp) define MULTIFD_CONTINUE now pages member is a pointer Fix off-by-one in number of pages in one packet Remove RAM_SAVE_FLAG_MULTIFD_PAGE s/multifd_pages_t/MultiFDPages_t/ add comment explaining what it means --- migration/ram.c | 112 +++- 1 file changed, 110 insertions(+), 2 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index 516f347d24..71a33b73e7 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -55,6 +55,7 @@ #include "sysemu/sysemu.h" #include "qemu/uuid.h" #include "savevm.h" +#include "qemu/iov.h" /***/ /* ram save/restore */ @@ -811,8 +812,83 @@ struct { QemuSemaphore sem_sync; /* global number of generated multifd packets */ uint64_t packet_num; +/* send channels ready */ +QemuSemaphore channels_ready; } *multifd_send_state; +/* + * How we use multifd_send_state->pages and channel->pages? + * + * We create a pages for each channel, and a main one. Each time that + * we need to send a batch of pages we interchange the ones between + * multifd_send_state and the channel that is sending it. There are + * two reasons for that: + *- to not have to do so many mallocs during migration + *- to make easier to know what to free at the end of migration + * + * This way we always know who is the owner of each "pages" struct, + * and we don't need any loocking. It belongs to the migration thread + * or to the channel thread. Switching is safe because the migration + * thread is using the channel mutex when changing it, and the channel + * have to had finish with its own, otherwise pending_job can't be + * false. + */ + +static void multifd_send_pages(void) +{ +int i; +static int next_channel; +MultiFDSendParams *p = NULL; /* make happy gcc */ +MultiFDPages_t *pages = multifd_send_state->pages; + +qemu_sem_wait(_send_state->channels_ready); +for (i = next_channel;; i = (i + 1) % migrate_multifd_channels()) { +p = _send_state->params[i]; + +qemu_mutex_lock(>mutex); +if (!p->pending_job) { +p->pending_job++; +next_channel = (i + 1) % migrate_multifd_channels(); +break; +} +qemu_mutex_unlock(>mutex); +} +p->pages->used = 0; + +p->packet_num = atomic_inc_fetch(_send_state->packet_num); +p->pages->block = NULL; +multifd_send_state->pages = p->pages; +p->pages = pages; +qemu_mutex_unlock(>mutex); +qemu_sem_post(>sem); +} + +static void multifd_queue_page(RAMBlock *block, ram_addr_t offset) +{ +MultiFDPages_t *pages = multifd_send_state->pages; + +if (!pages->block) { +pages->block = block; +} + +if (pages->block == block) { +pages->offset[pages->used] = offset; +pages->iov[pages->used].iov_base = block->host + offset; +pages->iov[pages->used].iov_len = TARGET_PAGE_SIZE; +pages->used++; + +if (pages->used < pages->allocated) { +return; +} +} + +multifd_send_pages(); + +if (pages->block != block) { +multifd_queue_page(block, offset); +} +} + static void multifd_send_terminate_threads(Error *err) { int i; @@ -867,6 +943,7 @@ int multifd_save_cleanup(Error **errp) g_free(p->packet); p->packet = NULL; } +qemu_sem_destroy(_send_state->channels_ready); qemu_sem_destroy(_send_state->sem_sync); g_free(multifd_send_state->params); multifd_send_state->params = NULL; @@ -884,12 +961,17 @@ static void multifd_send_sync_main(void) if (!migrate_use_multifd()) { return; } +if (multifd_send_state->pages->used) { +multifd_send_pages(); +} for (i = 0; i < migrate_multifd_channels(); i++) { MultiFDSendParams *p = _send_state->params[i]; trace_multifd_send_sync_main_signal(p->id); qemu_mutex_lock(>mutex); + +p->packet_num = atomic_inc_fetch(_send_state->packet_num); p->flags |= MULTIFD_FLAG_SYNC; p->pending_job++; qemu_mutex_unlock(>mutex); @@ -944,6 +1026,7 @@ static void *multifd_send_thread(void *opaque) if (flags & MULTIFD_FLAG_SYNC) { qemu_sem_post(_send_state->sem_sync); } +qemu_sem_post(_send_state->channels_ready); } else if (p->quit) { qemu_mutex_unlock(>mutex);
[Qemu-devel] [PATCH v15 05/12] migration: Multifd channels always wait on the sem
Either for quit, sync or packet, we first wake them. Signed-off-by: Juan Quintela Reviewed-by: Dr. David Alan Gilbert --- migration/ram.c | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index d7f8b0d989..617da76a2e 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -875,6 +875,7 @@ static void *multifd_send_thread(void *opaque) p->num_packets = 1; while (true) { +qemu_sem_wait(>sem); qemu_mutex_lock(>mutex); multifd_send_fill_packet(p); if (p->quit) { @@ -882,7 +883,9 @@ static void *multifd_send_thread(void *opaque) break; } qemu_mutex_unlock(>mutex); -qemu_sem_wait(>sem); +/* this is impossible */ +error_setg(_err, "multifd_send_thread: Unknown command"); +break; } out: @@ -1033,6 +1036,7 @@ static void *multifd_recv_thread(void *opaque) trace_multifd_recv_thread_start(p->id); while (true) { +qemu_sem_wait(>sem); qemu_mutex_lock(>mutex); if (false) { /* ToDo: Packet reception goes here */ @@ -1047,9 +1051,14 @@ static void *multifd_recv_thread(void *opaque) break; } qemu_mutex_unlock(>mutex); -qemu_sem_wait(>sem); +/* this is impossible */ +error_setg(_err, "multifd_recv_thread: Unknown command"); +break; } +if (local_err) { +multifd_recv_terminate_threads(local_err); +} qemu_mutex_lock(>mutex); p->running = false; qemu_mutex_unlock(>mutex); -- 2.17.1
[Qemu-devel] [PATCH v15 06/12] migration: Add block where to send/receive packets
Once there add tracepoints. Signed-off-by: Juan Quintela Reviewed-by: Dr. David Alan Gilbert --- migration/ram.c| 49 +- migration/trace-events | 2 ++ 2 files changed, 46 insertions(+), 5 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index 617da76a2e..793f0dc5d3 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -560,6 +560,8 @@ typedef struct { bool running; /* should this thread finish */ bool quit; +/* thread has work to do */ +int pending_job; /* array of pages to sent */ MultiFDPages_t *pages; /* packet allocated len */ @@ -595,6 +597,8 @@ typedef struct { bool running; /* should this thread finish */ bool quit; +/* thread has work to do */ +bool pending_job; /* array of pages to receive */ MultiFDPages_t *pages; /* packet allocated len */ @@ -877,8 +881,28 @@ static void *multifd_send_thread(void *opaque) while (true) { qemu_sem_wait(>sem); qemu_mutex_lock(>mutex); -multifd_send_fill_packet(p); -if (p->quit) { + +if (p->pending_job) { +uint32_t used = p->pages->used; +uint32_t packet_num = p->packet_num; +uint32_t flags = p->flags; + +multifd_send_fill_packet(p); +p->flags = 0; +p->num_packets++; +p->num_pages += used; +p->pages->used = 0; +qemu_mutex_unlock(>mutex); + +trace_multifd_send(p->id, packet_num, used, flags); + +/* ToDo: send packet here */ + +qemu_mutex_lock(>mutex); +p->pending_job--; +qemu_mutex_unlock(>mutex); +continue; +} else if (p->quit) { qemu_mutex_unlock(>mutex); break; } @@ -944,6 +968,7 @@ int multifd_save_setup(void) qemu_mutex_init(>mutex); qemu_sem_init(>sem, 0); p->quit = false; +p->pending_job = 0; p->id = i; p->pages = multifd_pages_init(page_count); p->packet_len = sizeof(MultiFDPacket_t) @@ -1038,14 +1063,27 @@ static void *multifd_recv_thread(void *opaque) while (true) { qemu_sem_wait(>sem); qemu_mutex_lock(>mutex); -if (false) { -/* ToDo: Packet reception goes here */ +if (p->pending_job) { +uint32_t used; +uint32_t flags; +qemu_mutex_unlock(>mutex); +/* ToDo: recv packet here */ + +qemu_mutex_lock(>mutex); ret = multifd_recv_unfill_packet(p, _err); -qemu_mutex_unlock(>mutex); if (ret) { +qemu_mutex_unlock(>mutex); break; } + +used = p->pages->used; +flags = p->flags; +trace_multifd_recv(p->id, p->packet_num, used, flags); +p->pending_job = false; +p->num_packets++; +p->num_pages += used; +qemu_mutex_unlock(>mutex); } else if (p->quit) { qemu_mutex_unlock(>mutex); break; @@ -1088,6 +1126,7 @@ int multifd_load_setup(void) qemu_mutex_init(>mutex); qemu_sem_init(>sem, 0); p->quit = false; +p->pending_job = false; p->id = i; p->pages = multifd_pages_init(page_count); p->packet_len = sizeof(MultiFDPacket_t) diff --git a/migration/trace-events b/migration/trace-events index 6d499448b3..c667d98529 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -76,8 +76,10 @@ get_queued_page_not_dirty(const char *block_name, uint64_t tmp_offset, unsigned migration_bitmap_sync_start(void) "" migration_bitmap_sync_end(uint64_t dirty_pages) "dirty_pages %" PRIu64 migration_throttle(void) "" +multifd_recv(uint8_t id, uint64_t packet_num, uint32_t used, uint32_t flags) "channel %d packet number %ld pages %d flags 0x%x" multifd_recv_thread_end(uint8_t id, uint32_t packets, uint32_t pages) "channel %d packets %d pages %d" multifd_recv_thread_start(uint8_t id) "%d" +multifd_send(uint8_t id, uint64_t packet_num, uint32_t used, uint32_t flags) "channel %d packet_num %ld pages %d flags 0x%x" multifd_send_thread_end(uint8_t id, uint32_t packets, uint32_t pages) "channel %d packets %d pages %d" multifd_send_thread_start(uint8_t id) "%d" ram_discard_range(const char *rbname, uint64_t start, size_t len) "%s: start: %" PRIx64 " %zx" -- 2.17.1
[Qemu-devel] [PATCH v15 03/12] migration: Add multifd traces for start/end thread
We want to know how many pages/packets each channel has sent. Add counters for those. Signed-off-by: Juan Quintela Reviewed-by: Juan Quintela -- sort trace-events (dave) --- migration/ram.c| 22 ++ migration/trace-events | 4 2 files changed, 26 insertions(+) diff --git a/migration/ram.c b/migration/ram.c index 6504b492da..d146689d3a 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -570,6 +570,11 @@ typedef struct { uint32_t flags; /* global number of generated multifd packets */ uint64_t packet_num; +/* thread local variables */ +/* packets sent through this channel */ +uint32_t num_packets; +/* pages sent through this channel */ +uint32_t num_pages; } MultiFDSendParams; typedef struct { @@ -600,6 +605,11 @@ typedef struct { uint32_t flags; /* global number of generated multifd packets */ uint64_t packet_num; +/* thread local variables */ +/* packets sent through this channel */ +uint32_t num_packets; +/* pages sent through this channel */ +uint32_t num_pages; } MultiFDRecvParams; static int multifd_send_initial_packet(MultiFDSendParams *p, Error **errp) @@ -856,9 +866,13 @@ static void *multifd_send_thread(void *opaque) MultiFDSendParams *p = opaque; Error *local_err = NULL; +trace_multifd_send_thread_start(p->id); + if (multifd_send_initial_packet(p, _err) < 0) { goto out; } +/* initial packet */ +p->num_packets = 1; while (true) { qemu_mutex_lock(>mutex); @@ -880,6 +894,8 @@ out: p->running = false; qemu_mutex_unlock(>mutex); +trace_multifd_send_thread_end(p->id, p->num_packets, p->num_pages); + return NULL; } @@ -1007,6 +1023,8 @@ static void *multifd_recv_thread(void *opaque) Error *local_err = NULL; int ret; +trace_multifd_recv_thread_start(p->id); + while (true) { qemu_mutex_lock(>mutex); if (false) { @@ -1029,6 +1047,8 @@ static void *multifd_recv_thread(void *opaque) p->running = false; qemu_mutex_unlock(>mutex); +trace_multifd_recv_thread_end(p->id, p->num_packets, p->num_pages); + return NULL; } @@ -1094,6 +1114,8 @@ void multifd_recv_new_channel(QIOChannel *ioc) } p->c = ioc; object_ref(OBJECT(ioc)); +/* initial packet */ +p->num_packets = 1; p->running = true; qemu_thread_create(>thread, p->name, multifd_recv_thread, p, diff --git a/migration/trace-events b/migration/trace-events index 3f67758893..6d499448b3 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -76,6 +76,10 @@ get_queued_page_not_dirty(const char *block_name, uint64_t tmp_offset, unsigned migration_bitmap_sync_start(void) "" migration_bitmap_sync_end(uint64_t dirty_pages) "dirty_pages %" PRIu64 migration_throttle(void) "" +multifd_recv_thread_end(uint8_t id, uint32_t packets, uint32_t pages) "channel %d packets %d pages %d" +multifd_recv_thread_start(uint8_t id) "%d" +multifd_send_thread_end(uint8_t id, uint32_t packets, uint32_t pages) "channel %d packets %d pages %d" +multifd_send_thread_start(uint8_t id) "%d" ram_discard_range(const char *rbname, uint64_t start, size_t len) "%s: start: %" PRIx64 " %zx" ram_load_loop(const char *rbname, uint64_t addr, int flags, void *host) "%s: addr: 0x%" PRIx64 " flags: 0x%x host: %p" ram_load_postcopy_loop(uint64_t addr, int flags) "@%" PRIx64 " %x" -- 2.17.1
[Qemu-devel] [PATCH v15 02/12] migration: Create multifd packet
We still don't put anything there. Signed-off-by: Juan Quintela Reviewed-by: Juan Quintela -- fix magic (dave) check offset/ramblock (dave) s/seq/packet_num/ and make it 64bit --- migration/ram.c | 145 +++- 1 file changed, 144 insertions(+), 1 deletion(-) diff --git a/migration/ram.c b/migration/ram.c index ed4401ee46..6504b492da 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -517,6 +517,17 @@ typedef struct { uint8_t id; } __attribute__((packed)) MultiFDInit_t; +typedef struct { +uint32_t magic; +uint32_t version; +uint32_t flags; +uint32_t size; +uint32_t used; +uint64_t packet_num; +char ramblock[256]; +uint64_t offset[]; +} __attribute__((packed)) MultiFDPacket_t; + typedef struct { /* number of used pages */ uint32_t used; @@ -551,6 +562,14 @@ typedef struct { bool quit; /* array of pages to sent */ MultiFDPages_t *pages; +/* packet allocated len */ +uint32_t packet_len; +/* pointer to the packet */ +MultiFDPacket_t *packet; +/* multifd flags for each packet */ +uint32_t flags; +/* global number of generated multifd packets */ +uint64_t packet_num; } MultiFDSendParams; typedef struct { @@ -573,6 +592,14 @@ typedef struct { bool quit; /* array of pages to receive */ MultiFDPages_t *pages; +/* packet allocated len */ +uint32_t packet_len; +/* pointer to the packet */ +MultiFDPacket_t *packet; +/* multifd flags for each packet */ +uint32_t flags; +/* global number of generated multifd packets */ +uint64_t packet_num; } MultiFDRecvParams; static int multifd_send_initial_packet(MultiFDSendParams *p, Error **errp) @@ -661,6 +688,99 @@ static void multifd_pages_clear(MultiFDPages_t *pages) g_free(pages); } +static void multifd_send_fill_packet(MultiFDSendParams *p) +{ +MultiFDPacket_t *packet = p->packet; +int i; + +packet->magic = cpu_to_be32(MULTIFD_MAGIC); +packet->version = cpu_to_be32(MULTIFD_VERSION); +packet->flags = cpu_to_be32(p->flags); +packet->size = cpu_to_be32(migrate_multifd_page_count()); +packet->used = cpu_to_be32(p->pages->used); +packet->packet_num = cpu_to_be64(p->packet_num); + +if (p->pages->block) { +strncpy(packet->ramblock, p->pages->block->idstr, 256); +} + +for (i = 0; i < p->pages->used; i++) { +packet->offset[i] = cpu_to_be64(p->pages->offset[i]); +} +} + +static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) +{ +MultiFDPacket_t *packet = p->packet; +RAMBlock *block; +int i; + +/* ToDo: We can't use it until we haven't received a message */ +return 0; + +be32_to_cpus(>magic); +if (packet->magic != MULTIFD_MAGIC) { +error_setg(errp, "multifd: received packet " + "magic %x and expected magic %x", + packet->magic, MULTIFD_MAGIC); +return -1; +} + +be32_to_cpus(>version); +if (packet->version != MULTIFD_VERSION) { +error_setg(errp, "multifd: received packet " + "version %d and expected version %d", + packet->version, MULTIFD_VERSION); +return -1; +} + +p->flags = be32_to_cpu(packet->flags); + +be32_to_cpus(>size); +if (packet->size > migrate_multifd_page_count()) { +error_setg(errp, "multifd: received packet " + "with size %d and expected maximum size %d", + packet->size, migrate_multifd_page_count()) ; +return -1; +} + +p->pages->used = be32_to_cpu(packet->used); +if (p->pages->used > packet->size) { +error_setg(errp, "multifd: received packet " + "with size %d and expected maximum size %d", + p->pages->used, packet->size) ; +return -1; +} + +p->packet_num = be64_to_cpu(packet->packet_num); + +if (p->pages->used) { +/* make sure that ramblock is 0 terminated */ +packet->ramblock[255] = 0; +block = qemu_ram_block_by_name(packet->ramblock); +if (!block) { +error_setg(errp, "multifd: unknown ram block %s", + packet->ramblock); +return -1; +} +} + +for (i = 0; i < p->pages->used; i++) { +ram_addr_t offset = be64_to_cpu(packet->offset[i]); + +if (offset > (block->used_length - TARGET_PAGE_SIZE)) { +error_setg(errp, "multifd: offset too long %" PRId64 + " (max %" PRId64 ")", + offset, block->max_length); +return -1; +} +p->pages->iov[i].iov_base = block->host + offset; +p->pages->iov[i].iov_len = TARGET_PAGE_SIZE; +} + +return 0; +} + struct { MultiFDSendParams *params; /* number of created threads */ @@ -718,6 +838,9 @@ int multifd_save_cleanup(Error **errp) p->name = NULL;
[Qemu-devel] [PATCH v15 04/12] migration: Calculate transferred ram correctly
On multifd we send data from more places that main channel. Signed-off-by: Juan Quintela -- Add placeholder for packets size --- migration/migration.c | 12 ++-- migration/ram.c | 7 +++ migration/ram.h | 1 + 3 files changed, 18 insertions(+), 2 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index e1eaa97df4..224629533b 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -2728,12 +2728,20 @@ static void migration_update_counters(MigrationState *s, { uint64_t transferred, time_spent; double bandwidth; +uint64_t now; if (current_time < s->iteration_start_time + BUFFER_DELAY) { return; } -transferred = qemu_ftell(s->to_dst_file) - s->iteration_initial_bytes; +if (migrate_use_multifd()) { +now = ram_counters.normal * qemu_target_page_size() ++ multifd_packets_size() ++ qemu_ftell(s->to_dst_file); +} else { +now = qemu_ftell(s->to_dst_file); +} +transferred = now - s->iteration_initial_bytes; time_spent = current_time - s->iteration_start_time; bandwidth = (double)transferred / time_spent; s->threshold_size = bandwidth * s->parameters.downtime_limit; @@ -2752,7 +2760,7 @@ static void migration_update_counters(MigrationState *s, qemu_file_reset_rate_limit(s->to_dst_file); s->iteration_start_time = current_time; -s->iteration_initial_bytes = qemu_ftell(s->to_dst_file); +s->iteration_initial_bytes = now; trace_migrate_transferred(transferred, time_spent, bandwidth, s->threshold_size); diff --git a/migration/ram.c b/migration/ram.c index d146689d3a..d7f8b0d989 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -952,6 +952,13 @@ int multifd_save_setup(void) return 0; } +/* Size in bytes of the page headers */ +int multifd_packets_size(void) +{ +/* We are not yet sending any data through channels */ +return 0; +} + struct { MultiFDRecvParams *params; /* number of created threads */ diff --git a/migration/ram.h b/migration/ram.h index d386f4d641..e1decb7418 100644 --- a/migration/ram.h +++ b/migration/ram.h @@ -47,6 +47,7 @@ int multifd_load_setup(void); int multifd_load_cleanup(Error **errp); bool multifd_recv_all_channels_created(void); void multifd_recv_new_channel(QIOChannel *ioc); +int multifd_packets_size(void); uint64_t ram_pagesize_summary(void); int ram_save_queue_pages(const char *rbname, ram_addr_t start, ram_addr_t len); -- 2.17.1
[Qemu-devel] [PATCH v15 00/12] Multifd
Hi This is v15 of multifd patches. Changes from previous version: - fix compilation on 32bit (weird) platforms. Move from uint64_t to long for atomics. - use shutdown for comunication close instead of object_unref(). (david suggestion) I took some performance numbers (not the most scientifc approach). I tested on localhost (so I can have fast networking), a guest with: - iddle - stress --vm 4 --vm-bytss 500M - stress --vm 4 --vm-bytss 700M Setup time for multifd is higher (around 1s). In idle difference is smaall, but precopy wins (we are not able to setup the setup consts). On 500MB multifd is cleary faster (2/3 of the time and much less downtime). With 700MB guests, normal procopy does not converge. It calculates a 4second downtime. Multifd ends without problems. Yes, I know that throughput of multifd is not writen correctly. That is only for printing stats, the ones that are used to calculate convergence work as expected. Will fix the transef speed later. Please, review the missing patch. Later, Juan. idle: precopy Migration status: completed total time: 1085 milliseconds downtime: 276 milliseconds setup: 20 milliseconds transferred ram: 243676 kbytes throughput: 1841.41 mbps remaining ram: 0 kbytes total ram: 3150664 kbytes duplicate: 728780 pages normal: 59202 pages normal bytes: 236808 kbytes dirty sync count: 3 idle: multifd Migration status: completed total time: 1799 milliseconds downtime: 251 milliseconds setup: 1051 milliseconds transferred ram: 6431 kbytes throughput: 30.25 mbps remaining ram: 0 kbytes total ram: 3150664 kbytes duplicate: 731745 pages skipped: 0 pages normal: 56596 pages normal bytes: 226384 kbytes dirty sync count: 3 page size: 4 kbytes stress --vm 4 --vm-bytes 500M: precopy total time: 9477 milliseconds downtime: 270 milliseconds setup: 33 milliseconds transferred ram: 6270075 kbytes throughput: 5420.09 mbps remaining ram: 0 kbytes total ram: 3150664 kbytes duplicate: 232478 pages skipped: 0 pages normal: 1563953 pages normal bytes: 6255812 kbytes dirty sync count: 10 stress --vm 4 --vm-bytes 500M: multifd total time: 6168 milliseconds downtime: 173 milliseconds setup: 1005 milliseconds transferred ram: 1984 kbytes throughput: 2.92 mbps remaining ram: 0 kbytes total ram: 3150664 kbytes duplicate: 225682 pages skipped: 0 pages normal: 1428939 pages normal bytes: 5715756 kbytes dirty sync count: 11 stress --vm 4 --vm-bytes 700M: precopy I stopped affter 87 seconds, notice that expected downtime is around 4 seconds, not changing at all. Migration status: active total time: 87026 milliseconds expected downtime: 4126 milliseconds setup: 18 milliseconds transferred ram: 50056699 kbytes throughput: 4835.55 mbps remaining ram: 2335628 kbytes total ram: 3150664 kbytes duplicate: 107856 pages skipped: 0 pages normal: 12489541 pages normal bytes: 49958164 kbytes dirty sync count: 23 page size: 4 kbytes dirty pages rate: 186814 pages stress --vm 4 --vm-bytes 700M: multifd total time: 40971 milliseconds downtime: 192 milliseconds setup: 1017 milliseconds transferred ram: 1144 kbytes throughput: 0.28 mbps remaining ram: 0 kbytes total ram: 3150664 kbytes duplicate: 129472 pages skipped: 0 pages normal: 11959938 pages normal bytes: 47839752 kbytes dirty sync count: 49 page size: 4 kbytes THis iv v14 multifd patches: Changes from previous submit: - rename seq -> packet_num: make things easier to understand - packet_num is now 64bit wide - include the size of the packet headers in the transfer stats (dave noticed it) - improve comments here and there. All the patches except two are already reviewed-by. And my understanding is that I fixed last issues with two remaining ones, so I am expect to pull this when this two patches are reviewed. Please review. Thanks, Juan. This is v13 of multifd patches: - several patches already integrated - rebased to latests upstreams - addressed all the reviews comments around. Please review. Thanks, Juan. [v12] Big news, it is not RFC anymore, it works reliabely for me. Changes: - Locknig changed completely (several times) - We now send all pages through the channels. In a 2GB guest with 1 disk and a network card, the amount of data send for RAM was 80KB. - This is not optimized yet, but it shouws clear improvements over precopy. testing over localhost networking I can guet: - 2 VCPUs guest - 2GB RAM - runn stress --vm 4 --vm 500GB (i.e. dirtying 2GB or RAM each second) - Total time: precopy ~50seconds, multifd around 11seconds - Bandwidth usage is around 273MB/s vs 71MB/s on the same hardware This is very preleminary testing, will send more numbers when I got them. But looks promissing. Things that will be improved later: - Initial synchronization is too slow (around 1s) - We synchronize all threads after each RAM section, we can move to only synchronize them after we have done a bitmap syncrhronization - We can improve bitmap walking (but that is independent of multifd) Please
[Qemu-devel] [PATCH v15 01/12] migration: Create multipage support
We only create/destry the page list here. We will use it later. Signed-off-by: Juan Quintela Reviewed-by: Dr. David Alan Gilbert --- migration/ram.c | 57 + 1 file changed, 57 insertions(+) diff --git a/migration/ram.c b/migration/ram.c index cd5f55117d..ed4401ee46 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -517,6 +517,20 @@ typedef struct { uint8_t id; } __attribute__((packed)) MultiFDInit_t; +typedef struct { +/* number of used pages */ +uint32_t used; +/* number of allocated pages */ +uint32_t allocated; +/* global number of generated multifd packets */ +uint64_t packet_num; +/* offset of each page */ +ram_addr_t *offset; +/* pointer to each page */ +struct iovec *iov; +RAMBlock *block; +} MultiFDPages_t; + typedef struct { /* this fields are not changed once the thread is created */ /* channel number */ @@ -535,6 +549,8 @@ typedef struct { bool running; /* should this thread finish */ bool quit; +/* array of pages to sent */ +MultiFDPages_t *pages; } MultiFDSendParams; typedef struct { @@ -555,6 +571,8 @@ typedef struct { bool running; /* should this thread finish */ bool quit; +/* array of pages to receive */ +MultiFDPages_t *pages; } MultiFDRecvParams; static int multifd_send_initial_packet(MultiFDSendParams *p, Error **errp) @@ -619,10 +637,36 @@ static int multifd_recv_initial_packet(QIOChannel *c, Error **errp) return msg.id; } +static MultiFDPages_t *multifd_pages_init(size_t size) +{ +MultiFDPages_t *pages = g_new0(MultiFDPages_t, 1); + +pages->allocated = size; +pages->iov = g_new0(struct iovec, size); +pages->offset = g_new0(ram_addr_t, size); + +return pages; +} + +static void multifd_pages_clear(MultiFDPages_t *pages) +{ +pages->used = 0; +pages->allocated = 0; +pages->packet_num = 0; +pages->block = NULL; +g_free(pages->iov); +pages->iov = NULL; +g_free(pages->offset); +pages->offset = NULL; +g_free(pages); +} + struct { MultiFDSendParams *params; /* number of created threads */ int count; +/* array of pages to sent */ +MultiFDPages_t *pages; } *multifd_send_state; static void multifd_send_terminate_threads(Error *err) @@ -672,9 +716,13 @@ int multifd_save_cleanup(Error **errp) qemu_sem_destroy(>sem); g_free(p->name); p->name = NULL; +multifd_pages_clear(p->pages); +p->pages = NULL; } g_free(multifd_send_state->params); multifd_send_state->params = NULL; +multifd_pages_clear(multifd_send_state->pages); +multifd_send_state->pages = NULL; g_free(multifd_send_state); multifd_send_state = NULL; return ret; @@ -735,6 +783,7 @@ static void multifd_new_send_channel_async(QIOTask *task, gpointer opaque) int multifd_save_setup(void) { int thread_count; +uint32_t page_count = migrate_multifd_page_count(); uint8_t i; if (!migrate_use_multifd()) { @@ -744,6 +793,8 @@ int multifd_save_setup(void) multifd_send_state = g_malloc0(sizeof(*multifd_send_state)); multifd_send_state->params = g_new0(MultiFDSendParams, thread_count); atomic_set(_send_state->count, 0); +multifd_send_state->pages = multifd_pages_init(page_count); + for (i = 0; i < thread_count; i++) { MultiFDSendParams *p = _send_state->params[i]; @@ -751,6 +802,7 @@ int multifd_save_setup(void) qemu_sem_init(>sem, 0); p->quit = false; p->id = i; +p->pages = multifd_pages_init(page_count); p->name = g_strdup_printf("multifdsend_%d", i); socket_send_channel_create(multifd_new_send_channel_async, p); } @@ -808,6 +860,8 @@ int multifd_load_cleanup(Error **errp) qemu_sem_destroy(>sem); g_free(p->name); p->name = NULL; +multifd_pages_clear(p->pages); +p->pages = NULL; } g_free(multifd_recv_state->params); multifd_recv_state->params = NULL; @@ -841,6 +895,7 @@ static void *multifd_recv_thread(void *opaque) int multifd_load_setup(void) { int thread_count; +uint32_t page_count = migrate_multifd_page_count(); uint8_t i; if (!migrate_use_multifd()) { @@ -850,6 +905,7 @@ int multifd_load_setup(void) multifd_recv_state = g_malloc0(sizeof(*multifd_recv_state)); multifd_recv_state->params = g_new0(MultiFDRecvParams, thread_count); atomic_set(_recv_state->count, 0); + for (i = 0; i < thread_count; i++) { MultiFDRecvParams *p = _recv_state->params[i]; @@ -857,6 +913,7 @@ int multifd_load_setup(void) qemu_sem_init(>sem, 0); p->quit = false; p->id = i; +p->pages = multifd_pages_init(page_count); p->name = g_strdup_printf("multifdrecv_%d", i); } return 0; -- 2.17.1
Re: [Qemu-devel] [PATCH] hw/i386: Deprecate the machine types pc-0.10 and pc-0.11
(CCing Markus and libvir-list) On Wed, Jun 20, 2018 at 08:40:38PM +0200, Thomas Huth wrote: > On 12.06.2018 00:18, Eduardo Habkost wrote: > > On Mon, Jun 11, 2018 at 05:41:04AM +0200, Thomas Huth wrote: > >> The oldest machine type which is still used in a maintained distribution > >> is a pc-0.12 based machine type in RHEL6, so everything that is older > >> than pc-0.12 should not be used anymore. Thus let's deprecate pc-0.10 > >> and pc-0.11 so that we can finally remove them in a future release. > [...] > >> @@ -3952,6 +3953,10 @@ int main(int argc, char **argv, char **envp) > >> } > >> > >> machine_class = select_machine(); > >> +if (machine_class->deprecation_msg) { > >> +error_report("Machine type '%s' is deprecated: %s", > >> + machine_class->name, machine_class->deprecation_msg); > >> +} > > > > Do you plan to add this info to 'query-machines' QMP command? > > No, I'm not planning to add this. We'd need a request from upper layers > (i.e. libvirt) for this first, otherwise it's just a dead interface that > nobody is using. I believe that useful information being available only through stderr is at least as bad as being available only through HMP. Should we extend QMP more proactively in cases like this, too? (In either case, I don't think this should block your series) -- Eduardo
Re: [Qemu-devel] [RFC PATCH 4/5] build-system: add clean-coverage target
On 06/20/2018 06:06 PM, Alex Bennée wrote: > > Philippe Mathieu-Daudé writes: > >> Hi Alex, >> >> On 06/20/2018 10:20 AM, Alex Bennée wrote: >>> This can be used to remove any stale coverage data before any >>> particular test run. This is useful for analysing individual tests. >>> >>> Signed-off-by: Alex Bennée >>> --- >>> Makefile | 11 +++ >>> docs/devel/testing.rst | 11 --- >>> 2 files changed, 19 insertions(+), 3 deletions(-) >>> >>> diff --git a/Makefile b/Makefile >>> index e46f2b625a..cb4af8bf80 100644 >>> --- a/Makefile >>> +++ b/Makefile >>> @@ -725,6 +725,14 @@ module_block.h: >>> $(SRC_PATH)/scripts/modules/module_block.py config-host.mak >>> $(addprefix $(SRC_PATH)/,$(patsubst %.mo,%.c,$(block-obj-m))), \ >>> "GEN","$@") >>> >>> +ifdef CONFIG_GCOV >>> +.PHONY: clean-coverage >>> +clean-coverage: >>> + $(call quiet-command, \ >>> + find . \( -name '*.gcda' -o -name '*.gcov' \) -type f -exec rm >>> {} +, \ >>> + "CLEAN", "coverage files") >> >> I also see ".gcno" files. >> From GCC man page: >> >> -ftest-coverage >>Produce a notes file that the gcov code-coverage >>utility can use to show program coverage. Each >>source file's note file is called auxname.gcno. > > I explicitly left that out - the gcno file is regenerated by the build. > There is no reason to wipe it between coverage runs. A full clean should > remove them however. Oh OK, fine then. > >> >> Tested-by: Philippe Mathieu-Daudé >> Adding gcno: >> Reviewed-by: Philippe Mathieu-Daudé >> >>> +endif >>> + >>> clean: >>> # avoid old build problems by removing potentially incorrect old files >>> rm -f config.mak op-i386.h opc-i386.h gen-op-i386.h op-arm.h opc-arm.h >>> gen-op-arm.h >>> @@ -1075,6 +1083,9 @@ endif >>> echo '') >>> @echo 'Cleaning targets:' >>> @echo ' clean - Remove most generated files but keep the >>> config' >>> +ifdef CONFIG_GCOV >>> + @echo ' clean-coverage - Remove coverage files' >>> +endif >>> @echo ' distclean - Remove all generated files' >>> @echo ' dist- Build a distributable tarball' >>> @echo '' >>> diff --git a/docs/devel/testing.rst b/docs/devel/testing.rst >>> index 66ef219f69..a3652aea14 100644 >>> --- a/docs/devel/testing.rst >>> +++ b/docs/devel/testing.rst >>> @@ -161,9 +161,14 @@ GCC gcov support >>> ``gcov`` is a GCC tool to analyze the testing coverage by >>> instrumenting the tested code. To use it, configure QEMU with >>> ``--enable-gcov`` option and build. Then run ``make check`` as usual. >>> -Reports can be obtained by running ``gcov`` command on the output >>> -files under ``$build_dir/tests/``, please read the ``gcov`` >>> -documentation for more information. >>> + >>> +If you want to gather coverage information on a single test the ``make >>> +clean-coverage`` target can be used to any existing coverage >>> +information before running a single test. >>> + >>> +Reports can be obtained by running ``gcov`` command >>> +on the output files under ``$build_dir/tests/``, please read the >>> +``gcov`` documentation for more information. >>> >>> QEMU iotests >>> >>> > > > -- > Alex Bennée >
Re: [Qemu-devel] [PATCH 00/113] Patch Round-up for stable 2.11.2, freeze on 2018-06-22
Quoting Michael Roth (2018-06-20 15:41:24) > Quoting Cornelia Huck (2018-06-19 02:42:48) > > On Mon, 18 Jun 2018 20:41:26 -0500 > > Michael Roth wrote: > > > > > Hi everyone, > > > > > > The following new patches are queued for QEMU stable v2.11.2: > > > > > > https://github.com/mdroth/qemu/commits/stable-2.11-staging > > > > > > The release is planned for 2018-06-22: > > > > > > https://wiki.qemu.org/Planning/2.11 > > > > > > Please respond here or CC qemu-sta...@nongnu.org on any patches you > > > think should be included in the release. > > > > > > Thanks! > > > > > > > > > > > > The following changes since commit > > > 7c1beb52ed86191d9e965444d934adaa2531710f: > > > > > > Update version for 2.11.1 release (2018-02-14 14:41:05 -0600) > > > > > > are available in the git repository at: > > > > > > git://github.com/mdroth/qemu.git > > > > > > for you to fetch changes up to acb3571f90885a2e206044b3bdc8d1dd2a0389c0: > > > > > > arm_gicv3_kvm: kvm_dist_get/put_priority: skip the registers banked by > > > GICR_IPRIORITYR (2018-06-16 07:47:00 -0500) > > > > > > > > > > Hi Michael, > > > > as this series includes some s390-ccw bios patches, it needs a rebuild > > of the s390-ccw bios as well, probably on top of your stable branch. > > (IIRC we have extra patches on master, so you probably don't want to > > cherry-pick the latest rebuild from there.). Let me know if one of us > > should provide a rebuild. > > > > Thanks Cornelia, I hadn't realized that. I think rebuild from one of the > maintainers would definitely be preferable. We'd also want the corresponding > patches for pc-bios/s390-ccw reflected in the 2.11.x tree. Er, sorry was a bit confused. I suppose that part is covered already if there's no additional patches needed in the rebuild other than what's in 2.11.x already. > another maintainer could put together a branch with those I can merge > those in directly.
Re: [Qemu-devel] [Qemu-stable] [PATCH 00/113] Patch Round-up for stable 2.11.2, freeze on 2018-06-22
Quoting Greg Kurz (2018-06-19 06:56:36) > On Mon, 18 Jun 2018 20:41:26 -0500 > Michael Roth wrote: > > > Hi everyone, > > > > The following new patches are queued for QEMU stable v2.11.2: > > > > https://github.com/mdroth/qemu/commits/stable-2.11-staging > > > > The release is planned for 2018-06-22: > > > > https://wiki.qemu.org/Planning/2.11 > > > > Please respond here or CC qemu-sta...@nongnu.org on any patches you > > think should be included in the release. > > > > Hi Mike, > > Please add the following commit to fix backward migration to QEMU 2.7 > and older: > > aef19c04bf88 spapr: don't migrate "spapr_option_vector_ov5_cas" to pre 2.8 > machines Do we still need this if we don't have the following patch? commit a324d6f166970f8f6a82c61ffd2356fbda81c8f4 Author: Bharata B Rao AuthorDate: Thu Apr 19 12:17:35 2018 +0530 Commit: David Gibson CommitDate: Fri Apr 27 18:05:23 2018 +1000 spapr: Support ibm,dynamic-memory-v2 property If so that one isn't part of 2.11.x. I have the patch tagged for 2.12.1 though. > > Cheers, > > -- > Greg > > > Thanks! > > > > > > > > The following changes since commit 7c1beb52ed86191d9e965444d934adaa2531710f: > > > > Update version for 2.11.1 release (2018-02-14 14:41:05 -0600) > > > > are available in the git repository at: > > > > git://github.com/mdroth/qemu.git > > > > for you to fetch changes up to acb3571f90885a2e206044b3bdc8d1dd2a0389c0: > > > > arm_gicv3_kvm: kvm_dist_get/put_priority: skip the registers banked by > > GICR_IPRIORITYR (2018-06-16 07:47:00 -0500) > > > > > > Alberto Garcia (2): > > specs/qcow2: Fix documentation of the compressed cluster descriptor > > throttle: Fix crash on reopen > > > > Alexandro Sanchez Bach (1): > > target/i386: Fix andn instruction > > > > Brijesh Singh (1): > > tap: set vhostfd passed from qemu cli to non-blocking > > > > Cornelia Huck (4): > > s390-ccw: force diag 308 subcode to unsigned long > > s390x/css: disabled subchannels cannot be status pending > > virtio-ccw: common reset handler > > s390x/ccw: make sure all ccw devices are properly reset > > > > Daniel P. Berrangé (1): > > i386: define the 'ssbd' CPUID feature bit (CVE-2018-3639) > > > > David Gibson (3): > > spapr: Allow some cases where we can't set VSMT mode in the kernel > > spapr: Adjust default VSMT value for better migration compatibility > > target/ppc: Clarify compat mode max_threads value > > > > Eric Blake (4): > > nbd: Honor server's advertised minimum block size > > nbd/client: Fix error messages during NBD_INFO_BLOCK_SIZE > > qemu-img: Fix assert when mapping unaligned raw file > > iotests: Add test 221 to catch qemu-img map regression > > > > Fam Zheng (1): > > raw: Check byte range uniformly > > > > Geert Uytterhoeven (1): > > device_tree: Increase FDT_MAX_SIZE to 1 MiB > > > > Gerd Hoffmann (3): > > sdl: workaround bug in sdl 2.0.8 headers > > qxl: fix local renderer crash > > vga: fix region calculation > > > > Greg Kurz (12): > > spapr: use spapr->vsmt to compute VCPU ids > > spapr: move VCPU calculation to core machine code > > spapr: rename spapr_vcpu_id() to spapr_get_vcpu_id() > > spapr: consolidate the VCPU id numbering logic in a single place > > spapr: fix missing CPU core nodes in DT when running with TCG > > spapr: register dummy ICPs later > > spapr: make pseries-2.11 the default machine type > > virtio_net: flush uncompleted TX on reset > > exec: fix memory leak in find_max_supported_pagesize() > > vfio-ccw: fix memory leaks in vfio_ccw_realize() > > target/ppc: always set PPC_MEM_TLBIE in pre 2.8 migration hack > > spapr: don't advertise radix GTSE if max-compat-cpu < power9 > > > > Henry Wertz (1): > > tcg/arm: Fix memory barrier encoding > > > > Jack Schwartz (4): > > multiboot: bss_end_addr can be zero > > multiboot: Remove unused variables from multiboot.c > > multiboot: Use header names when displaying fields > > multiboot: fprintf(stderr...) -> error_report() > > > > Jan Kiszka (1): > > hw/intc/arm_gicv3: Fix APxR register dispatching > > > > Jason Andryuk (1): > > ccid: Fix dwProtocols advertisement of T=0 > > > > John Snow (1): > > ahci: fix PxCI register race > > > > John Thomson (1): > > Fix libusb-1.0.22 deprecated libusb_set_debug with libusb_set_option > > > > KONRAD Frederic (1): > > sparc: fix leon3 casa instruction when MMU is disabled > > > > Kevin Wolf (7): > > rbd: Fix use after free in qemu_rbd_set_keypairs() error path > > multiboot: Reject kernels exceeding the address space > > multiboot: Check validity of mh_header_addr > >
Re: [Qemu-devel] [PATCH 00/113] Patch Round-up for stable 2.11.2, freeze on 2018-06-22
Quoting Michael Roth (2018-06-18 20:41:26) > Hi everyone, > > The following new patches are queued for QEMU stable v2.11.2: > > https://github.com/mdroth/qemu/commits/stable-2.11-staging > > The release is planned for 2018-06-22: > > https://wiki.qemu.org/Planning/2.11 > > Please respond here or CC qemu-sta...@nongnu.org on any patches you > think should be included in the release. The following additional patches have been queued for 2.11.2: tpm: lookup cancel path under tpm device class (Marc-André Lureau) tpm-passthrough: don't save guessed cancel_path in options (Marc-André Lureau) s390-ccw-virtio: allow for systems larger that 7.999TB (Christian Borntraeger) crypto: ensure we use a predictable TLS priority setting (Daniel P. Berrangé) qapi: ensure stable sort ordering when checking QAPI entities (Daniel P. Berrange) https://github.com/mdroth/qemu/commits/stable-2.11-staging Thank you everyone for the suggestions. > > Thanks! > > > > The following changes since commit 7c1beb52ed86191d9e965444d934adaa2531710f: > > Update version for 2.11.1 release (2018-02-14 14:41:05 -0600) > > are available in the git repository at: > > git://github.com/mdroth/qemu.git > > for you to fetch changes up to acb3571f90885a2e206044b3bdc8d1dd2a0389c0: > > arm_gicv3_kvm: kvm_dist_get/put_priority: skip the registers banked by > GICR_IPRIORITYR (2018-06-16 07:47:00 -0500) > > > Alberto Garcia (2): > specs/qcow2: Fix documentation of the compressed cluster descriptor > throttle: Fix crash on reopen > > Alexandro Sanchez Bach (1): > target/i386: Fix andn instruction > > Brijesh Singh (1): > tap: set vhostfd passed from qemu cli to non-blocking > > Cornelia Huck (4): > s390-ccw: force diag 308 subcode to unsigned long > s390x/css: disabled subchannels cannot be status pending > virtio-ccw: common reset handler > s390x/ccw: make sure all ccw devices are properly reset > > Daniel P. Berrangé (1): > i386: define the 'ssbd' CPUID feature bit (CVE-2018-3639) > > David Gibson (3): > spapr: Allow some cases where we can't set VSMT mode in the kernel > spapr: Adjust default VSMT value for better migration compatibility > target/ppc: Clarify compat mode max_threads value > > Eric Blake (4): > nbd: Honor server's advertised minimum block size > nbd/client: Fix error messages during NBD_INFO_BLOCK_SIZE > qemu-img: Fix assert when mapping unaligned raw file > iotests: Add test 221 to catch qemu-img map regression > > Fam Zheng (1): > raw: Check byte range uniformly > > Geert Uytterhoeven (1): > device_tree: Increase FDT_MAX_SIZE to 1 MiB > > Gerd Hoffmann (3): > sdl: workaround bug in sdl 2.0.8 headers > qxl: fix local renderer crash > vga: fix region calculation > > Greg Kurz (12): > spapr: use spapr->vsmt to compute VCPU ids > spapr: move VCPU calculation to core machine code > spapr: rename spapr_vcpu_id() to spapr_get_vcpu_id() > spapr: consolidate the VCPU id numbering logic in a single place > spapr: fix missing CPU core nodes in DT when running with TCG > spapr: register dummy ICPs later > spapr: make pseries-2.11 the default machine type > virtio_net: flush uncompleted TX on reset > exec: fix memory leak in find_max_supported_pagesize() > vfio-ccw: fix memory leaks in vfio_ccw_realize() > target/ppc: always set PPC_MEM_TLBIE in pre 2.8 migration hack > spapr: don't advertise radix GTSE if max-compat-cpu < power9 > > Henry Wertz (1): > tcg/arm: Fix memory barrier encoding > > Jack Schwartz (4): > multiboot: bss_end_addr can be zero > multiboot: Remove unused variables from multiboot.c > multiboot: Use header names when displaying fields > multiboot: fprintf(stderr...) -> error_report() > > Jan Kiszka (1): > hw/intc/arm_gicv3: Fix APxR register dispatching > > Jason Andryuk (1): > ccid: Fix dwProtocols advertisement of T=0 > > John Snow (1): > ahci: fix PxCI register race > > John Thomson (1): > Fix libusb-1.0.22 deprecated libusb_set_debug with libusb_set_option > > KONRAD Frederic (1): > sparc: fix leon3 casa instruction when MMU is disabled > > Kevin Wolf (7): > rbd: Fix use after free in qemu_rbd_set_keypairs() error path > multiboot: Reject kernels exceeding the address space > multiboot: Check validity of mh_header_addr > tests/multiboot: Test exit code for every qemu run > tests/multiboot: Add tests for the a.out kludge > tests/multiboot: Add .gitignore > gluster: Fix blockdev-add with server.N.type=unix > > Konrad Rzeszutek Wilk (2): > i386: Define the Virt SSBD MSR and handling of it (CVE-2018-3639) > i386: define the AMD
Re: [Qemu-devel] [PATCH 00/113] Patch Round-up for stable 2.11.2, freeze on 2018-06-22
Quoting Cornelia Huck (2018-06-19 02:42:48) > On Mon, 18 Jun 2018 20:41:26 -0500 > Michael Roth wrote: > > > Hi everyone, > > > > The following new patches are queued for QEMU stable v2.11.2: > > > > https://github.com/mdroth/qemu/commits/stable-2.11-staging > > > > The release is planned for 2018-06-22: > > > > https://wiki.qemu.org/Planning/2.11 > > > > Please respond here or CC qemu-sta...@nongnu.org on any patches you > > think should be included in the release. > > > > Thanks! > > > > > > > > The following changes since commit 7c1beb52ed86191d9e965444d934adaa2531710f: > > > > Update version for 2.11.1 release (2018-02-14 14:41:05 -0600) > > > > are available in the git repository at: > > > > git://github.com/mdroth/qemu.git > > > > for you to fetch changes up to acb3571f90885a2e206044b3bdc8d1dd2a0389c0: > > > > arm_gicv3_kvm: kvm_dist_get/put_priority: skip the registers banked by > > GICR_IPRIORITYR (2018-06-16 07:47:00 -0500) > > > > > > Hi Michael, > > as this series includes some s390-ccw bios patches, it needs a rebuild > of the s390-ccw bios as well, probably on top of your stable branch. > (IIRC we have extra patches on master, so you probably don't want to > cherry-pick the latest rebuild from there.). Let me know if one of us > should provide a rebuild. > Thanks Cornelia, I hadn't realized that. I think rebuild from one of the maintainers would definitely be preferable. We'd also want the corresponding patches for pc-bios/s390-ccw reflected in the 2.11.x tree. If you or another maintainer could put together a branch with those I can merge those in directly.
Re: [Qemu-devel] [RFC PATCH 4/5] build-system: add clean-coverage target
Philippe Mathieu-Daudé writes: > Hi Alex, > > On 06/20/2018 10:20 AM, Alex Bennée wrote: >> This can be used to remove any stale coverage data before any >> particular test run. This is useful for analysing individual tests. >> >> Signed-off-by: Alex Bennée >> --- >> Makefile | 11 +++ >> docs/devel/testing.rst | 11 --- >> 2 files changed, 19 insertions(+), 3 deletions(-) >> >> diff --git a/Makefile b/Makefile >> index e46f2b625a..cb4af8bf80 100644 >> --- a/Makefile >> +++ b/Makefile >> @@ -725,6 +725,14 @@ module_block.h: >> $(SRC_PATH)/scripts/modules/module_block.py config-host.mak >> $(addprefix $(SRC_PATH)/,$(patsubst %.mo,%.c,$(block-obj-m))), \ >> "GEN","$@") >> >> +ifdef CONFIG_GCOV >> +.PHONY: clean-coverage >> +clean-coverage: >> +$(call quiet-command, \ >> +find . \( -name '*.gcda' -o -name '*.gcov' \) -type f -exec rm >> {} +, \ >> +"CLEAN", "coverage files") > > I also see ".gcno" files. > From GCC man page: > > -ftest-coverage >Produce a notes file that the gcov code-coverage >utility can use to show program coverage. Each >source file's note file is called auxname.gcno. I explicitly left that out - the gcno file is regenerated by the build. There is no reason to wipe it between coverage runs. A full clean should remove them however. > > Tested-by: Philippe Mathieu-Daudé > Adding gcno: > Reviewed-by: Philippe Mathieu-Daudé > >> +endif >> + >> clean: >> # avoid old build problems by removing potentially incorrect old files >> rm -f config.mak op-i386.h opc-i386.h gen-op-i386.h op-arm.h opc-arm.h >> gen-op-arm.h >> @@ -1075,6 +1083,9 @@ endif >> echo '') >> @echo 'Cleaning targets:' >> @echo ' clean - Remove most generated files but keep the >> config' >> +ifdef CONFIG_GCOV >> +@echo ' clean-coverage - Remove coverage files' >> +endif >> @echo ' distclean - Remove all generated files' >> @echo ' dist- Build a distributable tarball' >> @echo '' >> diff --git a/docs/devel/testing.rst b/docs/devel/testing.rst >> index 66ef219f69..a3652aea14 100644 >> --- a/docs/devel/testing.rst >> +++ b/docs/devel/testing.rst >> @@ -161,9 +161,14 @@ GCC gcov support >> ``gcov`` is a GCC tool to analyze the testing coverage by >> instrumenting the tested code. To use it, configure QEMU with >> ``--enable-gcov`` option and build. Then run ``make check`` as usual. >> -Reports can be obtained by running ``gcov`` command on the output >> -files under ``$build_dir/tests/``, please read the ``gcov`` >> -documentation for more information. >> + >> +If you want to gather coverage information on a single test the ``make >> +clean-coverage`` target can be used to any existing coverage >> +information before running a single test. >> + >> +Reports can be obtained by running ``gcov`` command >> +on the output files under ``$build_dir/tests/``, please read the >> +``gcov`` documentation for more information. >> >> QEMU iotests >> >> -- Alex Bennée
Re: [Qemu-devel] [RFC PATCH 3/5] .travis.yml: add gcovr summary for GCOV build
Philippe Mathieu-Daudé writes: > On 06/20/2018 10:20 AM, Alex Bennée wrote: >> This gives a more useful summary, sorted by descending % coverage, >> after the tests have run. The final numbers will give an idea if our >> coverage is getting better or worse. >> >> As quite a lot of lines don't get covered at all we filter out all the >> 0% lines. If the file doesn't appear it is not being exercised. >> >> Signed-off-by: Alex Bennée >> --- >> .travis.yml | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/.travis.yml b/.travis.yml >> index fabfe9ec34..83e0577464 100644 >> --- a/.travis.yml >> +++ b/.travis.yml >> @@ -38,6 +38,7 @@ addons: >>- libvte-2.90-dev >>- sparse >>- uuid-dev >> + - gcovr >> >> # The channel name "irc.oftc.net#qemu" is encrypted against qemu/qemu >> # to prevent IRC notifications from forks. This was created using: >> @@ -81,6 +82,8 @@ matrix: >>compiler: clang >> # gprof/gcov are GCC features >> - env: CONFIG="--enable-gprof --enable-gcov --disable-pie >> --target-list=aarch64-softmmu,arm-softmmu,i386-softmmu,mips-softmmu,mips64-softmmu,ppc64-softmmu,riscv64-softmmu,s390x-softmmu,x86_64-softmmu" > > I just noticed the linux-user tests are not covered. I did try and calculate coverage of a risu run through SVE and didn't get any gcda files so I think there is something else that needs adding first. > > I'd duplicate this entry and use --disable-system --disable-bsd-user. > >> + after_success: >> +- gcovr -p | grep -v "0%" | sed s/[0-9]\*[,-]//g >>compiler: gcc >> # We manually include builds which we disable "make check" for >> - env: CONFIG="--enable-debug --enable-tcg-interpreter" >> -- Alex Bennée
Re: [Qemu-devel] [PATCH v2 3/7] s390x/tcg: properly implement the TOD
On 06/20/2018 10:33 AM, David Hildenbrand wrote: > On 20.06.2018 21:33, Richard Henderson wrote: >> On 06/20/2018 12:08 AM, David Hildenbrand wrote: >>> +/* Converts ns to s390's clock format */ >>> +static inline uint64_t time2tod(uint64_t ns) >>> +{ >>> +return (ns << 9) / 125; >>> +} >>> + >>> +/* Converts s390's clock format to ns */ >>> +static inline uint64_t tod2time(uint64_t t) >>> +{ >>> +return (t * 125) >> 9; >>> +} >> > > In this patch I'm only moving the code. If we find this is a problem, > this should go into a separate patch. Ah, right. >> How many significant bits on input here? > > Basically all are significant, and as it is a clock, we will reach these > bits at one point. > >> Do you in fact want to be using muldiv64? > > Looking at linux: > > arch/s390/include/asm/timex.h > > They have a lengthy documentation, resulting in (a spli to avoid overflows) > > return ((todval >> 9) * 125) + (((todval & 0x1ff) * 125) >> 9); > > Maybe we should do the same? That would work too. r~
Re: [Qemu-devel] [Qemu-block] [PATCH v5 6/6] docs/interop: add nbd.txt
On 06/20/2018 10:16 AM, Vladimir Sementsov-Ogievskiy wrote: > 20.06.2018 14:33, Eric Blake wrote: >> On 06/09/2018 10:17 AM, Vladimir Sementsov-Ogievskiy wrote: >>> Describe new metadata namespace: "qemu". >>> >>> Signed-off-by: Vladimir Sementsov-Ogievskiy >>> --- >>> docs/interop/nbd.txt | 37 + >>> MAINTAINERS | 1 + >>> 2 files changed, 38 insertions(+) >>> create mode 100644 docs/interop/nbd.txt >>> >>> diff --git a/docs/interop/nbd.txt b/docs/interop/nbd.txt >>> new file mode 100644 >>> index 00..7366269fc0 >>> --- /dev/null >>> +++ b/docs/interop/nbd.txt >>> @@ -0,0 +1,37 @@ >>> +Qemu supports NBD protocol, and has internal NBD client (look at >> >> s/supports/supports the/ >> >>> +block/nbd.c), internal NBD server (look at blockdev-nbd.c) as well as >> >> s/internal/an internal/2 >> >>> +external NBD server tool - qemu-nbd.c. The common code is placed in >> >> s/external/an external/ >> >>> +nbd/*. >>> + >>> +NBD protocol is specified here: >> >> s/NBD/The NBD/ >> >>> +https://github.com/NetworkBlockDevice/nbd/blob/master/doc/proto.md >>> + >>> +This following paragraphs describe some specific properties of NBD >>> +protocol realization in Qemu. >>> + >>> + >>> += Metadata namespaces = >>> + >>> +Qemu supports "base:allocation" metadata context as defined in the NBD >> >> s/supports/supports the/ >> >>> +protocol specification and defines own metadata namespace: "qemu". >> >> s/own/an additional/ >> >>> + >>> + >>> +== "qemu" namespace == >>> + >>> +For now, the only type of metadata context in the namespace is dirty >>> +bitmap. All available metadata contexts have the following form: >> >> maybe: >> >> The "qemu" namespace currently contains only one type of context, >> related to exposing the contents of a dirty bitmap alongside the >> associated disk contents. The available metadata context has the >> following form: > > Ok > >> >>> + >>> + qemu:dirty-bitmap: >>> + >>> +Each dirty-bitmap metadata context defines the only one flag for >>> +extents in reply for NBD_CMD_BLOCK_STATUS: >>> + >>> + bit 0: NBD_STATE_DIRTY, means that the extent is "dirty" >>> + >>> +For NBD_OPT_LIST_META_CONTEXT the following queries are supported >>> +additionally to "qemu:dirty-bitmap:": >> >> s/additionally/in addition/ >> >>> + >>> +* "qemu:" : returns list of all available metadata contexts in the >>> + namespace. >>> +* "qemu:dirty-bitmap:" : returns list of all available dirty-bitmap >>> + metadata contexts. >>> diff --git a/MAINTAINERS b/MAINTAINERS >>> index e187b1f18f..887b479440 100644 >>> --- a/MAINTAINERS >>> +++ b/MAINTAINERS >>> @@ -1923,6 +1923,7 @@ F: nbd/ >>> F: include/block/nbd* >>> F: qemu-nbd.* >>> F: blockdev-nbd.c >>> +F: docs/interop/nbd.txt >>> T: git git://repo.or.cz/qemu/ericb.git nbd >>> NFS >>> >> >> Reviewed-by: Eric Blake >> >> At this point, I think I'll touch up the issues I've spotted and >> submit a pull request, in order to make it easier for me to test my >> libvirt code. >> > > Ok, thank you! > ACK; the x- prefixes will help us get everything rolling together much faster and gives us some leeway to change things later as needed. Vladimir, can you jog our memories and let us know which series still need to hit QEMU for 3.0 for safe persistence/migration et al? (Not including any of my own qemu-img patches which I'll get to by freeze.)
Re: [Qemu-devel] [PATCH v1] postcopy: drop ram_pages parameter from postcopy_ram_incoming_init()
Hi, This series failed docker-mingw@fedora build test. Please find the testing commands and their output below. If you have Docker installed, you can probably reproduce it locally. Type: series Message-id: 20180620152241.15772-1-da...@redhat.com Subject: [Qemu-devel] [PATCH v1] postcopy: drop ram_pages parameter from postcopy_ram_incoming_init() === TEST SCRIPT BEGIN === #!/bin/bash set -e git submodule update --init dtc # Let docker tests dump environment info export SHOW_ENV=1 export J=8 time make docker-test-mingw@fedora === TEST SCRIPT END === Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 Switched to a new branch 'test' e7347b3a55 postcopy: drop ram_pages parameter from postcopy_ram_incoming_init() === OUTPUT BEGIN === Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc' Cloning into '/var/tmp/patchew-tester-tmp-8zzcpzkm/src/dtc'... Submodule path 'dtc': checked out 'e54388015af1fb4bf04d0bca99caba1074d9cc42' BUILD fedora make[1]: Entering directory '/var/tmp/patchew-tester-tmp-8zzcpzkm/src' GEN /var/tmp/patchew-tester-tmp-8zzcpzkm/src/docker-src.2018-06-20-15.02.33.29229/qemu.tar Cloning into '/var/tmp/patchew-tester-tmp-8zzcpzkm/src/docker-src.2018-06-20-15.02.33.29229/qemu.tar.vroot'... done. Checking out files: 46% (2924/6237) Checking out files: 47% (2932/6237) Checking out files: 48% (2994/6237) Checking out files: 49% (3057/6237) Checking out files: 50% (3119/6237) Checking out files: 51% (3181/6237) Checking out files: 52% (3244/6237) Checking out files: 53% (3306/6237) Checking out files: 54% (3368/6237) Checking out files: 55% (3431/6237) Checking out files: 56% (3493/6237) Checking out files: 57% (3556/6237) Checking out files: 58% (3618/6237) Checking out files: 59% (3680/6237) Checking out files: 60% (3743/6237) Checking out files: 61% (3805/6237) Checking out files: 62% (3867/6237) Checking out files: 63% (3930/6237) Checking out files: 64% (3992/6237) Checking out files: 65% (4055/6237) Checking out files: 66% (4117/6237) Checking out files: 67% (4179/6237) Checking out files: 68% (4242/6237) Checking out files: 69% (4304/6237) Checking out files: 70% (4366/6237) Checking out files: 71% (4429/6237) Checking out files: 72% (4491/6237) Checking out files: 73% (4554/6237) Checking out files: 74% (4616/6237) Checking out files: 75% (4678/6237) Checking out files: 76% (4741/6237) Checking out files: 77% (4803/6237) Checking out files: 78% (4865/6237) Checking out files: 79% (4928/6237) Checking out files: 80% (4990/6237) Checking out files: 81% (5052/6237) Checking out files: 82% (5115/6237) Checking out files: 83% (5177/6237) Checking out files: 84% (5240/6237) Checking out files: 85% (5302/6237) Checking out files: 86% (5364/6237) Checking out files: 87% (5427/6237) Checking out files: 88% (5489/6237) Checking out files: 89% (5551/6237) Checking out files: 90% (5614/6237) Checking out files: 91% (5676/6237) Checking out files: 92% (5739/6237) Checking out files: 93% (5801/6237) Checking out files: 94% (5863/6237) Checking out files: 95% (5926/6237) Checking out files: 96% (5988/6237) Checking out files: 97% (6050/6237) Checking out files: 98% (6113/6237) Checking out files: 99% (6175/6237) Checking out files: 100% (6237/6237) Checking out files: 100% (6237/6237), done. Your branch is up-to-date with 'origin/test'. Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc' Cloning into '/var/tmp/patchew-tester-tmp-8zzcpzkm/src/docker-src.2018-06-20-15.02.33.29229/qemu.tar.vroot/dtc'... Submodule path 'dtc': checked out 'e54388015af1fb4bf04d0bca99caba1074d9cc42' Submodule 'ui/keycodemapdb' (git://git.qemu.org/keycodemapdb.git) registered for path 'ui/keycodemapdb' Cloning into '/var/tmp/patchew-tester-tmp-8zzcpzkm/src/docker-src.2018-06-20-15.02.33.29229/qemu.tar.vroot/ui/keycodemapdb'... Submodule path 'ui/keycodemapdb': checked out '6b3d716e2b6472eb7189d3220552280ef3d832ce' COPYRUNNER RUN test-mingw in qemu:fedora Packages installed: SDL2-devel-2.0.8-5.fc28.x86_64 bc-1.07.1-5.fc28.x86_64 bison-3.0.4-9.fc28.x86_64 bluez-libs-devel-5.49-3.fc28.x86_64 brlapi-devel-0.6.7-12.fc28.x86_64 bzip2-1.0.6-26.fc28.x86_64 bzip2-devel-1.0.6-26.fc28.x86_64 ccache-3.4.2-2.fc28.x86_64 clang-6.0.0-5.fc28.x86_64 device-mapper-multipath-devel-0.7.4-2.git07e7bd5.fc28.x86_64 findutils-4.6.0-19.fc28.x86_64 flex-2.6.1-7.fc28.x86_64 gcc-8.1.1-1.fc28.x86_64 gcc-c++-8.1.1-1.fc28.x86_64 gettext-0.19.8.1-14.fc28.x86_64 git-2.17.1-2.fc28.x86_64 glib2-devel-2.56.1-3.fc28.x86_64 glusterfs-api-devel-4.0.2-1.fc28.x86_64 gnutls-devel-3.6.2-1.fc28.x86_64 gtk3-devel-3.22.30-1.fc28.x86_64 hostname-3.20-3.fc28.x86_64 libaio-devel-0.3.110-11.fc28.x86_64 libasan-8.1.1-1.fc28.x86_64 libattr-devel-2.4.47-23.fc28.x86_64
Re: [Qemu-devel] [RFC] Add NRF51 RNG peripheral
Hi Fam, On 06/19/2018 10:20 AM, no-re...@patchew.org wrote: > Hi, > > This series failed build test on s390x host. Please find the details below. > > N/A. Internal error while reading log file Did you noticed this error? Maybe this kind of error should only be sent to patchew-de...@redhat.com > > > > --- > Email generated automatically by Patchew [http://patchew.org/]. > Please send your feedback to patchew-de...@redhat.com >
Re: [Qemu-devel] [RFC PATCH 5/5] build-system: add coverage-report target
Hi Alex, On 06/20/2018 10:20 AM, Alex Bennée wrote: > This will build a coverage report under the current directory in > reports/coverage. At the users option a report can be generated by > directly invoking something like: > > make foo/bar/coverage-report.html > > Signed-off-by: Alex Bennée > --- > Makefile | 13 + > docs/devel/testing.rst | 11 --- > 2 files changed, 21 insertions(+), 3 deletions(-) > > diff --git a/Makefile b/Makefile > index cb4af8bf80..7450e0b7b5 100644 > --- a/Makefile > +++ b/Makefile > @@ -988,6 +988,16 @@ docs/interop/qemu-qmp-ref.dvi > docs/interop/qemu-qmp-ref.html \ > docs/interop/qemu-qmp-ref.txt docs/interop/qemu-qmp-ref.7: \ > docs/interop/qemu-qmp-ref.texi docs/interop/qemu-qmp-qapi.texi > > +# Reports/Analysis > + > +%/coverage-report.html: What about the files in the root directory? > + @mkdir -p $* > + $(call quiet-command,\ > + gcovr -p --html --html-details -o $@, \ > + "GEN", "coverage-report.html") I think this also needs "-r $(SRC_PATH)" for out-of-tree builds. >From GCOVR(1): -r ROOT, --root=ROOT Defines the root directory for source files. > + > +.PHONY: coverage-report > +coverage-report: $(CURDIR)/reports/coverage/coverage-report.html > > ifdef CONFIG_WIN32 > > @@ -1097,6 +1107,9 @@ endif > @echo 'Documentation targets:' > @echo ' html info pdf txt' > @echo ' - Build documentation in specified format' > +ifdef CONFIG_GCOV > + @echo ' coverage-report - Create code coverage report' > +endif > @echo '' > ifdef CONFIG_WIN32 > @echo 'Windows targets:' > diff --git a/docs/devel/testing.rst b/docs/devel/testing.rst > index a3652aea14..9dcdd19260 100644 > --- a/docs/devel/testing.rst > +++ b/docs/devel/testing.rst > @@ -166,9 +166,14 @@ If you want to gather coverage information on a single > test the ``make > clean-coverage`` target can be used to any existing coverage > information before running a single test. > > -Reports can be obtained by running ``gcov`` command > -on the output files under ``$build_dir/tests/``, please read the > -``gcov`` documentation for more information. > +You can generate a HTML coverage report by executing ``make > +coverage-report`` which will generate into > +./reports/coverage/coverage-report.html. If you want to generate it > +elsewhere simply execute ``make /foo/bar/baz/coverage-report.html``. $ make coverage-report GEN coverage-report.html /bin/sh: 1: gcovr: not found make: *** [Makefile:995: reports/coverage/coverage-report.html] Error 127 Can you add a line about this prerequisite? (I don't think it's worth a check in ./configure). > + > +Further analysis can be conducted by running the ``gcov`` command > +directly on the various .gcda output files. Please read the ``gcov`` > +documentation for more information. > > QEMU iotests > > Reviewed-by: Philippe Mathieu-Daudé Tested-by: Philippe Mathieu-Daudé
Re: [Qemu-devel] [RFC PATCH 3/5] .travis.yml: add gcovr summary for GCOV build
On 06/20/2018 10:20 AM, Alex Bennée wrote: > This gives a more useful summary, sorted by descending % coverage, > after the tests have run. The final numbers will give an idea if our > coverage is getting better or worse. > > As quite a lot of lines don't get covered at all we filter out all the > 0% lines. If the file doesn't appear it is not being exercised. > > Signed-off-by: Alex Bennée > --- > .travis.yml | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/.travis.yml b/.travis.yml > index fabfe9ec34..83e0577464 100644 > --- a/.travis.yml > +++ b/.travis.yml > @@ -38,6 +38,7 @@ addons: >- libvte-2.90-dev >- sparse >- uuid-dev > + - gcovr > > # The channel name "irc.oftc.net#qemu" is encrypted against qemu/qemu > # to prevent IRC notifications from forks. This was created using: > @@ -81,6 +82,8 @@ matrix: >compiler: clang > # gprof/gcov are GCC features > - env: CONFIG="--enable-gprof --enable-gcov --disable-pie > --target-list=aarch64-softmmu,arm-softmmu,i386-softmmu,mips-softmmu,mips64-softmmu,ppc64-softmmu,riscv64-softmmu,s390x-softmmu,x86_64-softmmu" I just noticed the linux-user tests are not covered. I'd duplicate this entry and use --disable-system --disable-bsd-user. > + after_success: > +- gcovr -p | grep -v "0%" | sed s/[0-9]\*[,-]//g >compiler: gcc > # We manually include builds which we disable "make check" for > - env: CONFIG="--enable-debug --enable-tcg-interpreter" >
Re: [Qemu-devel] [RFC PATCH 4/5] build-system: add clean-coverage target
Hi Alex, On 06/20/2018 10:20 AM, Alex Bennée wrote: > This can be used to remove any stale coverage data before any > particular test run. This is useful for analysing individual tests. > > Signed-off-by: Alex Bennée > --- > Makefile | 11 +++ > docs/devel/testing.rst | 11 --- > 2 files changed, 19 insertions(+), 3 deletions(-) > > diff --git a/Makefile b/Makefile > index e46f2b625a..cb4af8bf80 100644 > --- a/Makefile > +++ b/Makefile > @@ -725,6 +725,14 @@ module_block.h: > $(SRC_PATH)/scripts/modules/module_block.py config-host.mak > $(addprefix $(SRC_PATH)/,$(patsubst %.mo,%.c,$(block-obj-m))), \ > "GEN","$@") > > +ifdef CONFIG_GCOV > +.PHONY: clean-coverage > +clean-coverage: > + $(call quiet-command, \ > + find . \( -name '*.gcda' -o -name '*.gcov' \) -type f -exec rm > {} +, \ > + "CLEAN", "coverage files") I also see ".gcno" files. >From GCC man page: -ftest-coverage Produce a notes file that the gcov code-coverage utility can use to show program coverage. Each source file's note file is called auxname.gcno. Tested-by: Philippe Mathieu-Daudé Adding gcno: Reviewed-by: Philippe Mathieu-Daudé > +endif > + > clean: > # avoid old build problems by removing potentially incorrect old files > rm -f config.mak op-i386.h opc-i386.h gen-op-i386.h op-arm.h opc-arm.h > gen-op-arm.h > @@ -1075,6 +1083,9 @@ endif > echo '') > @echo 'Cleaning targets:' > @echo ' clean - Remove most generated files but keep the > config' > +ifdef CONFIG_GCOV > + @echo ' clean-coverage - Remove coverage files' > +endif > @echo ' distclean - Remove all generated files' > @echo ' dist- Build a distributable tarball' > @echo '' > diff --git a/docs/devel/testing.rst b/docs/devel/testing.rst > index 66ef219f69..a3652aea14 100644 > --- a/docs/devel/testing.rst > +++ b/docs/devel/testing.rst > @@ -161,9 +161,14 @@ GCC gcov support > ``gcov`` is a GCC tool to analyze the testing coverage by > instrumenting the tested code. To use it, configure QEMU with > ``--enable-gcov`` option and build. Then run ``make check`` as usual. > -Reports can be obtained by running ``gcov`` command on the output > -files under ``$build_dir/tests/``, please read the ``gcov`` > -documentation for more information. > + > +If you want to gather coverage information on a single test the ``make > +clean-coverage`` target can be used to any existing coverage > +information before running a single test. > + > +Reports can be obtained by running ``gcov`` command > +on the output files under ``$build_dir/tests/``, please read the > +``gcov`` documentation for more information. > > QEMU iotests > >
Re: [Qemu-devel] [PATCH v2 3/7] s390x/tcg: properly implement the TOD
On 20.06.2018 21:33, Richard Henderson wrote: > On 06/20/2018 12:08 AM, David Hildenbrand wrote: >> +/* Converts ns to s390's clock format */ >> +static inline uint64_t time2tod(uint64_t ns) >> +{ >> +return (ns << 9) / 125; >> +} >> + >> +/* Converts s390's clock format to ns */ >> +static inline uint64_t tod2time(uint64_t t) >> +{ >> +return (t * 125) >> 9; >> +} > In this patch I'm only moving the code. If we find this is a problem, this should go into a separate patch. > How many significant bits on input here? Basically all are significant, and as it is a clock, we will reach these bits at one point. > Do you in fact want to be using muldiv64? Looking at linux: arch/s390/include/asm/timex.h They have a lengthy documentation, resulting in (a spli to avoid overflows) return ((todval >> 9) * 125) + (((todval & 0x1ff) * 125) >> 9); Maybe we should do the same? > > > r~ > -- Thanks, David / dhildenb
Re: [Qemu-devel] [RFC PATCH 3/5] .travis.yml: add gcovr summary for GCOV build
On 06/20/2018 10:20 AM, Alex Bennée wrote: > This gives a more useful summary, sorted by descending % coverage, > after the tests have run. The final numbers will give an idea if our > coverage is getting better or worse. > > As quite a lot of lines don't get covered at all we filter out all the > 0% lines. If the file doesn't appear it is not being exercised. > > Signed-off-by: Alex Bennée > --- > .travis.yml | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/.travis.yml b/.travis.yml > index fabfe9ec34..83e0577464 100644 > --- a/.travis.yml > +++ b/.travis.yml > @@ -38,6 +38,7 @@ addons: >- libvte-2.90-dev >- sparse >- uuid-dev > + - gcovr > > # The channel name "irc.oftc.net#qemu" is encrypted against qemu/qemu > # to prevent IRC notifications from forks. This was created using: > @@ -81,6 +82,8 @@ matrix: >compiler: clang > # gprof/gcov are GCC features > - env: CONFIG="--enable-gprof --enable-gcov --disable-pie > --target-list=aarch64-softmmu,arm-softmmu,i386-softmmu,mips-softmmu,mips64-softmmu,ppc64-softmmu,riscv64-softmmu,s390x-softmmu,x86_64-softmmu" > + after_success: > +- gcovr -p | grep -v "0%" | sed s/[0-9]\*[,-]//g Can we use 'fgrep -B 1 -v "0%"' (--before-context=1) to also remove the filepath line? Any idea why I get this? "(WARNING) Unrecognized GCOV output: '='" Reviewed-by: Philippe Mathieu-Daudé Tested-by: Philippe Mathieu-Daudé >compiler: gcc > # We manually include builds which we disable "make check" for > - env: CONFIG="--enable-debug --enable-tcg-interpreter" >
Re: [Qemu-devel] regression: sata cdrom boot broken
On 06/20/2018 08:18 AM, Paolo Bonzini wrote: > On 20/06/2018 13:07, Gerd Hoffmann wrote: >> Hi, >> >> $subject says all. Noticed while testing the upcoming seabios update. >> Reproducer: >> >> qemu-system-x86_64 -M q35 -m 4G -cdrom >> Fedora-Workstation-Live-x86_64-28-1.1.iso >> >> bisected to: >> >> commit 956556e131e35f387ac482ad7b41151576fef057 >> Author: John Snow >> Date: Wed Jun 6 15:09:50 2018 -0400 >> >> ahci: move PIO Setup FIS before transfer, fix it for ATAPI commands >> >> cheers, >> Gerd >> > > If you know where to look at, the spec is actually pretty clear with > respect to when the interrupt is generated: "A PIO Setup FIS has been > received with the ‘I’ bit set, it has been copied into system memory, > and the data related to that FIS has been transferred". > You're quoting AHCI 1.3.1 section 3.3.5 here, the documentation for the PxIS register. ...Oh. So the PIO Setup FIS ... gets generated before the data is sent, but we don't copy it to the HBA memory buffers and notify the client until afterwards, but this is per-DRQ, I think, and not per-IDE command. > However, after reading the SATA specification I believe that the PIO > Setup FIS should never generate an interrupt in SeaBIOS, because: > > If this is the first DRQ data block for this command, the Interrupt > bit shall be cleared to zero. If this is not the first DRQ data block > for this command, the Interrupt bit shall be set to one > ...and this gem is from SATA 3.2 section 11.9, "PIO data-out command protocol." I have long wondered what controlled that 'I' bit... > Putting things together, there are two bugs in QEMU; > > - the PIO Setup interrupt must be generated at the end of data transfer > Oops. > - the PIO Setup interrupt must not be generated for the ATAPI command > transfer > Oops again. I ought to have stopped you but I was long aware that the PIO Setup FIS ought to be generated "before" the transfer. I suppose what we were doing was more correct, though. > *But* because SeaBIOS always uses DMA for ATAPI commands, there should > never be more than one DRQ data block for each command, and it should be > possible to remove the fishy PIO Setup FIS handling in SeaBIOS. > > Paolo >
[Qemu-devel] [PATCH v2] postcopy: drop ram_pages parameter from postcopy_ram_incoming_init()
Not needed. Don't expose last_ram_page(). Signed-off-by: David Hildenbrand --- v1 -> v2: - Make "last_ram_page" static exec.c | 2 +- include/exec/ram_addr.h | 1 - migration/postcopy-ram.c | 4 ++-- migration/postcopy-ram.h | 2 +- migration/ram.c | 4 +--- 5 files changed, 5 insertions(+), 8 deletions(-) diff --git a/exec.c b/exec.c index 9f4706db19..cc1c102d95 100644 --- a/exec.c +++ b/exec.c @@ -1940,7 +1940,7 @@ static ram_addr_t find_ram_offset(ram_addr_t size) return offset; } -unsigned long last_ram_page(void) +static unsigned long last_ram_page(void) { RAMBlock *block; ram_addr_t last = 0; diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h index 9295c01a89..d6687690fb 100644 --- a/include/exec/ram_addr.h +++ b/include/exec/ram_addr.h @@ -71,7 +71,6 @@ static inline unsigned long int ramblock_recv_bitmap_offset(void *host_addr, } long qemu_getrampagesize(void); -unsigned long last_ram_page(void); RAMBlock *qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr, bool share, const char *mem_path, Error **errp); diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c index 48e51556a7..932f188949 100644 --- a/migration/postcopy-ram.c +++ b/migration/postcopy-ram.c @@ -500,7 +500,7 @@ static int cleanup_range(const char *block_name, void *host_addr, * postcopy later; must be called prior to any precopy. * called from arch_init's similarly named ram_postcopy_incoming_init */ -int postcopy_ram_incoming_init(MigrationIncomingState *mis, size_t ram_pages) +int postcopy_ram_incoming_init(MigrationIncomingState *mis) { if (qemu_ram_foreach_migratable_block(init_range, NULL)) { return -1; @@ -1265,7 +1265,7 @@ bool postcopy_ram_supported_by_host(MigrationIncomingState *mis) return false; } -int postcopy_ram_incoming_init(MigrationIncomingState *mis, size_t ram_pages) +int postcopy_ram_incoming_init(MigrationIncomingState *mis) { error_report("postcopy_ram_incoming_init: No OS support"); return -1; diff --git a/migration/postcopy-ram.h b/migration/postcopy-ram.h index d900d9c34f..9d55536fd1 100644 --- a/migration/postcopy-ram.h +++ b/migration/postcopy-ram.h @@ -27,7 +27,7 @@ int postcopy_ram_enable_notify(MigrationIncomingState *mis); * postcopy later; must be called prior to any precopy. * called from ram.c's similarly named ram_postcopy_incoming_init */ -int postcopy_ram_incoming_init(MigrationIncomingState *mis, size_t ram_pages); +int postcopy_ram_incoming_init(MigrationIncomingState *mis); /* * At the end of a migration where postcopy_ram_incoming_init was called. diff --git a/migration/ram.c b/migration/ram.c index cd5f55117d..8de7ab683e 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -3107,9 +3107,7 @@ static int ram_load_cleanup(void *opaque) */ int ram_postcopy_incoming_init(MigrationIncomingState *mis) { -unsigned long ram_pages = last_ram_page(); - -return postcopy_ram_incoming_init(mis, ram_pages); +return postcopy_ram_incoming_init(mis); } /** -- 2.17.1
Re: [Qemu-devel] [RFC PATCH 2/5] .gitignore: add .gcov files
On 06/20/2018 10:20 AM, Alex Bennée wrote: > These are temporary files generated on gcov runs and shouldn't be > included in the source tree. > > Signed-off-by: Alex Bennée Reviewed-by: Philippe Mathieu-Daudé Tested-by: Philippe Mathieu-Daudé > --- > .gitignore | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/.gitignore b/.gitignore > index 9da3b3e626..5668d02782 100644 > --- a/.gitignore > +++ b/.gitignore > @@ -155,6 +155,7 @@ > .sdk > *.gcda > *.gcno > +*.gcov > /pc-bios/bios-pq/status > /pc-bios/vgabios-pq/status > /pc-bios/optionrom/linuxboot.asm >
Re: [Qemu-devel] [RFC PATCH 1/5] build-system: remove per-test GCOV reporting
On 06/20/2018 10:20 AM, Alex Bennée wrote: > I'm not entirely sure who's using this information and certainly in a > CI environment it just washes over as additional noise. Later patches > will provide new reporting options so a user who wants to analyse > individual tests will be able to use that to get the information. > > Signed-off-by: Alex Bennée Reviewed-by: Philippe Mathieu-Daudé Tested-by: Philippe Mathieu-Daudé > --- > docs/devel/testing.rst | 11 +-- > tests/Makefile.include | 10 -- > 2 files changed, 5 insertions(+), 16 deletions(-) > > diff --git a/docs/devel/testing.rst b/docs/devel/testing.rst > index f33e5a8423..66ef219f69 100644 > --- a/docs/devel/testing.rst > +++ b/docs/devel/testing.rst > @@ -158,12 +158,11 @@ rarely used. See "QEMU iotests" section below for more > information. > GCC gcov support > > > -``gcov`` is a GCC tool to analyze the testing coverage by instrumenting the > -tested code. To use it, configure QEMU with ``--enable-gcov`` option and > build. > -Then run ``make check`` as usual. There will be additional ``gcov`` output as > -the testing goes on, showing the test coverage percentage numbers per > analyzed > -source file. More detailed reports can be obtained by running ``gcov`` > command > -on the output files under ``$build_dir/tests/``, please read the ``gcov`` > +``gcov`` is a GCC tool to analyze the testing coverage by > +instrumenting the tested code. To use it, configure QEMU with > +``--enable-gcov`` option and build. Then run ``make check`` as usual. > +Reports can be obtained by running ``gcov`` command on the output > +files under ``$build_dir/tests/``, please read the ``gcov`` > documentation for more information. > > QEMU iotests > diff --git a/tests/Makefile.include b/tests/Makefile.include > index ca91da26cb..55d54bd180 100644 > --- a/tests/Makefile.include > +++ b/tests/Makefile.include > @@ -891,26 +891,16 @@ GCOV_OPTIONS = -n $(if $(V),-f,) > > .PHONY: $(patsubst %, check-qtest-%, $(QTEST_TARGETS)) > $(patsubst %, check-qtest-%, $(QTEST_TARGETS)): check-qtest-%: > subdir-%-softmmu $(check-qtest-y) > - $(if $(CONFIG_GCOV),@rm -f *.gcda */*.gcda */*/*.gcda */*/*/*.gcda,) > $(call quiet-command,QTEST_QEMU_BINARY=$*-softmmu/qemu-system-$* \ > QTEST_QEMU_IMG=qemu-img$(EXESUF) \ > MALLOC_PERTURB_=$${MALLOC_PERTURB_:-$$(( $${RANDOM:-0} % 255 + > 1))} \ > gtester $(GTESTER_OPTIONS) -m=$(SPEED) $(check-qtest-$*-y) > $(check-qtest-generic-y),"GTESTER","$@") > - $(if $(CONFIG_GCOV),@for f in $(gcov-files-$*-y) > $(gcov-files-generic-y); do \ > - echo Gcov report for $$f:;\ > - $(GCOV) $(GCOV_OPTIONS) $$f -o `dirname $$f`; \ > - done,) > > .PHONY: $(patsubst %, check-%, $(check-unit-y) $(check-speed-y)) > $(patsubst %, check-%, $(check-unit-y) $(check-speed-y)): check-%: % > - $(if $(CONFIG_GCOV),@rm -f *.gcda */*.gcda */*/*.gcda */*/*/*.gcda,) > $(call quiet-command, \ > MALLOC_PERTURB_=$${MALLOC_PERTURB_:-$$(( $${RANDOM:-0} % 255 + > 1))} \ > gtester $(GTESTER_OPTIONS) -m=$(SPEED) $*,"GTESTER","$*") > - $(if $(CONFIG_GCOV),@for f in $(gcov-files-$(subst tests/,,$*)-y) > $(gcov-files-generic-y); do \ > - echo Gcov report for $$f:;\ > - $(GCOV) $(GCOV_OPTIONS) $$f -o `dirname $$f`; \ > - done,) > > # gtester tests with XML output > >
Re: [Qemu-devel] [PATCH v1] postcopy: drop ram_pages parameter from postcopy_ram_incoming_init()
Hi, This series failed build test on s390x host. Please find the details below. Type: series Message-id: 20180620152241.15772-1-da...@redhat.com Subject: [Qemu-devel] [PATCH v1] postcopy: drop ram_pages parameter from postcopy_ram_incoming_init() === TEST SCRIPT BEGIN === #!/bin/bash # Testing script will be invoked under the git checkout with # HEAD pointing to a commit that has the patches applied on top of "base" # branch set -e echo "=== ENV ===" env echo "=== PACKAGES ===" rpm -qa echo "=== TEST BEGIN ===" CC=$HOME/bin/cc INSTALL=$PWD/install BUILD=$PWD/build echo -n "Using CC: " realpath $CC mkdir -p $BUILD $INSTALL SRC=$PWD cd $BUILD $SRC/configure --cc=$CC --prefix=$INSTALL make -j4 # XXX: we need reliable clean up # make check -j4 V=1 make install === TEST SCRIPT END === Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 Switched to a new branch 'test' e7347b3a55 postcopy: drop ram_pages parameter from postcopy_ram_incoming_init() === OUTPUT BEGIN === === ENV === LANG=en_US.UTF-8 XDG_SESSION_ID=240042 USER=fam PWD=/var/tmp/patchew-tester-tmp-3bsppygl/src HOME=/home/fam SHELL=/bin/sh SHLVL=2 PATCHEW=/home/fam/patchew/patchew-cli -s http://patchew.org --nodebug LOGNAME=fam DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1012/bus XDG_RUNTIME_DIR=/run/user/1012 PATH=/usr/bin:/bin _=/usr/bin/env === PACKAGES === gpg-pubkey-873529b8-54e386ff glibc-debuginfo-common-2.24-10.fc25.s390x fedora-release-26-1.noarch dejavu-sans-mono-fonts-2.35-4.fc26.noarch xemacs-filesystem-21.5.34-22.20170124hgf412e9f093d4.fc26.noarch bash-4.4.12-7.fc26.s390x libSM-1.2.2-5.fc26.s390x libmpc-1.0.2-6.fc26.s390x libaio-0.3.110-7.fc26.s390x libverto-0.2.6-7.fc26.s390x perl-Scalar-List-Utils-1.48-1.fc26.s390x iptables-libs-1.6.1-2.fc26.s390x tcl-8.6.6-2.fc26.s390x libxshmfence-1.2-4.fc26.s390x expect-5.45-23.fc26.s390x perl-Thread-Queue-3.12-1.fc26.noarch perl-encoding-2.19-6.fc26.s390x keyutils-1.5.10-1.fc26.s390x gmp-devel-6.1.2-4.fc26.s390x enchant-1.6.0-16.fc26.s390x python-gobject-base-3.24.1-1.fc26.s390x python3-enchant-1.6.10-1.fc26.noarch python-lockfile-0.11.0-6.fc26.noarch python2-pyparsing-2.1.10-3.fc26.noarch python2-lxml-4.1.1-1.fc26.s390x librados2-10.2.7-2.fc26.s390x trousers-lib-0.3.13-7.fc26.s390x libdatrie-0.2.9-4.fc26.s390x libsoup-2.58.2-1.fc26.s390x passwd-0.79-9.fc26.s390x bind99-libs-9.9.10-3.P3.fc26.s390x python3-rpm-4.13.0.2-1.fc26.s390x systemd-233-7.fc26.s390x virglrenderer-0.6.0-1.20170210git76b3da97b.fc26.s390x s390utils-ziomon-1.36.1-3.fc26.s390x s390utils-osasnmpd-1.36.1-3.fc26.s390x libXrandr-1.5.1-2.fc26.s390x libglvnd-glx-1.0.0-1.fc26.s390x texlive-ifxetex-svn19685.0.5-33.fc26.2.noarch texlive-psnfss-svn33946.9.2a-33.fc26.2.noarch texlive-dvipdfmx-def-svn40328-33.fc26.2.noarch texlive-natbib-svn20668.8.31b-33.fc26.2.noarch texlive-xdvi-bin-svn40750-33.20160520.fc26.2.s390x texlive-cm-svn32865.0-33.fc26.2.noarch texlive-beton-svn15878.0-33.fc26.2.noarch texlive-fpl-svn15878.1.002-33.fc26.2.noarch texlive-mflogo-svn38628-33.fc26.2.noarch texlive-texlive-docindex-svn41430-33.fc26.2.noarch texlive-luaotfload-bin-svn34647.0-33.20160520.fc26.2.noarch texlive-koma-script-svn41508-33.fc26.2.noarch texlive-pst-tree-svn24142.1.12-33.fc26.2.noarch texlive-breqn-svn38099.0.98d-33.fc26.2.noarch texlive-xetex-svn41438-33.fc26.2.noarch gstreamer1-plugins-bad-free-1.12.3-1.fc26.s390x xorg-x11-font-utils-7.5-33.fc26.s390x ghostscript-fonts-5.50-36.fc26.noarch libXext-devel-1.3.3-5.fc26.s390x libusbx-devel-1.0.21-2.fc26.s390x libglvnd-devel-1.0.0-1.fc26.s390x emacs-25.3-3.fc26.s390x alsa-lib-devel-1.1.4.1-1.fc26.s390x kbd-2.0.4-2.fc26.s390x dconf-0.26.0-2.fc26.s390x mc-4.8.19-5.fc26.s390x doxygen-1.8.13-9.fc26.s390x dpkg-1.18.24-1.fc26.s390x libtdb-1.3.13-1.fc26.s390x python2-pynacl-1.1.1-1.fc26.s390x perl-Filter-1.58-1.fc26.s390x python2-pip-9.0.1-11.fc26.noarch dnf-2.7.5-2.fc26.noarch bind-license-9.11.2-1.P1.fc26.noarch libtasn1-4.13-1.fc26.s390x cpp-7.3.1-2.fc26.s390x pkgconf-1.3.12-2.fc26.s390x python2-fedora-0.10.0-1.fc26.noarch cmake-filesystem-3.10.1-11.fc26.s390x python3-requests-kerberos-0.12.0-1.fc26.noarch libmicrohttpd-0.9.59-1.fc26.s390x GeoIP-GeoLite-data-2018.01-1.fc26.noarch python2-libs-2.7.14-7.fc26.s390x libidn2-2.0.4-3.fc26.s390x p11-kit-devel-0.23.10-1.fc26.s390x perl-Errno-1.25-396.fc26.s390x libdrm-2.4.90-2.fc26.s390x sssd-common-1.16.1-1.fc26.s390x boost-random-1.63.0-11.fc26.s390x urw-fonts-2.4-24.fc26.noarch ccache-3.3.6-1.fc26.s390x glibc-debuginfo-2.24-10.fc25.s390x dejavu-fonts-common-2.35-4.fc26.noarch bind99-license-9.9.10-3.P3.fc26.noarch ncurses-libs-6.0-8.20170212.fc26.s390x libpng-1.6.28-2.fc26.s390x libICE-1.0.9-9.fc26.s390x perl-Text-ParseWords-3.30-366.fc26.noarch libtool-ltdl-2.4.6-17.fc26.s390x libselinux-utils-2.6-7.fc26.s390x userspace-rcu-0.9.3-2.fc26.s390x perl-Class-Inspector-1.31-3.fc26.noarch keyutils-libs-devel-1.5.10-1.fc26.s390x isl-0.16.1-1.fc26.s390x libsecret-0.18.5-3.fc26.s390x compat-openssl10-1.0.2m-1.fc26.s390x python3-iniparse-0.4-24.fc26.noarch