Re: [Qemu-devel] [RFC PATCH 12/13] intel_iommu: do replay when context invalidate
> Before this one we only invalidate context cache when we receive context > entry invalidations. However it's possible that the invalidation also > contains a domain switch (only if cache-mode is enabled for vIOMMU). In > that case we need to notify all the registered components about the new > mapping. > > Signed-off-by: Peter Xu > --- > hw/i386/intel_iommu.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c > index 2fcd7af..0220e63 100644 > --- a/hw/i386/intel_iommu.c > +++ b/hw/i386/intel_iommu.c > @@ -1188,6 +1188,7 @@ static void > vtd_context_device_invalidate(IntelIOMMUState > *s, > trace_vtd_inv_desc_cc_device(bus_n, (devfn_it >> 3) & 0x1f, > devfn_it & 3); > vtd_as->context_cache_entry.context_cache_gen = 0; > +memory_region_iommu_replay_all(&vtd_as->iommu); Hi Peter, It looks like all the device context invalidation would result in replay even the device is not an assigned device. Is it necessary to do replay for a virtual device? Regards, Yi L > } > } > } > -- > 2.7.4
Re: [Qemu-devel] [RFC PATCH 11/13] intel_iommu: provide its own replay() callback
> The default replay() don't work for VT-d since vt-d will have a huge > default memory region which covers address range 0-(2^64-1). This will > normally bring a dead loop when guest starts. > > The solution is simple - we don't walk over all the regions. Instead, we > jump over the regions when we found that the page directories are empty. > It'll greatly reduce the time to walk the whole region. > > To achieve this, we provided a page walk helper to do that, invoking > corresponding hook function when we found an page we are interested in. > vtd_page_walk_level() is the core logic for the page walking. It's > interface is designed to suite further use case, e.g., to invalidate a > range of addresses. > > Signed-off-by: Peter Xu > --- > hw/i386/intel_iommu.c | 212 > -- > hw/i386/trace-events | 8 ++ > include/exec/memory.h | 2 + > 3 files changed, 217 insertions(+), 5 deletions(-) > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c > index 46b8a2f..2fcd7af 100644 > --- a/hw/i386/intel_iommu.c > +++ b/hw/i386/intel_iommu.c > @@ -620,6 +620,22 @@ static inline uint32_t > vtd_get_agaw_from_context_entry(VTDContextEntry *ce) > return 30 + (ce->hi & VTD_CONTEXT_ENTRY_AW) * 9; > } > > +static inline uint64_t vtd_iova_limit(VTDContextEntry *ce) > +{ > +uint32_t ce_agaw = vtd_get_agaw_from_context_entry(ce); > +return 1ULL << MIN(ce_agaw, VTD_MGAW); > +} > + > +/* Return true if IOVA passes range check, otherwise false. */ > +static inline bool vtd_iova_range_check(uint64_t iova, VTDContextEntry *ce) > +{ > +/* > + * Check if @iova is above 2^X-1, where X is the minimum of MGAW > + * in CAP_REG and AW in context-entry. > + */ > +return !(iova & ~(vtd_iova_limit(ce) - 1)); > +} > + > static const uint64_t vtd_paging_entry_rsvd_field[] = { > [0] = ~0ULL, > /* For not large page */ > @@ -656,13 +672,9 @@ static int vtd_iova_to_slpte(VTDContextEntry *ce, > uint64_t > iova, > uint32_t level = vtd_get_level_from_context_entry(ce); > uint32_t offset; > uint64_t slpte; > -uint32_t ce_agaw = vtd_get_agaw_from_context_entry(ce); > uint64_t access_right_check = 0; > > -/* Check if @iova is above 2^X-1, where X is the minimum of MGAW > - * in CAP_REG and AW in context-entry. > - */ > -if (iova & ~((1ULL << MIN(ce_agaw, VTD_MGAW)) - 1)) { > +if (!vtd_iova_range_check(iova, ce)) { > error_report("IOVA 0x%"PRIx64 " exceeds limits", iova); > return -VTD_FR_ADDR_BEYOND_MGAW; > } > @@ -718,6 +730,166 @@ static int vtd_iova_to_slpte(VTDContextEntry *ce, > uint64_t iova, > } > } > > +typedef int (*vtd_page_walk_hook)(IOMMUTLBEntry *entry, void *private); > + > +/** > + * vtd_page_walk_level - walk over specific level for IOVA range > + * > + * @addr: base GPA addr to start the walk > + * @start: IOVA range start address > + * @end: IOVA range end address (start <= addr < end) > + * @hook_fn: hook func to be called when detected page > + * @private: private data to be passed into hook func > + * @read: whether parent level has read permission > + * @write: whether parent level has write permission > + * @skipped: accumulated skipped ranges > + * @notify_unmap: whether we should notify invalid entries > + */ > +static int vtd_page_walk_level(dma_addr_t addr, uint64_t start, > + uint64_t end, vtd_page_walk_hook hook_fn, > + void *private, uint32_t level, > + bool read, bool write, uint64_t *skipped, > + bool notify_unmap) > +{ > +bool read_cur, write_cur, entry_valid; > +uint32_t offset; > +uint64_t slpte; > +uint64_t subpage_size, subpage_mask; > +IOMMUTLBEntry entry; > +uint64_t iova = start; > +uint64_t iova_next; > +uint64_t skipped_local = 0; > +int ret = 0; > + > +trace_vtd_page_walk_level(addr, level, start, end); > + > +subpage_size = 1ULL << vtd_slpt_level_shift(level); > +subpage_mask = vtd_slpt_level_page_mask(level); > + > +while (iova < end) { > +iova_next = (iova & subpage_mask) + subpage_size; > + > +offset = vtd_iova_level_offset(iova, level); > +slpte = vtd_get_slpte(addr, offset); > + > +/* > + * When one of the following case happens, we assume the whole > + * range is invalid: > + * > + * 1. read block failed > + * 3. reserved area non-zero > + * 2. both read & write flag are not set > + */ > + > +if (slpte == (uint64_t)-1) { > +trace_vtd_page_walk_skip_read(iova, iova_next); > +skipped_local++; > +goto next; > +} > + > +if (vtd_slpte_nonzero_rsvd(slpte, level)) { > +trace_vtd_page_walk_skip_reserve(iova, iova_next); > +skipped_local++; > +goto next; > +} > + >
[Qemu-devel] Reducing guest cpu usage
There is a program that I run inside of QEMU that doesn't use the virtual CPU very efficiently. It causes QEMU to use 100% of the guest's CPU time. I was wondering if there were a way to reduce the amount of host CPU time that a guest CPU can use? This feature would help prevent laptops from heating up when running QEMU.
Re: [Qemu-devel] [PATCH 0/4] RFC: A VFIO based block driver for NVMe device
> From: Fam Zheng > Sent: Wednesday, December 21, 2016 12:32 AM > > This series adds a new protocol driver that is intended to achieve about 20% > better performance for latency bound workloads (i.e. synchronous I/O) than > linux-aio when guest is exclusively accessing a NVMe device, by talking to the > device directly instead of through kernel file system layers and its NVMe > driver. > Curious... if the NVMe device is exclusively owned by the guest, why not directly passing through to the guest? is it a tradeoff between performance (better than linux-aio) and composability (snapshot and live migration which not supported by direct passthrough)? Thanks Kevin
[Qemu-devel] [PATCH v2] doc/pcie: correct command line examples
Nit picking: Multi-function PCI Express Root Ports should mean that 'addr' property is mandatory, and slot is optional because it defaults to 0, and 'chassis' is mandatory for 2nd & 3rd root port because it defaults to 0 too. Bonus: fix a typo(2->3) Signed-off-by: Cao jin Reviewed-by: Marcel Apfelbaum --- docs/pcie.txt | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/pcie.txt b/docs/pcie.txt index 9fb20aaed9f4..5bada24a15ab 100644 --- a/docs/pcie.txt +++ b/docs/pcie.txt @@ -110,18 +110,18 @@ Plug only PCI Express devices into PCI Express Ports. -device ioh3420,id=root_port1,chassis=x,slot=y[,bus=pcie.0][,addr=z] \ -device ,bus=root_port1 2.2.2 Using multi-function PCI Express Root Ports: - -device ioh3420,id=root_port1,multifunction=on,chassis=x,slot=y[,bus=pcie.0][,addr=z.0] \ - -device ioh3420,id=root_port2,chassis=x1,slot=y1[,bus=pcie.0][,addr=z.1] \ - -device ioh3420,id=root_port3,chassis=x2,slot=y2[,bus=pcie.0][,addr=z.2] \ -2.2.2 Plugging a PCI Express device into a Switch: + -device ioh3420,id=root_port1,multifunction=on,chassis=x,addr=z.0[,slot=y][,bus=pcie.0] \ + -device ioh3420,id=root_port2,chassis=x1,addr=z.1[,slot=y1][,bus=pcie.0] \ + -device ioh3420,id=root_port3,chassis=x2,addr=z.2[,slot=y2][,bus=pcie.0] \ +2.2.3 Plugging a PCI Express device into a Switch: -device ioh3420,id=root_port1,chassis=x,slot=y[,bus=pcie.0][,addr=z] \ -device x3130-upstream,id=upstream_port1,bus=root_port1[,addr=x] \ -device xio3130-downstream,id=downstream_port1,bus=upstream_port1,chassis=x1,slot=y1[,addr=z1]] \ -device ,bus=downstream_port1 Notes: - - (slot, chassis) pair is mandatory and must be - unique for each PCI Express Root Port. + - (slot, chassis) pair is mandatory and must be unique for each +PCI Express Root Port. slot defaults to 0 when not specified. - 'addr' parameter can be 0 for all the examples above. -- 2.1.0
Re: [Qemu-devel] [PATCH] doc/pcie: correct command line examples
On 12/28/2016 11:21 PM, Andrew Jones wrote: > On Wed, Dec 28, 2016 at 03:24:30PM +0200, Marcel Apfelbaum wrote: >> On 12/27/2016 09:40 AM, Cao jin wrote: >>> Nit picking: Multi-function PCI Express Root Ports should mean that >>> 'addr' property is mandatory, and slot is optional because it is default >>> to 0, and 'chassis' is mandatory for 2nd & 3rd root port because it is >>> default to 0 too. >>> >>> Bonus: fix a typo(2->3) >>> Signed-off-by: Cao jin >>> --- >>> docs/pcie.txt | 12 ++-- >>> 1 file changed, 6 insertions(+), 6 deletions(-) >>> >>> diff --git a/docs/pcie.txt b/docs/pcie.txt >>> index 9fb20aaed9f4..54f05eaa71dc 100644 >>> --- a/docs/pcie.txt >>> +++ b/docs/pcie.txt >>> @@ -110,18 +110,18 @@ Plug only PCI Express devices into PCI Express Ports. >>>-device >>> ioh3420,id=root_port1,chassis=x,slot=y[,bus=pcie.0][,addr=z] \ >>>-device ,bus=root_port1 >>> 2.2.2 Using multi-function PCI Express Root Ports: >>> - -device >>> ioh3420,id=root_port1,multifunction=on,chassis=x,slot=y[,bus=pcie.0][,addr=z.0] >>> \ >>> - -device >>> ioh3420,id=root_port2,chassis=x1,slot=y1[,bus=pcie.0][,addr=z.1] \ >>> - -device >>> ioh3420,id=root_port3,chassis=x2,slot=y2[,bus=pcie.0][,addr=z.2] \ >>> -2.2.2 Plugging a PCI Express device into a Switch: >>> + -device >>> ioh3420,id=root_port1,multifunction=on,chassis=x,addr=z.0[,slot=y][,bus=pcie.0] >>> \ >>> + -device >>> ioh3420,id=root_port2,chassis=x1,addr=z.1[,slot=y1][,bus=pcie.0] \ >>> + -device >>> ioh3420,id=root_port3,chassis=x2,addr=z.2[,slot=y2][,bus=pcie.0] \ >>> +2.2.3 Plugging a PCI Express device into a Switch: >>>-device ioh3420,id=root_port1,chassis=x,slot=y[,bus=pcie.0][,addr=z] >>> \ >>>-device x3130-upstream,id=upstream_port1,bus=root_port1[,addr=x] >>> \ >>>-device >>> xio3130-downstream,id=downstream_port1,bus=upstream_port1,chassis=x1,slot=y1[,addr=z1]] >>> \ >>>-device ,bus=downstream_port1 >>> >>> Notes: >>> - - (slot, chassis) pair is mandatory and must be >>> - unique for each PCI Express Root Port. >>> + - (slot, chassis) pair is mandatory and must be unique for each >>> +PCI Express Root Port. slot is default to 0 when doesn't specify it. > > Please rewrite last sentence as > > slot defaults to 0 when not specified. Thanks for pointing it out, v2 is on the way. -- Sincerely, Cao jin > >>>- 'addr' parameter can be 0 for all the examples above. >>> >>> >>> >> >> Reviewed-by: Marcel Apfelbaum >> >> Thanks, >> Marcel >> > > Thanks, > drew > > > . >
Re: [Qemu-devel] [PULL v2 00/12] M68k for 2.9 patches
On 27 December 2016 at 17:53, Laurent Vivier wrote: > The following changes since commit e5fdf663cf01f824f0e29701551a2c29554d80a4: > > Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20161223' into > staging (2016-12-27 14:56:47 +) > > are available in the git repository at: > > git://github.com/vivier/qemu-m68k.git tags/m68k-for-2.9-pull-request > > for you to fetch changes up to 2b5e2170678af36df48ab4b05dff81fe40b41a65: > > target-m68k: free TCG variables that are not (2016-12-27 18:28:40 +0100) > > > A series of patches queued since the beginning of the freeze period. > Compared to the m68k-for-2.9 branch, 3 patches implementing bitfield > ops are missing as they need new TCG functions. They will be pushed > later. > v2: remove warning for unused variables. > Applied, thanks. -- PMM
Re: [Qemu-devel] QEMU Advent Calendar - Final day
On Dec 24, 2016 9:04 AM, "Thomas Huth" wrote: Hi all, the last door of the QEMU advent calendar 2016 can now be opened (http://www.qemu-advent-calendar.org/2016/index.html#day-24), so we'd now like to say thank you to everybody who has contributed to or followed the advent calendar! It was fun to come up with all these disk images and we hope that you've also found some surprises that you enjoyed. Great job! It's a lot of work but brings joy to many people in the wider community. Stefan
[Qemu-devel] [PATCH v3] build: include sys/sysmacros.h for major() and minor()
The definition of the major() and minor() macros are moving within glibc to . Include this header when it is available to avoid the following sorts of build-stopping messages: qga/commands-posix.c: In function ‘dev_major_minor’: qga/commands-posix.c:656:13: error: In the GNU C Library, "major" is defined by . For historical compatibility, it is currently defined by as well, but we plan to remove this soon. To use "major", include directly. If you did not intend to use a system-defined macro "major", you should undefine it after including . [-Werror] *devmajor = major(st.st_rdev); ^~ qga/commands-posix.c:657:13: error: In the GNU C Library, "minor" is defined by . For historical compatibility, it is currently defined by as well, but we plan to remove this soon. To use "minor", include directly. If you did not intend to use a system-defined macro "minor", you should undefine it after including . [-Werror] *devminor = minor(st.st_rdev); ^~ The additional include allows the build to complete on Fedora 26 (Rawhide) with glibc version 2.24.90. Signed-off-by: Christopher Covington --- configure | 18 ++ include/sysemu/os-posix.h | 4 2 files changed, 22 insertions(+) diff --git a/configure b/configure index 218df87d21..58a33c71ad 100755 --- a/configure +++ b/configure @@ -4746,6 +4746,20 @@ if test "$modules" = "yes" && test "$LD_REL_FLAGS" = ""; then fi ## +# check for sysmacros.h + +have_sysmacros=no +cat > $TMPC << EOF +#include +int main(void) { +return makedev(0, 0); +} +EOF +if compile_prog "" "" ; then +have_sysmacros=yes +fi + +## # End of CC checks # After here, no more $cc or $ld runs @@ -5721,6 +5735,10 @@ if test "$have_af_vsock" = "yes" ; then echo "CONFIG_AF_VSOCK=y" >> $config_host_mak fi +if test "$have_sysmacros" = "yes" ; then + echo "CONFIG_SYSMACROS=y" >> $config_host_mak +fi + # Hold two types of flag: # CONFIG_THREAD_SETNAME_BYTHREAD - we've got a way of setting the name on # a thread we have a handle to diff --git a/include/sysemu/os-posix.h b/include/sysemu/os-posix.h index b0a6c0695b..900bdcb45a 100644 --- a/include/sysemu/os-posix.h +++ b/include/sysemu/os-posix.h @@ -34,6 +34,10 @@ #include #include +#ifdef CONFIG_SYSMACROS +#include +#endif + void os_set_line_buffering(void); void os_set_proc_name(const char *s); void os_setup_signal_handling(void); -- Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
Re: [Qemu-devel] [PATCH v2] build: include sys/sysmacros.h for major() and minor()
Hi Eric, On 12/28/2016 11:10 AM, Eric Blake wrote: > On 12/28/2016 08:53 AM, Christopher Covington wrote: >> The definition of the major() and minor() macros are moving within glibc to >> . Include this header to avoid the following sorts of >> build-stopping messages: >> > >> The additional include allows the build to complete on Fedora 26 (Rawhide) >> with glibc version 2.24.90. >> >> Signed-off-by: Christopher Covington >> --- >> include/sysemu/os-posix.h | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/include/sysemu/os-posix.h b/include/sysemu/os-posix.h >> index b0a6c0695b..772d58f7ed 100644 >> --- a/include/sysemu/os-posix.h >> +++ b/include/sysemu/os-posix.h >> @@ -28,6 +28,7 @@ >> >> #include >> #include >> +#include > > I repeat what I said on v1: > > Works for glibc; but is non-standard and not present > on some other systems, so this may fail to build elsewhere. I read your response to v1 but got stuck on this "some other systems" statement which seems too vague for me to act on. I see the following operating systems checked in configure: Cygwin, mingw32, GNU/kFreeBSD, FreeBSD, DragonFly, NetBSD, OpenBSD, Darwin, SunOS, AIX, Haiku, and Linux. But I'm really not sure what list of C libraries and corresponding mkdev.h versus sysmacros.h versus types.h usage this translates to. > You'll probably need a configure probe. I'm testing that now and will hopefully send it out as v3 shortly. > Autoconf also says that some platforms have instead of > (per its AC_HEADER_MAJOR macro). `git grep mkdev` returns no results for me so I conclude that no currently supported OS/libc requires it. In case anyone wants to work around these messages, I'd like to highlight the --disable-werror option to ./configure. If I had known about it this morning, I probably would be happily authoring other changes right now. Thanks, Cov -- Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
[Qemu-devel] [PATCH] target-x86:Add GDB XML register description support
[Qemu-devel] [PATCH] target-x86:Add GDB XML register description support This patch implements XML target description support for X86 and X86-64 architectures in the GDB stub, as the way with ARM and PowerPC: - gdb-xml/32bit-core.xml & gdb-xml/64bit-core.xml: Adding the XML target description files, these files are picked from GDB source code. - configure: Define gdb_xml_files for X86 targets. - target/i386/cpu.c: Define gdb_core_xml_file and gdb_arch_name to add XML awareness for this architecture, modify the gdb_num_core_regs to fit the registers number defined in each XML file. Signed-off-by: Abdallah Bouassida --- configure | 2 ++ gdb-xml/32bit-core.xml | 65 gdb-xml/64bit-core.xml | 73 ++ target/i386/cpu.c | 21 --- 4 files changed, 157 insertions(+), 4 deletions(-) create mode 100644 gdb-xml/32bit-core.xml create mode 100644 gdb-xml/64bit-core.xml diff --git a/configure b/configure index 218df87..b701d1e 100755 --- a/configure +++ b/configure @@ -5890,9 +5890,11 @@ TARGET_ABI_DIR="" case "$target_name" in i386) +gdb_xml_files="32bit-core.xml" ;; x86_64) TARGET_BASE_ARCH=i386 +gdb_xml_files="64bit-core.xml" ;; alpha) ;; diff --git a/gdb-xml/32bit-core.xml b/gdb-xml/32bit-core.xml new file mode 100644 index 000..7aeeeca --- /dev/null +++ b/gdb-xml/32bit-core.xml @@ -0,0 +1,65 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/gdb-xml/64bit-core.xml b/gdb-xml/64bit-core.xml new file mode 100644 index 000..5088d84 --- /dev/null +++ b/gdb-xml/64bit-core.xml @@ -0,0 +1,73 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/target/i386/cpu.c b/target/i386/cpu.c index b0640f1..d712e8b 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -2371,6 +2371,15 @@ static void x86_cpu_load_def(X86CPU *cpu, X86CPUDefinition *def, Error **errp) } +static gchar *x86_gdb_arch_name(CPUState *cs) +{ +#ifdef TARGET_X86_64 +return g_strdup("i386:x86-64"); +#else +return g_strdup("i386"); +#endif +} + X86CPU *cpu_x86_init(const char *cpu_model) { return X86_CPU(cpu_generic_init(TYPE_X86_CPU, cpu_model)); @@ -3720,10 +3729,14 @@ static void x86_cpu_common_class_init(ObjectClass *oc, void *data) cc->write_elf32_qemunote = x86_cpu_write_elf32_qemunote; cc->vmsd = &vmstate_x86_cpu; #endif -/* CPU_NB_REGS * 2 = general regs + xmm regs - * 25 = eip, eflags, 6 seg regs, st[0-7], fctrl,...,fop, mxcsr. - */ -cc->gdb_num_core_regs = CPU_NB_REGS * 2 + 25; +cc->gdb_arch_name = x86_gdb_arch_name; +#ifdef TARGET_X86_64 +cc->gdb_core_xml_file = "64bit-core.xml"; +cc->gdb_num_core_regs = 40; +#else +cc->gdb_core_xml_file = "32bit-core.xml"; +cc->gdb_num_core_regs = 32; +#endif #ifndef CONFIG_USER_ONLY cc->debug_excp_handler = breakpoint_handler; #endif -- 1.9.1
Re: [Qemu-devel] Can qemu reopen image files?
Hi Eric, There is something I don't understand. We are doing: "virsh save", "qemu-img convert", "qemu-img rebase" and "virsh restore". We only touch the backing chain by doing the rebase while the VM is down. Is there any chance this procedure can destroy data? If so, is there any difference between shutting down and just saving/restoring the VM? Maybe save/restore keeps a cache? Best regards, Christopher. On 19-Dec-16 13:24, Christopher Pereira wrote: Hi Eric, Thanks for your great answer. On 19-Dec-16 12:48, Eric Blake wrote: Then we do the rebase while the VM is suspended to make sure the image files are reopened. That part is where you are liable to break things. Qemu does NOT have a graceful way to reopen the backing chain, so rebasing snap3 to point to snap2' behind qemu's back is asking for problems. Since qemu may be caching things it has already learned about snap2, you have invalidated that cached data by making snap3 point to snap2', but have no way to force qemu to reread the backing chain to start reading from snap2'. We are actually doing a save, rebase and restore to reopen the backing chain. We only touch files (rebase) while the VM is down. Can you please confirm this is 100% safe? Or, if you don't want to merge into "base'", you can use block-stream to merge the other direction, so that "base <- snap1 <- snap2" is converted into "snap2'" - but that depends on patches that were only barely added in qemu 2.8 (intermediate block-commit has existed a lot longer than intermediate block-stream). But the point remains that you are still using qemu to do the work, and therefore with no external qemu-img process interfering with the chain, you don't need any guest downtime or any risk of breaking qemu operation by invalidating data it may have cached. Right. Since images are backed up remotely, we don't want to merge into base nor touch the backing chain at all (only the active snapshot should be modified). This is to keep things simple and avoid to re-syncs of images (remote backups). Besides, we don't want to merge the whole backing chain, but an intermediate point, so it seems that the clean way is to use the "intermediate block-stream" feature. We didn't try it, because when we researched we got the impression that the patches were not stable yet or not included in the qemu versions shipped with CentOS, so we went with 'qemu-img convert' because we needed something known, simple and stable (we are dealing with critical information for gov. orgs.). If block-commit and block-stream don't have enough power to do what you want, then we should patch them to expose that power, rather than worrying about how to use qemu-img to modify the backing chain behind qemu's back. "intermediate block-stream" seems to be the right solution for our use case. Does it also allow QCOW2 compression? Compression is interesting, especially when files are sync'ed via network.
[Qemu-devel] CMSIS SVD based peripheral definitions
CMSIS SVD - The latest release of GNU ARM Eclipse QEMU (2.8.0-20161227) introduced a new technology for implementing peripherals, based on standard CMSIS SVD definitions (http://www.keil.com/pack/doc/CMSIS/SVD/html/index.html). The SVD files are large XML files produced by the silicon vendors, and generally are considered the final hardware reference for the Cortex-M devices, so they are expected to provide the most accurate peripheral emulation. For convenience, the original SVD files are converted to JSON files, which are generally easier to parse. There is one file for each sub-family; the files currently used by GNU ARM Eclipse QEMU are grouped in the devices folders: https://github.com/gnuarmeclipse/qemu/tree/gnuarmeclipse-dev/gnuarmeclipse/devices Please note that some devices may include multiple instances of similar peripherals, for example multiple timers, with each instance slightly different from the others. The SVD files include all these differences, so strictly following the content of the SVD files is mandatory; extracting a definition common to all instances may seem attractive, but it is not realistic, since it may not be accurate for all instances. Peripheral/register/bitfield The basic objects used to implement peripherals are the 'registers'; a peripheral is actually an array of registers, each with its value. For convenience, registers can be viewed as collections of bitfields; bitfield objects do not have their own values, reading a bitfield refers to the parent register, which is masked, shifted and finally returned. Accessing bitfields is quite straightforward: const char *enabling_bit_name = "/machine/mcu/stm32/RCC/AHB1ENR/GPIOAEN"; state->enabling_bit = OBJECT(cm_device_by_name(enabling_bit_name)); // ... if (register_bitfield_is_non_zero(state->enabling_bit)) { // ... } Generated code -- In addition to using vendor SVD files, GNU ARM Eclipse QEMU goes one step further, by generating most of the peripheral code, to the point that new peripherals can be added simply be adding the generated files to the project. Examples of the files currently used are in sub subfolders of the devices/support folder: https://github.com/gnuarmeclipse/qemu/tree/gnuarmeclipse-dev/gnuarmeclipse/devices/support As per the current STM32 implementation, to avoid redundancy, each peripheral file includes definitions for all families; adding a new device implies generating the code for the new device sub-family, and copy/paste-ing the required code in the peripheral implementation. An example of such a peripheral implementation is the SYSFCG: https://github.com/gnuarmeclipse/qemu/blob/gnuarmeclipse-dev/hw/cortexm/stm32/syscfg.c Supported devices - GNU ARM Eclipse QEMU 2.8 supports the following boards: MapleLeafLab Arduino-style STM32 microcontroller board (r5) NUCLEO-F103RBST Nucleo Development Board for STM32 F1 series NUCLEO-F411REST Nucleo Development Board for STM32 F4 series NetduinoGo Netduino GoBus Development Board with STM32F4 NetduinoPlus2Netduino Development Board with STM32F4 OLIMEXINO-STM32 Olimex Maple (Arduino-like) Development Board STM32-E407 Olimex Development Board for STM32F407ZGT6 STM32-H103 Olimex Header Board for STM32F103RBT6 STM32-P103 Olimex Prototype Board for STM32F103RBT6 STM32-P107 Olimex Prototype Board for STM32F107VCT6 STM32F0-DiscoveryST Discovery kit for STM32F051 lines STM32F4-DiscoveryST Discovery kit for STM32F407/417 lines STM32F429I-Discovery ST Discovery kit for STM32F429/439 lines Supported MCUs: STM32F051R8 STM32F103RB STM32F107VC STM32F405RG STM32F407VG STM32F407ZG STM32F411RE STM32F429ZI Functionally, the boards include animated LEDs and active push buttons (reset and user). Test projects (blinky) for all supported boards are available from https://github.com/gnuarmeclipse/eclipse-qemu-test-projects Conclusion -- Although implementing peripherals remains a challenge, using the SVD definitions, plus the tools to generate code, significantly improve and simplify the process. Unfortunately, SVD files are available only for Cortex-M devices, but I think that, when not available, creating the JSON files and using the automated code generator is still easier than manually implementing the peripherals. Personally I plan to use this technology to re-define the system Cortex-M devices, which, right now, are improperly implemented. Regards, Liviu
[Qemu-devel] [PATCH v6 5/7] trace: [tcg] Do not generate TCG code to trace dinamically-disabled events
If an event is dynamically disabled, the TCG code that calls the execution-time tracer is not generated. Removes the overheads of execution-time tracers for dynamically disabled events. As a bonus, also avoids checking the event state when the execution-time tracer is called from TCG-generated code (since otherwise TCG would simply not call it). Signed-off-by: Lluís Vilanova --- scripts/tracetool/__init__.py|1 + scripts/tracetool/format/h.py| 24 ++-- scripts/tracetool/format/tcg_h.py| 19 --- scripts/tracetool/format/tcg_helper_c.py |3 ++- 4 files changed, 37 insertions(+), 10 deletions(-) diff --git a/scripts/tracetool/__init__.py b/scripts/tracetool/__init__.py index 365446fa53..63168ccdf0 100644 --- a/scripts/tracetool/__init__.py +++ b/scripts/tracetool/__init__.py @@ -264,6 +264,7 @@ class Event(object): return self._FMT.findall(self.fmt) QEMU_TRACE = "trace_%(name)s" +QEMU_TRACE_NOCHECK = "_nocheck__" + QEMU_TRACE QEMU_TRACE_TCG = QEMU_TRACE + "_tcg" QEMU_DSTATE = "_TRACE_%(NAME)s_DSTATE" QEMU_EVENT = "_TRACE_%(NAME)s_EVENT" diff --git a/scripts/tracetool/format/h.py b/scripts/tracetool/format/h.py index 3682f4e6a8..a78e50ef35 100644 --- a/scripts/tracetool/format/h.py +++ b/scripts/tracetool/format/h.py @@ -49,6 +49,19 @@ def generate(events, backend, group): backend.generate_begin(events, group) for e in events: +# tracer without checks +out('', +'static inline void %(api)s(%(args)s)', +'{', +api=e.api(e.QEMU_TRACE_NOCHECK), +args=e.args) + +if "disable" not in e.properties: +backend.generate(e, group) + +out('}') + +# tracer wrapper with checks (per-vCPU tracing) if "vcpu" in e.properties: trace_cpu = next(iter(e.args))[1] cond = "trace_event_get_vcpu_state(%(cpu)s,"\ @@ -63,16 +76,15 @@ def generate(events, backend, group): 'static inline void %(api)s(%(args)s)', '{', 'if (%(cond)s) {', +'%(api_nocheck)s(%(names)s);', +'}', +'}', api=e.api(), +api_nocheck=e.api(e.QEMU_TRACE_NOCHECK), args=e.args, +names=", ".join(e.args.names()), cond=cond) -if "disable" not in e.properties: -backend.generate(e, group) - -out('}', -'}') - backend.generate_end(events, group) out('#endif /* TRACE_%s_GENERATED_TRACERS_H */' % group.upper()) diff --git a/scripts/tracetool/format/tcg_h.py b/scripts/tracetool/format/tcg_h.py index 5f213f6cba..71b5c09432 100644 --- a/scripts/tracetool/format/tcg_h.py +++ b/scripts/tracetool/format/tcg_h.py @@ -41,7 +41,7 @@ def generate(events, backend, group): for e in events: # just keep one of them -if "tcg-trans" not in e.properties: +if "tcg-exec" not in e.properties: continue out('static inline void %(name_tcg)s(%(args)s)', @@ -53,12 +53,25 @@ def generate(events, backend, group): args_trans = e.original.event_trans.args args_exec = tracetool.vcpu.transform_args( "tcg_helper_c", e.original.event_exec, "wrapper") +if "vcpu" in e.properties: +trace_cpu = e.args.names()[0] +cond = "trace_event_get_vcpu_state(%(cpu)s,"\ + " TRACE_%(id)s)"\ + % dict( + cpu=trace_cpu, + id=e.original.event_exec.name.upper()) +else: +cond = "true" + out('%(name_trans)s(%(argnames_trans)s);', -'gen_helper_%(name_exec)s(%(argnames_exec)s);', +'if (%(cond)s) {', +'gen_helper_%(name_exec)s(%(argnames_exec)s);', +'}', name_trans=e.original.event_trans.api(e.QEMU_TRACE), name_exec=e.original.event_exec.api(e.QEMU_TRACE), argnames_trans=", ".join(args_trans.names()), -argnames_exec=", ".join(args_exec.names())) +argnames_exec=", ".join(args_exec.names()), +cond=cond) out('}') diff --git a/scripts/tracetool/format/tcg_helper_c.py b/scripts/tracetool/format/tcg_helper_c.py index cc26e03008..c2a05d756c 100644 --- a/scripts/tracetool/format/tcg_helper_c.py +++ b/scripts/tracetool/format/tcg_helper_c.py @@ -66,10 +66,11 @@ def generate(events, backend, group): out('void %(name_tcg)s(%(args_api)s)', '{', +# NOTE: the check was already performed at TCG-generation time '%(name)s(%(args_call)s);', '}', name_tcg="helper_%s_proxy" % e.api(), -
[Qemu-devel] [PATCH v6 6/7] trace: [tcg, trivial] Re-align generated code
Last patch removed a nesting level in generated code. Re-align all code generated by backends to be 4-column aligned. Signed-off-by: Lluís Vilanova --- scripts/tracetool/backend/dtrace.py |2 +- scripts/tracetool/backend/ftrace.py | 20 ++-- scripts/tracetool/backend/log.py| 17 + scripts/tracetool/backend/simple.py |2 +- scripts/tracetool/backend/syslog.py |6 +++--- scripts/tracetool/backend/ust.py|2 +- 6 files changed, 25 insertions(+), 24 deletions(-) diff --git a/scripts/tracetool/backend/dtrace.py b/scripts/tracetool/backend/dtrace.py index 79505c6b1a..b3a8645bf0 100644 --- a/scripts/tracetool/backend/dtrace.py +++ b/scripts/tracetool/backend/dtrace.py @@ -41,6 +41,6 @@ def generate_h_begin(events, group): def generate_h(event, group): -out('QEMU_%(uppername)s(%(argnames)s);', +out('QEMU_%(uppername)s(%(argnames)s);', uppername=event.name.upper(), argnames=", ".join(event.args.names())) diff --git a/scripts/tracetool/backend/ftrace.py b/scripts/tracetool/backend/ftrace.py index db9fe7ad57..dd0eda4441 100644 --- a/scripts/tracetool/backend/ftrace.py +++ b/scripts/tracetool/backend/ftrace.py @@ -29,17 +29,17 @@ def generate_h(event, group): if len(event.args) > 0: argnames = ", " + argnames -out('{', -'char ftrace_buf[MAX_TRACE_STRLEN];', -'int unused __attribute__ ((unused));', -'int trlen;', -'if (trace_event_get_state(%(event_id)s)) {', -'trlen = snprintf(ftrace_buf, MAX_TRACE_STRLEN,', -' "%(name)s " %(fmt)s "\\n" %(argnames)s);', -'trlen = MIN(trlen, MAX_TRACE_STRLEN - 1);', -'unused = write(trace_marker_fd, ftrace_buf, trlen);', -'}', +out('{', +'char ftrace_buf[MAX_TRACE_STRLEN];', +'int unused __attribute__ ((unused));', +'int trlen;', +'if (trace_event_get_state(%(event_id)s)) {', +'trlen = snprintf(ftrace_buf, MAX_TRACE_STRLEN,', +' "%(name)s " %(fmt)s "\\n" %(argnames)s);', +'trlen = MIN(trlen, MAX_TRACE_STRLEN - 1);', +'unused = write(trace_marker_fd, ftrace_buf, trlen);', '}', +'}', name=event.name, args=event.args, event_id="TRACE_" + event.name.upper(), diff --git a/scripts/tracetool/backend/log.py b/scripts/tracetool/backend/log.py index 4f4a4d38b1..7d2c3abe75 100644 --- a/scripts/tracetool/backend/log.py +++ b/scripts/tracetool/backend/log.py @@ -35,14 +35,15 @@ def generate_h(event, group): else: cond = "trace_event_get_state(%s)" % ("TRACE_" + event.name.upper()) -out('if (%(cond)s) {', -'struct timeval _now;', -'gettimeofday(&_now, NULL);', -'qemu_log_mask(LOG_TRACE, "%%d@%%zd.%%06zd:%(name)s " %(fmt)s "\\n",', -' getpid(),', -' (size_t)_now.tv_sec, (size_t)_now.tv_usec', -' %(argnames)s);', -'}', +out('if (%(cond)s) {', +'struct timeval _now;', +'gettimeofday(&_now, NULL);', +'qemu_log_mask(LOG_TRACE,', +' "%%d@%%zd.%%06zd:%(name)s " %(fmt)s "\\n",', +' getpid(),', +' (size_t)_now.tv_sec, (size_t)_now.tv_usec', +' %(argnames)s);', +'}', cond=cond, name=event.name, fmt=event.fmt.rstrip("\n"), diff --git a/scripts/tracetool/backend/simple.py b/scripts/tracetool/backend/simple.py index 85f61028e2..a28460b1e4 100644 --- a/scripts/tracetool/backend/simple.py +++ b/scripts/tracetool/backend/simple.py @@ -37,7 +37,7 @@ def generate_h_begin(events, group): def generate_h(event, group): -out('_simple_%(api)s(%(args)s);', +out('_simple_%(api)s(%(args)s);', api=event.api(), args=", ".join(event.args.names())) diff --git a/scripts/tracetool/backend/syslog.py b/scripts/tracetool/backend/syslog.py index b8ff2790c4..1ce627f0fc 100644 --- a/scripts/tracetool/backend/syslog.py +++ b/scripts/tracetool/backend/syslog.py @@ -35,9 +35,9 @@ def generate_h(event, group): else: cond = "trace_event_get_state(%s)" % ("TRACE_" + event.name.upper()) -out('if (%(cond)s) {', -'syslog(LOG_INFO, "%(name)s " %(fmt)s %(argnames)s);', -'}', +out('if (%(cond)s) {', +'syslog(LOG_INFO, "%(name)s " %(fmt)s %(argnames)s);', +'}', cond=cond, name=event.name, fmt=event.fmt.rstrip
[Qemu-devel] [PATCH v6 4/7] exec: [tcg] Use different TBs according to the vCPU's dynamic tracing state
Every vCPU now uses a separate set of TBs for each set of dynamic tracing event state values. Each set of TBs can be used by any number of vCPUs to maximize TB reuse when vCPUs have the same tracing state. This feature is later used by tracetool to optimize tracing of guest code events. The maximum number of TB sets is defined as 2^E, where E is the number of events that have the 'vcpu' property (their state is stored in CPUState->trace_dstate). For this to work, a change on the dynamic tracing state of a vCPU will force it to flush its virtual TB cache (which is only indexed by address), and fall back to the physical TB cache (which now contains the vCPU's dynamic tracing state as part of the hashing function). Signed-off-by: Lluís Vilanova --- cpu-exec.c| 26 +- include/exec/exec-all.h |5 + include/exec/tb-hash-xx.h |8 +++- include/exec/tb-hash.h|5 +++-- include/qemu-common.h |3 +++ tests/qht-bench.c |2 +- trace/control-target.c|3 +++ trace/control.h |3 +++ translate-all.c | 16 ++-- 9 files changed, 60 insertions(+), 11 deletions(-) diff --git a/cpu-exec.c b/cpu-exec.c index 1b7366efb0..a377505b9c 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -262,6 +262,7 @@ struct tb_desc { CPUArchState *env; tb_page_addr_t phys_page1; uint32_t flags; +TRACE_QHT_VCPU_DSTATE_TYPE trace_vcpu_dstate; }; static bool tb_cmp(const void *p, const void *d) @@ -273,6 +274,7 @@ static bool tb_cmp(const void *p, const void *d) tb->page_addr[0] == desc->phys_page1 && tb->cs_base == desc->cs_base && tb->flags == desc->flags && +tb->trace_vcpu_dstate == desc->trace_vcpu_dstate && !atomic_read(&tb->invalid)) { /* check next page if needed */ if (tb->page_addr[1] == -1) { @@ -294,7 +296,8 @@ static bool tb_cmp(const void *p, const void *d) static TranslationBlock *tb_htable_lookup(CPUState *cpu, target_ulong pc, target_ulong cs_base, - uint32_t flags) + uint32_t flags, + uint32_t trace_vcpu_dstate) { tb_page_addr_t phys_pc; struct tb_desc desc; @@ -303,10 +306,11 @@ static TranslationBlock *tb_htable_lookup(CPUState *cpu, desc.env = (CPUArchState *)cpu->env_ptr; desc.cs_base = cs_base; desc.flags = flags; +desc.trace_vcpu_dstate = trace_vcpu_dstate; desc.pc = pc; phys_pc = get_page_addr_code(desc.env, pc); desc.phys_page1 = phys_pc & TARGET_PAGE_MASK; -h = tb_hash_func(phys_pc, pc, flags); +h = tb_hash_func(phys_pc, pc, flags, trace_vcpu_dstate); return qht_lookup(&tcg_ctx.tb_ctx.htable, tb_cmp, &desc, h); } @@ -318,16 +322,24 @@ static inline TranslationBlock *tb_find(CPUState *cpu, TranslationBlock *tb; target_ulong cs_base, pc; uint32_t flags; +unsigned long trace_vcpu_dstate_bitmap; +TRACE_QHT_VCPU_DSTATE_TYPE trace_vcpu_dstate; bool have_tb_lock = false; +bitmap_copy(&trace_vcpu_dstate_bitmap, cpu->trace_dstate, +trace_get_vcpu_event_count()); +memcpy(&trace_vcpu_dstate, &trace_vcpu_dstate_bitmap, + sizeof(trace_vcpu_dstate)); + /* we record a subset of the CPU state. It will always be the same before a given translated block is executed. */ cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags); tb = atomic_rcu_read(&cpu->tb_jmp_cache[tb_jmp_cache_hash_func(pc)]); if (unlikely(!tb || tb->pc != pc || tb->cs_base != cs_base || - tb->flags != flags)) { -tb = tb_htable_lookup(cpu, pc, cs_base, flags); + tb->flags != flags || + tb->trace_vcpu_dstate != trace_vcpu_dstate)) { +tb = tb_htable_lookup(cpu, pc, cs_base, flags, trace_vcpu_dstate); if (!tb) { /* mmap_lock is needed by tb_gen_code, and mmap_lock must be @@ -341,7 +353,7 @@ static inline TranslationBlock *tb_find(CPUState *cpu, /* There's a chance that our desired tb has been translated while * taking the locks so we check again inside the lock. */ -tb = tb_htable_lookup(cpu, pc, cs_base, flags); +tb = tb_htable_lookup(cpu, pc, cs_base, flags, trace_vcpu_dstate); if (!tb) { /* if no translated code available, then translate it now */ tb = tb_gen_code(cpu, pc, cs_base, flags, 0); @@ -465,6 +477,7 @@ static inline bool cpu_handle_exception(CPUState *cpu, int *ret) if (unlikely(atomic_read(&cpu->trace_dstate_delayed_req))) { bitmap_copy(cpu->trace_dstate, cpu->trace_dstate_delayed, trace_get_vcpu_event_count()); +tb_flush_jmp_cache_all(cpu
[Qemu-devel] [PATCH v6 2/7] trace: Make trace_get_vcpu_event_count() inlinable
Later patches will make use of it. Signed-off-by: Lluís Vilanova --- trace/control-internal.h |5 + trace/control.c |9 ++--- trace/control.h |2 +- 3 files changed, 8 insertions(+), 8 deletions(-) diff --git a/trace/control-internal.h b/trace/control-internal.h index a9d395a587..beb98a0d2c 100644 --- a/trace/control-internal.h +++ b/trace/control-internal.h @@ -16,6 +16,7 @@ extern int trace_events_enabled_count; +extern uint32_t trace_next_vcpu_id; static inline bool trace_event_is_pattern(const char *str) @@ -82,6 +83,10 @@ static inline bool trace_event_get_vcpu_state_dynamic(CPUState *vcpu, return trace_event_get_vcpu_state_dynamic_by_vcpu_id(vcpu, vcpu_id); } +static inline uint32_t trace_get_vcpu_event_count(void) +{ +return trace_next_vcpu_id; +} void trace_event_register_group(TraceEvent **events); diff --git a/trace/control.c b/trace/control.c index 1a7bee6ddc..52d0e343fa 100644 --- a/trace/control.c +++ b/trace/control.c @@ -36,7 +36,7 @@ typedef struct TraceEventGroup { static TraceEventGroup *event_groups; static size_t nevent_groups; static uint32_t next_id; -static uint32_t next_vcpu_id; +uint32_t trace_next_vcpu_id; QemuOptsList qemu_trace_opts = { .name = "trace", @@ -65,7 +65,7 @@ void trace_event_register_group(TraceEvent **events) for (i = 0; events[i] != NULL; i++) { events[i]->id = next_id++; if (events[i]->vcpu_id != TRACE_VCPU_EVENT_NONE) { -events[i]->vcpu_id = next_vcpu_id++; +events[i]->vcpu_id = trace_next_vcpu_id++; } } event_groups = g_renew(TraceEventGroup, event_groups, nevent_groups + 1); @@ -299,8 +299,3 @@ char *trace_opt_parse(const char *optarg) return trace_file; } - -uint32_t trace_get_vcpu_event_count(void) -{ -return next_vcpu_id; -} diff --git a/trace/control.h b/trace/control.h index ccaeac8552..80d326c4d1 100644 --- a/trace/control.h +++ b/trace/control.h @@ -237,7 +237,7 @@ char *trace_opt_parse(const char *optarg); * * Return the number of known vcpu-specific events */ -uint32_t trace_get_vcpu_event_count(void); +static uint32_t trace_get_vcpu_event_count(void); #include "trace/control-internal.h"
[Qemu-devel] [PATCH v6 7/7] trace: [trivial] Statically enable all guest events
The optimizations of this series makes it feasible to have them available on all builds. Signed-off-by: Lluís Vilanova --- trace-events |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/trace-events b/trace-events index f74e1d3d22..0a0f4d9cd6 100644 --- a/trace-events +++ b/trace-events @@ -159,7 +159,7 @@ vcpu guest_cpu_reset(void) # # Mode: user, softmmu # Targets: TCG(all) -disable vcpu tcg guest_mem_before(TCGv vaddr, uint8_t info) "info=%d", "vaddr=0x%016"PRIx64" info=%d" +vcpu tcg guest_mem_before(TCGv vaddr, uint8_t info) "info=%d", "vaddr=0x%016"PRIx64" info=%d" # @num: System call number. # @arg*: System call argument value. @@ -168,7 +168,7 @@ disable vcpu tcg guest_mem_before(TCGv vaddr, uint8_t info) "info=%d", "vaddr=0x # # Mode: user # Targets: TCG(all) -disable vcpu guest_user_syscall(uint64_t num, uint64_t arg1, uint64_t arg2, uint64_t arg3, uint64_t arg4, uint64_t arg5, uint64_t arg6, uint64_t arg7, uint64_t arg8) "num=0x%016"PRIx64" arg1=0x%016"PRIx64" arg2=0x%016"PRIx64" arg3=0x%016"PRIx64" arg4=0x%016"PRIx64" arg5=0x%016"PRIx64" arg6=0x%016"PRIx64" arg7=0x%016"PRIx64" arg8=0x%016"PRIx64 +vcpu guest_user_syscall(uint64_t num, uint64_t arg1, uint64_t arg2, uint64_t arg3, uint64_t arg4, uint64_t arg5, uint64_t arg6, uint64_t arg7, uint64_t arg8) "num=0x%016"PRIx64" arg1=0x%016"PRIx64" arg2=0x%016"PRIx64" arg3=0x%016"PRIx64" arg4=0x%016"PRIx64" arg5=0x%016"PRIx64" arg6=0x%016"PRIx64" arg7=0x%016"PRIx64" arg8=0x%016"PRIx64 # @num: System call number. # @ret: System call result value. @@ -177,4 +177,4 @@ disable vcpu guest_user_syscall(uint64_t num, uint64_t arg1, uint64_t arg2, uint # # Mode: user # Targets: TCG(all) -disable vcpu guest_user_syscall_ret(uint64_t num, uint64_t ret) "num=0x%016"PRIx64" ret=0x%016"PRIx64 +vcpu guest_user_syscall_ret(uint64_t num, uint64_t ret) "num=0x%016"PRIx64" ret=0x%016"PRIx64
[Qemu-devel] [PATCH v6 3/7] trace: [tcg] Delay changes to dynamic state when translating
This keeps consistency across all decisions taken during translation when the dynamic state of a vCPU is changed in the middle of translating some guest code. Signed-off-by: Lluís Vilanova --- cpu-exec.c | 26 ++ include/qom/cpu.h |7 +++ qom/cpu.c |4 trace/control-target.c | 11 +-- 4 files changed, 46 insertions(+), 2 deletions(-) diff --git a/cpu-exec.c b/cpu-exec.c index 4188fed3c6..1b7366efb0 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -33,6 +33,7 @@ #include "hw/i386/apic.h" #endif #include "sysemu/replay.h" +#include "trace/control.h" /* -icount align implementation. */ @@ -451,9 +452,21 @@ static inline bool cpu_handle_exception(CPUState *cpu, int *ret) #ifndef CONFIG_USER_ONLY } else if (replay_has_exception() && cpu->icount_decr.u16.low + cpu->icount_extra == 0) { +/* delay changes to this vCPU's dstate during translation */ +atomic_set(&cpu->trace_dstate_delayed_req, false); +atomic_set(&cpu->trace_dstate_must_delay, true); + /* try to cause an exception pending in the log */ cpu_exec_nocache(cpu, 1, tb_find(cpu, NULL, 0), true); *ret = -1; + +/* apply and disable delayed dstate changes */ +atomic_set(&cpu->trace_dstate_must_delay, false); +if (unlikely(atomic_read(&cpu->trace_dstate_delayed_req))) { +bitmap_copy(cpu->trace_dstate, cpu->trace_dstate_delayed, +trace_get_vcpu_event_count()); +} + return true; #endif } @@ -634,8 +647,21 @@ int cpu_exec(CPUState *cpu) for(;;) { cpu_handle_interrupt(cpu, &last_tb); + +/* delay changes to this vCPU's dstate during translation */ +atomic_set(&cpu->trace_dstate_delayed_req, false); +atomic_set(&cpu->trace_dstate_must_delay, true); + tb = tb_find(cpu, last_tb, tb_exit); cpu_loop_exec_tb(cpu, tb, &last_tb, &tb_exit, &sc); + +/* apply and disable delayed dstate changes */ +atomic_set(&cpu->trace_dstate_must_delay, false); +if (unlikely(atomic_read(&cpu->trace_dstate_delayed_req))) { +bitmap_copy(cpu->trace_dstate, cpu->trace_dstate_delayed, +trace_get_vcpu_event_count()); +} + /* Try to align the host and virtual clocks if the guest is in advance */ align_clocks(&sc, cpu); diff --git a/include/qom/cpu.h b/include/qom/cpu.h index 3f79a8e955..58255d06fa 100644 --- a/include/qom/cpu.h +++ b/include/qom/cpu.h @@ -295,6 +295,10 @@ struct qemu_work_item; * @kvm_fd: vCPU file descriptor for KVM. * @work_mutex: Lock to prevent multiple access to queued_work_*. * @queued_work_first: First asynchronous work pending. + * @trace_dstate_must_delay: Whether a change to trace_dstate must be delayed. + * @trace_dstate_delayed_req: Whether a change to trace_dstate was delayed. + * @trace_dstate_delayed: Delayed changes to trace_dstate (includes all changes + *to @trace_dstate). * @trace_dstate: Dynamic tracing state of events for this vCPU (bitmask). * * State of one CPU core or thread. @@ -370,6 +374,9 @@ struct CPUState { * Dynamically allocated based on bitmap requried to hold up to * trace_get_vcpu_event_count() entries. */ +bool trace_dstate_must_delay; +bool trace_dstate_delayed_req; +unsigned long *trace_dstate_delayed; unsigned long *trace_dstate; /* TODO Move common fields from CPUArchState here. */ diff --git a/qom/cpu.c b/qom/cpu.c index 03d9190f8c..d56496d28d 100644 --- a/qom/cpu.c +++ b/qom/cpu.c @@ -367,6 +367,9 @@ static void cpu_common_initfn(Object *obj) QTAILQ_INIT(&cpu->breakpoints); QTAILQ_INIT(&cpu->watchpoints); +cpu->trace_dstate_must_delay = false; +cpu->trace_dstate_delayed_req = false; +cpu->trace_dstate_delayed = bitmap_new(trace_get_vcpu_event_count()); cpu->trace_dstate = bitmap_new(trace_get_vcpu_event_count()); cpu_exec_initfn(cpu); @@ -375,6 +378,7 @@ static void cpu_common_initfn(Object *obj) static void cpu_common_finalize(Object *obj) { CPUState *cpu = CPU(obj); +g_free(cpu->trace_dstate_delayed); g_free(cpu->trace_dstate); } diff --git a/trace/control-target.c b/trace/control-target.c index 7ebf6e0bcb..aba8db55de 100644 --- a/trace/control-target.c +++ b/trace/control-target.c @@ -69,13 +69,20 @@ void trace_event_set_vcpu_state_dynamic(CPUState *vcpu, if (state_pre != state) { if (state) { trace_events_enabled_count++; -set_bit(vcpu_id, vcpu->trace_dstate); +set_bit(vcpu_id, vcpu->trace_dstate_delayed); +if (!atomic_read(&vcpu->trace_dstate_must_delay)) { +set_bit(vcpu_id, vcpu->trace_dstate); +
[Qemu-devel] [PATCH v6 1/7] exec: [tcg] Refactor flush of per-CPU virtual TB cache
The function is reused in later patches. Signed-off-by: Lluís Vilanova --- cputlb.c|2 +- include/exec/exec-all.h |6 ++ translate-all.c | 14 +- 3 files changed, 16 insertions(+), 6 deletions(-) diff --git a/cputlb.c b/cputlb.c index 813279f3bc..9bf9960e1b 100644 --- a/cputlb.c +++ b/cputlb.c @@ -80,7 +80,7 @@ void tlb_flush(CPUState *cpu, int flush_global) memset(env->tlb_table, -1, sizeof(env->tlb_table)); memset(env->tlb_v_table, -1, sizeof(env->tlb_v_table)); -memset(cpu->tb_jmp_cache, 0, sizeof(cpu->tb_jmp_cache)); +tb_flush_jmp_cache_all(cpu); env->vtlb_index = 0; env->tlb_flush_addr = -1; diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h index a8c13cee66..57cd978578 100644 --- a/include/exec/exec-all.h +++ b/include/exec/exec-all.h @@ -256,6 +256,12 @@ struct TranslationBlock { }; void tb_free(TranslationBlock *tb); +/** + * tb_flush_jmp_cache_all: + * + * Flush the virtual translation block cache. + */ +void tb_flush_jmp_cache_all(CPUState *env); void tb_flush(CPUState *cpu); void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr); diff --git a/translate-all.c b/translate-all.c index 3dd9214904..29ccb9e546 100644 --- a/translate-all.c +++ b/translate-all.c @@ -941,11 +941,7 @@ static void do_tb_flush(CPUState *cpu, run_on_cpu_data tb_flush_count) } CPU_FOREACH(cpu) { -int i; - -for (i = 0; i < TB_JMP_CACHE_SIZE; ++i) { -atomic_set(&cpu->tb_jmp_cache[i], NULL); -} +tb_flush_jmp_cache_all(cpu); } tcg_ctx.tb_ctx.nb_tbs = 0; @@ -1741,6 +1737,14 @@ void tb_check_watchpoint(CPUState *cpu) } } +void tb_flush_jmp_cache_all(CPUState *cpu) +{ +int i; +for (i = 0; i < TB_JMP_CACHE_SIZE; ++i) { +atomic_set(&cpu->tb_jmp_cache[i], NULL); +} +} + #ifndef CONFIG_USER_ONLY /* in deterministic execution mode, instructions doing device I/Os must be at the end of the TB */
[Qemu-devel] [PATCH v6 0/7] trace: [tcg] Optimize per-vCPU tracing states with separate TB caches
Optimizes tracing of events with the 'tcg' and 'vcpu' properties (e.g., memory accesses), making it feasible to statically enable them by default on all QEMU builds. Some quick'n'dirty numbers with 400.perlbench (SPECcpu2006) on the train input (medium size - suns.pl) and the guest_mem_before event: * vanilla, statically disabled real0m2,259s user0m2,252s sys 0m0,004s * vanilla, statically enabled (overhead: 2.18x) real0m4,921s user0m4,912s sys 0m0,008s * multi-tb, statically disabled (overhead: 0.99x) [within noise range] real0m2,228s user0m2,216s sys 0m0,008s * multi-tb, statically enabled (overhead: 0.99x) [within noise range] real0m2,229s user0m2,224s sys 0m0,004s Right now, events with the 'tcg' property always generate TCG code to trace that event at guest code execution time, where the event's dynamic state is checked. This series adds a performance optimization where TCG code for events with the 'tcg' and 'vcpu' properties is not generated if the event is dynamically disabled. This optimization raises two issues: * An event can be dynamically disabled/enabled after the corresponding TCG code has been generated (i.e., a new TB with the corresponding code should be used). * Each vCPU can have a different dynamic state for the same event (i.e., tracing the memory accesses of only one process pinned to a vCPU). To handle both issues, this series integrates the dynamic tracing event state into the TB hashing function, so that vCPUs tracing different events will use separate TBs. Note that only events with the 'vcpu' property are used for hashing (as stored in the bitmap of CPUState->trace_dstate). This makes dynamic event state changes on vCPUs very efficient, since they can use TBs produced by other vCPUs while on the same event state combination (or produced by the same vCPU, earlier). Discarded alternatives: * Emitting TCG code to check if an event needs tracing, where we should still move the tracing call code to either a cold path (making tracing performance worse), or leave it inlined (making non-tracing performance worse). * Eliding TCG code only when *zero* vCPUs are tracing an event, since enabling it on a single vCPU will impact the performance of all other vCPUs that are not tracing that event. Signed-off-by: Lluís Vilanova --- Changes in v6 = * Check hashing size error with QEMU_BUILD_BUG_ON [Richard Henderson]. Changes in v5 = * Move define into "qemu-common.h" to allow compilation of tests. Changes in v4 = * Incorporate trace_dstate into the TB hashing function instead of using multiple physical TB caches [suggested by Richard Henderson]. Changes in v3 = * Rebase on 0737f32daf. * Do not use reserved symbol prefixes ("__") [Stefan Hajnoczi]. * Refactor trace_get_vcpu_event_count() to be inlinable. * Optimize cpu_tb_cache_set_requested() (hottest path). Changes in v2 = * Fix bitmap copy in cpu_tb_cache_set_apply(). * Split generated code re-alignment into a separate patch [Daniel P. Berrange]. Lluís Vilanova (7): exec: [tcg] Refactor flush of per-CPU virtual TB cache trace: Make trace_get_vcpu_event_count() inlinable trace: [tcg] Delay changes to dynamic state when translating exec: [tcg] Use different TBs according to the vCPU's dynamic tracing state trace: [tcg] Do not generate TCG code to trace dinamically-disabled events trace: [tcg,trivial] Re-align generated code trace: [trivial] Statically enable all guest events cpu-exec.c | 52 +++--- cputlb.c |2 + include/exec/exec-all.h | 11 ++ include/exec/tb-hash-xx.h|8 - include/exec/tb-hash.h |5 ++- include/qemu-common.h|3 ++ include/qom/cpu.h|7 qom/cpu.c|4 ++ scripts/tracetool/__init__.py|1 + scripts/tracetool/backend/dtrace.py |2 + scripts/tracetool/backend/ftrace.py | 20 ++-- scripts/tracetool/backend/log.py | 17 +- scripts/tracetool/backend/simple.py |2 + scripts/tracetool/backend/syslog.py |6 ++- scripts/tracetool/backend/ust.py |2 + scripts/tracetool/format/h.py| 24 ++ scripts/tracetool/format/tcg_h.py| 19 +-- scripts/tracetool/format/tcg_helper_c.py |3 +- tests/qht-bench.c|2 + trace-events |6 ++- trace/control-internal.h |5 +++ trace/control-target.c | 14 +++- trace/control.c |9 + trace/control.h |5 ++- translate-all.c | 30 +
Re: [Qemu-devel] Looking for a linux-user mode test
On 28 December 2016 at 17:12, Sean Bruno wrote: > On 12/28/16 10:05, Peter Maydell wrote: >> Ideally all of that rework (including the support for properly >> interrupting syscalls without races) should be ported over to >> bsd-user at some point. > > If you have a moment to point me at the merge commit that pulled in the > majority of this overhaul, I'll take a moment to review it for > application to bsd-user. Merges 430da7a81d356e3, 3e904d6ade7f36, b66e10e4c9ae7, d6550e9ed2e1a60 (listed here latest first but probably more helpfully examined the other way round) have the bulk of it, there are probably some bugfixes that got in via other merges. thanks -- PMM
[Qemu-devel] [PATCH] linux-user: always start with parallel_cpus set to true
We always need real atomics, as we can have shared memory between processes. A good test case is the example from futex(2), futex_demo.c: the use case is mmap(...); fork(); Parent and Child: while(...) __sync_bool_compare_and_swap(...) ... futex(...) In this case we need real atomics in __sync_bool_compare_and_swap(), but as parallel_cpus is set to 0, we don't have. We also revert "b67cb68 linux-user: enable parallel code generation on clone" as parallel_cpus in unconditionally set now. Of course, this doesn't fix atomics that are emulated using cpu_loop_exit_atomic() as we can't stop virtual CPUs from another processes. Signed-off-by: Laurent Vivier --- linux-user/syscall.c | 8 translate-all.c | 4 2 files changed, 4 insertions(+), 8 deletions(-) diff --git a/linux-user/syscall.c b/linux-user/syscall.c index 7b77503..db697c0 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -6164,14 +6164,6 @@ static int do_fork(CPUArchState *env, unsigned int flags, abi_ulong newsp, sigfillset(&sigmask); sigprocmask(SIG_BLOCK, &sigmask, &info.sigmask); -/* If this is our first additional thread, we need to ensure we - * generate code for parallel execution and flush old translations. - */ -if (!parallel_cpus) { -parallel_cpus = true; -tb_flush(cpu); -} - ret = pthread_create(&info.thread, &attr, clone_func, &info); /* TODO: Free new CPU state if thread creation failed. */ diff --git a/translate-all.c b/translate-all.c index 3dd9214..0b0bb09 100644 --- a/translate-all.c +++ b/translate-all.c @@ -142,7 +142,11 @@ static void *l1_map[V_L1_MAX_SIZE]; /* code generation context */ TCGContext tcg_ctx; +#ifdef CONFIG_USER_ONLY +bool parallel_cpus = true; +#else bool parallel_cpus; +#endif /* translation block context */ #ifdef CONFIG_USER_ONLY -- 2.7.4
Re: [Qemu-devel] Looking for a linux-user mode test
On 12/28/16 10:05, Peter Maydell wrote: > On 28 December 2016 at 15:06, Sean Bruno wrote: >> After some recent-ish changes to how user mode executes things/stuff, >> I'm running into issues with the out of tree bsd-user mode code that >> FreeBSD has been maintaining. It looks like the host_signal_handler() >> is never executed or registered correctly in our code. I'm curious if >> the linux-user code can handle this bit of configure script from m4. >> >> https://people.freebsd.org/~sbruno/stack.c > > Hmm. That code does: > * set up a SIGSEGV signal handler to run on its own stack > * go into an infinite recursion, expecting to run out of >stack and trigger a SEGV > which is a bit of an obscure corner case of signal handling. > > We recently fixed a lot of signal handler related bugs in linux-user > by doing a significant overhaul of that code. If bsd-user is still > using the old broken approach it's probably still got lots of bugs > in it. Alternatively, it's possible we changed some of the core > code in that process and broke bsd-user by mistake. > > Ideally all of that rework (including the support for properly > interrupting syscalls without races) should be ported over to > bsd-user at some point. If you have a moment to point me at the merge commit that pulled in the majority of this overhaul, I'll take a moment to review it for application to bsd-user. > >> If someone has the time/inclination, can this code be compiled for ARMv6 >> and executed in a linux chroot with the -strace argument applied? I see >> the following, which after much debugging seems to indicate that the >> host_signal_handler() code is never executed as this code is requesting >> that SIGSEGV be masked to its own handler. > > Built for ARMv7 since I don't have an ARMv6 cross compiler > or system, but it works ok for linux (also, built with -static > rather than run in a chroot, for convenience): > > e104462:xenial:qemu$ ./build/arm-linux/arm-linux-user/qemu-arm -strace > ~/linaro/qemu-misc-tests/stack > 29798 uname(0xf6fff1f0) = 0 > 29798 brk(NULL) = 0x0007f000 > 29798 brk(0x0007fd00) = 0x0007fd00 > 29798 readlink("/proc/self/exe",0xf6ffe328,4096) = 43 > 29798 brk(0x000a0d00) = 0x000a0d00 > 29798 brk(0x000a1000) = 0x000a1000 > 29798 access("/etc/ld.so.nohwcap",F_OK) = -1 errno=2 (No such file or > directory) > 29798 sigaltstack(0xf6fff2e0,(nil)) = 0 > 29798 rt_sigaction(SIGSEGV,0xf6fff1b0,NULL) = 0 > --- SIGSEGV {si_signo=SIGSEGV, si_code=1, si_addr = 0xf67c} --- > 29798 exit_group(0) > > (the enhancement to linux-user's strace to print the line on signal > delivery is also a pretty new change.) > Thanks. This is what I expect to see. >> https://people.freebsd.org/~sbruno/qemu-bsd-user-arm.txt >> >> Prior to 7e6c57e2957c7d868f74bd0d53b5e861b495e1c7 this DTRT for our >> ARMv6 targets. > > This commit hash doesn't seem to be in QEMU master. > *sigh* ... that was the merge commit to the bsd-user branch I maintain. Ignore it. > thanks > -- PMM > signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v2] build: include sys/sysmacros.h for major() and minor()
On 28 December 2016 at 16:10, Eric Blake wrote: > On 12/28/2016 08:53 AM, Christopher Covington wrote: >> The definition of the major() and minor() macros are moving within glibc to >> . Include this header to avoid the following sorts of >> build-stopping messages: >> > >> The additional include allows the build to complete on Fedora 26 (Rawhide) >> with glibc version 2.24.90. >> >> Signed-off-by: Christopher Covington >> --- >> include/sysemu/os-posix.h | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/include/sysemu/os-posix.h b/include/sysemu/os-posix.h >> index b0a6c0695b..772d58f7ed 100644 >> --- a/include/sysemu/os-posix.h >> +++ b/include/sysemu/os-posix.h >> @@ -28,6 +28,7 @@ >> >> #include >> #include >> +#include > > I repeat what I said on v1: > > Works for glibc; but is non-standard and not present > on some other systems, so this may fail to build elsewhere. You'll > probably need a configure probe. Autoconf also says that some platforms > have instead of (per its AC_HEADER_MAJOR > macro). Also this seems straightforwardly like a bug in glibc: it shouldn't be making this kind of breaking change. makedev(3) on my Linux box says nothing about needing sysmacros.h for these. thanks -- PMM
Re: [Qemu-devel] [PATCH for-2.9] numa: make -numa parser dynamically allocate CPUs masks
On Fri, Nov 18, 2016 at 12:02:54PM +0100, Igor Mammedov wrote: > so it won't impose an additional limits on max_cpus limits > supported by different targets. > > It removes global MAX_CPUMASK_BITS constant and need to > bump it up whenever max_cpus is being increased for > a target above MAX_CPUMASK_BITS value. > > Use runtime max_cpus value instead to allocate sufficiently > sized node_cpu bitmasks in numa parser. > > Signed-off-by: Igor Mammedov Reviewed-by: Eduardo Habkost As the cpu_index assignment code isn't obviously safe against setting cpu_index > max_cpus, I would like to squash this into the patch. Is that OK for you? diff --git a/numa.c b/numa.c index 1b6fa78..33f2fd4 100644 --- a/numa.c +++ b/numa.c @@ -401,6 +401,7 @@ void numa_post_machine_init(void) CPU_FOREACH(cpu) { for (i = 0; i < nb_numa_nodes; i++) { +assert(cpu->cpu_index < max_cpus); if (test_bit(cpu->cpu_index, numa_info[i].node_cpu)) { cpu->numa_node = i; } @@ -559,6 +560,8 @@ int numa_get_node_for_cpu(int idx) { int i; +assert(idx < max_cpus); + for (i = 0; i < nb_numa_nodes; i++) { if (test_bit(idx, numa_info[i].node_cpu)) { break; -- Eduardo
Re: [Qemu-devel] Looking for a linux-user mode test
On 28 December 2016 at 15:06, Sean Bruno wrote: > After some recent-ish changes to how user mode executes things/stuff, > I'm running into issues with the out of tree bsd-user mode code that > FreeBSD has been maintaining. It looks like the host_signal_handler() > is never executed or registered correctly in our code. I'm curious if > the linux-user code can handle this bit of configure script from m4. > > https://people.freebsd.org/~sbruno/stack.c Hmm. That code does: * set up a SIGSEGV signal handler to run on its own stack * go into an infinite recursion, expecting to run out of stack and trigger a SEGV which is a bit of an obscure corner case of signal handling. We recently fixed a lot of signal handler related bugs in linux-user by doing a significant overhaul of that code. If bsd-user is still using the old broken approach it's probably still got lots of bugs in it. Alternatively, it's possible we changed some of the core code in that process and broke bsd-user by mistake. Ideally all of that rework (including the support for properly interrupting syscalls without races) should be ported over to bsd-user at some point. > If someone has the time/inclination, can this code be compiled for ARMv6 > and executed in a linux chroot with the -strace argument applied? I see > the following, which after much debugging seems to indicate that the > host_signal_handler() code is never executed as this code is requesting > that SIGSEGV be masked to its own handler. Built for ARMv7 since I don't have an ARMv6 cross compiler or system, but it works ok for linux (also, built with -static rather than run in a chroot, for convenience): e104462:xenial:qemu$ ./build/arm-linux/arm-linux-user/qemu-arm -strace ~/linaro/qemu-misc-tests/stack 29798 uname(0xf6fff1f0) = 0 29798 brk(NULL) = 0x0007f000 29798 brk(0x0007fd00) = 0x0007fd00 29798 readlink("/proc/self/exe",0xf6ffe328,4096) = 43 29798 brk(0x000a0d00) = 0x000a0d00 29798 brk(0x000a1000) = 0x000a1000 29798 access("/etc/ld.so.nohwcap",F_OK) = -1 errno=2 (No such file or directory) 29798 sigaltstack(0xf6fff2e0,(nil)) = 0 29798 rt_sigaction(SIGSEGV,0xf6fff1b0,NULL) = 0 --- SIGSEGV {si_signo=SIGSEGV, si_code=1, si_addr = 0xf67c} --- 29798 exit_group(0) (the enhancement to linux-user's strace to print the line on signal delivery is also a pretty new change.) > https://people.freebsd.org/~sbruno/qemu-bsd-user-arm.txt > > Prior to 7e6c57e2957c7d868f74bd0d53b5e861b495e1c7 this DTRT for our > ARMv6 targets. This commit hash doesn't seem to be in QEMU master. thanks -- PMM
Re: [Qemu-devel] [PATCH] target/i386: Fix bad patch application to translate.c
On Sat, Dec 24, 2016 at 08:29:33PM +, Doug Evans wrote: > In commit c52ab08aee6f7d4717fc6b517174043126bd302f, > the patch snippet for the "syscall" insn got applied to "iret". > > Signed-off-by: Doug Evans Patch was corrupt, I have fixed line wrapping by hand and had to use git-am --ignore-whitespace to apply it. I suggest using git-send-email, as e-mail clients often break patch contents when copying&pasting. Fixed patch below, for reference: --- target/i386/translate.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/target/i386/translate.c b/target/i386/translate.c index 59e11fc..7adfff0 100644 --- a/target/i386/translate.c +++ b/target/i386/translate.c @@ -6435,10 +6435,7 @@ static target_ulong disas_insn(CPUX86State *env, DisasContext *s, tcg_const_i32(s->pc - s->cs_base)); set_cc_op(s, CC_OP_EFLAGS); } -/* TF handling for the syscall insn is different. The TF bit is checked - after the syscall insn completes. This allows #DB to not be - generated after one has entered CPL0 if TF is set in FMASK. */ -gen_eob_worker(s, false, true); +gen_eob(s); break; case 0xe8: /* call im */ { @@ -7119,7 +7116,10 @@ static target_ulong disas_insn(CPUX86State *env, DisasContext *s, gen_update_cc_op(s); gen_jmp_im(pc_start - s->cs_base); gen_helper_syscall(cpu_env, tcg_const_i32(s->pc - pc_start)); -gen_eob(s); +/* TF handling for the syscall insn is different. The TF bit is checked + after the syscall insn completes. This allows #DB to not be + generated after one has entered CPL0 if TF is set in FMASK. */ +gen_eob_worker(s, false, true); break; case 0x107: /* sysret */ if (!s->pe) { -- 2.7.4 > --- > target/i386/translate.c | 10 +- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/target/i386/translate.c b/target/i386/translate.c > index 59e11fc..7e9d073 100644 > --- a/target/i386/translate.c > +++ b/target/i386/translate.c > @@ -6435,10 +6435,7 @@ static target_ulong disas_insn(CPUX86State *env, > DisasContext *s, >tcg_const_i32(s->pc - s->cs_base)); > set_cc_op(s, CC_OP_EFLAGS); > } > -/* TF handling for the syscall insn is different. The TF bit is > checked > - after the syscall insn completes. This allows #DB to not be > - generated after one has entered CPL0 if TF is set in FMASK. */ > -gen_eob_worker(s, false, true); > +gen_eob(s); > break; > case 0xe8: /* call im */ > { > @@ -7119,7 +7116,10 @@ static target_ulong disas_insn(CPUX86State *env, > DisasContext *s, > gen_update_cc_op(s); > gen_jmp_im(pc_start - s->cs_base); > gen_helper_syscall(cpu_env, tcg_const_i32(s->pc - pc_start)); > -gen_eob(s); > +/* TF handling for the syscall insn is different. The TF bit is > checked > + after the syscall insn completes. This allows #DB to not be > + generated after one has entered CPL0 if TF is set in FMASK. */ > +gen_eob_worker(s, false, true); > break; > case 0x107: /* sysret */ > if (!s->pe) { > -- > 2.8.0.rc3.226.g39d4020 > > -- Eduardo
[Qemu-devel] [PATCH v5 1/6] Pass generic CPUState to gen_intermediate_code()
Needed to implement a target-agnostic gen_intermediate_code() in the future. Signed-off-by: Lluís Vilanova Reviewed-by: David Gibson --- include/exec/exec-all.h |2 +- target-alpha/translate.c | 11 +-- target-arm/translate.c| 24 target-cris/translate.c | 17 - target-i386/translate.c | 13 ++--- target-lm32/translate.c | 22 +++--- target-m68k/translate.c | 15 +++ target-microblaze/translate.c | 22 +++--- target-mips/translate.c | 15 +++ target-moxie/translate.c | 14 +++--- target-openrisc/translate.c | 22 +++--- target-ppc/translate.c| 15 +++ target-s390x/translate.c | 13 ++--- target-sh4/translate.c| 15 +++ target-sparc/translate.c | 11 +-- target-tilegx/translate.c |7 +++ target-tricore/translate.c|9 - target-unicore32/translate.c | 17 - target-xtensa/translate.c | 13 ++--- translate-all.c |2 +- 20 files changed, 133 insertions(+), 146 deletions(-) diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h index a8c13cee66..0e45e1aedc 100644 --- a/include/exec/exec-all.h +++ b/include/exec/exec-all.h @@ -43,7 +43,7 @@ typedef ram_addr_t tb_page_addr_t; #include "qemu/log.h" -void gen_intermediate_code(CPUArchState *env, struct TranslationBlock *tb); +void gen_intermediate_code(CPUState *env, struct TranslationBlock *tb); void restore_state_to_opc(CPUArchState *env, struct TranslationBlock *tb, target_ulong *data); diff --git a/target-alpha/translate.c b/target-alpha/translate.c index 114927b751..6759ec28cc 100644 --- a/target-alpha/translate.c +++ b/target-alpha/translate.c @@ -2873,10 +2873,9 @@ static ExitStatus translate_one(DisasContext *ctx, uint32_t insn) return ret; } -void gen_intermediate_code(CPUAlphaState *env, struct TranslationBlock *tb) +void gen_intermediate_code(CPUState *cpu, struct TranslationBlock *tb) { -AlphaCPU *cpu = alpha_env_get_cpu(env); -CPUState *cs = CPU(cpu); +CPUAlphaState *env = cpu->env_ptr; DisasContext ctx, *ctxp = &ctx; target_ulong pc_start; target_ulong pc_mask; @@ -2891,7 +2890,7 @@ void gen_intermediate_code(CPUAlphaState *env, struct TranslationBlock *tb) ctx.pc = pc_start; ctx.mem_idx = cpu_mmu_index(env, false); ctx.implver = env->implver; -ctx.singlestep_enabled = cs->singlestep_enabled; +ctx.singlestep_enabled = cpu->singlestep_enabled; #ifdef CONFIG_USER_ONLY ctx.ir = cpu_std_ir; @@ -2934,7 +2933,7 @@ void gen_intermediate_code(CPUAlphaState *env, struct TranslationBlock *tb) tcg_gen_insn_start(ctx.pc); num_insns++; -if (unlikely(cpu_breakpoint_test(cs, ctx.pc, BP_ANY))) { +if (unlikely(cpu_breakpoint_test(cpu, ctx.pc, BP_ANY))) { ret = gen_excp(&ctx, EXCP_DEBUG, 0); /* The address covered by the breakpoint must be included in [tb->pc, tb->pc + tb->size) in order to for it to be @@ -2996,7 +2995,7 @@ void gen_intermediate_code(CPUAlphaState *env, struct TranslationBlock *tb) && qemu_log_in_addr_range(pc_start)) { qemu_log_lock(); qemu_log("IN: %s\n", lookup_symbol(pc_start)); -log_target_disas(cs, pc_start, ctx.pc - pc_start, 1); +log_target_disas(cpu, pc_start, ctx.pc - pc_start, 1); qemu_log("\n"); qemu_log_unlock(); } diff --git a/target-arm/translate.c b/target-arm/translate.c index 0ad9070b45..3aa766901c 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -11589,10 +11589,10 @@ static bool insn_crosses_page(CPUARMState *env, DisasContext *s) } /* generate intermediate code for basic block 'tb'. */ -void gen_intermediate_code(CPUARMState *env, TranslationBlock *tb) +void gen_intermediate_code(CPUState *cpu, TranslationBlock *tb) { -ARMCPU *cpu = arm_env_get_cpu(env); -CPUState *cs = CPU(cpu); +CPUARMState *env = cpu->env_ptr; +ARMCPU *arm_cpu = arm_env_get_cpu(env); DisasContext dc1, *dc = &dc1; target_ulong pc_start; target_ulong next_page_start; @@ -11606,7 +11606,7 @@ void gen_intermediate_code(CPUARMState *env, TranslationBlock *tb) * the A32/T32 complexity to do with conditional execution/IT blocks/etc. */ if (ARM_TBFLAG_AARCH64_STATE(tb->flags)) { -gen_intermediate_code_a64(cpu, tb); +gen_intermediate_code_a64(arm_cpu, tb); return; } @@ -11616,7 +11616,7 @@ void gen_intermediate_code(CPUARMState *env, TranslationBlock *tb) dc->is_jmp = DISAS_NEXT; dc->pc = pc_start; -dc->singlestep_enabled = cs->singlestep_enabled; +dc->singlestep_enabled = cpu->singlestep_enabled; dc->condjmp = 0;
[Qemu-devel] [PATCH v5 5/6] target: [tcg, i386] Port to generic translation framework
Signed-off-by: Lluís Vilanova --- target-i386/translate.c | 304 ++- 1 file changed, 140 insertions(+), 164 deletions(-) diff --git a/target-i386/translate.c b/target-i386/translate.c index 61d73e286f..a63627b470 100644 --- a/target-i386/translate.c +++ b/target-i386/translate.c @@ -69,6 +69,10 @@ case (2 << 6) | (OP << 3) | 0 ... (2 << 6) | (OP << 3) | 7: \ case (3 << 6) | (OP << 3) | 0 ... (3 << 6) | (OP << 3) | 7 +#include "exec/translate-all_template.h" +#define DJ_JUMP (DJ_TARGET + 0) /* end of block due to call/jump */ +#define DJ_MISC (DJ_TARGET + 1) /* some other reason */ + //#define MACRO_TEST 1 /* global register indexes */ @@ -94,7 +98,10 @@ static TCGv_i64 cpu_tmp1_i64; static int x86_64_hregs; #endif + typedef struct DisasContext { +DisasContextBase base; + /* current insn context */ int override; /* -1 if no override */ int prefix; @@ -102,8 +109,6 @@ typedef struct DisasContext { TCGMemOp dflag; target_ulong pc_start; target_ulong pc; /* pc = eip + cs_base */ -int is_jmp; /* 1 = means jump (stop translation), 2 means CPU - static state change (stop translation) */ /* current block context */ target_ulong cs_base; /* base of CS segment */ int pe; /* protected mode */ @@ -124,12 +129,10 @@ typedef struct DisasContext { int cpl; int iopl; int tf; /* TF cpu flag */ -int singlestep_enabled; /* "hardware" single step enabled */ int jmp_opt; /* use direct block chaining for direct jumps */ int repz_opt; /* optimize jumps within repz instructions */ int mem_index; /* select memory access functions */ uint64_t flags; /* all execution flags */ -struct TranslationBlock *tb; int popl_esp_hack; /* for correct popl with esp base handling */ int rip_offset; /* only used in x86_64, but left for simplicity */ int cpuid_features; @@ -140,6 +143,8 @@ typedef struct DisasContext { int cpuid_xsave_features; } DisasContext; +#include "translate-all_template.h" + static void gen_eob(DisasContext *s); static void gen_jmp(DisasContext *s, target_ulong eip); static void gen_jmp_tb(DisasContext *s, target_ulong eip, int tb_num); @@ -1112,7 +1117,7 @@ static void gen_bpt_io(DisasContext *s, TCGv_i32 t_port, int ot) static inline void gen_ins(DisasContext *s, TCGMemOp ot) { -if (s->tb->cflags & CF_USE_ICOUNT) { +if (s->base.tb->cflags & CF_USE_ICOUNT) { gen_io_start(); } gen_string_movl_A0_EDI(s); @@ -1127,14 +1132,14 @@ static inline void gen_ins(DisasContext *s, TCGMemOp ot) gen_op_movl_T0_Dshift(ot); gen_op_add_reg_T0(s->aflag, R_EDI); gen_bpt_io(s, cpu_tmp2_i32, ot); -if (s->tb->cflags & CF_USE_ICOUNT) { +if (s->base.tb->cflags & CF_USE_ICOUNT) { gen_io_end(); } } static inline void gen_outs(DisasContext *s, TCGMemOp ot) { -if (s->tb->cflags & CF_USE_ICOUNT) { +if (s->base.tb->cflags & CF_USE_ICOUNT) { gen_io_start(); } gen_string_movl_A0_ESI(s); @@ -1147,7 +1152,7 @@ static inline void gen_outs(DisasContext *s, TCGMemOp ot) gen_op_movl_T0_Dshift(ot); gen_op_add_reg_T0(s->aflag, R_ESI); gen_bpt_io(s, cpu_tmp2_i32, ot); -if (s->tb->cflags & CF_USE_ICOUNT) { +if (s->base.tb->cflags & CF_USE_ICOUNT) { gen_io_end(); } } @@ -2130,7 +2135,7 @@ static inline int insn_const_size(TCGMemOp ot) static inline bool use_goto_tb(DisasContext *s, target_ulong pc) { #ifndef CONFIG_USER_ONLY -return (pc & TARGET_PAGE_MASK) == (s->tb->pc & TARGET_PAGE_MASK) || +return (pc & TARGET_PAGE_MASK) == (s->base.tb->pc & TARGET_PAGE_MASK) || (pc & TARGET_PAGE_MASK) == (s->pc_start & TARGET_PAGE_MASK); #else return true; @@ -2145,7 +2150,7 @@ static inline void gen_goto_tb(DisasContext *s, int tb_num, target_ulong eip) /* jump to same page: we can use a direct jump */ tcg_gen_goto_tb(tb_num); gen_jmp_im(eip); -tcg_gen_exit_tb((uintptr_t)s->tb + tb_num); +tcg_gen_exit_tb((uintptr_t)s->base.tb + tb_num); } else { /* jump to another page: currently not optimized */ gen_jmp_im(eip); @@ -2166,7 +2171,7 @@ static inline void gen_jcc(DisasContext *s, int b, gen_set_label(l1); gen_goto_tb(s, 1, val); -s->is_jmp = DISAS_TB_JUMP; +s->base.jmp_type = DJ_JUMP; } else { l1 = gen_new_label(); l2 = gen_new_label(); @@ -2237,11 +2242,11 @@ static void gen_movl_seg_T0(DisasContext *s, int seg_reg) stop as a special handling must be done to disable hardware interrupts for the next instruction */ if (seg_reg == R_SS || (s->code32 && seg_reg < R_FS)) -s->is_jmp = DISAS_TB_JUMP; +s->base.jmp_type = DJ_JUMP; } else { gen_op_movl_seg_T0_vm(seg_reg); if (seg_reg == R_S
[Qemu-devel] [PATCH v5 6/6] target: [tcg, arm] Port to generic translation framework
Signed-off-by: Lluís Vilanova --- target-arm/translate-a64.c | 346 ++--- target-arm/translate.c | 720 ++-- target-arm/translate.h | 42 ++- 3 files changed, 555 insertions(+), 553 deletions(-) diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c index 6dc27a6115..cd7a4282cb 100644 --- a/target-arm/translate-a64.c +++ b/target-arm/translate-a64.c @@ -296,17 +296,17 @@ static void gen_exception(int excp, uint32_t syndrome, uint32_t target_el) static void gen_exception_internal_insn(DisasContext *s, int offset, int excp) { -gen_a64_set_pc_im(s->pc - offset); +gen_a64_set_pc_im(s->base.pc_next - offset); gen_exception_internal(excp); -s->is_jmp = DISAS_EXC; +s->base.jmp_type = DJ_EXC; } static void gen_exception_insn(DisasContext *s, int offset, int excp, uint32_t syndrome, uint32_t target_el) { -gen_a64_set_pc_im(s->pc - offset); +gen_a64_set_pc_im(s->base.pc_next - offset); gen_exception(excp, syndrome, target_el); -s->is_jmp = DISAS_EXC; +s->base.jmp_type = DJ_EXC; } static void gen_ss_advance(DisasContext *s) @@ -334,7 +334,7 @@ static void gen_step_complete_exception(DisasContext *s) gen_ss_advance(s); gen_exception(EXCP_UDEF, syn_swstep(s->ss_same_el, 1, s->is_ldex), default_exception_el(s)); -s->is_jmp = DISAS_EXC; +s->base.jmp_type = DJ_EXC; } static inline bool use_goto_tb(DisasContext *s, int n, uint64_t dest) @@ -342,13 +342,14 @@ static inline bool use_goto_tb(DisasContext *s, int n, uint64_t dest) /* No direct tb linking with singlestep (either QEMU's or the ARM * debug architecture kind) or deterministic io */ -if (s->singlestep_enabled || s->ss_active || (s->tb->cflags & CF_LAST_IO)) { +if (s->base.singlestep_enabled || s->ss_active || +(s->base.tb->cflags & CF_LAST_IO)) { return false; } #ifndef CONFIG_USER_ONLY /* Only link tbs from inside the same guest page */ -if ((s->tb->pc & TARGET_PAGE_MASK) != (dest & TARGET_PAGE_MASK)) { +if ((s->base.tb->pc & TARGET_PAGE_MASK) != (dest & TARGET_PAGE_MASK)) { return false; } #endif @@ -360,21 +361,21 @@ static inline void gen_goto_tb(DisasContext *s, int n, uint64_t dest) { TranslationBlock *tb; -tb = s->tb; +tb = s->base.tb; if (use_goto_tb(s, n, dest)) { tcg_gen_goto_tb(n); gen_a64_set_pc_im(dest); tcg_gen_exit_tb((intptr_t)tb + n); -s->is_jmp = DISAS_TB_JUMP; +s->base.jmp_type = DJ_TB_JUMP; } else { gen_a64_set_pc_im(dest); if (s->ss_active) { gen_step_complete_exception(s); -} else if (s->singlestep_enabled) { +} else if (s->base.singlestep_enabled) { gen_exception_internal(EXCP_DEBUG); } else { tcg_gen_exit_tb(0); -s->is_jmp = DISAS_TB_JUMP; +s->base.jmp_type = DJ_TB_JUMP; } } } @@ -405,11 +406,11 @@ static void unallocated_encoding(DisasContext *s) qemu_log_mask(LOG_UNIMP, \ "%s:%d: unsupported instruction encoding 0x%08x " \ "at pc=%016" PRIx64 "\n", \ - __FILE__, __LINE__, insn, s->pc - 4); \ + __FILE__, __LINE__, insn, s->base.pc_next - 4);\ unallocated_encoding(s); \ } while (0); -static void init_tmp_a64_array(DisasContext *s) +void init_tmp_a64_array(DisasContext *s) { #ifdef CONFIG_DEBUG_TCG int i; @@ -1223,11 +1224,11 @@ static inline AArch64DecodeFn *lookup_disas_fn(const AArch64DecodeTable *table, */ static void disas_uncond_b_imm(DisasContext *s, uint32_t insn) { -uint64_t addr = s->pc + sextract32(insn, 0, 26) * 4 - 4; +uint64_t addr = s->base.pc_next + sextract32(insn, 0, 26) * 4 - 4; if (insn & (1U << 31)) { /* C5.6.26 BL Branch with link */ -tcg_gen_movi_i64(cpu_reg(s, 30), s->pc); +tcg_gen_movi_i64(cpu_reg(s, 30), s->base.pc_next); } /* C5.6.20 B Branch / C5.6.26 BL Branch with link */ @@ -1250,7 +1251,7 @@ static void disas_comp_b_imm(DisasContext *s, uint32_t insn) sf = extract32(insn, 31, 1); op = extract32(insn, 24, 1); /* 0: CBZ; 1: CBNZ */ rt = extract32(insn, 0, 5); -addr = s->pc + sextract32(insn, 5, 19) * 4 - 4; +addr = s->base.pc_next + sextract32(insn, 5, 19) * 4 - 4; tcg_cmp = read_cpu_reg(s, rt, sf); label_match = gen_new_label(); @@ -1258,7 +1259,7 @@ static void disas_comp_b_imm(DisasContext *s, uint32_t insn) tcg_gen_brcondi_i64(op ? TCG_COND_NE : TCG_COND_EQ, tcg_cmp, 0, label_match); -gen_goto_tb(s, 0, s->pc); +gen_goto_tb(s, 0, s->base.pc_next); gen_set_labe
Re: [Qemu-devel] [RFC PATCH v4 0/6] translate: [tcg] Generic translation framework
no-reply writes: > Hi, > Your series failed automatic build test. Please find the testing commands and > their output below. If you have docker installed, you can probably reproduce > it > locally. Oh, my bad. Forgot to remove some of the "restrict" I added on previous versions. Thanks, Lluis
[Qemu-devel] [PATCH v5 3/6] target: [tcg] Add generic translation framework
Signed-off-by: Lluís Vilanova --- include/exec/gen-icount.h |2 include/exec/translate-all_template.h | 73 include/qom/cpu.h | 22 translate-all_template.h | 204 + 4 files changed, 300 insertions(+), 1 deletion(-) create mode 100644 include/exec/translate-all_template.h create mode 100644 translate-all_template.h diff --git a/include/exec/gen-icount.h b/include/exec/gen-icount.h index 050de59b38..c91ac95ed7 100644 --- a/include/exec/gen-icount.h +++ b/include/exec/gen-icount.h @@ -45,7 +45,7 @@ static inline void gen_tb_start(TranslationBlock *tb) tcg_temp_free_i32(count); } -static void gen_tb_end(TranslationBlock *tb, int num_insns) +static inline void gen_tb_end(TranslationBlock *tb, int num_insns) { gen_set_label(exitreq_label); tcg_gen_exit_tb((uintptr_t)tb + TB_EXIT_REQUESTED); diff --git a/include/exec/translate-all_template.h b/include/exec/translate-all_template.h new file mode 100644 index 00..ea507f90c6 --- /dev/null +++ b/include/exec/translate-all_template.h @@ -0,0 +1,73 @@ +/* + * Generic intermediate code generation. + * + * Copyright (C) 2016 Lluís Vilanova + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#ifndef EXEC__TRANSLATE_ALL_TEMPLATE_H +#define EXEC__TRANSLATE_ALL_TEMPLATE_H + +/* + * Include this header from a target-specific file, and add a + * + * DisasContextBase base; + * + * member in your target-specific DisasContext. + */ + + +#include "exec/exec-all.h" + + +/** + * BreakpointHitType: + * @BH_MISS: No hit + * @BH_HIT_INSN: Hit, but continue translating instruction + * @BH_HIT_TB: Hit, stop translating TB + * + * How to react to a breakpoint hit. + */ +typedef enum BreakpointHitType { +BH_MISS, +BH_HIT_INSN, +BH_HIT_TB, +} BreakpointHitType; + +/** + * DisasJumpType: + * @DJ_NEXT: Next instruction in program order + * @DJ_TOO_MANY: Too many instructions executed + * @DJ_TARGET: Start of target-specific conditions + * + * What instruction to disassemble next. + */ +typedef enum DisasJumpType { +DJ_NEXT, +DJ_TOO_MANY, +DJ_TARGET, +} DisasJumpType; + +/** + * DisasContextBase: + * @tb: Translation block for this disassembly. + * @singlestep_enabled: "Hardware" single stepping enabled. + * @pc_first: Address of first guest instruction in this TB. + * @pc_next: Address of next guest instruction in this TB (current during + * disassembly). + * @num_insns: Number of translated instructions (including current). + * + * Architecture-agnostic disassembly context. + */ +typedef struct DisasContextBase { +TranslationBlock *tb; +bool singlestep_enabled; +target_ulong pc_first; +target_ulong pc_next; +DisasJumpType jmp_type; +unsigned int num_insns; +} DisasContextBase; + +#endif /* EXEC__TRANSLATE_ALL_TEMPLATE_H */ diff --git a/include/qom/cpu.h b/include/qom/cpu.h index 3f79a8e955..64a288b066 100644 --- a/include/qom/cpu.h +++ b/include/qom/cpu.h @@ -948,6 +948,28 @@ static inline bool cpu_breakpoint_test(CPUState *cpu, vaddr pc, int mask) return false; } +/* Get first breakpoint matching a PC */ +static inline CPUBreakpoint *cpu_breakpoint_get(CPUState *cpu, vaddr pc, +CPUBreakpoint *bp) +{ +if (likely(bp == NULL)) { +if (unlikely(!QTAILQ_EMPTY(&cpu->breakpoints))) { +QTAILQ_FOREACH(bp, &cpu->breakpoints, entry) { +if (bp->pc == pc) { +return bp; +} +} +} +} else { +QTAILQ_FOREACH_CONTINUE(bp, entry) { +if (bp->pc == pc) { +return bp; +} +} +} +return NULL; +} + int cpu_watchpoint_insert(CPUState *cpu, vaddr addr, vaddr len, int flags, CPUWatchpoint **watchpoint); int cpu_watchpoint_remove(CPUState *cpu, vaddr addr, diff --git a/translate-all_template.h b/translate-all_template.h new file mode 100644 index 00..6208916d08 --- /dev/null +++ b/translate-all_template.h @@ -0,0 +1,204 @@ +/* + * Generic intermediate code generation. + * + * Copyright (C) 2016 Lluís Vilanova + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#ifndef TRANSLATE_ALL_TEMPLATE_H +#define TRANSLATE_ALL_TEMPLATE_H + +/* + * Include this header from a target-specific file, which must define the + * target-specific functions declared below. + * + * These must be paired with instructions in "exec/translate-all_template.h". + */ + + +#include "cpu.h" +#include "qemu/error-report.h" + + +static void gen_intermediate_code_target_init_disas_context( +DisasContext *dc, CPUArchState *env); + +static void gen_intermediate_code_target_init_globals( +DisasContext *dc, CPUArchState *
[Qemu-devel] [PATCH v5 4/6] target: [tcg] Redefine DISAS_* onto the generic translation framework (DJ_*)
Temporarily redefine DISAS_* values based on DJ_TARGET. They should disappear as targets get ported to the generic framework. Signed-off-by: Lluís Vilanova --- include/exec/exec-all.h | 11 +++ target-arm/translate.h | 15 --- target-cris/translate.c |3 ++- target-m68k/translate.c |3 ++- target-s390x/translate.c |3 ++- target-unicore32/translate.c |3 ++- 6 files changed, 23 insertions(+), 15 deletions(-) diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h index 0e45e1aedc..169da5ebe0 100644 --- a/include/exec/exec-all.h +++ b/include/exec/exec-all.h @@ -36,10 +36,13 @@ typedef ram_addr_t tb_page_addr_t; #endif /* is_jmp field values */ -#define DISAS_NEXT0 /* next instruction can be analyzed */ -#define DISAS_JUMP1 /* only pc was modified dynamically */ -#define DISAS_UPDATE 2 /* cpu state was modified dynamically */ -#define DISAS_TB_JUMP 3 /* only pc was modified statically */ +/* TODO: delete after all targets are transitioned to generic translation */ +#include "exec/translate-all_template.h" +#define DISAS_NEXTDJ_NEXT /* next instruction can be analyzed */ +#define DISAS_JUMP(DJ_TARGET + 0) /* only pc was modified dynamically */ +#define DISAS_UPDATE (DJ_TARGET + 1) /* cpu state was modified dynamically */ +#define DISAS_TB_JUMP (DJ_TARGET + 2) /* only pc was modified statically */ +#define DISAS_TARGET (DJ_TARGET + 3) /* base for target-specific values */ #include "qemu/log.h" diff --git a/target-arm/translate.h b/target-arm/translate.h index 285e96f087..3dd4c4578e 100644 --- a/target-arm/translate.h +++ b/target-arm/translate.h @@ -105,21 +105,22 @@ static inline int default_exception_el(DisasContext *s) } /* target-specific extra values for is_jmp */ +/* TODO: rename as DJ_* when transitioning this target to generic translation */ /* These instructions trap after executing, so the A32/T32 decoder must * defer them until after the conditional execution state has been updated. * WFI also needs special handling when single-stepping. */ -#define DISAS_WFI 4 -#define DISAS_SWI 5 +#define DISAS_WFI (DISAS_TARGET + 0) +#define DISAS_SWI (DISAS_TARGET + 1) /* For instructions which unconditionally cause an exception we can skip * emitting unreachable code at the end of the TB in the A64 decoder */ -#define DISAS_EXC 6 +#define DISAS_EXC (DISAS_TARGET + 2) /* WFE */ -#define DISAS_WFE 7 -#define DISAS_HVC 8 -#define DISAS_SMC 9 -#define DISAS_YIELD 10 +#define DISAS_WFE (DISAS_TARGET + 3) +#define DISAS_HVC (DISAS_TARGET + 4) +#define DISAS_SMC (DISAS_TARGET + 5) +#define DISAS_YIELD (DISAS_TARGET + 6) #ifdef TARGET_AARCH64 void a64_translate_init(void); diff --git a/target-cris/translate.c b/target-cris/translate.c index ebcf7863bf..001714c7c1 100644 --- a/target-cris/translate.c +++ b/target-cris/translate.c @@ -50,7 +50,8 @@ #define BUG() (gen_BUG(dc, __FILE__, __LINE__)) #define BUG_ON(x) ({if (x) BUG();}) -#define DISAS_SWI 5 +/* TODO: rename as DJ_* when transitioning this target to generic translation */ +#define DISAS_SWI (DISAS_TARGET + 0) /* Used by the decoder. */ #define EXTRACT_FIELD(src, start, end) \ diff --git a/target-m68k/translate.c b/target-m68k/translate.c index 6da6f2b51b..b2b0555c80 100644 --- a/target-m68k/translate.c +++ b/target-m68k/translate.c @@ -143,7 +143,8 @@ typedef struct DisasContext { int done_mac; } DisasContext; -#define DISAS_JUMP_NEXT 4 +/* TODO: rename as DJ_* when transitioning this target to generic translation */ +#define DISAS_JUMP_NEXT (DISAS_TARGET + 0) #if defined(CONFIG_USER_ONLY) #define IS_USER(s) 1 diff --git a/target-s390x/translate.c b/target-s390x/translate.c index a3992dae5a..75787e89e3 100644 --- a/target-s390x/translate.c +++ b/target-s390x/translate.c @@ -74,7 +74,8 @@ typedef struct { } u; } DisasCompare; -#define DISAS_EXCP 4 +/* TODO: rename as DJ_* when transitioning this target to generic translation */ +#define DISAS_EXCP (DISAS_TARGET + 0) #ifdef DEBUG_INLINE_BRANCHES static uint64_t inline_branch_hit[CC_OP_MAX]; diff --git a/target-unicore32/translate.c b/target-unicore32/translate.c index 39eaa76b50..de0a64e1c8 100644 --- a/target-unicore32/translate.c +++ b/target-unicore32/translate.c @@ -45,9 +45,10 @@ typedef struct DisasContext { #define IS_USER(s) 1 #endif +/* TODO: rename as DJ_* when transitioning this target to generic translation */ /* These instructions trap after executing, so defer them until after the conditional executions state has been updated. */ -#define DISAS_SYSCALL 5 +#define DISAS_SYSCALL (DISAS_TARGET + 0) static TCGv_env cpu_env; static TCGv_i32 cpu_R[32];
[Qemu-devel] [PATCH v5 2/6] queue: Add macro for incremental traversal
Adds macro QTAILQ_FOREACH_CONTINUE to support incremental list traversal. Signed-off-by: Lluís Vilanova --- include/qemu/queue.h | 12 1 file changed, 12 insertions(+) diff --git a/include/qemu/queue.h b/include/qemu/queue.h index 342073fb4d..ea6130f1c9 100644 --- a/include/qemu/queue.h +++ b/include/qemu/queue.h @@ -415,6 +415,18 @@ struct { \ (var); \ (var) = ((var)->field.tqe_next)) +/** + * QTAILQ_FOREACH_CONTINUE: + * @var: Variable to resume iteration from. + * @field: Field in @var holding a QTAILQ_ENTRY for this queue. + * + * Resumes iteration on a queue from the element in @var. + */ +#define QTAILQ_FOREACH_CONTINUE(var, field) \ +for ((var) = ((var)->field.tqe_next); \ +(var); \ +(var) = ((var)->field.tqe_next)) + #define QTAILQ_FOREACH_SAFE(var, head, field, next_var) \ for ((var) = ((head)->tqh_first); \ (var) && ((next_var) = ((var)->field.tqe_next), 1); \
[Qemu-devel] [RFC PATCH v5 0/6] translate: [tcg] Generic translation framework
This series proposes a generic (target-agnostic) instruction translation framework. It basically provides a generic main loop for instruction disassembly, which calls target-specific functions when necessary. This generalization makes inserting new code in the main loop easier, and helps in keeping all targets in synch as to the contents of it. This series also paves the way towards adding events to trace guest code execution (BBLs and instructions). I've ported i386/x86-64 and arm/aarch64 as an example to see how it fits in the current organization, but will port the rest when this series gets merged. Signed-off-by: Lluís Vilanova --- Changes in v5 = * Remove stray uses of "restrict" keyword. Changes in v4 = * Document new macro QTAILQ_FOREACH_CONTINUE [Peter Maydell]. * Fix coding style errors reported by checkpatch. * Remove use of "restrict" in added functions; it makes older gcc versions barf about compilation errors. Changes in v3 = * Rebase on 0737f32daf. Changes in v2 = * Port ARM and AARCH64 targets. * Fold single-stepping checks into "max_insns" [Richard Henderson]. * Move instruction start marks to target code [Richard Henderson]. * Add target hook for TB start. * Check for TCG temporary leaks. * Move instruction disassembly into a target hook. * Make breakpoint_hit() return an enum to accomodate target's needs (ARM). Lluís Vilanova (6): Pass generic CPUState to gen_intermediate_code() queue: Add macro for incremental traversal target: [tcg] Add generic translation framework target: [tcg] Redefine DISAS_* onto the generic translation framework (DJ_*) target: [tcg,i386] Port to generic translation framework target: [tcg,arm] Port to generic translation framework include/exec/exec-all.h | 13 - include/exec/gen-icount.h |2 include/exec/translate-all_template.h | 73 +++ include/qemu/queue.h | 12 + include/qom/cpu.h | 22 + target-alpha/translate.c | 11 - target-arm/translate-a64.c| 346 target-arm/translate.c| 720 + target-arm/translate.h| 41 +- target-cris/translate.c | 20 - target-i386/translate.c | 305 ++ target-lm32/translate.c | 22 + target-m68k/translate.c | 18 - target-microblaze/translate.c | 22 + target-mips/translate.c | 15 - target-moxie/translate.c | 14 - target-openrisc/translate.c | 22 + target-ppc/translate.c| 15 - target-s390x/translate.c | 16 - target-sh4/translate.c| 15 - target-sparc/translate.c | 11 - target-tilegx/translate.c |7 target-tricore/translate.c|9 target-unicore32/translate.c | 20 - target-xtensa/translate.c | 13 - translate-all.c |2 translate-all_template.h | 204 + 27 files changed, 1137 insertions(+), 853 deletions(-) create mode 100644 include/exec/translate-all_template.h create mode 100644 translate-all_template.h To: qemu-devel@nongnu.org Cc: Paolo Bonzini Cc: Peter Crosthwaite Cc: Richard Henderson
Re: [Qemu-devel] [RFC PATCH v4 0/6] translate: [tcg] Generic translation framework
no-reply writes: > Hi, > Your series failed automatic build test. Please find the testing commands and > their output below. If you have docker installed, you can probably reproduce > it > locally. I did try to compile all targets and it works for me (gcc 6.2.1). I've also tried the oldest gcc I have (4.8.4) and it fails to link all programs on vanilla QEMU, but compiles all the files otherwise (including my series). Cheers, Lluis
Re: [Qemu-devel] [PATCH v5 4/7] exec: [tcg] Use different TBs according to the vCPU's dynamic tracing state
Richard Henderson writes: > On 12/28/2016 06:08 AM, Lluís Vilanova wrote: >> @@ -83,6 +85,13 @@ uint32_t tb_hash_func5(uint64_t a0, uint64_t b0, uint32_t >> e) >> h32 += e * PRIME32_3; >> h32 = rol32(h32, 17) * PRIME32_4; >> >> +if (sizeof(TRACE_QHT_VCPU_DSTATE_TYPE) == sizeof(uint32_t)) { >> +h32 += f * PRIME32_3; >> +h32 = rol32(h32, 17) * PRIME32_4; >> +} else { >> +abort(); >> +} >> + > QEMU_BUILD_BUG_ON. Right, thanks. Lluis
Re: [Qemu-devel] [PATCH v2] build: include sys/sysmacros.h for major() and minor()
On 12/28/2016 08:53 AM, Christopher Covington wrote: > The definition of the major() and minor() macros are moving within glibc to > . Include this header to avoid the following sorts of > build-stopping messages: > > The additional include allows the build to complete on Fedora 26 (Rawhide) > with glibc version 2.24.90. > > Signed-off-by: Christopher Covington > --- > include/sysemu/os-posix.h | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/include/sysemu/os-posix.h b/include/sysemu/os-posix.h > index b0a6c0695b..772d58f7ed 100644 > --- a/include/sysemu/os-posix.h > +++ b/include/sysemu/os-posix.h > @@ -28,6 +28,7 @@ > > #include > #include > +#include I repeat what I said on v1: Works for glibc; but is non-standard and not present on some other systems, so this may fail to build elsewhere. You'll probably need a configure probe. Autoconf also says that some platforms have instead of (per its AC_HEADER_MAJOR macro). -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v5 4/7] exec: [tcg] Use different TBs according to the vCPU's dynamic tracing state
On 12/28/2016 06:08 AM, Lluís Vilanova wrote: @@ -83,6 +85,13 @@ uint32_t tb_hash_func5(uint64_t a0, uint64_t b0, uint32_t e) h32 += e * PRIME32_3; h32 = rol32(h32, 17) * PRIME32_4; +if (sizeof(TRACE_QHT_VCPU_DSTATE_TYPE) == sizeof(uint32_t)) { +h32 += f * PRIME32_3; +h32 = rol32(h32, 17) * PRIME32_4; +} else { +abort(); +} + QEMU_BUILD_BUG_ON. r~
Re: [Qemu-devel] [RFC PATCH v4 0/6] translate: [tcg] Generic translation framework
Hi, Your series failed automatic build test. Please find the testing commands and their output below. If you have docker installed, you can probably reproduce it locally. Subject: [Qemu-devel] [RFC PATCH v4 0/6] translate: [tcg] Generic translation framework Type: series Message-id: 148293987753.31645.8166717498506500137.st...@fimbulvetr.bsc.es === TEST SCRIPT BEGIN === #!/bin/bash set -e git submodule update --init dtc # Let docker tests dump environment info export SHOW_ENV=1 export J=16 make docker-test-quick@centos6 make docker-test-mingw@fedora make docker-test-build@min-glib === TEST SCRIPT END === Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 Switched to a new branch 'test' 379c21f target: [tcg, arm] Port to generic translation framework c955be5 target: [tcg, i386] Port to generic translation framework 02ac4cd target: [tcg] Redefine DISAS_* onto the generic translation framework (DJ_*) 9cb1c12 target: [tcg] Add generic translation framework d9d2d4d queue: Add macro for incremental traversal 8d9f6ec Pass generic CPUState to gen_intermediate_code() === OUTPUT BEGIN === Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc' Cloning into 'dtc'... Submodule path 'dtc': checked out '65cc4d2748a2c2e6f27f1cf39e07a5dbabd80ebf' BUILD centos6 make[1]: Entering directory `/var/tmp/patchew-tester-tmp-ji3mp14i/src' ARCHIVE qemu.tgz ARCHIVE dtc.tgz COPYRUNNER RUN test-quick in qemu:centos6 Packages installed: SDL-devel-1.2.14-7.el6_7.1.x86_64 ccache-3.1.6-2.el6.x86_64 epel-release-6-8.noarch gcc-4.4.7-17.el6.x86_64 git-1.7.1-4.el6_7.1.x86_64 glib2-devel-2.28.8-5.el6.x86_64 libfdt-devel-1.4.0-1.el6.x86_64 make-3.81-23.el6.x86_64 package g++ is not installed pixman-devel-0.32.8-1.el6.x86_64 tar-1.23-15.el6_8.x86_64 zlib-devel-1.2.3-29.el6.x86_64 Environment variables: PACKAGES=libfdt-devel ccache tar git make gcc g++ zlib-devel glib2-devel SDL-devel pixman-devel epel-release HOSTNAME=a49476d3c1a8 TERM=xterm MAKEFLAGS= -j16 HISTSIZE=1000 J=16 USER=root CCACHE_DIR=/var/tmp/ccache EXTRA_CONFIGURE_OPTS= V= SHOW_ENV=1 MAIL=/var/spool/mail/root PATH=/usr/lib/ccache:/usr/lib64/ccache:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin PWD=/ LANG=en_US.UTF-8 TARGET_LIST= HISTCONTROL=ignoredups SHLVL=1 HOME=/root TEST_DIR=/tmp/qemu-test LOGNAME=root LESSOPEN=||/usr/bin/lesspipe.sh %s FEATURES= dtc DEBUG= G_BROKEN_FILENAMES=1 CCACHE_HASHDIR= _=/usr/bin/env Configure options: --enable-werror --target-list=x86_64-softmmu,aarch64-softmmu --prefix=/var/tmp/qemu-build/install No C++ compiler available; disabling C++ specific optional code Install prefix/var/tmp/qemu-build/install BIOS directory/var/tmp/qemu-build/install/share/qemu binary directory /var/tmp/qemu-build/install/bin library directory /var/tmp/qemu-build/install/lib module directory /var/tmp/qemu-build/install/lib/qemu libexec directory /var/tmp/qemu-build/install/libexec include directory /var/tmp/qemu-build/install/include config directory /var/tmp/qemu-build/install/etc local state directory /var/tmp/qemu-build/install/var Manual directory /var/tmp/qemu-build/install/share/man ELF interp prefix /usr/gnemul/qemu-%M Source path /tmp/qemu-test/src C compilercc Host C compiler cc C++ compiler Objective-C compiler cc ARFLAGS rv CFLAGS-O2 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -g QEMU_CFLAGS -I/usr/include/pixman-1-pthread -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include -fPIE -DPIE -m64 -mcx16 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -fno-common -fwrapv -Wendif-labels -Wmissing-include-dirs -Wempty-body -Wnested-externs -Wformat-security -Wformat-y2k -Winit-self -Wignored-qualifiers -Wold-style-declaration -Wold-style-definition -Wtype-limits -fstack-protector-all LDFLAGS -Wl,--warn-common -Wl,-z,relro -Wl,-z,now -pie -m64 -g make make install install pythonpython -B smbd /usr/sbin/smbd module supportno host CPU x86_64 host big endian no target list x86_64-softmmu aarch64-softmmu tcg debug enabled no gprof enabled no sparse enabledno strip binariesyes profiler no static build no pixmansystem SDL support yes (1.2.14) GTK support no GTK GL supportno VTE support no TLS priority NORMAL GNUTLS supportno GNUTLS rndno libgcrypt no libgcrypt kdf no nettleno nettle kdfno libtasn1 no curses supportno virgl support no curl support no mingw32 support no Audio drivers oss Block whitelist (rw) Block whitelist (ro) VirtFS supportno VNC support yes VNC SASL support no VNC JPEG support no VNC PNG support no xen support no brlapi supportno bluez supportno Doc
[Qemu-devel] [PATCH v4 5/6] target: [tcg, i386] Port to generic translation framework
Signed-off-by: Lluís Vilanova --- target-i386/translate.c | 304 ++- 1 file changed, 140 insertions(+), 164 deletions(-) diff --git a/target-i386/translate.c b/target-i386/translate.c index 61d73e286f..a63627b470 100644 --- a/target-i386/translate.c +++ b/target-i386/translate.c @@ -69,6 +69,10 @@ case (2 << 6) | (OP << 3) | 0 ... (2 << 6) | (OP << 3) | 7: \ case (3 << 6) | (OP << 3) | 0 ... (3 << 6) | (OP << 3) | 7 +#include "exec/translate-all_template.h" +#define DJ_JUMP (DJ_TARGET + 0) /* end of block due to call/jump */ +#define DJ_MISC (DJ_TARGET + 1) /* some other reason */ + //#define MACRO_TEST 1 /* global register indexes */ @@ -94,7 +98,10 @@ static TCGv_i64 cpu_tmp1_i64; static int x86_64_hregs; #endif + typedef struct DisasContext { +DisasContextBase base; + /* current insn context */ int override; /* -1 if no override */ int prefix; @@ -102,8 +109,6 @@ typedef struct DisasContext { TCGMemOp dflag; target_ulong pc_start; target_ulong pc; /* pc = eip + cs_base */ -int is_jmp; /* 1 = means jump (stop translation), 2 means CPU - static state change (stop translation) */ /* current block context */ target_ulong cs_base; /* base of CS segment */ int pe; /* protected mode */ @@ -124,12 +129,10 @@ typedef struct DisasContext { int cpl; int iopl; int tf; /* TF cpu flag */ -int singlestep_enabled; /* "hardware" single step enabled */ int jmp_opt; /* use direct block chaining for direct jumps */ int repz_opt; /* optimize jumps within repz instructions */ int mem_index; /* select memory access functions */ uint64_t flags; /* all execution flags */ -struct TranslationBlock *tb; int popl_esp_hack; /* for correct popl with esp base handling */ int rip_offset; /* only used in x86_64, but left for simplicity */ int cpuid_features; @@ -140,6 +143,8 @@ typedef struct DisasContext { int cpuid_xsave_features; } DisasContext; +#include "translate-all_template.h" + static void gen_eob(DisasContext *s); static void gen_jmp(DisasContext *s, target_ulong eip); static void gen_jmp_tb(DisasContext *s, target_ulong eip, int tb_num); @@ -1112,7 +1117,7 @@ static void gen_bpt_io(DisasContext *s, TCGv_i32 t_port, int ot) static inline void gen_ins(DisasContext *s, TCGMemOp ot) { -if (s->tb->cflags & CF_USE_ICOUNT) { +if (s->base.tb->cflags & CF_USE_ICOUNT) { gen_io_start(); } gen_string_movl_A0_EDI(s); @@ -1127,14 +1132,14 @@ static inline void gen_ins(DisasContext *s, TCGMemOp ot) gen_op_movl_T0_Dshift(ot); gen_op_add_reg_T0(s->aflag, R_EDI); gen_bpt_io(s, cpu_tmp2_i32, ot); -if (s->tb->cflags & CF_USE_ICOUNT) { +if (s->base.tb->cflags & CF_USE_ICOUNT) { gen_io_end(); } } static inline void gen_outs(DisasContext *s, TCGMemOp ot) { -if (s->tb->cflags & CF_USE_ICOUNT) { +if (s->base.tb->cflags & CF_USE_ICOUNT) { gen_io_start(); } gen_string_movl_A0_ESI(s); @@ -1147,7 +1152,7 @@ static inline void gen_outs(DisasContext *s, TCGMemOp ot) gen_op_movl_T0_Dshift(ot); gen_op_add_reg_T0(s->aflag, R_ESI); gen_bpt_io(s, cpu_tmp2_i32, ot); -if (s->tb->cflags & CF_USE_ICOUNT) { +if (s->base.tb->cflags & CF_USE_ICOUNT) { gen_io_end(); } } @@ -2130,7 +2135,7 @@ static inline int insn_const_size(TCGMemOp ot) static inline bool use_goto_tb(DisasContext *s, target_ulong pc) { #ifndef CONFIG_USER_ONLY -return (pc & TARGET_PAGE_MASK) == (s->tb->pc & TARGET_PAGE_MASK) || +return (pc & TARGET_PAGE_MASK) == (s->base.tb->pc & TARGET_PAGE_MASK) || (pc & TARGET_PAGE_MASK) == (s->pc_start & TARGET_PAGE_MASK); #else return true; @@ -2145,7 +2150,7 @@ static inline void gen_goto_tb(DisasContext *s, int tb_num, target_ulong eip) /* jump to same page: we can use a direct jump */ tcg_gen_goto_tb(tb_num); gen_jmp_im(eip); -tcg_gen_exit_tb((uintptr_t)s->tb + tb_num); +tcg_gen_exit_tb((uintptr_t)s->base.tb + tb_num); } else { /* jump to another page: currently not optimized */ gen_jmp_im(eip); @@ -2166,7 +2171,7 @@ static inline void gen_jcc(DisasContext *s, int b, gen_set_label(l1); gen_goto_tb(s, 1, val); -s->is_jmp = DISAS_TB_JUMP; +s->base.jmp_type = DJ_JUMP; } else { l1 = gen_new_label(); l2 = gen_new_label(); @@ -2237,11 +2242,11 @@ static void gen_movl_seg_T0(DisasContext *s, int seg_reg) stop as a special handling must be done to disable hardware interrupts for the next instruction */ if (seg_reg == R_SS || (s->code32 && seg_reg < R_FS)) -s->is_jmp = DISAS_TB_JUMP; +s->base.jmp_type = DJ_JUMP; } else { gen_op_movl_seg_T0_vm(seg_reg); if (seg_reg == R_S
[Qemu-devel] [PATCH v4 6/6] target: [tcg, arm] Port to generic translation framework
Signed-off-by: Lluís Vilanova --- target-arm/translate-a64.c | 346 ++--- target-arm/translate.c | 720 ++-- target-arm/translate.h | 42 ++- 3 files changed, 555 insertions(+), 553 deletions(-) diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c index 6dc27a6115..d6f5a65b5a 100644 --- a/target-arm/translate-a64.c +++ b/target-arm/translate-a64.c @@ -296,17 +296,17 @@ static void gen_exception(int excp, uint32_t syndrome, uint32_t target_el) static void gen_exception_internal_insn(DisasContext *s, int offset, int excp) { -gen_a64_set_pc_im(s->pc - offset); +gen_a64_set_pc_im(s->base.pc_next - offset); gen_exception_internal(excp); -s->is_jmp = DISAS_EXC; +s->base.jmp_type = DJ_EXC; } static void gen_exception_insn(DisasContext *s, int offset, int excp, uint32_t syndrome, uint32_t target_el) { -gen_a64_set_pc_im(s->pc - offset); +gen_a64_set_pc_im(s->base.pc_next - offset); gen_exception(excp, syndrome, target_el); -s->is_jmp = DISAS_EXC; +s->base.jmp_type = DJ_EXC; } static void gen_ss_advance(DisasContext *s) @@ -334,7 +334,7 @@ static void gen_step_complete_exception(DisasContext *s) gen_ss_advance(s); gen_exception(EXCP_UDEF, syn_swstep(s->ss_same_el, 1, s->is_ldex), default_exception_el(s)); -s->is_jmp = DISAS_EXC; +s->base.jmp_type = DJ_EXC; } static inline bool use_goto_tb(DisasContext *s, int n, uint64_t dest) @@ -342,13 +342,14 @@ static inline bool use_goto_tb(DisasContext *s, int n, uint64_t dest) /* No direct tb linking with singlestep (either QEMU's or the ARM * debug architecture kind) or deterministic io */ -if (s->singlestep_enabled || s->ss_active || (s->tb->cflags & CF_LAST_IO)) { +if (s->base.singlestep_enabled || s->ss_active || +(s->base.tb->cflags & CF_LAST_IO)) { return false; } #ifndef CONFIG_USER_ONLY /* Only link tbs from inside the same guest page */ -if ((s->tb->pc & TARGET_PAGE_MASK) != (dest & TARGET_PAGE_MASK)) { +if ((s->base.tb->pc & TARGET_PAGE_MASK) != (dest & TARGET_PAGE_MASK)) { return false; } #endif @@ -360,21 +361,21 @@ static inline void gen_goto_tb(DisasContext *s, int n, uint64_t dest) { TranslationBlock *tb; -tb = s->tb; +tb = s->base.tb; if (use_goto_tb(s, n, dest)) { tcg_gen_goto_tb(n); gen_a64_set_pc_im(dest); tcg_gen_exit_tb((intptr_t)tb + n); -s->is_jmp = DISAS_TB_JUMP; +s->base.jmp_type = DJ_TB_JUMP; } else { gen_a64_set_pc_im(dest); if (s->ss_active) { gen_step_complete_exception(s); -} else if (s->singlestep_enabled) { +} else if (s->base.singlestep_enabled) { gen_exception_internal(EXCP_DEBUG); } else { tcg_gen_exit_tb(0); -s->is_jmp = DISAS_TB_JUMP; +s->base.jmp_type = DJ_TB_JUMP; } } } @@ -405,11 +406,11 @@ static void unallocated_encoding(DisasContext *s) qemu_log_mask(LOG_UNIMP, \ "%s:%d: unsupported instruction encoding 0x%08x " \ "at pc=%016" PRIx64 "\n", \ - __FILE__, __LINE__, insn, s->pc - 4); \ + __FILE__, __LINE__, insn, s->base.pc_next - 4);\ unallocated_encoding(s); \ } while (0); -static void init_tmp_a64_array(DisasContext *s) +void init_tmp_a64_array(DisasContext *s) { #ifdef CONFIG_DEBUG_TCG int i; @@ -1223,11 +1224,11 @@ static inline AArch64DecodeFn *lookup_disas_fn(const AArch64DecodeTable *table, */ static void disas_uncond_b_imm(DisasContext *s, uint32_t insn) { -uint64_t addr = s->pc + sextract32(insn, 0, 26) * 4 - 4; +uint64_t addr = s->base.pc_next + sextract32(insn, 0, 26) * 4 - 4; if (insn & (1U << 31)) { /* C5.6.26 BL Branch with link */ -tcg_gen_movi_i64(cpu_reg(s, 30), s->pc); +tcg_gen_movi_i64(cpu_reg(s, 30), s->base.pc_next); } /* C5.6.20 B Branch / C5.6.26 BL Branch with link */ @@ -1250,7 +1251,7 @@ static void disas_comp_b_imm(DisasContext *s, uint32_t insn) sf = extract32(insn, 31, 1); op = extract32(insn, 24, 1); /* 0: CBZ; 1: CBNZ */ rt = extract32(insn, 0, 5); -addr = s->pc + sextract32(insn, 5, 19) * 4 - 4; +addr = s->base.pc_next + sextract32(insn, 5, 19) * 4 - 4; tcg_cmp = read_cpu_reg(s, rt, sf); label_match = gen_new_label(); @@ -1258,7 +1259,7 @@ static void disas_comp_b_imm(DisasContext *s, uint32_t insn) tcg_gen_brcondi_i64(op ? TCG_COND_NE : TCG_COND_EQ, tcg_cmp, 0, label_match); -gen_goto_tb(s, 0, s->pc); +gen_goto_tb(s, 0, s->base.pc_next); gen_set_labe
[Qemu-devel] [PATCH v4 4/6] target: [tcg] Redefine DISAS_* onto the generic translation framework (DJ_*)
Temporarily redefine DISAS_* values based on DJ_TARGET. They should disappear as targets get ported to the generic framework. Signed-off-by: Lluís Vilanova --- include/exec/exec-all.h | 11 +++ target-arm/translate.h | 15 --- target-cris/translate.c |3 ++- target-m68k/translate.c |3 ++- target-s390x/translate.c |3 ++- target-unicore32/translate.c |3 ++- 6 files changed, 23 insertions(+), 15 deletions(-) diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h index 0e45e1aedc..169da5ebe0 100644 --- a/include/exec/exec-all.h +++ b/include/exec/exec-all.h @@ -36,10 +36,13 @@ typedef ram_addr_t tb_page_addr_t; #endif /* is_jmp field values */ -#define DISAS_NEXT0 /* next instruction can be analyzed */ -#define DISAS_JUMP1 /* only pc was modified dynamically */ -#define DISAS_UPDATE 2 /* cpu state was modified dynamically */ -#define DISAS_TB_JUMP 3 /* only pc was modified statically */ +/* TODO: delete after all targets are transitioned to generic translation */ +#include "exec/translate-all_template.h" +#define DISAS_NEXTDJ_NEXT /* next instruction can be analyzed */ +#define DISAS_JUMP(DJ_TARGET + 0) /* only pc was modified dynamically */ +#define DISAS_UPDATE (DJ_TARGET + 1) /* cpu state was modified dynamically */ +#define DISAS_TB_JUMP (DJ_TARGET + 2) /* only pc was modified statically */ +#define DISAS_TARGET (DJ_TARGET + 3) /* base for target-specific values */ #include "qemu/log.h" diff --git a/target-arm/translate.h b/target-arm/translate.h index 285e96f087..3dd4c4578e 100644 --- a/target-arm/translate.h +++ b/target-arm/translate.h @@ -105,21 +105,22 @@ static inline int default_exception_el(DisasContext *s) } /* target-specific extra values for is_jmp */ +/* TODO: rename as DJ_* when transitioning this target to generic translation */ /* These instructions trap after executing, so the A32/T32 decoder must * defer them until after the conditional execution state has been updated. * WFI also needs special handling when single-stepping. */ -#define DISAS_WFI 4 -#define DISAS_SWI 5 +#define DISAS_WFI (DISAS_TARGET + 0) +#define DISAS_SWI (DISAS_TARGET + 1) /* For instructions which unconditionally cause an exception we can skip * emitting unreachable code at the end of the TB in the A64 decoder */ -#define DISAS_EXC 6 +#define DISAS_EXC (DISAS_TARGET + 2) /* WFE */ -#define DISAS_WFE 7 -#define DISAS_HVC 8 -#define DISAS_SMC 9 -#define DISAS_YIELD 10 +#define DISAS_WFE (DISAS_TARGET + 3) +#define DISAS_HVC (DISAS_TARGET + 4) +#define DISAS_SMC (DISAS_TARGET + 5) +#define DISAS_YIELD (DISAS_TARGET + 6) #ifdef TARGET_AARCH64 void a64_translate_init(void); diff --git a/target-cris/translate.c b/target-cris/translate.c index ebcf7863bf..001714c7c1 100644 --- a/target-cris/translate.c +++ b/target-cris/translate.c @@ -50,7 +50,8 @@ #define BUG() (gen_BUG(dc, __FILE__, __LINE__)) #define BUG_ON(x) ({if (x) BUG();}) -#define DISAS_SWI 5 +/* TODO: rename as DJ_* when transitioning this target to generic translation */ +#define DISAS_SWI (DISAS_TARGET + 0) /* Used by the decoder. */ #define EXTRACT_FIELD(src, start, end) \ diff --git a/target-m68k/translate.c b/target-m68k/translate.c index 6da6f2b51b..b2b0555c80 100644 --- a/target-m68k/translate.c +++ b/target-m68k/translate.c @@ -143,7 +143,8 @@ typedef struct DisasContext { int done_mac; } DisasContext; -#define DISAS_JUMP_NEXT 4 +/* TODO: rename as DJ_* when transitioning this target to generic translation */ +#define DISAS_JUMP_NEXT (DISAS_TARGET + 0) #if defined(CONFIG_USER_ONLY) #define IS_USER(s) 1 diff --git a/target-s390x/translate.c b/target-s390x/translate.c index a3992dae5a..75787e89e3 100644 --- a/target-s390x/translate.c +++ b/target-s390x/translate.c @@ -74,7 +74,8 @@ typedef struct { } u; } DisasCompare; -#define DISAS_EXCP 4 +/* TODO: rename as DJ_* when transitioning this target to generic translation */ +#define DISAS_EXCP (DISAS_TARGET + 0) #ifdef DEBUG_INLINE_BRANCHES static uint64_t inline_branch_hit[CC_OP_MAX]; diff --git a/target-unicore32/translate.c b/target-unicore32/translate.c index 39eaa76b50..de0a64e1c8 100644 --- a/target-unicore32/translate.c +++ b/target-unicore32/translate.c @@ -45,9 +45,10 @@ typedef struct DisasContext { #define IS_USER(s) 1 #endif +/* TODO: rename as DJ_* when transitioning this target to generic translation */ /* These instructions trap after executing, so defer them until after the conditional executions state has been updated. */ -#define DISAS_SYSCALL 5 +#define DISAS_SYSCALL (DISAS_TARGET + 0) static TCGv_env cpu_env; static TCGv_i32 cpu_R[32];
[Qemu-devel] [PATCH v4 1/6] Pass generic CPUState to gen_intermediate_code()
Needed to implement a target-agnostic gen_intermediate_code() in the future. Signed-off-by: Lluís Vilanova Reviewed-by: David Gibson --- include/exec/exec-all.h |2 +- target-alpha/translate.c | 11 +-- target-arm/translate.c| 24 target-cris/translate.c | 17 - target-i386/translate.c | 13 ++--- target-lm32/translate.c | 22 +++--- target-m68k/translate.c | 15 +++ target-microblaze/translate.c | 22 +++--- target-mips/translate.c | 15 +++ target-moxie/translate.c | 14 +++--- target-openrisc/translate.c | 22 +++--- target-ppc/translate.c| 15 +++ target-s390x/translate.c | 13 ++--- target-sh4/translate.c| 15 +++ target-sparc/translate.c | 11 +-- target-tilegx/translate.c |7 +++ target-tricore/translate.c|9 - target-unicore32/translate.c | 17 - target-xtensa/translate.c | 13 ++--- translate-all.c |2 +- 20 files changed, 133 insertions(+), 146 deletions(-) diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h index a8c13cee66..0e45e1aedc 100644 --- a/include/exec/exec-all.h +++ b/include/exec/exec-all.h @@ -43,7 +43,7 @@ typedef ram_addr_t tb_page_addr_t; #include "qemu/log.h" -void gen_intermediate_code(CPUArchState *env, struct TranslationBlock *tb); +void gen_intermediate_code(CPUState *env, struct TranslationBlock *tb); void restore_state_to_opc(CPUArchState *env, struct TranslationBlock *tb, target_ulong *data); diff --git a/target-alpha/translate.c b/target-alpha/translate.c index 114927b751..6759ec28cc 100644 --- a/target-alpha/translate.c +++ b/target-alpha/translate.c @@ -2873,10 +2873,9 @@ static ExitStatus translate_one(DisasContext *ctx, uint32_t insn) return ret; } -void gen_intermediate_code(CPUAlphaState *env, struct TranslationBlock *tb) +void gen_intermediate_code(CPUState *cpu, struct TranslationBlock *tb) { -AlphaCPU *cpu = alpha_env_get_cpu(env); -CPUState *cs = CPU(cpu); +CPUAlphaState *env = cpu->env_ptr; DisasContext ctx, *ctxp = &ctx; target_ulong pc_start; target_ulong pc_mask; @@ -2891,7 +2890,7 @@ void gen_intermediate_code(CPUAlphaState *env, struct TranslationBlock *tb) ctx.pc = pc_start; ctx.mem_idx = cpu_mmu_index(env, false); ctx.implver = env->implver; -ctx.singlestep_enabled = cs->singlestep_enabled; +ctx.singlestep_enabled = cpu->singlestep_enabled; #ifdef CONFIG_USER_ONLY ctx.ir = cpu_std_ir; @@ -2934,7 +2933,7 @@ void gen_intermediate_code(CPUAlphaState *env, struct TranslationBlock *tb) tcg_gen_insn_start(ctx.pc); num_insns++; -if (unlikely(cpu_breakpoint_test(cs, ctx.pc, BP_ANY))) { +if (unlikely(cpu_breakpoint_test(cpu, ctx.pc, BP_ANY))) { ret = gen_excp(&ctx, EXCP_DEBUG, 0); /* The address covered by the breakpoint must be included in [tb->pc, tb->pc + tb->size) in order to for it to be @@ -2996,7 +2995,7 @@ void gen_intermediate_code(CPUAlphaState *env, struct TranslationBlock *tb) && qemu_log_in_addr_range(pc_start)) { qemu_log_lock(); qemu_log("IN: %s\n", lookup_symbol(pc_start)); -log_target_disas(cs, pc_start, ctx.pc - pc_start, 1); +log_target_disas(cpu, pc_start, ctx.pc - pc_start, 1); qemu_log("\n"); qemu_log_unlock(); } diff --git a/target-arm/translate.c b/target-arm/translate.c index 0ad9070b45..3aa766901c 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -11589,10 +11589,10 @@ static bool insn_crosses_page(CPUARMState *env, DisasContext *s) } /* generate intermediate code for basic block 'tb'. */ -void gen_intermediate_code(CPUARMState *env, TranslationBlock *tb) +void gen_intermediate_code(CPUState *cpu, TranslationBlock *tb) { -ARMCPU *cpu = arm_env_get_cpu(env); -CPUState *cs = CPU(cpu); +CPUARMState *env = cpu->env_ptr; +ARMCPU *arm_cpu = arm_env_get_cpu(env); DisasContext dc1, *dc = &dc1; target_ulong pc_start; target_ulong next_page_start; @@ -11606,7 +11606,7 @@ void gen_intermediate_code(CPUARMState *env, TranslationBlock *tb) * the A32/T32 complexity to do with conditional execution/IT blocks/etc. */ if (ARM_TBFLAG_AARCH64_STATE(tb->flags)) { -gen_intermediate_code_a64(cpu, tb); +gen_intermediate_code_a64(arm_cpu, tb); return; } @@ -11616,7 +11616,7 @@ void gen_intermediate_code(CPUARMState *env, TranslationBlock *tb) dc->is_jmp = DISAS_NEXT; dc->pc = pc_start; -dc->singlestep_enabled = cs->singlestep_enabled; +dc->singlestep_enabled = cpu->singlestep_enabled; dc->condjmp = 0;
[Qemu-devel] [PATCH v4 3/6] target: [tcg] Add generic translation framework
Signed-off-by: Lluís Vilanova --- include/exec/gen-icount.h |2 include/exec/translate-all_template.h | 73 include/qom/cpu.h | 22 translate-all_template.h | 204 + 4 files changed, 300 insertions(+), 1 deletion(-) create mode 100644 include/exec/translate-all_template.h create mode 100644 translate-all_template.h diff --git a/include/exec/gen-icount.h b/include/exec/gen-icount.h index 050de59b38..c91ac95ed7 100644 --- a/include/exec/gen-icount.h +++ b/include/exec/gen-icount.h @@ -45,7 +45,7 @@ static inline void gen_tb_start(TranslationBlock *tb) tcg_temp_free_i32(count); } -static void gen_tb_end(TranslationBlock *tb, int num_insns) +static inline void gen_tb_end(TranslationBlock *tb, int num_insns) { gen_set_label(exitreq_label); tcg_gen_exit_tb((uintptr_t)tb + TB_EXIT_REQUESTED); diff --git a/include/exec/translate-all_template.h b/include/exec/translate-all_template.h new file mode 100644 index 00..ea507f90c6 --- /dev/null +++ b/include/exec/translate-all_template.h @@ -0,0 +1,73 @@ +/* + * Generic intermediate code generation. + * + * Copyright (C) 2016 Lluís Vilanova + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#ifndef EXEC__TRANSLATE_ALL_TEMPLATE_H +#define EXEC__TRANSLATE_ALL_TEMPLATE_H + +/* + * Include this header from a target-specific file, and add a + * + * DisasContextBase base; + * + * member in your target-specific DisasContext. + */ + + +#include "exec/exec-all.h" + + +/** + * BreakpointHitType: + * @BH_MISS: No hit + * @BH_HIT_INSN: Hit, but continue translating instruction + * @BH_HIT_TB: Hit, stop translating TB + * + * How to react to a breakpoint hit. + */ +typedef enum BreakpointHitType { +BH_MISS, +BH_HIT_INSN, +BH_HIT_TB, +} BreakpointHitType; + +/** + * DisasJumpType: + * @DJ_NEXT: Next instruction in program order + * @DJ_TOO_MANY: Too many instructions executed + * @DJ_TARGET: Start of target-specific conditions + * + * What instruction to disassemble next. + */ +typedef enum DisasJumpType { +DJ_NEXT, +DJ_TOO_MANY, +DJ_TARGET, +} DisasJumpType; + +/** + * DisasContextBase: + * @tb: Translation block for this disassembly. + * @singlestep_enabled: "Hardware" single stepping enabled. + * @pc_first: Address of first guest instruction in this TB. + * @pc_next: Address of next guest instruction in this TB (current during + * disassembly). + * @num_insns: Number of translated instructions (including current). + * + * Architecture-agnostic disassembly context. + */ +typedef struct DisasContextBase { +TranslationBlock *tb; +bool singlestep_enabled; +target_ulong pc_first; +target_ulong pc_next; +DisasJumpType jmp_type; +unsigned int num_insns; +} DisasContextBase; + +#endif /* EXEC__TRANSLATE_ALL_TEMPLATE_H */ diff --git a/include/qom/cpu.h b/include/qom/cpu.h index 3f79a8e955..64a288b066 100644 --- a/include/qom/cpu.h +++ b/include/qom/cpu.h @@ -948,6 +948,28 @@ static inline bool cpu_breakpoint_test(CPUState *cpu, vaddr pc, int mask) return false; } +/* Get first breakpoint matching a PC */ +static inline CPUBreakpoint *cpu_breakpoint_get(CPUState *cpu, vaddr pc, +CPUBreakpoint *bp) +{ +if (likely(bp == NULL)) { +if (unlikely(!QTAILQ_EMPTY(&cpu->breakpoints))) { +QTAILQ_FOREACH(bp, &cpu->breakpoints, entry) { +if (bp->pc == pc) { +return bp; +} +} +} +} else { +QTAILQ_FOREACH_CONTINUE(bp, entry) { +if (bp->pc == pc) { +return bp; +} +} +} +return NULL; +} + int cpu_watchpoint_insert(CPUState *cpu, vaddr addr, vaddr len, int flags, CPUWatchpoint **watchpoint); int cpu_watchpoint_remove(CPUState *cpu, vaddr addr, diff --git a/translate-all_template.h b/translate-all_template.h new file mode 100644 index 00..6208916d08 --- /dev/null +++ b/translate-all_template.h @@ -0,0 +1,204 @@ +/* + * Generic intermediate code generation. + * + * Copyright (C) 2016 Lluís Vilanova + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#ifndef TRANSLATE_ALL_TEMPLATE_H +#define TRANSLATE_ALL_TEMPLATE_H + +/* + * Include this header from a target-specific file, which must define the + * target-specific functions declared below. + * + * These must be paired with instructions in "exec/translate-all_template.h". + */ + + +#include "cpu.h" +#include "qemu/error-report.h" + + +static void gen_intermediate_code_target_init_disas_context( +DisasContext *dc, CPUArchState *env); + +static void gen_intermediate_code_target_init_globals( +DisasContext *dc, CPUArchState *
[Qemu-devel] [PATCH v4 2/6] queue: Add macro for incremental traversal
Adds macro QTAILQ_FOREACH_CONTINUE to support incremental list traversal. Signed-off-by: Lluís Vilanova --- include/qemu/queue.h | 12 1 file changed, 12 insertions(+) diff --git a/include/qemu/queue.h b/include/qemu/queue.h index 342073fb4d..ea6130f1c9 100644 --- a/include/qemu/queue.h +++ b/include/qemu/queue.h @@ -415,6 +415,18 @@ struct { \ (var); \ (var) = ((var)->field.tqe_next)) +/** + * QTAILQ_FOREACH_CONTINUE: + * @var: Variable to resume iteration from. + * @field: Field in @var holding a QTAILQ_ENTRY for this queue. + * + * Resumes iteration on a queue from the element in @var. + */ +#define QTAILQ_FOREACH_CONTINUE(var, field) \ +for ((var) = ((var)->field.tqe_next); \ +(var); \ +(var) = ((var)->field.tqe_next)) + #define QTAILQ_FOREACH_SAFE(var, head, field, next_var) \ for ((var) = ((head)->tqh_first); \ (var) && ((next_var) = ((var)->field.tqe_next), 1); \
[Qemu-devel] [RFC PATCH v4 0/6] translate: [tcg] Generic translation framework
This series proposes a generic (target-agnostic) instruction translation framework. It basically provides a generic main loop for instruction disassembly, which calls target-specific functions when necessary. This generalization makes inserting new code in the main loop easier, and helps in keeping all targets in synch as to the contents of it. This series also paves the way towards adding events to trace guest code execution (BBLs and instructions). I've ported i386/x86-64 and arm/aarch64 as an example to see how it fits in the current organization, but will port the rest when this series gets merged. Signed-off-by: Lluís Vilanova --- Changes in v4 = * Document new macro QTAILQ_FOREACH_CONTINUE [Peter Maydell]. * Fix coding style errors reported by checkpatch. * Remove use of "restrict" in added functions; it makes older gcc versions barf about compilation errors. Changes in v3 = * Rebase on 0737f32daf. Changes in v2 = * Port ARM and AARCH64 targets. * Fold single-stepping checks into "max_insns" [Richard Henderson]. * Move instruction start marks to target code [Richard Henderson]. * Add target hook for TB start. * Check for TCG temporary leaks. * Move instruction disassembly into a target hook. * Make breakpoint_hit() return an enum to accomodate target's needs (ARM). Lluís Vilanova (6): Pass generic CPUState to gen_intermediate_code() queue: Add macro for incremental traversal target: [tcg] Add generic translation framework target: [tcg] Redefine DISAS_* onto the generic translation framework (DJ_*) target: [tcg,i386] Port to generic translation framework target: [tcg,arm] Port to generic translation framework include/exec/exec-all.h | 13 - include/exec/gen-icount.h |2 include/exec/translate-all_template.h | 73 +++ include/qemu/queue.h | 12 + include/qom/cpu.h | 22 + target-alpha/translate.c | 11 - target-arm/translate-a64.c| 346 target-arm/translate.c| 720 + target-arm/translate.h| 41 +- target-cris/translate.c | 20 - target-i386/translate.c | 305 ++ target-lm32/translate.c | 22 + target-m68k/translate.c | 18 - target-microblaze/translate.c | 22 + target-mips/translate.c | 15 - target-moxie/translate.c | 14 - target-openrisc/translate.c | 22 + target-ppc/translate.c| 15 - target-s390x/translate.c | 16 - target-sh4/translate.c| 15 - target-sparc/translate.c | 11 - target-tilegx/translate.c |7 target-tricore/translate.c|9 target-unicore32/translate.c | 20 - target-xtensa/translate.c | 13 - translate-all.c |2 translate-all_template.h | 204 + 27 files changed, 1137 insertions(+), 853 deletions(-) create mode 100644 include/exec/translate-all_template.h create mode 100644 translate-all_template.h To: qemu-devel@nongnu.org Cc: Paolo Bonzini Cc: Peter Crosthwaite Cc: Richard Henderson
[Qemu-devel] [Bug 1649042] Re: Ubuntu 16.04.1 LightDM Resolution Not Correct
OK, if it works with -vga virtio, I think we should close this bug as WONTFIX, since the -vga vmware code is pretty much unmaintained as far as I know (if somebody is willing to fix this there, too, feel free to open this bug again). ** Changed in: qemu Status: New => Won't Fix -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1649042 Title: Ubuntu 16.04.1 LightDM Resolution Not Correct Status in QEMU: Won't Fix Bug description: My Specs: Slackware 14.2 x86_64 > Host Nvidia GPU GTX660M nvidia-driver-352.63 QEMU 2.7.0 Ubuntu 16.04.1 x86_64 > Guest Unity Xorg nouveau - 1:1.0.12-1build2 These are the startup options for Ubuntu: qemu-system-x86_64 -drive format=raw,file=ubuntu.img \ -cpu host \ --enable-kvm \ -smp 2 \ -m 4096 \ -vga vmware \ -soundhw ac97 \ -usbdevice tablet \ -rtc base=localtime \ -usbdevice host:0781:5575 Unity desktop resolution set for 1440x900. I noticed when I come to the login screen to enter my password the LightDM resolution fills my entire desktop. I searched online and found this solution; cp ~/.config/monitor.xml /var/lib/lightdm/.config For now I'm assuming this step should not be needed and the resolution should be correctly detected and set? To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1649042/+subscriptions
Re: [Qemu-devel] [PATCH] doc/pcie: correct command line examples
On Wed, Dec 28, 2016 at 03:24:30PM +0200, Marcel Apfelbaum wrote: > On 12/27/2016 09:40 AM, Cao jin wrote: > > Nit picking: Multi-function PCI Express Root Ports should mean that > > 'addr' property is mandatory, and slot is optional because it is default > > to 0, and 'chassis' is mandatory for 2nd & 3rd root port because it is > > default to 0 too. > > > > Bonus: fix a typo(2->3) > > Signed-off-by: Cao jin > > --- > > docs/pcie.txt | 12 ++-- > > 1 file changed, 6 insertions(+), 6 deletions(-) > > > > diff --git a/docs/pcie.txt b/docs/pcie.txt > > index 9fb20aaed9f4..54f05eaa71dc 100644 > > --- a/docs/pcie.txt > > +++ b/docs/pcie.txt > > @@ -110,18 +110,18 @@ Plug only PCI Express devices into PCI Express Ports. > >-device > > ioh3420,id=root_port1,chassis=x,slot=y[,bus=pcie.0][,addr=z] \ > >-device ,bus=root_port1 > > 2.2.2 Using multi-function PCI Express Root Ports: > > - -device > > ioh3420,id=root_port1,multifunction=on,chassis=x,slot=y[,bus=pcie.0][,addr=z.0] > > \ > > - -device > > ioh3420,id=root_port2,chassis=x1,slot=y1[,bus=pcie.0][,addr=z.1] \ > > - -device > > ioh3420,id=root_port3,chassis=x2,slot=y2[,bus=pcie.0][,addr=z.2] \ > > -2.2.2 Plugging a PCI Express device into a Switch: > > + -device > > ioh3420,id=root_port1,multifunction=on,chassis=x,addr=z.0[,slot=y][,bus=pcie.0] > > \ > > + -device > > ioh3420,id=root_port2,chassis=x1,addr=z.1[,slot=y1][,bus=pcie.0] \ > > + -device > > ioh3420,id=root_port3,chassis=x2,addr=z.2[,slot=y2][,bus=pcie.0] \ > > +2.2.3 Plugging a PCI Express device into a Switch: > >-device ioh3420,id=root_port1,chassis=x,slot=y[,bus=pcie.0][,addr=z] > > \ > >-device x3130-upstream,id=upstream_port1,bus=root_port1[,addr=x] > > \ > >-device > > xio3130-downstream,id=downstream_port1,bus=upstream_port1,chassis=x1,slot=y1[,addr=z1]] > > \ > >-device ,bus=downstream_port1 > > > > Notes: > > - - (slot, chassis) pair is mandatory and must be > > - unique for each PCI Express Root Port. > > + - (slot, chassis) pair is mandatory and must be unique for each > > +PCI Express Root Port. slot is default to 0 when doesn't specify it. Please rewrite last sentence as slot defaults to 0 when not specified. > >- 'addr' parameter can be 0 for all the examples above. > > > > > > > > Reviewed-by: Marcel Apfelbaum > > Thanks, > Marcel > Thanks, drew
[Qemu-devel] Looking for a linux-user mode test
After some recent-ish changes to how user mode executes things/stuff, I'm running into issues with the out of tree bsd-user mode code that FreeBSD has been maintaining. It looks like the host_signal_handler() is never executed or registered correctly in our code. I'm curious if the linux-user code can handle this bit of configure script from m4. https://people.freebsd.org/~sbruno/stack.c If someone has the time/inclination, can this code be compiled for ARMv6 and executed in a linux chroot with the -strace argument applied? I see the following, which after much debugging seems to indicate that the host_signal_handler() code is never executed as this code is requesting that SIGSEGV be masked to its own handler. https://people.freebsd.org/~sbruno/qemu-bsd-user-arm.txt Prior to 7e6c57e2957c7d868f74bd0d53b5e861b495e1c7 this DTRT for our ARMv6 targets. sean signature.asc Description: OpenPGP digital signature
[Qemu-devel] [PATCH v2] build: include sys/sysmacros.h for major() and minor()
The definition of the major() and minor() macros are moving within glibc to . Include this header to avoid the following sorts of build-stopping messages: qga/commands-posix.c: In function ‘dev_major_minor’: qga/commands-posix.c:656:13: error: In the GNU C Library, "major" is defined by . For historical compatibility, it is currently defined by as well, but we plan to remove this soon. To use "major", include directly. If you did not intend to use a system-defined macro "major", you should undefine it after including . [-Werror] *devmajor = major(st.st_rdev); ^~ qga/commands-posix.c:657:13: error: In the GNU C Library, "minor" is defined by . For historical compatibility, it is currently defined by as well, but we plan to remove this soon. To use "minor", include directly. If you did not intend to use a system-defined macro "minor", you should undefine it after including . [-Werror] *devminor = minor(st.st_rdev); ^~ The additional include allows the build to complete on Fedora 26 (Rawhide) with glibc version 2.24.90. Signed-off-by: Christopher Covington --- include/sysemu/os-posix.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/sysemu/os-posix.h b/include/sysemu/os-posix.h index b0a6c0695b..772d58f7ed 100644 --- a/include/sysemu/os-posix.h +++ b/include/sysemu/os-posix.h @@ -28,6 +28,7 @@ #include #include +#include #include #include #include -- Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
[Qemu-devel] [PATCH v5 7/7] trace: [trivial] Statically enable all guest events
The optimizations of this series makes it feasible to have them available on all builds. Signed-off-by: Lluís Vilanova --- trace-events |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/trace-events b/trace-events index f74e1d3d22..0a0f4d9cd6 100644 --- a/trace-events +++ b/trace-events @@ -159,7 +159,7 @@ vcpu guest_cpu_reset(void) # # Mode: user, softmmu # Targets: TCG(all) -disable vcpu tcg guest_mem_before(TCGv vaddr, uint8_t info) "info=%d", "vaddr=0x%016"PRIx64" info=%d" +vcpu tcg guest_mem_before(TCGv vaddr, uint8_t info) "info=%d", "vaddr=0x%016"PRIx64" info=%d" # @num: System call number. # @arg*: System call argument value. @@ -168,7 +168,7 @@ disable vcpu tcg guest_mem_before(TCGv vaddr, uint8_t info) "info=%d", "vaddr=0x # # Mode: user # Targets: TCG(all) -disable vcpu guest_user_syscall(uint64_t num, uint64_t arg1, uint64_t arg2, uint64_t arg3, uint64_t arg4, uint64_t arg5, uint64_t arg6, uint64_t arg7, uint64_t arg8) "num=0x%016"PRIx64" arg1=0x%016"PRIx64" arg2=0x%016"PRIx64" arg3=0x%016"PRIx64" arg4=0x%016"PRIx64" arg5=0x%016"PRIx64" arg6=0x%016"PRIx64" arg7=0x%016"PRIx64" arg8=0x%016"PRIx64 +vcpu guest_user_syscall(uint64_t num, uint64_t arg1, uint64_t arg2, uint64_t arg3, uint64_t arg4, uint64_t arg5, uint64_t arg6, uint64_t arg7, uint64_t arg8) "num=0x%016"PRIx64" arg1=0x%016"PRIx64" arg2=0x%016"PRIx64" arg3=0x%016"PRIx64" arg4=0x%016"PRIx64" arg5=0x%016"PRIx64" arg6=0x%016"PRIx64" arg7=0x%016"PRIx64" arg8=0x%016"PRIx64 # @num: System call number. # @ret: System call result value. @@ -177,4 +177,4 @@ disable vcpu guest_user_syscall(uint64_t num, uint64_t arg1, uint64_t arg2, uint # # Mode: user # Targets: TCG(all) -disable vcpu guest_user_syscall_ret(uint64_t num, uint64_t ret) "num=0x%016"PRIx64" ret=0x%016"PRIx64 +vcpu guest_user_syscall_ret(uint64_t num, uint64_t ret) "num=0x%016"PRIx64" ret=0x%016"PRIx64
[Qemu-devel] [PATCH v5 4/7] exec: [tcg] Use different TBs according to the vCPU's dynamic tracing state
Every vCPU now uses a separate set of TBs for each set of dynamic tracing event state values. Each set of TBs can be used by any number of vCPUs to maximize TB reuse when vCPUs have the same tracing state. This feature is later used by tracetool to optimize tracing of guest code events. The maximum number of TB sets is defined as 2^E, where E is the number of events that have the 'vcpu' property (their state is stored in CPUState->trace_dstate). For this to work, a change on the dynamic tracing state of a vCPU will force it to flush its virtual TB cache (which is only indexed by address), and fall back to the physical TB cache (which now contains the vCPU's dynamic tracing state as part of the hashing function). Signed-off-by: Lluís Vilanova --- cpu-exec.c| 26 +- include/exec/exec-all.h |5 + include/exec/tb-hash-xx.h | 11 ++- include/exec/tb-hash.h|5 +++-- include/qemu-common.h |3 +++ tests/qht-bench.c |2 +- trace/control-target.c|3 +++ trace/control.h |3 +++ translate-all.c | 16 ++-- 9 files changed, 63 insertions(+), 11 deletions(-) diff --git a/cpu-exec.c b/cpu-exec.c index 1b7366efb0..a377505b9c 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -262,6 +262,7 @@ struct tb_desc { CPUArchState *env; tb_page_addr_t phys_page1; uint32_t flags; +TRACE_QHT_VCPU_DSTATE_TYPE trace_vcpu_dstate; }; static bool tb_cmp(const void *p, const void *d) @@ -273,6 +274,7 @@ static bool tb_cmp(const void *p, const void *d) tb->page_addr[0] == desc->phys_page1 && tb->cs_base == desc->cs_base && tb->flags == desc->flags && +tb->trace_vcpu_dstate == desc->trace_vcpu_dstate && !atomic_read(&tb->invalid)) { /* check next page if needed */ if (tb->page_addr[1] == -1) { @@ -294,7 +296,8 @@ static bool tb_cmp(const void *p, const void *d) static TranslationBlock *tb_htable_lookup(CPUState *cpu, target_ulong pc, target_ulong cs_base, - uint32_t flags) + uint32_t flags, + uint32_t trace_vcpu_dstate) { tb_page_addr_t phys_pc; struct tb_desc desc; @@ -303,10 +306,11 @@ static TranslationBlock *tb_htable_lookup(CPUState *cpu, desc.env = (CPUArchState *)cpu->env_ptr; desc.cs_base = cs_base; desc.flags = flags; +desc.trace_vcpu_dstate = trace_vcpu_dstate; desc.pc = pc; phys_pc = get_page_addr_code(desc.env, pc); desc.phys_page1 = phys_pc & TARGET_PAGE_MASK; -h = tb_hash_func(phys_pc, pc, flags); +h = tb_hash_func(phys_pc, pc, flags, trace_vcpu_dstate); return qht_lookup(&tcg_ctx.tb_ctx.htable, tb_cmp, &desc, h); } @@ -318,16 +322,24 @@ static inline TranslationBlock *tb_find(CPUState *cpu, TranslationBlock *tb; target_ulong cs_base, pc; uint32_t flags; +unsigned long trace_vcpu_dstate_bitmap; +TRACE_QHT_VCPU_DSTATE_TYPE trace_vcpu_dstate; bool have_tb_lock = false; +bitmap_copy(&trace_vcpu_dstate_bitmap, cpu->trace_dstate, +trace_get_vcpu_event_count()); +memcpy(&trace_vcpu_dstate, &trace_vcpu_dstate_bitmap, + sizeof(trace_vcpu_dstate)); + /* we record a subset of the CPU state. It will always be the same before a given translated block is executed. */ cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags); tb = atomic_rcu_read(&cpu->tb_jmp_cache[tb_jmp_cache_hash_func(pc)]); if (unlikely(!tb || tb->pc != pc || tb->cs_base != cs_base || - tb->flags != flags)) { -tb = tb_htable_lookup(cpu, pc, cs_base, flags); + tb->flags != flags || + tb->trace_vcpu_dstate != trace_vcpu_dstate)) { +tb = tb_htable_lookup(cpu, pc, cs_base, flags, trace_vcpu_dstate); if (!tb) { /* mmap_lock is needed by tb_gen_code, and mmap_lock must be @@ -341,7 +353,7 @@ static inline TranslationBlock *tb_find(CPUState *cpu, /* There's a chance that our desired tb has been translated while * taking the locks so we check again inside the lock. */ -tb = tb_htable_lookup(cpu, pc, cs_base, flags); +tb = tb_htable_lookup(cpu, pc, cs_base, flags, trace_vcpu_dstate); if (!tb) { /* if no translated code available, then translate it now */ tb = tb_gen_code(cpu, pc, cs_base, flags, 0); @@ -465,6 +477,7 @@ static inline bool cpu_handle_exception(CPUState *cpu, int *ret) if (unlikely(atomic_read(&cpu->trace_dstate_delayed_req))) { bitmap_copy(cpu->trace_dstate, cpu->trace_dstate_delayed, trace_get_vcpu_event_count()); +tb_flush_jmp_cache_all(
[Qemu-devel] [PATCH v5 5/7] trace: [tcg] Do not generate TCG code to trace dinamically-disabled events
If an event is dynamically disabled, the TCG code that calls the execution-time tracer is not generated. Removes the overheads of execution-time tracers for dynamically disabled events. As a bonus, also avoids checking the event state when the execution-time tracer is called from TCG-generated code (since otherwise TCG would simply not call it). Signed-off-by: Lluís Vilanova --- scripts/tracetool/__init__.py|1 + scripts/tracetool/format/h.py| 24 ++-- scripts/tracetool/format/tcg_h.py| 19 --- scripts/tracetool/format/tcg_helper_c.py |3 ++- 4 files changed, 37 insertions(+), 10 deletions(-) diff --git a/scripts/tracetool/__init__.py b/scripts/tracetool/__init__.py index 365446fa53..63168ccdf0 100644 --- a/scripts/tracetool/__init__.py +++ b/scripts/tracetool/__init__.py @@ -264,6 +264,7 @@ class Event(object): return self._FMT.findall(self.fmt) QEMU_TRACE = "trace_%(name)s" +QEMU_TRACE_NOCHECK = "_nocheck__" + QEMU_TRACE QEMU_TRACE_TCG = QEMU_TRACE + "_tcg" QEMU_DSTATE = "_TRACE_%(NAME)s_DSTATE" QEMU_EVENT = "_TRACE_%(NAME)s_EVENT" diff --git a/scripts/tracetool/format/h.py b/scripts/tracetool/format/h.py index 3682f4e6a8..a78e50ef35 100644 --- a/scripts/tracetool/format/h.py +++ b/scripts/tracetool/format/h.py @@ -49,6 +49,19 @@ def generate(events, backend, group): backend.generate_begin(events, group) for e in events: +# tracer without checks +out('', +'static inline void %(api)s(%(args)s)', +'{', +api=e.api(e.QEMU_TRACE_NOCHECK), +args=e.args) + +if "disable" not in e.properties: +backend.generate(e, group) + +out('}') + +# tracer wrapper with checks (per-vCPU tracing) if "vcpu" in e.properties: trace_cpu = next(iter(e.args))[1] cond = "trace_event_get_vcpu_state(%(cpu)s,"\ @@ -63,16 +76,15 @@ def generate(events, backend, group): 'static inline void %(api)s(%(args)s)', '{', 'if (%(cond)s) {', +'%(api_nocheck)s(%(names)s);', +'}', +'}', api=e.api(), +api_nocheck=e.api(e.QEMU_TRACE_NOCHECK), args=e.args, +names=", ".join(e.args.names()), cond=cond) -if "disable" not in e.properties: -backend.generate(e, group) - -out('}', -'}') - backend.generate_end(events, group) out('#endif /* TRACE_%s_GENERATED_TRACERS_H */' % group.upper()) diff --git a/scripts/tracetool/format/tcg_h.py b/scripts/tracetool/format/tcg_h.py index 5f213f6cba..71b5c09432 100644 --- a/scripts/tracetool/format/tcg_h.py +++ b/scripts/tracetool/format/tcg_h.py @@ -41,7 +41,7 @@ def generate(events, backend, group): for e in events: # just keep one of them -if "tcg-trans" not in e.properties: +if "tcg-exec" not in e.properties: continue out('static inline void %(name_tcg)s(%(args)s)', @@ -53,12 +53,25 @@ def generate(events, backend, group): args_trans = e.original.event_trans.args args_exec = tracetool.vcpu.transform_args( "tcg_helper_c", e.original.event_exec, "wrapper") +if "vcpu" in e.properties: +trace_cpu = e.args.names()[0] +cond = "trace_event_get_vcpu_state(%(cpu)s,"\ + " TRACE_%(id)s)"\ + % dict( + cpu=trace_cpu, + id=e.original.event_exec.name.upper()) +else: +cond = "true" + out('%(name_trans)s(%(argnames_trans)s);', -'gen_helper_%(name_exec)s(%(argnames_exec)s);', +'if (%(cond)s) {', +'gen_helper_%(name_exec)s(%(argnames_exec)s);', +'}', name_trans=e.original.event_trans.api(e.QEMU_TRACE), name_exec=e.original.event_exec.api(e.QEMU_TRACE), argnames_trans=", ".join(args_trans.names()), -argnames_exec=", ".join(args_exec.names())) +argnames_exec=", ".join(args_exec.names()), +cond=cond) out('}') diff --git a/scripts/tracetool/format/tcg_helper_c.py b/scripts/tracetool/format/tcg_helper_c.py index cc26e03008..c2a05d756c 100644 --- a/scripts/tracetool/format/tcg_helper_c.py +++ b/scripts/tracetool/format/tcg_helper_c.py @@ -66,10 +66,11 @@ def generate(events, backend, group): out('void %(name_tcg)s(%(args_api)s)', '{', +# NOTE: the check was already performed at TCG-generation time '%(name)s(%(args_call)s);', '}', name_tcg="helper_%s_proxy" % e.api(), -
[Qemu-devel] [PATCH v5 2/7] trace: Make trace_get_vcpu_event_count() inlinable
Later patches will make use of it. Signed-off-by: Lluís Vilanova --- trace/control-internal.h |5 + trace/control.c |9 ++--- trace/control.h |2 +- 3 files changed, 8 insertions(+), 8 deletions(-) diff --git a/trace/control-internal.h b/trace/control-internal.h index a9d395a587..beb98a0d2c 100644 --- a/trace/control-internal.h +++ b/trace/control-internal.h @@ -16,6 +16,7 @@ extern int trace_events_enabled_count; +extern uint32_t trace_next_vcpu_id; static inline bool trace_event_is_pattern(const char *str) @@ -82,6 +83,10 @@ static inline bool trace_event_get_vcpu_state_dynamic(CPUState *vcpu, return trace_event_get_vcpu_state_dynamic_by_vcpu_id(vcpu, vcpu_id); } +static inline uint32_t trace_get_vcpu_event_count(void) +{ +return trace_next_vcpu_id; +} void trace_event_register_group(TraceEvent **events); diff --git a/trace/control.c b/trace/control.c index 1a7bee6ddc..52d0e343fa 100644 --- a/trace/control.c +++ b/trace/control.c @@ -36,7 +36,7 @@ typedef struct TraceEventGroup { static TraceEventGroup *event_groups; static size_t nevent_groups; static uint32_t next_id; -static uint32_t next_vcpu_id; +uint32_t trace_next_vcpu_id; QemuOptsList qemu_trace_opts = { .name = "trace", @@ -65,7 +65,7 @@ void trace_event_register_group(TraceEvent **events) for (i = 0; events[i] != NULL; i++) { events[i]->id = next_id++; if (events[i]->vcpu_id != TRACE_VCPU_EVENT_NONE) { -events[i]->vcpu_id = next_vcpu_id++; +events[i]->vcpu_id = trace_next_vcpu_id++; } } event_groups = g_renew(TraceEventGroup, event_groups, nevent_groups + 1); @@ -299,8 +299,3 @@ char *trace_opt_parse(const char *optarg) return trace_file; } - -uint32_t trace_get_vcpu_event_count(void) -{ -return next_vcpu_id; -} diff --git a/trace/control.h b/trace/control.h index ccaeac8552..80d326c4d1 100644 --- a/trace/control.h +++ b/trace/control.h @@ -237,7 +237,7 @@ char *trace_opt_parse(const char *optarg); * * Return the number of known vcpu-specific events */ -uint32_t trace_get_vcpu_event_count(void); +static uint32_t trace_get_vcpu_event_count(void); #include "trace/control-internal.h"
[Qemu-devel] [PATCH v5 6/7] trace: [tcg, trivial] Re-align generated code
Last patch removed a nesting level in generated code. Re-align all code generated by backends to be 4-column aligned. Signed-off-by: Lluís Vilanova --- scripts/tracetool/backend/dtrace.py |2 +- scripts/tracetool/backend/ftrace.py | 20 ++-- scripts/tracetool/backend/log.py| 17 + scripts/tracetool/backend/simple.py |2 +- scripts/tracetool/backend/syslog.py |6 +++--- scripts/tracetool/backend/ust.py|2 +- 6 files changed, 25 insertions(+), 24 deletions(-) diff --git a/scripts/tracetool/backend/dtrace.py b/scripts/tracetool/backend/dtrace.py index 79505c6b1a..b3a8645bf0 100644 --- a/scripts/tracetool/backend/dtrace.py +++ b/scripts/tracetool/backend/dtrace.py @@ -41,6 +41,6 @@ def generate_h_begin(events, group): def generate_h(event, group): -out('QEMU_%(uppername)s(%(argnames)s);', +out('QEMU_%(uppername)s(%(argnames)s);', uppername=event.name.upper(), argnames=", ".join(event.args.names())) diff --git a/scripts/tracetool/backend/ftrace.py b/scripts/tracetool/backend/ftrace.py index db9fe7ad57..dd0eda4441 100644 --- a/scripts/tracetool/backend/ftrace.py +++ b/scripts/tracetool/backend/ftrace.py @@ -29,17 +29,17 @@ def generate_h(event, group): if len(event.args) > 0: argnames = ", " + argnames -out('{', -'char ftrace_buf[MAX_TRACE_STRLEN];', -'int unused __attribute__ ((unused));', -'int trlen;', -'if (trace_event_get_state(%(event_id)s)) {', -'trlen = snprintf(ftrace_buf, MAX_TRACE_STRLEN,', -' "%(name)s " %(fmt)s "\\n" %(argnames)s);', -'trlen = MIN(trlen, MAX_TRACE_STRLEN - 1);', -'unused = write(trace_marker_fd, ftrace_buf, trlen);', -'}', +out('{', +'char ftrace_buf[MAX_TRACE_STRLEN];', +'int unused __attribute__ ((unused));', +'int trlen;', +'if (trace_event_get_state(%(event_id)s)) {', +'trlen = snprintf(ftrace_buf, MAX_TRACE_STRLEN,', +' "%(name)s " %(fmt)s "\\n" %(argnames)s);', +'trlen = MIN(trlen, MAX_TRACE_STRLEN - 1);', +'unused = write(trace_marker_fd, ftrace_buf, trlen);', '}', +'}', name=event.name, args=event.args, event_id="TRACE_" + event.name.upper(), diff --git a/scripts/tracetool/backend/log.py b/scripts/tracetool/backend/log.py index 4f4a4d38b1..7d2c3abe75 100644 --- a/scripts/tracetool/backend/log.py +++ b/scripts/tracetool/backend/log.py @@ -35,14 +35,15 @@ def generate_h(event, group): else: cond = "trace_event_get_state(%s)" % ("TRACE_" + event.name.upper()) -out('if (%(cond)s) {', -'struct timeval _now;', -'gettimeofday(&_now, NULL);', -'qemu_log_mask(LOG_TRACE, "%%d@%%zd.%%06zd:%(name)s " %(fmt)s "\\n",', -' getpid(),', -' (size_t)_now.tv_sec, (size_t)_now.tv_usec', -' %(argnames)s);', -'}', +out('if (%(cond)s) {', +'struct timeval _now;', +'gettimeofday(&_now, NULL);', +'qemu_log_mask(LOG_TRACE,', +' "%%d@%%zd.%%06zd:%(name)s " %(fmt)s "\\n",', +' getpid(),', +' (size_t)_now.tv_sec, (size_t)_now.tv_usec', +' %(argnames)s);', +'}', cond=cond, name=event.name, fmt=event.fmt.rstrip("\n"), diff --git a/scripts/tracetool/backend/simple.py b/scripts/tracetool/backend/simple.py index 85f61028e2..a28460b1e4 100644 --- a/scripts/tracetool/backend/simple.py +++ b/scripts/tracetool/backend/simple.py @@ -37,7 +37,7 @@ def generate_h_begin(events, group): def generate_h(event, group): -out('_simple_%(api)s(%(args)s);', +out('_simple_%(api)s(%(args)s);', api=event.api(), args=", ".join(event.args.names())) diff --git a/scripts/tracetool/backend/syslog.py b/scripts/tracetool/backend/syslog.py index b8ff2790c4..1ce627f0fc 100644 --- a/scripts/tracetool/backend/syslog.py +++ b/scripts/tracetool/backend/syslog.py @@ -35,9 +35,9 @@ def generate_h(event, group): else: cond = "trace_event_get_state(%s)" % ("TRACE_" + event.name.upper()) -out('if (%(cond)s) {', -'syslog(LOG_INFO, "%(name)s " %(fmt)s %(argnames)s);', -'}', +out('if (%(cond)s) {', +'syslog(LOG_INFO, "%(name)s " %(fmt)s %(argnames)s);', +'}', cond=cond, name=event.name, fmt=event.fmt.rstrip
[Qemu-devel] [PATCH v5 1/7] exec: [tcg] Refactor flush of per-CPU virtual TB cache
The function is reused in later patches. Signed-off-by: Lluís Vilanova --- cputlb.c|2 +- include/exec/exec-all.h |6 ++ translate-all.c | 14 +- 3 files changed, 16 insertions(+), 6 deletions(-) diff --git a/cputlb.c b/cputlb.c index 813279f3bc..9bf9960e1b 100644 --- a/cputlb.c +++ b/cputlb.c @@ -80,7 +80,7 @@ void tlb_flush(CPUState *cpu, int flush_global) memset(env->tlb_table, -1, sizeof(env->tlb_table)); memset(env->tlb_v_table, -1, sizeof(env->tlb_v_table)); -memset(cpu->tb_jmp_cache, 0, sizeof(cpu->tb_jmp_cache)); +tb_flush_jmp_cache_all(cpu); env->vtlb_index = 0; env->tlb_flush_addr = -1; diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h index a8c13cee66..57cd978578 100644 --- a/include/exec/exec-all.h +++ b/include/exec/exec-all.h @@ -256,6 +256,12 @@ struct TranslationBlock { }; void tb_free(TranslationBlock *tb); +/** + * tb_flush_jmp_cache_all: + * + * Flush the virtual translation block cache. + */ +void tb_flush_jmp_cache_all(CPUState *env); void tb_flush(CPUState *cpu); void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr); diff --git a/translate-all.c b/translate-all.c index 3dd9214904..29ccb9e546 100644 --- a/translate-all.c +++ b/translate-all.c @@ -941,11 +941,7 @@ static void do_tb_flush(CPUState *cpu, run_on_cpu_data tb_flush_count) } CPU_FOREACH(cpu) { -int i; - -for (i = 0; i < TB_JMP_CACHE_SIZE; ++i) { -atomic_set(&cpu->tb_jmp_cache[i], NULL); -} +tb_flush_jmp_cache_all(cpu); } tcg_ctx.tb_ctx.nb_tbs = 0; @@ -1741,6 +1737,14 @@ void tb_check_watchpoint(CPUState *cpu) } } +void tb_flush_jmp_cache_all(CPUState *cpu) +{ +int i; +for (i = 0; i < TB_JMP_CACHE_SIZE; ++i) { +atomic_set(&cpu->tb_jmp_cache[i], NULL); +} +} + #ifndef CONFIG_USER_ONLY /* in deterministic execution mode, instructions doing device I/Os must be at the end of the TB */
[Qemu-devel] [PATCH v5 3/7] trace: [tcg] Delay changes to dynamic state when translating
This keeps consistency across all decisions taken during translation when the dynamic state of a vCPU is changed in the middle of translating some guest code. Signed-off-by: Lluís Vilanova --- cpu-exec.c | 26 ++ include/qom/cpu.h |7 +++ qom/cpu.c |4 trace/control-target.c | 11 +-- 4 files changed, 46 insertions(+), 2 deletions(-) diff --git a/cpu-exec.c b/cpu-exec.c index 4188fed3c6..1b7366efb0 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -33,6 +33,7 @@ #include "hw/i386/apic.h" #endif #include "sysemu/replay.h" +#include "trace/control.h" /* -icount align implementation. */ @@ -451,9 +452,21 @@ static inline bool cpu_handle_exception(CPUState *cpu, int *ret) #ifndef CONFIG_USER_ONLY } else if (replay_has_exception() && cpu->icount_decr.u16.low + cpu->icount_extra == 0) { +/* delay changes to this vCPU's dstate during translation */ +atomic_set(&cpu->trace_dstate_delayed_req, false); +atomic_set(&cpu->trace_dstate_must_delay, true); + /* try to cause an exception pending in the log */ cpu_exec_nocache(cpu, 1, tb_find(cpu, NULL, 0), true); *ret = -1; + +/* apply and disable delayed dstate changes */ +atomic_set(&cpu->trace_dstate_must_delay, false); +if (unlikely(atomic_read(&cpu->trace_dstate_delayed_req))) { +bitmap_copy(cpu->trace_dstate, cpu->trace_dstate_delayed, +trace_get_vcpu_event_count()); +} + return true; #endif } @@ -634,8 +647,21 @@ int cpu_exec(CPUState *cpu) for(;;) { cpu_handle_interrupt(cpu, &last_tb); + +/* delay changes to this vCPU's dstate during translation */ +atomic_set(&cpu->trace_dstate_delayed_req, false); +atomic_set(&cpu->trace_dstate_must_delay, true); + tb = tb_find(cpu, last_tb, tb_exit); cpu_loop_exec_tb(cpu, tb, &last_tb, &tb_exit, &sc); + +/* apply and disable delayed dstate changes */ +atomic_set(&cpu->trace_dstate_must_delay, false); +if (unlikely(atomic_read(&cpu->trace_dstate_delayed_req))) { +bitmap_copy(cpu->trace_dstate, cpu->trace_dstate_delayed, +trace_get_vcpu_event_count()); +} + /* Try to align the host and virtual clocks if the guest is in advance */ align_clocks(&sc, cpu); diff --git a/include/qom/cpu.h b/include/qom/cpu.h index 3f79a8e955..58255d06fa 100644 --- a/include/qom/cpu.h +++ b/include/qom/cpu.h @@ -295,6 +295,10 @@ struct qemu_work_item; * @kvm_fd: vCPU file descriptor for KVM. * @work_mutex: Lock to prevent multiple access to queued_work_*. * @queued_work_first: First asynchronous work pending. + * @trace_dstate_must_delay: Whether a change to trace_dstate must be delayed. + * @trace_dstate_delayed_req: Whether a change to trace_dstate was delayed. + * @trace_dstate_delayed: Delayed changes to trace_dstate (includes all changes + *to @trace_dstate). * @trace_dstate: Dynamic tracing state of events for this vCPU (bitmask). * * State of one CPU core or thread. @@ -370,6 +374,9 @@ struct CPUState { * Dynamically allocated based on bitmap requried to hold up to * trace_get_vcpu_event_count() entries. */ +bool trace_dstate_must_delay; +bool trace_dstate_delayed_req; +unsigned long *trace_dstate_delayed; unsigned long *trace_dstate; /* TODO Move common fields from CPUArchState here. */ diff --git a/qom/cpu.c b/qom/cpu.c index 03d9190f8c..d56496d28d 100644 --- a/qom/cpu.c +++ b/qom/cpu.c @@ -367,6 +367,9 @@ static void cpu_common_initfn(Object *obj) QTAILQ_INIT(&cpu->breakpoints); QTAILQ_INIT(&cpu->watchpoints); +cpu->trace_dstate_must_delay = false; +cpu->trace_dstate_delayed_req = false; +cpu->trace_dstate_delayed = bitmap_new(trace_get_vcpu_event_count()); cpu->trace_dstate = bitmap_new(trace_get_vcpu_event_count()); cpu_exec_initfn(cpu); @@ -375,6 +378,7 @@ static void cpu_common_initfn(Object *obj) static void cpu_common_finalize(Object *obj) { CPUState *cpu = CPU(obj); +g_free(cpu->trace_dstate_delayed); g_free(cpu->trace_dstate); } diff --git a/trace/control-target.c b/trace/control-target.c index 7ebf6e0bcb..aba8db55de 100644 --- a/trace/control-target.c +++ b/trace/control-target.c @@ -69,13 +69,20 @@ void trace_event_set_vcpu_state_dynamic(CPUState *vcpu, if (state_pre != state) { if (state) { trace_events_enabled_count++; -set_bit(vcpu_id, vcpu->trace_dstate); +set_bit(vcpu_id, vcpu->trace_dstate_delayed); +if (!atomic_read(&vcpu->trace_dstate_must_delay)) { +set_bit(vcpu_id, vcpu->trace_dstate); +
[Qemu-devel] [PATCH v5 0/7] trace: [tcg] Optimize per-vCPU tracing states with separate TB caches
Optimizes tracing of events with the 'tcg' and 'vcpu' properties (e.g., memory accesses), making it feasible to statically enable them by default on all QEMU builds. Some quick'n'dirty numbers with 400.perlbench (SPECcpu2006) on the train input (medium size - suns.pl) and the guest_mem_before event: * vanilla, statically disabled real0m2,259s user0m2,252s sys 0m0,004s * vanilla, statically enabled (overhead: 2.18x) real0m4,921s user0m4,912s sys 0m0,008s * multi-tb, statically disabled (overhead: 0.99x) [within noise range] real0m2,228s user0m2,216s sys 0m0,008s * multi-tb, statically enabled (overhead: 0.99x) [within noise range] real0m2,229s user0m2,224s sys 0m0,004s Right now, events with the 'tcg' property always generate TCG code to trace that event at guest code execution time, where the event's dynamic state is checked. This series adds a performance optimization where TCG code for events with the 'tcg' and 'vcpu' properties is not generated if the event is dynamically disabled. This optimization raises two issues: * An event can be dynamically disabled/enabled after the corresponding TCG code has been generated (i.e., a new TB with the corresponding code should be used). * Each vCPU can have a different dynamic state for the same event (i.e., tracing the memory accesses of only one process pinned to a vCPU). To handle both issues, this series integrates the dynamic tracing event state into the TB hashing function, so that vCPUs tracing different events will use separate TBs. Note that only events with the 'vcpu' property are used for hashing (as stored in the bitmap of CPUState->trace_dstate). This makes dynamic event state changes on vCPUs very efficient, since they can use TBs produced by other vCPUs while on the same event state combination (or produced by the same vCPU, earlier). Discarded alternatives: * Emitting TCG code to check if an event needs tracing, where we should still move the tracing call code to either a cold path (making tracing performance worse), or leave it inlined (making non-tracing performance worse). * Eliding TCG code only when *zero* vCPUs are tracing an event, since enabling it on a single vCPU will impact the performance of all other vCPUs that are not tracing that event. Signed-off-by: Lluís Vilanova --- Changes in v5 = * Move define into "qemu-common.h" to allow compilation of tests. Changes in v4 = * Incorporate trace_dstate into the TB hashing function instead of using multiple physical TB caches [suggested by Richard Henderson]. Changes in v3 = * Rebase on 0737f32daf. * Do not use reserved symbol prefixes ("__") [Stefan Hajnoczi]. * Refactor trace_get_vcpu_event_count() to be inlinable. * Optimize cpu_tb_cache_set_requested() (hottest path). Changes in v2 = * Fix bitmap copy in cpu_tb_cache_set_apply(). * Split generated code re-alignment into a separate patch [Daniel P. Berrange]. Lluís Vilanova (7): exec: [tcg] Refactor flush of per-CPU virtual TB cache trace: Make trace_get_vcpu_event_count() inlinable trace: [tcg] Delay changes to dynamic state when translating exec: [tcg] Use different TBs according to the vCPU's dynamic tracing state trace: [tcg] Do not generate TCG code to trace dinamically-disabled events trace: [tcg,trivial] Re-align generated code trace: [trivial] Statically enable all guest events cpu-exec.c | 52 +++--- cputlb.c |2 + include/exec/exec-all.h | 11 ++ include/exec/tb-hash-xx.h| 11 ++ include/exec/tb-hash.h |5 ++- include/qemu-common.h|3 ++ include/qom/cpu.h|7 qom/cpu.c|4 ++ scripts/tracetool/__init__.py|1 + scripts/tracetool/backend/dtrace.py |2 + scripts/tracetool/backend/ftrace.py | 20 ++-- scripts/tracetool/backend/log.py | 17 +- scripts/tracetool/backend/simple.py |2 + scripts/tracetool/backend/syslog.py |6 ++- scripts/tracetool/backend/ust.py |2 + scripts/tracetool/format/h.py| 24 ++ scripts/tracetool/format/tcg_h.py| 19 +-- scripts/tracetool/format/tcg_helper_c.py |3 +- tests/qht-bench.c|2 + trace-events |6 ++- trace/control-internal.h |5 +++ trace/control-target.c | 14 +++- trace/control.c |9 + trace/control.h |5 ++- translate-all.c | 30 + 25 files changed, 198 insertions(+), 64 deletions(-) To: qemu-devel@nongnu.org Cc: Stefan Hajnoczi
Re: [Qemu-devel] [PATCH] doc/pcie: correct command line examples
On 12/27/2016 09:40 AM, Cao jin wrote: Nit picking: Multi-function PCI Express Root Ports should mean that 'addr' property is mandatory, and slot is optional because it is default to 0, and 'chassis' is mandatory for 2nd & 3rd root port because it is default to 0 too. Bonus: fix a typo(2->3) Signed-off-by: Cao jin --- docs/pcie.txt | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/pcie.txt b/docs/pcie.txt index 9fb20aaed9f4..54f05eaa71dc 100644 --- a/docs/pcie.txt +++ b/docs/pcie.txt @@ -110,18 +110,18 @@ Plug only PCI Express devices into PCI Express Ports. -device ioh3420,id=root_port1,chassis=x,slot=y[,bus=pcie.0][,addr=z] \ -device ,bus=root_port1 2.2.2 Using multi-function PCI Express Root Ports: - -device ioh3420,id=root_port1,multifunction=on,chassis=x,slot=y[,bus=pcie.0][,addr=z.0] \ - -device ioh3420,id=root_port2,chassis=x1,slot=y1[,bus=pcie.0][,addr=z.1] \ - -device ioh3420,id=root_port3,chassis=x2,slot=y2[,bus=pcie.0][,addr=z.2] \ -2.2.2 Plugging a PCI Express device into a Switch: + -device ioh3420,id=root_port1,multifunction=on,chassis=x,addr=z.0[,slot=y][,bus=pcie.0] \ + -device ioh3420,id=root_port2,chassis=x1,addr=z.1[,slot=y1][,bus=pcie.0] \ + -device ioh3420,id=root_port3,chassis=x2,addr=z.2[,slot=y2][,bus=pcie.0] \ +2.2.3 Plugging a PCI Express device into a Switch: -device ioh3420,id=root_port1,chassis=x,slot=y[,bus=pcie.0][,addr=z] \ -device x3130-upstream,id=upstream_port1,bus=root_port1[,addr=x] \ -device xio3130-downstream,id=downstream_port1,bus=upstream_port1,chassis=x1,slot=y1[,addr=z1]] \ -device ,bus=downstream_port1 Notes: - - (slot, chassis) pair is mandatory and must be - unique for each PCI Express Root Port. + - (slot, chassis) pair is mandatory and must be unique for each +PCI Express Root Port. slot is default to 0 when doesn't specify it. - 'addr' parameter can be 0 for all the examples above. Reviewed-by: Marcel Apfelbaum Thanks, Marcel
Re: [Qemu-devel] [PATCH 23/23] hw/arm/virt: Add board property to enable EL2
On Tue, Dec 13, 2016 at 10:36:24AM +, Peter Maydell wrote: > Add a board level property to the virt board which will > enable EL2 on the CPU if the user asks for it. The > default is not to provide EL2. If EL2 is enabled then > we will use SMC as our PSCI conduit, and report the > virtualization support in the GICv3 device tree node. > > Signed-off-by: Peter Maydell > --- > hw/arm/virt.c | 45 +++-- > 1 file changed, 43 insertions(+), 2 deletions(-) > Reviewed-by: Andrew Jones
Re: [Qemu-devel] [PATCH 21/23] hw/arm/virt: Support using SMC for PSCI
On Tue, Dec 13, 2016 at 10:36:22AM +, Peter Maydell wrote: > If we are giving the guest a CPU with EL2, it is likely to > want to use the HVC instruction itself, for instance for > providing PSCI to inner guest VMs. This makes using HVC > as the PSCI conduit for the outer QEMU a bad idea. We will > want to use SMC instead is this case: this makes sense > because QEMU's PSCI implementation is effectively an > emulation of functionality provided by EL3 firmware. > > Add code to support selecting the PSCI conduit to use, > rather than hardcoding use of HVC. > > Signed-off-by: Peter Maydell > --- > hw/arm/virt.c | 29 ++--- > 1 file changed, 22 insertions(+), 7 deletions(-) > Reviewed-by: Andrew Jones
Re: [Qemu-devel] [PATCH 22/23] target-arm: Enable EL2 feature bit on A53 and A57
On Tue, Dec 13, 2016 at 10:36:23AM +, Peter Maydell wrote: > Enable the ARM_FEATURE_EL2 bit on Cortex-A52 and > Cortex-A57, since this is all now sufficiently implemented > to work with the GICv3. We provide the usual CPU property > to disable it for backwards compatibility with the older > virt boards. > > In this commit, we disable the EL2 feature on the > virt and ZynpMP boards, so there is no overall effect. > Another commit will expose a board-level property to > allow the user to enable EL2. > > Signed-off-by: Peter Maydell > --- > target-arm/cpu.h | 2 ++ > hw/arm/virt.c| 4 > hw/arm/xlnx-zynqmp.c | 2 ++ > target-arm/cpu.c | 12 > target-arm/cpu64.c | 2 ++ > 5 files changed, 22 insertions(+) > Reviewed-by: Andrew Jones
Re: [Qemu-devel] [PATCH v4 0/7] trace: [tcg] Optimize per-vCPU tracing states with separate TB caches
Hi, Your series failed automatic build test. Please find the testing commands and their output below. If you have docker installed, you can probably reproduce it locally. Message-id: 148292774946.380.3638349228328753405.st...@fimbulvetr.bsc.es Subject: [Qemu-devel] [PATCH v4 0/7] trace: [tcg] Optimize per-vCPU tracing states with separate TB caches Type: series === TEST SCRIPT BEGIN === #!/bin/bash set -e git submodule update --init dtc # Let docker tests dump environment info export SHOW_ENV=1 export J=16 make docker-test-quick@centos6 make docker-test-mingw@fedora make docker-test-build@min-glib === TEST SCRIPT END === Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 Switched to a new branch 'test' fa951db trace: [trivial] Statically enable all guest events 3965b43 trace: [tcg, trivial] Re-align generated code 9e50f3c trace: [tcg] Do not generate TCG code to trace dinamically-disabled events 0e45e8e exec: [tcg] Use different TBs according to the vCPU's dynamic tracing state bcbae22 trace: [tcg] Delay changes to dynamic state when translating ca96b75 trace: Make trace_get_vcpu_event_count() inlinable 2d7cc5e exec: [tcg] Refactor flush of per-CPU virtual TB cache === OUTPUT BEGIN === Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc' Cloning into 'dtc'... Submodule path 'dtc': checked out '65cc4d2748a2c2e6f27f1cf39e07a5dbabd80ebf' BUILD centos6 make[1]: Entering directory `/var/tmp/patchew-tester-tmp-cfk9gpom/src' ARCHIVE qemu.tgz ARCHIVE dtc.tgz COPYRUNNER RUN test-quick in qemu:centos6 Packages installed: SDL-devel-1.2.14-7.el6_7.1.x86_64 ccache-3.1.6-2.el6.x86_64 epel-release-6-8.noarch gcc-4.4.7-17.el6.x86_64 git-1.7.1-4.el6_7.1.x86_64 glib2-devel-2.28.8-5.el6.x86_64 libfdt-devel-1.4.0-1.el6.x86_64 make-3.81-23.el6.x86_64 package g++ is not installed pixman-devel-0.32.8-1.el6.x86_64 tar-1.23-15.el6_8.x86_64 zlib-devel-1.2.3-29.el6.x86_64 Environment variables: PACKAGES=libfdt-devel ccache tar git make gcc g++ zlib-devel glib2-devel SDL-devel pixman-devel epel-release HOSTNAME=b3cd0d1df384 TERM=xterm MAKEFLAGS= -j16 HISTSIZE=1000 J=16 USER=root CCACHE_DIR=/var/tmp/ccache EXTRA_CONFIGURE_OPTS= V= SHOW_ENV=1 MAIL=/var/spool/mail/root PATH=/usr/lib/ccache:/usr/lib64/ccache:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin PWD=/ LANG=en_US.UTF-8 TARGET_LIST= HISTCONTROL=ignoredups SHLVL=1 HOME=/root TEST_DIR=/tmp/qemu-test LOGNAME=root LESSOPEN=||/usr/bin/lesspipe.sh %s FEATURES= dtc DEBUG= G_BROKEN_FILENAMES=1 CCACHE_HASHDIR= _=/usr/bin/env Configure options: --enable-werror --target-list=x86_64-softmmu,aarch64-softmmu --prefix=/var/tmp/qemu-build/install No C++ compiler available; disabling C++ specific optional code Install prefix/var/tmp/qemu-build/install BIOS directory/var/tmp/qemu-build/install/share/qemu binary directory /var/tmp/qemu-build/install/bin library directory /var/tmp/qemu-build/install/lib module directory /var/tmp/qemu-build/install/lib/qemu libexec directory /var/tmp/qemu-build/install/libexec include directory /var/tmp/qemu-build/install/include config directory /var/tmp/qemu-build/install/etc local state directory /var/tmp/qemu-build/install/var Manual directory /var/tmp/qemu-build/install/share/man ELF interp prefix /usr/gnemul/qemu-%M Source path /tmp/qemu-test/src C compilercc Host C compiler cc C++ compiler Objective-C compiler cc ARFLAGS rv CFLAGS-O2 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -g QEMU_CFLAGS -I/usr/include/pixman-1-pthread -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include -fPIE -DPIE -m64 -mcx16 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -fno-common -fwrapv -Wendif-labels -Wmissing-include-dirs -Wempty-body -Wnested-externs -Wformat-security -Wformat-y2k -Winit-self -Wignored-qualifiers -Wold-style-declaration -Wold-style-definition -Wtype-limits -fstack-protector-all LDFLAGS -Wl,--warn-common -Wl,-z,relro -Wl,-z,now -pie -m64 -g make make install install pythonpython -B smbd /usr/sbin/smbd module supportno host CPU x86_64 host big endian no target list x86_64-softmmu aarch64-softmmu tcg debug enabled no gprof enabled no sparse enabledno strip binariesyes profiler no static build no pixmansystem SDL support yes (1.2.14) GTK support no GTK GL supportno VTE support no TLS priority NORMAL GNUTLS supportno GNUTLS rndno libgcrypt no libgcrypt kdf no nettleno nettle kdfno libtasn1 no curses supportno virgl support no curl support no mingw32 support no Audio drivers oss Block whitelist (rw) Block whitelist (ro) VirtFS supportno VNC support yes VNC SASL support n
[Qemu-devel] [PATCH v4 5/7] trace: [tcg] Do not generate TCG code to trace dinamically-disabled events
If an event is dynamically disabled, the TCG code that calls the execution-time tracer is not generated. Removes the overheads of execution-time tracers for dynamically disabled events. As a bonus, also avoids checking the event state when the execution-time tracer is called from TCG-generated code (since otherwise TCG would simply not call it). Signed-off-by: Lluís Vilanova --- scripts/tracetool/__init__.py|1 + scripts/tracetool/format/h.py| 24 ++-- scripts/tracetool/format/tcg_h.py| 19 --- scripts/tracetool/format/tcg_helper_c.py |3 ++- 4 files changed, 37 insertions(+), 10 deletions(-) diff --git a/scripts/tracetool/__init__.py b/scripts/tracetool/__init__.py index 365446fa53..63168ccdf0 100644 --- a/scripts/tracetool/__init__.py +++ b/scripts/tracetool/__init__.py @@ -264,6 +264,7 @@ class Event(object): return self._FMT.findall(self.fmt) QEMU_TRACE = "trace_%(name)s" +QEMU_TRACE_NOCHECK = "_nocheck__" + QEMU_TRACE QEMU_TRACE_TCG = QEMU_TRACE + "_tcg" QEMU_DSTATE = "_TRACE_%(NAME)s_DSTATE" QEMU_EVENT = "_TRACE_%(NAME)s_EVENT" diff --git a/scripts/tracetool/format/h.py b/scripts/tracetool/format/h.py index 3682f4e6a8..a78e50ef35 100644 --- a/scripts/tracetool/format/h.py +++ b/scripts/tracetool/format/h.py @@ -49,6 +49,19 @@ def generate(events, backend, group): backend.generate_begin(events, group) for e in events: +# tracer without checks +out('', +'static inline void %(api)s(%(args)s)', +'{', +api=e.api(e.QEMU_TRACE_NOCHECK), +args=e.args) + +if "disable" not in e.properties: +backend.generate(e, group) + +out('}') + +# tracer wrapper with checks (per-vCPU tracing) if "vcpu" in e.properties: trace_cpu = next(iter(e.args))[1] cond = "trace_event_get_vcpu_state(%(cpu)s,"\ @@ -63,16 +76,15 @@ def generate(events, backend, group): 'static inline void %(api)s(%(args)s)', '{', 'if (%(cond)s) {', +'%(api_nocheck)s(%(names)s);', +'}', +'}', api=e.api(), +api_nocheck=e.api(e.QEMU_TRACE_NOCHECK), args=e.args, +names=", ".join(e.args.names()), cond=cond) -if "disable" not in e.properties: -backend.generate(e, group) - -out('}', -'}') - backend.generate_end(events, group) out('#endif /* TRACE_%s_GENERATED_TRACERS_H */' % group.upper()) diff --git a/scripts/tracetool/format/tcg_h.py b/scripts/tracetool/format/tcg_h.py index 5f213f6cba..71b5c09432 100644 --- a/scripts/tracetool/format/tcg_h.py +++ b/scripts/tracetool/format/tcg_h.py @@ -41,7 +41,7 @@ def generate(events, backend, group): for e in events: # just keep one of them -if "tcg-trans" not in e.properties: +if "tcg-exec" not in e.properties: continue out('static inline void %(name_tcg)s(%(args)s)', @@ -53,12 +53,25 @@ def generate(events, backend, group): args_trans = e.original.event_trans.args args_exec = tracetool.vcpu.transform_args( "tcg_helper_c", e.original.event_exec, "wrapper") +if "vcpu" in e.properties: +trace_cpu = e.args.names()[0] +cond = "trace_event_get_vcpu_state(%(cpu)s,"\ + " TRACE_%(id)s)"\ + % dict( + cpu=trace_cpu, + id=e.original.event_exec.name.upper()) +else: +cond = "true" + out('%(name_trans)s(%(argnames_trans)s);', -'gen_helper_%(name_exec)s(%(argnames_exec)s);', +'if (%(cond)s) {', +'gen_helper_%(name_exec)s(%(argnames_exec)s);', +'}', name_trans=e.original.event_trans.api(e.QEMU_TRACE), name_exec=e.original.event_exec.api(e.QEMU_TRACE), argnames_trans=", ".join(args_trans.names()), -argnames_exec=", ".join(args_exec.names())) +argnames_exec=", ".join(args_exec.names()), +cond=cond) out('}') diff --git a/scripts/tracetool/format/tcg_helper_c.py b/scripts/tracetool/format/tcg_helper_c.py index cc26e03008..c2a05d756c 100644 --- a/scripts/tracetool/format/tcg_helper_c.py +++ b/scripts/tracetool/format/tcg_helper_c.py @@ -66,10 +66,11 @@ def generate(events, backend, group): out('void %(name_tcg)s(%(args_api)s)', '{', +# NOTE: the check was already performed at TCG-generation time '%(name)s(%(args_call)s);', '}', name_tcg="helper_%s_proxy" % e.api(), -
[Qemu-devel] [PATCH v4 3/7] trace: [tcg] Delay changes to dynamic state when translating
This keeps consistency across all decisions taken during translation when the dynamic state of a vCPU is changed in the middle of translating some guest code. Signed-off-by: Lluís Vilanova --- cpu-exec.c | 26 ++ include/qom/cpu.h |7 +++ qom/cpu.c |4 trace/control-target.c | 11 +-- 4 files changed, 46 insertions(+), 2 deletions(-) diff --git a/cpu-exec.c b/cpu-exec.c index 4188fed3c6..1b7366efb0 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -33,6 +33,7 @@ #include "hw/i386/apic.h" #endif #include "sysemu/replay.h" +#include "trace/control.h" /* -icount align implementation. */ @@ -451,9 +452,21 @@ static inline bool cpu_handle_exception(CPUState *cpu, int *ret) #ifndef CONFIG_USER_ONLY } else if (replay_has_exception() && cpu->icount_decr.u16.low + cpu->icount_extra == 0) { +/* delay changes to this vCPU's dstate during translation */ +atomic_set(&cpu->trace_dstate_delayed_req, false); +atomic_set(&cpu->trace_dstate_must_delay, true); + /* try to cause an exception pending in the log */ cpu_exec_nocache(cpu, 1, tb_find(cpu, NULL, 0), true); *ret = -1; + +/* apply and disable delayed dstate changes */ +atomic_set(&cpu->trace_dstate_must_delay, false); +if (unlikely(atomic_read(&cpu->trace_dstate_delayed_req))) { +bitmap_copy(cpu->trace_dstate, cpu->trace_dstate_delayed, +trace_get_vcpu_event_count()); +} + return true; #endif } @@ -634,8 +647,21 @@ int cpu_exec(CPUState *cpu) for(;;) { cpu_handle_interrupt(cpu, &last_tb); + +/* delay changes to this vCPU's dstate during translation */ +atomic_set(&cpu->trace_dstate_delayed_req, false); +atomic_set(&cpu->trace_dstate_must_delay, true); + tb = tb_find(cpu, last_tb, tb_exit); cpu_loop_exec_tb(cpu, tb, &last_tb, &tb_exit, &sc); + +/* apply and disable delayed dstate changes */ +atomic_set(&cpu->trace_dstate_must_delay, false); +if (unlikely(atomic_read(&cpu->trace_dstate_delayed_req))) { +bitmap_copy(cpu->trace_dstate, cpu->trace_dstate_delayed, +trace_get_vcpu_event_count()); +} + /* Try to align the host and virtual clocks if the guest is in advance */ align_clocks(&sc, cpu); diff --git a/include/qom/cpu.h b/include/qom/cpu.h index 3f79a8e955..58255d06fa 100644 --- a/include/qom/cpu.h +++ b/include/qom/cpu.h @@ -295,6 +295,10 @@ struct qemu_work_item; * @kvm_fd: vCPU file descriptor for KVM. * @work_mutex: Lock to prevent multiple access to queued_work_*. * @queued_work_first: First asynchronous work pending. + * @trace_dstate_must_delay: Whether a change to trace_dstate must be delayed. + * @trace_dstate_delayed_req: Whether a change to trace_dstate was delayed. + * @trace_dstate_delayed: Delayed changes to trace_dstate (includes all changes + *to @trace_dstate). * @trace_dstate: Dynamic tracing state of events for this vCPU (bitmask). * * State of one CPU core or thread. @@ -370,6 +374,9 @@ struct CPUState { * Dynamically allocated based on bitmap requried to hold up to * trace_get_vcpu_event_count() entries. */ +bool trace_dstate_must_delay; +bool trace_dstate_delayed_req; +unsigned long *trace_dstate_delayed; unsigned long *trace_dstate; /* TODO Move common fields from CPUArchState here. */ diff --git a/qom/cpu.c b/qom/cpu.c index 03d9190f8c..d56496d28d 100644 --- a/qom/cpu.c +++ b/qom/cpu.c @@ -367,6 +367,9 @@ static void cpu_common_initfn(Object *obj) QTAILQ_INIT(&cpu->breakpoints); QTAILQ_INIT(&cpu->watchpoints); +cpu->trace_dstate_must_delay = false; +cpu->trace_dstate_delayed_req = false; +cpu->trace_dstate_delayed = bitmap_new(trace_get_vcpu_event_count()); cpu->trace_dstate = bitmap_new(trace_get_vcpu_event_count()); cpu_exec_initfn(cpu); @@ -375,6 +378,7 @@ static void cpu_common_initfn(Object *obj) static void cpu_common_finalize(Object *obj) { CPUState *cpu = CPU(obj); +g_free(cpu->trace_dstate_delayed); g_free(cpu->trace_dstate); } diff --git a/trace/control-target.c b/trace/control-target.c index 7ebf6e0bcb..aba8db55de 100644 --- a/trace/control-target.c +++ b/trace/control-target.c @@ -69,13 +69,20 @@ void trace_event_set_vcpu_state_dynamic(CPUState *vcpu, if (state_pre != state) { if (state) { trace_events_enabled_count++; -set_bit(vcpu_id, vcpu->trace_dstate); +set_bit(vcpu_id, vcpu->trace_dstate_delayed); +if (!atomic_read(&vcpu->trace_dstate_must_delay)) { +set_bit(vcpu_id, vcpu->trace_dstate); +
[Qemu-devel] [PATCH v4 6/7] trace: [tcg, trivial] Re-align generated code
Last patch removed a nesting level in generated code. Re-align all code generated by backends to be 4-column aligned. Signed-off-by: Lluís Vilanova --- scripts/tracetool/backend/dtrace.py |2 +- scripts/tracetool/backend/ftrace.py | 20 ++-- scripts/tracetool/backend/log.py| 17 + scripts/tracetool/backend/simple.py |2 +- scripts/tracetool/backend/syslog.py |6 +++--- scripts/tracetool/backend/ust.py|2 +- 6 files changed, 25 insertions(+), 24 deletions(-) diff --git a/scripts/tracetool/backend/dtrace.py b/scripts/tracetool/backend/dtrace.py index 79505c6b1a..b3a8645bf0 100644 --- a/scripts/tracetool/backend/dtrace.py +++ b/scripts/tracetool/backend/dtrace.py @@ -41,6 +41,6 @@ def generate_h_begin(events, group): def generate_h(event, group): -out('QEMU_%(uppername)s(%(argnames)s);', +out('QEMU_%(uppername)s(%(argnames)s);', uppername=event.name.upper(), argnames=", ".join(event.args.names())) diff --git a/scripts/tracetool/backend/ftrace.py b/scripts/tracetool/backend/ftrace.py index db9fe7ad57..dd0eda4441 100644 --- a/scripts/tracetool/backend/ftrace.py +++ b/scripts/tracetool/backend/ftrace.py @@ -29,17 +29,17 @@ def generate_h(event, group): if len(event.args) > 0: argnames = ", " + argnames -out('{', -'char ftrace_buf[MAX_TRACE_STRLEN];', -'int unused __attribute__ ((unused));', -'int trlen;', -'if (trace_event_get_state(%(event_id)s)) {', -'trlen = snprintf(ftrace_buf, MAX_TRACE_STRLEN,', -' "%(name)s " %(fmt)s "\\n" %(argnames)s);', -'trlen = MIN(trlen, MAX_TRACE_STRLEN - 1);', -'unused = write(trace_marker_fd, ftrace_buf, trlen);', -'}', +out('{', +'char ftrace_buf[MAX_TRACE_STRLEN];', +'int unused __attribute__ ((unused));', +'int trlen;', +'if (trace_event_get_state(%(event_id)s)) {', +'trlen = snprintf(ftrace_buf, MAX_TRACE_STRLEN,', +' "%(name)s " %(fmt)s "\\n" %(argnames)s);', +'trlen = MIN(trlen, MAX_TRACE_STRLEN - 1);', +'unused = write(trace_marker_fd, ftrace_buf, trlen);', '}', +'}', name=event.name, args=event.args, event_id="TRACE_" + event.name.upper(), diff --git a/scripts/tracetool/backend/log.py b/scripts/tracetool/backend/log.py index 4f4a4d38b1..7d2c3abe75 100644 --- a/scripts/tracetool/backend/log.py +++ b/scripts/tracetool/backend/log.py @@ -35,14 +35,15 @@ def generate_h(event, group): else: cond = "trace_event_get_state(%s)" % ("TRACE_" + event.name.upper()) -out('if (%(cond)s) {', -'struct timeval _now;', -'gettimeofday(&_now, NULL);', -'qemu_log_mask(LOG_TRACE, "%%d@%%zd.%%06zd:%(name)s " %(fmt)s "\\n",', -' getpid(),', -' (size_t)_now.tv_sec, (size_t)_now.tv_usec', -' %(argnames)s);', -'}', +out('if (%(cond)s) {', +'struct timeval _now;', +'gettimeofday(&_now, NULL);', +'qemu_log_mask(LOG_TRACE,', +' "%%d@%%zd.%%06zd:%(name)s " %(fmt)s "\\n",', +' getpid(),', +' (size_t)_now.tv_sec, (size_t)_now.tv_usec', +' %(argnames)s);', +'}', cond=cond, name=event.name, fmt=event.fmt.rstrip("\n"), diff --git a/scripts/tracetool/backend/simple.py b/scripts/tracetool/backend/simple.py index 85f61028e2..a28460b1e4 100644 --- a/scripts/tracetool/backend/simple.py +++ b/scripts/tracetool/backend/simple.py @@ -37,7 +37,7 @@ def generate_h_begin(events, group): def generate_h(event, group): -out('_simple_%(api)s(%(args)s);', +out('_simple_%(api)s(%(args)s);', api=event.api(), args=", ".join(event.args.names())) diff --git a/scripts/tracetool/backend/syslog.py b/scripts/tracetool/backend/syslog.py index b8ff2790c4..1ce627f0fc 100644 --- a/scripts/tracetool/backend/syslog.py +++ b/scripts/tracetool/backend/syslog.py @@ -35,9 +35,9 @@ def generate_h(event, group): else: cond = "trace_event_get_state(%s)" % ("TRACE_" + event.name.upper()) -out('if (%(cond)s) {', -'syslog(LOG_INFO, "%(name)s " %(fmt)s %(argnames)s);', -'}', +out('if (%(cond)s) {', +'syslog(LOG_INFO, "%(name)s " %(fmt)s %(argnames)s);', +'}', cond=cond, name=event.name, fmt=event.fmt.rstrip
[Qemu-devel] [PATCH v4 4/7] exec: [tcg] Use different TBs according to the vCPU's dynamic tracing state
Every vCPU now uses a separate set of TBs for each set of dynamic tracing event state values. Each set of TBs can be used by any number of vCPUs to maximize TB reuse when vCPUs have the same tracing state. This feature is later used by tracetool to optimize tracing of guest code events. The maximum number of TB sets is defined as 2^E, where E is the number of events that have the 'vcpu' property (their state is stored in CPUState->trace_dstate). For this to work, a change on the dynamic tracing state of a vCPU will force it to flush its virtual TB cache (which is only indexed by address), and fall back to the physical TB cache (which now contains the vCPU's dynamic tracing state as part of the hashing function). Signed-off-by: Lluís Vilanova --- cpu-exec.c| 26 +- include/exec/exec-all.h |7 +++ include/exec/tb-hash-xx.h | 10 +- include/exec/tb-hash.h|5 +++-- tests/qht-bench.c |2 +- trace/control-target.c|3 +++ trace/control.h |3 +++ translate-all.c | 16 ++-- 8 files changed, 61 insertions(+), 11 deletions(-) diff --git a/cpu-exec.c b/cpu-exec.c index 1b7366efb0..a377505b9c 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -262,6 +262,7 @@ struct tb_desc { CPUArchState *env; tb_page_addr_t phys_page1; uint32_t flags; +TRACE_QHT_VCPU_DSTATE_TYPE trace_vcpu_dstate; }; static bool tb_cmp(const void *p, const void *d) @@ -273,6 +274,7 @@ static bool tb_cmp(const void *p, const void *d) tb->page_addr[0] == desc->phys_page1 && tb->cs_base == desc->cs_base && tb->flags == desc->flags && +tb->trace_vcpu_dstate == desc->trace_vcpu_dstate && !atomic_read(&tb->invalid)) { /* check next page if needed */ if (tb->page_addr[1] == -1) { @@ -294,7 +296,8 @@ static bool tb_cmp(const void *p, const void *d) static TranslationBlock *tb_htable_lookup(CPUState *cpu, target_ulong pc, target_ulong cs_base, - uint32_t flags) + uint32_t flags, + uint32_t trace_vcpu_dstate) { tb_page_addr_t phys_pc; struct tb_desc desc; @@ -303,10 +306,11 @@ static TranslationBlock *tb_htable_lookup(CPUState *cpu, desc.env = (CPUArchState *)cpu->env_ptr; desc.cs_base = cs_base; desc.flags = flags; +desc.trace_vcpu_dstate = trace_vcpu_dstate; desc.pc = pc; phys_pc = get_page_addr_code(desc.env, pc); desc.phys_page1 = phys_pc & TARGET_PAGE_MASK; -h = tb_hash_func(phys_pc, pc, flags); +h = tb_hash_func(phys_pc, pc, flags, trace_vcpu_dstate); return qht_lookup(&tcg_ctx.tb_ctx.htable, tb_cmp, &desc, h); } @@ -318,16 +322,24 @@ static inline TranslationBlock *tb_find(CPUState *cpu, TranslationBlock *tb; target_ulong cs_base, pc; uint32_t flags; +unsigned long trace_vcpu_dstate_bitmap; +TRACE_QHT_VCPU_DSTATE_TYPE trace_vcpu_dstate; bool have_tb_lock = false; +bitmap_copy(&trace_vcpu_dstate_bitmap, cpu->trace_dstate, +trace_get_vcpu_event_count()); +memcpy(&trace_vcpu_dstate, &trace_vcpu_dstate_bitmap, + sizeof(trace_vcpu_dstate)); + /* we record a subset of the CPU state. It will always be the same before a given translated block is executed. */ cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags); tb = atomic_rcu_read(&cpu->tb_jmp_cache[tb_jmp_cache_hash_func(pc)]); if (unlikely(!tb || tb->pc != pc || tb->cs_base != cs_base || - tb->flags != flags)) { -tb = tb_htable_lookup(cpu, pc, cs_base, flags); + tb->flags != flags || + tb->trace_vcpu_dstate != trace_vcpu_dstate)) { +tb = tb_htable_lookup(cpu, pc, cs_base, flags, trace_vcpu_dstate); if (!tb) { /* mmap_lock is needed by tb_gen_code, and mmap_lock must be @@ -341,7 +353,7 @@ static inline TranslationBlock *tb_find(CPUState *cpu, /* There's a chance that our desired tb has been translated while * taking the locks so we check again inside the lock. */ -tb = tb_htable_lookup(cpu, pc, cs_base, flags); +tb = tb_htable_lookup(cpu, pc, cs_base, flags, trace_vcpu_dstate); if (!tb) { /* if no translated code available, then translate it now */ tb = tb_gen_code(cpu, pc, cs_base, flags, 0); @@ -465,6 +477,7 @@ static inline bool cpu_handle_exception(CPUState *cpu, int *ret) if (unlikely(atomic_read(&cpu->trace_dstate_delayed_req))) { bitmap_copy(cpu->trace_dstate, cpu->trace_dstate_delayed, trace_get_vcpu_event_count()); +tb_flush_jmp_cache_all(cpu); } return tr
[Qemu-devel] [PATCH v4 2/7] trace: Make trace_get_vcpu_event_count() inlinable
Later patches will make use of it. Signed-off-by: Lluís Vilanova --- trace/control-internal.h |5 + trace/control.c |9 ++--- trace/control.h |2 +- 3 files changed, 8 insertions(+), 8 deletions(-) diff --git a/trace/control-internal.h b/trace/control-internal.h index a9d395a587..beb98a0d2c 100644 --- a/trace/control-internal.h +++ b/trace/control-internal.h @@ -16,6 +16,7 @@ extern int trace_events_enabled_count; +extern uint32_t trace_next_vcpu_id; static inline bool trace_event_is_pattern(const char *str) @@ -82,6 +83,10 @@ static inline bool trace_event_get_vcpu_state_dynamic(CPUState *vcpu, return trace_event_get_vcpu_state_dynamic_by_vcpu_id(vcpu, vcpu_id); } +static inline uint32_t trace_get_vcpu_event_count(void) +{ +return trace_next_vcpu_id; +} void trace_event_register_group(TraceEvent **events); diff --git a/trace/control.c b/trace/control.c index 1a7bee6ddc..52d0e343fa 100644 --- a/trace/control.c +++ b/trace/control.c @@ -36,7 +36,7 @@ typedef struct TraceEventGroup { static TraceEventGroup *event_groups; static size_t nevent_groups; static uint32_t next_id; -static uint32_t next_vcpu_id; +uint32_t trace_next_vcpu_id; QemuOptsList qemu_trace_opts = { .name = "trace", @@ -65,7 +65,7 @@ void trace_event_register_group(TraceEvent **events) for (i = 0; events[i] != NULL; i++) { events[i]->id = next_id++; if (events[i]->vcpu_id != TRACE_VCPU_EVENT_NONE) { -events[i]->vcpu_id = next_vcpu_id++; +events[i]->vcpu_id = trace_next_vcpu_id++; } } event_groups = g_renew(TraceEventGroup, event_groups, nevent_groups + 1); @@ -299,8 +299,3 @@ char *trace_opt_parse(const char *optarg) return trace_file; } - -uint32_t trace_get_vcpu_event_count(void) -{ -return next_vcpu_id; -} diff --git a/trace/control.h b/trace/control.h index ccaeac8552..80d326c4d1 100644 --- a/trace/control.h +++ b/trace/control.h @@ -237,7 +237,7 @@ char *trace_opt_parse(const char *optarg); * * Return the number of known vcpu-specific events */ -uint32_t trace_get_vcpu_event_count(void); +static uint32_t trace_get_vcpu_event_count(void); #include "trace/control-internal.h"
[Qemu-devel] [PATCH v4 7/7] trace: [trivial] Statically enable all guest events
The optimizations of this series makes it feasible to have them available on all builds. Signed-off-by: Lluís Vilanova --- trace-events |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/trace-events b/trace-events index f74e1d3d22..0a0f4d9cd6 100644 --- a/trace-events +++ b/trace-events @@ -159,7 +159,7 @@ vcpu guest_cpu_reset(void) # # Mode: user, softmmu # Targets: TCG(all) -disable vcpu tcg guest_mem_before(TCGv vaddr, uint8_t info) "info=%d", "vaddr=0x%016"PRIx64" info=%d" +vcpu tcg guest_mem_before(TCGv vaddr, uint8_t info) "info=%d", "vaddr=0x%016"PRIx64" info=%d" # @num: System call number. # @arg*: System call argument value. @@ -168,7 +168,7 @@ disable vcpu tcg guest_mem_before(TCGv vaddr, uint8_t info) "info=%d", "vaddr=0x # # Mode: user # Targets: TCG(all) -disable vcpu guest_user_syscall(uint64_t num, uint64_t arg1, uint64_t arg2, uint64_t arg3, uint64_t arg4, uint64_t arg5, uint64_t arg6, uint64_t arg7, uint64_t arg8) "num=0x%016"PRIx64" arg1=0x%016"PRIx64" arg2=0x%016"PRIx64" arg3=0x%016"PRIx64" arg4=0x%016"PRIx64" arg5=0x%016"PRIx64" arg6=0x%016"PRIx64" arg7=0x%016"PRIx64" arg8=0x%016"PRIx64 +vcpu guest_user_syscall(uint64_t num, uint64_t arg1, uint64_t arg2, uint64_t arg3, uint64_t arg4, uint64_t arg5, uint64_t arg6, uint64_t arg7, uint64_t arg8) "num=0x%016"PRIx64" arg1=0x%016"PRIx64" arg2=0x%016"PRIx64" arg3=0x%016"PRIx64" arg4=0x%016"PRIx64" arg5=0x%016"PRIx64" arg6=0x%016"PRIx64" arg7=0x%016"PRIx64" arg8=0x%016"PRIx64 # @num: System call number. # @ret: System call result value. @@ -177,4 +177,4 @@ disable vcpu guest_user_syscall(uint64_t num, uint64_t arg1, uint64_t arg2, uint # # Mode: user # Targets: TCG(all) -disable vcpu guest_user_syscall_ret(uint64_t num, uint64_t ret) "num=0x%016"PRIx64" ret=0x%016"PRIx64 +vcpu guest_user_syscall_ret(uint64_t num, uint64_t ret) "num=0x%016"PRIx64" ret=0x%016"PRIx64
[Qemu-devel] [PATCH v4 1/7] exec: [tcg] Refactor flush of per-CPU virtual TB cache
The function is reused in later patches. Signed-off-by: Lluís Vilanova --- cputlb.c|2 +- include/exec/exec-all.h |6 ++ translate-all.c | 14 +- 3 files changed, 16 insertions(+), 6 deletions(-) diff --git a/cputlb.c b/cputlb.c index 813279f3bc..9bf9960e1b 100644 --- a/cputlb.c +++ b/cputlb.c @@ -80,7 +80,7 @@ void tlb_flush(CPUState *cpu, int flush_global) memset(env->tlb_table, -1, sizeof(env->tlb_table)); memset(env->tlb_v_table, -1, sizeof(env->tlb_v_table)); -memset(cpu->tb_jmp_cache, 0, sizeof(cpu->tb_jmp_cache)); +tb_flush_jmp_cache_all(cpu); env->vtlb_index = 0; env->tlb_flush_addr = -1; diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h index a8c13cee66..57cd978578 100644 --- a/include/exec/exec-all.h +++ b/include/exec/exec-all.h @@ -256,6 +256,12 @@ struct TranslationBlock { }; void tb_free(TranslationBlock *tb); +/** + * tb_flush_jmp_cache_all: + * + * Flush the virtual translation block cache. + */ +void tb_flush_jmp_cache_all(CPUState *env); void tb_flush(CPUState *cpu); void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr); diff --git a/translate-all.c b/translate-all.c index 3dd9214904..29ccb9e546 100644 --- a/translate-all.c +++ b/translate-all.c @@ -941,11 +941,7 @@ static void do_tb_flush(CPUState *cpu, run_on_cpu_data tb_flush_count) } CPU_FOREACH(cpu) { -int i; - -for (i = 0; i < TB_JMP_CACHE_SIZE; ++i) { -atomic_set(&cpu->tb_jmp_cache[i], NULL); -} +tb_flush_jmp_cache_all(cpu); } tcg_ctx.tb_ctx.nb_tbs = 0; @@ -1741,6 +1737,14 @@ void tb_check_watchpoint(CPUState *cpu) } } +void tb_flush_jmp_cache_all(CPUState *cpu) +{ +int i; +for (i = 0; i < TB_JMP_CACHE_SIZE; ++i) { +atomic_set(&cpu->tb_jmp_cache[i], NULL); +} +} + #ifndef CONFIG_USER_ONLY /* in deterministic execution mode, instructions doing device I/Os must be at the end of the TB */
[Qemu-devel] [PATCH v4 0/7] trace: [tcg] Optimize per-vCPU tracing states with separate TB caches
Optimizes tracing of events with the 'tcg' and 'vcpu' properties (e.g., memory accesses), making it feasible to statically enable them by default on all QEMU builds. Some quick'n'dirty numbers with 400.perlbench (SPECcpu2006) on the train input (medium size - suns.pl) and the guest_mem_before event: * vanilla, statically disabled real0m2,259s user0m2,252s sys 0m0,004s * vanilla, statically enabled (overhead: 2.18x) real0m4,921s user0m4,912s sys 0m0,008s * multi-tb, statically disabled (overhead: 0.99x) [within noise range] real0m2,228s user0m2,216s sys 0m0,008s * multi-tb, statically enabled (overhead: 0.99x) [within noise range] real0m2,229s user0m2,224s sys 0m0,004s Right now, events with the 'tcg' property always generate TCG code to trace that event at guest code execution time, where the event's dynamic state is checked. This series adds a performance optimization where TCG code for events with the 'tcg' and 'vcpu' properties is not generated if the event is dynamically disabled. This optimization raises two issues: * An event can be dynamically disabled/enabled after the corresponding TCG code has been generated (i.e., a new TB with the corresponding code should be used). * Each vCPU can have a different dynamic state for the same event (i.e., tracing the memory accesses of only one process pinned to a vCPU). To handle both issues, this series integrates the dynamic tracing event state into the TB hashing function, so that vCPUs tracing different events will use separate TBs. Note that only events with the 'vcpu' property are used for hashing (as stored in the bitmap of CPUState->trace_dstate). This makes dynamic event state changes on vCPUs very efficient, since they can use TBs produced by other vCPUs while on the same event state combination (or produced by the same vCPU, earlier). Discarded alternatives: * Emitting TCG code to check if an event needs tracing, where we should still move the tracing call code to either a cold path (making tracing performance worse), or leave it inlined (making non-tracing performance worse). * Eliding TCG code only when *zero* vCPUs are tracing an event, since enabling it on a single vCPU will impact the performance of all other vCPUs that are not tracing that event. Signed-off-by: Lluís Vilanova --- Changes in v4 = * Incorporate trace_dstate into the TB hashing function instead of using multiple physical TB caches [suggested by Richard Henderson]. Changes in v3 = * Rebase on 0737f32daf. * Do not use reserved symbol prefixes ("__") [Stefan Hajnoczi]. * Refactor trace_get_vcpu_event_count() to be inlinable. * Optimize cpu_tb_cache_set_requested() (hottest path). Changes in v2 = * Fix bitmap copy in cpu_tb_cache_set_apply(). * Split generated code re-alignment into a separate patch [Daniel P. Berrange]. Lluís Vilanova (7): exec: [tcg] Refactor flush of per-CPU virtual TB cache trace: Make trace_get_vcpu_event_count() inlinable trace: [tcg] Delay changes to dynamic state when translating exec: [tcg] Use different TBs according to the vCPU's dynamic tracing state trace: [tcg] Do not generate TCG code to trace dinamically-disabled events trace: [tcg,trivial] Re-align generated code trace: [trivial] Statically enable all guest events cpu-exec.c | 52 +++--- cputlb.c |2 + include/exec/exec-all.h | 13 include/exec/tb-hash-xx.h| 10 +- include/exec/tb-hash.h |5 ++- include/qom/cpu.h|7 qom/cpu.c|4 ++ scripts/tracetool/__init__.py|1 + scripts/tracetool/backend/dtrace.py |2 + scripts/tracetool/backend/ftrace.py | 20 ++-- scripts/tracetool/backend/log.py | 17 +- scripts/tracetool/backend/simple.py |2 + scripts/tracetool/backend/syslog.py |6 ++- scripts/tracetool/backend/ust.py |2 + scripts/tracetool/format/h.py| 24 ++ scripts/tracetool/format/tcg_h.py| 19 +-- scripts/tracetool/format/tcg_helper_c.py |3 +- tests/qht-bench.c|2 + trace-events |6 ++- trace/control-internal.h |5 +++ trace/control-target.c | 14 +++- trace/control.c |9 + trace/control.h |5 ++- translate-all.c | 30 + 24 files changed, 196 insertions(+), 64 deletions(-) To: qemu-devel@nongnu.org Cc: Stefan Hajnoczi Cc: Eduardo Habkost Cc: Eric Blake