Re: [Qemu-devel] [PATCHv2 02/11] iscsi: read unmap info from block limits vpd page
Right. I don't see any problem with your patch. On Sat, Jul 6, 2013 at 3:15 PM, Peter Lieven wrote: > ok, to sum up you see no potential problem with my patch to optimize write > zeroes by > unmap iff lbprz==1 and lbpme == 1 ? ACK > > the alternative would be to use writesame16 and sent a zero block. this would > allow > an optimization also if lbprz == 0. in this case i would not set the unmap > bit. > > peter > > > Am 05.07.2013 um 09:11 schrieb ronnie sahlberg : > >> The device MIGHT map or anchor the blocks after the unmap but it may >> only do so if the blocks that become mapped are all zero. >> >> So I think you can safely assume that if lbprz==1 then it will always >> read back as zero no matter what happens internally in the target. >> >> Either the block becomes unmapped/deallocated and then it will read >> back as zero, >> or the blocks is automatically re-mapped to anchored/mapped again >> but this can only happen if the mapped block is all zero so once again >> it will still read back as all zero. >> >> === >> >> 4.7.3.5 Autonomous LBA transitions >> A device server may perform the following actions at any time: >> a) transition any deallocated LBA to mapped; >> b) transition any anchored LBA to mapped; or >> c) transition any deallocated LBA to anchored. >> If the LBPRZ bit in the READ CAPACITY (16) parameter data (see 5.16.2) >> is set to one, and a mapped LBA >> references a logical block that contains: >> a) user data with all bits set to zero; and >> Working Draft SCSI Block Commands – 3 (SBC-3) >> 27T10/BSR INCITS 514 Revision 35d >> 15 May 2013 >> b) protection information, if any, set to ___h, >> then the device server may transition that mapped LBA to anchored or >> deallocated at any time. >> The logical block provisioning st >> >> On Thu, Jul 4, 2013 at 2:07 PM, Peter Lieven wrote: >>> >>> Am 04.07.2013 um 14:37 schrieb Paolo Bonzini : >>> Il 03/07/2013 23:23, Peter Lieven ha scritto: > BDC is not used. I had an implementation that sent multiple descriptors > out, but > at least for my storage the maximum unmap counts not for each > descriptors, but for all > together. So in this case we do not need the field at all. I forgot to > remove it. > > discard and write_zeroes will both only send one request up to max_unmap > in size. > > apropos write_zeroes: do you know if UNMAP is guaranteed to unmap data if > lbprz == 1? Yes. On the other hand note that WRITE_SAME should be guaranteed _not_ to unmap if lbprz == 0 and you do WRITE_SAME with UNMAP and a zero payload, but I suspect there may be buggy targets here. > I have read in the specs something that the target might unmap the blocks > or not touch them at all. > Maybe you have more information. That's even true of UNMAP itself, actually. :) The storage can always "upgrade" a block from unmapped to anchored and from anchored to allocated, so UNMAP can be a no-op and still comply with the standard. >>> >>> My concern was, if I UNMAP a block and lbprz == 1 is it guaranteed that it >>> reads >>> as zero afterwards? Regardless if the target decides to "upgrade" the block >>> or do >>> not unmap the block? >>> >>> Peter >>> >
Re: [Qemu-devel] [PATCH RFC qom-cpu 33/41] cpu: Turn cpu_get_phys_page_debug() into a CPUClass hook
On Sun, Jul 7, 2013 at 12:19 AM, Andreas Färber wrote: > Am 29.06.2013 22:01, schrieb Andreas Färber: >> Since all targets now assign a softmmu-only field, we can drop helpers >> cpu_class_set_{do_unassigned_access,vmsd}() and device_class_set_vmsd(). >> >> Prepares for changing cpu_memory_rw_debug() argument to CPUState. >> >> Signed-off-by: Andreas Färber > > The following fixup for xtensa is required. This demonstrates how > dangerous it is to pass void* arguments directly into functions - please > always use a typed local variable to enforce type checks. With this fix it works, so for xtensa part Acked-by: Max Filippov My question about having configuration pointer cached in the env still remains though. -- Thanks. -- Max
Re: [Qemu-devel] [PATCHv2 02/11] iscsi: read unmap info from block limits vpd page
ok, to sum up you see no potential problem with my patch to optimize write zeroes by unmap iff lbprz==1 and lbpme == 1 ? the alternative would be to use writesame16 and sent a zero block. this would allow an optimization also if lbprz == 0. in this case i would not set the unmap bit. peter Am 05.07.2013 um 09:11 schrieb ronnie sahlberg : > The device MIGHT map or anchor the blocks after the unmap but it may > only do so if the blocks that become mapped are all zero. > > So I think you can safely assume that if lbprz==1 then it will always > read back as zero no matter what happens internally in the target. > > Either the block becomes unmapped/deallocated and then it will read > back as zero, > or the blocks is automatically re-mapped to anchored/mapped again > but this can only happen if the mapped block is all zero so once again > it will still read back as all zero. > > === > > 4.7.3.5 Autonomous LBA transitions > A device server may perform the following actions at any time: > a) transition any deallocated LBA to mapped; > b) transition any anchored LBA to mapped; or > c) transition any deallocated LBA to anchored. > If the LBPRZ bit in the READ CAPACITY (16) parameter data (see 5.16.2) > is set to one, and a mapped LBA > references a logical block that contains: > a) user data with all bits set to zero; and > Working Draft SCSI Block Commands – 3 (SBC-3) > 27T10/BSR INCITS 514 Revision 35d > 15 May 2013 > b) protection information, if any, set to ___h, > then the device server may transition that mapped LBA to anchored or > deallocated at any time. > The logical block provisioning st > > On Thu, Jul 4, 2013 at 2:07 PM, Peter Lieven wrote: >> >> Am 04.07.2013 um 14:37 schrieb Paolo Bonzini : >> >>> Il 03/07/2013 23:23, Peter Lieven ha scritto: BDC is not used. I had an implementation that sent multiple descriptors out, but at least for my storage the maximum unmap counts not for each descriptors, but for all together. So in this case we do not need the field at all. I forgot to remove it. discard and write_zeroes will both only send one request up to max_unmap in size. apropos write_zeroes: do you know if UNMAP is guaranteed to unmap data if lbprz == 1? >>> >>> Yes. On the other hand note that WRITE_SAME should be guaranteed _not_ >>> to unmap if lbprz == 0 and you do WRITE_SAME with UNMAP and a zero >>> payload, but I suspect there may be buggy targets here. >>> I have read in the specs something that the target might unmap the blocks or not touch them at all. Maybe you have more information. >>> >>> That's even true of UNMAP itself, actually. :) >>> >>> The storage can always "upgrade" a block from unmapped to anchored and >>> from anchored to allocated, so UNMAP can be a no-op and still comply >>> with the standard. >> >> My concern was, if I UNMAP a block and lbprz == 1 is it guaranteed that it >> reads >> as zero afterwards? Regardless if the target decides to "upgrade" the block >> or do >> not unmap the block? >> >> Peter >>
[Qemu-devel] [PATCH] linux-user: Fix target_stat and target_stat64 for OpenRISC
OpenRISC uses the asm-generic versions of target_stat and target_stat64, but it was incorrectly using the x86/ARM/etc version due to a misplaced defined(TARGET_OPENRISC). The previously unused OpenRISC section of the ifdef ladder also defined an incorrect target_stat and omitted the target_stat64 definition. Fix target_stat, provide target_stat64, and add a comment noting that these are the asm-generic versions for the benefit of future ports. Signed-off-by: Peter Maydell --- This fixes basic problems like "ls -l output is nonsense" and "shell thinks programs aren't executable and won't run them". linux-user/syscall_defs.h | 49 ++--- 1 file changed, 37 insertions(+), 12 deletions(-) diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h index 92c01a9..cb6341f 100644 --- a/linux-user/syscall_defs.h +++ b/linux-user/syscall_defs.h @@ -1138,8 +1138,7 @@ struct target_winsize { #endif #if (defined(TARGET_I386) && defined(TARGET_ABI32)) || defined(TARGET_ARM) \ -|| defined(TARGET_CRIS) || defined(TARGET_UNICORE32) \ -|| defined(TARGET_OPENRISC) +|| defined(TARGET_CRIS) || defined(TARGET_UNICORE32) struct target_stat { unsigned short st_dev; unsigned short __pad1; @@ -1837,29 +1836,55 @@ struct target_stat { abi_ulong __unused[3]; }; #elif defined(TARGET_OPENRISC) + +/* These are the asm-generic versions of the stat and stat64 structures */ + struct target_stat { abi_ulong st_dev; abi_ulong st_ino; -abi_ulong st_nlink; - unsigned int st_mode; +unsigned int st_nlink; unsigned int st_uid; unsigned int st_gid; -unsigned int __pad0; abi_ulong st_rdev; +abi_ulong __pad1; abi_long st_size; -abi_long st_blksize; -abi_long st_blocks;/* Number 512-byte blocks allocated. */ - -abi_ulong target_st_atime; +int st_blksize; +int __pad2; +abi_long st_blocks; +abi_long target_st_atime; abi_ulong target_st_atime_nsec; -abi_ulong target_st_mtime; +abi_long target_st_mtime; abi_ulong target_st_mtime_nsec; -abi_ulong target_st_ctime; +abi_long target_st_ctime; abi_ulong target_st_ctime_nsec; +unsigned int __unused4; +unsigned int __unused5; +}; -abi_long __unused[3]; +struct target_stat64 { +uint64_t st_dev; +uint64_t st_ino; +unsigned int st_mode; +unsigned int st_nlink; +unsigned int st_uid; +unsigned int st_gid; +uint64_t st_rdev; +uint64_t __pad1; +int64_t st_size; +int st_blksize; +int __pad2; +int64_t st_blocks; +int target_st_atime; +unsigned int target_st_atime_nsec; +int target_st_mtime; +unsigned int target_st_mtime_nsec; +int target_st_ctime; +unsigned int target_st_ctime_nsec; +unsigned int __unused4; +unsigned int __unused5; }; + #else #error unsupported CPU #endif -- 1.7.9.5
Re: [Qemu-devel] [PATCH RFC qom-cpu 33/41] cpu: Turn cpu_get_phys_page_debug() into a CPUClass hook
Am 29.06.2013 22:01, schrieb Andreas Färber: > Since all targets now assign a softmmu-only field, we can drop helpers > cpu_class_set_{do_unassigned_access,vmsd}() and device_class_set_vmsd(). > > Prepares for changing cpu_memory_rw_debug() argument to CPUState. > > Signed-off-by: Andreas Färber The following fixup for xtensa is required. This demonstrates how dangerous it is to pass void* arguments directly into functions - please always use a typed local variable to enforce type checks. Andreas diff --git a/hw/xtensa/xtensa_lx60.c b/hw/xtensa/xtensa_lx60.c index 650dd31..a079774 100644 --- a/hw/xtensa/xtensa_lx60.c +++ b/hw/xtensa/xtensa_lx60.c @@ -144,9 +144,11 @@ static void lx60_net_init(MemoryRegion *address_space, memory_region_add_subregion(address_space, buffers, ram); } -static uint64_t translate_phys_addr(void *env, uint64_t addr) +static uint64_t translate_phys_addr(void *opaque, uint64_t addr) { -return cpu_get_phys_page_debug(env, addr); +XtensaCPU *cpu = opaque; + +return cpu_get_phys_page_debug(CPU(cpu), addr); } static void lx60_reset(void *opaque) @@ -252,7 +254,7 @@ static void lx_init(const LxBoardDesc *board, QEMUMachineInitArgs *args) } uint64_t elf_entry; uint64_t elf_lowaddr; -int success = load_elf(kernel_filename, translate_phys_addr, env, +int success = load_elf(kernel_filename, translate_phys_addr, cpu, &elf_entry, &elf_lowaddr, NULL, be, ELF_MACHINE, 0); if (success > 0) { env->pc = elf_entry; diff --git a/hw/xtensa/xtensa_sim.c b/hw/xtensa/xtensa_sim.c index 5241f8d..a4194cfb 100644 --- a/hw/xtensa/xtensa_sim.c +++ b/hw/xtensa/xtensa_sim.c @@ -32,9 +32,11 @@ #include "exec/memory.h" #include "exec/address-spaces.h" -static uint64_t translate_phys_addr(void *env, uint64_t addr) +static uint64_t translate_phys_addr(void *opaque, uint64_t addr) { -return cpu_get_phys_page_debug(env, addr); +XtensaCPU *cpu = opaque; + +return cpu_get_phys_page_debug(CPU(cpu), addr); } static void sim_reset(void *opaque) @@ -88,10 +90,10 @@ static void xtensa_sim_init(QEMUMachineInitArgs *args) uint64_t elf_entry; uint64_t elf_lowaddr; #ifdef TARGET_WORDS_BIGENDIAN -int success = load_elf(kernel_filename, translate_phys_addr, env, +int success = load_elf(kernel_filename, translate_phys_addr, cpu, &elf_entry, &elf_lowaddr, NULL, 1, ELF_MACHINE, 0); #else -int success = load_elf(kernel_filename, translate_phys_addr, env, +int success = load_elf(kernel_filename, translate_phys_addr, cpu, &elf_entry, &elf_lowaddr, NULL, 0, ELF_MACHINE, 0); #endif if (success > 0) { -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH RFC qom-cpu 39/41] target-xtensa: Introduce XtensaCPU subclasses
On Sat, Jul 6, 2013 at 11:12 PM, Andreas Färber wrote: > Am 06.07.2013 20:39, schrieb Max Filippov: >> On Sat, Jul 6, 2013 at 10:01 PM, Max Filippov wrote: >>> On Sat, Jul 6, 2013 at 4:55 PM, Andreas Färber wrote: Am 29.06.2013 22:01, schrieb Andreas Färber: > Register a CPU type per core registered. Save the XtensaConfig in > XtensaCPUClass instead of CPUXtensaState. > > Prepares for storing per-class GDB register count. > > Signed-off-by: Andreas Färber Ping! Can you ack? (It did not seem to break your test image.) >>> >>> Hi Andreas, >>> >>> I tried make check -C tests/tcg/xtensa with the branch you recommended >>> and it segfaults on elf loading >> >> ...and maybe a stupid question, but why moving configuration pointer away >> from env and then changing every place that used to access it? > > Xtensa is the only target trying to implicitly access an "env" variable > through a macro to obtain the number of registers for gdbstub. That's > what I'm trying to fix with 40/41 in order to get rid of the #ifdeffery. > > The number of registers is not accessed by TCG, so it could go into > XtensaCPU instead of CPUXtensaState. > Further it does not change during vCPU runtime, so it no longer belongs > in CPUXtensaState nor XtensaCPU but in XtensaCPUClass, which is shared > among CPU cores and can be accessed statically. I'm concerned about two things here: adding boilerplate code to access CPU class and doing more work at runtime than just following a pointer. I mean that env->config is almost always used together with env itself, could we consider env->config to be a cache for xcc->config? > However we only had one XtensaCPUClass but multiple XtensaConfigs. > Therefore this patch registers one XtensaCPUClass per XtensaConfig. > > You might remember that I once tried to place the XtensaConfig fields > directly into XtensaCPUClass but that didn't work out nicely back then. > This is by comparison a slim/minimal conversion to subclasses, leaving > XtensaConfig and your custom registration mechanisms in place. Cleaning > that up to use type_init() and type_register_static() in the respective > source files and changing -cpu ? to iterate QOM types would be a nice > follow-up, but not needed for this gdbstub refactoring series. > > Another background is that I'm trying to get rid of cpu_init() and > anything that blocks using -device or device_add with QOM CPUs. -- Thanks. -- Max
Re: [Qemu-devel] [PATCH RFC qom-cpu 39/41] target-xtensa: Introduce XtensaCPU subclasses
Am 06.07.2013 20:01, schrieb Max Filippov: > On Sat, Jul 6, 2013 at 4:55 PM, Andreas Färber wrote: >> Max, >> >> Am 29.06.2013 22:01, schrieb Andreas Färber: >>> Register a CPU type per core registered. Save the XtensaConfig in >>> XtensaCPUClass instead of CPUXtensaState. >>> >>> Prepares for storing per-class GDB register count. >>> >>> Signed-off-by: Andreas Färber >> >> Ping! Can you ack? (It did not seem to break your test image.) > > Hi Andreas, > > I tried make check -C tests/tcg/xtensa with the branch you recommended > and it segfaults on elf loading: First bad commit seems to be "cpu: Turn cpu_get_phys_page_debug() into a CPUClass hook", investigating. Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH RFC qom-cpu 39/41] target-xtensa: Introduce XtensaCPU subclasses
On Sat, Jul 6, 2013 at 10:45 PM, Andreas Färber wrote: > Hi Max, > > Am 06.07.2013 20:01, schrieb Max Filippov: >> On Sat, Jul 6, 2013 at 4:55 PM, Andreas Färber wrote: >>> Am 29.06.2013 22:01, schrieb Andreas Färber: Register a CPU type per core registered. Save the XtensaConfig in XtensaCPUClass instead of CPUXtensaState. Prepares for storing per-class GDB register count. Signed-off-by: Andreas Färber >>> >>> Ping! Can you ack? (It did not seem to break your test image.) >> >> I tried make check -C tests/tcg/xtensa with the branch you recommended >> and it segfaults on elf loading: > > That's weird, I'm seeing a similar breakage with the test image on > today's qom-cpu-11 branch. It is not caused by this commit though, so > must be something I reordered last minute... sorry. > > The make command above was not working, I've sent a patch for > out-of-tree builds and to update it to the current tree layout with tcg/ > subdirectory. I have actually always been using it in-tree, adjusting PATH to pick up qemu-system-xtensa from the correct build tree... I've made a couple more fixes to your patch so that it actually builds and runs tests out-of-tree. -- Thanks. -- Max
[Qemu-devel] [PATCH] tests/tcg/xtensa: Fix out-of-tree build
From: Andreas Färber Signed-off-by: Andreas Färber Signed-off-by: Max Filippov --- configure | 5 +++-- tests/tcg/xtensa/Makefile | 20 +++- 2 files changed, 14 insertions(+), 11 deletions(-) diff --git a/configure b/configure index 0e0adde..7b60efd 100755 --- a/configure +++ b/configure @@ -4469,13 +4469,14 @@ if [ "$dtc_internal" = "yes" ]; then fi # build tree in object directory in case the source is not in the current directory -DIRS="tests tests/tcg tests/tcg/cris tests/tcg/lm32 tests/libqos" +DIRS="tests tests/tcg tests/tcg/cris tests/tcg/lm32 tests/tcg/xtensa" +DIRS="$DIRS tests/libqos" DIRS="$DIRS pc-bios/optionrom pc-bios/spapr-rtas pc-bios/s390-ccw" DIRS="$DIRS roms/seabios roms/vgabios" DIRS="$DIRS qapi-generated" FILES="Makefile tests/tcg/Makefile qdict-test-data.txt" FILES="$FILES tests/tcg/cris/Makefile tests/tcg/cris/.gdbinit" -FILES="$FILES tests/tcg/lm32/Makefile po/Makefile" +FILES="$FILES tests/tcg/lm32/Makefile tests/tcg/xtensa/Makefile po/Makefile" FILES="$FILES pc-bios/optionrom/Makefile pc-bios/keymaps" FILES="$FILES pc-bios/spapr-rtas/Makefile" FILES="$FILES pc-bios/s390-ccw/Makefile" diff --git a/tests/tcg/xtensa/Makefile b/tests/tcg/xtensa/Makefile index 002fd87..dacafe0 100644 --- a/tests/tcg/xtensa/Makefile +++ b/tests/tcg/xtensa/Makefile @@ -1,9 +1,9 @@ --include ../../config-host.mak +-include ../../../config-host.mak CROSS=xtensa-dc232b-elf- ifndef XT -SIM = qemu-system-xtensa +SIM = ../../../xtensa-softmmu/qemu-system-xtensa SIMFLAGS = -M sim -cpu dc232b -nographic -semihosting $(EXTFLAGS) -kernel SIMDEBUG = -s -S else @@ -13,10 +13,12 @@ SIMDEBUG = --gdbserve=0 endif CC = $(CROSS)gcc -AS = $(CROSS)gcc -x assembler +AS = $(CROSS)gcc -x assembler-with-cpp LD = $(CROSS)ld -LDFLAGS = -Tlinker.ld +XTENSA_SRC_PATH = $(SRC_PATH)/tests/tcg/xtensa + +LDFLAGS = -T$(XTENSA_SRC_PATH)/linker.ld CRT= crt.o vectors.o @@ -52,13 +54,13 @@ TESTCASES += test_windowed.tst all: build -%.o: $(SRC_PATH)/tests/xtensa/%.c - $(CC) $(CFLAGS) -c $< -o $@ +%.o: $(XTENSA_SRC_PATH)/%.c + $(CC) -I$(XTENSA_SRC_PATH) $(CFLAGS) -c $< -o $@ -%.o: $(SRC_PATH)/tests/xtensa/%.S - $(AS) $(ASFLAGS) -c $< -o $@ +%.o: $(XTENSA_SRC_PATH)/%.S + $(AS) -Wa,-I,$(XTENSA_SRC_PATH) $(ASFLAGS) -c $< -o $@ -%.tst: %.o macros.inc $(CRT) Makefile +%.tst: %.o $(XTENSA_SRC_PATH)/macros.inc $(CRT) Makefile $(LD) $(LDFLAGS) $(NOSTDFLAGS) $(CRT) $< -o $@ build: $(TESTCASES) -- 1.8.1.4
Re: [Qemu-devel] [PATCH RFC qom-cpu 41/41] cpu: Introduce CPUClass::gdb_{read, write}_register()
Am 01.07.2013 20:07, schrieb Richard Henderson: > On 06/29/2013 01:01 PM, Andreas Färber wrote: >> Replace GET_REG*() macros by gdb_get_reg*() inline functions for >> clarity and drop breaks after return. >> >> Allows to move target-specific code to new target-*/gdbstub.c files. >> >> Signed-off-by: Andreas Färber > > This one's quite hard to read. True. Apart from the IMO too generically named GET_REG*() macros, I needed to fix an awful number of Coding Style issues, which we might want to do upfront. I could also do the GET_REG*() -> gdb_get_reg*() sweep in place and split off the break cleanups, too. > I wonder if there's value in splitting out functions one target at a time into > the new gdbstub.c files, and only afterward performing a few minor textual > changes to transform those functions into the CPUClass methods? That's not so easy, but I'll think of something. #include'ing the C files until the final change to CPUClass hooks might work... Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH RFC qom-cpu 39/41] target-xtensa: Introduce XtensaCPU subclasses
Am 06.07.2013 20:39, schrieb Max Filippov: > On Sat, Jul 6, 2013 at 10:01 PM, Max Filippov wrote: >> On Sat, Jul 6, 2013 at 4:55 PM, Andreas Färber wrote: >>> Am 29.06.2013 22:01, schrieb Andreas Färber: Register a CPU type per core registered. Save the XtensaConfig in XtensaCPUClass instead of CPUXtensaState. Prepares for storing per-class GDB register count. Signed-off-by: Andreas Färber >>> >>> Ping! Can you ack? (It did not seem to break your test image.) >> >> Hi Andreas, >> >> I tried make check -C tests/tcg/xtensa with the branch you recommended >> and it segfaults on elf loading > > ...and maybe a stupid question, but why moving configuration pointer away > from env and then changing every place that used to access it? Xtensa is the only target trying to implicitly access an "env" variable through a macro to obtain the number of registers for gdbstub. That's what I'm trying to fix with 40/41 in order to get rid of the #ifdeffery. The number of registers is not accessed by TCG, so it could go into XtensaCPU instead of CPUXtensaState. Further it does not change during vCPU runtime, so it no longer belongs in CPUXtensaState nor XtensaCPU but in XtensaCPUClass, which is shared among CPU cores and can be accessed statically. However we only had one XtensaCPUClass but multiple XtensaConfigs. Therefore this patch registers one XtensaCPUClass per XtensaConfig. You might remember that I once tried to place the XtensaConfig fields directly into XtensaCPUClass but that didn't work out nicely back then. This is by comparison a slim/minimal conversion to subclasses, leaving XtensaConfig and your custom registration mechanisms in place. Cleaning that up to use type_init() and type_register_static() in the respective source files and changing -cpu ? to iterate QOM types would be a nice follow-up, but not needed for this gdbstub refactoring series. Another background is that I'm trying to get rid of cpu_init() and anything that blocks using -device or device_add with QOM CPUs. Regards, Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH RFC qom-cpu 39/41] target-xtensa: Introduce XtensaCPU subclasses
Hi Max, Am 06.07.2013 20:01, schrieb Max Filippov: > On Sat, Jul 6, 2013 at 4:55 PM, Andreas Färber wrote: >> Am 29.06.2013 22:01, schrieb Andreas Färber: >>> Register a CPU type per core registered. Save the XtensaConfig in >>> XtensaCPUClass instead of CPUXtensaState. >>> >>> Prepares for storing per-class GDB register count. >>> >>> Signed-off-by: Andreas Färber >> >> Ping! Can you ack? (It did not seem to break your test image.) > > I tried make check -C tests/tcg/xtensa with the branch you recommended > and it segfaults on elf loading: That's weird, I'm seeing a similar breakage with the test image on today's qom-cpu-11 branch. It is not caused by this commit though, so must be something I reordered last minute... sorry. The make command above was not working, I've sent a patch for out-of-tree builds and to update it to the current tree layout with tcg/ subdirectory. Andreas > #0 object_class_dynamic_cast_assert (class=0x0, > typename=typename@entry=0x5573a85e "cpu", > file=file@entry=0x557349c8 > "/home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/include/qom/cpu.h", > line=line@entry=290, func=func@entry= > 0x55735770 <__func__.17127> "cpu_get_phys_page_debug") at > /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/qom/object.c:535 > #1 0x556b884b in cpu_get_phys_page_debug (addr=3489660928, > cpu=0x56275a30) at > /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/include/qom/cpu.h:290 > #2 translate_phys_addr (env=0x56275a30, addr=3489660928) at > /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/hw/xtensa/xtensa_sim.c:37 > #3 0x55602b9b in load_elf32 (clear_lsb=0, elf_machine=94, > highaddr=0x0, lowaddr=0x7fffd3b0, pentry=0x7fffd3a8, > must_swab=, translate_opaque=0x56275a30, > translate_fn= > 0x556b8800 , fd=10, name=0x56267df0 > "./test_b.tst") at > /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/include/hw/elf_ops.h:269 > #4 load_elf (filename=filename@entry=0x56267df0 "./test_b.tst", > translate_fn=translate_fn@entry=0x556b8800 , > translate_opaque=translate_opaque@entry=0x56275a30, > pentry=pentry@entry= > 0x7fffd3a8, lowaddr=lowaddr@entry=0x7fffd3b0, > highaddr=highaddr@entry=0x0, big_endian=big_endian@entry=0, > elf_machine=elf_machine@entry=94, clear_lsb=clear_lsb@entry=0) > at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/hw/core/loader.c:326 > #5 0x556b8a23 in xtensa_sim_init (args=) at > /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/hw/xtensa/xtensa_sim.c:94 > #6 0x5559cf6f in main (argc=, argv= out>, envp=) at > /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/vl.c:4286 > > The mainline is ok. > -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
[Qemu-devel] [PATCH] tests/tcg/xtensa: Fix out-of-tree build
Signed-off-by: Andreas Färber --- configure | 5 +++-- tests/tcg/xtensa/Makefile | 8 2 files changed, 7 insertions(+), 6 deletions(-) diff --git a/configure b/configure index 0e0adde..7b60efd 100755 --- a/configure +++ b/configure @@ -4469,13 +4469,14 @@ if [ "$dtc_internal" = "yes" ]; then fi # build tree in object directory in case the source is not in the current directory -DIRS="tests tests/tcg tests/tcg/cris tests/tcg/lm32 tests/libqos" +DIRS="tests tests/tcg tests/tcg/cris tests/tcg/lm32 tests/tcg/xtensa" +DIRS="$DIRS tests/libqos" DIRS="$DIRS pc-bios/optionrom pc-bios/spapr-rtas pc-bios/s390-ccw" DIRS="$DIRS roms/seabios roms/vgabios" DIRS="$DIRS qapi-generated" FILES="Makefile tests/tcg/Makefile qdict-test-data.txt" FILES="$FILES tests/tcg/cris/Makefile tests/tcg/cris/.gdbinit" -FILES="$FILES tests/tcg/lm32/Makefile po/Makefile" +FILES="$FILES tests/tcg/lm32/Makefile tests/tcg/xtensa/Makefile po/Makefile" FILES="$FILES pc-bios/optionrom/Makefile pc-bios/keymaps" FILES="$FILES pc-bios/spapr-rtas/Makefile" FILES="$FILES pc-bios/s390-ccw/Makefile" diff --git a/tests/tcg/xtensa/Makefile b/tests/tcg/xtensa/Makefile index 002fd87..c77ff76 100644 --- a/tests/tcg/xtensa/Makefile +++ b/tests/tcg/xtensa/Makefile @@ -1,4 +1,4 @@ --include ../../config-host.mak +-include ../../../config-host.mak CROSS=xtensa-dc232b-elf- @@ -52,13 +52,13 @@ TESTCASES += test_windowed.tst all: build -%.o: $(SRC_PATH)/tests/xtensa/%.c +%.o: $(SRC_PATH)/tests/tcg/xtensa/%.c $(CC) $(CFLAGS) -c $< -o $@ -%.o: $(SRC_PATH)/tests/xtensa/%.S +%.o: $(SRC_PATH)/tests/tcg/xtensa/%.S $(AS) $(ASFLAGS) -c $< -o $@ -%.tst: %.o macros.inc $(CRT) Makefile +%.tst: %.o $(SRC_PATH)/tests/tcg/xtensa/macros.inc $(CRT) Makefile $(LD) $(LDFLAGS) $(NOSTDFLAGS) $(CRT) $< -o $@ build: $(TESTCASES) -- 1.8.1.4
Re: [Qemu-devel] [PATCH RFC qom-cpu 39/41] target-xtensa: Introduce XtensaCPU subclasses
On Sat, Jul 6, 2013 at 10:01 PM, Max Filippov wrote: > On Sat, Jul 6, 2013 at 4:55 PM, Andreas Färber wrote: >> Max, >> >> Am 29.06.2013 22:01, schrieb Andreas Färber: >>> Register a CPU type per core registered. Save the XtensaConfig in >>> XtensaCPUClass instead of CPUXtensaState. >>> >>> Prepares for storing per-class GDB register count. >>> >>> Signed-off-by: Andreas Färber >> >> Ping! Can you ack? (It did not seem to break your test image.) > > Hi Andreas, > > I tried make check -C tests/tcg/xtensa with the branch you recommended > and it segfaults on elf loading ...and maybe a stupid question, but why moving configuration pointer away from env and then changing every place that used to access it? -- Thanks. -- Max
[Qemu-devel] does anybody have a working set of linux-m68k/coldfire binaries?
I'd like to test some changes I'm making which affect the qemu linux-m68k target, but I can't find a working set of binaries to test them with. Everything I've found online (including the ones in the linux-user test tarball on the qemu wiki) just dies pretty quickly with an illegal instruction (possibly because the binaries want some CPU features/insns our coldfire CPU emulation doesn't support?) Does anybody have some test binaries that actually work? Otherwise I'll just make the m68k changes that look to be right, on the basis that they clean things up and if it's already broken it can't get worse... thanks -- PMM
[Qemu-devel] [PATCHv2] [RFC] aio/async: Add timed bottom-halves
Add timed bottom halves. A timed bottom half is a bottom half that will not execute until a given time has passed (qemu_bh_schedule_at) or a given interval has passed (qemu_bh_schedule_in). Any qemu clock can be used, and times are specified in nanoseconds. Timed bottom halves can be used where timers cannot. For instance, in block drivers where there is no mainloop that calls timers (qemu-nbd, qemu-img), or where (per stefa...@redhat.com) the aio code loops internally and thus timers never get called. Changes since v1: * aio_ctx_prepare should cope with wait<0 * aio_ctx_prepare should round up wait time Signed-off-by: Alex Bligh --- async.c | 53 +-- include/block/aio.h | 33 tests/test-aio.c| 47 + 3 files changed, 127 insertions(+), 6 deletions(-) diff --git a/async.c b/async.c index 90fe906..d93b1ab 100644 --- a/async.c +++ b/async.c @@ -35,6 +35,8 @@ struct QEMUBH { QEMUBHFunc *cb; void *opaque; QEMUBH *next; +QEMUClock *clock; +int64_t time; bool scheduled; bool idle; bool deleted; @@ -52,6 +54,11 @@ QEMUBH *aio_bh_new(AioContext *ctx, QEMUBHFunc *cb, void *opaque) return bh; } +static inline int64_t qemu_bh_ready_in(QEMUBH *bh) +{ +return (bh->clock) ? (bh->time - qemu_get_clock_ns(bh->clock)) : 0; +} + int aio_bh_poll(AioContext *ctx) { QEMUBH *bh, **bhp, *next; @@ -62,8 +69,10 @@ int aio_bh_poll(AioContext *ctx) ret = 0; for (bh = ctx->first_bh; bh; bh = next) { next = bh->next; -if (!bh->deleted && bh->scheduled) { +if (!bh->deleted && bh->scheduled && qemu_bh_ready_in(bh) <= 0) { bh->scheduled = 0; +bh->clock = NULL; +bh->time = 0; if (!bh->idle) ret = 1; bh->idle = 0; @@ -96,6 +105,8 @@ void qemu_bh_schedule_idle(QEMUBH *bh) return; bh->scheduled = 1; bh->idle = 1; +bh->clock = NULL; +bh->time = 0; } void qemu_bh_schedule(QEMUBH *bh) @@ -104,18 +115,39 @@ void qemu_bh_schedule(QEMUBH *bh) return; bh->scheduled = 1; bh->idle = 0; +bh->clock = NULL; +bh->time = 0; aio_notify(bh->ctx); } +void qemu_bh_schedule_at(QEMUBH *bh, QEMUClock *clock, int64_t time) +{ +/* Allow rescheduling if already scheduled */ +bh->scheduled = 1; +bh->idle = 0; +bh->clock = clock; +bh->time = time; +aio_notify(bh->ctx); /*FIXME: is this right?*/ +} + +void qemu_bh_schedule_in(QEMUBH *bh, QEMUClock *clock, int64_t time) +{ +qemu_bh_schedule_at(bh, clock, qemu_get_clock_ns(clock) + time); +} + void qemu_bh_cancel(QEMUBH *bh) { bh->scheduled = 0; +bh->clock = NULL; +bh->time = 0; } void qemu_bh_delete(QEMUBH *bh) { bh->scheduled = 0; bh->deleted = 1; +bh->clock = NULL; +bh->time = 0; } static gboolean @@ -126,13 +158,22 @@ aio_ctx_prepare(GSource *source, gint*timeout) for (bh = ctx->first_bh; bh; bh = bh->next) { if (!bh->deleted && bh->scheduled) { -if (bh->idle) { +int64_t wait = qemu_bh_ready_in(bh) / SCALE_MS; +if (!wait && bh->idle) { /* idle bottom halves will be polled at least * every 10ms */ -*timeout = 10; +wait = 10; +} +if (wait) { +/* Use the minimum wait across all bottom + * halves */ +if (*timeout == -1 || *timeout > wait) { +*timeout = wait; +} } else { -/* non-idle bottom halves will be executed - * immediately */ +/* non-idle bottom halves or timed bottom + * halves which are ready to run will be + * executed immediately */ *timeout = 0; return true; } @@ -149,7 +190,7 @@ aio_ctx_check(GSource *source) QEMUBH *bh; for (bh = ctx->first_bh; bh; bh = bh->next) { -if (!bh->deleted && bh->scheduled) { +if (!bh->deleted && bh->scheduled && qemu_bh_ready_in(bh) <= 0) { return true; } } diff --git a/include/block/aio.h b/include/block/aio.h index 1836793..ff26a3b 100644 --- a/include/block/aio.h +++ b/include/block/aio.h @@ -20,6 +20,7 @@ typedef struct BlockDriverAIOCB BlockDriverAIOCB; typedef void BlockDriverCompletionFunc(void *opaque, int ret); +typedef struct QEMUClock QEMUClock; typedef struct AIOCBInfo { void (*cancel)(BlockDriverAIOCB *acb); @@ -145,6 +146,38 @@ int aio_bh_poll(AioContext *ctx); void qemu_bh_schedule(QEMUBH *bh); /** + * qemu_bh_schedule_at: Schedule a bottom half at a future time + * + * Scheduling a bottom half interrupts the main loop and causes the + * execution of the callback that was passed t
Re: [Qemu-devel] [PATCH RFC qom-cpu 39/41] target-xtensa: Introduce XtensaCPU subclasses
On Sat, Jul 6, 2013 at 4:55 PM, Andreas Färber wrote: > Max, > > Am 29.06.2013 22:01, schrieb Andreas Färber: >> Register a CPU type per core registered. Save the XtensaConfig in >> XtensaCPUClass instead of CPUXtensaState. >> >> Prepares for storing per-class GDB register count. >> >> Signed-off-by: Andreas Färber > > Ping! Can you ack? (It did not seem to break your test image.) Hi Andreas, I tried make check -C tests/tcg/xtensa with the branch you recommended and it segfaults on elf loading: #0 object_class_dynamic_cast_assert (class=0x0, typename=typename@entry=0x5573a85e "cpu", file=file@entry=0x557349c8 "/home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/include/qom/cpu.h", line=line@entry=290, func=func@entry= 0x55735770 <__func__.17127> "cpu_get_phys_page_debug") at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/qom/object.c:535 #1 0x556b884b in cpu_get_phys_page_debug (addr=3489660928, cpu=0x56275a30) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/include/qom/cpu.h:290 #2 translate_phys_addr (env=0x56275a30, addr=3489660928) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/hw/xtensa/xtensa_sim.c:37 #3 0x55602b9b in load_elf32 (clear_lsb=0, elf_machine=94, highaddr=0x0, lowaddr=0x7fffd3b0, pentry=0x7fffd3a8, must_swab=, translate_opaque=0x56275a30, translate_fn= 0x556b8800 , fd=10, name=0x56267df0 "./test_b.tst") at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/include/hw/elf_ops.h:269 #4 load_elf (filename=filename@entry=0x56267df0 "./test_b.tst", translate_fn=translate_fn@entry=0x556b8800 , translate_opaque=translate_opaque@entry=0x56275a30, pentry=pentry@entry= 0x7fffd3a8, lowaddr=lowaddr@entry=0x7fffd3b0, highaddr=highaddr@entry=0x0, big_endian=big_endian@entry=0, elf_machine=elf_machine@entry=94, clear_lsb=clear_lsb@entry=0) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/hw/core/loader.c:326 #5 0x556b8a23 in xtensa_sim_init (args=) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/hw/xtensa/xtensa_sim.c:94 #6 0x5559cf6f in main (argc=, argv=, envp=) at /home/jcmvbkbc/ws/m/awt/emu/xtensa/qemu/vl.c:4286 The mainline is ok. -- Thanks. -- Max
[Qemu-devel] [PATCH] linux-user: Fix pipe syscall return for SPARC
SPARC is one of the CPUs which has a funny syscall ABI for the pipe syscall; add it to the set of special cases in do_pipe(). Signed-off-by: Peter Maydell --- bash is much more useful with this patch -- without it, it will close() stdin instead of one end of its pipe to a child process, so after the first time you run a program bash will exit... It's this kind of "basic stuff doesn't work" that makes me happier about being a bit cavalier with code cleanups, enabling NPTL, etc for some of the minor target archs in linux-user -- it's clear that some of them have simply never been tested for anything more serious than "run some trivial binary"... linux-user/syscall.c |3 +++ 1 file changed, 3 insertions(+) diff --git a/linux-user/syscall.c b/linux-user/syscall.c index cdd0c28..aea9be4 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -1030,6 +1030,9 @@ static abi_long do_pipe(void *cpu_env, abi_ulong pipedes, #elif defined(TARGET_SH4) ((CPUSH4State*)cpu_env)->gregs[1] = host_pipe[1]; return host_pipe[0]; +#elif defined(TARGET_SPARC) +((CPUSPARCState*)cpu_env)->regwptr[1] = host_pipe[1]; +return host_pipe[0]; #endif } -- 1.7.9.5
Re: [Qemu-devel] [PATCH] [RFC] aio/async: Add timed bottom-halves
--On 6 July 2013 17:24:57 +0100 Alex Bligh wrote: Add timed bottom halves. A timed bottom half is a bottom half that will not execute until a given time has passed (qemu_bh_schedule_at) or a given interval has passed (qemu_bh_schedule_in). Any qemu clock can be used, and times are specified in nanoseconds. For background, this patch resulted from a discussion with Kevin & Stefan on IRC as to the best way to use timers (or rather avoid using timers) in block drivers. There is one thing I'm not entirely sure about: +void qemu_bh_schedule_at(QEMUBH *bh, QEMUClock *clock, int64_t time) +{ +/* Allow rescheduling if already scheduled */ +bh->scheduled = 1; +bh->idle = 0; +bh->clock = clock; +bh->time = time; +aio_notify(bh->ctx); /*FIXME: is this right?*/ +} qemu_bh_schedule calls aio_notify, but qemu_bh_schedule_idle does not. I am not quite sure why the latter doesn't - possibly through not fully understanding the aio system. Should qemu_bh_schedule_at be calling this, when I don't know whether the bh is scheduled for 1 nanosecond ahead or 1 second? -- Alex Bligh
[Qemu-devel] [PATCH] [RFC] aio/async: Add timed bottom-halves
Add timed bottom halves. A timed bottom half is a bottom half that will not execute until a given time has passed (qemu_bh_schedule_at) or a given interval has passed (qemu_bh_schedule_in). Any qemu clock can be used, and times are specified in nanoseconds. Timed bottom halves can be used where timers cannot. For instance, in block drivers where there is no mainloop that calls timers (qemu-nbd, qemu-img), or where (per stefa...@redhat.com) the aio code loops internally and thus timers never get called. Signed-off-by: Alex Bligh --- async.c | 53 +-- include/block/aio.h | 33 tests/test-aio.c| 47 + 3 files changed, 127 insertions(+), 6 deletions(-) diff --git a/async.c b/async.c index 90fe906..d93b1ab 100644 --- a/async.c +++ b/async.c @@ -35,6 +35,8 @@ struct QEMUBH { QEMUBHFunc *cb; void *opaque; QEMUBH *next; +QEMUClock *clock; +int64_t time; bool scheduled; bool idle; bool deleted; @@ -52,6 +54,11 @@ QEMUBH *aio_bh_new(AioContext *ctx, QEMUBHFunc *cb, void *opaque) return bh; } +static inline int64_t qemu_bh_ready_in(QEMUBH *bh) +{ +return (bh->clock) ? (bh->time - qemu_get_clock_ns(bh->clock)) : 0; +} + int aio_bh_poll(AioContext *ctx) { QEMUBH *bh, **bhp, *next; @@ -62,8 +69,10 @@ int aio_bh_poll(AioContext *ctx) ret = 0; for (bh = ctx->first_bh; bh; bh = next) { next = bh->next; -if (!bh->deleted && bh->scheduled) { +if (!bh->deleted && bh->scheduled && qemu_bh_ready_in(bh) <= 0) { bh->scheduled = 0; +bh->clock = NULL; +bh->time = 0; if (!bh->idle) ret = 1; bh->idle = 0; @@ -96,6 +105,8 @@ void qemu_bh_schedule_idle(QEMUBH *bh) return; bh->scheduled = 1; bh->idle = 1; +bh->clock = NULL; +bh->time = 0; } void qemu_bh_schedule(QEMUBH *bh) @@ -104,18 +115,39 @@ void qemu_bh_schedule(QEMUBH *bh) return; bh->scheduled = 1; bh->idle = 0; +bh->clock = NULL; +bh->time = 0; aio_notify(bh->ctx); } +void qemu_bh_schedule_at(QEMUBH *bh, QEMUClock *clock, int64_t time) +{ +/* Allow rescheduling if already scheduled */ +bh->scheduled = 1; +bh->idle = 0; +bh->clock = clock; +bh->time = time; +aio_notify(bh->ctx); /*FIXME: is this right?*/ +} + +void qemu_bh_schedule_in(QEMUBH *bh, QEMUClock *clock, int64_t time) +{ +qemu_bh_schedule_at(bh, clock, qemu_get_clock_ns(clock) + time); +} + void qemu_bh_cancel(QEMUBH *bh) { bh->scheduled = 0; +bh->clock = NULL; +bh->time = 0; } void qemu_bh_delete(QEMUBH *bh) { bh->scheduled = 0; bh->deleted = 1; +bh->clock = NULL; +bh->time = 0; } static gboolean @@ -126,13 +158,22 @@ aio_ctx_prepare(GSource *source, gint*timeout) for (bh = ctx->first_bh; bh; bh = bh->next) { if (!bh->deleted && bh->scheduled) { -if (bh->idle) { +int64_t wait = qemu_bh_ready_in(bh) / SCALE_MS; +if (!wait && bh->idle) { /* idle bottom halves will be polled at least * every 10ms */ -*timeout = 10; +wait = 10; +} +if (wait) { +/* Use the minimum wait across all bottom + * halves */ +if (*timeout == -1 || *timeout > wait) { +*timeout = wait; +} } else { -/* non-idle bottom halves will be executed - * immediately */ +/* non-idle bottom halves or timed bottom + * halves which are ready to run will be + * executed immediately */ *timeout = 0; return true; } @@ -149,7 +190,7 @@ aio_ctx_check(GSource *source) QEMUBH *bh; for (bh = ctx->first_bh; bh; bh = bh->next) { -if (!bh->deleted && bh->scheduled) { +if (!bh->deleted && bh->scheduled && qemu_bh_ready_in(bh) <= 0) { return true; } } diff --git a/include/block/aio.h b/include/block/aio.h index 1836793..ff26a3b 100644 --- a/include/block/aio.h +++ b/include/block/aio.h @@ -20,6 +20,7 @@ typedef struct BlockDriverAIOCB BlockDriverAIOCB; typedef void BlockDriverCompletionFunc(void *opaque, int ret); +typedef struct QEMUClock QEMUClock; typedef struct AIOCBInfo { void (*cancel)(BlockDriverAIOCB *acb); @@ -145,6 +146,38 @@ int aio_bh_poll(AioContext *ctx); void qemu_bh_schedule(QEMUBH *bh); /** + * qemu_bh_schedule_at: Schedule a bottom half at a future time + * + * Scheduling a bottom half interrupts the main loop and causes the + * execution of the callback that was passed to qemu_bh_new. + * + * Bottom halves that are scheduled from a bottom half handler are instantly + * invo
Re: [Qemu-devel] [PATCH RFC 03/15] cpu/a9mpcore: Embed GICState
Am 01.07.2013 00:13, schrieb Peter Maydell: > On 30 June 2013 22:00, Andreas Färber wrote: >> From: Andreas Färber >> >> Prepares for conversion to QOM realize. >> >> Signed-off-by: Andreas Färber >> --- >> hw/cpu/a9mpcore.c | 25 - >> 1 file changed, 16 insertions(+), 9 deletions(-) >> >> diff --git a/hw/cpu/a9mpcore.c b/hw/cpu/a9mpcore.c >> index 63a4eb1..6340b0f 100644 >> --- a/hw/cpu/a9mpcore.c >> +++ b/hw/cpu/a9mpcore.c >> @@ -9,6 +9,7 @@ >> */ >> >> #include "hw/sysbus.h" >> +#include "hw/intc/gic_internal.h" > > This doesn't look right -- the GIC internals are supposed > to be internal to the GIC implementation (and its subclasses). > They shouldn't be exposed to users. (That's why the header > is in hw/intc and not in include/.) I can easily add an include/hw/intc/arm_gic.h header to address that. (Previously all device headers were in hw/.) >> #define TYPE_A9MPCORE_PRIV "a9mpcore_priv" >> #define A9MPCORE_PRIV(obj) \ >> @@ -23,15 +24,17 @@ typedef struct A9MPPrivState { >> MemoryRegion container; >> DeviceState *mptimer; >> DeviceState *wdt; >> -DeviceState *gic; >> DeviceState *scu; >> uint32_t num_irq; >> + >> +GICState gic; >> } A9MPPrivState; > > So we sort of had a discussion about this before, but I > still don't think we have a sensible way of doing > embedding of devices into container device structures > like this properly (ie without exposing implementation > internals that qdev doesn't require you to expose). We didn't have a discussion, we had you complaining about how great qdev was and how little you like QOM. Anthony told you quite clearly that this is how it's supposed to be done with his QOM: * in TypeInfo::instance_init initialize child devices with object_initialize(), which is not supposed to fail, * property setters can report failure via **errp and may therefore dynamically allocate children - you implemented your array static properties as a consequence, * DeviceClass::realize shall not create new devices as they would be hidden from management tools in the future QMP phase (-S) before realization and would break recursive realization due to devices appearing while iterating. So when we have one or a fixed number of children, it seems clear to me that they shall be embedded into the struct as done in this series. You had several months to come up with a solution of your liking, but apparently you didn't step forward with any patchset or proposal. ;) I'd be more than happy if you refactored ARM code yourself, that would mean less work for me and/or Tao. Saying no is not constructive though. The only way I can think of to shield struct internals while still reserving space to avoid dynamic allocation would be: uint8_t gic[sizeof(GICState)]; which would still need access to the real GICState struct through a header, while prohibiting access so s->gic.foo. This could be hidden in a macro CHILD(typename, varname) to assure proper alignment. CC'ing Anthony and Paolo. But casting to the struct type is so simple to do that we can just use the struct in the first place and let the compiler handle alignment for us. As you can see from my Tegra2 patches, I am doing nothing bad with a9mpcore_priv, not accessing its internals at all! And since you're reviewing all ARM patches anyway, you could still reject any patches accessing device internals. Me being prompted by AArch64, we need an improvement over qdev code *now* before its patterns get copied into further devices, creating even more obstacles to implementing recursive realization. > If we want to do struct-embedding then I think we should > come up with a standard way of writing that code (eg > a .h file under include/hw with type name macros and a > struct with only the public-facing bits and internal > members hidden behind a typedef somehow, That is simply not possible in C. There are no public vs. internal fields since everything can be accessed via pointer dereference anyway, whatever way we use to declare pointers and fields. By Anthony's QOM conventions, the /*< private >*/ gtk-doc marker hides stuff from documentation completely (i.e., the parent_obj field not to be accessed), whereas device fields should be documented for use within the device, thus we need to keep them /*< public >*/ there. Personally, I consider all device fields private and consider direct access to random devices' fields as an abstraction breach and prefer to use accessor functions for that (Data Caging). But I remember you rejecting the accessor functions Anthony introduced for interacting with QOM pins as "boilerplate", so this argument feels double-tongued. > and a .priv.h > in hw/whatever/ with the actual struct that the device > needs itself), and then start using that. Otherwise we > should just keep using pointers since they can happily > be opaque about the details of the struct they point to. Again, pointers in instance_init break the QOM allocation model. T
[Qemu-devel] [PATCH 08/19] pseries: savevm support for PAPR VIO logical tty
From: David Gibson This patch adds the necessary VMStateDescription information to support savevm/loadvm for the spapr_tty (PAPR logical serial) device. Signed-off-by: David Gibson Signed-off-by: Alexey Kardashevskiy --- hw/char/spapr_vty.c | 16 1 file changed, 16 insertions(+) diff --git a/hw/char/spapr_vty.c b/hw/char/spapr_vty.c index 2993848..a799721 100644 --- a/hw/char/spapr_vty.c +++ b/hw/char/spapr_vty.c @@ -142,6 +142,21 @@ static Property spapr_vty_properties[] = { DEFINE_PROP_END_OF_LIST(), }; +static const VMStateDescription vmstate_spapr_vty = { +.name = "spapr_vty", +.version_id = 1, +.minimum_version_id = 1, +.minimum_version_id_old = 1, +.fields = (VMStateField []) { +VMSTATE_SPAPR_VIO(sdev, VIOsPAPRVTYDevice), + +VMSTATE_UINT32(in, VIOsPAPRVTYDevice), +VMSTATE_UINT32(out, VIOsPAPRVTYDevice), +VMSTATE_BUFFER(buf, VIOsPAPRVTYDevice), +VMSTATE_END_OF_LIST() +}, +}; + static void spapr_vty_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); @@ -152,6 +167,7 @@ static void spapr_vty_class_init(ObjectClass *klass, void *data) k->dt_type = "serial"; k->dt_compatible = "hvterm1"; dc->props = spapr_vty_properties; +dc->vmsd = &vmstate_spapr_vty; } static const TypeInfo spapr_vty_info = { -- 1.8.3.2
[Qemu-devel] [PATCH 19/19] spapr-pci: rework MSI/MSIX
The specific of sPAPR platform is that the guest allocates MSI/MSIX vectors via RTAS hypercalls and only operates with global IRQ numbers. In the real hardware, PHB is expected to convert MSIMessage to an IRQ number. So it is up to the host kernel to setup correct MSIMessage in a real device and a PHB where a device sits on. Therefore MSIMessage handling is completely hidden in QEMU. Previously every PCI host bridge implemented its own MSI memory window to catch msi_notify()/msix_notify() calls from QEMU devices (virtio-pci or vfio) and redirect them to the guest via qemu_pulse_irq(). MSIMessage encoding was: * .addr - address within the PHB MSI window; * .data - the device index on PHB plus vector number. The MSI MR write function translated this MSIMessage to a global VIRQ number and called qemu_pulse_irq(). However the total number of IRQs is not really big (at the moment it is 1024 IRQs starting from 4096) and even 16bit data field of MSIMessage seems to be enough to store a VIRQ number there so no decoding will be needed. The patch does: 1. remove MSI windows from a PHB; 2. add a single memory region for all MSIs in the guest; 3. encode MSIMessage as: * .addr - a fixed address of SPAPR_PCI_MSI_WINDOW==0x400ULL; * .data as a IRQ number. 4. change IRQ allocator to align first IRQ number for MSI as it uses lowest .data bits to put a vector number; this is not required for MSI-X though as it has a per vector .data field. Signed-off-by: Alexey Kardashevskiy --- hw/ppc/spapr.c | 29 -- hw/ppc/spapr_pci.c | 94 +++-- include/hw/pci-host/spapr.h | 8 ++-- include/hw/ppc/spapr.h | 4 +- 4 files changed, 73 insertions(+), 62 deletions(-) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index 5ecd81b..29d2be5 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -89,6 +89,9 @@ int spapr_allocate_irq(int hint, bool lsi) if (hint) { irq = hint; +if (hint >= spapr->next_irq) { +spapr->next_irq = hint + 1; +} /* FIXME: we should probably check for collisions somehow */ } else { irq = spapr->next_irq++; @@ -104,22 +107,39 @@ int spapr_allocate_irq(int hint, bool lsi) return irq; } -/* Allocate block of consequtive IRQs, returns a number of the first */ -int spapr_allocate_irq_block(int num, bool lsi) +/* + * Allocate block of consequtive IRQs, returns a number of the first. + * If msi==true, aligns the first IRQ number to num. + */ +int spapr_allocate_irq_block(int num, bool lsi, bool msi) { int first = -1; -int i; +int i, hint = 0; + +/* + * MSIMesage::data is used for storing VIRQ so + * it has to be aligned to num to support multiple + * MSI vectors. MSI-X is not affected by this. + * The hint is used for the first IRQ, the rest should + * be allocated continously. + */ +if (msi) { +assert((num == 1) || (num == 2) || (num == 4) || + (num == 8) || (num == 16) || (num == 32)); +hint = (spapr->next_irq + num - 1) & ~(num - 1); +} for (i = 0; i < num; ++i) { int irq; -irq = spapr_allocate_irq(0, lsi); +irq = spapr_allocate_irq(hint, lsi); if (!irq) { return -1; } if (0 == i) { first = irq; +hint = 0; } /* If the above doesn't create a consecutive block then that's @@ -1256,6 +1276,7 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args) spapr_create_nvram(spapr); /* Set up PCI */ +spapr_pci_msi_init(spapr, SPAPR_PCI_MSI_WINDOW); spapr_pci_rtas_init(); phb = spapr_create_phb(spapr, 0); diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index 4d8e3cd..23dbc0e 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -253,30 +253,6 @@ static int spapr_msicfg_find(sPAPRPHBState *phb, uint32_t config_addr, return -1; } -/* - * Set MSI/MSIX message data. - * This is required for msi_notify()/msix_notify() which - * will write at the addresses via spapr_msi_write(). - */ -static void spapr_msi_setmsg(PCIDevice *pdev, hwaddr addr, - bool msix, unsigned req_num) -{ -unsigned i; -MSIMessage msg = { .address = addr, .data = 0 }; - -if (!msix) { -msi_set_message(pdev, msg); -trace_spapr_pci_msi_setup(pdev->name, 0, msg.address); -return; -} - -for (i = 0; i < req_num; ++i) { -msg.address = addr | (i << 2); -msix_set_message(pdev, i, msg); -trace_spapr_pci_msi_setup(pdev->name, i, msg.address); -} -} - static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPREnvironment *spapr, uint32_t token, uint32_t nargs, target_ulong args, uint32_t nret, @@ -288,9 +264,10 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPREnvironment *spapr, unsigned int re
[Qemu-devel] [PATCH 18/19] target-ppc: Enhance the CPU node labels for the guest device tree for pseries.
From: Prerna Saxena In absence of a -CPU parameter in the qemu command line, the nodes of KVM-enabled guest device tree look like this : /proc/device-tree/cpus/HOST@0/... /proc/device-tree/cpus/HOST@4/... This patch replaces this obscure 'HOST' label with a more descriptive label. This is gathered by first identifying the PVR of the host, and then determining the host CPU alias which corresponds to the model indicated by this PVR. Sample Final outcome for an KVM-enabled pseries guest running on POWER7: /proc/device-tree/cpus/PowerPC,POWER7@0/... /proc/device-tree/cpus/PowerPC,POWER7@4/... This also helps userspace tools like ppc64_cpu, which expect the device tree to be in this format in the guest. Signed-off-by: Prerna Saxena Signed-off-by: Alexey Kardashevskiy --- hw/ppc/spapr.c | 17 ++--- target-ppc/cpu-qom.h| 1 + target-ppc/translate_init.c | 28 3 files changed, 43 insertions(+), 3 deletions(-) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index 26dd3f7..5ecd81b 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -80,6 +80,7 @@ #define HTAB_SIZE(spapr)(1ULL << ((spapr)->htab_shift)) +#define PPC_DEVTREE_STR "PowerPC," sPAPREnvironment *spapr; int spapr_allocate_irq(int hint, bool lsi) @@ -296,9 +297,12 @@ static void *spapr_create_fdt_skel(const char *cpu_model, _FDT((fdt_property_cell(fdt, "#address-cells", 0x1))); _FDT((fdt_property_cell(fdt, "#size-cells", 0x0))); -modelname = g_strdup(cpu_model); +/* device tree nodes must look like this : + * PowerPC,CPU_ALIAS@0 + */ +modelname = g_strdup_printf(PPC_DEVTREE_STR "%s", cpu_model); -for (i = 0; i < strlen(modelname); i++) { +for (i = strlen(PPC_DEVTREE_STR); i < strlen(modelname); i++) { modelname[i] = toupper(modelname[i]); } @@ -1112,7 +1116,7 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args) MemoryRegion *sysmem = get_system_memory(); MemoryRegion *ram = g_new(MemoryRegion, 1); hwaddr rma_alloc_size; -uint32_t initrd_base = 0; +uint32_t initrd_base = 0, pvr = 0; long kernel_size = 0, initrd_size = 0; long load_limit, rtas_limit, fw_size; char *filename; @@ -1342,6 +1346,13 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args) register_savevm_live(NULL, "spapr/htab", -1, 1, &savevm_htab_handlers, spapr); +/* Ensure that cpu_model is correctly reflected for a KVM guest */ +if (kvm_enabled() && !strcmp(cpu_model, "host")) { +asm ("mfpvr %0" +: "=r"(pvr)); +cpu_model = ppc_cpu_alias_by_pvr(pvr); +} + /* Prepare the device tree */ spapr->fdt_skel = spapr_create_fdt_skel(cpu_model, initrd_base, initrd_size, diff --git a/target-ppc/cpu-qom.h b/target-ppc/cpu-qom.h index a14a3d9..2ad45c2 100644 --- a/target-ppc/cpu-qom.h +++ b/target-ppc/cpu-qom.h @@ -99,6 +99,7 @@ static inline PowerPCCPU *ppc_env_get_cpu(CPUPPCState *env) #define ENV_OFFSET offsetof(PowerPCCPU, env) PowerPCCPUClass *ppc_cpu_class_by_pvr(uint32_t pvr); +const char *ppc_cpu_alias_by_pvr(uint32_t pvr); void ppc_cpu_do_interrupt(CPUState *cpu); void ppc_cpu_dump_state(CPUState *cpu, FILE *f, fprintf_function cpu_fprintf, diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c index c4b466b..2b013c2 100644 --- a/target-ppc/translate_init.c +++ b/target-ppc/translate_init.c @@ -7947,6 +7947,34 @@ PowerPCCPUClass *ppc_cpu_class_by_pvr(uint32_t pvr) return pcc; } +const char *ppc_cpu_alias_by_pvr(uint32_t pvr) +{ +int i; +const char *cpu_alias; +char *offset, *model; + +cpu_alias = object_class_get_name(OBJECT_CLASS +(ppc_cpu_class_by_pvr(pvr))); + +/* Replace the full class name in cpu_alias with the CPU alias + * Eg, POWER7_V2.3-POWERPC64-CPU can simply be called + * POWER7 + */ + +offset = strstr(cpu_alias, "-" TYPE_POWERPC_CPU); +if (offset) { +model = g_strndup(cpu_alias, offset - cpu_alias); +for (i = 0; ppc_cpu_aliases[i].model != NULL; i++) { +if (strcmp(ppc_cpu_aliases[i].model, model) == 0) { +g_free(model); +return ppc_cpu_aliases[i].alias; +} +} +g_free(model); +} +return NULL; +} + static gint ppc_cpu_compare_class_name(gconstpointer a, gconstpointer b) { ObjectClass *oc = (ObjectClass *)a; -- 1.8.3.2
[Qemu-devel] [PATCH 17/19] target-ppc: Add POWER8 v1.0 CPU model
From: Prerna Saxena This patch adds CPU PVR definition for POWER8, and enables QEMU to launch guests on POWER8 hardware. Signed-off-by: Prerna Saxena Signed-off-by: Alexey Kardashevskiy Reviewed-by: Paul Mackerras Reviewed-by: Andreas Farber --- Changes: 2013/07/04: * version 0.1 fixed to 1.0 Signed-off-by: Alexey Kardashevskiy --- target-ppc/cpu-models.c | 3 +++ target-ppc/cpu-models.h | 1 + target-ppc/translate_init.c | 34 ++ 3 files changed, 38 insertions(+) diff --git a/target-ppc/cpu-models.c b/target-ppc/cpu-models.c index 9bb68c8..623ad29 100644 --- a/target-ppc/cpu-models.c +++ b/target-ppc/cpu-models.c @@ -1145,6 +1145,8 @@ "POWER7 v2.1") POWERPC_DEF("POWER7_v2.3", CPU_POWERPC_POWER7_v23, POWER7, "POWER7 v2.3") +POWERPC_DEF("POWER8_v1.0", CPU_POWERPC_POWER8_v10, POWER8, +"POWER8 v1.0") POWERPC_DEF("970", CPU_POWERPC_970,970, "PowerPC 970") POWERPC_DEF("970fx_v1.0",CPU_POWERPC_970FX_v10, 970FX, @@ -1390,6 +1392,7 @@ PowerPCCPUAlias ppc_cpu_aliases[] = { { "Dino", "POWER3" }, { "POWER3+", "631" }, { "POWER7", "POWER7_v2.3" }, +{ "POWER8", "POWER8_v1.0" }, { "970fx", "970fx_v3.1" }, { "970mp", "970mp_v1.1" }, { "Apache", "RS64" }, diff --git a/target-ppc/cpu-models.h b/target-ppc/cpu-models.h index ae8f7c7..5458529 100644 --- a/target-ppc/cpu-models.h +++ b/target-ppc/cpu-models.h @@ -556,6 +556,7 @@ enum { CPU_POWERPC_POWER7_v20 = 0x003F0200, CPU_POWERPC_POWER7_v21 = 0x003F0201, CPU_POWERPC_POWER7_v23 = 0x003F0203, +CPU_POWERPC_POWER8_v10 = 0x004B0100, CPU_POWERPC_970= 0x00390202, CPU_POWERPC_970FX_v10 = 0x00391100, CPU_POWERPC_970FX_v20 = 0x003C0200, diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c index 02f3825..c4b466b 100644 --- a/target-ppc/translate_init.c +++ b/target-ppc/translate_init.c @@ -7011,6 +7011,40 @@ POWERPC_FAMILY(POWER7)(ObjectClass *oc, void *data) pcc->l1_dcache_size = 0x8000; pcc->l1_icache_size = 0x8000; } + +POWERPC_FAMILY(POWER8)(ObjectClass *oc, void *data) +{ +DeviceClass *dc = DEVICE_CLASS(oc); +PowerPCCPUClass *pcc = POWERPC_CPU_CLASS(oc); + +dc->desc = "POWER8"; +pcc->init_proc = init_proc_POWER7; +pcc->check_pow = check_pow_nocheck; +pcc->insns_flags = PPC_INSNS_BASE | PPC_STRING | PPC_MFTB | + PPC_FLOAT | PPC_FLOAT_FSEL | PPC_FLOAT_FRES | + PPC_FLOAT_FSQRT | PPC_FLOAT_FRSQRTE | + PPC_FLOAT_STFIWX | + PPC_CACHE | PPC_CACHE_ICBI | PPC_CACHE_DCBZ | + PPC_MEM_SYNC | PPC_MEM_EIEIO | + PPC_MEM_TLBIE | PPC_MEM_TLBSYNC | + PPC_64B | PPC_ALTIVEC | + PPC_SEGMENT_64B | PPC_SLBI | + PPC_POPCNTB | PPC_POPCNTWD; +pcc->insns_flags2 = PPC2_VSX | PPC2_DFP | PPC2_DBRX; +pcc->msr_mask = 0x8204FF36ULL; +pcc->mmu_model = POWERPC_MMU_2_06; +#if defined(CONFIG_SOFTMMU) +pcc->handle_mmu_fault = ppc_hash64_handle_mmu_fault; +#endif +pcc->excp_model = POWERPC_EXCP_POWER7; +pcc->bus_model = PPC_FLAGS_INPUT_POWER7; +pcc->bfd_mach = bfd_mach_ppc64; +pcc->flags = POWERPC_FLAG_VRE | POWERPC_FLAG_SE | + POWERPC_FLAG_BE | POWERPC_FLAG_PMM | + POWERPC_FLAG_BUS_CLK | POWERPC_FLAG_CFAR; +pcc->l1_dcache_size = 0x8000; +pcc->l1_icache_size = 0x8000; +} #endif /* defined (TARGET_PPC64) */ -- 1.8.3.2
[Qemu-devel] [PATCH 13/19] pseries: savevm support for PCI host bridge
From: David Gibson This adds the necessary support for saving the state of the PAPR virtual PCI host bridge (or host bridges). Signed-off-by: David Gibson Signed-off-by: Alexey Kardashevskiy --- hw/ppc/spapr_pci.c | 49 + include/hw/pci-host/spapr.h | 6 +++--- 2 files changed, 52 insertions(+), 3 deletions(-) diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index c8c12c8..4d8e3cd 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -696,6 +696,54 @@ static Property spapr_phb_properties[] = { DEFINE_PROP_END_OF_LIST(), }; +static const VMStateDescription vmstate_spapr_pci_lsi = { +.name = "spapr_pci/lsi", +.version_id = 1, +.minimum_version_id = 1, +.minimum_version_id_old = 1, +.fields = (VMStateField []) { +VMSTATE_UINT32_EQUAL(irq, struct spapr_pci_lsi), + +VMSTATE_END_OF_LIST() +}, +}; + +static const VMStateDescription vmstate_spapr_pci_msi = { +.name = "spapr_pci/lsi", +.version_id = 1, +.minimum_version_id = 1, +.minimum_version_id_old = 1, +.fields = (VMStateField []) { +VMSTATE_UINT32(config_addr, struct spapr_pci_msi), +VMSTATE_UINT32(irq, struct spapr_pci_msi), +VMSTATE_UINT32(nvec, struct spapr_pci_msi), + +VMSTATE_END_OF_LIST() +}, +}; + +static const VMStateDescription vmstate_spapr_pci = { +.name = "spapr_pci", +.version_id = 1, +.minimum_version_id = 1, +.minimum_version_id_old = 1, +.fields = (VMStateField []) { +VMSTATE_UINT64_EQUAL(buid, sPAPRPHBState), +VMSTATE_UINT32_EQUAL(dma_liobn, sPAPRPHBState), +VMSTATE_UINT64_EQUAL(mem_win_addr, sPAPRPHBState), +VMSTATE_UINT64_EQUAL(mem_win_size, sPAPRPHBState), +VMSTATE_UINT64_EQUAL(io_win_addr, sPAPRPHBState), +VMSTATE_UINT64_EQUAL(io_win_size, sPAPRPHBState), +VMSTATE_UINT64_EQUAL(msi_win_addr, sPAPRPHBState), +VMSTATE_STRUCT_ARRAY(lsi_table, sPAPRPHBState, PCI_NUM_PINS, 0, + vmstate_spapr_pci_lsi, struct spapr_pci_lsi), +VMSTATE_STRUCT_ARRAY(msi_table, sPAPRPHBState, SPAPR_MSIX_MAX_DEVS, 0, + vmstate_spapr_pci_msi, struct spapr_pci_msi), + +VMSTATE_END_OF_LIST() +}, +}; + static void spapr_phb_class_init(ObjectClass *klass, void *data) { SysBusDeviceClass *sdc = SYS_BUS_DEVICE_CLASS(klass); @@ -704,6 +752,7 @@ static void spapr_phb_class_init(ObjectClass *klass, void *data) sdc->init = spapr_phb_init; dc->props = spapr_phb_properties; dc->reset = spapr_phb_reset; +dc->vmsd = &vmstate_spapr_pci; } static const TypeInfo spapr_phb_info = { diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h index 1e23dbf..93f9511 100644 --- a/include/hw/pci-host/spapr.h +++ b/include/hw/pci-host/spapr.h @@ -52,14 +52,14 @@ typedef struct sPAPRPHBState { sPAPRTCETable *tcet; AddressSpace iommu_as; -struct { +struct spapr_pci_lsi { uint32_t irq; } lsi_table[PCI_NUM_PINS]; -struct { +struct spapr_pci_msi { uint32_t config_addr; uint32_t irq; -int nvec; +uint32_t nvec; } msi_table[SPAPR_MSIX_MAX_DEVS]; QLIST_ENTRY(sPAPRPHBState) list; -- 1.8.3.2
[Qemu-devel] [PATCH 16/19] pseries: savevm support with KVM
From: David Gibson At present, the savevm / migration support for the pseries machine will not work when KVM is enabled. That's because KVM manages the guest's hash page table in the host kernel, so qemu has no visibility of it. This patch fixes this by using new kernel interfaces to extract and reinsert the guest's hash table during the migration process. Signed-off-by: David Gibson Signed-off-by: Alexey Kardashevskiy --- hw/ppc/spapr.c | 106 +++-- include/hw/ppc/spapr.h | 1 + target-ppc/kvm.c | 69 target-ppc/kvm_ppc.h | 22 ++ 4 files changed, 176 insertions(+), 22 deletions(-) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index 961f2f7..26dd3f7 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -744,17 +744,27 @@ static int htab_save_setup(QEMUFile *f, void *opaque) { sPAPREnvironment *spapr = opaque; -spapr->htab_save_index = 0; -spapr->htab_first_pass = true; - /* "Iteration" header */ qemu_put_be32(f, spapr->htab_shift); +if (spapr->htab) { +spapr->htab_save_index = 0; +spapr->htab_first_pass = true; +} else { +assert(kvm_enabled()); + +spapr->htab_fd = kvmppc_get_htab_fd(false); +if (spapr->htab_fd < 0) { +fprintf(stderr, "Unable to open fd for reading hash table from KVM: %s\n", +strerror(errno)); +return -1; +} +} + + return 0; } -#define MAX_ITERATION_NS500 /* 5 ms */ - static void htab_save_first_pass(QEMUFile *f, sPAPREnvironment *spapr, int64_t max_ns) { @@ -805,8 +815,8 @@ static void htab_save_first_pass(QEMUFile *f, sPAPREnvironment *spapr, spapr->htab_save_index = index; } -static bool htab_save_later_pass(QEMUFile *f, sPAPREnvironment *spapr, - int64_t max_ns) +static int htab_save_later_pass(QEMUFile *f, sPAPREnvironment *spapr, +int64_t max_ns) { bool final = max_ns < 0; int htabslots = HTAB_SIZE(spapr) / HASH_PTE_SIZE_64; @@ -879,21 +889,32 @@ static bool htab_save_later_pass(QEMUFile *f, sPAPREnvironment *spapr, spapr->htab_save_index = index; -return (examined >= htabslots) && (sent == 0); +return (examined >= htabslots) && (sent == 0) ? 1 : 0; } +#define MAX_ITERATION_NS500 /* 5 ms */ +#define MAX_KVM_BUF_SIZE2048 + static int htab_save_iterate(QEMUFile *f, void *opaque) { sPAPREnvironment *spapr = opaque; -bool nothingleft = false;; +int rc = 0; /* Iteration header */ qemu_put_be32(f, 0); -if (spapr->htab_first_pass) { +if (!spapr->htab) { +assert(kvm_enabled()); + +rc = kvmppc_save_htab(f, spapr->htab_fd, + MAX_KVM_BUF_SIZE, MAX_ITERATION_NS); +if (rc < 0) { +return rc; +} +} else if (spapr->htab_first_pass) { htab_save_first_pass(f, spapr, MAX_ITERATION_NS); } else { -nothingleft = htab_save_later_pass(f, spapr, MAX_ITERATION_NS); +rc = htab_save_later_pass(f, spapr, MAX_ITERATION_NS); } /* End marker */ @@ -901,7 +922,7 @@ static int htab_save_iterate(QEMUFile *f, void *opaque) qemu_put_be16(f, 0); qemu_put_be16(f, 0); -return nothingleft ? 1 : 0; +return rc; } static int htab_save_complete(QEMUFile *f, void *opaque) @@ -911,7 +932,20 @@ static int htab_save_complete(QEMUFile *f, void *opaque) /* Iteration header */ qemu_put_be32(f, 0); -htab_save_later_pass(f, spapr, -1); +if (!spapr->htab) { +int rc; + +assert(kvm_enabled()); + +rc = kvmppc_save_htab(f, spapr->htab_fd, MAX_KVM_BUF_SIZE, -1); +if (rc < 0) { +return rc; +} +close(spapr->htab_fd); +spapr->htab_fd = -1; +} else { +htab_save_later_pass(f, spapr, -1); +} /* End marker */ qemu_put_be32(f, 0); @@ -925,6 +959,7 @@ static int htab_load(QEMUFile *f, void *opaque, int version_id) { sPAPREnvironment *spapr = opaque; uint32_t section_hdr; +int fd = -1; if (version_id < 1 || version_id > 1) { fprintf(stderr, "htab_load() bad version\n"); @@ -941,6 +976,16 @@ static int htab_load(QEMUFile *f, void *opaque, int version_id) return 0; } +if (!spapr->htab) { +assert(kvm_enabled()); + +fd = kvmppc_get_htab_fd(true); +if (fd < 0) { +fprintf(stderr, "Unable to open fd to restore KVM hash table: %s\n", +strerror(errno)); +} +} + while (true) { uint32_t index; uint16_t n_valid, n_invalid; @@ -954,24 +999,41 @@ static int htab_load(QEMUFile *f, void *opaque, int version_id) break; } -if ((index + n_valid + n_invalid) >= +if ((index + n_valid + n_inval
[Qemu-devel] [PATCH 14/19] target-ppc: Add helper for KVM_PPC_RTAS_DEFINE_TOKEN
From: David Gibson Recent PowerKVM allows the kernel to intercept some RTAS calls from the guest directly. This is used to implement the more efficient in-kernel XICS for example. qemu is still responsible for assigning the RTAS token numbers however, and needs to tell the kernel which RTAS function name is assigned to a given token value. This patch adds a convenience wrapper for the KVM_PPC_RTAS_DEFINE_TOKEN ioctl() which is used for this purpose. Signed-off-by: David Gibson Signed-off-by: Alexey Kardashevskiy --- target-ppc/kvm.c | 14 ++ target-ppc/kvm_ppc.h | 7 +++ 2 files changed, 21 insertions(+) diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c index c89dd58..33ddf63 100644 --- a/target-ppc/kvm.c +++ b/target-ppc/kvm.c @@ -1787,6 +1787,20 @@ static int kvm_ppc_register_host_cpu_type(void) return 0; } +int kvmppc_define_rtas_token(uint32_t token, const char *function) +{ +struct kvm_rtas_token_args args = { +.token = token, +}; + +if (!kvm_check_extension(kvm_state, KVM_CAP_PPC_RTAS)) { +return -ENOENT; +} + +strncpy(args.name, function, sizeof(args.name)); + +return kvm_vm_ioctl(kvm_state, KVM_PPC_RTAS_DEFINE_TOKEN, &args); +} bool kvm_arch_stop_on_emulation_error(CPUState *cpu) { diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h index 771cfbe..21939a8 100644 --- a/target-ppc/kvm_ppc.h +++ b/target-ppc/kvm_ppc.h @@ -38,6 +38,7 @@ uint64_t kvmppc_rma_size(uint64_t current_size, unsigned int hash_shift); #endif /* !CONFIG_USER_ONLY */ int kvmppc_fixup_cpu(PowerPCCPU *cpu); bool kvmppc_has_cap_epr(void); +int kvmppc_define_rtas_token(uint32_t token, const char *function); #else @@ -159,6 +160,12 @@ static inline bool kvmppc_has_cap_epr(void) { return false; } + +static inline int kvmppc_define_rtas_token(uint32_t token, + const char *function) +{ +return -1; +} #endif #ifndef CONFIG_KVM -- 1.8.3.2
[Qemu-devel] [PATCH 12/19] pseries: savevm support for pseries machine
From: David Gibson This adds the necessary pieces to implement savevm / migration for the pseries machine. The most complex part here is migrating the hash table - for the paravirtualized pseries machine the guest's hash page table is not stored within guest memory, but externally and the guest accesses it via hypercalls. This patch uses a hypervisor reserved bit of the HPTE as a dirty bit (tracking changes to the HPTE itself, not the page it references). This is used to implement a live migration style incremental save and restore of the hash table contents. In addition it adds VMStateDescription information to save and restore the (few) remaining pieces of state information needed by the pseries machine. Signed-off-by: David Gibson Signed-off-by: Alexey Kardashevskiy --- hw/ppc/spapr.c | 269 - hw/ppc/spapr_hcall.c | 8 +- include/hw/ppc/spapr.h | 12 ++- 3 files changed, 281 insertions(+), 8 deletions(-) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index d8f1614..bf348c7 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -32,6 +32,7 @@ #include "sysemu/cpus.h" #include "sysemu/kvm.h" #include "kvm_ppc.h" +#include "mmu-hash64.h" #include "hw/boards.h" #include "hw/ppc/ppc.h" @@ -667,7 +668,7 @@ static void spapr_cpu_reset(void *opaque) env->spr[SPR_HIOR] = 0; -env->external_htab = spapr->htab; +env->external_htab = (uint8_t *)spapr->htab; env->htab_base = -1; env->htab_mask = HTAB_SIZE(spapr) - 1; env->spr[SPR_SDR1] = (target_ulong)(uintptr_t)spapr->htab | @@ -719,6 +720,268 @@ static int spapr_vga_init(PCIBus *pci_bus) } } +static const VMStateDescription vmstate_spapr = { +.name = "spapr", +.version_id = 1, +.minimum_version_id = 1, +.minimum_version_id_old = 1, +.fields = (VMStateField []) { +VMSTATE_UINT32(next_irq, sPAPREnvironment), + +/* RTC offset */ +VMSTATE_UINT64(rtc_offset, sPAPREnvironment), + +VMSTATE_END_OF_LIST() +}, +}; + +#define HPTE(_table, _i) (void *)(((uint64_t *)(_table)) + ((_i) * 2)) +#define HPTE_VALID(_hpte) (tswap64(*((uint64_t *)(_hpte))) & HPTE64_V_VALID) +#define HPTE_DIRTY(_hpte) (tswap64(*((uint64_t *)(_hpte))) & HPTE64_V_HPTE_DIRTY) +#define CLEAN_HPTE(_hpte) ((*(uint64_t *)(_hpte)) &= tswap64(~HPTE64_V_HPTE_DIRTY)) + +static int htab_save_setup(QEMUFile *f, void *opaque) +{ +sPAPREnvironment *spapr = opaque; + +spapr->htab_save_index = 0; +spapr->htab_first_pass = true; + +/* "Iteration" header */ +qemu_put_be32(f, spapr->htab_shift); + +return 0; +} + +#define MAX_ITERATION_NS500 /* 5 ms */ + +static void htab_save_first_pass(QEMUFile *f, sPAPREnvironment *spapr, + int64_t max_ns) +{ +int htabslots = HTAB_SIZE(spapr) / HASH_PTE_SIZE_64; +int index = spapr->htab_save_index; +int64_t starttime = qemu_get_clock_ns(rt_clock); + +assert(spapr->htab_first_pass); + +do { +int chunkstart; + +/* Consume invalid HPTEs */ +while ((index < htabslots) + && !HPTE_VALID(HPTE(spapr->htab, index))) { +index++; +CLEAN_HPTE(HPTE(spapr->htab, index)); +} + +/* Consume valid HPTEs */ +chunkstart = index; +while ((index < htabslots) + && HPTE_VALID(HPTE(spapr->htab, index))) { +index++; +CLEAN_HPTE(HPTE(spapr->htab, index)); +} + +if (index > chunkstart) { +int n_valid = index - chunkstart; + +qemu_put_be32(f, chunkstart); +qemu_put_be16(f, n_valid); +qemu_put_be16(f, 0); +qemu_put_buffer(f, HPTE(spapr->htab, chunkstart), +HASH_PTE_SIZE_64 * n_valid); + +if ((qemu_get_clock_ns(rt_clock) - starttime) > max_ns) { +break; +} +} +} while ((index < htabslots) && !qemu_file_rate_limit(f)); + +if (index >= htabslots) { +assert(index == htabslots); +index = 0; +spapr->htab_first_pass = false; +} +spapr->htab_save_index = index; +} + +static bool htab_save_later_pass(QEMUFile *f, sPAPREnvironment *spapr, + int64_t max_ns) +{ +bool final = max_ns < 0; +int htabslots = HTAB_SIZE(spapr) / HASH_PTE_SIZE_64; +int examined = 0, sent = 0; +int index = spapr->htab_save_index; +int64_t starttime = qemu_get_clock_ns(rt_clock); + +assert(!spapr->htab_first_pass); + +do { +int chunkstart, invalidstart; + +/* Consume non-dirty HPTEs */ +while ((index < htabslots) + && !HPTE_DIRTY(HPTE(spapr->htab, index))) { +index++; +examined++; +} + +chunkstart = index; +/* Consume valid dirty HPTEs */ +while ((index < htabslots) + && HPTE_DIRTY(HPTE(spapr->htab, index)) +
[Qemu-devel] [PATCH 11/19] pseries: savevm support for PAPR virtual SCSI
From: David Gibson This patch adds the necessary support for saving the state of the PAPR VIO virtual SCSI device. This also saves and restores active SCSI requests. [aik: implemented vscsi_req save/restore] Signed-off-by: Alexey Kardashevskiy Cc: David Gibson Signed-off-by: Alexey Kardashevskiy --- hw/scsi/spapr_vscsi.c | 82 ++- 1 file changed, 81 insertions(+), 1 deletion(-) diff --git a/hw/scsi/spapr_vscsi.c b/hw/scsi/spapr_vscsi.c index 1e93102..4db3a47 100644 --- a/hw/scsi/spapr_vscsi.c +++ b/hw/scsi/spapr_vscsi.c @@ -579,6 +579,69 @@ static void vscsi_request_cancelled(SCSIRequest *sreq) vscsi_put_req(req); } +static const VMStateDescription vmstate_spapr_vscsi_req = { +.name = "spapr_vscsi_req", +.version_id = 1, +.minimum_version_id = 1, +.minimum_version_id_old = 1, +.fields = (VMStateField []) { +VMSTATE_BUFFER(crq.raw, vscsi_req), +VMSTATE_BUFFER(iu.srp.reserved, vscsi_req), +VMSTATE_UINT32(qtag, vscsi_req), +VMSTATE_BOOL(active, vscsi_req), +VMSTATE_UINT32(data_len, vscsi_req), +VMSTATE_BOOL(writing, vscsi_req), +VMSTATE_UINT32(senselen, vscsi_req), +VMSTATE_BUFFER(sense, vscsi_req), +VMSTATE_UINT8(dma_fmt, vscsi_req), +VMSTATE_UINT16(local_desc, vscsi_req), +VMSTATE_UINT16(total_desc, vscsi_req), +VMSTATE_UINT16(cdb_offset, vscsi_req), + /*Restart SCSI request from the beginning for now */ + /*VMSTATE_UINT16(cur_desc_num, vscsi_req), +VMSTATE_UINT16(cur_desc_offset, vscsi_req),*/ +VMSTATE_END_OF_LIST() +}, +}; + +static void vscsi_save_request(QEMUFile *f, SCSIRequest *sreq) +{ +vscsi_req *req = sreq->hba_private; +assert(req->active); + +vmstate_save_state(f, &vmstate_spapr_vscsi_req, req); + +dprintf("VSCSI: saving tag=%u, current desc#%d, offset=%x\n", +req->qtag, req->cur_desc_num, req->cur_desc_offset); +} + +static void *vscsi_load_request(QEMUFile *f, SCSIRequest *sreq) +{ +SCSIBus *bus = sreq->bus; +VSCSIState *s = VIO_SPAPR_VSCSI_DEVICE(bus->qbus.parent); +vscsi_req *req; +int rc; + +assert(sreq->tag < VSCSI_REQ_LIMIT); +req = &s->reqs[sreq->tag]; +assert(!req->active); + +memset(req, 0, sizeof(*req)); +rc = vmstate_load_state(f, &vmstate_spapr_vscsi_req, req, 1); +if (rc) { +fprintf(stderr, "VSCSI: failed loading request tag#%u\n", sreq->tag); +return NULL; +} +assert(req->active); + +req->sreq = scsi_req_ref(sreq); + +dprintf("VSCSI: restoring tag=%u, current desc#%d, offset=%x\n", +req->qtag, req->cur_desc_num, req->cur_desc_offset); + +return req; +} + static void vscsi_process_login(VSCSIState *s, vscsi_req *req) { union viosrp_iu *iu = &req->iu; @@ -933,7 +996,9 @@ static const struct SCSIBusInfo vscsi_scsi_info = { .transfer_data = vscsi_transfer_data, .complete = vscsi_command_complete, -.cancel = vscsi_request_cancelled +.cancel = vscsi_request_cancelled, +.save_request = vscsi_save_request, +.load_request = vscsi_load_request, }; static void spapr_vscsi_reset(VIOsPAPRDevice *dev) @@ -992,6 +1057,20 @@ static Property spapr_vscsi_properties[] = { DEFINE_PROP_END_OF_LIST(), }; +static const VMStateDescription vmstate_spapr_vscsi = { +.name = "spapr_vscsi", +.version_id = 1, +.minimum_version_id = 1, +.minimum_version_id_old = 1, +.fields = (VMStateField []) { +VMSTATE_SPAPR_VIO(vdev, VSCSIState), +/* VSCSI state */ +/* */ + +VMSTATE_END_OF_LIST() +}, +}; + static void spapr_vscsi_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); @@ -1006,6 +1085,7 @@ static void spapr_vscsi_class_init(ObjectClass *klass, void *data) k->signal_mask = 0x0001; dc->props = spapr_vscsi_properties; k->rtce_window_size = 0x1000; +dc->vmsd = &vmstate_spapr_vscsi; } static const TypeInfo spapr_vscsi_info = { -- 1.8.3.2
[Qemu-devel] [PATCH 07/19] pseries: savevm support for PAPR VIO logical lan
From: David Gibson This patch adds the necessary VMStateDescription information to support savevm/loadvm for the spapr_llan (PAPR logical lan) device. Signed-off-by: David Gibson Signed-off-by: Alexey Kardashevskiy --- hw/net/spapr_llan.c | 24 ++-- 1 file changed, 22 insertions(+), 2 deletions(-) diff --git a/hw/net/spapr_llan.c b/hw/net/spapr_llan.c index 03a09f2..46f7d5f 100644 --- a/hw/net/spapr_llan.c +++ b/hw/net/spapr_llan.c @@ -81,9 +81,9 @@ typedef struct VIOsPAPRVLANDevice { VIOsPAPRDevice sdev; NICConf nicconf; NICState *nic; -int isopen; +bool isopen; target_ulong buf_list; -int add_buf_ptr, use_buf_ptr, rx_bufs; +uint32_t add_buf_ptr, use_buf_ptr, rx_bufs; target_ulong rxq_ptr; } VIOsPAPRVLANDevice; @@ -500,6 +500,25 @@ static Property spapr_vlan_properties[] = { DEFINE_PROP_END_OF_LIST(), }; +static const VMStateDescription vmstate_spapr_llan = { +.name = "spapr_llan", +.version_id = 1, +.minimum_version_id = 1, +.minimum_version_id_old = 1, +.fields = (VMStateField []) { +VMSTATE_SPAPR_VIO(sdev, VIOsPAPRVLANDevice), +/* LLAN state */ +VMSTATE_BOOL(isopen, VIOsPAPRVLANDevice), +VMSTATE_UINTTL(buf_list, VIOsPAPRVLANDevice), +VMSTATE_UINT32(add_buf_ptr, VIOsPAPRVLANDevice), +VMSTATE_UINT32(use_buf_ptr, VIOsPAPRVLANDevice), +VMSTATE_UINT32(rx_bufs, VIOsPAPRVLANDevice), +VMSTATE_UINTTL(rxq_ptr, VIOsPAPRVLANDevice), + +VMSTATE_END_OF_LIST() +}, +}; + static void spapr_vlan_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); @@ -514,6 +533,7 @@ static void spapr_vlan_class_init(ObjectClass *klass, void *data) k->signal_mask = 0x1; dc->props = spapr_vlan_properties; k->rtce_window_size = 0x1000; +dc->vmsd = &vmstate_spapr_llan; } static const TypeInfo spapr_vlan_info = { -- 1.8.3.2
[Qemu-devel] [PATCH 03/19] savevm: Implement VMS_DIVIDE flag
From: David Gibson The vmstate infrastructure includes a VMS_MULTIPY flag, and associated VMSTATE_VBUFFER_MULTIPLY helper macro. These can be used to save a variably sized buffer where the size in bytes of the buffer isn't directly accessible as a structure field, but an element count from which the size can be derived is. This patch adds an analogous VMS_DIVIDE option, which handles a variably sized buffer whose size is a submultiple of a field, rather than a multiple. For example a buffer containing per-page structures whose size is derived from a field storing the total address space described by the structures could use this construct. Signed-off-by: David Gibson Signed-off-by: Alexey Kardashevskiy --- include/migration/vmstate.h | 13 + savevm.c| 8 2 files changed, 21 insertions(+) diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h index 1c31b5d..672b0a7 100644 --- a/include/migration/vmstate.h +++ b/include/migration/vmstate.h @@ -100,6 +100,7 @@ enum VMStateFlags { VMS_MULTIPLY = 0x200, /* multiply "size" field by field_size */ VMS_VARRAY_UINT8 = 0x400, /* Array with size in uint8_t field*/ VMS_VARRAY_UINT32= 0x800, /* Array with size in uint32_t field*/ +VMS_DIVIDE = 0x1000, /* divide "size" field by field_size */ }; typedef struct { @@ -422,6 +423,18 @@ extern const VMStateInfo vmstate_info_bitmap; .start= (_start),\ } +#define VMSTATE_VBUFFER_DIVIDE(_field, _state, _version, _test, _start, _field_size, _divide) { \ +.name = (stringify(_field)), \ +.version_id = (_version), \ +.field_exists = (_test), \ +.size_offset = vmstate_offset_value(_state, _field_size, uint32_t),\ +.size = (_divide), \ +.info = &vmstate_info_buffer,\ +.flags= VMS_VBUFFER|VMS_POINTER|VMS_DIVIDE, \ +.offset = offsetof(_state, _field),\ +.start= (_start),\ +} + #define VMSTATE_VBUFFER(_field, _state, _version, _test, _start, _field_size) { \ .name = (stringify(_field)), \ .version_id = (_version), \ diff --git a/savevm.c b/savevm.c index e0491e7..788af85 100644 --- a/savevm.c +++ b/savevm.c @@ -1700,6 +1700,10 @@ int vmstate_load_state(QEMUFile *f, const VMStateDescription *vmsd, if (field->flags & VMS_MULTIPLY) { size *= field->size; } +if (field->flags & VMS_DIVIDE) { +assert((size % field->size) == 0); +size /= field->size; +} } if (field->flags & VMS_ARRAY) { n_elems = field->num; @@ -1764,6 +1768,10 @@ void vmstate_save_state(QEMUFile *f, const VMStateDescription *vmsd, if (field->flags & VMS_MULTIPLY) { size *= field->size; } +if (field->flags & VMS_DIVIDE) { +assert((size % field->size) == 0); +size /= field->size; +} } if (field->flags & VMS_ARRAY) { n_elems = field->num; -- 1.8.3.2
[Qemu-devel] [PATCH 10/19] pseries: rework PAPR virtual SCSI
The patch reimplements handling of indirect requests in order to simplify upcoming live migration support. - all pointers (except SCSIRequest*) were replaces with integer indexes and offsets; - DMA'ed srp_direct_buf kept untouched (ie. BE format); - vscsi_fetch_desc() is added, now it is the only place where descriptors are fetched and byteswapped; - vscsi_req struct fields converted to migration-friendly types; - many dprintf()'s fixed. This also removed an unused field 'lun' from the spapr_vscsi device which is assigned, but never used. So, remove it. [David Gibson: removed unused 'lun'] Signed-off-by: Alexey Kardashevskiy Cc: David Gibson Signed-off-by: Alexey Kardashevskiy --- hw/scsi/spapr_vscsi.c | 224 +- 1 file changed, 131 insertions(+), 93 deletions(-) diff --git a/hw/scsi/spapr_vscsi.c b/hw/scsi/spapr_vscsi.c index e8978bf..1e93102 100644 --- a/hw/scsi/spapr_vscsi.c +++ b/hw/scsi/spapr_vscsi.c @@ -75,20 +75,19 @@ typedef struct vscsi_req { /* SCSI request tracking */ SCSIRequest *sreq; uint32_tqtag; /* qemu tag != srp tag */ -int lun; -int active; -longdata_len; -int writing; -int senselen; +boolactive; +uint32_tdata_len; +boolwriting; +uint32_tsenselen; uint8_t sense[SCSI_SENSE_BUF_SIZE]; /* RDMA related bits */ uint8_t dma_fmt; -struct srp_direct_buf ext_desc; -struct srp_direct_buf *cur_desc; -struct srp_indirect_buf *ind_desc; -int local_desc; -int total_desc; +uint16_tlocal_desc; +uint16_ttotal_desc; +uint16_tcdb_offset; +uint16_tcur_desc_num; +uint16_tcur_desc_offset; } vscsi_req; #define TYPE_VIO_SPAPR_VSCSI_DEVICE "spapr-vscsi" @@ -264,93 +263,139 @@ static int vscsi_send_rsp(VSCSIState *s, vscsi_req *req, return 0; } -static inline void vscsi_swap_desc(struct srp_direct_buf *desc) +static inline struct srp_direct_buf vscsi_swap_desc(struct srp_direct_buf desc) { -desc->va = be64_to_cpu(desc->va); -desc->len = be32_to_cpu(desc->len); +desc.va = be64_to_cpu(desc.va); +desc.len = be32_to_cpu(desc.len); +return desc; +} + +static int vscsi_fetch_desc(VSCSIState *s, struct vscsi_req *req, +unsigned n, unsigned buf_offset, +struct srp_direct_buf *ret) +{ +struct srp_cmd *cmd = &req->iu.srp.cmd; + +switch (req->dma_fmt) { +case SRP_NO_DATA_DESC: { +dprintf("VSCSI: no data descriptor\n"); +return 0; +} +case SRP_DATA_DESC_DIRECT: { +*ret = *(struct srp_direct_buf *)(cmd->add_data + req->cdb_offset); +assert(req->cur_desc_num == 0); +dprintf("VSCSI: direct segment"); +break; +} +case SRP_DATA_DESC_INDIRECT: { +struct srp_indirect_buf *tmp = (struct srp_indirect_buf *) + (cmd->add_data + req->cdb_offset); +if (n < req->local_desc) { +*ret = tmp->desc_list[n]; +dprintf("VSCSI: indirect segment local tag=0x%x desc#%d/%d", +req->qtag, n, req->local_desc); + +} else if (n < req->total_desc) { +int rc; +struct srp_direct_buf tbl_desc = vscsi_swap_desc(tmp->table_desc); +unsigned desc_offset = (n - req->local_desc) * +sizeof(struct srp_direct_buf); + +if (desc_offset > tbl_desc.len) { +dprintf("VSCSI: #%d is ouf of range (%d bytes)\n", +n, desc_offset); +return -1; +} +rc = spapr_vio_dma_read(&s->vdev, tbl_desc.va + desc_offset, +ret, sizeof(struct srp_direct_buf)); +if (rc) { +dprintf("VSCSI: spapr_vio_dma_read -> %d reading ext_desc\n", +rc); +return rc; +} +dprintf("VSCSI: indirect segment ext. tag=0x%x desc#%d/%d { va=%"PRIx64" len=%x }", +req->qtag, n, req->total_desc, tbl_desc.va, tbl_desc.len); +} else { +dprintf("VSCSI: Out of descriptors !\n"); +return 0; +} +break; +} +default: +fprintf(stderr, "VSCSI: Unknown format %x\n", req->dma_fmt); +return -1; +} + +*ret = vscsi_swap_desc(*ret); +if (buf_offset > ret->len) { +dprintf(" offset=%x is out of a descriptor #%d boundary=%x\n", +buf_offset, req->cur_desc_num, ret->len); +return -1; +} +ret->va += buf_offset; +ret->len -= buf_
[Qemu-devel] [PATCH 09/19] pseries: savevm support for PAPR TCE tables
From: David Gibson This patch adds the necessary VMStateDescription information to save the state of PAPR TCE tables (that is, the PAPR specified IOMMU). Signed-off-by: David Gibson Signed-off-by: Alexey Kardashevskiy --- hw/ppc/spapr_iommu.c | 25 + 1 file changed, 25 insertions(+) diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c index 91bc8e4..ba1f7b6 100644 --- a/hw/ppc/spapr_iommu.c +++ b/hw/ppc/spapr_iommu.c @@ -112,6 +112,25 @@ static IOMMUTLBEntry spapr_tce_translate_iommu(MemoryRegion *iommu, hwaddr addr) }; } +static const VMStateDescription vmstate_spapr_tce_table = { +.name = "spapr_iommu", +.version_id = 1, +.minimum_version_id = 1, +.minimum_version_id_old = 1, +.fields = (VMStateField []) { +/* Sanity check */ +VMSTATE_UINT32_EQUAL(liobn, sPAPRTCETable), +VMSTATE_UINT32_EQUAL(window_size, sPAPRTCETable), + +/* IOMMU state */ +VMSTATE_BOOL(bypass, sPAPRTCETable), +VMSTATE_VBUFFER_DIVIDE(table, sPAPRTCETable, 0, NULL, 0, window_size, + SPAPR_TCE_PAGE_SIZE / sizeof(sPAPRTCE)), + +VMSTATE_END_OF_LIST() +}, +}; + static MemoryRegionIOMMUOps spapr_iommu_ops = { .translate = spapr_tce_translate_iommu, }; @@ -156,6 +175,8 @@ sPAPRTCETable *spapr_tce_new_table(uint32_t liobn, size_t window_size) QLIST_INSERT_HEAD(&spapr_tce_tables, tcet, list); +vmstate_register(NULL, tcet->liobn, &vmstate_spapr_tce_table, tcet); + return tcet; } @@ -163,6 +184,10 @@ void spapr_tce_free(sPAPRTCETable *tcet) { QLIST_REMOVE(tcet, list); +vmstate_unregister(NULL, &vmstate_spapr_tce_table, tcet); + +QLIST_REMOVE(tcet, list); + if (!kvm_enabled() || (kvmppc_remove_spapr_tce(tcet->table, tcet->fd, tcet->window_size) != 0)) { -- 1.8.3.2
[Qemu-devel] [PATCH 15/19] pseries: Support for in-kernel XICS interrupt controller
From: David Gibson Recent (host) kernels support emulating the PAPR defined "XICS" interrupt controller system within KVM. This patch allows qemu to initialize and configure the in-kernel XICS, and keep its state in sync with qemu's XICS state as necessary. This should give considerable performance improvements. e.g. on a simple IPI ping-pong test between hardware threads, using qemu XICS gives us around 5,000 irqs/second, whereas the in-kernel XICS gives us around 70,000 irqs/s on the same hardware configuration. [Mike Qiu : fixed mistype which caused ics_set_kvm_state() to fail] Signed-off-by: David Gibson [aik: moved to a separate device] --- Changes: 2013/07/01 * fixed VMState names in order to support xics-kvm migration to xics and vice versa Signed-off-by: Alexey Kardashevskiy --- default-configs/ppc64-softmmu.mak | 1 + hw/intc/Makefile.objs | 1 + hw/intc/xics_kvm.c| 445 ++ hw/ppc/spapr.c| 32 ++- include/hw/ppc/xics.h | 13 ++ 5 files changed, 489 insertions(+), 3 deletions(-) create mode 100644 hw/intc/xics_kvm.c diff --git a/default-configs/ppc64-softmmu.mak b/default-configs/ppc64-softmmu.mak index 69a9f8d..5b995f9 100644 --- a/default-configs/ppc64-softmmu.mak +++ b/default-configs/ppc64-softmmu.mak @@ -48,5 +48,6 @@ CONFIG_OPENPIC_KVM=$(and $(CONFIG_E500),$(CONFIG_KVM)) # For pSeries CONFIG_PCI_HOTPLUG=y CONFIG_XICS=$(CONFIG_PSERIES) +CONFIG_XICS_KVM=$(and $(CONFIG_PSERIES),$(CONFIG_KVM)) # For PReP CONFIG_MC146818RTC=y diff --git a/hw/intc/Makefile.objs b/hw/intc/Makefile.objs index abe8f80..9e77afe 100644 --- a/hw/intc/Makefile.objs +++ b/hw/intc/Makefile.objs @@ -23,3 +23,4 @@ obj-$(CONFIG_OPENPIC) += openpic.o obj-$(CONFIG_OPENPIC_KVM) += openpic_kvm.o obj-$(CONFIG_SH4) += sh_intc.o obj-$(CONFIG_XICS) += xics.o +obj-$(CONFIG_XICS_KVM) += xics_kvm.o diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c new file mode 100644 index 000..b630150 --- /dev/null +++ b/hw/intc/xics_kvm.c @@ -0,0 +1,445 @@ +/* + * QEMU PowerPC pSeries Logical Partition (aka sPAPR) hardware System Emulator + * + * PAPR Virtualized Interrupt System, aka ICS/ICP aka xics, in-kernel emulation + * + * Copyright (c) 2013 David Gibson, IBM Corporation. + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE. + * + */ + +#include "hw/hw.h" +#include "trace.h" +#include "hw/ppc/spapr.h" +#include "hw/ppc/xics.h" +#include "kvm_ppc.h" +#include "qemu/config-file.h" + +#include + +struct icp_state_kvm { +struct icp_state parent; + +uint32_t set_xive_token; +uint32_t get_xive_token; +uint32_t int_off_token; +uint32_t int_on_token; +int kernel_xics_fd; +}; + +static void icp_get_kvm_state(struct icp_server_state *ss) +{ +uint64_t state; +struct kvm_one_reg reg = { +.id = KVM_REG_PPC_ICP_STATE, +.addr = (uintptr_t)&state, +}; +int ret; + +if (!ss->cs) { +return; /* kernel irqchip not in use */ +} + +ret = kvm_vcpu_ioctl(ss->cs, KVM_GET_ONE_REG, ®); +if (ret != 0) { +fprintf(stderr, "Unable to retrieve KVM interrupt controller state" +" for CPU %d: %s\n", ss->cs->cpu_index, strerror(errno)); +exit(1); +} + +ss->xirr = state >> KVM_REG_PPC_ICP_XISR_SHIFT; +ss->mfrr = (state >> KVM_REG_PPC_ICP_MFRR_SHIFT) +& KVM_REG_PPC_ICP_MFRR_MASK; +ss->pending_priority = (state >> KVM_REG_PPC_ICP_PPRI_SHIFT) +& KVM_REG_PPC_ICP_PPRI_MASK; +} + +static int icp_set_kvm_state(struct icp_server_state *ss) +{ +uint64_t state; +struct kvm_one_reg reg = { +.id = KVM_REG_PPC_ICP_STATE, +.addr = (uintptr_t)&state, +}; +int ret; + +if (!ss->cs) { +return 0; /* kernel irqchip not in use */ +} + +state = ((uint64_t)ss->xirr << KVM_REG_PPC_ICP_XISR_SHIFT) +| ((uint64_t)ss->mfrr << KVM_REG_PPC_ICP_MFRR_SHIFT) +
[Qemu-devel] [PATCH 05/19] pseries: savevm support for XICS interrupt controller
From: David Gibson This patch adds the necessary VMStateDescription information to support savevm/loadvm for the XICS interrupt controller used on the pseries machine. Signed-off-by: David Gibson [aik: added ics_resend() on post_load] Signed-off-by: Alexey Kardashevskiy --- hw/intc/xics.c | 63 ++ 1 file changed, 63 insertions(+) diff --git a/hw/intc/xics.c b/hw/intc/xics.c index 0e374c8..3e8f48f 100644 --- a/hw/intc/xics.c +++ b/hw/intc/xics.c @@ -497,6 +497,61 @@ static void xics_reset(DeviceState *d) xics_common_reset(XICS(d)); } +static int ics_post_load(void *opaque, int version_id) +{ +int i; +struct ics_state *ics = opaque; + +for (i = 0; i < ics->icp->nr_servers; i++) { +icp_resend(ics->icp, i); +} + +return 0; +} + +const VMStateDescription vmstate_icp_server = { +.name = "icp/server", +.version_id = 1, +.minimum_version_id = 1, +.minimum_version_id_old = 1, +.fields = (VMStateField []) { +/* Sanity check */ +VMSTATE_UINT32(xirr, struct icp_server_state), +VMSTATE_UINT8(pending_priority, struct icp_server_state), +VMSTATE_UINT8(mfrr, struct icp_server_state), +VMSTATE_END_OF_LIST() +}, +}; + +static const VMStateDescription vmstate_ics_irq = { +.name = "ics/irq", +.version_id = 1, +.minimum_version_id = 1, +.minimum_version_id_old = 1, +.fields = (VMStateField []) { +VMSTATE_UINT32(server, struct ics_irq_state), +VMSTATE_UINT8(priority, struct ics_irq_state), +VMSTATE_UINT8(saved_priority, struct ics_irq_state), +VMSTATE_UINT8(status, struct ics_irq_state), +VMSTATE_END_OF_LIST() +}, +}; + +const VMStateDescription vmstate_ics = { +.name = "ics", +.version_id = 1, +.minimum_version_id = 1, +.minimum_version_id_old = 1, +.post_load = ics_post_load, +.fields = (VMStateField []) { +/* Sanity check */ +VMSTATE_UINT32_EQUAL(nr_irqs, struct ics_state), + +VMSTATE_STRUCT_VARRAY_POINTER_UINT32(irqs, struct ics_state, nr_irqs, vmstate_ics_irq, struct ics_irq_state), +VMSTATE_END_OF_LIST() +}, +}; + void xics_common_cpu_setup(struct icp_state *icp, PowerPCCPU *cpu) { CPUState *cs = CPU(cpu); @@ -523,7 +578,11 @@ void xics_common_cpu_setup(struct icp_state *icp, PowerPCCPU *cpu) void xics_cpu_setup(struct icp_state *icp, PowerPCCPU *cpu) { +CPUState *cs = CPU(cpu); +struct icp_server_state *ss = &icp->ss[cs->cpu_index]; + xics_common_cpu_setup(icp, cpu); +vmstate_register(NULL, cs->cpu_index, &vmstate_icp_server, ss); } void xics_common_init(struct icp_state *icp, qemu_irq_handler handler) @@ -555,6 +614,10 @@ static void xics_realize(DeviceState *dev, Error **errp) spapr_rtas_register("ibm,int-off", rtas_int_off); spapr_rtas_register("ibm,int-on", rtas_int_on); +/* We use each the ICS's offset into the global irq number space + * as an instance id. This means we can extend to multiple ICS + * instances without needing to change the savevm format */ +vmstate_register(NULL, icp->ics->offset, &vmstate_ics, icp->ics); } static Property xics_properties[] = { -- 1.8.3.2
[Qemu-devel] [PATCH 06/19] pseries: savevm support for VIO devices
From: David Gibson This patch adds helpers to allow PAPR VIO devices to save state common to all VIO devices during savevm. Signed-off-by: David Gibson Signed-off-by: Alexey Kardashevskiy --- hw/ppc/spapr_vio.c | 20 include/hw/ppc/spapr_vio.h | 5 + 2 files changed, 25 insertions(+) diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c index 9c18741..565d883 100644 --- a/hw/ppc/spapr_vio.c +++ b/hw/ppc/spapr_vio.c @@ -542,6 +542,26 @@ static const TypeInfo spapr_vio_bridge_info = { .class_init= spapr_vio_bridge_class_init, }; +const VMStateDescription vmstate_spapr_vio = { +.name = "spapr_vio", +.version_id = 1, +.minimum_version_id = 1, +.minimum_version_id_old = 1, +.fields = (VMStateField []) { +/* Sanity check */ +VMSTATE_UINT32_EQUAL(reg, VIOsPAPRDevice), +VMSTATE_UINT32_EQUAL(irq, VIOsPAPRDevice), + +/* General VIO device state */ +VMSTATE_UINTTL(signal_state, VIOsPAPRDevice), +VMSTATE_UINT64(crq.qladdr, VIOsPAPRDevice), +VMSTATE_UINT32(crq.qsize, VIOsPAPRDevice), +VMSTATE_UINT32(crq.qnext, VIOsPAPRDevice), + +VMSTATE_END_OF_LIST() +}, +}; + static void vio_spapr_device_class_init(ObjectClass *klass, void *data) { DeviceClass *k = DEVICE_CLASS(klass); diff --git a/include/hw/ppc/spapr_vio.h b/include/hw/ppc/spapr_vio.h index 3609327..46edc2a 100644 --- a/include/hw/ppc/spapr_vio.h +++ b/include/hw/ppc/spapr_vio.h @@ -134,4 +134,9 @@ VIOsPAPRDevice *spapr_vty_get_default(VIOsPAPRBus *bus); void spapr_vio_quiesce(void); +extern const VMStateDescription vmstate_spapr_vio; + +#define VMSTATE_SPAPR_VIO(_f, _s) \ +VMSTATE_STRUCT(_f, _s, 0, vmstate_spapr_vio, VIOsPAPRDevice) + #endif /* _HW_SPAPR_VIO_H */ -- 1.8.3.2
[Qemu-devel] [PATCH 02/19] pseries: rework XICS
Currently XICS interrupt controller is not a QEMU device. As we are going to support in-kernel emulated XICS which is a part of KVM, it make sense not to extend the existing XICS and have multiple KVM stub functions but to create yet another device and share pieces between fully emulated XICS and in-kernel XICS. The rework includes: * port to QOM * made few functions public to use from in-kernel XICS implementation * made VMStateDescription public to be used for in-kernel XICS migration * move xics_system_init() to spapr.c, it tries creating fully-emulated XICS now and will try in-kernel XICS in upcoming patches. Signed-off-by: Alexey Kardashevskiy --- hw/intc/xics.c| 109 ++ hw/ppc/spapr.c| 28 + include/hw/ppc/xics.h | 59 +-- 3 files changed, 141 insertions(+), 55 deletions(-) diff --git a/hw/intc/xics.c b/hw/intc/xics.c index 091912e..0e374c8 100644 --- a/hw/intc/xics.c +++ b/hw/intc/xics.c @@ -34,13 +34,6 @@ * ICP: Presentation layer */ -struct icp_server_state { -uint32_t xirr; -uint8_t pending_priority; -uint8_t mfrr; -qemu_irq output; -}; - #define XISR_MASK 0x00ff #define CPPR_MASK 0xff00 @@ -49,12 +42,6 @@ struct icp_server_state { struct ics_state; -struct icp_state { -long nr_servers; -struct icp_server_state *ss; -struct ics_state *ics; -}; - static void ics_reject(struct ics_state *ics, int nr); static void ics_resend(struct ics_state *ics); static void ics_eoi(struct ics_state *ics, int nr); @@ -171,27 +158,6 @@ static void icp_irq(struct icp_state *icp, int server, int nr, uint8_t priority) /* * ICS: Source layer */ - -struct ics_irq_state { -int server; -uint8_t priority; -uint8_t saved_priority; -#define XICS_STATUS_ASSERTED 0x1 -#define XICS_STATUS_SENT 0x2 -#define XICS_STATUS_REJECTED 0x4 -#define XICS_STATUS_MASKED_PENDING 0x8 -uint8_t status; -}; - -struct ics_state { -int nr_irqs; -int offset; -qemu_irq *qirqs; -bool *islsi; -struct ics_irq_state *irqs; -struct icp_state *icp; -}; - static int ics_valid_irq(struct ics_state *ics, uint32_t nr) { return (nr >= ics->offset) @@ -506,9 +472,8 @@ static void rtas_int_on(PowerPCCPU *cpu, sPAPREnvironment *spapr, rtas_st(rets, 0, 0); /* Success */ } -static void xics_reset(void *opaque) +void xics_common_reset(struct icp_state *icp) { -struct icp_state *icp = (struct icp_state *)opaque; struct ics_state *ics = icp->ics; int i; @@ -527,7 +492,12 @@ static void xics_reset(void *opaque) } } -void xics_cpu_setup(struct icp_state *icp, PowerPCCPU *cpu) +static void xics_reset(DeviceState *d) +{ +xics_common_reset(XICS(d)); +} + +void xics_common_cpu_setup(struct icp_state *icp, PowerPCCPU *cpu) { CPUState *cs = CPU(cpu); CPUPPCState *env = &cpu->env; @@ -551,37 +521,72 @@ void xics_cpu_setup(struct icp_state *icp, PowerPCCPU *cpu) } } -struct icp_state *xics_system_init(int nr_servers, int nr_irqs) +void xics_cpu_setup(struct icp_state *icp, PowerPCCPU *cpu) { -struct icp_state *icp; -struct ics_state *ics; +xics_common_cpu_setup(icp, cpu); +} + +void xics_common_init(struct icp_state *icp, qemu_irq_handler handler) +{ +struct ics_state *ics = icp->ics; -icp = g_malloc0(sizeof(*icp)); -icp->nr_servers = nr_servers; icp->ss = g_malloc0(icp->nr_servers*sizeof(struct icp_server_state)); ics = g_malloc0(sizeof(*ics)); -ics->nr_irqs = nr_irqs; +ics->nr_irqs = icp->nr_irqs; ics->offset = XICS_IRQ_BASE; -ics->irqs = g_malloc0(nr_irqs * sizeof(struct ics_irq_state)); -ics->islsi = g_malloc0(nr_irqs * sizeof(bool)); +ics->irqs = g_malloc0(ics->nr_irqs * sizeof(struct ics_irq_state)); +ics->islsi = g_malloc0(ics->nr_irqs * sizeof(bool)); icp->ics = ics; ics->icp = icp; -ics->qirqs = qemu_allocate_irqs(ics_set_irq, ics, nr_irqs); +ics->qirqs = qemu_allocate_irqs(handler, ics, ics->nr_irqs); +} -spapr_register_hypercall(H_CPPR, h_cppr); -spapr_register_hypercall(H_IPI, h_ipi); -spapr_register_hypercall(H_XIRR, h_xirr); -spapr_register_hypercall(H_EOI, h_eoi); +static void xics_realize(DeviceState *dev, Error **errp) +{ +struct icp_state *icp = XICS(dev); + +xics_common_init(icp, ics_set_irq); spapr_rtas_register("ibm,set-xive", rtas_set_xive); spapr_rtas_register("ibm,get-xive", rtas_get_xive); spapr_rtas_register("ibm,int-off", rtas_int_off); spapr_rtas_register("ibm,int-on", rtas_int_on); -qemu_register_reset(xics_reset, icp); +} + +static Property xics_properties[] = { +DEFINE_PROP_UINT32("nr_servers", struct icp_state, nr_servers, -1), +DEFINE_PROP_UINT32("nr_irqs", struct icp_state, nr_irqs, -1), +DEFINE_PROP_END_OF_LIST(), +}; + +static void xics_class_init(ObjectClass *oc, void *data) +{ +
[Qemu-devel] [PATCH 04/19] target-ppc: Convert ppc cpu savevm to VMStateDescription
From: David Gibson The savevm code for the powerpc cpu emulation is currently based around the old register_savevm() rather than register_vmstate() method. It's also rather broken, missing some important state on some CPU models. This patch completely rewrites the savevm for target-ppc, using the new VMStateDescription approach. Exactly what needs to be saved in what configurations has been more carefully examined, too. This introduces a new version (5) of the cpu save format. The old load function is retained to support version 4 images. Signed-off-by: David Gibson [aik: ppc cpu savevm convertion fixed to use PowerPCCPU instead of CPUPPCState] Signed-off-by: Alexey Kardashevskiy Signed-off-by: Alexey Kardashevskiy --- target-ppc/cpu-qom.h| 4 + target-ppc/cpu.h| 8 +- target-ppc/machine.c| 533 target-ppc/translate_init.c | 2 + 4 files changed, 454 insertions(+), 93 deletions(-) diff --git a/target-ppc/cpu-qom.h b/target-ppc/cpu-qom.h index 84ba105..a14a3d9 100644 --- a/target-ppc/cpu-qom.h +++ b/target-ppc/cpu-qom.h @@ -106,4 +106,8 @@ void ppc_cpu_dump_state(CPUState *cpu, FILE *f, fprintf_function cpu_fprintf, void ppc_cpu_dump_statistics(CPUState *cpu, FILE *f, fprintf_function cpu_fprintf, int flags); +#ifndef CONFIG_USER_ONLY +extern const struct VMStateDescription vmstate_ppc_cpu; +#endif + #endif diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h index 0ede077..f30577d 100644 --- a/target-ppc/cpu.h +++ b/target-ppc/cpu.h @@ -948,7 +948,7 @@ struct CPUPPCState { #if defined(TARGET_PPC64) /* PowerPC 64 SLB area */ ppc_slb_t slb[64]; -int slb_nr; +int32_t slb_nr; #endif /* segment registers */ hwaddr htab_base; @@ -957,11 +957,11 @@ struct CPUPPCState { /* externally stored hash table */ uint8_t *external_htab; /* BATs */ -int nb_BATs; +uint32_t nb_BATs; target_ulong DBAT[2][8]; target_ulong IBAT[2][8]; /* PowerPC TLB registers (for 4xx, e500 and 60x software driven TLBs) */ -int nb_tlb; /* Total number of TLB */ +int32_t nb_tlb; /* Total number of TLB */ int tlb_per_way; /* Speed-up helper: used to avoid divisions at run time */ int nb_ways; /* Number of ways in the TLB set*/ int last_way;/* Last used way used to allocate TLB in a LRU way */ @@ -1176,8 +1176,6 @@ static inline CPUPPCState *cpu_init(const char *cpu_model) #define cpu_signal_handler cpu_ppc_signal_handler #define cpu_list ppc_cpu_list -#define CPU_SAVE_VERSION 4 - /* MMU modes definitions */ #define MMU_MODE0_SUFFIX _user #define MMU_MODE1_SUFFIX _kernel diff --git a/target-ppc/machine.c b/target-ppc/machine.c index 2d10adb..1fcc6bc 100644 --- a/target-ppc/machine.c +++ b/target-ppc/machine.c @@ -1,96 +1,12 @@ #include "hw/hw.h" #include "hw/boards.h" #include "sysemu/kvm.h" +#include "helper_regs.h" -void cpu_save(QEMUFile *f, void *opaque) +static int cpu_load_old(QEMUFile *f, void *opaque, int version_id) { -CPUPPCState *env = (CPUPPCState *)opaque; -unsigned int i, j; -uint32_t fpscr; -target_ulong xer; - -for (i = 0; i < 32; i++) -qemu_put_betls(f, &env->gpr[i]); -#if !defined(TARGET_PPC64) -for (i = 0; i < 32; i++) -qemu_put_betls(f, &env->gprh[i]); -#endif -qemu_put_betls(f, &env->lr); -qemu_put_betls(f, &env->ctr); -for (i = 0; i < 8; i++) -qemu_put_be32s(f, &env->crf[i]); -xer = cpu_read_xer(env); -qemu_put_betls(f, &xer); -qemu_put_betls(f, &env->reserve_addr); -qemu_put_betls(f, &env->msr); -for (i = 0; i < 4; i++) -qemu_put_betls(f, &env->tgpr[i]); -for (i = 0; i < 32; i++) { -union { -float64 d; -uint64_t l; -} u; -u.d = env->fpr[i]; -qemu_put_be64(f, u.l); -} -fpscr = env->fpscr; -qemu_put_be32s(f, &fpscr); -qemu_put_sbe32s(f, &env->access_type); -#if defined(TARGET_PPC64) -qemu_put_betls(f, &env->spr[SPR_ASR]); -qemu_put_sbe32s(f, &env->slb_nr); -#endif -qemu_put_betls(f, &env->spr[SPR_SDR1]); -for (i = 0; i < 32; i++) -qemu_put_betls(f, &env->sr[i]); -for (i = 0; i < 2; i++) -for (j = 0; j < 8; j++) -qemu_put_betls(f, &env->DBAT[i][j]); -for (i = 0; i < 2; i++) -for (j = 0; j < 8; j++) -qemu_put_betls(f, &env->IBAT[i][j]); -qemu_put_sbe32s(f, &env->nb_tlb); -qemu_put_sbe32s(f, &env->tlb_per_way); -qemu_put_sbe32s(f, &env->nb_ways); -qemu_put_sbe32s(f, &env->last_way); -qemu_put_sbe32s(f, &env->id_tlbs); -qemu_put_sbe32s(f, &env->nb_pids); -if (env->tlb.tlb6) { -// XXX assumes 6xx -for (i = 0; i < env->nb_tlb; i++) { -qemu_put_betls(f, &env->tlb.tlb6[i].pte0); -qemu_put_betls(f, &env->tlb
[Qemu-devel] [PATCH 01/19] pseries: move interrupt controllers to hw/intc/
Signed-off-by: Alexey Kardashevskiy Reviewed-by: Andreas Färber --- default-configs/ppc64-softmmu.mak | 1 + hw/intc/Makefile.objs | 1 + hw/{ppc => intc}/xics.c | 0 hw/ppc/Makefile.objs | 2 +- 4 files changed, 3 insertions(+), 1 deletion(-) rename hw/{ppc => intc}/xics.c (100%) diff --git a/default-configs/ppc64-softmmu.mak b/default-configs/ppc64-softmmu.mak index cb279cb..69a9f8d 100644 --- a/default-configs/ppc64-softmmu.mak +++ b/default-configs/ppc64-softmmu.mak @@ -47,5 +47,6 @@ CONFIG_E500=y CONFIG_OPENPIC_KVM=$(and $(CONFIG_E500),$(CONFIG_KVM)) # For pSeries CONFIG_PCI_HOTPLUG=y +CONFIG_XICS=$(CONFIG_PSERIES) # For PReP CONFIG_MC146818RTC=y diff --git a/hw/intc/Makefile.objs b/hw/intc/Makefile.objs index 2ba49d0..abe8f80 100644 --- a/hw/intc/Makefile.objs +++ b/hw/intc/Makefile.objs @@ -22,3 +22,4 @@ obj-$(CONFIG_OMAP) += omap_intc.o obj-$(CONFIG_OPENPIC) += openpic.o obj-$(CONFIG_OPENPIC_KVM) += openpic_kvm.o obj-$(CONFIG_SH4) += sh_intc.o +obj-$(CONFIG_XICS) += xics.o diff --git a/hw/ppc/xics.c b/hw/intc/xics.c similarity index 100% rename from hw/ppc/xics.c rename to hw/intc/xics.c diff --git a/hw/ppc/Makefile.objs b/hw/ppc/Makefile.objs index be00d1d..7a1cd5d 100644 --- a/hw/ppc/Makefile.objs +++ b/hw/ppc/Makefile.objs @@ -1,7 +1,7 @@ # shared objects obj-y += ppc.o ppc_booke.o # IBM pSeries (sPAPR) -obj-$(CONFIG_PSERIES) += spapr.o xics.o spapr_vio.o spapr_events.o +obj-$(CONFIG_PSERIES) += spapr.o spapr_vio.o spapr_events.o obj-$(CONFIG_PSERIES) += spapr_hcall.o spapr_iommu.o spapr_rtas.o obj-$(CONFIG_PSERIES) += spapr_pci.o # PowerPC 4xx boards -- 1.8.3.2
[Qemu-devel] [PATCH 00/19 v4] spapr: migration, pci, msi, power8
New patch ("target-ppc: Enhance the CPU node labels for the guest device tree for pseries") and "savevm for VIO TTY" is separated from "savevm for VIO LAN". The series was also rebased on top of current master from qemu.org. Besides that, no more changes. Alexey Kardashevskiy (4): pseries: move interrupt controllers to hw/intc/ pseries: rework XICS pseries: rework PAPR virtual SCSI spapr-pci: rework MSI/MSIX David Gibson (13): savevm: Implement VMS_DIVIDE flag target-ppc: Convert ppc cpu savevm to VMStateDescription pseries: savevm support for XICS interrupt controller pseries: savevm support for VIO devices pseries: savevm support for PAPR VIO logical lan pseries: savevm support for PAPR VIO logical tty pseries: savevm support for PAPR TCE tables pseries: savevm support for PAPR virtual SCSI pseries: savevm support for pseries machine pseries: savevm support for PCI host bridge target-ppc: Add helper for KVM_PPC_RTAS_DEFINE_TOKEN pseries: Support for in-kernel XICS interrupt controller pseries: savevm support with KVM Prerna Saxena (2): target-ppc: Add POWER8 v1.0 CPU model target-ppc: Enhance the CPU node labels for the guest device tree for pseries. default-configs/ppc64-softmmu.mak | 2 + hw/char/spapr_vty.c | 16 ++ hw/intc/Makefile.objs | 2 + hw/{ppc => intc}/xics.c | 172 hw/intc/xics_kvm.c| 445 +++ hw/net/spapr_llan.c | 24 +- hw/ppc/Makefile.objs | 2 +- hw/ppc/spapr.c| 435 ++- hw/ppc/spapr_hcall.c | 8 +- hw/ppc/spapr_iommu.c | 25 ++ hw/ppc/spapr_pci.c| 141 ++ hw/ppc/spapr_vio.c| 20 ++ hw/scsi/spapr_vscsi.c | 306 +++--- include/hw/pci-host/spapr.h | 14 +- include/hw/ppc/spapr.h| 17 +- include/hw/ppc/spapr_vio.h| 5 + include/hw/ppc/xics.h | 72 - include/migration/vmstate.h | 13 + savevm.c | 8 + target-ppc/cpu-models.c | 3 + target-ppc/cpu-models.h | 1 + target-ppc/cpu-qom.h | 5 + target-ppc/cpu.h | 8 +- target-ppc/kvm.c | 83 ++ target-ppc/kvm_ppc.h | 29 +++ target-ppc/machine.c | 533 +++--- target-ppc/translate_init.c | 64 + 27 files changed, 2131 insertions(+), 322 deletions(-) rename hw/{ppc => intc}/xics.c (80%) create mode 100644 hw/intc/xics_kvm.c -- 1.8.3.2
Re: [Qemu-devel] [PATCH RFC 07/15] timer/arm_mptimer: Convert to QOM realize
Am 01.07.2013 11:33, schrieb Peter Crosthwaite: > Hi Andreas, > > On Mon, Jul 1, 2013 at 7:00 AM, Andreas Färber wrote: >> From: Andreas Färber >> >> Split the SysBusDevice initfn into instance_init and realizefn. >> >> Signed-off-by: Andreas Färber >> --- >> hw/timer/arm_mptimer.c | 25 +++-- >> 1 file changed, 15 insertions(+), 10 deletions(-) >> >> diff --git a/hw/timer/arm_mptimer.c b/hw/timer/arm_mptimer.c >> index 588e34b..a19ffa3 100644 >> --- a/hw/timer/arm_mptimer.c >> +++ b/hw/timer/arm_mptimer.c >> @@ -226,8 +226,18 @@ static void arm_mptimer_reset(DeviceState *dev) >> } >> } >> >> -static int arm_mptimer_init(SysBusDevice *dev) >> +static void arm_mptimer_init(Object *obj) >> { >> +ARMMPTimerState *s = ARM_MP_TIMER(obj); >> + >> +memory_region_init_io(&s->iomem, &arm_thistimer_ops, s, >> + "arm_mptimer_timer", 0x20); >> +sysbus_init_mmio(SYS_BUS_DEVICE(obj), &s->iomem); >> +} > > Splitting off only one of the memory region setups to init seems a bit > awkward. Is it really worth it given that we need to wait for > realization for the rest to occur anyway? Well, my main interest wrt MemoryRegions here is to have the *mpcore_priv container MemoryRegion mappable before realize and to avoid having to implement unnecessary cleanups in unrealize. An alternative would be for PMM to use his array properties to dynamically allocate the MemoryRegions before realize, but so far he has failed to come up with a solution... Another solution, since this is a fixed-length array, would be to always initialize all MemoryRegions and just keep adding all 3(?) in realize. FWIU adding subregions is desired in instance_init as long as it affects only containing devices and not a global address space. Regards, Andreas >> + >> +static void arm_mptimer_realize(DeviceState *dev, Error **errp) >> +{ >> +SysBusDevice *sbd = SYS_BUS_DEVICE(dev); >> ARMMPTimerState *s = ARM_MP_TIMER(dev); >> int i; >> >> @@ -244,19 +254,14 @@ static int arm_mptimer_init(SysBusDevice *dev) >> * * timer for core 1 >> * and so on. >> */ >> -memory_region_init_io(&s->iomem, &arm_thistimer_ops, s, >> - "arm_mptimer_timer", 0x20); >> -sysbus_init_mmio(dev, &s->iomem); >> for (i = 0; i < s->num_cpu; i++) { >> TimerBlock *tb = &s->timerblock[i]; >> tb->timer = qemu_new_timer_ns(vm_clock, timerblock_tick, tb); >> -sysbus_init_irq(dev, &tb->irq); >> +sysbus_init_irq(sbd, &tb->irq); >> memory_region_init_io(&tb->iomem, &timerblock_ops, tb, >>"arm_mptimer_timerblock", 0x20); >> -sysbus_init_mmio(dev, &tb->iomem); >> +sysbus_init_mmio(sbd, &tb->iomem); >> } >> - >> -return 0; >> } >> >> static const VMStateDescription vmstate_timerblock = { >> @@ -293,9 +298,8 @@ static Property arm_mptimer_properties[] = { >> static void arm_mptimer_class_init(ObjectClass *klass, void *data) >> { >> DeviceClass *dc = DEVICE_CLASS(klass); >> -SysBusDeviceClass *sbc = SYS_BUS_DEVICE_CLASS(klass); >> >> -sbc->init = arm_mptimer_init; >> +dc->realize = arm_mptimer_realize; >> dc->vmsd = &vmstate_arm_mptimer; >> dc->reset = arm_mptimer_reset; >> dc->no_user = 1; >> @@ -306,6 +310,7 @@ static const TypeInfo arm_mptimer_info = { >> .name = TYPE_ARM_MP_TIMER, >> .parent= TYPE_SYS_BUS_DEVICE, >> .instance_size = sizeof(ARMMPTimerState), >> +.instance_init = arm_mptimer_init, >> .class_init= arm_mptimer_class_init, >> }; >> >> -- >> 1.8.1.4 >> >> -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH RFC 06/15] timer/arm_mptimer: QOM cast cleanup
Am 01.07.2013 11:29, schrieb Peter Crosthwaite: > On Mon, Jul 1, 2013 at 7:00 AM, Andreas Färber wrote: >> From: Andreas Färber >> >> Introduce type constant and cast macro and rename >> ARMMPTimerState::busdev to enforce its use. >> >> Signed-off-by: Andreas Färber >> --- >> hw/timer/arm_mptimer.c | 18 +- >> 1 file changed, 13 insertions(+), 5 deletions(-) >> >> diff --git a/hw/timer/arm_mptimer.c b/hw/timer/arm_mptimer.c >> index 317f5e4..588e34b 100644 >> --- a/hw/timer/arm_mptimer.c >> +++ b/hw/timer/arm_mptimer.c >> @@ -40,8 +40,15 @@ typedef struct { >> MemoryRegion iomem; >> } TimerBlock; >> >> +#define TYPE_ARM_MP_TIMER "arm_mptimer" > > The _ in "MP_TIMER" seems inconsistent with the type string and > "A9MPCORE" as introduced in P1. It is consistent with "PC_NET" though. ;) I can surely change it to "MPTIMER" if preferred. Cheers, Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH 3/9] linux-user: Don't reset a new thread's CPU
Am 06.07.2013 14:44, schrieb Peter Maydell: > On 6 July 2013 13:40, Andreas Färber wrote: >> softmmu would do it after the future QMP qom-set phase. The mess there >> is reset handler registration order: We cannot have most CPUs register a >> reset handler themselves yet because some machines (including most ARM >> ones) register reset handlers to tweak registers before the CPU would >> have reset in that future scenario. > > I'm not really a fan of that "use reset handler to simulate > bootloader firmware" code, so if you have a cleaner solution > to suggest I'd be happy to move to that. I once suggested a secondary reset hook in CPUClass, to be invoked from CPUClass::reset, that boards could set to piggy-back initializations, but IIRC Anthony didn't like that. For PReP I am trying to avoid NIP tweaking sneaking in with the ELF loading requests since I know how hard it'll be to keep working. Andreas > (if the cleaner solution is "provide a firmware blob for > all boards" that might be too much work though :-)) > > -- PMM -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH RFC qom-cpu 39/41] target-xtensa: Introduce XtensaCPU subclasses
Max, Am 29.06.2013 22:01, schrieb Andreas Färber: > Register a CPU type per core registered. Save the XtensaConfig in > XtensaCPUClass instead of CPUXtensaState. > > Prepares for storing per-class GDB register count. > > Signed-off-by: Andreas Färber Ping! Can you ack? (It did not seem to break your test image.) Thanks, Andreas > --- > gdbstub.c | 17 --- > hw/xtensa/pic_cpu.c | 47 -- > target-xtensa/cpu-qom.h | 3 ++ > target-xtensa/cpu.c | 30 ++-- > target-xtensa/cpu.h | 22 + > target-xtensa/helper.c| 93 +++ > target-xtensa/op_helper.c | 121 > -- > target-xtensa/translate.c | 12 +++-- > 8 files changed, 236 insertions(+), 109 deletions(-) > > diff --git a/gdbstub.c b/gdbstub.c > index 4ebe9e0..d08cfd3 100644 > --- a/gdbstub.c > +++ b/gdbstub.c > @@ -1692,14 +1692,16 @@ static int cpu_gdb_write_register(CPULM32State *env, > uint8_t *mem_buf, int n) > * reset bit 0 in the 'flags' field of the registers definitions in the > * gdb/xtensa-config.c inside gdb source tree or inside gdb overlay. > */ > -#define NUM_CORE_REGS (env->config->gdb_regmap.num_regs) > +#define NUM_CORE_REGS \ > + > (XTENSA_CPU_GET_CLASS(xtensa_env_get_cpu(env))->config->gdb_regmap.num_regs) > #define num_g_regs NUM_CORE_REGS > > static int cpu_gdb_read_register(CPUXtensaState *env, uint8_t *mem_buf, int > n) > { > -const XtensaGdbReg *reg = env->config->gdb_regmap.reg + n; > +XtensaCPUClass *xcc = XTENSA_CPU_GET_CLASS(xtensa_env_get_cpu(env)); > +const XtensaGdbReg *reg = xcc->config->gdb_regmap.reg + n; > > -if (n < 0 || n >= env->config->gdb_regmap.num_regs) { > +if (n < 0 || n >= xcc->config->gdb_regmap.num_regs) { > return 0; > } > > @@ -1710,7 +1712,7 @@ static int cpu_gdb_read_register(CPUXtensaState *env, > uint8_t *mem_buf, int n) > > case 1: /*ar*/ > xtensa_sync_phys_from_window(env); > -GET_REG32(env->phys_regs[(reg->targno & 0xff) % env->config->nareg]); > +GET_REG32(env->phys_regs[(reg->targno & 0xff) % xcc->config->nareg]); > break; > > case 2: /*SR*/ > @@ -1738,10 +1740,11 @@ static int cpu_gdb_read_register(CPUXtensaState *env, > uint8_t *mem_buf, int n) > > static int cpu_gdb_write_register(CPUXtensaState *env, uint8_t *mem_buf, int > n) > { > +XtensaCPUClass *xcc = XTENSA_CPU_GET_CLASS(xtensa_env_get_cpu(env)); > uint32_t tmp; > -const XtensaGdbReg *reg = env->config->gdb_regmap.reg + n; > +const XtensaGdbReg *reg = xcc->config->gdb_regmap.reg + n; > > -if (n < 0 || n >= env->config->gdb_regmap.num_regs) { > +if (n < 0 || n >= xcc->config->gdb_regmap.num_regs) { > return 0; > } > > @@ -1753,7 +1756,7 @@ static int cpu_gdb_write_register(CPUXtensaState *env, > uint8_t *mem_buf, int n) > break; > > case 1: /*ar*/ > -env->phys_regs[(reg->targno & 0xff) % env->config->nareg] = tmp; > +env->phys_regs[(reg->targno & 0xff) % xcc->config->nareg] = tmp; > xtensa_sync_window_from_phys(env); > break; > > diff --git a/hw/xtensa/pic_cpu.c b/hw/xtensa/pic_cpu.c > index 7f015ff..048038d 100644 > --- a/hw/xtensa/pic_cpu.c > +++ b/hw/xtensa/pic_cpu.c > @@ -31,13 +31,15 @@ > > void xtensa_advance_ccount(CPUXtensaState *env, uint32_t d) > { > +XtensaCPU *cpu = xtensa_env_get_cpu(env); > +XtensaCPUClass *xcc = XTENSA_CPU_GET_CLASS(cpu); > uint32_t old_ccount = env->sregs[CCOUNT]; > > env->sregs[CCOUNT] += d; > > -if (xtensa_option_enabled(env->config, XTENSA_OPTION_TIMER_INTERRUPT)) { > +if (xtensa_option_enabled(xcc->config, XTENSA_OPTION_TIMER_INTERRUPT)) { > int i; > -for (i = 0; i < env->config->nccompare; ++i) { > +for (i = 0; i < xcc->config->nccompare; ++i) { > if (env->sregs[CCOMPARE + i] - old_ccount <= d) { > xtensa_timer_irq(env, i, 1); > } > @@ -47,7 +49,9 @@ void xtensa_advance_ccount(CPUXtensaState *env, uint32_t d) > > void check_interrupts(CPUXtensaState *env) > { > -CPUState *cs = CPU(xtensa_env_get_cpu(env)); > +XtensaCPU *cpu = xtensa_env_get_cpu(env); > +CPUState *cs = CPU(cpu); > +XtensaCPUClass *xcc = XTENSA_CPU_GET_CLASS(cpu); > int minlevel = xtensa_get_cintlevel(env); > uint32_t int_set_enabled = env->sregs[INTSET] & env->sregs[INTENABLE]; > int level; > @@ -60,11 +64,11 @@ void check_interrupts(CPUXtensaState *env) > > xtensa_advance_ccount(env, > muldiv64(now - env->halt_clock, > -env->config->clock_freq_khz, 100)); > +xcc->config->clock_freq_khz, 100)); > env->halt_clock = now; > } > -for (level = env->config->nlevel; level > minlevel; --level) { > -if (env->config->level_mask[level] & int_set_enabled) {
[Qemu-devel] [Bug 584143] Re: qemu fails to set hdd serial number
** No longer affects: qemu-kvm (Ubuntu Lucid) ** No longer affects: qemu-kvm (Ubuntu Maverick) -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/584143 Title: qemu fails to set hdd serial number Status in QEMU: Fix Released Status in “qemu-kvm” package in Ubuntu: Fix Released Status in “qemu-kvm” package in Debian: Fix Released Bug description: = SRU Justification: 1. Impact: 'qemu -drive ...,serial=xyz' does not work 2. How addressed: a patch from upstream fixes bug that sizeof was called on the wrong thing. 3. patch: is in the description 4. to reproduce: use '-drive ...,serial=xyz' option to qemu 5. regression potential: this only changes one line which called sizeof on the wrong thing, so should not impact any other code. = The -drive ...,serial=xyz option is broken, at least in 0.12. See Debian bug#573439, http://bugs.debian.org/cgi- bin/bugreport.cgi?bug=573439 for details. The proposed fix from the original reporter: --- qemu-kvm-0.12.3+dfsg/vl.c 2010-02-26 11:34:00.0 +0900 +++ qemu-kvm-0.12.3+dfsg.old/vl.c 2010-03-11 02:26:00.134217787 +0900 @@ -2397,7 +2397,7 @@ dinfo->on_write_error = on_write_error; dinfo->opts = opts; if (serial) -strncpy(dinfo->serial, serial, sizeof(serial)); +strncpy(dinfo->serial, serial, sizeof(dinfo->serial)); QTAILQ_INSERT_TAIL(&drives, dinfo, next); if (is_extboot) { extboot_drive = dinfo; To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/584143/+subscriptions
Re: [Qemu-devel] [PATCH 3/9] linux-user: Don't reset a new thread's CPU
On 6 July 2013 13:40, Andreas Färber wrote: > softmmu would do it after the future QMP qom-set phase. The mess there > is reset handler registration order: We cannot have most CPUs register a > reset handler themselves yet because some machines (including most ARM > ones) register reset handlers to tweak registers before the CPU would > have reset in that future scenario. I'm not really a fan of that "use reset handler to simulate bootloader firmware" code, so if you have a cleaner solution to suggest I'd be happy to move to that. (if the cleaner solution is "provide a firmware blob for all boards" that might be too much work though :-)) -- PMM
Re: [Qemu-devel] [PATCH 3/9] linux-user: Don't reset a new thread's CPU
Am 06.07.2013 12:31, schrieb Peter Maydell: > On 6 July 2013 01:36, Alexander Graf wrote: >> When we create a new thread, there is no reason to reset it. I'm fairly sure >> the code has mostly been left in there because nobody understood why it was >> there in the first place. > > We had a big discussion on this on IRC. This code is here > because of commit b4558d748, which attempted to fix a breakage > introduced when commit b55a37c9 made these CPUs no longer do > a reset as part of their cpu_init(). And as I said on IRC, x86 has been fixed to reset as part of cpu_init(): http://repo.or.cz/w/qemu.git/commit/77868120cfe93ad7816dfac6546684e5a6c6e256?f=linux-user/main.c So since this patch only removes x86 reset and leaves ppc and sparc in place, I think this patch is more correct than what we have now. > However, it's in the > wrong place -- this reset needs to go into cpu_copy(), so > it doesn't undo the copying work done by that function. The remaining question Eduardo raised was whether not resetting after fiddling with fields might cause some kind of inconsistency in the CPU. But since the current code effectively undoes the effects of cpu_copy(), I'm willing to pick up this patch as follow-up to my earlier refactoring (and 6/9 if ack'ed as follow-up to Peter's refactoring) for qom-cpu. The commit message would need to be reworked though to reference why this can be done for x86. And we'd still need a follow-up solution for ppc and sparc. Peter's suggestion was that we run LTP in a chroot to verify whether our respective solutions break anything badly: http://wiki.qemu.org/Testing/LTP (BTW the May 2013 stable version is available now) >> Remove the reset. A new thread's kernel sided state should be identical to >> the old one's. > > Just deleting the reset isn't right. We should standardize > whether we think cpu_init() ought to give you a cleanly > reset CPU or not (I think it should); +1 Specifically as part of QOM realize, which - once cpu_init() is gone - linux-user/bsd-user would trivially invoke directly after object_new(). softmmu would do it after the future QMP qom-set phase. The mess there is reset handler registration order: We cannot have most CPUs register a reset handler themselves yet because some machines (including most ARM ones) register reset handlers to tweak registers before the CPU would have reset in that future scenario. > until then, we should > put a cpu_reset() immediately after cpu_init() in cpu_copy(). > It doesn't need to be ifdef-guarded either. Will test and post my patch to that effect. Regards, Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
[Qemu-devel] [PATCH 9/9] linux-user: Unlock mmap_lock when resuming guest from page_unprotect
The page_unprotect() function is running everything locked. Before every potential exit path of the function mmap_unlock() gets called to make sure we don't leak the lock. However, the function calls tb_invalidate_phys_page() which again can exit a signal through longjmp, leaving our mmap_unlock() attempts in vain. Add a hint to tb_invalidate_phys_page() that we need to unlock before we can leave back into guest context, so that we don't leak the lock. This fixes 16-bit i386 wine programs running in linux-user for me. Signed-off-by: Alexander Graf --- translate-all.c | 10 +++--- 1 files changed, 7 insertions(+), 3 deletions(-) diff --git a/translate-all.c b/translate-all.c index e8683d2..3b5fc7c 100644 --- a/translate-all.c +++ b/translate-all.c @@ -1148,7 +1148,8 @@ void tb_invalidate_phys_page_fast(tb_page_addr_t start, int len) #if !defined(CONFIG_SOFTMMU) static void tb_invalidate_phys_page(tb_page_addr_t addr, -uintptr_t pc, void *puc) +uintptr_t pc, void *puc, +bool locked) { TranslationBlock *tb; PageDesc *p; @@ -1206,6 +1207,9 @@ static void tb_invalidate_phys_page(tb_page_addr_t addr, itself */ cpu->current_tb = NULL; tb_gen_code(env, current_pc, current_cs_base, current_flags, 1); +if (locked) { +mmap_unlock(); +} cpu_resume_from_signal(env, puc); } #endif @@ -1723,7 +1727,7 @@ void page_set_flags(target_ulong start, target_ulong end, int flags) if (!(p->flags & PAGE_WRITE) && (flags & PAGE_WRITE) && p->first_tb) { -tb_invalidate_phys_page(addr, 0, NULL); +tb_invalidate_phys_page(addr, 0, NULL, false); } p->flags = flags; } @@ -1818,7 +1822,7 @@ int page_unprotect(target_ulong address, uintptr_t pc, void *puc) /* and since the content will be modified, we must invalidate the corresponding translated code. */ -tb_invalidate_phys_page(addr, pc, puc); +tb_invalidate_phys_page(addr, pc, puc, true); #ifdef DEBUG_TB_CHECK tb_invalidate_check(addr); #endif -- 1.6.0.2
[Qemu-devel] [PATCH 5/9] linux-user: Fix epoll on ARM hosts
The epoll emulation uses data structures without packing them, so the compiler might choose to add padding inside. This patch makes the most offending one (target_epoll_event) a packed structure to make sure we don't pad it by accident. ARM would pad it, so declare the padding mandatory for ARM targets. This fixes i386-on-ARM epoll emulation for me. Signed-off-by: Alexander Graf --- linux-user/syscall_defs.h |5 - 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h index 84da7f7..c99aed6 100644 --- a/linux-user/syscall_defs.h +++ b/linux-user/syscall_defs.h @@ -2454,8 +2454,11 @@ typedef union target_epoll_data { struct target_epoll_event { uint32_t events; +#ifdef TARGET_ARM +uint32_t __pad; +#endif target_epoll_data_t data; -}; +} QEMU_PACKED; #endif struct target_rlimit64 { uint64_t rlim_cur; -- 1.6.0.2
[Qemu-devel] [PATCH 1/9] linux-user: fix segmentation fault passing with h2g(x) != x
When forwarding a segmentation fault into the guest process, we were passing the host's address directly into the guest process's signal descriptor. That obviously confused the guest process, since it didn't know what to make of the (usually 32-bit truncated) address. Passing in h2g(address) makes the guest process a lot happier. To make the code more obvious, introduce a h2g_nocheck() macro that does the same as h2g(), but allows us to convert addresses that may be outside of guest mapped range into the guest's view of address space. This fixes java running in arm-linux-user for me. Signed-off-by: Alexander Graf --- include/exec/cpu-all.h |8 ++-- user-exec.c|4 2 files changed, 10 insertions(+), 2 deletions(-) diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h index 6499cd0..320e3c4 100644 --- a/include/exec/cpu-all.h +++ b/include/exec/cpu-all.h @@ -209,11 +209,15 @@ extern unsigned long reserved_va; }) #endif -#define h2g(x) ({ \ +#define h2g_nocheck(x) ({ \ unsigned long __ret = (unsigned long)(x) - GUEST_BASE; \ +(abi_ulong)__ret; \ +}) + +#define h2g(x) ({ \ /* Check if given address fits target address space */ \ assert(h2g_valid(x)); \ -(abi_ulong)__ret; \ +h2g_nocheck(x); \ }) #define saddr(x) g2h(x) diff --git a/user-exec.c b/user-exec.c index 26cde7c..ed15f1e 100644 --- a/user-exec.c +++ b/user-exec.c @@ -94,6 +94,10 @@ static inline int handle_cpu_signal(uintptr_t pc, unsigned long address, return 1; } +/* Convert forcefully to guest address space, invalid addresses + are still valid segv ones */ +address = h2g_nocheck(address); + env = current_cpu->env_ptr; /* see if it is an MMU fault */ ret = cpu_handle_mmu_fault(env, address, is_write, MMU_USER_IDX); -- 1.6.0.2
[Qemu-devel] [PATCH 2/9] user-exec.c: Set is_write correctly in the ARM cpu_signal_handler()
From: Peter Maydell In the ARM implementation of cpu_signal_handler(), set is_write correctly using the FSR value which the kernel passes us in the error_code field of uc_mcontext. Since the WnR bit of the FSR was only introduced in ARMv6, this means that v5 cores will continue to behave as before this patch, but they are not really supported as hosts for linux-user mode anyway since they do not have the modern behaviour for unaligned accesses. Signed-off-by: Peter Maydell Acked-by: Alexander Graf Signed-off-by: Alexander Graf --- user-exec.c |8 ++-- 1 files changed, 6 insertions(+), 2 deletions(-) diff --git a/user-exec.c b/user-exec.c index ed15f1e..82bfa66 100644 --- a/user-exec.c +++ b/user-exec.c @@ -20,6 +20,7 @@ #include "cpu.h" #include "disas/disas.h" #include "tcg.h" +#include "qemu/bitops.h" #undef EAX #undef ECX @@ -446,8 +447,11 @@ int cpu_signal_handler(int host_signum, void *pinfo, #else pc = uc->uc_mcontext.arm_pc; #endif -/* XXX: compute is_write */ -is_write = 0; + +/* error_code is the FSR value, in which bit 11 is WnR (assuming a v6 or + * later processor; on v5 we will always report this as a read). + */ +is_write = extract32(uc->uc_mcontext.error_code, 11, 1); return handle_cpu_signal(pc, (unsigned long)info->si_addr, is_write, &uc->uc_sigmask, puc); -- 1.6.0.2
[Qemu-devel] [PATCH 6/9] linux-user: Add i386 TLS setter
We can easily set the TLS on i386. Add code to do so. Signed-off-by: Alexander Graf --- linux-user/i386/target_cpu.h | 12 ++-- linux-user/syscall.c |2 +- 2 files changed, 11 insertions(+), 3 deletions(-) diff --git a/linux-user/i386/target_cpu.h b/linux-user/i386/target_cpu.h index abcac79..1170d84 100644 --- a/linux-user/i386/target_cpu.h +++ b/linux-user/i386/target_cpu.h @@ -28,6 +28,14 @@ static inline void cpu_clone_regs(CPUX86State *env, target_ulong newsp) env->regs[R_EAX] = 0; } -/* TODO: need to implement cpu_set_tls() */ +#if defined(TARGET_ABI32) +abi_long do_set_thread_area(CPUX86State *env, abi_ulong ptr); -#endif +static inline void cpu_set_tls(CPUX86State *env, target_ulong newtls) +{ +do_set_thread_area(env, newtls); +cpu_x86_load_seg(env, R_GS, env->segs[R_GS].selector); +} +#endif /* defined(TARGET_ABI32) */ + +#endif /* !defined(TARGET_CPU_H) */ diff --git a/linux-user/syscall.c b/linux-user/syscall.c index 8eb5c31..e7d5143 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -3983,7 +3983,7 @@ static abi_long do_modify_ldt(CPUX86State *env, int func, abi_ulong ptr, } #if defined(TARGET_I386) && defined(TARGET_ABI32) -static abi_long do_set_thread_area(CPUX86State *env, abi_ulong ptr) +abi_long do_set_thread_area(CPUX86State *env, abi_ulong ptr) { uint64_t *gdt_table = g2h(env->gdt.base); struct target_modify_ldt_ldt_s ldt_info; -- 1.6.0.2
[Qemu-devel] [PATCH 8/9] linux-user: Default to 64k guest base
Most kernels these days have protection code in place to forbid user space to access low memory. The barrier varies between architectures though. For this purpose we have the guest base option that allows us to offset guest visible memory from host memory, so that the guest process thinks it can access lower memory than it really can access. Set the default for the guest base to 64k which should be good enough on any host system. This fixes running i386 wine on ARM for me. Signed-off-by: Alexander Graf --- linux-user/main.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/linux-user/main.c b/linux-user/main.c index 7f15d3d..a246cff 100644 --- a/linux-user/main.c +++ b/linux-user/main.c @@ -45,8 +45,8 @@ envlist_t *envlist; const char *cpu_model; unsigned long mmap_min_addr; #if defined(CONFIG_USE_GUEST_BASE) -unsigned long guest_base; -int have_guest_base; +unsigned long guest_base = 64 * 1024; +int have_guest_base = 1; #if (TARGET_LONG_BITS == 32) && (HOST_LONG_BITS == 64) /* * When running 32-on-64 we should make sure we can fit all of the possible @@ -3294,7 +3294,7 @@ static void handle_arg_cpu(const char *arg) static void handle_arg_guest_base(const char *arg) { guest_base = strtol(arg, NULL, 0); -have_guest_base = 1; +have_guest_base = guest_base ? 1 : 0; } static void handle_arg_reserved_va(const char *arg) -- 1.6.0.2
[Qemu-devel] [PATCH 3/9] linux-user: Reset copied CPUs in cpu_copy() always
When a new thread gets created, we need to reset non arch specific state to get the new CPU into clean state. However this reset should happen before the arch specific CPU contents get copied over. Otherwise we end up having clean reset state in our newly created thread. Signed-off-by: Alexander Graf --- exec.c |4 linux-user/syscall.c |3 --- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/exec.c b/exec.c index 4e20143..7eaa3a0 100644 --- a/exec.c +++ b/exec.c @@ -638,6 +638,10 @@ CPUArchState *cpu_copy(CPUArchState *env) CPUWatchpoint *wp; #endif +/* Reset non arch specific state */ +cpu_reset(ENV_GET_CPU(new_env)); + +/* Copy arch specific state into the new CPU */ memcpy(new_env, env, sizeof(CPUArchState)); /* Clone all break/watchpoints. diff --git a/linux-user/syscall.c b/linux-user/syscall.c index 433d3ba..89b7698 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -4234,9 +4234,6 @@ static int do_fork(CPUArchState *env, unsigned int flags, abi_ulong newsp, init_task_state(ts); /* we create a new CPU instance. */ new_env = cpu_copy(env); -#if defined(TARGET_I386) || defined(TARGET_SPARC) || defined(TARGET_PPC) -cpu_reset(ENV_GET_CPU(new_env)); -#endif /* Init regs that differ from the parent. */ cpu_clone_regs(new_env, newsp); new_env->opaque = ts; -- 1.6.0.2
[Qemu-devel] [PATCH 0/9] Wine enablement patch set v2
Howdy, It's been a while since I've tried to run wine in QEMU's i386 linux-user target, so I figured I'd give it a go again. Obviously I've hit a bunch of obstacles, some of which were old, some of which we only introduced recently. However, with this patch set I am able to successfully run Civilization II in wine on my ARM Chromebook: http://db.tt/AvHJOKSg Beware that this patch set is based on Andreas' qom-cpu branch. If you just want a working git tree to try this out on, check out: git://github.com/agraf/qemu.git linux-user-x86-tls-v1 Alex v1 -> v2: - user Peter's patch for ARM is_write support - add h2g_unchecked() macro - rewrite cpu reset on cpu clone code - rewrite sendrcvmsg() patch Alexander Graf (8): linux-user: fix segmentation fault passing with h2g(x) != x linux-user: Reset copied CPUs in cpu_copy() always linux-user: Clean up sendrecvmsg message parsing linux-user: Fix epoll on ARM hosts linux-user: Add i386 TLS setter linux-user: Enable NPTL for i386 linux-user: Default to 64k guest base linux-user: Unlock mmap_lock when resuming guest from page_unprotect Peter Maydell (1): user-exec.c: Set is_write correctly in the ARM cpu_signal_handler() configure|1 + exec.c |4 ++ include/exec/cpu-all.h |8 - linux-user/i386/target_cpu.h | 12 ++- linux-user/main.c|6 ++-- linux-user/syscall.c | 30 +++ linux-user/syscall_defs.h| 63 - translate-all.c | 10 -- user-exec.c | 12 ++- 9 files changed, 101 insertions(+), 45 deletions(-)
[Qemu-devel] [PATCH 7/9] linux-user: Enable NPTL for i386
The i386 target is now able to properly handle NPTL. Enable it. Signed-off-by: Alexander Graf --- configure |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/configure b/configure index 0e0adde..61f2770 100755 --- a/configure +++ b/configure @@ -4156,6 +4156,7 @@ TARGET_ABI_DIR="" case "$target_name" in i386) +target_nptl="yes" ;; x86_64) TARGET_BASE_ARCH=i386 -- 1.6.0.2
[Qemu-devel] [PATCH 4/9] linux-user: Clean up sendrecvmsg message parsing
The cmsg handling in the linux-user code is very hard to read as it tries to follow glibc's unreadable code closely. Let's clean it up, converting all helpers into static inline functions, so that QEMU developers have a chance to understand what's going on. While at it, this also allows us to make the next target header lookup more obvious and GUEST_BASE safe. We only compare host mapped pointers between each other now. During the rewrite I also saw that we never advance our target cmsg structure to the next one. This behavior is identical to the previous code, but looks very bogus to me. Signed-off-by: Alexander Graf --- linux-user/syscall.c | 25 --- linux-user/syscall_defs.h | 58 ++-- 2 files changed, 55 insertions(+), 28 deletions(-) diff --git a/linux-user/syscall.c b/linux-user/syscall.c index 89b7698..8eb5c31 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -1120,6 +1120,7 @@ static inline abi_long target_to_host_cmsg(struct msghdr *msgh, abi_long msg_controllen; abi_ulong target_cmsg_addr; struct target_cmsghdr *target_cmsg; +struct target_cmsghdr *first_target_cmsg; socklen_t space = 0; msg_controllen = tswapal(target_msgh->msg_controllen); @@ -1129,13 +1130,14 @@ static inline abi_long target_to_host_cmsg(struct msghdr *msgh, target_cmsg = lock_user(VERIFY_READ, target_cmsg_addr, msg_controllen, 1); if (!target_cmsg) return -TARGET_EFAULT; +first_target_cmsg = target_cmsg; while (cmsg && target_cmsg) { void *data = CMSG_DATA(cmsg); -void *target_data = TARGET_CMSG_DATA(target_cmsg); +void *target_data = target_cmsg_data(target_cmsg); int len = tswapal(target_cmsg->cmsg_len) - - TARGET_CMSG_ALIGN(sizeof (struct target_cmsghdr)); + - target_cmsg_align(sizeof (struct target_cmsghdr)); space += CMSG_SPACE(len); if (space > msgh->msg_controllen) { @@ -1161,7 +1163,8 @@ static inline abi_long target_to_host_cmsg(struct msghdr *msgh, } cmsg = CMSG_NXTHDR(msgh, cmsg); -target_cmsg = TARGET_CMSG_NXTHDR(target_msgh, target_cmsg); +target_cmsg = target_cmsg_nxthdr(target_msgh, target_cmsg, + first_target_cmsg); } unlock_user(target_cmsg, target_cmsg_addr, 0); the_end: @@ -1176,6 +1179,7 @@ static inline abi_long host_to_target_cmsg(struct target_msghdr *target_msgh, abi_long msg_controllen; abi_ulong target_cmsg_addr; struct target_cmsghdr *target_cmsg; +struct target_cmsghdr *first_target_cmsg; socklen_t space = 0; msg_controllen = tswapal(target_msgh->msg_controllen); @@ -1183,25 +1187,27 @@ static inline abi_long host_to_target_cmsg(struct target_msghdr *target_msgh, goto the_end; target_cmsg_addr = tswapal(target_msgh->msg_control); target_cmsg = lock_user(VERIFY_WRITE, target_cmsg_addr, msg_controllen, 0); -if (!target_cmsg) +if (!target_cmsg) { return -TARGET_EFAULT; +} +first_target_cmsg = target_cmsg; while (cmsg && target_cmsg) { void *data = CMSG_DATA(cmsg); -void *target_data = TARGET_CMSG_DATA(target_cmsg); +void *target_data = target_cmsg_data(target_cmsg); int len = cmsg->cmsg_len - CMSG_ALIGN(sizeof (struct cmsghdr)); -space += TARGET_CMSG_SPACE(len); +space += target_cmsg_space(len); if (space > msg_controllen) { -space -= TARGET_CMSG_SPACE(len); +space -= target_cmsg_space(len); gemu_log("Target cmsg overflow\n"); break; } target_cmsg->cmsg_level = tswap32(cmsg->cmsg_level); target_cmsg->cmsg_type = tswap32(cmsg->cmsg_type); -target_cmsg->cmsg_len = tswapal(TARGET_CMSG_LEN(len)); +target_cmsg->cmsg_len = tswapal(target_cmsg_len(len)); if ((cmsg->cmsg_level == TARGET_SOL_SOCKET) && (cmsg->cmsg_type == SCM_RIGHTS)) { @@ -1228,7 +1234,8 @@ static inline abi_long host_to_target_cmsg(struct target_msghdr *target_msgh, } cmsg = CMSG_NXTHDR(msgh, cmsg); -target_cmsg = TARGET_CMSG_NXTHDR(target_msgh, target_cmsg); +target_cmsg = target_cmsg_nxthdr(target_msgh, target_cmsg, + first_target_cmsg); } unlock_user(target_cmsg, target_cmsg_addr, space); the_end: diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h index 92c01a9..84da7f7 100644 --- a/linux-user/syscall_defs.h +++ b/linux-user/syscall_defs.h @@ -199,26 +199,46 @@ struct target_cmsghdr { int cmsg_type; }; -#define TARGET_CMSG_DATA(cmsg) ((unsigned char *) ((struct target_cmsghdr *) (cmsg) + 1)) -#define TARGET_CMSG_NXTHDR(mhdr, cmsg) __target_cmsg_nxthdr (mhdr, cmsg) -#define TARGET_CMSG_ALIGN(len) (((len) + sizeof
Re: [Qemu-devel] [PATCH 7/9] linux-user: Enable NPTL for i386
On 06.07.2013, at 12:48, Peter Maydell wrote: > On 6 July 2013 01:36, Alexander Graf wrote: >> The i386 target is now able to properly handle NPTL. Enable it. > > This will conflict with the on-list series which reverses > the default for target_nptl in this bit of configure > (though the correction is obviously trivial). It's here for completeness only. It's such a trivial change that I'm sure we can easily redo it once your series is in :). Alex
Re: [Qemu-devel] [PATCH 7/9] linux-user: Enable NPTL for i386
On 6 July 2013 01:36, Alexander Graf wrote: > The i386 target is now able to properly handle NPTL. Enable it. This will conflict with the on-list series which reverses the default for target_nptl in this bit of configure (though the correction is obviously trivial). thanks -- PMM
Re: [Qemu-devel] [PATCH 5/9] linux-user: Fix epoll on ARM hosts
On 06.07.2013, at 12:45, Peter Maydell wrote: > On 6 July 2013 01:36, Alexander Graf wrote: >> diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h >> index 8b06a19..fbc3cac 100644 >> --- a/linux-user/syscall_defs.h >> +++ b/linux-user/syscall_defs.h >> @@ -2434,8 +2434,11 @@ typedef union target_epoll_data { >> >> struct target_epoll_event { >> uint32_t events; >> +#ifdef TARGET_ARM >> +uint32_t __pad; >> +#endif >> target_epoll_data_t data; >> -}; >> +} QEMU_PACKED; >> #endif >> struct target_rlimit64 { >> uint64_t rlim_cur; > > Is ARM really the only arch that needs the pad field? It's the only one I definitely know about. Other targets may add it as they see fit. It shouldn't be more broken than before really, where we just took random host alignment. Alex
Re: [Qemu-devel] [PATCH 4/9] linux-user: Fix sendrecvmsg() with QEMU_GUEST_BASE
On 06.07.2013, at 12:42, Peter Maydell wrote: > On 6 July 2013 01:36, Alexander Graf wrote: >> While looking for cmsg entries, we want to compare guest pointers to see >> whether we're at the end of the passed in array. >> >> However, what we really do is we compare our in-use host pointer with the >> to-be-the-end guest pointer. This comparison is obviously bogus. >> >> Change the comparison to compare guest pointer with guest pointer. >> >> Signed-off-by: Alexander Graf >> --- >> linux-user/syscall_defs.h |2 +- >> 1 files changed, 1 insertions(+), 1 deletions(-) >> >> diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h >> index 92c01a9..8b06a19 100644 >> --- a/linux-user/syscall_defs.h >> +++ b/linux-user/syscall_defs.h >> @@ -214,7 +214,7 @@ __target_cmsg_nxthdr (struct target_msghdr *__mhdr, >> struct target_cmsghdr *__cms >> >> __ptr = (struct target_cmsghdr *)((unsigned char *) __cmsg >> + TARGET_CMSG_ALIGN >> (tswapal(__cmsg->cmsg_len))); >> - if ((unsigned long)((char *)(__ptr+1) - (char >> *)(size_t)tswapal(__mhdr->msg_control)) >> + if ((unsigned long)((char *)(h2g(__ptr+1)) - (char >> *)(size_t)tswapal(__mhdr->msg_control)) >>> tswapal(__mhdr->msg_controllen)) >> /* No more entries. */ >> return (struct target_cmsghdr *)0; > > I don't think this is right. The passed in __cmsg (and thus the > __ptr we calculate) isn't a guest address -- it's the address > we get back from calling lock_user() on a guest address. ... which makes it a host address we want to convert into guest address space, so we can do a guest <-> guest comparison. > That can't be validly compared with anything except another > address derived by arithmetic from the same lock_user() > return value (because if DEBUG_REMAP is defined then the Ah, ok. I didn't know about that debug flag. That might break, yes. > value you get back from lock_user() is the result of calling > malloc()). What we ought to be comparing __ptr+1 against > is not tswapal(__mhdr->msg_control) but the initial value > of target_cmsg returned from lock_user(). Ok :). Alex
Re: [Qemu-devel] [PATCH 5/9] linux-user: Fix epoll on ARM hosts
On 6 July 2013 01:36, Alexander Graf wrote: > diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h > index 8b06a19..fbc3cac 100644 > --- a/linux-user/syscall_defs.h > +++ b/linux-user/syscall_defs.h > @@ -2434,8 +2434,11 @@ typedef union target_epoll_data { > > struct target_epoll_event { > uint32_t events; > +#ifdef TARGET_ARM > +uint32_t __pad; > +#endif > target_epoll_data_t data; > -}; > +} QEMU_PACKED; > #endif > struct target_rlimit64 { > uint64_t rlim_cur; Is ARM really the only arch that needs the pad field? -- PMM
Re: [Qemu-devel] [PATCH 4/9] linux-user: Fix sendrecvmsg() with QEMU_GUEST_BASE
On 6 July 2013 01:36, Alexander Graf wrote: > While looking for cmsg entries, we want to compare guest pointers to see > whether we're at the end of the passed in array. > > However, what we really do is we compare our in-use host pointer with the > to-be-the-end guest pointer. This comparison is obviously bogus. > > Change the comparison to compare guest pointer with guest pointer. > > Signed-off-by: Alexander Graf > --- > linux-user/syscall_defs.h |2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h > index 92c01a9..8b06a19 100644 > --- a/linux-user/syscall_defs.h > +++ b/linux-user/syscall_defs.h > @@ -214,7 +214,7 @@ __target_cmsg_nxthdr (struct target_msghdr *__mhdr, > struct target_cmsghdr *__cms > >__ptr = (struct target_cmsghdr *)((unsigned char *) __cmsg > + TARGET_CMSG_ALIGN > (tswapal(__cmsg->cmsg_len))); > - if ((unsigned long)((char *)(__ptr+1) - (char > *)(size_t)tswapal(__mhdr->msg_control)) > + if ((unsigned long)((char *)(h2g(__ptr+1)) - (char > *)(size_t)tswapal(__mhdr->msg_control)) >> tswapal(__mhdr->msg_controllen)) > /* No more entries. */ > return (struct target_cmsghdr *)0; I don't think this is right. The passed in __cmsg (and thus the __ptr we calculate) isn't a guest address -- it's the address we get back from calling lock_user() on a guest address. That can't be validly compared with anything except another address derived by arithmetic from the same lock_user() return value (because if DEBUG_REMAP is defined then the value you get back from lock_user() is the result of calling malloc()). What we ought to be comparing __ptr+1 against is not tswapal(__mhdr->msg_control) but the initial value of target_cmsg returned from lock_user(). thanks -- PMM
Re: [Qemu-devel] [PATCH 3/9] linux-user: Don't reset a new thread's CPU
On 6 July 2013 01:36, Alexander Graf wrote: > When we create a new thread, there is no reason to reset it. I'm fairly sure > the code has mostly been left in there because nobody understood why it was > there in the first place. We had a big discussion on this on IRC. This code is here because of commit b4558d748, which attempted to fix a breakage introduced when commit b55a37c9 made these CPUs no longer do a reset as part of their cpu_init(). However, it's in the wrong place -- this reset needs to go into cpu_copy(), so it doesn't undo the copying work done by that function. > Remove the reset. A new thread's kernel sided state should be identical to > the old one's. Just deleting the reset isn't right. We should standardize whether we think cpu_init() ought to give you a cleanly reset CPU or not (I think it should); until then, we should put a cpu_reset() immediately after cpu_init() in cpu_copy(). It doesn't need to be ifdef-guarded either. thanks -- PMM
Re: [Qemu-devel] [PATCH 1/9] linux-user: fix segmentation fault passing with h2g(x) != x
On 6 July 2013 01:36, Alexander Graf wrote: > When forwarding a segmentation fault into the guest process, we were passing > the host's address directly into the guest process's signal descriptor. > > That obviously confused the guest process, since it didn't know what to make > of the (usually 32-bit truncated) address. Passing in h2g(address) makes the > guest process a lot happier. Commit message says it uses h2g, code doesn't. Maybe we should have an h2g_unchecked() for this sort of use? thanks -- PMM
Re: [Qemu-devel] [PATCH 2/9] linux-user: Add is_write segfault check for ARM hosts
On 06.07.2013, at 12:24, Peter Maydell wrote: > On 6 July 2013 01:36, Alexander Graf wrote: >> When we get a segmentation fault we check whether the fault was a write. If >> it was a write, it might be a fault because we tried to modify a code region. >> >> This logic does not work on ARM hosts, because they don't evaluate whether a >> segementation fault is due to a write. Instead they always declare it a read. >> >> So self modifying code fails with a segmentation fault whenever it tries to >> modify itself. >> >> Add the is_write evaluation based on what the kernel tells us as fault >> reason. >> >> Signed-off-by: Alexander Graf > > We've already got a patch for this on list : > http://patchwork.ozlabs.org/patch/248590/ Ah, seems like we wrote both patches at about the same time. I prefer yours though - it has a nice comment going with it :). Alex
Re: [Qemu-devel] [PATCH] user-exec.c: Set is_write correctly in the ARM cpu_signal_handler()
On 04.06.2013, at 15:31, Peter Maydell wrote: > In the ARM implementation of cpu_signal_handler(), set is_write > correctly using the FSR value which the kernel passes us in the > error_code field of uc_mcontext. Since the WnR bit of the FSR was > only introduced in ARMv6, this means that v5 cores will continue > to behave as before this patch, but they are not really supported > as hosts for linux-user mode anyway since they do not have the > modern behaviour for unaligned accesses. > > Signed-off-by: Peter Maydell Acked-by: Alexander Graf Alex > --- > Without this linux-user won't work very well. In particular after > fork() bash will segfault, with this in the QEMU_STRACE output > immediately preceding: > sigreturn(18,4390912,1082130608,0,0,0) = -1 errno=255 (Unknown error 255) > at least for PPC and MIPSEL guests. > > user-exec.c |8 ++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/user-exec.c b/user-exec.c > index 71bd6c5..336ac70 100644 > --- a/user-exec.c > +++ b/user-exec.c > @@ -20,6 +20,7 @@ > #include "cpu.h" > #include "disas/disas.h" > #include "tcg.h" > +#include "qemu/bitops.h" > > #undef EAX > #undef ECX > @@ -441,8 +442,11 @@ int cpu_signal_handler(int host_signum, void *pinfo, > #else > pc = uc->uc_mcontext.arm_pc; > #endif > -/* XXX: compute is_write */ > -is_write = 0; > + > +/* error_code is the FSR value, in which bit 11 is WnR (assuming a v6 or > + * later processor; on v5 we will always report this as a read). > + */ > +is_write = extract32(uc->uc_mcontext.error_code, 11, 1); > return handle_cpu_signal(pc, (unsigned long)info->si_addr, > is_write, > &uc->uc_sigmask, puc); > -- > 1.7.9.5 > >
Re: [Qemu-devel] [PATCH 2/9] linux-user: Add is_write segfault check for ARM hosts
On 6 July 2013 01:36, Alexander Graf wrote: > When we get a segmentation fault we check whether the fault was a write. If > it was a write, it might be a fault because we tried to modify a code region. > > This logic does not work on ARM hosts, because they don't evaluate whether a > segementation fault is due to a write. Instead they always declare it a read. > > So self modifying code fails with a segmentation fault whenever it tries to > modify itself. > > Add the is_write evaluation based on what the kernel tells us as fault reason. > > Signed-off-by: Alexander Graf We've already got a patch for this on list : http://patchwork.ozlabs.org/patch/248590/ thanks -- PMM
[Qemu-devel] Qemu-IO : Indentify Guest File access Types
Hi, Is it possible from qemu userspace to identify what type of file access ( DIRECT or SYNC or ASYNC )the guest is performing on its files also along with their file-names if possible. I tried to trace bdrv events but I am not able to correctly identify. Regards Sen