Re: [Qemu-devel] A Question
On 5 May 2011 00:16, Rob Landley r...@landley.net wrote: I note that I have a half-dozen prebuilt system images at http://landley.net/aboriginal/downloads/binaries and the build scripts and such are in the directories above that. I'm afraid I don't entirely understand your file naming system there -- it seems to say which architecture the system images are for but not what board? Perhaps we should have a wiki page with links to useful third-party system images? I also know of Aurelien's images at http://people.debian.org/~aurel32/qemu/ and no doubt there are others. -- PMM
[Qemu-devel] [Bug 723871] Re: qemu-kvm-0.14.0 Aborts with -vga qxl
What is ITP ? -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/723871 Title: qemu-kvm-0.14.0 Aborts with -vga qxl Status in QEMU: Confirmed Status in “libvirt” package in Ubuntu: Triaged Status in “qemu-kvm” package in Ubuntu: Fix Released Bug description: Host CPU is Core i7 Q820. KVM is from 2.6.35-gentoo-r5 kernel (x86_64). Host has spice-0.7.2 and spice-protocol-0.7.0. Guest is Windows XP SP3 with qxl driver 0.6.1, virtio-serial 1.1.6 and vdagent 0.6.3. qemu-kvm is started like so: qemu-system-x86_64 -cpu host -enable-kvm -pidfile /home/rick/qemu/hds/wxp.pid -drive file=/home/rick/qemu/hds/wxp.raw,if=virtio,media=disk,aio=native,snapshot=on -m 768 -name WinXP -net nic,model=virtio -net user -localtime -usb -vga qxl -device virtio-serial -chardev spicevmc,name=vdagent,id=vdagent -device virtserialport,chardev=vdagent,name=com.redhat.spice.0 -spice port=1234,disable-ticketing -monitor stdio and crashes with: qemu-system-x86_64: /home/rick/qemu/src/qemu-kvm-0.14.0/qemu-kvm.c:1724: kvm_mutex_unlock: Assertion `!cpu_single_env' failed. Aborted If I use -no-kvm, it works fine. If I use -vga std, it works fine. -enable-kvm and -vga qxl crashes.
[Qemu-devel] [PATCH] s390x: fix smp support for kvm
Currently smp support for kvm does not work. Qemu does a kvm run even on secondary CPUs which dont have a sane state (initial psw == 0) triggering some program faults. Architecturally these cpus are in the stopped state, so we should not do the kvm run ioctl. (these CPUs will be started by a SIGP restart later during the boot process) We need to tell the loop that this cpu should not run. Jan Kiszka pointed out that kvm_arch_process_async_events is the right place to do. Signed-off-by: Christian Borntraeger borntrae...@de.ibm.com --- a/target-s390x/kvm.c +++ b/target-s390x/kvm.c @@ -172,7 +172,7 @@ void kvm_arch_post_run(CPUState *env, struct kvm_run *run) int kvm_arch_process_async_events(CPUState *env) { -return 0; +return env-halted; } void kvm_s390_interrupt_internal(CPUState *env, int type, uint32_t parm,
Re: [Qemu-devel] [PATCH] qemu-kvm: Add CPUID support for VIA CPU
Hi, the subject's tag (qemu-kvm) is misleading. This is actually targeting the uq/master patch queue, i.e. the upstream kvm staging area. On 2011-05-05 05:03, brill...@viatech.com.cn wrote: When KVM is running on VIA CPU with host cpu's model, the feautures of VIA CPU will be passed into kvm guest by calling the CPUID instruction for Centaur. Signed-off-by: BrillyWubrill...@viatech.com.cn Signed-off-by: KaryJinkary...@viatech.com.cn --- target-i386/cpu.h |7 +++ target-i386/cpuid.c | 48 +++- You patch is unfortunately line-wrapped. target-i386/kvm.c | 15 +++ 3 files changed, 69 insertions(+), 1 deletion(-) --- a/target-i386/cpu.h 2011-05-05 09:01:11.742328398 +0800 +++ b/target-i386/cpu.h 2011-05-05 10:47:32.112329696 +0800 @@ -441,6 +441,10 @@ #define CPUID_VENDOR_AMD_2 0x69746e65 /* enti */ #define CPUID_VENDOR_AMD_3 0x444d4163 /* cAMD */ +#define CPUID_VENDOR_VIA_1 0x746e6543 /* Cent */ +#define CPUID_VENDOR_VIA_2 0x48727561 /* aurH */ +#define CPUID_VENDOR_VIA_3 0x736c7561 /* auls */ + #define CPUID_MWAIT_IBE (1 1) /* Interrupts can exit capability */ #define CPUID_MWAIT_EMX (1 0) /* enumeration supported */ @@ -721,6 +725,9 @@ typedef struct CPUX86State { uint32_t cpuid_ext3_features; uint32_t cpuid_apic_id; int cpuid_vendor_override; +/*Store the results of Centaur's CPUID instructions*/ Please format comments like this /* comment text */, ie. with blanks after/before the /* / */. +uint32_t cpuid_xlevel2; +uint32_t cpuid_ext4_features; /* MTRRs */ uint64_t mtrr_fixed[11]; --- a/target-i386/cpuid.c 2011-05-05 09:01:05.352331142 +0800 +++ b/target-i386/cpuid.c 2011-05-05 10:47:41.102330705 +0800 @@ -230,6 +230,9 @@ typedef struct x86_def_t { char model_id[48]; int vendor_override; uint32_t flags; +/*Store the results of Centaur's CPUID instructions*/ +uint32_t ext4_features; +uint32_t xlevel2; } x86_def_t; #define I486_FEATURES (CPUID_FP87 | CPUID_VME | CPUID_PSE) @@ -522,6 +525,17 @@ static int cpu_x86_fill_host(x86_def_t * cpu_x86_fill_model_id(x86_cpu_def-model_id); x86_cpu_def-vendor_override = 0; +/* Call Centaur's CPUID instruction. */ +if (x86_cpu_def-vendor1 == CPUID_VENDOR_VIA_1 + x86_cpu_def-vendor2 == CPUID_VENDOR_VIA_2 + x86_cpu_def-vendor3 == CPUID_VENDOR_VIA_3) { + host_cpuid(0xC000, 0, eax, ebx, ecx, edx); + if (eax = 0xC001) { + x86_cpu_def-xlevel2 = eax; /*support VIA max extended level*/ + host_cpuid(0xC001, 0, eax, ebx, ecx, edx); + x86_cpu_def-ext4_features = edx; + } +} /* * Every SVM feature requires emulation support in KVM - so we can't just @@ -855,6 +869,8 @@ int cpu_x86_register (CPUX86State *env, env-cpuid_xlevel = def-xlevel; env-cpuid_kvm_features = def-kvm_features; env-cpuid_svm_features = def-svm_features; +env-cpuid_ext4_features = def-ext4_features; +env-cpuid_xlevel2 = def-xlevel2; if (!kvm_enabled()) { env-cpuid_features = TCG_FEATURES; env-cpuid_ext_features = TCG_EXT_FEATURES; @@ -1034,7 +1050,15 @@ void cpu_x86_cpuid(CPUX86State *env, uin uint32_t *ecx, uint32_t *edx) { /* test if maximum index reached */ -if (index 0x8000) { +if ((index 0xC000) == 0xC000) { + /* Handle the Centaur's CPUID instruction.* + * If cpuid_xlevel2 is 0, then put into the* + * default case. */ + if (env-cpuid_xlevel2 == 0) + index = 0xF000; + else if (index env-cpuid_xlevel2) + index = env-cpuid_xlevel2; Please validate your patch before posting with scripts/checkpatch.pl. +} else if (index 0x8000) { if (index env-cpuid_xlevel) index = env-cpuid_level; } else { @@ -1256,6 +1280,28 @@ void cpu_x86_cpuid(CPUX86State *env, uin *edx = 0; } break; +case 0xC000: + *eax = env-cpuid_xlevel2; + *ebx = 0; + *ecx = 0; + *edx = 0; + break; +case 0xC001: + /* Support for VIA CPU's CPUID instruction */ + *eax = env-cpuid_version; + *ebx = 0; + *ecx = 0; + *edx = env-cpuid_ext4_features; + break; +case 0xC002: +case 0xC003: +case 0xC004: + /*Reserved for the future, and now filled with zero*/ + *eax = 0; + *ebx = 0; + *ecx = 0; + *edx = 0; + break; default: /* reserved values: zero */ *eax = 0; --- a/target-i386/kvm.c 2011-05-05 09:01:17.182326246 +0800 +++ b/target-i386/kvm.c 2011-05-05 10:47:48.312331989 +0800 @@ -496,6 +496,21 @@ int kvm_arch_init_vcpu(CPUState *env) cpu_x86_cpuid(env, i, 0, c-eax, c-ebx, c-ecx,
Re: [Qemu-devel] [PATCH v3 5/5] hpet 'driftfix': add code in hpet_timer() to compensate delayed callbacks and coalesced interrupts
Hi Marcelo, Other than that, shouldnt reset accounting variables to init state on write to GLOBAL_ENABLE_CFG / writes to main counter? I'd suggest to initialize/reset the driftfix-related fields in the 'HPETTimer' structure (including the backlog of unaccounted ticks) in the following situations. - When the guest o/s sets the 'CFG_ENABLE' bit (overall enable) in the General Configuration Register. This is the place in hpet_ram_writel(): case HPET_CFG: ... if (activating_bit(old_val, new_val, HPET_CFG_ENABLE)) { ... for (i = 0; i s-num_timers; i++) { if ((s-timer[i])-cmp != ~0ULL) { // initialize / reset fields here hpet_set_timer(s-timer[i]); - When the guest o/s sets the 'TN_ENABLE' bit (timer N interrupt enable) in the Timer N Configuration and Capabilities Register. This is the place in hpet_ram_writel(): case HPET_TN_CFG: ... if (activating_bit(old_val, new_val, HPET_TN_ENABLE)) { // initialize / reset fields here hpet_set_timer(timer); This should cover cases such as ... - first time initialization of HPET timers during guest o/s boot. (includes kexec and reboot) - if a guest o/s stops and restarts the timer. (for example, to change the comparator register value or to switch a timer between periodic mode and non-periodic mode) Regards, Uli
[Qemu-devel] Error handling while loading ROM
Hi all, I'm new to the list, so hopefully I'm not retracing old ground (I did try to search the archives, but maybe I missed something). The problem I have is that when using the Stellarris ARMv7M target if I load an ELF file as my kernel, and some of the ELF segments are outside the range of memory, then things fail silently rather than raising any error. I think it would be very useful if there was some indication of this being an error, rather than failing silently. I'm not sure if this current behaviour is by design or simply an oversight. Or just as likely there is some bit of the design that I don't fully grok right now. The problem code is in the cpu_physical_write_rom function: if ((pd ~TARGET_PAGE_MASK) != IO_MEM_RAM (pd ~TARGET_PAGE_MASK) != IO_MEM_ROM !(pd IO_MEM_ROMD)) { /* do nothing */ } else { unsigned long addr1; addr1 = (pd TARGET_PAGE_MASK) + (addr ~TARGET_PAGE_MASK); /* ROM/RAM case */ ptr = qemu_get_ram_ptr(addr1); memcpy(ptr, buf, l); } The 'do nothing' case is where I think it would be useful for some warning to be made, or better yet, some error to be raises. Or possibly this function could return an error to the caller (in this case rom_reset), and the caller could then decide if printing an error is reasonable or not. I'd appreciate any guidance on the best way to add some useful diagnostics for this case. Thanks, Benno
[Qemu-devel] Unsubscription Confirmation
Thank you for subscribing. You have now unsubscribed and no more messages will be sent.
[Qemu-devel] Allow ARMv7M to be started without a kernel
Hi all, For some current software development I'm doing I've found it most easy to use Qemu in the following manner qemu-system-arm -M lm3s811evb -s -S arm-eabi-gdb From GDB I then load any code I want to debug and test and run it. For this to work however, I needed to make a small change to armv7m.c to enable starting the simulator without specifying a -kernel option. Is there any reason why such a change is a bad idea? (If not, I'll submit an appropriate patch). FWIW, the reason why I'm not using -kernel is that the current way the armv7m code works, it expects the provided kernel to be a full flash image including appropriate vector table, whereas right now I just want to debug some stand-alone code, not the full system, which the above gdb approach works perfectly for. Cheers, Benno
Re: [Qemu-devel] [RFC 19/28] target-xtensa: implement RST2 group (32 bit mul/div/rem)
case 2: /*RST2*/ - TBD(); + if (_OP2 = 12) { + HAS_OPTION(XTENSA_OPTION_32_BIT_IDIV); + int label = gen_new_label(); + tcg_gen_brcondi_i32(TCG_COND_NE, cpu_R[RRR_T], 0, label); + gen_exception_cause(dc, INTEGER_DIVIE_BY_ZERO_CAUSE); DIVIE? Oops (: Looks like I overuse ^n. Thanks. -- Max
Re: [Qemu-devel] [RFC 12/28] target-xtensa: implement shifts (ST1 and RST1 groups)
To track immediate values written to SAR? You mean that there may be some performance difference of fixed size shift vs indirect shift and TCG is able to tell them apart? Well, not really fixed vs indirect, but if you know that the value in the SAR register is in the right range, you can avoid using a 64-bit shift. For instance, SSL ar2 SLL ar0, ar1 could be implemented with tcg_gen_sll_i32(ar0, ar1, ar2); assuming we have enough context. Let us decompose the SAR register into two parts, storing both the true value, and 32-value. struct DisasContext { // Current Stuff // ... // When valid, holds 32-SAR. TCGv sar_m32; bool sar_m32_alloc; bool sar_m32_valid; bool sar_5bit; }; At the beginning of the TB: TCGV_UNUSED_I32(dc-sar_m32); dc-sar_m32_alloc = false; dc-sar_m32_valid = false; dc-sar_5bit = false; static void gen_set_sra_m32(DisasContext *dc, TCGv val) { if (!dc-sar_m32_alloc) { dc-sar_m32_alloc = true; dc-sar_m32 = tcg_temp_local_new_i32(); } dc-sar_m32_valid = true; /* Clear 5 bit because the SAR value could be 32. */ dc-sar_5bit = false; tcg_gen_movi_i32(cpu_SR[SAR], 32); tcg_gen_sub_i32(cpu_SR[SAR], cpu_SR[SAR], val); tcg_gen_mov_i32(dc-sar_m32, val); } static void gen_set_sra(DisasContext *dc, TCGv val, bool is_5bit) { if (dc-sar_m32_alloc dc-sar_m32_valid) { tcg_gen_discard_i32(dc-sar_m32); } dc-sar_m32_valid = false; dc-sar_5bit = is_5bit; tcg_gen_mov_i32(cpu_SR[SAR], val); } /* SSL */ tcg_gen_andi_i32(tmp, cpu_R[AS], 31); gen_set_sra_m32(dc, tmp); break; /* SRL */ tcg_gen_andi_i32(tmp, cpu_R[AS], 31); gen_set_sra(dc, tmp, true); break; /* WSR.SAR */ tcg_gen_andi_i32(tmp, cpu_R[AS], 63); gen_set_sra(dc, tmp, false); break; /* SSAI */ tcg_gen_movi_i32(tmp, constant); gen_gen_sra(dc, tmp, true); break; /* SLL */ if (dc-sar_m32_valid) { tcg_gen_sll_i32(cpu_R[AR], cpu_R[AS], dc-sar_m32); } else { /* your existing 64-bit shift emulation. */ } break; /* SRL */ if (dc-sar_5bit) { tcg_gen_srl_i32(cpu_R[AR], cpu_R[AS], cpu_SR[SAR]); } else { /* your existing 64-bit shift emulation. */ } A couple of points: The use of the local temp avoids problems with intervening insns that might generate branch opcodes. For the simplest cases, as with the case at the start of the message, we ought to be able to propagate the values into the TCG shift insn directly. Does that make sense? Yes it does. Thanks for the good explanation. I tried to keep it all as simple as possible to have a working prototype qickly. Now that it works optimizations should be no problem. Thanks. -- Max
[Qemu-devel] [RFC] darwin: work around sigfd
When running qemu-system on Darwin, the vcpu processes guest code, but I don't get to see anything on the cocoa screen. When running a guest with -nographic, time stands still for the guest: [0.00] Detected 2659.508 MHz processor. [0.000756] Calibrating delay loop (skipped), value calculated using timer frequency.. 5319.01 BogoMIPS (lpj=2659508) [0.000999] pid_max: default: 32768 minimum: 301 [0.000999] Security Framework initialized [0.000999] AppArmor: AppArmor initialized [...] [0.000999] Kernel panic - not syncing: IO-APIC + timer doesn't work! Boot with apic=debug and send a report. Then try booting with the 'noapic' option. [0.000999] [...] [0.000999] [81b3ec92] kernel_init+0x8f/0x206 [0.000999] [81003d74] kernel_thread_helper+0x4/0x10 This patch makes qemu-system work again on Darwin, but is obviously just a hack. I'd really like to see some more clever people find out what exactly is going wrong to find a real solution! Reported-by: Andreas Färber andreas.faer...@web.de (no signed-off-by on purpose - it's an RFC!) --- cpus.c |9 + 1 files changed, 9 insertions(+), 0 deletions(-) diff --git a/cpus.c b/cpus.c index 1fc34b7..ef604bf 100644 --- a/cpus.c +++ b/cpus.c @@ -388,6 +388,15 @@ static int qemu_signal_init(void) int sigfd; sigset_t set; +#ifdef CONFIG_DARWIN +/* Darwin breaks for me with sigfd. I don't know why, but it just sits + there hanging. The vcpu does process things, so that one's good, but + there is no output. Doing the same as win32 works for me. */ +if (1) { +return 0; +} +#endif + #ifdef CONFIG_IOTHREAD /* SIGUSR2 used by posix-aio-compat.c */ sigemptyset(set); -- 1.7.1
Re: [Qemu-devel] virtio-scsi spec, first public draft
On Thu, May 05, 2011 at 12:28:31AM +0200, Paolo Bonzini wrote: Virtio SCSI Controller Device Spec == The virtio controller device groups together one or more simple virtual devices (ie. disk), and allows communicating to these devices using the SCSI protocol. A controller device represents a SCSI host with many targets attached. The virtio controller services two kinds of requests: - command requests for a logical unit; - task management functions related to a logical unit, target or command. The controller is also able to send out notifications about added and removed devices. v4: First public version Configuration - Subsystem Device ID TBD Virtqueues 0..n-1:one requestq per target n:control transmitq n+1:control receiveq 1 requestq per target makes it harder to support large numbers or dynamic targets. You mention detaching targets so is there a way to add a target? The following would be simpler: 0:requestq 1:control transmitq 2:control receiveq Requests must include a target port identifier/name so that they can be delivered to the correct target. Adding or removing targets is easy with a single requestq since the virtqueues don't change. Feature bits VIRTIO_SCSI_F_INOUT - Whether a single request can include both read-only and write-only data buffers. Why make this an optional feature? Device configuration layout struct virtio_scsi_config { u32 num_targets; } num_targets is the number of targets, and the id of the virtqueue used for the control receiveq. Device initialization - The initialization routine should first of all discover the controller's control virtqueues. The driver should then place at least a buffer in the control receiveq. Buffers returned by the device on the control receiveq may be referred to as events in the rest of the document. The driver can immediately issue requests (for example, INQUIRY or REPORT LUNS) or task management functions (for example, I_T RESET). Device operation: request queue --- The driver queues requests to the virtqueue, and they are used by the device (not necessarily in order). Requests have the following format: struct virtio_scsi_req { u32 type; ... u8 response; } #define VIRTIO_SCSI_T_BARRIER 0x8000 The type identifies the remaining fields. The value VIRTIO_SCSI_T_BARRIER can be ORed in the type as well. This bit indicates that this request acts as a barrier and that all preceding requests must be complete before this one, and all following requests must not be started until this is complete. Note that a barrier does not flush caches in the underlying backend device in host, and thus does not serve as data consistency guarantee. The driver must send a SYNCHRONIZE CACHE command to flush the host cache. Why are these barrier semantics needed? Valid response values are defined separately for each command. - Task management function #define VIRTIO_SCSI_T_TMF 0 #define VIRTIO_SCSI_T_TMF_ABORT_TASK_SET 1 #define VIRTIO_SCSI_T_TMF_I_T_NEXUS_RESET 4 #define VIRTIO_SCSI_T_TMF_LOGICAL_UNIT_RESET 5 #define VIRTIO_SCSI_T_TMF_QUERY_TASK_SET 7 #define VIRTIO_SCSI_T_TMF_LOGICAL_UNIT_DETACH (1 24) struct virtio_scsi_req_tmf { u32 subtype; u8 lun[8]; u8 additional[]; u8 response; } /* command-specific response values */ #define VIRTIO_SCSI_S_FUNCTION_COMPLETE0 #define VIRTIO_SCSI_S_NO_TARGET1 #define VIRTIO_SCSI_S_TARGET_FAILURE 2 #define VIRTIO_SCSI_S_FUNCTION_SUCCEEDED 3 #define VIRTIO_SCSI_S_FUNCTION_REJECTED4 #define VIRTIO_SCSI_S_INCORRECT_LUN5 The type is VIRTIO_SCSI_T_LUN_INFO, possibly with the VIRTIO_SCSI_T_BARRIER bit ORed in. Did you mean type is VIRTIO_SCSI_T_TMF? The subtype and lun field are filled in by the driver, the additional and response field is filled in by the device. Unknown LUNs are ignored; also, the lun field is ignored for the I_T NEXUS RESET command. In/out buffers must be separate in virtio so I think it makes sense to split apart a struct virtio_scsi_tmf_req and struct virtio_scsi_tmf_resp. Task management functions accepting an I_T_L_Q nexus (ABORT TASK, QUERY TASK) are only accessible through the control transmitq. Task management functions not in the above list are not accessible in this version of the specification. Future versions may allow access to them through additional features. VIRTIO_SCSI_T_TMF_LOGICAL_UNIT_DETACH asks the device to make the logical unit (and the target
Re: [Qemu-devel] Allow ARMv7M to be started without a kernel
On 5 May 2011 09:23, Ben Leslie be...@benno.id.au wrote: FWIW, the reason why I'm not using -kernel is that the current way the armv7m code works, it expects the provided kernel to be a full flash image including appropriate vector table, whereas right now I just want to debug some stand-alone code, not the full system, which the above gdb approach works perfectly for. It would probably be better for the -kernel option to honour the entry point in the ELF file rather than insisting on full reset (and to try to load the reset SP from the vector table but not insist on that working). That is, we should support both load this ELF image which is a full system image with a vector table and load this ELF image which is just a bare-metal (possibly semihosting) application. The combination of v7M reset with image loading and the possibility of a debugger altering the pc/sp while the core is in reset is a bit complicated, though :-) As an aside: I think QEMU should have an option which is just load a plain ELF or raw binary, with no funny Linux-kernel-specific behaviour rather than overloading -kernel to mean if it's a raw image it's Linux and if it's an ELF file it's not. -- PMM
Re: [Qemu-devel] [PATCH 6/6] PPC: Qdev'ify e500 pci
On 04.05.2011, at 21:08, Blue Swirl wrote: On Mon, May 2, 2011 at 6:03 PM, Alexander Graf ag...@suse.de wrote: The e500 PCI controller isn't qdev'ified yet. This leads to severe issues when running with -drive. To be able to use a virtio disk with an e500 VM, let's convert the PCI controller over to qdev. Signed-off-by: Alexander Graf ag...@suse.de --- v2 - v3: - rebase to current code base - fix endian issue - use sysbus helpers --- hw/ppce500_pci.c | 112 +++--- 1 files changed, 73 insertions(+), 39 deletions(-) diff --git a/hw/ppce500_pci.c b/hw/ppce500_pci.c index 83a20e4..88bc759 100644 --- a/hw/ppce500_pci.c +++ b/hw/ppce500_pci.c @@ -73,11 +73,11 @@ struct pci_inbound { }; struct PPCE500PCIState { +PCIHostState pci_state; struct pci_outbound pob[PPCE500_PCI_NR_POBS]; struct pci_inbound pib[PPCE500_PCI_NR_PIBS]; uint32_t gasket_time; -PCIHostState pci_state; -PCIDevice *pci_dev; +uint64_t base_addr; This does not seem to be used. Also devices shouldn't care about their base addresses. Oops - must have missed that one :). Thanks! Alex
[Qemu-devel] Cavium-Octeon support in QEMU
hi I have sent corrected patches regarding MIPS64 user mode emulation with Octeon support. But i got no further review on these Patches the date of mailed patches is 29th of April. the subjects of my mails are as follow *[PATCH 1/3](Corrected version) linux-user:Support for MIPS64 user mode emulation in QEMUhttp://lists.gnu.org/archive/html/qemu-devel/2011-04/msg02545.html * *[PATCH 2/3] target-mips:Support for Cavium-Octeon specific instructionshttp://lists.gnu.org/archive/html/qemu-devel/2011-04/msg02543.html * please give comments on these patches
[Qemu-devel] [PATCH V15 00/18] Xen device model support
From: Anthony PERARD anthony.per...@citrix.com Hi all, Here is an update on the series that add the support of a Xen HVM guest to QEMU. change v14-v15: - add a patch to not initialise vmport with Xen. The change v13-v14: - Remove of ram_size parameter from pc_memory_init - set both below/above_4g_mem_size at the same place in the code. Change v12-v13: - There are few changes in the xen init code. A xen_hvm_init function is new in this patch set and is call from xenfv:machine-init. - So -xen-create -M xenpv will continue to work as before this patch series. - There is a new reset handler to set env-halted = 0 on the first vcpu. - One change have been made to pc_memory_init, the calculation of below/above_4g_mem_size have been moved to pc_init1. This is to remove a random if (xen()) return; in pc_memory_init. - xen_map_block is a new function to map RAMBlock that belong to a ROM/RAM of a device. - fix cpu_physical_memory_unmap with mapcache, Because qemu_get_ram_ptr can be called more than one time in cpu_physical_memory_map, qemu_put_ram_ptr need to be called the same amount of time. - Add some trace_* call. This series depends on the series Introduce machine QemuOpts. You can find a git tree here: git://xenbits.xen.org/people/aperard/qemu-dm.git qemu-dm-v15 Anthony PERARD (14): xen: Replace some tab-indents with spaces (clean-up). xen: Make Xen build once. xen: Support new libxc calls from xen unstable. xen: Add initialisation of Xen pc_memory_init: Move memory calculation to the caller. xen: Add xenfv machine pc, Disable vmport initialisation with Xen. piix_pci: Introduces Xen specific call for irq. xen: Introduce Xen Interrupt Controller Introduce qemu_put_ram_ptr configure: Always use 64bits target physical addresses with xen enabled. vl.c: Introduce getter for shutdown_requested and reset_requested. xen: Set running state in xenstore. xen: Add Xen hypercall for sleep state in the cmos_s3 callback. Arun Sharma (1): xen: Initialize event channels and io rings John Baboval (2): xen: Adds a cap to the number of map cache entries. pci: Use of qemu_put_ram_ptr in pci_add_option_rom. Jun Nakajima (1): xen: Introduce the Xen mapcache Makefile.target | 14 +- configure| 71 ++- cpu-common.h |1 + exec.c | 86 +++- hw/pc.c | 28 +-- hw/pc.h | 11 +- hw/pc_piix.c | 71 ++- hw/pci.c |2 + hw/piix_pci.c| 47 - hw/xen.h | 41 hw/xen_backend.c | 421 +++ hw/xen_backend.h |6 +- hw/xen_common.h | 106 -- hw/xen_disk.c| 496 ++--- hw/xen_domainbuild.c |3 +- hw/xen_machine_pv.c |1 + hw/xen_nic.c | 265 -- sysemu.h |2 + trace-events | 13 + vl.c | 12 + xen-all.c| 605 ++ xen-mapcache-stub.c | 44 xen-mapcache.c | 375 +++ xen-mapcache.h | 37 +++ xen-stub.c | 41 25 files changed, 2198 insertions(+), 601 deletions(-) create mode 100644 xen-all.c create mode 100644 xen-mapcache-stub.c create mode 100644 xen-mapcache.c create mode 100644 xen-mapcache.h create mode 100644 xen-stub.c -- 1.7.2.5
[Qemu-devel] [PATCH V15 06/18] xen: Add xenfv machine
From: Anthony PERARD anthony.per...@citrix.com Introduce the Xen FV (Fully Virtualized) machine to Qemu, some more Xen specific call will be added in further patches. Signed-off-by: Anthony PERARD anthony.per...@citrix.com --- hw/pc_piix.c | 41 +++-- hw/xen.h |6 ++ xen-all.c| 24 3 files changed, 69 insertions(+), 2 deletions(-) diff --git a/hw/pc_piix.c b/hw/pc_piix.c index 23a6bfb..aba3d58 100644 --- a/hw/pc_piix.c +++ b/hw/pc_piix.c @@ -38,6 +38,10 @@ #include arch_init.h #include blockdev.h #include smbus.h +#include xen.h +#ifdef CONFIG_XEN +# include xen/hvm/hvm_info_table.h +#endif #define MAX_IDE_BUS 2 @@ -101,8 +105,10 @@ static void pc_init1(ram_addr_t ram_size, } /* allocate ram and load rom/bios */ -pc_memory_init(kernel_filename, kernel_cmdline, initrd_filename, - below_4g_mem_size, above_4g_mem_size); +if (!xen_enabled()) { +pc_memory_init(kernel_filename, kernel_cmdline, initrd_filename, + below_4g_mem_size, above_4g_mem_size); +} cpu_irq = pc_allocate_cpu_irq(); i8259 = i8259_init(cpu_irq[0]); @@ -221,6 +227,24 @@ static void pc_init_isa(ram_addr_t ram_size, initrd_filename, cpu_model, 0, 1); } +#ifdef CONFIG_XEN +static void pc_xen_hvm_init(ram_addr_t ram_size, +const char *boot_device, +const char *kernel_filename, +const char *kernel_cmdline, +const char *initrd_filename, +const char *cpu_model) +{ +if (xen_hvm_init() != 0) { +hw_error(xen hardware virtual machine initialisation failed); +} +pc_init_pci_no_kvmclock(ram_size, boot_device, +kernel_filename, kernel_cmdline, +initrd_filename, cpu_model); +xen_vcpu_init(); +} +#endif + static QEMUMachine pc_machine = { .name = pc-0.14, .alias = pc, @@ -385,6 +409,16 @@ static QEMUMachine isapc_machine = { .max_cpus = 1, }; +#ifdef CONFIG_XEN +static QEMUMachine xenfv_machine = { +.name = xenfv, +.desc = Xen Fully-virtualized PC, +.init = pc_xen_hvm_init, +.max_cpus = HVM_MAX_VCPUS, +.default_machine_opts = accel=xen, +}; +#endif + static void pc_machine_init(void) { qemu_register_machine(pc_machine); @@ -393,6 +427,9 @@ static void pc_machine_init(void) qemu_register_machine(pc_machine_v0_11); qemu_register_machine(pc_machine_v0_10); qemu_register_machine(isapc_machine); +#ifdef CONFIG_XEN +qemu_register_machine(xenfv_machine); +#endif } machine_init(pc_machine_init); diff --git a/hw/xen.h b/hw/xen.h index 1fefe3a..bb4dcb5 100644 --- a/hw/xen.h +++ b/hw/xen.h @@ -30,5 +30,11 @@ static inline int xen_enabled(void) } int xen_init(void); +int xen_hvm_init(void); +void xen_vcpu_init(void); + +#if defined(CONFIG_XEN) CONFIG_XEN_CTRL_INTERFACE_VERSION 400 +# define HVM_MAX_VCPUS 32 +#endif #endif /* QEMU_HW_XEN_H */ diff --git a/xen-all.c b/xen-all.c index e2872f9..0b984b2 100644 --- a/xen-all.c +++ b/xen-all.c @@ -9,6 +9,25 @@ #include hw/xen_common.h #include hw/xen_backend.h +/* VCPU Operations, MMIO, IO ring ... */ + +static void xen_reset_vcpu(void *opaque) +{ +CPUState *env = opaque; + +env-halted = 1; +} + +void xen_vcpu_init(void) +{ +CPUState *first_cpu; + +if ((first_cpu = qemu_get_cpu(0))) { +qemu_register_reset(xen_reset_vcpu, first_cpu); +xen_reset_vcpu(first_cpu); +} +} + /* Initialise Xen */ int xen_init(void) @@ -21,3 +40,8 @@ int xen_init(void) return 0; } + +int xen_hvm_init(void) +{ +return 0; +} -- 1.7.2.5
[Qemu-devel] [PATCH V15 04/18] xen: Add initialisation of Xen
From: Anthony PERARD anthony.per...@citrix.com The xenpv machine use the common init function. Signed-off-by: Anthony PERARD anthony.per...@citrix.com Acked-by: Alexander Graf ag...@suse.de --- Makefile.target |9 + hw/xen.h| 13 + hw/xen_backend.c|3 +-- hw/xen_machine_pv.c |1 + vl.c|2 ++ xen-all.c | 23 +++ xen-stub.c | 15 +++ 7 files changed, 64 insertions(+), 2 deletions(-) create mode 100644 xen-all.c create mode 100644 xen-stub.c diff --git a/Makefile.target b/Makefile.target index 6873f76..fd84d46 100644 --- a/Makefile.target +++ b/Makefile.target @@ -208,6 +208,15 @@ QEMU_CFLAGS += $(VNC_PNG_CFLAGS) # xen backend driver support obj-i386-$(CONFIG_XEN) += xen_machine_pv.o xen_domainbuild.o +ifeq ($(TARGET_BASE_ARCH), i386) + CONFIG_NO_XEN = $(if $(subst n,,$(CONFIG_XEN)),n,y) +else + CONFIG_NO_XEN = y +endif +# xen support +obj-i386-$(CONFIG_XEN) += xen-all.o +obj-$(CONFIG_NO_XEN) += xen-stub.o + # Inter-VM PCI shared memory CONFIG_IVSHMEM = ifeq ($(CONFIG_KVM), y) diff --git a/hw/xen.h b/hw/xen.h index 780dcf7..1fefe3a 100644 --- a/hw/xen.h +++ b/hw/xen.h @@ -18,4 +18,17 @@ enum xen_mode { extern uint32_t xen_domid; extern enum xen_mode xen_mode; +extern int xen_allowed; + +static inline int xen_enabled(void) +{ +#ifdef CONFIG_XEN +return xen_allowed; +#else +return 0; +#endif +} + +int xen_init(void); + #endif /* QEMU_HW_XEN_H */ diff --git a/hw/xen_backend.c b/hw/xen_backend.c index 5f58a3f..d881fa2 100644 --- a/hw/xen_backend.c +++ b/hw/xen_backend.c @@ -665,9 +665,8 @@ int xen_be_init(void) goto err; } -xen_xc = xen_xc_interface_open(0, 0, 0); if (xen_xc == XC_HANDLER_INITIAL_VALUE) { -xen_be_printf(NULL, 0, can't open xen interface\n); +/* Check if xen_init() have been called */ goto err; } return 0; diff --git a/hw/xen_machine_pv.c b/hw/xen_machine_pv.c index 0d7f73e..7985d11 100644 --- a/hw/xen_machine_pv.c +++ b/hw/xen_machine_pv.c @@ -113,6 +113,7 @@ static QEMUMachine xenpv_machine = { .desc = Xen Para-virtualized PC, .init = xen_init_pv, .max_cpus = 1, +.default_machine_opts = accel=xen, }; static void xenpv_machine_init(void) diff --git a/vl.c b/vl.c index 4632469..3ed6855 100644 --- a/vl.c +++ b/vl.c @@ -259,6 +259,7 @@ static NotifierList machine_init_done_notifiers = static int tcg_allowed = 1; int kvm_allowed = 0; +int xen_allowed = 0; uint32_t xen_domid; enum xen_mode xen_mode = XEN_EMULATE; @@ -1890,6 +1891,7 @@ static struct { int *allowed; } accel_list[] = { { tcg, tcg, tcg_available, tcg_init, tcg_allowed }, +{ xen, Xen, xen_available, xen_init, xen_allowed }, { kvm, KVM, kvm_available, kvm_init, kvm_allowed }, }; diff --git a/xen-all.c b/xen-all.c new file mode 100644 index 000..e2872f9 --- /dev/null +++ b/xen-all.c @@ -0,0 +1,23 @@ +/* + * Copyright (C) 2010 Citrix Ltd. + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + * + */ + +#include hw/xen_common.h +#include hw/xen_backend.h + +/* Initialise Xen */ + +int xen_init(void) +{ +xen_xc = xen_xc_interface_open(0, 0, 0); +if (xen_xc == XC_HANDLER_INITIAL_VALUE) { +xen_be_printf(NULL, 0, can't open xen interface\n); +return -1; +} + +return 0; +} diff --git a/xen-stub.c b/xen-stub.c new file mode 100644 index 000..beb982f --- /dev/null +++ b/xen-stub.c @@ -0,0 +1,15 @@ +/* + * Copyright (C) 2010 Citrix Ltd. + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + * + */ + +#include qemu-common.h +#include hw/xen.h + +int xen_init(void) +{ +return -ENOSYS; +} -- 1.7.2.5
[Qemu-devel] [PATCH V15 05/18] pc_memory_init: Move memory calculation to the caller.
From: Anthony PERARD anthony.per...@citrix.com This patch moves above_4g_mem_size and below_4g_mem_size calculation in the caller of pc_memory_init (pc_init1). And the prototype of pc_memory_init is changed because there is no need anymore to have variable pointer and the ram_size parameter. Signed-off-by: Anthony PERARD anthony.per...@citrix.com --- hw/pc.c | 17 +++-- hw/pc.h |7 +++ hw/pc_piix.c | 12 ++-- 3 files changed, 16 insertions(+), 20 deletions(-) diff --git a/hw/pc.c b/hw/pc.c index 6939c04..ebdf3b0 100644 --- a/hw/pc.c +++ b/hw/pc.c @@ -957,29 +957,18 @@ void pc_cpus_init(const char *cpu_model) } } -void pc_memory_init(ram_addr_t ram_size, -const char *kernel_filename, +void pc_memory_init(const char *kernel_filename, const char *kernel_cmdline, const char *initrd_filename, -ram_addr_t *below_4g_mem_size_p, -ram_addr_t *above_4g_mem_size_p) +ram_addr_t below_4g_mem_size, +ram_addr_t above_4g_mem_size) { char *filename; int ret, linux_boot, i; ram_addr_t ram_addr, bios_offset, option_rom_offset; -ram_addr_t below_4g_mem_size, above_4g_mem_size = 0; int bios_size, isa_bios_size; void *fw_cfg; -if (ram_size = 0xe000 ) { -above_4g_mem_size = ram_size - 0xe000; -below_4g_mem_size = 0xe000; -} else { -below_4g_mem_size = ram_size; -} -*above_4g_mem_size_p = above_4g_mem_size; -*below_4g_mem_size_p = below_4g_mem_size; - linux_boot = (kernel_filename != NULL); /* allocate RAM */ diff --git a/hw/pc.h b/hw/pc.h index feb8a7a..b7ee7f8 100644 --- a/hw/pc.h +++ b/hw/pc.h @@ -129,12 +129,11 @@ void pc_cmos_set_s3_resume(void *opaque, int irq, int level); void pc_acpi_smi_interrupt(void *opaque, int irq, int level); void pc_cpus_init(const char *cpu_model); -void pc_memory_init(ram_addr_t ram_size, -const char *kernel_filename, +void pc_memory_init(const char *kernel_filename, const char *kernel_cmdline, const char *initrd_filename, -ram_addr_t *below_4g_mem_size_p, -ram_addr_t *above_4g_mem_size_p); +ram_addr_t below_4g_mem_size, +ram_addr_t above_4g_mem_size); qemu_irq *pc_allocate_cpu_irq(void); void pc_vga_init(PCIBus *pci_bus); void pc_basic_device_init(qemu_irq *isa_irq, diff --git a/hw/pc_piix.c b/hw/pc_piix.c index a85214b..23a6bfb 100644 --- a/hw/pc_piix.c +++ b/hw/pc_piix.c @@ -92,9 +92,17 @@ static void pc_init1(ram_addr_t ram_size, kvmclock_create(); } +if (ram_size = 0xe000 ) { +above_4g_mem_size = ram_size - 0xe000; +below_4g_mem_size = 0xe000; +} else { +above_4g_mem_size = 0; +below_4g_mem_size = ram_size; +} + /* allocate ram and load rom/bios */ -pc_memory_init(ram_size, kernel_filename, kernel_cmdline, initrd_filename, - below_4g_mem_size, above_4g_mem_size); +pc_memory_init(kernel_filename, kernel_cmdline, initrd_filename, + below_4g_mem_size, above_4g_mem_size); cpu_irq = pc_allocate_cpu_irq(); i8259 = i8259_init(cpu_irq[0]); -- 1.7.2.5
[Qemu-devel] [PATCH V15 02/18] xen: Make Xen build once.
From: Anthony PERARD anthony.per...@citrix.com xen_domainbuild and xen_machine_pv are built only for i386 targets. Signed-off-by: Anthony PERARD anthony.per...@citrix.com --- Makefile.target |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Makefile.target b/Makefile.target index 21f864a..6873f76 100644 --- a/Makefile.target +++ b/Makefile.target @@ -206,7 +206,7 @@ QEMU_CFLAGS += $(VNC_JPEG_CFLAGS) QEMU_CFLAGS += $(VNC_PNG_CFLAGS) # xen backend driver support -obj-$(CONFIG_XEN) += xen_machine_pv.o xen_domainbuild.o +obj-i386-$(CONFIG_XEN) += xen_machine_pv.o xen_domainbuild.o # Inter-VM PCI shared memory CONFIG_IVSHMEM = -- 1.7.2.5
[Qemu-devel] [PATCH V15 09/18] xen: Introduce Xen Interrupt Controller
From: Anthony PERARD anthony.per...@citrix.com Every set_irq call makes a Xen hypercall. Signed-off-by: Anthony PERARD anthony.per...@citrix.com Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com --- hw/pc_piix.c |8 ++-- hw/xen.h |2 ++ xen-all.c| 12 xen-stub.c |5 + 4 files changed, 25 insertions(+), 2 deletions(-) diff --git a/hw/pc_piix.c b/hw/pc_piix.c index 9353b91..62cdf71 100644 --- a/hw/pc_piix.c +++ b/hw/pc_piix.c @@ -110,8 +110,12 @@ static void pc_init1(ram_addr_t ram_size, below_4g_mem_size, above_4g_mem_size); } -cpu_irq = pc_allocate_cpu_irq(); -i8259 = i8259_init(cpu_irq[0]); +if (!xen_enabled()) { +cpu_irq = pc_allocate_cpu_irq(); +i8259 = i8259_init(cpu_irq[0]); +} else { +i8259 = xen_interrupt_controller_init(); +} isa_irq_state = qemu_mallocz(sizeof(*isa_irq_state)); isa_irq_state-i8259 = i8259; if (pci_enabled) { diff --git a/hw/xen.h b/hw/xen.h index a4096ca..9f00c0b 100644 --- a/hw/xen.h +++ b/hw/xen.h @@ -35,6 +35,8 @@ int xen_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num); void xen_piix3_set_irq(void *opaque, int irq_num, int level); void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len); +qemu_irq *xen_interrupt_controller_init(void); + int xen_init(void); int xen_hvm_init(void); void xen_vcpu_init(void); diff --git a/xen-all.c b/xen-all.c index acb051c..bb809ef 100644 --- a/xen-all.c +++ b/xen-all.c @@ -40,6 +40,18 @@ void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len) } } +/* Xen Interrupt Controller */ + +static void xen_set_irq(void *opaque, int irq, int level) +{ +xc_hvm_set_isa_irq_level(xen_xc, xen_domid, irq, level); +} + +qemu_irq *xen_interrupt_controller_init(void) +{ +return qemu_allocate_irqs(xen_set_irq, NULL, 16); +} + /* VCPU Operations, MMIO, IO ring ... */ static void xen_reset_vcpu(void *opaque) diff --git a/xen-stub.c b/xen-stub.c index dc90f10..3a8449c 100644 --- a/xen-stub.c +++ b/xen-stub.c @@ -22,6 +22,11 @@ void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len) { } +qemu_irq *xen_interrupt_controller_init(void) +{ +return NULL; +} + int xen_init(void) { return -ENOSYS; -- 1.7.2.5
[Qemu-devel] [PATCH V15 10/18] xen: Introduce the Xen mapcache
From: Jun Nakajima jun.nakaj...@intel.com On IA32 host or IA32 PAE host, at present, generally, we can't create an HVM guest with more than 2G memory, because generally it's almost impossible for Qemu to find a large enough and consecutive virtual address space to map an HVM guest's whole physical address space. The attached patch fixes this issue using dynamic mapping based on little blocks of memory. Each call to qemu_get_ram_ptr makes a call to qemu_map_cache with the lock option, so mapcache will not unmap these ram_ptr. Blocks that do not belong to the RAM, but usually to a device ROM or to a framebuffer, are handled in a separate function. So the whole RAMBlock can be map. Signed-off-by: Jun Nakajima jun.nakaj...@intel.com Signed-off-by: Anthony PERARD anthony.per...@citrix.com Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com --- Makefile.target |3 + configure |3 + exec.c | 48 +++- hw/xen.h| 13 ++ hw/xen_common.h |9 ++ trace-events| 10 ++ xen-all.c | 66 ++ xen-mapcache-stub.c | 44 +++ xen-mapcache.c | 349 +++ xen-mapcache.h | 37 ++ xen-stub.c |4 + 11 files changed, 582 insertions(+), 4 deletions(-) create mode 100644 xen-mapcache-stub.c create mode 100644 xen-mapcache.c create mode 100644 xen-mapcache.h diff --git a/Makefile.target b/Makefile.target index fd84d46..2e281a4 100644 --- a/Makefile.target +++ b/Makefile.target @@ -214,8 +214,11 @@ else CONFIG_NO_XEN = y endif # xen support +CONFIG_NO_XEN_MAPCACHE = $(if $(subst n,,$(CONFIG_XEN_MAPCACHE)),n,y) obj-i386-$(CONFIG_XEN) += xen-all.o obj-$(CONFIG_NO_XEN) += xen-stub.o +obj-i386-$(CONFIG_XEN_MAPCACHE) += xen-mapcache.o +obj-$(CONFIG_NO_XEN_MAPCACHE) += xen-mapcache-stub.o # Inter-VM PCI shared memory CONFIG_IVSHMEM = diff --git a/configure b/configure index 5df84d2..6fc2bdd 100755 --- a/configure +++ b/configure @@ -3299,6 +3299,9 @@ case $target_arch2 in i386|x86_64) if test $xen = yes -a $target_softmmu = yes ; then echo CONFIG_XEN=y $config_target_mak + if test $cpu = i386 -o $cpu = x86_64; then + echo CONFIG_XEN_MAPCACHE=y $config_target_mak + fi fi esac case $target_arch2 in diff --git a/exec.c b/exec.c index a718d74..19707c5 100644 --- a/exec.c +++ b/exec.c @@ -32,6 +32,7 @@ #include hw/qdev.h #include osdep.h #include kvm.h +#include hw/xen.h #include qemu-timer.h #if defined(CONFIG_USER_ONLY) #include qemu.h @@ -51,6 +52,8 @@ #include libutil.h #endif #endif +#else /* !CONFIG_USER_ONLY */ +#include xen-mapcache.h #endif //#define DEBUG_TB_INVALIDATE @@ -2879,6 +2882,7 @@ ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, const char *name, } } +new_block-offset = find_ram_offset(size); if (host) { new_block-host = host; new_block-flags |= RAM_PREALLOC_MASK; @@ -2901,13 +2905,15 @@ ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, const char *name, PROT_EXEC|PROT_READ|PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0); #else -new_block-host = qemu_vmalloc(size); +if (xen_mapcache_enabled()) { +xen_ram_alloc(new_block-offset, size); +} else { +new_block-host = qemu_vmalloc(size); +} #endif qemu_madvise(new_block-host, size, QEMU_MADV_MERGEABLE); } } - -new_block-offset = find_ram_offset(size); new_block-length = size; QLIST_INSERT_HEAD(ram_list.blocks, new_block, next); @@ -2952,7 +2958,11 @@ void qemu_ram_free(ram_addr_t addr) #if defined(TARGET_S390X) defined(CONFIG_KVM) munmap(block-host, block-length); #else -qemu_vfree(block-host); +if (xen_mapcache_enabled()) { +qemu_invalidate_entry(block-host); +} else { +qemu_vfree(block-host); +} #endif } qemu_free(block); @@ -3041,6 +3051,16 @@ void *qemu_get_ram_ptr(ram_addr_t addr) QLIST_REMOVE(block, next); QLIST_INSERT_HEAD(ram_list.blocks, block, next); } +if (xen_mapcache_enabled()) { +/* We need to check if the requested address is in the RAM + * because we don't want to map the entire memory in QEMU. + */ +if (block-offset == 0) { +return qemu_map_cache(addr, 0, 1); +} else if (block-host == NULL) { +block-host = xen_map_block(block-offset, block-length); +} +} return block-host + (addr - block-offset); } } @@ -3060,6 +3080,16 @@ void *qemu_safe_ram_ptr(ram_addr_t addr)
[Qemu-devel] [PATCH V15 07/18] pc, Disable vmport initialisation with Xen.
From: Anthony PERARD anthony.per...@citrix.com This is because there is not synchronisation of the vcpu register between Xen and QEMU, so vmport can't work properly. This patch introduces no_vmport parameter to pc_basic_device_init. Signed-off-by: Anthony PERARD anthony.per...@citrix.com --- hw/pc.c | 11 --- hw/pc.h |3 ++- hw/pc_piix.c |2 +- 3 files changed, 11 insertions(+), 5 deletions(-) diff --git a/hw/pc.c b/hw/pc.c index ebdf3b0..8106197 100644 --- a/hw/pc.c +++ b/hw/pc.c @@ -1082,7 +1082,8 @@ static void cpu_request_exit(void *opaque, int irq, int level) } void pc_basic_device_init(qemu_irq *isa_irq, - ISADevice **rtc_state) + ISADevice **rtc_state, + bool no_vmport) { int i; DriveInfo *fd[MAX_FD]; @@ -1127,8 +1128,12 @@ void pc_basic_device_init(qemu_irq *isa_irq, a20_line = qemu_allocate_irqs(handle_a20_line_change, first_cpu, 2); i8042 = isa_create_simple(i8042); i8042_setup_a20_line(i8042, a20_line[0]); -vmport_init(); -vmmouse = isa_try_create(vmmouse); +if (!no_vmport) { +vmport_init(); +vmmouse = isa_try_create(vmmouse); +} else { +vmmouse = NULL; +} if (vmmouse) { qdev_prop_set_ptr(vmmouse-qdev, ps2_mouse, i8042); qdev_init_nofail(vmmouse-qdev); diff --git a/hw/pc.h b/hw/pc.h index b7ee7f8..6d5730b 100644 --- a/hw/pc.h +++ b/hw/pc.h @@ -137,7 +137,8 @@ void pc_memory_init(const char *kernel_filename, qemu_irq *pc_allocate_cpu_irq(void); void pc_vga_init(PCIBus *pci_bus); void pc_basic_device_init(qemu_irq *isa_irq, - ISADevice **rtc_state); + ISADevice **rtc_state, + bool no_vmport); void pc_init_ne2k_isa(NICInfo *nd); void pc_cmos_init(ram_addr_t ram_size, ram_addr_t above_4g_mem_size, const char *boot_device, diff --git a/hw/pc_piix.c b/hw/pc_piix.c index aba3d58..e814f00 100644 --- a/hw/pc_piix.c +++ b/hw/pc_piix.c @@ -133,7 +133,7 @@ static void pc_init1(ram_addr_t ram_size, pc_vga_init(pci_enabled? pci_bus: NULL); /* init basic PC hardware */ -pc_basic_device_init(isa_irq, rtc_state); +pc_basic_device_init(isa_irq, rtc_state, xen_enabled()); for(i = 0; i nb_nics; i++) { NICInfo *nd = nd_table[i]; -- 1.7.2.5
[Qemu-devel] [PATCH V15 03/18] xen: Support new libxc calls from xen unstable.
From: Anthony PERARD anthony.per...@citrix.com This patch updates the libxenctrl calls in Qemu to use the new interface, otherwise Qemu wouldn't be able to build against new versions of the library. We check libxenctrl version in configure, from Xen 3.3.0 to Xen unstable. Signed-off-by: Anthony PERARD anthony.per...@citrix.com Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Acked-by: Alexander Graf ag...@suse.de --- configure| 67 ++- hw/xen_backend.c | 21 ++- hw/xen_backend.h |6 ++-- hw/xen_common.h | 95 ++--- hw/xen_disk.c|4 +- hw/xen_domainbuild.c |3 +- 6 files changed, 164 insertions(+), 32 deletions(-) diff --git a/configure b/configure index 6f75e2e..5df84d2 100755 --- a/configure +++ b/configure @@ -127,6 +127,7 @@ vnc_jpeg= vnc_png= vnc_thread=no xen= +xen_ctrl_version= linux_aio= attr= vhost_net= @@ -1180,20 +1181,81 @@ fi if test $xen != no ; then xen_libs=-lxenstore -lxenctrl -lxenguest + + # Xen unstable cat $TMPC EOF #include xenctrl.h #include xs.h -int main(void) { xs_daemon_open(); xc_interface_open(); return 0; } +#include stdint.h +#include xen/hvm/hvm_info_table.h +#if !defined(HVM_MAX_VCPUS) +# error HVM_MAX_VCPUS not defined +#endif +int main(void) { + xc_interface *xc; + xs_daemon_open(); + xc = xc_interface_open(0, 0, 0); + xc_hvm_set_mem_type(0, 0, HVMMEM_ram_ro, 0, 0); + xc_gnttab_open(NULL, 0); + return 0; +} EOF if compile_prog $xen_libs ; then +xen_ctrl_version=410 xen=yes -libs_softmmu=$xen_libs $libs_softmmu + + # Xen 4.0.0 + elif ( + cat $TMPC EOF +#include xenctrl.h +#include xs.h +#include stdint.h +#include xen/hvm/hvm_info_table.h +#if !defined(HVM_MAX_VCPUS) +# error HVM_MAX_VCPUS not defined +#endif +int main(void) { + xs_daemon_open(); + xc_interface_open(); + xc_gnttab_open(); + xc_hvm_set_mem_type(0, 0, HVMMEM_ram_ro, 0, 0); + return 0; +} +EOF + compile_prog $xen_libs +) ; then +xen_ctrl_version=400 +xen=yes + + # Xen 3.3.0, 3.4.0 + elif ( + cat $TMPC EOF +#include xenctrl.h +#include xs.h +int main(void) { + xs_daemon_open(); + xc_interface_open(); + xc_gnttab_open(); + xc_hvm_set_mem_type(0, 0, HVMMEM_ram_ro, 0, 0); + return 0; +} +EOF + compile_prog $xen_libs +) ; then +xen_ctrl_version=330 +xen=yes + + # Xen not found or unsupported else if test $xen = yes ; then feature_not_found xen fi xen=no fi + + if test $xen = yes; then +libs_softmmu=$xen_libs $libs_softmmu + fi fi ## @@ -2855,6 +2917,7 @@ if test $bluez = yes ; then fi if test $xen = yes ; then echo CONFIG_XEN=y $config_host_mak + echo CONFIG_XEN_CTRL_INTERFACE_VERSION=$xen_ctrl_version $config_host_mak fi if test $io_thread = yes ; then echo CONFIG_IOTHREAD=y $config_host_mak diff --git a/hw/xen_backend.c b/hw/xen_backend.c index 9f4ec4b..5f58a3f 100644 --- a/hw/xen_backend.c +++ b/hw/xen_backend.c @@ -43,7 +43,8 @@ /* - */ /* public */ -int xen_xc; +XenXC xen_xc = XC_HANDLER_INITIAL_VALUE; +XenGnttab xen_xcg = XC_HANDLER_INITIAL_VALUE; struct xs_handle *xenstore = NULL; const char *xen_protocol; @@ -214,8 +215,8 @@ static struct XenDevice *xen_be_get_xendev(const char *type, int dom, int dev, xendev-debug = debug; xendev-local_port = -1; -xendev-evtchndev = xc_evtchn_open(); -if (xendev-evtchndev 0) { +xendev-evtchndev = xen_xc_evtchn_open(NULL, 0); +if (xendev-evtchndev == XC_HANDLER_INITIAL_VALUE) { xen_be_printf(NULL, 0, can't open evtchn device\n); qemu_free(xendev); return NULL; @@ -223,15 +224,15 @@ static struct XenDevice *xen_be_get_xendev(const char *type, int dom, int dev, fcntl(xc_evtchn_fd(xendev-evtchndev), F_SETFD, FD_CLOEXEC); if (ops-flags DEVOPS_FLAG_NEED_GNTDEV) { -xendev-gnttabdev = xc_gnttab_open(); -if (xendev-gnttabdev 0) { +xendev-gnttabdev = xen_xc_gnttab_open(NULL, 0); +if (xendev-gnttabdev == XC_HANDLER_INITIAL_VALUE) { xen_be_printf(NULL, 0, can't open gnttab device\n); xc_evtchn_close(xendev-evtchndev); qemu_free(xendev); return NULL; } } else { -xendev-gnttabdev = -1; +xendev-gnttabdev = XC_HANDLER_INITIAL_VALUE; } QTAILQ_INSERT_TAIL(xendevs, xendev, next); @@ -277,10 +278,10 @@ static struct XenDevice *xen_be_del_xendev(int dom, int dev) qemu_free(xendev-fe); } -if (xendev-evtchndev = 0) { +if (xendev-evtchndev != XC_HANDLER_INITIAL_VALUE) { xc_evtchn_close(xendev-evtchndev); } -if (xendev-gnttabdev = 0) { +if (xendev-gnttabdev != XC_HANDLER_INITIAL_VALUE) {
[Qemu-devel] [PATCH V15 11/18] xen: Adds a cap to the number of map cache entries.
From: John Baboval john.babo...@virtualcomputer.com Adds a cap to the number of map cache entries. This prevents the map cache from overwhelming system memory. I also removed the bitmap macros and #included bitmap.h instead. Signed-off-By: John Baboval john.babo...@virtualcomputer.com Signed-off-by: Anthony PERARD anthony.per...@citrix.com --- xen-mapcache.c | 37 +++-- 1 files changed, 15 insertions(+), 22 deletions(-) diff --git a/xen-mapcache.c b/xen-mapcache.c index a539358..2ca18ce 100644 --- a/xen-mapcache.c +++ b/xen-mapcache.c @@ -12,6 +12,7 @@ #include hw/xen_backend.h #include blockdev.h +#include bitmap.h #include xen/hvm/params.h #include sys/mman.h @@ -32,15 +33,13 @@ #if defined(__i386__) # define MCACHE_BUCKET_SHIFT 16 +# define MCACHE_MAX_SIZE (1UL31) /* 2GB Cap */ #elif defined(__x86_64__) # define MCACHE_BUCKET_SHIFT 20 +# define MCACHE_MAX_SIZE (1UL35) /* 32GB Cap */ #endif #define MCACHE_BUCKET_SIZE (1UL MCACHE_BUCKET_SHIFT) -#define BITS_PER_LONG (sizeof(long) * 8) -#define BITS_TO_LONGS(bits) (((bits) + BITS_PER_LONG - 1) / BITS_PER_LONG) -#define DECLARE_BITMAP(name, bits) unsigned long name[BITS_TO_LONGS(bits)] - typedef struct MapCacheEntry { target_phys_addr_t paddr_index; uint8_t *vaddr_base; @@ -69,11 +68,6 @@ typedef struct MapCache { static MapCache *mapcache; -static inline int test_bit(unsigned int bit, const unsigned long *map) -{ -return !!((map)[(bit) / BITS_PER_LONG] (1UL ((bit) % BITS_PER_LONG))); -} - void qemu_map_cache_init(void) { unsigned long size; @@ -85,9 +79,14 @@ void qemu_map_cache_init(void) mapcache-last_address_index = -1; getrlimit(RLIMIT_AS, rlimit_as); -rlimit_as.rlim_cur = rlimit_as.rlim_max; +if (rlimit_as.rlim_max MCACHE_MAX_SIZE) { +rlimit_as.rlim_cur = rlimit_as.rlim_max; +} else { +rlimit_as.rlim_cur = MCACHE_MAX_SIZE; +} + setrlimit(RLIMIT_AS, rlimit_as); -mapcache-max_mcache_size = rlimit_as.rlim_max; +mapcache-max_mcache_size = rlimit_as.rlim_cur; mapcache-nr_buckets = (((mapcache-max_mcache_size XC_PAGE_SHIFT) + @@ -107,7 +106,7 @@ static void qemu_remap_bucket(MapCacheEntry *entry, uint8_t *vaddr_base; xen_pfn_t *pfns; int *err; -unsigned int i, j; +unsigned int i; target_phys_addr_t nb_pfn = size XC_PAGE_SHIFT; trace_qemu_remap_bucket(address_index); @@ -136,17 +135,11 @@ static void qemu_remap_bucket(MapCacheEntry *entry, entry-vaddr_base = vaddr_base; entry-paddr_index = address_index; -for (i = 0; i nb_pfn; i += BITS_PER_LONG) { -unsigned long word = 0; -if ((i + BITS_PER_LONG) nb_pfn) { -j = nb_pfn % BITS_PER_LONG; -} else { -j = BITS_PER_LONG; -} -while (j 0) { -word = (word 1) | !err[i + --j]; +bitmap_zero(entry-valid_mapping, nb_pfn); +for (i = 0; i nb_pfn; i++) { +if (!err[i]) { +bitmap_set(entry-valid_mapping, i, 1); } -entry-valid_mapping[i / BITS_PER_LONG] = word; } qemu_free(pfns); -- 1.7.2.5
[Qemu-devel] [PATCH V15 17/18] xen: Set running state in xenstore.
From: Anthony PERARD anthony.per...@citrix.com This tells to the xen management tool that the machine can begin run. Signed-off-by: Anthony PERARD anthony.per...@citrix.com Acked-by: Alexander Graf ag...@suse.de --- xen-all.c | 23 +++ 1 files changed, 23 insertions(+), 0 deletions(-) diff --git a/xen-all.c b/xen-all.c index e849a38..19c2fe1 100644 --- a/xen-all.c +++ b/xen-all.c @@ -64,6 +64,8 @@ typedef struct XenIOState { /* which vcpu we are serving */ int send_vcpu; +struct xs_handle *xenstore; + Notifier exit; } XenIOState; @@ -450,6 +452,17 @@ static void cpu_handle_ioreq(void *opaque) } } +static void xenstore_record_dm_state(XenIOState *s, const char *state) +{ +char path[50]; + +snprintf(path, sizeof (path), /local/domain/0/device-model/%u/state, xen_domid); +if (!xs_write(s-xenstore, XBT_NULL, path, state, strlen(state))) { +fprintf(stderr, error recording dm state\n); +exit(1); +} +} + static void xen_main_loop_prepare(XenIOState *state) { int evtchn_fd = -1; @@ -465,6 +478,9 @@ static void xen_main_loop_prepare(XenIOState *state) if (evtchn_fd != -1) { qemu_set_fd_handler(evtchn_fd, cpu_handle_ioreq, NULL, state); } + +/* record state running */ +xenstore_record_dm_state(state, running); } @@ -483,6 +499,7 @@ static void xen_exit_notifier(Notifier *n) XenIOState *state = container_of(n, XenIOState, exit); xc_evtchn_close(state-xce_handle); +xs_daemon_close(state-xenstore); } int xen_init(void) @@ -510,6 +527,12 @@ int xen_hvm_init(void) return -errno; } +state-xenstore = xs_daemon_open(); +if (state-xenstore == NULL) { +perror(xen: xenstore open); +return -errno; +} + state-exit.notify = xen_exit_notifier; qemu_add_exit_notifier(state-exit); -- 1.7.2.5
[Qemu-devel] [PATCH V15 08/18] piix_pci: Introduces Xen specific call for irq.
From: Anthony PERARD anthony.per...@citrix.com This patch introduces Xen specific call in piix_pci. The specific part for Xen is in write_config, set_irq and get_pirq. Signed-off-by: Anthony PERARD anthony.per...@citrix.com Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Acked-by: Alexander Graf ag...@suse.de --- hw/pc.h |1 + hw/pc_piix.c |6 +- hw/piix_pci.c | 47 --- hw/xen.h |6 ++ xen-all.c | 31 +++ xen-stub.c| 13 + 6 files changed, 100 insertions(+), 4 deletions(-) diff --git a/hw/pc.h b/hw/pc.h index 6d5730b..0dcbee7 100644 --- a/hw/pc.h +++ b/hw/pc.h @@ -176,6 +176,7 @@ struct PCII440FXState; typedef struct PCII440FXState PCII440FXState; PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix_devfn, qemu_irq *pic, ram_addr_t ram_size); +PCIBus *i440fx_xen_init(PCII440FXState **pi440fx_state, int *piix3_devfn, qemu_irq *pic, ram_addr_t ram_size); void i440fx_init_memory_mappings(PCII440FXState *d); /* piix4.c */ diff --git a/hw/pc_piix.c b/hw/pc_piix.c index e814f00..9353b91 100644 --- a/hw/pc_piix.c +++ b/hw/pc_piix.c @@ -120,7 +120,11 @@ static void pc_init1(ram_addr_t ram_size, isa_irq = qemu_allocate_irqs(isa_irq_handler, isa_irq_state, 24); if (pci_enabled) { -pci_bus = i440fx_init(i440fx_state, piix3_devfn, isa_irq, ram_size); +if (!xen_enabled()) { +pci_bus = i440fx_init(i440fx_state, piix3_devfn, isa_irq, ram_size); +} else { +pci_bus = i440fx_xen_init(i440fx_state, piix3_devfn, isa_irq, ram_size); +} } else { pci_bus = NULL; i440fx_state = NULL; diff --git a/hw/piix_pci.c b/hw/piix_pci.c index 358da58..c11a7f6 100644 --- a/hw/piix_pci.c +++ b/hw/piix_pci.c @@ -29,6 +29,7 @@ #include isa.h #include sysbus.h #include range.h +#include xen.h /* * I440FX chipset data sheet. @@ -151,6 +152,13 @@ static void i440fx_write_config(PCIDevice *dev, } } +static void i440fx_write_config_xen(PCIDevice *dev, +uint32_t address, uint32_t val, int len) +{ +xen_piix_pci_write_config_client(address, val, len); +i440fx_write_config(dev, address, val, len); +} + static int i440fx_load_old(QEMUFile* f, void *opaque, int version_id) { PCII440FXState *d = opaque; @@ -216,7 +224,10 @@ static int i440fx_initfn(PCIDevice *dev) return 0; } -PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix3_devfn, qemu_irq *pic, ram_addr_t ram_size) +static PCIBus *i440fx_common_init(const char *device_name, + PCII440FXState **pi440fx_state, + int *piix3_devfn, + qemu_irq *pic, ram_addr_t ram_size) { DeviceState *dev; PCIBus *b; @@ -230,13 +241,13 @@ PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix3_devfn, qemu_irq * s-bus = b; qdev_init_nofail(dev); -d = pci_create_simple(b, 0, i440FX); +d = pci_create_simple(b, 0, device_name); *pi440fx_state = DO_UPCAST(PCII440FXState, dev, d); piix3 = DO_UPCAST(PIIX3State, dev, pci_create_simple_multifunction(b, -1, true, PIIX3)); piix3-pic = pic; -pci_bus_irqs(b, piix3_set_irq, pci_slot_get_pirq, piix3, 4); + (*pi440fx_state)-piix3 = piix3; *piix3_devfn = piix3-dev.devfn; @@ -249,6 +260,28 @@ PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix3_devfn, qemu_irq * return b; } +PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix3_devfn, +qemu_irq *pic, ram_addr_t ram_size) +{ +PCIBus *b; + +b = i440fx_common_init(i440FX, pi440fx_state, piix3_devfn, pic, ram_size); +pci_bus_irqs(b, piix3_set_irq, pci_slot_get_pirq, (*pi440fx_state)-piix3, 4); + +return b; +} + +PCIBus *i440fx_xen_init(PCII440FXState **pi440fx_state, int *piix3_devfn, +qemu_irq *pic, ram_addr_t ram_size) +{ +PCIBus *b; + +b = i440fx_common_init(i440FX-xen, pi440fx_state, piix3_devfn, pic, ram_size); +pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq, (*pi440fx_state)-piix3, 4); + +return b; +} + /* PIIX3 PCI to ISA bridge */ static void piix3_set_irq(void *opaque, int irq_num, int level) @@ -352,6 +385,14 @@ static PCIDeviceInfo i440fx_info[] = { .init = i440fx_initfn, .config_write = i440fx_write_config, },{ +.qdev.name= i440FX-xen, +.qdev.desc= Host bridge, +.qdev.size= sizeof(PCII440FXState), +.qdev.vmsd= vmstate_i440fx, +.qdev.no_user = 1, +.init = i440fx_initfn, +.config_write = i440fx_write_config_xen, +},{ .qdev.name= PIIX3, .qdev.desc= ISA bridge, .qdev.size= sizeof(PIIX3State), diff --git a/hw/xen.h b/hw/xen.h
[Qemu-devel] [PATCH V15 13/18] configure: Always use 64bits target physical addresses with xen enabled.
From: Anthony PERARD anthony.per...@citrix.com With MapCache, we can handle a 64b target, even with a 32b host/qemu. So, we need to have target_phys_addr_t to 64bits. Signed-off-by: Anthony PERARD anthony.per...@citrix.com Acked-by: Alexander Graf ag...@suse.de --- configure |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/configure b/configure index 6fc2bdd..3ba6401 100755 --- a/configure +++ b/configure @@ -3298,6 +3298,7 @@ echo TARGET_ABI_DIR=$TARGET_ABI_DIR $config_target_mak case $target_arch2 in i386|x86_64) if test $xen = yes -a $target_softmmu = yes ; then + target_phys_bits=64 echo CONFIG_XEN=y $config_target_mak if test $cpu = i386 -o $cpu = x86_64; then echo CONFIG_XEN_MAPCACHE=y $config_target_mak -- 1.7.2.5
[Qemu-devel] [PATCH V15 12/18] Introduce qemu_put_ram_ptr
From: Anthony PERARD anthony.per...@citrix.com This function allows to unlock a ram_ptr give by qemu_get_ram_ptr. After a call to qemu_put_ram_ptr, the pointer may be unmap from QEMU when used with Xen. Signed-off-by: Anthony PERARD anthony.per...@citrix.com Acked-by: Alexander Graf ag...@suse.de --- cpu-common.h |1 + exec.c | 38 +++--- trace-events |3 +++ xen-mapcache.c | 33 + 4 files changed, 72 insertions(+), 3 deletions(-) diff --git a/cpu-common.h b/cpu-common.h index 96c02ae..1d4fdbf 100644 --- a/cpu-common.h +++ b/cpu-common.h @@ -56,6 +56,7 @@ void *qemu_get_ram_ptr(ram_addr_t addr); /* Same but slower, to use for migration, where the order of * RAMBlocks must not change. */ void *qemu_safe_ram_ptr(ram_addr_t addr); +void qemu_put_ram_ptr(void *addr); /* This should not be used by devices. */ int qemu_ram_addr_from_host(void *ptr, ram_addr_t *ram_addr); ram_addr_t qemu_ram_addr_from_host_nofail(void *ptr); diff --git a/exec.c b/exec.c index 19707c5..063d2f6 100644 --- a/exec.c +++ b/exec.c @@ -3100,6 +3100,27 @@ void *qemu_safe_ram_ptr(ram_addr_t addr) return NULL; } +void qemu_put_ram_ptr(void *addr) +{ +trace_qemu_put_ram_ptr(addr); + +if (xen_mapcache_enabled()) { +RAMBlock *block; + +QLIST_FOREACH(block, ram_list.blocks, next) { +if (addr == block-host) { +break; +} +} +if (block block-host) { +xen_unmap_block(block-host, block-length); +block-host = NULL; +} else { +qemu_map_cache_unlock(addr); +} +} +} + int qemu_ram_addr_from_host(void *ptr, ram_addr_t *ram_addr) { RAMBlock *block; @@ -3815,6 +3836,7 @@ void cpu_physical_memory_rw(target_phys_addr_t addr, uint8_t *buf, cpu_physical_memory_set_dirty_flags( addr1, (0xff ~CODE_DIRTY_FLAG)); } +qemu_put_ram_ptr(ptr); } } else { if ((pd ~TARGET_PAGE_MASK) IO_MEM_ROM @@ -3842,9 +3864,9 @@ void cpu_physical_memory_rw(target_phys_addr_t addr, uint8_t *buf, } } else { /* RAM case */ -ptr = qemu_get_ram_ptr(pd TARGET_PAGE_MASK) + -(addr ~TARGET_PAGE_MASK); -memcpy(buf, ptr, l); +ptr = qemu_get_ram_ptr(pd TARGET_PAGE_MASK); +memcpy(buf, ptr + (addr ~TARGET_PAGE_MASK), l); +qemu_put_ram_ptr(ptr); } } len -= l; @@ -3885,6 +3907,7 @@ void cpu_physical_memory_write_rom(target_phys_addr_t addr, /* ROM/RAM case */ ptr = qemu_get_ram_ptr(addr1); memcpy(ptr, buf, l); +qemu_put_ram_ptr(ptr); } len -= l; buf += l; @@ -4026,6 +4049,15 @@ void cpu_physical_memory_unmap(void *buffer, target_phys_addr_t len, access_len -= l; } } +if (xen_mapcache_enabled()) { +uint8_t *buffer1 = buffer; +uint8_t *end_buffer = buffer + len; + +while (buffer1 end_buffer) { +qemu_put_ram_ptr(buffer1); +buffer1 += TARGET_PAGE_SIZE; +} +} return; } if (is_write) { diff --git a/trace-events b/trace-events index d703347..a00b63c 100644 --- a/trace-events +++ b/trace-events @@ -371,3 +371,6 @@ disable qemu_remap_bucket(uint64_t index) index %#PRIx64 disable qemu_map_cache_return(void* ptr) %p disable xen_map_block(uint64_t phys_addr, uint64_t size) %#PRIx64, size %#PRIx64 disable xen_unmap_block(void* addr, unsigned long size) %p, size %#lx + +# exec.c +disable qemu_put_ram_ptr(void* addr) %p diff --git a/xen-mapcache.c b/xen-mapcache.c index 2ca18ce..349cc62 100644 --- a/xen-mapcache.c +++ b/xen-mapcache.c @@ -196,6 +196,39 @@ uint8_t *qemu_map_cache(target_phys_addr_t phys_addr, target_phys_addr_t size, u return mapcache-last_address_vaddr + address_offset; } +void qemu_map_cache_unlock(void *buffer) +{ +MapCacheEntry *entry = NULL, *pentry = NULL; +MapCacheRev *reventry; +target_phys_addr_t paddr_index; +int found = 0; + +QTAILQ_FOREACH(reventry, mapcache-locked_entries, next) { +if (reventry-vaddr_req == buffer) { +paddr_index = reventry-paddr_index; +found = 1; +break; +} +} +if (!found) { +return; +} +QTAILQ_REMOVE(mapcache-locked_entries, reventry, next); +qemu_free(reventry); + +entry = mapcache-entry[paddr_index % mapcache-nr_buckets]; +while (entry entry-paddr_index != paddr_index) { +pentry = entry; +entry = entry-next; +} +if (!entry) { +return; +} +if (entry-lock 0) { +entry-lock--; +} +} + ram_addr_t
[Qemu-devel] [PATCH V15 15/18] vl.c: Introduce getter for shutdown_requested and reset_requested.
From: Anthony PERARD anthony.per...@citrix.com Introduce two functions qemu_shutdown_requested_get and qemu_reset_requested_get to get the value of shutdown/reset_requested without reset it. Signed-off-by: Anthony PERARD anthony.per...@citrix.com Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Acked-by: Alexander Graf ag...@suse.de --- sysemu.h |2 ++ vl.c | 10 ++ 2 files changed, 12 insertions(+), 0 deletions(-) diff --git a/sysemu.h b/sysemu.h index b0296a0..7e70daa 100644 --- a/sysemu.h +++ b/sysemu.h @@ -42,6 +42,8 @@ void qemu_system_shutdown_request(void); void qemu_system_powerdown_request(void); void qemu_system_debug_request(void); void qemu_system_vmstop_request(int reason); +int qemu_shutdown_requested_get(void); +int qemu_reset_requested_get(void); int qemu_shutdown_requested(void); int qemu_reset_requested(void); int qemu_powerdown_requested(void); diff --git a/vl.c b/vl.c index 3ed6855..bffba69 100644 --- a/vl.c +++ b/vl.c @@ -1161,6 +1161,16 @@ static int powerdown_requested; static int debug_requested; static int vmstop_requested; +int qemu_shutdown_requested_get(void) +{ +return shutdown_requested; +} + +int qemu_reset_requested_get(void) +{ +return reset_requested; +} + int qemu_shutdown_requested(void) { int r = shutdown_requested; -- 1.7.2.5
[Qemu-devel] [PATCH V15 14/18] pci: Use of qemu_put_ram_ptr in pci_add_option_rom.
From: John Baboval john.babo...@virtualcomputer.com Prevent a deadlock caused by leaving a map cache bucket locked by the preceding qemu_get_ram_ptr() call. Signed-off-By: John Baboval john.babo...@virtualcomputer.com Signed-off-by: Anthony PERARD anthony.per...@citrix.com --- hw/pci.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/hw/pci.c b/hw/pci.c index 6b577e1..2b24dd4 100644 --- a/hw/pci.c +++ b/hw/pci.c @@ -1897,6 +1897,8 @@ static int pci_add_option_rom(PCIDevice *pdev, bool is_default_rom) pci_patch_ids(pdev, ptr, size); } +qemu_put_ram_ptr(ptr); + pci_register_bar(pdev, PCI_ROM_SLOT, size, 0, pci_map_option_rom); -- 1.7.2.5
[Qemu-devel] [PATCH V15 16/18] xen: Initialize event channels and io rings
From: Arun Sharma arun.sha...@intel.com Open and bind event channels; map ioreq and buffered ioreq rings. Signed-off-by: Arun Sharma arun.sha...@intel.com Signed-off-by: Anthony PERARD anthony.per...@citrix.com Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Acked-by: Alexander Graf ag...@suse.de --- hw/xen_common.h |2 + xen-all.c | 417 +++ 2 files changed, 419 insertions(+), 0 deletions(-) diff --git a/hw/xen_common.h b/hw/xen_common.h index dd3e896..a1958a0 100644 --- a/hw/xen_common.h +++ b/hw/xen_common.h @@ -107,4 +107,6 @@ static inline int xc_fd(xc_interface *xen_xc) } #endif +void destroy_hvm_domain(void); + #endif /* QEMU_HW_XEN_COMMON_H */ diff --git a/xen-all.c b/xen-all.c index cb01ab9..e849a38 100644 --- a/xen-all.c +++ b/xen-all.c @@ -6,6 +6,8 @@ * */ +#include sys/mman.h + #include hw/pci.h #include hw/xen_common.h #include hw/xen_backend.h @@ -13,6 +15,58 @@ #include xen-mapcache.h #include trace.h +#include xen/hvm/ioreq.h +#include xen/hvm/params.h + +//#define DEBUG_XEN + +#ifdef DEBUG_XEN +#define DPRINTF(fmt, ...) \ +do { fprintf(stderr, xen: fmt, ## __VA_ARGS__); } while (0) +#else +#define DPRINTF(fmt, ...) \ +do { } while (0) +#endif + +/* Compatibility with older version */ +#if __XEN_LATEST_INTERFACE_VERSION__ 0x0003020a +static inline uint32_t xen_vcpu_eport(shared_iopage_t *shared_page, int i) +{ +return shared_page-vcpu_iodata[i].vp_eport; +} +static inline ioreq_t *xen_vcpu_ioreq(shared_iopage_t *shared_page, int vcpu) +{ +return shared_page-vcpu_iodata[vcpu].vp_ioreq; +} +# define FMT_ioreq_size PRIx64 +#else +static inline uint32_t xen_vcpu_eport(shared_iopage_t *shared_page, int i) +{ +return shared_page-vcpu_ioreq[i].vp_eport; +} +static inline ioreq_t *xen_vcpu_ioreq(shared_iopage_t *shared_page, int vcpu) +{ +return shared_page-vcpu_ioreq[vcpu]; +} +# define FMT_ioreq_size u +#endif + +#define BUFFER_IO_MAX_DELAY 100 + +typedef struct XenIOState { +shared_iopage_t *shared_page; +buffered_iopage_t *buffered_io_page; +QEMUTimer *buffered_io_timer; +/* the evtchn port for polling the notification, */ +evtchn_port_t *ioreq_local_port; +/* the evtchn fd for polling */ +XenEvtchn xce_handle; +/* which vcpu we are serving */ +int send_vcpu; + +Notifier exit; +} XenIOState; + /* Xen specific function for piix pci */ int xen_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num) @@ -133,8 +187,304 @@ void xen_vcpu_init(void) } } +/* get the ioreq packets from share mem */ +static ioreq_t *cpu_get_ioreq_from_shared_memory(XenIOState *state, int vcpu) +{ +ioreq_t *req = xen_vcpu_ioreq(state-shared_page, vcpu); + +if (req-state != STATE_IOREQ_READY) { +DPRINTF(I/O request not ready: +%x, ptr: %x, port: %PRIx64, +data: %PRIx64, count: % FMT_ioreq_size , size: % FMT_ioreq_size \n, +req-state, req-data_is_ptr, req-addr, +req-data, req-count, req-size); +return NULL; +} + +xen_rmb(); /* see IOREQ_READY /then/ read contents of ioreq */ + +req-state = STATE_IOREQ_INPROCESS; +return req; +} + +/* use poll to get the port notification */ +/* ioreq_vec--out,the */ +/* retval--the number of ioreq packet */ +static ioreq_t *cpu_get_ioreq(XenIOState *state) +{ +int i; +evtchn_port_t port; + +port = xc_evtchn_pending(state-xce_handle); +if (port != -1) { +for (i = 0; i smp_cpus; i++) { +if (state-ioreq_local_port[i] == port) { +break; +} +} + +if (i == smp_cpus) { +hw_error(Fatal error while trying to get io event!\n); +} + +/* unmask the wanted port again */ +xc_evtchn_unmask(state-xce_handle, port); + +/* get the io packet from shared memory */ +state-send_vcpu = i; +return cpu_get_ioreq_from_shared_memory(state, i); +} + +/* read error or read nothing */ +return NULL; +} + +static uint32_t do_inp(pio_addr_t addr, unsigned long size) +{ +switch (size) { +case 1: +return cpu_inb(addr); +case 2: +return cpu_inw(addr); +case 4: +return cpu_inl(addr); +default: +hw_error(inp: bad size: %04FMT_pioaddr %lx, addr, size); +} +} + +static void do_outp(pio_addr_t addr, +unsigned long size, uint32_t val) +{ +switch (size) { +case 1: +return cpu_outb(addr, val); +case 2: +return cpu_outw(addr, val); +case 4: +return cpu_outl(addr, val); +default: +hw_error(outp: bad size: %04FMT_pioaddr %lx, addr, size); +} +} + +static void cpu_ioreq_pio(ioreq_t *req) +{ +int i, sign; + +sign = req-df ? -1 : 1; + +if (req-dir == IOREQ_READ) { +if (!req-data_is_ptr) { +
[Qemu-devel] [PATCH V15 18/18] xen: Add Xen hypercall for sleep state in the cmos_s3 callback.
From: Anthony PERARD anthony.per...@citrix.com Signed-off-by: Anthony PERARD anthony.per...@citrix.com --- hw/pc_piix.c |6 +- hw/xen.h |1 + xen-all.c|9 + xen-stub.c |4 4 files changed, 19 insertions(+), 1 deletions(-) diff --git a/hw/pc_piix.c b/hw/pc_piix.c index 62cdf71..9a22a8a 100644 --- a/hw/pc_piix.c +++ b/hw/pc_piix.c @@ -179,7 +179,11 @@ static void pc_init1(ram_addr_t ram_size, if (pci_enabled acpi_enabled) { i2c_bus *smbus; -cmos_s3 = qemu_allocate_irqs(pc_cmos_set_s3_resume, rtc_state, 1); +if (!xen_enabled()) { +cmos_s3 = qemu_allocate_irqs(pc_cmos_set_s3_resume, rtc_state, 1); +} else { +cmos_s3 = qemu_allocate_irqs(xen_cmos_set_s3_resume, rtc_state, 1); +} smi_irq = qemu_allocate_irqs(pc_acpi_smi_interrupt, first_cpu, 1); /* TODO: Populate SPD eeprom data. */ smbus = piix4_pm_init(pci_bus, piix3_devfn + 3, 0xb100, diff --git a/hw/xen.h b/hw/xen.h index 6245b38..d435ca0 100644 --- a/hw/xen.h +++ b/hw/xen.h @@ -43,6 +43,7 @@ static inline int xen_mapcache_enabled(void) int xen_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num); void xen_piix3_set_irq(void *opaque, int irq_num, int level); void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len); +void xen_cmos_set_s3_resume(void *opaque, int irq, int level); qemu_irq *xen_interrupt_controller_init(void); diff --git a/xen-all.c b/xen-all.c index 19c2fe1..0eac202 100644 --- a/xen-all.c +++ b/xen-all.c @@ -9,6 +9,7 @@ #include sys/mman.h #include hw/pci.h +#include hw/pc.h #include hw/xen_common.h #include hw/xen_backend.h @@ -99,6 +100,14 @@ void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len) } } +void xen_cmos_set_s3_resume(void *opaque, int irq, int level) +{ +pc_cmos_set_s3_resume(opaque, irq, level); +if (level) { +xc_set_hvm_param(xen_xc, xen_domid, HVM_PARAM_ACPI_S_STATE, 3); +} +} + /* Xen Interrupt Controller */ static void xen_set_irq(void *opaque, int irq, int level) diff --git a/xen-stub.c b/xen-stub.c index 8d2fa54..a4f35a1 100644 --- a/xen-stub.c +++ b/xen-stub.c @@ -22,6 +22,10 @@ void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len) { } +void xen_cmos_set_s3_resume(void *opaque, int irq, int level) +{ +} + void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size) { } -- 1.7.2.5
Re: [Qemu-devel] [PATCH v2 0/4]usb: implement Interface Association Descriptor support
On 04/03/11 07:33, Brad Hards wrote: These descriptors are covered in Section 9.6.4 of the USB 3.0 spec, but there is a better description in the Intel IAD whitepaper (www.usb.org/developers/whitepapers/iadclasscode_r10.pdf). The implementation basically introduces the concept of a grouped of interfaces (with an IAD header), and support for sending it to the device. Queued up in the usb patch queue. thanks, Gerd
Re: [Qemu-devel] Allow ARMv7M to be started without a kernel
On Thu, May 5, 2011 at 19:56, Peter Maydell peter.mayd...@linaro.org wrote: On 5 May 2011 09:23, Ben Leslie be...@benno.id.au wrote: FWIW, the reason why I'm not using -kernel is that the current way the armv7m code works, it expects the provided kernel to be a full flash image including appropriate vector table, whereas right now I just want to debug some stand-alone code, not the full system, which the above gdb approach works perfectly for. It would probably be better for the -kernel option to honour the entry point in the ELF file rather than insisting on full reset (and to try to load the reset SP from the vector table but not insist on that working). That is, we should support both load this ELF image which is a full system image with a vector table and load this ELF image which is just a bare-metal (possibly semihosting) application. Ideally that should work too, well in fact it would be more important to get that working, and I guess the utility of my suggestion is significantly lowered if that worked correctly. I still think it is somewhat nice that the simulator target can work just like a blank board though, and then connect GDB to it either directly for the sim or via JTAG for a real board. Then it is the same work flow for simulated or real hardware. (And the code change is just one if statement). I guess the 'against' case is that if you start up like that, and aren't running GDB it may be slightly confusing for the user. However you would just get the same behaviour as you get now when you pass in an image without a vector table. (As an aside it would probably be good to implement the lockup mechanism of ARMv7M so that with an invalid reset vector it goes into lockup and something useful could be displayed on stderr.) The combination of v7M reset with image loading and the possibility of a debugger altering the pc/sp while the core is in reset is a bit complicated, though :-) I'm not sure if it is that complicated, I would assume that in armv7_reset after calling cpu_reset the PC is set to the entry point in a similar way to to how it is done in arm_boot.c I don't think GDB gets a chance to touch the registers until after CPU reset so I'm not sure if that will cause an issue. (Another aside: ARMv7M is meant to cause a Hard Fault if something tries to set the 'thumb' bit to zero, which would also help avoid some of the silly problems I had when getting my code to run.) As an aside: I think QEMU should have an option which is just load a plain ELF or raw binary, with no funny Linux-kernel-specific behaviour rather than overloading -kernel to mean if it's a raw image it's Linux and if it's an ELF file it's not. I agree, as far as I can tell right now (at least on arm), kernel can mean: 1: A Linux kernel in linux kernel format. 2: An ELF file (which may or may not have entry honoured) 3: A raw binary. OK, I think I may have forked this conversation with too may asides. Unless there is a better suggestion I'll put together two patches, one that allows -kernel to go unspecified, and a second that honours the ELF entry point after reset. (Unless someone else wants to do it first!) Cheers, Ben
Re: [Qemu-devel] Bug #757654: UHCI fails to signal stall response patch
diff --git a/hw/usb-uhci.c b/hw/usb-uhci.c index 346db3e..a51d89b 100644 --- a/hw/usb-uhci.c +++ b/hw/usb-uhci.c @@ -732,11 +732,21 @@ out: case USB_RET_STALL: td-ctrl |= TD_CTRL_STALL; td-ctrl= ~TD_CTRL_ACTIVE; +s-status |= UHCI_STS_USBERR; Just this line should be enougth. case USB_RET_BABBLE: td-ctrl |= TD_CTRL_BABBLE | TD_CTRL_STALL; td-ctrl= ~TD_CTRL_ACTIVE; +s-status |= UHCI_STS_USBERR; Likewise. Tried that? cheers, Gerd
Re: [Qemu-devel] [PATCH 0/7] pci: initialize ids in pci common code
On Fri, Apr 08, 2011 at 09:52:59PM +0900, Isaku Yamahata wrote: vender id/device id... in configuration space are read-only registers which are commonly defined for all pci devices. So initialize them in common code and it simplifies the initialization a bit. I converted some of them. If this is the right direction, I'll convert the remaining devices. So I agree about device vendor id and revision but not header type: devices ideally should supply device type (bridge or not) and we will fill it in correctly. Isaku Yamahata (7): pci: move ids of config space into PCIDeviceInfo usb-uhci: convert to PCIDEviceInfo to initialize ids eepro100: convert to PCIDeviceInfo to initialize ids dec_pci: convert to PCIDeviceInfo to initialize ids apb_pci: convert to PCIDeviceInfo to initialize ids ide/piix: convert to PCIDeviceInfo to initialize ids vmware_vga.c: convert to PCIDeviceInfo to initialize ids hw/apb_pci.c| 13 +-- hw/dec_pci.c| 27 +++- hw/eepro100.c | 60 -- hw/ide/piix.c | 29 +++--- hw/pci.c| 46 + hw/pci.h|9 hw/usb-uhci.c | 38 ++ hw/vmware_vga.c | 13 +-- 8 files changed, 107 insertions(+), 128 deletions(-)
Re: [Qemu-devel] [PATCH 1/7] pci: move ids of config space into PCIDeviceInfo
So the benefit as I see it would be that qemu will be able to list supported devices by vendor id etc. lspci has a database of readable vendor/device strings, maybe we can import that. And we could sort by device type, that's also helpful. header type/prog interface - not so sure. On Fri, Apr 08, 2011 at 09:53:00PM +0900, Isaku Yamahata wrote: diff --git a/hw/pci.h b/hw/pci.h index c6a6eb6..f945798 100644 --- a/hw/pci.h +++ b/hw/pci.h @@ -433,6 +433,15 @@ typedef struct { PCIConfigReadFunc *config_read; PCIConfigWriteFunc *config_write; +uint16_t vendor_id; +uint16_t device_id; +uint8_t revision; This is good. +uint8_t prog_interface; Not sure about this one. What is wrong +uint16_t class_id; This is good. +uint8_t header_type; We have a flag for bridge already, right? Let's fill this in automatically then. +uint16_t subsystem_vendor_id; /* only for header type = 0 */ +uint16_t subsystem_id; /* only for header type = 0 */ add an assert then? + /* * pci-to-pci bridge or normal device. * This doesn't mean pci host switch. -- 1.7.1.1
Re: [Qemu-devel] virtio-scsi spec, first public draft
Virtqueues 0..n-1:one requestq per target n:control transmitq n+1:control receiveq 1 requestq per target makes it harder to support large numbers or dynamic targets. I chose 1 requestq per target so that, with MSI-X support, each target can be associated to one MSI-X vector. If you want a large number of units, you can subdivide targets into logical units, or use multiple adapters if you prefer. We can have 20-odd SCSI adapters, each with 65534 targets. I think we're way beyond the practical limits even before LUN support is added to QEMU. For comparison, Windows supports up to 1024 targets per adapter (split across 8 channels); IBM vSCSI provides up to 128; VMware supports a maximum of 15 SCSI targets per adapter and 4 adapters per VM. You mention detaching targets so is there a way to add a target? Yes, just add the first LUN to it (it will be LUN0 which must be there anyway). The target's existence will be reported on the control receiveq. Feature bits VIRTIO_SCSI_F_INOUT - Whether a single request can include both read-only and write-only data buffers. Why make this an optional feature? Because QEMU does not support it so far. The type identifies the remaining fields. The value VIRTIO_SCSI_T_BARRIER can be ORed in the type as well. This bit indicates that this request acts as a barrier and that all preceding requests must be complete before this one, and all following requests must not be started until this is complete. Note that a barrier does not flush caches in the underlying backend device in host, and thus does not serve as data consistency guarantee. The driver must send a SYNCHRONIZE CACHE command to flush the host cache. Why are these barrier semantics needed? They are a convenience that I took from virtio-blk. They are not needed in upstream Linux (which uses flush/FUA instead), so I'm not wedded to it, but they may be useful if virtio-scsi is ever ported to the stable 2.6.32 series. The type is VIRTIO_SCSI_T_LUN_INFO, possibly with the VIRTIO_SCSI_T_BARRIER bit ORed in. Did you mean type is VIRTIO_SCSI_T_TMF? Yes, of course. Will fix. The subtype and lun field are filled in by the driver, the additional and response field is filled in by the device. Unknown LUNs are ignored; also, the lun field is ignored for the I_T NEXUS RESET command. In/out buffers must be separate in virtio so I think it makes sense to split apart a struct virtio_scsi_tmf_req and struct virtio_scsi_tmf_resp. Here I was using the same standard used by the existing virtio specs, which place both kinds of buffers in the same struct. I am fine with separating the two (and similarly for the other requests), but I'd rather not make virtio-scsi the only different one. VIRTIO_SCSI_T_TMF_LOGICAL_UNIT_DETACH asks the device to make the logical unit (and the target as well if this is the last logical unit) disappear. It takes an I_T_L nexus. This non-standard TMF should be used in response to a host request to shutdown a target or LUN, after having placed the LUN in a clean state. Do we need an initiator-driven detach? If the initiator doesn't care about a device anymore it simply doesn't communicate with it or allocate resources for it. I think the real detach should be performed on the target side (e.g. QEMU monitor command removes the target from the SCSI bus). So I guess I'm asking what is the real use-case for this function? It is not really an initiator-driven detach, it is the initiator's acknowledgement of a target-driven detach. The target needs to know when the initiator is ready so that it can free resources attached to the logical unit (this is particularly important if the LU is a physical disk and it is opened with exclusive access). - SCSI command #define VIRTIO_SCSI_T_CMD 1 struct virtio_scsi_req_cmd { u32 type; u32 ioprio; u8 lun[8]; u64 id; u32 num_dataout, num_datain; char cdb[]; char data[][num_dataout+num_datain]; u8 sense[]; u32 sense_len; u32 residual; u8 status; u8 response; }; We don't need explicit buffer size fields since virtqueue elements include sizes. For example: size_t sense_len = elem-in_sg[sense_idx].iov_len; memcpy(elem-in_sg[sense_idx].iov_buf, sense_buf, MIN(sense_len, sizeof(sense_buf))); I think only the total length is written in the used ring, letting the driver figure out the number of bytes written to the sense buffer is harder than just writing it. Paolo
Re: [Qemu-devel] virtio-scsi spec, first public draft
#define VIRTIO_SCSI_T_BARRIER 0x8000 The type identifies the remaining fields. The value VIRTIO_SCSI_T_BARRIER can be ORed in the type as well. This bit indicates that this request acts as a barrier and that all preceding requests must be complete before this one, and all following requests must not be started until this is complete. Note that a barrier does not flush caches in the underlying backend device in host, and thus does not serve as data consistency guarantee. The driver must send a SYNCHRONIZE CACHE command to flush the host cache. Please don't repeat the barrier mistake done in the Xen and virtio-blk/lguest protocols. It really doesn't make sense to put this kind of strict odering in. If we really want ordering let's do it using SCSI ordered tags at least to use a standard implementation. And SCSI already supports the FUA bit to force a write to be writethrough, even if the QEMU SCSI code doesn't implement. Let's just make virtio-scsi purely a transport and not added magic features into it.
Re: [Qemu-devel] virtio-scsi spec, first public draft
On 05/05/2011 02:50 PM, Christoph Hellwig wrote: Please don't repeat the barrier mistake done in the Xen and virtio-blk/lguest protocols. It really doesn't make sense to put this kind of strict odering in. If we really want ordering let's do it using SCSI ordered tags at least to use a standard implementation. And SCSI already supports the FUA bit to force a write to be writethrough, even if the QEMU SCSI code doesn't implement. You're right, I reviewed the history of barriers and you can consider this gone. Paolo
Re: [Qemu-devel] [RFC] darwin: work around sigfd
On 05/05/2011 11:36 AM, Alexander Graf wrote: When running qemu-system on Darwin, the vcpu processes guest code, but I don't get to see anything on the cocoa screen. Out of curiosity, does it work with iothread? Paolo
Re: [Qemu-devel] [PATCH v2 04/10] eepro100: Pad received short frames
On Sat, Apr 30, 2011 at 10:40:07PM +0200, Stefan Weil wrote: QEMU sends frames smaller than 60 bytes to ethernet nics. This should be fixed in the networking code because normally such frames are rejected by real NICs and their emulations. To avoid this behaviour, other NIC emulations pad received frames. This patch enables this workaround for eepro100, too. All related code is marked with CONFIG_PAD_RECEIVED_FRAMES, so emulation of the correct handling for short frames can be restored as soon as QEMU's networking code is fixed. Signed-off-by: Stefan Weil w...@mail.berlios.de Applied, I tweaked the comment a bit as we don't intend to change it in qemu. But go ahead and keep the ifdef around if you like. --- hw/eepro100.c | 25 - 1 files changed, 24 insertions(+), 1 deletions(-) diff --git a/hw/eepro100.c b/hw/eepro100.c index 82c6369..2104bc0 100644 --- a/hw/eepro100.c +++ b/hw/eepro100.c @@ -48,6 +48,14 @@ #include eeprom93xx.h #include sysemu.h +/* QEMU sends frames smaller than 60 bytes to ethernet nics. + * This should be fixed in the networking code because normally + * such frames are rejected by real nics and their emulations. + * To avoid this behaviour, other nic emulations pad received + * frames. The following definition enables this workaround for + * eepro100, too. */ +#define CONFIG_PAD_RECEIVED_FRAMES + #define KiB 1024 /* Debug EEPRO100 card. */ @@ -1656,19 +1664,32 @@ static ssize_t nic_receive(VLANClientState *nc, const uint8_t * buf, size_t size */ EEPRO100State *s = DO_UPCAST(NICState, nc, nc)-opaque; uint16_t rfd_status = 0xa000; +#if defined(CONFIG_PAD_RECEIVED_FRAMES) +uint8_t min_buf[60]; +#endif static const uint8_t broadcast_macaddr[6] = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }; +#if defined(CONFIG_PAD_RECEIVED_FRAMES) +/* Pad to minimum Ethernet frame length */ +if (size sizeof(min_buf)) { +memcpy(min_buf, buf, size); +memset(min_buf[size], 0, sizeof(min_buf) - size); +buf = min_buf; +size = sizeof(min_buf); +} +#endif + if (s-configuration[8] 0x80) { /* CSMA is disabled. */ logout(%p received while CSMA is disabled\n, s); return -1; +#if !defined(CONFIG_PAD_RECEIVED_FRAMES) } else if (size 64 (s-configuration[7] BIT(0))) { /* Short frame and configuration byte 7/0 (discard short receive) set: * Short frame is discarded */ logout(%p received short frame (%zu byte)\n, s, size); s-statistics.rx_short_frame_errors++; -#if 0 return -1; #endif } else if ((size MAX_ETH_FRAME_SIZE + 4) !(s-configuration[18] BIT(3))) { @@ -1747,9 +1768,11 @@ static ssize_t nic_receive(VLANClientState *nc, const uint8_t * buf, size_t size (%zu bytes); data truncated\n, rfd_size, size); size = rfd_size; } +#if !defined(CONFIG_PAD_RECEIVED_FRAMES) if (size 64) { rfd_status |= 0x0080; } +#endif TRACE(OTHER, logout(command 0x%04x, link 0x%08x, addr 0x%08x, size %u\n, rfd_command, rx.link, rx.rx_buf_addr, rfd_size)); stw_phys(s-ru_base + s-ru_offset + offsetof(eepro100_rx_t, status), -- 1.7.2.5
Re: [Qemu-devel] [PULL v2] eepro100: Update of patch series (fixes and enhancements)
On Sat, Apr 30, 2011 at 10:40:03PM +0200, Stefan Weil wrote: Hi, this is the second version of a series of patches for eepro100 which mainly fix endianness issues and enhance register access. There was a bug report on qemu-devel recently which is fixed by these enhancements, see http://lists.nongnu.org/archive/html/qemu-devel/2011-03/msg02109.html. Changes in v2: * The 2nd patch is new. * Patches are sorted in a different order. The first 4 patches and the rest are independent, so it's possible to apply parts of the series. * The endianness patch was updated to address the feedback which I received. I still use local functions to access physical memory - mainly because I want to use cpu_physical_memory_read / cpu_physical_memory_write as long as I am not sure whether the alignment requirements for the suggested open coded variant are met. The prefix is e100 - shorter and more up-to-date than eepro100. When I started this device emulation, linux still used a module called eepro100. Today, the only linux module is called e100. So my final goal is renaming all eepro100 to e100. I did not change the patch which adds padding to short received frames, because I'd like to keep the preprocessor statement (CONFIG_PAD_RECEIVED_FRAMES) as some kind of documentation (even if QEMU's network code won't be modified in the near future to fully support a real ethernet emulation). Kind regards, Stefan W. Applied with a small tweak, thanks!
Re: [Qemu-devel] [PATCH v2] MSI: Robust resource release
On Mon, May 02, 2011 at 08:00:47PM +0200, Jan Kiszka wrote: msi_init may fail, so we need to check on uninit if the cap was actually installed. This also avoids that the users need to check. Signed-off-by: Jan Kiszka jan.kis...@siemens.com Applied, thanks! --- hw/ide/ich.c |5 + hw/intel-hda.c |4 +--- hw/msi.c | 12 ++-- 3 files changed, 12 insertions(+), 9 deletions(-) v2: Push cap check into msi_uninit (just like msix), remove check from users. diff --git a/hw/ide/ich.c b/hw/ide/ich.c index a3d475c..8c1ff6c 100644 --- a/hw/ide/ich.c +++ b/hw/ide/ich.c @@ -110,10 +110,7 @@ static int pci_ich9_uninit(PCIDevice *dev) struct AHCIPCIState *d; d = DO_UPCAST(struct AHCIPCIState, card, dev); -if (msi_enabled(dev)) { -msi_uninit(dev); -} - +msi_uninit(dev); qemu_unregister_reset(ahci_reset, d); ahci_uninit(d-ahci); diff --git a/hw/intel-hda.c b/hw/intel-hda.c index b0b1d12..093c4b9 100644 --- a/hw/intel-hda.c +++ b/hw/intel-hda.c @@ -1174,9 +1174,7 @@ static int intel_hda_exit(PCIDevice *pci) { IntelHDAState *d = DO_UPCAST(IntelHDAState, pci, pci); -if (d-msi) { -msi_uninit(d-pci); -} +msi_uninit(d-pci); cpu_unregister_io_memory(d-mmio_addr); return 0; } diff --git a/hw/msi.c b/hw/msi.c index 0cbf89f..3141804 100644 --- a/hw/msi.c +++ b/hw/msi.c @@ -183,9 +183,17 @@ int msi_init(struct PCIDevice *dev, uint8_t offset, void msi_uninit(struct PCIDevice *dev) { -uint16_t flags = pci_get_word(dev-config + msi_flags_off(dev)); -uint8_t cap_size = msi_cap_sizeof(flags); +uint16_t flags; +uint8_t cap_size; + +if (!(dev-cap_present QEMU_PCI_CAP_MSI)) { +return; +} +flags = pci_get_word(dev-config + msi_flags_off(dev)); +cap_size = msi_cap_sizeof(flags); pci_del_capability(dev, PCI_CAP_ID_MSIX, cap_size); +dev-cap_present = ~QEMU_PCI_CAP_MSI; + MSI_DEV_PRINTF(dev, uninit\n); } -- 1.7.1
Re: [Qemu-devel] [PATCH] pci: Add class 0x403 as 'audio controller'
On Mon, May 02, 2011 at 08:01:37PM +0200, Jan Kiszka wrote: Used by HD audio controllers like our intel-hda. Signed-off-by: Jan Kiszka jan.kis...@siemens.com Applied, thanks! --- hw/pci.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/hw/pci.c b/hw/pci.c index 6b577e1..87f1b0c 100644 --- a/hw/pci.c +++ b/hw/pci.c @@ -1145,6 +1145,7 @@ static const pci_class_desc pci_class_descriptions[] = { 0x0400, Video controller, video}, { 0x0401, Audio controller, sound}, { 0x0402, Phone}, +{ 0x0403, Audio controller, sound}, { 0x0480, Multimedia controller}, { 0x0500, RAM controller, memory}, { 0x0501, Flash controller, flash}, -- 1.7.1
Re: [Qemu-devel] [RFC] darwin: work around sigfd
On 05.05.2011, at 14:56, Paolo Bonzini wrote: On 05/05/2011 11:36 AM, Alexander Graf wrote: When running qemu-system on Darwin, the vcpu processes guest code, but I don't get to see anything on the cocoa screen. Out of curiosity, does it work with iothread? Seems to work with -nographic, yes. With cocoa it doesn't seem as happy :o. It certainly gets a lot further than without. Alex
Re: [Qemu-devel] Allow ARMv7M to be started without a kernel
On 5 May 2011 13:03, Ben Leslie be...@benno.id.au wrote: I still think it is somewhat nice that the simulator target can work just like a blank board though, and then connect GDB to it either directly for the sim or via JTAG for a real board. Then it is the same work flow for simulated or real hardware. (And the code change is just one if statement). I would personally be happy with that (or with a no really I don't want an image option or something), but I would like consistency across targets rather than an armv7m specific change. (As an aside it would probably be good to implement the lockup mechanism of ARMv7M so that with an invalid reset vector it goes into lockup and something useful could be displayed on stderr.) There are a few known bugs in QEMU's v7m exception model, and yes, not implementing lockup is one of them. Incidentally, technically it's possible to write an architecturally compliant program that puts the core into lockup and then rescues itself via an NMI handler, but gone into lockup would be a useful thing to trace if we had a consistent mechanism for enabling emit trace for events indicating likely OS bugs. I don't have any time personally to look at v7M issues (Linaro's focus is A profile cores) but I'll review patches if anybody submits them. [The best way to think of Lockup is as a continual attempt to execute an instruction until either it works or you get a reset or suitably high priority exception; so you assert the lockup signal for the things the spec says cause lockup, and you deassert lockup if you find you have managed to successfully execute an instruction. Implementing this in QEMU is left as an exercise for the reader :-)] The combination of v7M reset with image loading and the possibility of a debugger altering the pc/sp while the core is in reset is a bit complicated, though :-) I'm not sure if it is that complicated, I would assume that in armv7_reset after calling cpu_reset the PC is set to the entry point in a similar way to to how it is done in arm_boot.c I don't think GDB gets a chance to touch the registers until after CPU reset so I'm not sure if that will cause an issue. The trouble is that unlike AR profiles (where the reset PC is a constant value), the reset PC/SP are loaded from memory. So PC/SP aren't actually set when the core goes into reset, but as the first thing that happens when we come out of reset and start doing work. So if QEMU does the load initial PC/SP in reset then (a) this isn't what the hardware does and (b) trying to load an image with a vector table via the debugger won't work (because we read PC from RAM before the debugger wrote to it). If you do the load PC/SP after reset, then any change the user made to PC/SP in gdb gets overridden (unless you take special measures to avoid that). (See also the comment in target-arm/helper.c:cpu_reset() about another case we're not getting right.) I think you need to work out a consistent way everything should behave first, rather than trying to generate patches to make point fixes to cases that are causing problems. (Another aside: ARMv7M is meant to cause a Hard Fault if something tries to set the 'thumb' bit to zero, which would also help avoid some of the silly problems I had when getting my code to run.) Strictly, it should generate a UsageFault (with UFSR.INVSTATE set) when the core next tries to execute an instruction with the T bit clear. (If this happens immediately after reset then the UsageFault will be escalated to HardFault; if your HardFault handler also tries to execute in ARM mode then we go into Lockup.) QEMU might not get the right fault status bits, but it should certainly generate some kind of fault because disas_arm_insn() causes an exception to be generated if it's invoked for an M profile core. -- PMM
Re: [Qemu-devel] [PATCH v2 3/3] CPUPhysMemoryClient: Batch contiguous addresses when playing catchup
On Tue, May 03, 2011 at 12:36:58PM -0600, Alex Williamson wrote: When a phys memory client registers and we play catchup by walking the page tables, we can make a huge improvement in the number of times the set_memory callback is called by batching contiguous pages together. With a 4G guest, this reduces the number of callbacks at registration from 1048866 to 296. Signed-off-by: Alex Williamson alex.william...@redhat.com --- exec.c | 38 -- 1 files changed, 32 insertions(+), 6 deletions(-) diff --git a/exec.c b/exec.c index bbd5c86..a0678a4 100644 --- a/exec.c +++ b/exec.c @@ -1741,14 +1741,21 @@ static int cpu_notify_migration_log(int enable) return 0; } +struct last_map { +target_phys_addr_t start_addr; +ram_addr_t size; A bit worried that ram_addr_t size might thinkably overflow (it's just a long, could be a 4G ram). Break it out when it fills up? +ram_addr_t phys_offset; +}; + /* The l1_phys_map provides the upper P_L1_BITs of the guest physical * address. Each intermediate table provides the next L2_BITs of guest * physical address space. The number of levels vary based on host and * guest configuration, making it efficient to build the final guest * physical address by seeding the L1 offset and shifting and adding in * each L2 offset as we recurse through them. */ -static void phys_page_for_each_1(CPUPhysMemoryClient *client, - int level, void **lp, target_phys_addr_t addr) +static void phys_page_for_each_1(CPUPhysMemoryClient *client, int level, + void **lp, target_phys_addr_t addr, + struct last_map *map) { int i; @@ -1760,15 +1767,29 @@ static void phys_page_for_each_1(CPUPhysMemoryClient *client, addr = L2_BITS + TARGET_PAGE_BITS; for (i = 0; i L2_SIZE; ++i) { if (pd[i].phys_offset != IO_MEM_UNASSIGNED) { -client-set_memory(client, addr | i TARGET_PAGE_BITS, - TARGET_PAGE_SIZE, pd[i].phys_offset); +target_phys_addr_t start_addr = addr | i TARGET_PAGE_BITS; + +if (map-size +start_addr == map-start_addr + map-size +pd[i].phys_offset == map-phys_offset + map-size) { + +map-size += TARGET_PAGE_SIZE; +continue; +} else if (map-size) { +client-set_memory(client, map-start_addr, + map-size, map-phys_offset); +} + +map-start_addr = start_addr; +map-size = TARGET_PAGE_SIZE; +map-phys_offset = pd[i].phys_offset; } } } else { void **pp = *lp; for (i = 0; i L2_SIZE; ++i) { phys_page_for_each_1(client, level - 1, pp + i, - (addr L2_BITS) | i); + (addr L2_BITS) | i, map); } } } @@ -1776,9 +1797,14 @@ static void phys_page_for_each_1(CPUPhysMemoryClient *client, static void phys_page_for_each(CPUPhysMemoryClient *client) { int i; +struct last_map map = { 0 }; + Nit: just {} is enough. for (i = 0; i P_L1_SIZE; ++i) { phys_page_for_each_1(client, P_L1_SHIFT / L2_BITS - 1, - l1_phys_map + i, i); + l1_phys_map + i, i, map); +} +if (map.size) { +client-set_memory(client, map.start_addr, map.size, map.phys_offset); } }
Re: [Qemu-devel] [PATCH v2 0/3] CPUPhysMemoryClient: Fixes and batching
On Tue, May 03, 2011 at 12:36:19PM -0600, Alex Williamson wrote: This series pulls together several related patches for bugs and performance that I found last week. Only the 2nd patch is actually modified from inital posting, adding the comments suggested by Markus. The 1st two patches fix pretty serious brokeness in the CPUPhysMemoryClient interface. Of the two current users, kvm and vhost, only vhost is actually affected by these bugs. Please apply. Thanks, Alex Applied first two, thanks! --- Alex Williamson (3): CPUPhysMemoryClient: Batch contiguous addresses when playing catchup CPUPhysMemoryClient: Pass guest physical address not region offset CPUPhysMemoryClient: Fix typo in phys memory client registration exec.c | 46 -- 1 files changed, 40 insertions(+), 6 deletions(-)
Re: [Qemu-devel] [RFC] darwin: work around sigfd
On 05/05/2011 03:15 PM, Alexander Graf wrote: On 05.05.2011, at 14:56, Paolo Bonzini wrote: On 05/05/2011 11:36 AM, Alexander Graf wrote: When running qemu-system on Darwin, the vcpu processes guest code, but I don't get to see anything on the cocoa screen. Out of curiosity, does it work with iothread? Seems to work with -nographic, yes. With cocoa it doesn't seem as happy :o. It certainly gets a lot further than without. And SDL? Paolo
Re: [Qemu-devel] [RFC] darwin: work around sigfd
On 05.05.2011, at 15:23, Paolo Bonzini wrote: On 05/05/2011 03:15 PM, Alexander Graf wrote: On 05.05.2011, at 14:56, Paolo Bonzini wrote: On 05/05/2011 11:36 AM, Alexander Graf wrote: When running qemu-system on Darwin, the vcpu processes guest code, but I don't get to see anything on the cocoa screen. Out of curiosity, does it work with iothread? Seems to work with -nographic, yes. With cocoa it doesn't seem as happy :o. It certainly gets a lot further than without. And SDL? SDL doesn't compile on Mac OS X :). Otherwise we wouldn't have the cocoa backend. Alex
Re: [Qemu-devel] [Bug 723871] Re: qemu-kvm-0.14.0 Aborts with -vga qxl
Quoting Boris Derzhavets (723...@bugs.launchpad.net): What is ITP ? ITP is an 'Intent to Package', as outlined at https://wiki.ubuntu.com/UbuntuDevelopment/NewPackages. It's a type of bug to open in order to get packages into the universe archive. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/723871 Title: qemu-kvm-0.14.0 Aborts with -vga qxl Status in QEMU: Confirmed Status in “libvirt” package in Ubuntu: Triaged Status in “qemu-kvm” package in Ubuntu: Fix Released Bug description: Host CPU is Core i7 Q820. KVM is from 2.6.35-gentoo-r5 kernel (x86_64). Host has spice-0.7.2 and spice-protocol-0.7.0. Guest is Windows XP SP3 with qxl driver 0.6.1, virtio-serial 1.1.6 and vdagent 0.6.3. qemu-kvm is started like so: qemu-system-x86_64 -cpu host -enable-kvm -pidfile /home/rick/qemu/hds/wxp.pid -drive file=/home/rick/qemu/hds/wxp.raw,if=virtio,media=disk,aio=native,snapshot=on -m 768 -name WinXP -net nic,model=virtio -net user -localtime -usb -vga qxl -device virtio-serial -chardev spicevmc,name=vdagent,id=vdagent -device virtserialport,chardev=vdagent,name=com.redhat.spice.0 -spice port=1234,disable-ticketing -monitor stdio and crashes with: qemu-system-x86_64: /home/rick/qemu/src/qemu-kvm-0.14.0/qemu-kvm.c:1724: kvm_mutex_unlock: Assertion `!cpu_single_env' failed. Aborted If I use -no-kvm, it works fine. If I use -vga std, it works fine. -enable-kvm and -vga qxl crashes.
Re: [Qemu-devel] [PATCH v2 3/3] CPUPhysMemoryClient: Batch contiguous addresses when playing catchup
On Thu, 2011-05-05 at 16:21 +0300, Michael S. Tsirkin wrote: On Tue, May 03, 2011 at 12:36:58PM -0600, Alex Williamson wrote: When a phys memory client registers and we play catchup by walking the page tables, we can make a huge improvement in the number of times the set_memory callback is called by batching contiguous pages together. With a 4G guest, this reduces the number of callbacks at registration from 1048866 to 296. Signed-off-by: Alex Williamson alex.william...@redhat.com --- exec.c | 38 -- 1 files changed, 32 insertions(+), 6 deletions(-) diff --git a/exec.c b/exec.c index bbd5c86..a0678a4 100644 --- a/exec.c +++ b/exec.c @@ -1741,14 +1741,21 @@ static int cpu_notify_migration_log(int enable) return 0; } +struct last_map { +target_phys_addr_t start_addr; +ram_addr_t size; A bit worried that ram_addr_t size might thinkably overflow (it's just a long, could be a 4G ram). Break it out when it fills up? struct CPUPhysMemoryClient { void (*set_memory)(struct CPUPhysMemoryClient *client, target_phys_addr_t start_addr, ram_addr_t size, ram_addr_t phys_offset); ram_addr_t seems to be the standard for describing these types of things. It's an unsigned long, so 4G is only concern for 32b builds, which don't support that much memory anyway. Please apply. Thanks, Alex +ram_addr_t phys_offset; +}; + /* The l1_phys_map provides the upper P_L1_BITs of the guest physical * address. Each intermediate table provides the next L2_BITs of guest * physical address space. The number of levels vary based on host and * guest configuration, making it efficient to build the final guest * physical address by seeding the L1 offset and shifting and adding in * each L2 offset as we recurse through them. */ -static void phys_page_for_each_1(CPUPhysMemoryClient *client, - int level, void **lp, target_phys_addr_t addr) +static void phys_page_for_each_1(CPUPhysMemoryClient *client, int level, + void **lp, target_phys_addr_t addr, + struct last_map *map) { int i; @@ -1760,15 +1767,29 @@ static void phys_page_for_each_1(CPUPhysMemoryClient *client, addr = L2_BITS + TARGET_PAGE_BITS; for (i = 0; i L2_SIZE; ++i) { if (pd[i].phys_offset != IO_MEM_UNASSIGNED) { -client-set_memory(client, addr | i TARGET_PAGE_BITS, - TARGET_PAGE_SIZE, pd[i].phys_offset); +target_phys_addr_t start_addr = addr | i TARGET_PAGE_BITS; + +if (map-size +start_addr == map-start_addr + map-size +pd[i].phys_offset == map-phys_offset + map-size) { + +map-size += TARGET_PAGE_SIZE; +continue; +} else if (map-size) { +client-set_memory(client, map-start_addr, + map-size, map-phys_offset); +} + +map-start_addr = start_addr; +map-size = TARGET_PAGE_SIZE; +map-phys_offset = pd[i].phys_offset; } } } else { void **pp = *lp; for (i = 0; i L2_SIZE; ++i) { phys_page_for_each_1(client, level - 1, pp + i, - (addr L2_BITS) | i); + (addr L2_BITS) | i, map); } } } @@ -1776,9 +1797,14 @@ static void phys_page_for_each_1(CPUPhysMemoryClient *client, static void phys_page_for_each(CPUPhysMemoryClient *client) { int i; +struct last_map map = { 0 }; + Nit: just {} is enough. for (i = 0; i P_L1_SIZE; ++i) { phys_page_for_each_1(client, P_L1_SHIFT / L2_BITS - 1, - l1_phys_map + i, i); + l1_phys_map + i, i, map); +} +if (map.size) { +client-set_memory(client, map.start_addr, map.size, map.phys_offset); } }
Re: [Qemu-devel] virtio-scsi spec, first public draft
Hi all, On 05/05/2011 02:49 PM, Paolo Bonzini wrote: Virtqueues 0..n-1:one requestq per target n:control transmitq n+1:control receiveq 1 requestq per target makes it harder to support large numbers or dynamic targets. I chose 1 requestq per target so that, with MSI-X support, each target can be associated to one MSI-X vector. If you want a large number of units, you can subdivide targets into logical units, or use multiple adapters if you prefer. We can have 20-odd SCSI adapters, each with 65534 targets. I think we're way beyond the practical limits even before LUN support is added to QEMU. But this will make queue full tracking harder. If we have one queue per LUN the SCSI stack is able to track QUEUE FULL states and will adjust the queue depth accordingly. When we have only one queue per target we cannot track QUEUE FULL anymore and have to rely on the static per-host 'can_queue' setting. Which doesn't work as well, especially in a virtualized environment where the queue full conditions might change at any time. But read on: For comparison, Windows supports up to 1024 targets per adapter (split across 8 channels); IBM vSCSI provides up to 128; VMware supports a maximum of 15 SCSI targets per adapter and 4 adapters per VM. We don't have to impose any hard limits here. The virtio scsi transport would need to be able to detect the targets, and we would be using whatever targets have been found. You mention detaching targets so is there a way to add a target? Yes, just add the first LUN to it (it will be LUN0 which must be there anyway). The target's existence will be reported on the control receiveq. ?? How is this supposed to work? How can I detect the existence of a virtqueue ? For this I actually like the MSI-X idea: If we were to rely on MSI-X to refer to the virtqueues we could just parse the MSI-X structure and create virtqueues for each entry found to be valid. And to be consistent with the SCSI layer the virtqueues then in fact would need to map the SCSI targets; LUNs would be detected from the SCSI midlayer outside the control of the virtio-scsi HBA. Feature bits VIRTIO_SCSI_F_INOUT - Whether a single request can include both read-only and write-only data buffers. Why make this an optional feature? Because QEMU does not support it so far. The type identifies the remaining fields. The value VIRTIO_SCSI_T_BARRIER can be ORed in the type as well. This bit indicates that this request acts as a barrier and that all preceding requests must be complete before this one, and all following requests must not be started until this is complete. Note that a barrier does not flush caches in the underlying backend device in host, and thus does not serve as data consistency guarantee. The driver must send a SYNCHRONIZE CACHE command to flush the host cache. Why are these barrier semantics needed? They are a convenience that I took from virtio-blk. They are not needed in upstream Linux (which uses flush/FUA instead), so I'm not wedded to it, but they may be useful if virtio-scsi is ever ported to the stable 2.6.32 series. As mentioned by hch; just drop this. [ .. ] VIRTIO_SCSI_T_TMF_LOGICAL_UNIT_DETACH asks the device to make the logical unit (and the target as well if this is the last logical unit) disappear. It takes an I_T_L nexus. This non-standard TMF should be used in response to a host request to shutdown a target or LUN, after having placed the LUN in a clean state. Do we need an initiator-driven detach? If the initiator doesn't care about a device anymore it simply doesn't communicate with it or allocate resources for it. I think the real detach should be performed on the target side (e.g. QEMU monitor command removes the target from the SCSI bus). So I guess I'm asking what is the real use-case for this function? It is not really an initiator-driven detach, it is the initiator's acknowledgement of a target-driven detach. The target needs to know when the initiator is ready so that it can free resources attached to the logical unit (this is particularly important if the LU is a physical disk and it is opened with exclusive access). Not required. The target can detach any LUN at any time and can rely on the initiator to handle this situation. Multipath handles this just fine. - SCSI command #define VIRTIO_SCSI_T_CMD 1 struct virtio_scsi_req_cmd { u32 type; u32 ioprio; u8 lun[8]; u64 id; u32 num_dataout, num_datain; char cdb[]; char data[][num_dataout+num_datain]; u8 sense[]; u32 sense_len; u32 residual; u8 status; u8 response; }; We don't need explicit buffer size fields since virtqueue elements include sizes. For example: size_t sense_len = elem-in_sg[sense_idx].iov_len; memcpy(elem-in_sg[sense_idx].iov_buf, sense_buf, MIN(sense_len, sizeof(sense_buf))); I think only the total length is written in the used ring, letting the driver figure out the number of bytes written to the sense buffer is harder than just writing it. Yes. The sense
Re: [Qemu-devel] [PATCH v2 3/3] CPUPhysMemoryClient: Batch contiguous addresses when playing catchup
On 05/05/11 16:21, Alex Williamson wrote: A bit worried that ram_addr_t size might thinkably overflow (it's just a long, could be a 4G ram). Break it out when it fills up? struct CPUPhysMemoryClient { void (*set_memory)(struct CPUPhysMemoryClient *client, target_phys_addr_t start_addr, ram_addr_t size, ram_addr_t phys_offset); ram_addr_t seems to be the standard for describing these types of things. It's an unsigned long, so 4G is only concern for 32b builds, which don't support that much memory anyway. Please apply. Thanks, A memory size can obviously not be bigger than the maximum physical address, so I find it really hard to see how this could overflow. It seems fair to use it for the size here. Acked-by: Jes Sorensen jes.soren...@redhat.com
Re: [Qemu-devel] virtio-scsi spec, first public draft
On 05/05/2011 04:29 PM, Hannes Reinecke wrote: I chose 1 requestq per target so that, with MSI-X support, each target can be associated to one MSI-X vector. If you want a large number of units, you can subdivide targets into logical units, or use multiple adapters if you prefer. We can have 20-odd SCSI adapters, each with 65534 targets. I think we're way beyond the practical limits even before LUN support is added to QEMU. But this will make queue full tracking harder. If we have one queue per LUN the SCSI stack is able to track QUEUE FULL states and will adjust the queue depth accordingly. When we have only one queue per target we cannot track QUEUE FULL anymore and have to rely on the static per-host 'can_queue' setting. Which doesn't work as well, especially in a virtualized environment where the queue full conditions might change at any time. So you want one virtqueue per LUN? I had it in the first version, but then you had to associate a (target, 8-byte LUN) pair to each virtqueue manually. That was very hairy, so I changed it to one target per queue. But read on: For comparison, Windows supports up to 1024 targets per adapter (split across 8 channels); IBM vSCSI provides up to 128; VMware supports a maximum of 15 SCSI targets per adapter and 4 adapters per VM. We don't have to impose any hard limits here. The virtio scsi transport would need to be able to detect the targets, and we would be using whatever targets have been found. Yes, that's what I wrote above. Right now detect the targets means send INQUIRY for LUN0 and/or REPORT LUNS to each virtqueue, thanks to the 1:1 relationship. In my first version it would mean: - associate each target's LUN0 to a virtqueue - if needed, send INQUIRY for LUN0 and/or REPORT LUNS - if needed, deassociate the LUN0 and the virtqueue Really, it was ugly. It also brings a lot more the question, such as what to do if a virtqueue has pending requests at deassociation time. Yes, just add the first LUN to it (it will be LUN0 which must be there anyway). The target's existence will be reported on the control receiveq. ?? How is this supposed to work? How can I detect the existence of a virtqueue ? Config space tells you how many virtqueue exist. That gives how many targets you can address at most. If some of them are empty at the beginning of the guest's life, their LUN0 will fail to answer INQUIRY and REPORT LUNS. (It is the same for vmw_pvscsi by the way, except simpler: the maximum # of targets is not configurable, and there is just one queue + one interrupt). And to be consistent with the SCSI layer the virtqueues then in fact would need to map the SCSI targets; LUNs would be detected from the SCSI midlayer outside the control of the virtio-scsi HBA. Exactly, that was my point! It seemed so clean compared to a dynamic assignment between LUNs and virtqueues. VIRTIO_SCSI_T_TMF_LOGICAL_UNIT_DETACH asks the device to make the logical unit (and the target as well if this is the last logical unit) disappear. It takes an I_T_L nexus. This non-standard TMF should be used in response to a host request to shutdown a target or LUN, after having placed the LUN in a clean state. It is not really an initiator-driven detach, it is the initiator's acknowledgement of a target-driven detach. The target needs to know when the initiator is ready so that it can free resources attached to the logical unit (this is particularly important if the LU is a physical disk and it is opened with exclusive access). Not required. The target can detach any LUN at any time and can rely on the initiator to handle this situation. Multipath handles this just fine. I didn't invent this, we had a customer request this feature for Xen guests in the past (a soft target detach where the filesystem is unmounted cleanly). But I guess I can drop it since KVM guests have agents like Matahari that will take care of this. They will use out-of-band channels to start an initiator-driven detach, and I guess it's better this way. :) BTW, with barriers gone, I think I can also drop the per-target TMF command. Thanks for the review. Paolo
Re: [Qemu-devel] [PATCH v2 3/3] CPUPhysMemoryClient: Batch contiguous addresses when playing catchup
On Thu, May 05, 2011 at 04:30:57PM +0200, Jes Sorensen wrote: On 05/05/11 16:21, Alex Williamson wrote: A bit worried that ram_addr_t size might thinkably overflow (it's just a long, could be a 4G ram). Break it out when it fills up? struct CPUPhysMemoryClient { void (*set_memory)(struct CPUPhysMemoryClient *client, target_phys_addr_t start_addr, ram_addr_t size, ram_addr_t phys_offset); ram_addr_t seems to be the standard for describing these types of things. It's an unsigned long, so 4G is only concern for 32b builds, which don't support that much memory anyway. Please apply. Thanks, A memory size can obviously not be bigger than the maximum physical address, so I find it really hard to see how this could overflow. For example, a 4G size does not fit in 32 bits. It seems fair to use it for the size here. Acked-by: Jes Sorensen jes.soren...@redhat.com
Re: [Qemu-devel] [PATCH v2 3/3] CPUPhysMemoryClient: Batch contiguous addresses when playing catchup
On Thu, May 05, 2011 at 08:21:06AM -0600, Alex Williamson wrote: On Thu, 2011-05-05 at 16:21 +0300, Michael S. Tsirkin wrote: On Tue, May 03, 2011 at 12:36:58PM -0600, Alex Williamson wrote: When a phys memory client registers and we play catchup by walking the page tables, we can make a huge improvement in the number of times the set_memory callback is called by batching contiguous pages together. With a 4G guest, this reduces the number of callbacks at registration from 1048866 to 296. Signed-off-by: Alex Williamson alex.william...@redhat.com --- exec.c | 38 -- 1 files changed, 32 insertions(+), 6 deletions(-) diff --git a/exec.c b/exec.c index bbd5c86..a0678a4 100644 --- a/exec.c +++ b/exec.c @@ -1741,14 +1741,21 @@ static int cpu_notify_migration_log(int enable) return 0; } +struct last_map { +target_phys_addr_t start_addr; +ram_addr_t size; A bit worried that ram_addr_t size might thinkably overflow (it's just a long, could be a 4G ram). Break it out when it fills up? struct CPUPhysMemoryClient { void (*set_memory)(struct CPUPhysMemoryClient *client, target_phys_addr_t start_addr, ram_addr_t size, ram_addr_t phys_offset); ram_addr_t seems to be the standard for describing these types of things. It's an unsigned long, so 4G is only concern for 32b builds, which don't support that much memory anyway. Please apply. Thanks, Alex OK, I don't think it's a problem in practice. I dislike the use of _addr for sizes, we should have _size_t, but that's a separate problem, this patch is consistent. I'll give people a bit of time to review and reply though, there seems to be no rush. +ram_addr_t phys_offset; +}; + /* The l1_phys_map provides the upper P_L1_BITs of the guest physical * address. Each intermediate table provides the next L2_BITs of guest * physical address space. The number of levels vary based on host and * guest configuration, making it efficient to build the final guest * physical address by seeding the L1 offset and shifting and adding in * each L2 offset as we recurse through them. */ -static void phys_page_for_each_1(CPUPhysMemoryClient *client, - int level, void **lp, target_phys_addr_t addr) +static void phys_page_for_each_1(CPUPhysMemoryClient *client, int level, + void **lp, target_phys_addr_t addr, + struct last_map *map) { int i; @@ -1760,15 +1767,29 @@ static void phys_page_for_each_1(CPUPhysMemoryClient *client, addr = L2_BITS + TARGET_PAGE_BITS; for (i = 0; i L2_SIZE; ++i) { if (pd[i].phys_offset != IO_MEM_UNASSIGNED) { -client-set_memory(client, addr | i TARGET_PAGE_BITS, - TARGET_PAGE_SIZE, pd[i].phys_offset); +target_phys_addr_t start_addr = addr | i TARGET_PAGE_BITS; + +if (map-size +start_addr == map-start_addr + map-size +pd[i].phys_offset == map-phys_offset + map-size) { + +map-size += TARGET_PAGE_SIZE; +continue; +} else if (map-size) { +client-set_memory(client, map-start_addr, + map-size, map-phys_offset); +} + +map-start_addr = start_addr; +map-size = TARGET_PAGE_SIZE; +map-phys_offset = pd[i].phys_offset; } } } else { void **pp = *lp; for (i = 0; i L2_SIZE; ++i) { phys_page_for_each_1(client, level - 1, pp + i, - (addr L2_BITS) | i); + (addr L2_BITS) | i, map); } } } @@ -1776,9 +1797,14 @@ static void phys_page_for_each_1(CPUPhysMemoryClient *client, static void phys_page_for_each(CPUPhysMemoryClient *client) { int i; +struct last_map map = { 0 }; + Nit: just {} is enough. for (i = 0; i P_L1_SIZE; ++i) { phys_page_for_each_1(client, P_L1_SHIFT / L2_BITS - 1, - l1_phys_map + i, i); + l1_phys_map + i, i, map); +} +if (map.size) { +client-set_memory(client, map.start_addr, map.size, map.phys_offset); } }
Re: [Qemu-devel] [PATCH v2 3/3] CPUPhysMemoryClient: Batch contiguous addresses when playing catchup
On 05/05/11 17:18, Michael S. Tsirkin wrote: A memory size can obviously not be bigger than the maximum physical address, so I find it really hard to see how this could overflow. For example, a 4G size does not fit in 32 bits. That is the only corner case - you can handle that by -1 if you like. Jes
Re: [Qemu-devel] [PATCH v2 3/3] CPUPhysMemoryClient: Batch contiguous addresses when playing catchup
On Thu, May 05, 2011 at 05:36:04PM +0200, Jes Sorensen wrote: On 05/05/11 17:18, Michael S. Tsirkin wrote: A memory size can obviously not be bigger than the maximum physical address, so I find it really hard to see how this could overflow. For example, a 4G size does not fit in 32 bits. That is the only corner case True. you can handle that by -1 if you like. But then all users need to be updated. Seems easier to break out of the loop easier. It's likely not a real problem, certainly not on a pc, don't know about other systems. Jes
Re: [Qemu-devel] [PATCH v2 3/3] CPUPhysMemoryClient: Batch contiguous addresses when playing catchup
On 05/05/11 17:38, Michael S. Tsirkin wrote: On Thu, May 05, 2011 at 05:36:04PM +0200, Jes Sorensen wrote: On 05/05/11 17:18, Michael S. Tsirkin wrote: A memory size can obviously not be bigger than the maximum physical address, so I find it really hard to see how this could overflow. For example, a 4G size does not fit in 32 bits. That is the only corner case True. you can handle that by -1 if you like. But then all users need to be updated. Seems easier to break out of the loop easier. It's likely not a real problem, certainly not on a pc, don't know about other systems. I think it is quite fair to limit the amount of memory we support when running 32 bit qemu binaries. I would expect more things to break than just this if we tried to support 4GB of RAM on a 32 bit host. Cheers, Jes
Re: [Qemu-devel] [PATCH v2 3/3] CPUPhysMemoryClient: Batch contiguous addresses when playing catchup
On Thu, May 05, 2011 at 05:40:19PM +0200, Jes Sorensen wrote: On 05/05/11 17:38, Michael S. Tsirkin wrote: On Thu, May 05, 2011 at 05:36:04PM +0200, Jes Sorensen wrote: On 05/05/11 17:18, Michael S. Tsirkin wrote: A memory size can obviously not be bigger than the maximum physical address, so I find it really hard to see how this could overflow. For example, a 4G size does not fit in 32 bits. That is the only corner case True. you can handle that by -1 if you like. But then all users need to be updated. Seems easier to break out of the loop easier. It's likely not a real problem, certainly not on a pc, don't know about other systems. I think it is quite fair to limit the amount of memory we support when running 32 bit qemu binaries. I would expect more things to break than just this if we tried to support 4GB of RAM on a 32 bit host. Cheers, Jes Fair enough.
[Qemu-devel] [PULL] piix, pci, msi, memory, vhost, eepro100
The following changes since commit d2d979c628e4b2c4a3cb71a31841875795c79043: NBD: Avoid leaking a couple of strings when the NBD device is closed (2011-05-03 11:29:21 +0200) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/mst/qemu.git for_anthony Alex Williamson (2): CPUPhysMemoryClient: Fix typo in phys memory client registration CPUPhysMemoryClient: Pass guest physical address not region offset Avi Kivity (10): pci: add pci_register_bar_simple() API rtl8139: convert to pci_register_bar_simple() cirrus-vga: convert to pci_register_bar_simple() eepro100: convert to pci_register_bar_simple() hda-intel: convert to pci_register_bar_simple() hda-intel: convert to pci_register_bar_simple() (partial) ich/ahci: convert to pci_register_bar_simple() pcnet-pci: convert to pci_register_bar_simple() usb-ohci: convert to pci_register_bar_simple() wdt_i6300esb: convert to pci_register_bar_simple() Isaku Yamahata (4): pci: add accessor function to get irq levels piix_pci: eliminate PIIX3State::pci_irq_levels piix_pci: optimize set irq path piix_pci: load path clean up Jan Kiszka (2): MSI: Robust resource release pci: Add class 0x403 as 'audio controller' Michael S. Tsirkin (6): cpu: add set_memory flag to request dirty logging kvm: halve number of set memory calls for vga vhost: skip memory which needs dirty logging vhost: optimize out no-change assignment cirrus_vga: flag on-device ram for dirty logging Merge remote branch 'origin/master' into pci Stefan Weil (11): cirrus_vga: remove unneeded reset eepro100: Avoid duplicate debug messages eepro100: Remove type casts which are no longer needed eepro100: Remove unused structure element eepro100: Pad received short frames eepro100: Fix endianness issues eepro100: Support byte/word writes to port address eepro100: Support byte/word writes to pointer register eepro100: Support byte/word read/write access to MDI control register eepro100: Support byte read access to general control register eepro100: Support 32 bit read/write access to flash register cpu-common.h | 22 +++- exec.c| 30 +++-- hw/cirrus_vga.c | 30 ++--- hw/eepro100.c | 342 - hw/ide/ahci.c |9 -- hw/ide/ahci.h |3 - hw/ide/ich.c |8 +- hw/intel-hda.c| 15 +-- hw/lsi53c895a.c | 12 +-- hw/msi.c | 12 ++- hw/pci.c | 25 hw/pci.h |4 + hw/pcnet-pci.c| 16 +--- hw/piix_pci.c | 129 hw/rtl8139.c | 11 +-- hw/usb-ohci.c | 10 +-- hw/vhost.c| 61 +- hw/wdt_i6300esb.c | 42 +++ kvm-all.c | 62 ++ 19 files changed, 545 insertions(+), 298 deletions(-)
Re: [Qemu-devel] [PATCH 0/7] pci: initialize ids in pci common code
On Thu, May 05, 2011 at 03:41:57PM +0300, Michael S. Tsirkin wrote: On Fri, Apr 08, 2011 at 09:52:59PM +0900, Isaku Yamahata wrote: vender id/device id... in configuration space are read-only registers which are commonly defined for all pci devices. So initialize them in common code and it simplifies the initialization a bit. I converted some of them. If this is the right direction, I'll convert the remaining devices. So I agree about device vendor id and revision but not header type: devices ideally should supply device type (bridge or not) and we will fill it in correctly. Okay, with the nextspin, I'll drop program interface and header type and add assert for subsystem id/vendor id. -- yamahata
Re: [Qemu-devel] [PATCH v2 04/10] eepro100: Pad received short frames
Am 05.05.2011 15:00, schrieb Michael S. Tsirkin: On Sat, Apr 30, 2011 at 10:40:07PM +0200, Stefan Weil wrote: QEMU sends frames smaller than 60 bytes to ethernet nics. This should be fixed in the networking code because normally such frames are rejected by real NICs and their emulations. To avoid this behaviour, other NIC emulations pad received frames. This patch enables this workaround for eepro100, too. All related code is marked with CONFIG_PAD_RECEIVED_FRAMES, so emulation of the correct handling for short frames can be restored as soon as QEMU's networking code is fixed. Signed-off-by: Stefan Weilw...@mail.berlios.de Applied, I tweaked the comment a bit as we don't intend to change it in qemu. But go ahead and keep the ifdef around if you like. The new comment is ok, thanks. Cheers, Stefan W.
[Qemu-devel] [Bug 778032] [NEW] qemu spinning on serial port writes
Public bug reported: As originally found at http://www.mail- archive.com/k...@vger.kernel.org/msg08745.html from 3 years ago! Basically qemu seizes up in the event that the file descriptor for its emulated serial port has a full buffer, i.e. write() returns EAGAIN. For me, this happened when the serial port was being directed through a UNIX socket, with a default-sized 4KB buffer. Just the normal output from a Linux kernel boot caused it to seize up, and stop the main emulation / select loop. My suggestion is to remove the detection of EAGAIN in qemu-char.c:521, so that if the buffer is full, KVM discards the byte(s) it was trying to write. This is a surely better outcome than the process spinning forever. I will submit a separate patch to control the buffer sizes when creating UNIX sockets, which will help allow slow-reading processes to tune things so that they don't miss any output. Additionally, in the context of a hosted environment, if the -serial option is used, this could be a small security issue. An untrusted user of a guest system, knowing their serial output is going via a small buffer, could spew output to their /dev/ttyS0 at a rate fast enough to trigger this bug and eat a CPU core on the host. To quote David S. Ahern's original bug report (mine was the same, only with the latest version from git, so line numbers may have changed - my suggested fix above is accurate though): I am trying to redirect a guest's boot output through the host's serial port. Shortly after launching qemu, the main thread is spinning on: write(9, 0, 1) = -1 EAGAIN (Resource temporarily unavailable) fd 9 is the serial port, ttyS0. The backtrace for the thread is: #0 0x2ac3433f8c0b in write () from /lib64/libpthread.so.0 #1 0x00475df9 in send_all (fd=9, buf=value optimized out, len1=1) at qemu-char.c:477 #2 0x0043a102 in serial_xmit (opaque=value optimized out) at /root/kvm-81/qemu/hw/serial.c:311 #3 0x0043a591 in serial_ioport_write (opaque=0x14971790, addr=value optimized out, val=48) at /root/kvm-81/qemu/hw/serial.c:366 #4 0x410eeedc in ?? () #5 0x00129000 in ?? () #6 0x14821fa0 in ?? () #7 0x0007 in ?? () #8 0x004a54c5 in tlb_set_page_exec (env=0x10ab4, vaddr=46912496956816, paddr=1, prot=-1, mmu_idx=0, is_softmmu=1) at /root/kvm-81/qemu/exec.c:388 #9 0x00512f3b in tlb_fill (addr=345446292, is_write=1, mmu_idx=-1, retaddr=0x0) at /root/kvm-81/qemu/target-i386/op_helper.c:4690 #10 0x004a6bd2 in __ldb_cmmu (addr=9, mmu_idx=0) at /root/kvm-81/qemu/softmmu_template.h:135 #11 0x004a879b in cpu_x86_exec (env1=value optimized out) at /root/kvm-81/qemu/cpu-exec.c:628 #12 0x0040ba29 in main (argc=12, argv=0x7fff67f7a398) at /root/kvm-81/qemu/vl.c:3816 send_all() invokes unix_write() which by design is not breaking out on EAGAIN. The following command is enough to show the problem: qemu-system-x86_64 -m 256 -smp 1 -no-kvm \ -drivefile=/dev/cciss/c0d0,if=scsi,cache=off,boot=on \ -vnc :1 -serial /dev/ttyS0 The guest is running RHEL3 with the parameter 'console=ttyS0' added to grub.conf; the problem appears to be with qemu, so I would expect it to show with any linux guest. This particular host is running RHEL5.2 with kvm-81, but I have also seen the problem with Fedora-9 as the host OS. Yes, the serial port of the server is connected to another system via a null modem. If I change the serial argument to '-serial udp::4555' and use 'nc -u -l localhost 4555 /dev/ttyS0' I see the guest's boot output show up on the second system as expected. I'd prefer to be able to use the serial port connection directly without nc as a proxy. Suggestions? ** Affects: qemu Importance: Undecided Status: New ** Tags: eagain serial -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/778032 Title: qemu spinning on serial port writes Status in QEMU: New Bug description: As originally found at http://www.mail- archive.com/k...@vger.kernel.org/msg08745.html from 3 years ago! Basically qemu seizes up in the event that the file descriptor for its emulated serial port has a full buffer, i.e. write() returns EAGAIN. For me, this happened when the serial port was being directed through a UNIX socket, with a default-sized 4KB buffer. Just the normal output from a Linux kernel boot caused it to seize up, and stop the main emulation / select loop. My suggestion is to remove the detection of EAGAIN in qemu-char.c:521, so that if the buffer is full, KVM discards the byte(s) it was trying to write. This is a surely better outcome than the process spinning forever. I will submit a separate patch to control the buffer sizes when creating UNIX sockets, which will help allow slow-reading processes to tune things so that they don't miss any output.
Re: [Qemu-devel] [PULL] spice: fix locking
On 05/03/2011 10:06 AM, Gerd Hoffmann wrote: Hi, This is the current spice patch queue ready for pull. Patches have been posted a few days ago for review. A minor issue (leftover debug bit, spotten by Alon) has been fixed and the patch queue has been rebased to latest master, otherwise it is unmodified. Please pull. Pulled. Thanks. Regards, Anthony Liguori cheers, Gerd The following changes since commit d2d979c628e4b2c4a3cb71a31841875795c79043: NBD: Avoid leaking a couple of strings when the NBD device is closed (2011-05-03 11:29:21 +0200) are available in the git repository at: git://anongit.freedesktop.org/spice/qemu spice.v35 Gerd Hoffmann (3): spice: don't create updates in spice server context. spice: don't call displaystate callbacks from spice server context. spice: drop obsolete iothread locking Jes Sorensen (1): Make spice dummy functions inline to fix calls not checking return values hw/qxl-render.c| 25 ++--- hw/qxl.c | 27 +++ ui/qemu-spice.h| 12 - ui/spice-display.c | 61 +-- ui/spice-display.h | 24 5 files changed, 93 insertions(+), 56 deletions(-)
Re: [Qemu-devel] [PULL] piix, pci, msi, memory, vhost, eepro100
On 05/05/2011 10:45 AM, Michael S. Tsirkin wrote: The following changes since commit d2d979c628e4b2c4a3cb71a31841875795c79043: NBD: Avoid leaking a couple of strings when the NBD device is closed (2011-05-03 11:29:21 +0200) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/mst/qemu.git for_anthony Pulled. Thanks. Regards, Anthony Liguori Alex Williamson (2): CPUPhysMemoryClient: Fix typo in phys memory client registration CPUPhysMemoryClient: Pass guest physical address not region offset Avi Kivity (10): pci: add pci_register_bar_simple() API rtl8139: convert to pci_register_bar_simple() cirrus-vga: convert to pci_register_bar_simple() eepro100: convert to pci_register_bar_simple() hda-intel: convert to pci_register_bar_simple() hda-intel: convert to pci_register_bar_simple() (partial) ich/ahci: convert to pci_register_bar_simple() pcnet-pci: convert to pci_register_bar_simple() usb-ohci: convert to pci_register_bar_simple() wdt_i6300esb: convert to pci_register_bar_simple() Isaku Yamahata (4): pci: add accessor function to get irq levels piix_pci: eliminate PIIX3State::pci_irq_levels piix_pci: optimize set irq path piix_pci: load path clean up Jan Kiszka (2): MSI: Robust resource release pci: Add class 0x403 as 'audio controller' Michael S. Tsirkin (6): cpu: add set_memory flag to request dirty logging kvm: halve number of set memory calls for vga vhost: skip memory which needs dirty logging vhost: optimize out no-change assignment cirrus_vga: flag on-device ram for dirty logging Merge remote branch 'origin/master' into pci Stefan Weil (11): cirrus_vga: remove unneeded reset eepro100: Avoid duplicate debug messages eepro100: Remove type casts which are no longer needed eepro100: Remove unused structure element eepro100: Pad received short frames eepro100: Fix endianness issues eepro100: Support byte/word writes to port address eepro100: Support byte/word writes to pointer register eepro100: Support byte/word read/write access to MDI control register eepro100: Support byte read access to general control register eepro100: Support 32 bit read/write access to flash register cpu-common.h | 22 +++- exec.c| 30 +++-- hw/cirrus_vga.c | 30 ++--- hw/eepro100.c | 342 - hw/ide/ahci.c |9 -- hw/ide/ahci.h |3 - hw/ide/ich.c |8 +- hw/intel-hda.c| 15 +-- hw/lsi53c895a.c | 12 +-- hw/msi.c | 12 ++- hw/pci.c | 25 hw/pci.h |4 + hw/pcnet-pci.c| 16 +--- hw/piix_pci.c | 129 hw/rtl8139.c | 11 +-- hw/usb-ohci.c | 10 +-- hw/vhost.c| 61 +- hw/wdt_i6300esb.c | 42 +++ kvm-all.c | 62 ++ 19 files changed, 545 insertions(+), 298 deletions(-)
Re: [Qemu-devel] [PULL] usb patch queue
On 05/04/2011 10:41 AM, Gerd Hoffmann wrote: Hi, The USB patch queue is back! I'm still busy catching up with the backlog, I know I didn't pick up everything from the list yet. If in doubt it doesn't hurt to resend usb related patches, with me being Cc'ed. This pull brings old stuff, most of the patches are several months old already. Finally the usb-host fixes from Hans are queued up for merge. Some async packet handling cleanups are in there to. Oh, and one more bugfix for the usb mass storage device. please pull, Gerd Pulled. Thanks. Regards, Anthony Liguori The following changes since commit d2d979c628e4b2c4a3cb71a31841875795c79043: NBD: Avoid leaking a couple of strings when the NBD device is closed (2011-05-03 11:29:21 +0200) are available in the git repository at: git://git.kraxel.org/qemu usb.7.pull Gerd Hoffmann (6): uhci: switch to QTAILQ uhci: keep uhci state pointer in async packet struct. ohci: get ohci state via container_of() musb: get musb state via container_of() usb: move complete callback to port ops usb: mass storage fix Hans de Goede (8): usb-linux: introduce a usb_linux_alt_setting function usb-linux: Get the alt. setting from sysfs rather then asking the dev usb-linux: Add support for buffering iso usb packets usb-linux: Refuse packets for endpoints which are not in the usb descriptor usb-linux: Refuse iso packets when max packet size is 0 (alt setting 0) usb-linux: We only need to keep track of 15 endpoints usb-linux: Add support for buffering iso out usb packets usb: control buffer fixes hw/usb-hub.c | 14 ++ hw/usb-msd.c |5 +- hw/usb-musb.c | 75 ++- hw/usb-ohci.c |9 +- hw/usb-uhci.c | 82 hw/usb.c |6 + hw/usb.h |9 +- usb-linux.c | 394 ++--- 8 files changed, 445 insertions(+), 149 deletions(-)
[Qemu-devel] [PATCH] target-arm: Fix VMLA, VMLS, VNMLS, VNMLA handling of NaNs
Correct handling of NaNs for VFP VMLA, VMLS, VNMLS and VNMLA requires that we implement the set of negations and additions specified by the ARM ARM; plausible looking simplifications like turning (-A + B) into (B - A) or computing (A + B) rather than (B + A) result in selecting the wrong NaN or returning a NaN with the wrong sign bit. Signed-off-by: Peter Maydell peter.mayd...@linaro.org --- target-arm/translate.c | 53 --- 1 files changed, 40 insertions(+), 13 deletions(-) diff --git a/target-arm/translate.c b/target-arm/translate.c index a1af436..3c38364 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -909,6 +909,26 @@ VFP_OP2(div) #undef VFP_OP2 +static inline void gen_vfp_F1_mul(int dp) +{ +/* Like gen_vfp_mul() but put result in F1 */ +if (dp) { +gen_helper_vfp_muld(cpu_F1d, cpu_F0d, cpu_F1d, cpu_env); +} else { +gen_helper_vfp_muls(cpu_F1s, cpu_F0s, cpu_F1s, cpu_env); +} +} + +static inline void gen_vfp_F1_neg(int dp) +{ +/* Like gen_vfp_neg() but put result in F1 */ +if (dp) { +gen_helper_vfp_negd(cpu_F1d, cpu_F0d); +} else { +gen_helper_vfp_negs(cpu_F1s, cpu_F0s); +} +} + static inline void gen_vfp_abs(int dp) { if (dp) @@ -3021,27 +3041,34 @@ static int disas_vfp_insn(CPUState * env, DisasContext *s, uint32_t insn) for (;;) { /* Perform the calculation. */ switch (op) { -case 0: /* mac: fd + (fn * fm) */ -gen_vfp_mul(dp); -gen_mov_F1_vreg(dp, rd); +case 0: /* VMLA: fd + (fn * fm) */ +/* Note that order of inputs to the add matters for NaNs */ +gen_vfp_F1_mul(dp); +gen_mov_F0_vreg(dp, rd); gen_vfp_add(dp); break; -case 1: /* nmac: fd - (fn * fm) */ +case 1: /* VMLS: fd + -(fn * fm) */ gen_vfp_mul(dp); -gen_vfp_neg(dp); -gen_mov_F1_vreg(dp, rd); +gen_vfp_F1_neg(dp); +gen_mov_F0_vreg(dp, rd); gen_vfp_add(dp); break; -case 2: /* msc: -fd + (fn * fm) */ -gen_vfp_mul(dp); -gen_mov_F1_vreg(dp, rd); -gen_vfp_sub(dp); +case 2: /* VNMLS: -fd + (fn * fm) */ +/* Note that it isn't valid to replace (-A + B) with (B - A) + * or similar plausible looking simplifications + * because this will give wrong results for NaNs. + */ +gen_vfp_F1_mul(dp); +gen_mov_F0_vreg(dp, rd); +gen_vfp_neg(dp); +gen_vfp_add(dp); break; -case 3: /* nmsc: -fd - (fn * fm) */ +case 3: /* VNMLA: -fd + -(fn * fm) */ gen_vfp_mul(dp); +gen_vfp_F1_neg(dp); +gen_mov_F0_vreg(dp, rd); gen_vfp_neg(dp); -gen_mov_F1_vreg(dp, rd); -gen_vfp_sub(dp); +gen_vfp_add(dp); break; case 4: /* mul: fn * fm */ gen_vfp_mul(dp); -- 1.7.1
Re: [Qemu-devel] A Question
On 05/05/2011 02:01 AM, Peter Maydell wrote: On 5 May 2011 00:16, Rob Landley r...@landley.net wrote: I note that I have a half-dozen prebuilt system images at http://landley.net/aboriginal/downloads/binaries and the build scripts and such are in the directories above that. I'm afraid I don't entirely understand your file naming system there -- it seems to say which architecture the system images are for but not what board? Exactly. An armv5l root filesystem will run on dozens of boards. You need to rebuild the kernel for a specific board, but not the root filesystem or toolchain. The point of these system images is to encourage native development (I.E. building software natively under qemu, optionally using distcc to call out to a compatible cross compiler on the host). All it needs to do this is _a_ kernel that qemu is capable of booting that can run that software with appropriate peripherals (serial I/O, network card, block device, RTC, etc). It includes an example kernel built to do that under qemu, and a shell script to launch qemu. But these are not kernels you'd install on the actual hardware, there are dozens of those for each root filesystem. Perhaps we should have a wiki page with links to useful third-party system images? I also know of Aurelien's images at http://people.debian.org/~aurel32/qemu/ and no doubt there are others. There used to be one, but it's impossible to be complete. Rob
Re: [Qemu-devel] A Question
On 5 May 2011 23:13, Rob Landley r...@landley.net wrote: On 05/05/2011 02:01 AM, Peter Maydell wrote: I'm afraid I don't entirely understand your file naming system there -- it seems to say which architecture the system images are for but not what board? Exactly. An armv5l root filesystem will run on dozens of boards. You need to rebuild the kernel for a specific board, but not the root filesystem or toolchain. Doh, I should have read the readme a bit more carefully. I usually take system image to mean complete disk image including bootloader, kernel and initrd as well as rootfs, which obviously does include the board-specific bits. On the other hand the readme does say the tarball includes a kernel so in that sense it is board-specific, presumably. (ARM kernels having alas not yet got to the point where you can build a single kernel that will boot on everything.) -- PMM
Re: [Qemu-devel] A Question
On 05/05/2011 05:32 PM, Peter Maydell wrote: On 5 May 2011 23:13, Rob Landley r...@landley.net wrote: On 05/05/2011 02:01 AM, Peter Maydell wrote: I'm afraid I don't entirely understand your file naming system there -- it seems to say which architecture the system images are for but not what board? Exactly. An armv5l root filesystem will run on dozens of boards. You need to rebuild the kernel for a specific board, but not the root filesystem or toolchain. Doh, I should have read the readme a bit more carefully. I'm working on documentation but every time I sit down and properly document everything it turns into a giant BOOK ala http://landley.net/aboriginal/downloads/presentation.pdf I've tried it as a faq, I've tried it as both tutorial and reference, I need to do... I dunno, podcasts or something. I usually take system image to mean complete disk image including bootloader, kernel and initrd as well as rootfs, In this case qemu's -kernel option is the bootloader. I have it load an init script out of the final root filesystem instead of using an initrd (although the build scripts have an initrd packging option so the root filesystem can _be_ an initrd, either as a cpio image or bundled into a kernel, but that imposes size limitations and requires the emulated system to allocate more physical memory, which is a pain on mips and such). which obviously does include the board-specific bits. On the other hand the readme does say the tarball includes a kernel so in that sense it is board-specific, presumably. Yes, but it's just some random board qemu is capable of emulating. It doesn't matter WHICH, it matters that ./run-emulator.sh can boot qemu to a shell prompt and get reasonable performance with the required I/O device feature list. I'm actually trying to get it more generic with device trees and virtio and such, but those really aren't baked yet. (And armv6l bit-rotted recently because I was patching the kernel to stick an armv6l cpu emulation into a Versatile board, which doesn't happen in nature and stopped working with my patch around 2.6.33 or so. I need to swap in a real armv6l board emulation there. It's a todo item.) (ARM kernels having alas not yet got to the point where you can build a single kernel that will boot on everything.) Grant Likely's working on making it happen via device trees. Here's my bookmark out of my todo list on the current status of using qemu with that: http://lists.ozlabs.org/pipermail/devicetree-discuss/2011-March/005112.html Haven't had a chance to play with it yet, though... It still won't be a _single_ kernel because you'll still need armv4tl, armv5l, armv6l, armv7l, thumb2, and a couple of not quite floating point, not quite mmx extensions ala neon... (Well, ok, armv4tl will boot on everything but it'd be slow which means poor battery life. Good enough for doing native compiles, of course...) Rob
Re: [Qemu-devel] Allow ARMv7M to be started without a kernel
On 05.05.2011, at 11:56, Peter Maydell wrote: On 5 May 2011 09:23, Ben Leslie be...@benno.id.au wrote: FWIW, the reason why I'm not using -kernel is that the current way the armv7m code works, it expects the provided kernel to be a full flash image including appropriate vector table, whereas right now I just want to debug some stand-alone code, not the full system, which the above gdb approach works perfectly for. It would probably be better for the -kernel option to honour the entry point in the ELF file rather than insisting on full reset (and to try to load the reset SP from the vector table but not insist on that working). That is, we should support both load this ELF image which is a full system image with a vector table and load this ELF image which is just a bare-metal (possibly semihosting) application. The combination of v7M reset with image loading and the possibility of a debugger altering the pc/sp while the core is in reset is a bit complicated, though :-) As an aside: I think QEMU should have an option which is just load a plain ELF or raw binary, with no funny Linux-kernel-specific behaviour rather than overloading -kernel to mean if it's a raw image it's Linux and if it's an ELF file it's not. Traditionally, -bios has been that one. -kernel is more of a real bootloader replacement, including all the weirdness a bootloader does :). Alex
[Qemu-devel] Cannonical web qemu-doc.html location?
The wiki's Documentation tab links to: http://qemu.weilnetz.de/qemu-doc.html But Google's first hit for qemu-doc.html is: http://wiki.qemu.org/download/qemu-doc.html Which exists but is not remotely the same file. Which is correct? Rob
Re: [Qemu-devel] A Question
On 6 May 2011 00:20, Rob Landley r...@landley.net wrote: On 05/05/2011 05:32 PM, Peter Maydell wrote: (ARM kernels having alas not yet got to the point where you can build a single kernel that will boot on everything.) Grant Likely's working on making it happen via device trees. Here's my bookmark out of my todo list on the current status of using qemu Yes. Making sure ARM QEMU plays nicely with device tree is on Linaro's todo list for the upcoming six-month cycle. -- PMM
Re: [Qemu-devel] Allow ARMv7M to be started without a kernel
On 05/05/2011 06:26 PM, Alexander Graf wrote: As an aside: I think QEMU should have an option which is just load a plain ELF or raw binary, with no funny Linux-kernel-specific behaviour rather than overloading -kernel to mean if it's a raw image it's Linux and if it's an ELF file it's not. Traditionally, -bios has been that one. -kernel is more of a real bootloader replacement, including all the weirdness a bootloader does :). Except that neither qemu-system-x86_64 -bios vmlinux nor qemu-system-x86_64 -kernel vmlinux will load an ELF kernel on x86-64. The code to do this _exists_ within qemu, it's just not hooked up consistently on all targets. We have a universal cross-platform image format, and we have support in qemu for loading that format, and for some reason it's only enabled on certain targets. I've never understood why... Rob
Re: [Qemu-devel] [PATCH] qemu-kvm: Add CPUID support for VIA CPU
Hi, Jan Thank you very much for your advice. That's helpful for me. Hi, the subject's tag (qemu-kvm) is misleading. This is actually targeting the uq/master patch queue, i.e. the upstream kvm staging area. If I want to submit a patch for the qemu-kvm-git, should I use [QEMU-DEVEL][Patch]... as the subject? Or there are other rules for qemu-kvm upstream? If yes, would you like to tell me?. Thanks! On 2011-05-05 05:03, brill...@viatech.com.cn wrote: When KVM is running on VIA CPU with host cpu's model, the feautures of VIA CPU will be passed into kvm guest by calling the CPUID instruction for Centaur. Signed-off-by: BrillyWubrill...@viatech.com.cn Signed-off-by: KaryJinkary...@viatech.com.cn --- target-i386/cpu.h |7 +++ target-i386/cpuid.c | 48 +++- You patch is unfortunately line-wrapped. Yes, I will be careful the next time. @@ -721,6 +725,9 @@ typedef struct CPUX86State { uint32_t cpuid_ext3_features; uint32_t cpuid_apic_id; int cpuid_vendor_override; +/*Store the results of Centaur's CPUID instructions*/ Please format comments like this /* comment text */, ie. with blanks after/before the /* / */. OK, I will check it. +1050,15 @@ void cpu_x86_cpuid(CPUX86State *env, uin uint32_t *ecx, uint32_t *edx) { /* test if maximum index reached */ -if (index 0x8000) { +if ((index 0xC000) == 0xC000) { + /* Handle the Centaur's CPUID instruction.* + * If cpuid_xlevel2 is 0, then put into the* + * default case. */ + if (env-cpuid_xlevel2 == 0) + index = 0xF000; + else if (index env-cpuid_xlevel2) + index = env-cpuid_xlevel2; Please validate your patch before posting with scripts/checkpatch.pl. OK, I will do it. I found that space is used to code indent other than tab, should I follow it or use tab instead in my patch? If I use space, there are some warnings when using scripts/checkpatch.pl to validate the patch. Can I ignore them?
Re: [Qemu-devel] Canonical web qemu-doc.html location?
Am 06.05.2011 01:29, schrieb Rob Landley: The wiki's Documentation tab links to: http://qemu.weilnetz.de/qemu-doc.html But Google's first hit for qemu-doc.html is: http://wiki.qemu.org/download/qemu-doc.html Which exists but is not remotely the same file. Which is correct? Rob Both were made from the official source code, so both are correct in some way. The version on my private QEMU site is newer, because I update it often when I build new QEMU binaries for w32. It is not canonical because I am not an official QEMU maintainer. Ideally, the version on qemu.org would be updated automatically. This can only be done by a site administrator. Regards, Stefan W.