Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl v3
Dong, Eddie wrote: BTW, why we use vector here? shouldn't it be irq_line or irq_no? Maybe you mean the Channel Subsystem (1st piece of knowledge and surprise known from s390 doc) are emulated in Qemu, correct? The vector field was introduced by Avi's comment. I just copied that over. On s390, we only have irq numbers, no vectors. For now, we don't want to emulate the channel subsystem, just paravirt. Technically, we could do a passthrough in the long term just like pci devices can be dedicated to a guest using an iommu in the memory mapped I/O world. - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl v3
On Monday 05 November 2007, Carsten Otte wrote: Dong, Eddie wrote: BTW, why we use vector here? shouldn't it be irq_line or irq_no? Maybe you mean the Channel Subsystem (1st piece of knowledge and surprise known from s390 doc) are emulated in Qemu, correct? The vector field was introduced by Avi's comment. I just copied that over. On s390, we only have irq numbers, no vectors. Actually, you have neither irq numbers nor vectors on s390 right now. I/O subchannels are do not fit into the IRQ handling in Linux at all, and external interrupts are sufficiently different that you should not treat them as IRQ lines in Linux. However, I would suggest that you use either one external interrupt or the thin interrupt as an event source for an interrupt controller for all the virtio devices, and use the generic IRQ subsystem for that, including interrupt lines and vectors. In case of the thin interrupt, your virtual interrupt controller would more or less just consist of one lowcore address from which you can read the pending interrupt vector after an interrupt has been caused, as well as a single hcall that does a 'acknowledge interrupt, get next pending irq vector into lowcore and tell me whether there was one' operation. You'll also need an operation to associate a virtio device with an interrupt vector, but that belongs into virtio. Arnd - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl v3
Arnd Bergmann wrote: On Monday 05 November 2007, Carsten Otte wrote: Dong, Eddie wrote: BTW, why we use vector here? shouldn't it be irq_line or irq_no? Maybe you mean the Channel Subsystem (1st piece of knowledge and surprise known from s390 doc) are emulated in Qemu, correct? The vector field was introduced by Avi's comment. I just copied that over. On s390, we only have irq numbers, no vectors. Actually, you have neither irq numbers nor vectors on s390 right now. I/O subchannels are do not fit into the IRQ handling in Linux at all, and external interrupts are sufficiently different that you should not treat them as IRQ lines in Linux. We're not emulating the I/O subsystem, and thus no I/O subchannels. However, I would suggest that you use either one external interrupt or the thin interrupt as an event source for an interrupt controller for all the virtio devices, and use the generic IRQ subsystem for that, including interrupt lines and vectors. In case of the thin interrupt, your virtual interrupt controller would more or less just consist of one lowcore address from which you can read the pending interrupt vector after an interrupt has been caused, as well as a single hcall that does a 'acknowledge interrupt, get next pending irq vector into lowcore and tell me whether there was one' operation. You'll also need an operation to associate a virtio device with an interrupt vector, but that belongs into virtio. The irq subsystem does not fit the external interrupt model, and you'd definitely want to argue with Martin before suggesting to introduce the IRQ subsystem on s390. Only over my dead body was the last statement I do remember. Plus I don't see a benefit from pretending to have an interrupt controller: virtio abstracts from this, and can well be implemented over extint and hypercall like Christian has done it. What's the problem you're trying to solve? - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl v3
On Monday 05 November 2007, Carsten Otte wrote: Actually, you have neither irq numbers nor vectors on s390 right now. I/O subchannels are do not fit into the IRQ handling in Linux at all, and external interrupts are sufficiently different that you should not treat them as IRQ lines in Linux. snip The irq subsystem does not fit the external interrupt model, and you'd definitely want to argue with Martin before suggesting to introduce the IRQ subsystem on s390. Only over my dead body was the last statement I do remember. Read again what I wrote above. I'm suggesting to have just one external interrupt for virtio and use the generic IRQ abstraction to handle everything that comes below that. Plus I don't see a benefit from pretending to have an interrupt controller: virtio abstracts from this, and can well be implemented over extint and hypercall like Christian has done it. What's the problem you're trying to solve? Sorry, I can't find Christian's code right now, do you have a pointer to the patches? I suspect that he has done exactly what I was trying to explain, except that the implementation is not using the generic IRQ layer, which means you're duplicating some of the code. Arnd - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl v3
Am Montag, 5. November 2007 schrieb Arnd Bergmann: Read again what I wrote above. I'm suggesting to have just one external interrupt for virtio and use the generic IRQ abstraction to handle everything that comes below that. So you basically suggest to implement wrapper code around extint and lowcore memory to be able to use request_irq/free_irq? Plus I don't see a benefit from pretending to have an interrupt controller: virtio abstracts from this, and can well be implemented over extint and hypercall like Christian has done it. What's the problem you're trying to solve? Sorry, I can't find Christian's code right now, do you have a pointer to the patches? The code was only used for our prototype hypervisor. I never posted these virtio patches as Rusty was quicker in changing virtio than I was able to re-add them to our prototype code. ;-) I suspect that he has done exactly what I was trying to explain, except that the implementation is not using the generic IRQ layer, which means you're duplicating some of the code. I used one external interrupt and I reserved an area in lowcore for a 64bit extint parameter. (I use the same address as z/VM for the PFAULT token). I defined a hypercall in which the guest could specify this 64bit value for a given virtqueue. That allowed me to get the virtqueue pointer without looking it up in the list of (maybe many) virtqueues. Christian - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl v3
BTW, why we use vector here? shouldn't it be irq_line or irq_no? Maybe you mean the Channel Subsystem (1st piece of knowledge and surprise known from s390 doc) are emulated in Qemu, correct? Eddie - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl v3
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Carsten Otte Sent: 2007年10月26日 20:02 To: Avi Kivity; Zhang, Xiantao; Hollis Blanchard Cc: kvm-devel@lists.sourceforge.net Subject: Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl v3 This patch splits kvm_vm_ioctl into archtecture independent parts, and x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c. The patch has been updated to current git, and it leaves out memory slot registration work which is currently subject to a detailed discussion. Common ioctls for all architectures are: KVM_CREATE_VCPU, KVM_GET_DIRTY_LOG, KVM_SET_USER_MEMORY_REGION KVM_SET_USER_MEMORY_REGION implementation is no longer moved to x86.c. It seems to me that more fine-grained refinement then just moving the code is required here. x86 specific ioctls are: KVM_SET_MEMORY_REGION, KVM_GET/SET_NR_MMU_PAGES, KVM_SET_MEMORY_ALIAS, KVM_CREATE_IRQCHIP, KVM_CREATE_IRQ_LINE, KVM_GET/SET_IRQCHIP KVM_SET_TSS_ADDR KVM_SET_TSS_ADDR has been added to the list of x86 specifics, as Izik's commit states it is used for emulating real mode on intel. Why KVM_IRQ_LINE is X86b specific? The original idea are based on ACPI spec which I assume to be generic though S390 may not take. thx,eddie - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl v3
Dong, Eddie wrote: -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Carsten Otte Sent: 2007年10月26日 20:02 To: Avi Kivity; Zhang, Xiantao; Hollis Blanchard Cc: kvm-devel@lists.sourceforge.net Subject: Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl v3 This patch splits kvm_vm_ioctl into archtecture independent parts, and x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c. The patch has been updated to current git, and it leaves out memory slot registration work which is currently subject to a detailed discussion. Common ioctls for all architectures are: KVM_CREATE_VCPU, KVM_GET_DIRTY_LOG, KVM_SET_USER_MEMORY_REGION KVM_SET_USER_MEMORY_REGION implementation is no longer moved to x86.c. It seems to me that more fine-grained refinement then just moving the code is required here. x86 specific ioctls are: KVM_SET_MEMORY_REGION, KVM_GET/SET_NR_MMU_PAGES, KVM_SET_MEMORY_ALIAS, KVM_CREATE_IRQCHIP, KVM_CREATE_IRQ_LINE, KVM_GET/SET_IRQCHIP KVM_SET_TSS_ADDR KVM_SET_TSS_ADDR has been added to the list of x86 specifics, as Izik's commit states it is used for emulating real mode on intel. Why KVM_IRQ_LINE is X86b specific? The original idea are based on ACPI spec which I assume to be generic though S390 may not take. ia64 can probably share much. ppc will probably want KVM_IRQ_LINE with with different parameters. s390, as far as I understand, will not. -- Any sufficiently difficult bug is indistinguishable from a feature. - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl v3
Dong, Eddie wrote: Why KVM_IRQ_LINE is X86b specific? The original idea are based on ACPI spec which I assume to be generic though S390 may not take. ACPI is not present on s390 and ppc. In fact, I doubt it is present on any architecture except those two intel ones: at least my mips router and my arm pda don't have it either. It's kind of based on the idea of having a bios alike code. so long, Carsten - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl v3
OK, so how can a device inform kernel for an IRQ in S390? -Original Message- From: Carsten Otte [mailto:[EMAIL PROTECTED] Sent: 2007年10月30日 19:30 To: Dong, Eddie Cc: Avi Kivity; Zhang, Xiantao; Hollis Blanchard; kvm-devel@lists.sourceforge.net Subject: Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl v3 Dong, Eddie wrote: Why KVM_IRQ_LINE is X86b specific? The original idea are based on ACPI spec which I assume to be generic though S390 may not take. ACPI is not present on s390 and ppc. In fact, I doubt it is present on any architecture except those two intel ones: at least my mips router and my arm pda don't have it either. It's kind of based on the idea of having a bios alike code. so long, Carsten - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl v3
Avi Kivity wrote: Why KVM_IRQ_LINE is X86b specific? The original idea are based on ACPI spec which I assume to be generic though S390 may not take. ia64 can probably share much. ppc will probably want KVM_IRQ_LINE with with different parameters. s390, as far as I understand, will not. I think we'll have to come up with a more modular approach later on: various aspects are of interest to various architectures and/or platforms. The generic kernel has CONFIG_FEATURE toggles for that. The portability patches are not intended to split kvm into components at this stage, I believe that is something that we will have to come up when actual ports are being integrated. In my optinion, a reasonable next-step refinement here would be to come up with a generic interrupt injection call that can inject an interrupt on any architecture and platform. After userspace has adopted to use that one, we can keep the old call for backward compatibility reasons in a deprecated state for some time before removing it. For now, my goal is to seperate what is generic in a way that it is a functionality that a portable user space program that uses kvm can expect to work the same way on all architectures and platforms. so long, Carsten - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl v3
Carsten Otte wrote: I think we'll have to come up with a more modular approach later on: various aspects are of interest to various architectures and/or platforms. The generic kernel has CONFIG_FEATURE toggles for that. The portability patches are not intended to split kvm into components at this stage, I believe that is something that we will have to come up when actual ports are being integrated. In my optinion, a reasonable next-step refinement here would be to come up with a generic interrupt injection call that can inject an interrupt on any architecture and platform. After userspace has adopted to use that one, we can keep the old call for backward compatibility reasons in a deprecated state for some time before removing it. For now, my goal is to seperate what is generic in a way that it is a functionality that a portable user space program that uses kvm can expect to work the same way on all architectures and platforms. We have to be careful not to force too much portability on the code. After all, the instruction set is different and some of the hardware philosophy is different. You will never be able to run the same guest on different archs, or have exactly the same virtual devices. The differences are real, and the goal is not portability at any cost; it is to share as much as possible, but not more. Architectures which have interrupt request lines that are edge-triggered or level-triggered and emulate the interrupt controller in the kernel can share the KVM_IRQ_LINE API in some way; architectures that don't will need another method. -- Any sufficiently difficult bug is indistinguishable from a feature. - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl v3
Dong, Eddie wrote: OK, so how can a device inform kernel for an IRQ in S390? Oooh, that is a looong explanation. If you want to peek at it, see http://publibz.boulder.ibm.com/epubs/pdf/a2278324.pdf . Chapter 6 covers Interruptions. I'd recommend to start with reading external interruptions, because that is the one we'll be primarily be using with kvm. External interruptions are used for things like timers, hypercalls, and IPIs. The Program Interruption Coditions are also worth reading, they cover things similar to general protection fault on x86. Chapter 11 covers a different type of Interruptions, such as error detection of hardware failures and hot-standby component failover. Chapter 16 is also of interrest, it covers I/O interruptions. Whenever you see me in person (next kvm forum maybe?), you are invited to a lot of beer: I'll bring pen and paper and try to give you an overview while we get drunk :-). so long, Carsten - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl v3
Avi Kivity wrote: But that doesn't make the code portable. The s390 userspace has to know how to encode the number, and x86 will do it differently. If it's really different, let's keep it different. Unless you can push the encoding so far back it's transparent to userspace (e.g. qemu). I agree that we should keep it seperate unless it makes sense to have commonality. A paravirt driver for example could make use of this abstraction: it could request an interrupt, and hand the __u64 that it got back to a function that actually sends the interrupt over. But for now, I agree we should keep it seperate. I am just thinking loud here. In addition, I would love to be able to specify which target CPUs may receive that interrupt because our IPI equivalent comes out just like a regular interrupt on just one target CPU. That boils down to something like this: struct kvm_interrupt_data { __u64 interrupt_number; cpuset_t possible_target_cpus; } and an KVM_INJECT_INTERRUPT common ioctl for the vm to provide this. Are cpusets exported to userspace? x86 has something similar (IPI to a set of cpus) but it's handled 100% in the kernel these days. No they are'nt. We'd need to come up with a different data structure for that. Does IPI have an interrupt number too? - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl v3
Carsten Otte wrote: In addition, I would love to be able to specify which target CPUs may receive that interrupt because our IPI equivalent comes out just like a regular interrupt on just one target CPU. That boils down to something like this: struct kvm_interrupt_data { __u64 interrupt_number; cpuset_t possible_target_cpus; } and an KVM_INJECT_INTERRUPT common ioctl for the vm to provide this. Are cpusets exported to userspace? x86 has something similar (IPI to a set of cpus) but it's handled 100% in the kernel these days. No they are'nt. We'd need to come up with a different data structure for that. A bitmap would do it, but what size? Expandable ones are messy. Does IPI have an interrupt number too? No, it's a command (mmio) to the APIC, you tell it which vector you want and to which cpus you want it delivered. So you can have many IPI interrupt vectors. -- Any sufficiently difficult bug is indistinguishable from a feature. - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl v3
Avi Kivity wrote: A bitmap would do it, but what size? Expandable ones are messy. We could have a #define KVM_CPU_BITMAP_SIZE in the arch specific header files that go to include/asm/. For s390, we have one of our rocket science virtualization accelerating facilities that limits us to 64 cpus per guest. This may well be extended later on, but for now that would be sufficient. Thinking about Christoph Lameter with his 4k CPU boxes, I believe ia64 would want fr more than that. No, it's a command (mmio) to the APIC, you tell it which vector you want and to which cpus you want it delivered. So you can have many IPI interrupt vectors. I see. But the interrupt vector can be encoded in __u64? - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl v3
Carsten Otte wrote: Avi Kivity wrote: A bitmap would do it, but what size? Expandable ones are messy. We could have a #define KVM_CPU_BITMAP_SIZE in the arch specific header files that go to include/asm/. For s390, we have one of our rocket science virtualization accelerating facilities that limits us to 64 cpus per guest. This may well be extended later on, but for now that would be sufficient. Thinking about Christoph Lameter with his 4k CPU boxes, I believe ia64 would want fr more than that. If there's a single variable length array (which is the case here) it can be tucked on at the end: struct kvm_ipi { __u64 vector; __u32 size; /* bytes, must be multiple of 8 */ __u32 pad; __u64 cpuset[0]; }; We have this in a few places. Not pretty, but serviceable. No, it's a command (mmio) to the APIC, you tell it which vector you want and to which cpus you want it delivered. So you can have many IPI interrupt vectors. I see. But the interrupt vector can be encoded in __u64? The vector is just a u8. The x86 interrupt path looks like this: [devices] -- irq -- [interrupt controllers] vector --- [processor] The interrupt controllers translate irq lines into vectors, which the processor consumes. Before kvm-irqchip, the API taked about vectors since the interrupt controller was in userspace. Nowadays userspace talks irq lines to the kernel, which converts them into vectors. If I uderstand correctly, s390 is interrupt vector oriented, no? -- Any sufficiently difficult bug is indistinguishable from a feature. - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl v3
[EMAIL PROTECTED] wrote: Avi Kivity wrote: A bitmap would do it, but what size? Expandable ones are messy. We could have a #define KVM_CPU_BITMAP_SIZE in the arch specific header files that go to include/asm/. For s390, we have one of our rocket science virtualization accelerating facilities that limits us to 64 cpus per guest. This may well be extended later on, but for now that would be sufficient. Thinking about Christoph Lameter with his 4k CPU boxes, I believe ia64 would want fr more than that. IA64/KVM will handle interrupt in kernel including IPI IMO, so what user level need to tell kernel is which platform IRQ pin is set/cleared. Can't S390 do in similar way? From platform point of view, each irq can have a unique # and the device itself doesn;t need to know which CPU will receive it. Are talking about having your interrupt controller in user space? or I missed something. Love to study the spec later :-) Eddie - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl v3
Dong, Eddie wrote: IA64/KVM will handle interrupt in kernel including IPI IMO, so what user level need to tell kernel is which platform IRQ pin is set/cleared. Can't S390 do in similar way? From platform point of view, each irq can have a unique # and the device itself doesn;t need to know which CPU will receive it. Are talking about having your interrupt controller in user space? or I missed something. We don't have interrupt controllers in the first place, and therefore we don't need to emulate them. We want to handle IPI inside the kernel too, and we also need to be able to inject interrupts from userspace. Would you be able to encode your interrupt related information into an __u64 data type? Do all CPUs have the same interrupts pending, or is the information per-cpu? Does the data structure that Avi suggested fit your interrupt injection needs? struct kvm_interrupt { __u64 vector; __u32 size; /* bytes, must be multiple of 8 */ __u32 pad; __u64 cpuset[0]; }; - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl v2
On Thu, 2007-10-25 at 17:48 +0200, Carsten Otte wrote: This patch splits kvm_vm_ioctl into archtecture independent parts, and x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c. Common ioctls for all architectures are: KVM_CREATE_VCPU, KVM_GET_DIRTY_LOG, KVM_SET_USER_MEMORY_REGION KVM_SET_USER_MEMORY_REGION is actually implemented in x86.c now, because the code behind looks arch specific to me. i think it is much better just to split the parts that allocate the rmap, and the part that set the number of shadow pages mmu, beside this parts it seems to me that it isnt arch specific. - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl v2
Hollis Blanchard wrote: On Thu, 2007-10-25 at 17:48 +0200, Izik Eidus wrote: On Thu, 2007-10-25 at 17:48 +0200, Carsten Otte wrote: This patch splits kvm_vm_ioctl into archtecture independent parts, and x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c. Common ioctls for all architectures are: KVM_CREATE_VCPU, KVM_GET_DIRTY_LOG, KVM_SET_USER_MEMORY_REGION KVM_SET_USER_MEMORY_REGION is actually implemented in x86.c now, because the code behind looks arch specific to me. Reviewed-by: Hollis Blanchard [EMAIL PROTECTED] i think it is much better just to split the parts that allocate the rmap, and the part that set the number of shadow pages mmu, beside this parts it seems to me that it isnt arch specific. Carsten omitted the explanation about memslots he had in his original patch. To quote that here: We've got a total different address layout on s390: we cannot support multiple slots, and a user memory range always equals the guest physical memory [guest_phys + vm specific offset = host user address]. We don't have nor need dedicated vmas for the guest memory, we just use what the memory managment has in stock. This is true, because we reuse the page table for user and guest mode. Given that explanation, and that kvm_vm_ioctl_set_memory_region() is entirely about memslots, I'm inclined to agree with this code movement. ok i was thinking, maybe we can rewrite the way kvm hold memory so more code would be shared, lets say we throw away all the slots and arch depended stuff, and we let kvm just hold the userspace allocated memory address, then we will will have to each arch arch specific functions that will map the memory as it will need. for example for x86 we will make gfn_to_page map on the fly how the memory should look. i think i will write patch to example this, but it might take me some time, anyway what do you think about this idea? - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl v2
Carsten Otte wrote: This patch splits kvm_vm_ioctl into archtecture independent parts, and x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c. Common ioctls for all architectures are: KVM_CREATE_VCPU, KVM_GET_DIRTY_LOG, KVM_SET_USER_MEMORY_REGION KVM_SET_USER_MEMORY_REGION is actually implemented in x86.c now, Hi Carsten, I don't think we can move the whole function to arch-specific part, because it should work well (or with few issues) for most archs. Basically, IA64 mostly can use it directly. If we move them as arch-specific, it will introduces many duplicates. As you said, S390 has quite difference about this side, but I think maybe we can use macros, such as #ifndef CONFIG_S390 to comment out them, and S390 define it in your arch-specific portions. Any other good ideas ? Thanks Xiantao - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl
Avi Kivity wrote: We need to distinguish between x86 specific (only x86 has this) and s390 special (everyone has it except s390). In the latter case it should still be in common code to avoid duplication, with a guard to disable compilation on s390. KVM_SET_MEMORY_REGION is in theory applicable to non-s390 but I'd like to deprecate it, so there's no point in implementing it on non-x86. The rest are s390 special rather than x86 specific. As related previously, KVM_SET_USER_MEMORY_REGION should work for s390 (with an extra check for the guest phsyical start address). I thought about that all through the weekend. To me, it looks like I want to eliminate s390 special as far as possible. In given case, I'll follow Anthonys suggestion and support KVM_SET_USER_MEMORY_REGION. KVM_SET_MEMORY_REGION will also go common, and in case we're pushing the actual port before you deprecate that call, we'll #ifdef CONFIG_ARCH_S390 around it. As for the pic/apic part, I think it is x86 specific. Christian Ehrhardt stated he believes ppc won't have that too. Will create another patch on vm_ioctl that reflects this later today. so long, Carsten - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl
Anthony Liguori wrote: While the pic/apic related functions are obviously x86 specific, some other ioctls seem to be common at a first glance. KVM_SET_(USER)_MEMORY_REGION for example. We've got a total different address layout on s390: we cannot support multiple slots, and a user memory range always equals the guest physical memory [guest_phys + vm specific offset = host user address]. We don't have nor need dedicated vmas for the guest memory, we just use what the memory managment has in stock. This is true, because we reuse the page table for user and guest mode. You still need to tell the kernel about vm specific offset right? So doesn't KVM_SET_USER_MEMORY_REGION for you just become that? There's nothing wrong with s390 not supporting multiple memory slots, but there's no reason the ioctl interface can't be the same. I've though about that too. The thing is, the interface would really do something different on s390. I think it's more confusing to userland to have one interface that does two different things depending on the architecture rather then having different interfaces for different things. You're right that KVM_SET_USER_MEMORY_REGION would cover our needs if we return -EINVAL in case slot != 0 or guest start address != 0. I'll talk it though with Martin on monday. Carsten - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl
Zhang, Xiantao wrote: I don't know why we not put KVM_SET_MEMORY_REGION, KVM_SET_USER_MEMORY_REGION as common, although I have read the reasons you listed. I think they should work for most of archs, although it is not very friendly with s390. If we put them as arch-specific ones, we have to duplicate many copies for them in KVM code. On s390, we use regular userspace memory to back our guest. Our architecture allows us to specify an offset and an address limit for the guest, and we don't need to have shaddow page tables and other tricks. We just use the userspace page table, and the guest memory is swapable to disk. One suggestion: Maybe we can comment out current memory allocation logic in userspace for S390, and s390 use your apporach to get its memory. Userspace will definetly need to do a special case for setting up the guest memory anyway due to major architectural differences. Thus I think it is fair to let it use two different ioctls for each case. - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl
Carsten Otte wrote: This patch splits kvm_vm_ioctl into archtecture independent parts, and x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c. Common ioctls for all architectures are: KVM_CREATE_VCPU, KVM_GET_DIRTY_LOG I'd really like to see more commonalities, but all others did not fit our needs. I would love to keep KVM_GET_DIRTY_LOG common, so that the ingenious migration code does not need to care too much about different architectures. x86 specific ioctls are: KVM_SET_MEMORY_REGION, KVM_SET_USER_MEMORY_REGION, KVM_GET/SET_NR_MMU_PAGES, KVM_SET_MEMORY_ALIAS, KVM_CREATE_IRQCHIP, KVM_CREATE_IRQ_LINE, KVM_GET/SET_IRQCHIP We need to distinguish between x86 specific (only x86 has this) and s390 special (everyone has it except s390). In the latter case it should still be in common code to avoid duplication, with a guard to disable compilation on s390. KVM_SET_MEMORY_REGION is in theory applicable to non-s390 but I'd like to deprecate it, so there's no point in implementing it on non-x86. The rest are s390 special rather than x86 specific. As related previously, KVM_SET_USER_MEMORY_REGION should work for s390 (with an extra check for the guest phsyical start address). -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl
Avi Kivity wrote: We need to distinguish between x86 specific (only x86 has this) and s390 special (everyone has it except s390). In the latter case it should still be in common code to avoid duplication, with a guard to disable compilation on s390. KVM_SET_MEMORY_REGION is in theory applicable to non-s390 but I'd like to deprecate it, so there's no point in implementing it on non-x86. The rest are s390 special rather than x86 specific. As related previously, KVM_SET_USER_MEMORY_REGION should work for s390 (with an extra check for the guest phsyical start address). That sounds reasonable to me. I'll make a patch that does this three way split on Monday. Let's see how it comes out. KVM_SET_USER_MEMORY_REGION, and KVM_SET_MEMORY_REGION will go back to common, the irq related ones to a third place. thanks for reviewing, Carsten - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl
Carsten Otte wrote: This patch splits kvm_vm_ioctl into archtecture independent parts, and x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c. Common ioctls for all architectures are: KVM_CREATE_VCPU, KVM_GET_DIRTY_LOG I'd really like to see more commonalities, but all others did not fit our needs. I would love to keep KVM_GET_DIRTY_LOG common, so that the ingenious migration code does not need to care too much about different architectures. x86 specific ioctls are: KVM_SET_MEMORY_REGION, KVM_SET_USER_MEMORY_REGION, KVM_GET/SET_NR_MMU_PAGES, KVM_SET_MEMORY_ALIAS, KVM_CREATE_IRQCHIP, KVM_CREATE_IRQ_LINE, KVM_GET/SET_IRQCHIP While the pic/apic related functions are obviously x86 specific, some other ioctls seem to be common at a first glance. KVM_SET_(USER)_MEMORY_REGION for example. We've got a total different address layout on s390: we cannot support multiple slots, and a user memory range always equals the guest physical memory [guest_phys + vm specific offset = host user address]. We don't have nor need dedicated vmas for the guest memory, we just use what the memory managment has in stock. This is true, because we reuse the page table for user and guest mode. You still need to tell the kernel about vm specific offset right? So doesn't KVM_SET_USER_MEMORY_REGION for you just become that? There's nothing wrong with s390 not supporting multiple memory slots, but there's no reason the ioctl interface can't be the same. Regards, Anthony Liguori Looks to me like the s390 might have a lot in common with a future AMD nested page table implementation. If AMD choose to reuse the page table too, we might share the same ioctl to set up guest addressing with them. signed-off-by: Carsten Otte [EMAIL PROTECTED] reviewed-by: Christian Borntraeger [EMAIL PROTECTED] reviewed-by: Christian Ehrhardt [EMAIL PROTECTED] --- - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl
Am Freitag, den 12.10.2007, 15:37 +0200 schrieb Arnd Bergmann: I assume the contents are ok, since you're just moving code around, but please write this 'Signed-off-by' and 'Reviewed-by' (capital letters), and include a diffstat for any patch that doesn't fit on a few pages of mail client screen space. The intend of an rfc is in general to review a patch, not to pick on formalities. Signed-off-by: Carsten Otte [EMAIL PROTECTED] Reviewed-by: Christian Borntraeger [EMAIL PROTECTED] Reviewed-by: Christian Ehrhardt [EMAIL PROTECTED] --- kvm.h |3 kvm_main.c | 460 --- x86.c | 472 + 3 files changed, 478 insertions(+), 457 deletions(-) Index: kvm/drivers/kvm/kvm.h === --- kvm.orig/drivers/kvm/kvm.h 2007-10-12 13:38:59.0 +0200 +++ kvm/drivers/kvm/kvm.h 2007-10-12 14:22:40.0 +0200 @@ -661,6 +661,9 @@ unsigned int ioctl, unsigned long arg); long kvm_arch_vcpu_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg); +long kvm_arch_vm_ioctl(struct file *filp, + unsigned int ioctl, unsigned long arg); +void kvm_arch_destroy_vm(struct kvm *kvm); void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu); void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu); Index: kvm/drivers/kvm/kvm_main.c === --- kvm.orig/drivers/kvm/kvm_main.c 2007-10-12 13:38:59.0 +0200 +++ kvm/drivers/kvm/kvm_main.c 2007-10-12 13:57:30.0 +0200 @@ -40,7 +40,6 @@ #include linux/anon_inodes.h #include linux/profile.h #include linux/kvm_para.h -#include linux/pagemap.h #include asm/processor.h #include asm/msr.h @@ -319,61 +318,6 @@ return kvm; } -static void kvm_free_userspace_physmem(struct kvm_memory_slot *free) -{ - int i; - - for (i = 0; i free-npages; ++i) { - if (free-phys_mem[i]) { - if (!PageReserved(free-phys_mem[i])) - SetPageDirty(free-phys_mem[i]); - page_cache_release(free-phys_mem[i]); - } - } -} - -static void kvm_free_kernel_physmem(struct kvm_memory_slot *free) -{ - int i; - - for (i = 0; i free-npages; ++i) - if (free-phys_mem[i]) - __free_page(free-phys_mem[i]); -} - -/* - * Free any memory in @free but not in @dont. - */ -static void kvm_free_physmem_slot(struct kvm_memory_slot *free, - struct kvm_memory_slot *dont) -{ - if (!dont || free-phys_mem != dont-phys_mem) - if (free-phys_mem) { - if (free-user_alloc) - kvm_free_userspace_physmem(free); - else - kvm_free_kernel_physmem(free); - vfree(free-phys_mem); - } - if (!dont || free-rmap != dont-rmap) - vfree(free-rmap); - - if (!dont || free-dirty_bitmap != dont-dirty_bitmap) - vfree(free-dirty_bitmap); - - free-phys_mem = NULL; - free-npages = 0; - free-dirty_bitmap = NULL; -} - -static void kvm_free_physmem(struct kvm *kvm) -{ - int i; - - for (i = 0; i kvm-nmemslots; ++i) - kvm_free_physmem_slot(kvm-memslots[i], NULL); -} - static void free_pio_guest_pages(struct kvm_vcpu *vcpu) { int i; @@ -421,7 +365,7 @@ kfree(kvm-vpic); kfree(kvm-vioapic); kvm_free_vcpus(kvm); - kvm_free_physmem(kvm); + kvm_arch_destroy_vm(kvm); kfree(kvm); } @@ -686,183 +630,6 @@ EXPORT_SYMBOL_GPL(fx_init); /* - * Allocate some memory and give it an address in the guest physical address - * space. - * - * Discontiguous memory is allowed, mostly for framebuffers. - */ -static int kvm_vm_ioctl_set_memory_region(struct kvm *kvm, - struct - kvm_userspace_memory_region *mem, - int user_alloc) -{ - int r; - gfn_t base_gfn; - unsigned long npages; - unsigned long i; - struct kvm_memory_slot *memslot; - struct kvm_memory_slot old, new; - - r = -EINVAL; - /* General sanity checks */ - if (mem-memory_size (PAGE_SIZE - 1)) - goto out; - if (mem-guest_phys_addr (PAGE_SIZE - 1)) - goto out; - if (mem-slot = KVM_MEMORY_SLOTS) - goto out; - if (mem-guest_phys_addr + mem-memory_size mem-guest_phys_addr) - goto out; - - memslot = kvm-memslots[mem-slot]; - base_gfn = mem-guest_phys_addr PAGE_SHIFT; - npages = mem-memory_size PAGE_SHIFT; - - if (!npages) -
Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl
On Friday 12 October 2007, Carsten Otte wrote: This patch splits kvm_vm_ioctl into archtecture independent parts, and x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c. I assume the contents are ok, since you're just moving code around, but please signed-off-by: Carsten Otte [EMAIL PROTECTED] reviewed-by: Christian Borntraeger [EMAIL PROTECTED] reviewed-by: Christian Ehrhardt [EMAIL PROTECTED] write this 'Signed-off-by' and 'Reviewed-by' (capital letters), and include a diffstat for any patch that doesn't fit on a few pages of mail client screen space. Arnd - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl
Carsten Otte wrote: This patch splits kvm_vm_ioctl into archtecture independent parts, and x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c. Common ioctls for all architectures are: KVM_CREATE_VCPU, KVM_GET_DIRTY_LOG I'd really like to see more commonalities, but all others did not fit our needs. I would love to keep KVM_GET_DIRTY_LOG common, so that the ingenious migration code does not need to care too much about different architectures. x86 specific ioctls are: KVM_SET_MEMORY_REGION, KVM_SET_USER_MEMORY_REGION, KVM_GET/SET_NR_MMU_PAGES, KVM_SET_MEMORY_ALIAS, KVM_CREATE_IRQCHIP, KVM_CREATE_IRQ_LINE, KVM_GET/SET_IRQCHIP I don't know why we not put KVM_SET_MEMORY_REGION, KVM_SET_USER_MEMORY_REGION as common, although I have read the reasons you listed. I think they should work for most of archs, although it is not very friendly with s390. If we put them as arch-specific ones, we have to duplicate many copies for them in KVM code. One suggestion: Maybe we can comment out current memory allocation logic in userspace for S390, and s390 use your apporach to get its memory. While the pic/apic related functions are obviously x86 specific, some other ioctls seem to be common at a first glance. KVM_SET_(USER)_MEMORY_REGION for example. We've got a total different address layout on s390: we cannot support multiple slots, and a user memory range always equals the guest physical memory [guest_phys + vm specific offset = host user address]. We don't have nor need dedicated vmas for the guest memory, we just use what the memory managment has in stock. This is true, because we reuse the page table for user and guest mode. Looks to me like the s390 might have a lot in common with a future AMD nested page table implementation. If AMD choose to reuse the page table too, we might share the same ioctl to set up guest addressing with them. - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel