Re: [Xen-devel] [v8][PATCH 04/17] update the existing hypercall to support XEN_DOMCTL_set_rdm
On 09.12.14 at 08:47, tiejun.c...@intel.com wrote: On 2014/12/8 16:51, Jan Beulich wrote: The whole if-copy-unlock-and-return-EFAULT-otherwise-increment is identical and can be factored out pretty easily afaict. What about this? struct get_reserved_device_memory { struct xen_reserved_device_memory_map map; unsigned int used_entries; struct domain *domain; }; static int get_reserved_device_memory(xen_pfn_t start, xen_ulong_t nr, u32 id, void *ctxt) { struct get_reserved_device_memory *grdm = ctxt; struct domain *d = grdm-domain; unsigned int i, hit_one = 0; u32 sbdf; struct xen_reserved_device_memory rdm = { .start_pfn = start, .nr_pages = nr }; if ( !d-arch.hvm_domain.pci_force ) { for ( i = 0; i d-arch.hvm_domain.num_pcidevs; i++ ) { sbdf = PCI_SBDF2(d-arch.hvm_domain.pcidevs[i].seg, d-arch.hvm_domain.pcidevs[i].bus, d-arch.hvm_domain.pcidevs[i].devfn); if ( sbdf == id ) { hit_one = 1; break; } } if ( !hit_one ) return 0; } Why do you always pick other than the simplest possible solution? You don't need a separate variable here, you can simply check whether i reached d-arch.hvm_domain.num_pcidevs after the loop. And even if you added a variable, it would want to be a bool_t one with the way you use it. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v6 0/2] add new p2m type class and new p2m type
On 09.12.14 at 03:02, yu.c.zh...@linux.intel.com wrote: Thank you very much for your review. And could you please also help me about how to get an ACK? I'm not sure what's the next action I need to take. :-) I don't think you need to take any action at this point. The second patch will need Tim's ack, yes, but that's nothing to worry about (yet), since even with his ack the two patches wouldn't go in until after 4.5 got branched off of staging. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [v8][PATCH 02/17] introduce XEN_DOMCTL_set_rdm
On 09.12.14 at 02:06, tiejun.c...@intel.com wrote: Also how does this work with 32-bit dom0s? Is there a need to use the compat layer? Are you saying in xsm case? Others? Actually this new DOMCTL is similar with XEN_DOMCTL_assign_device in some senses but I don't see such an issue you're pointing. I was thinking about the compat layer and making sure it works properly. Do we really need this consideration? I mean I referred to that existing XEN_DOMCTL_assign_device to implement this new DOMCTL, but looks there's nothing related to this point. Or could you make your thought clear to me with an exiting example? Then I can take a look at what exactly should be taken in my new DOMCTL since I'm a fresh man to work out this properly inside xen. I think Konrad got a little confused here - domctl-s intentionally are structured so that they don't need a compat translation layer. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [Xen-users] 4.5 git: regression in xen systemd shutdown hangs the OS
On Mon, Dec 08, Wei Liu wrote: However I think the problem you're seeing is due to xenstored exits too early, which should be fixable by rearranging the shutdown order? I.e. xenstored shuts down after xendomains. True, and this is addressed by 268b8f1. Perhaps a crashed or otherwise unavailable xenstored isnt such a problem because its like that since a very long time. Olaf ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] XenServer Xen-4.5 (-rc3ish) testing
On 08.12.14 at 20:03, andrew.coop...@citrix.com wrote: XEN_DOMCTL_memory_mapping hypercall fails with EPERM where it didn't in xen-4.4, given identical parameters. The hypercall gained an extra permission check as part of 0561e1f01e. Our usecase here is a daemon in dom0 mapping guest RAM to emulate a graphics card, but I currently don't see how that is incompatible with the new permissions check. This seems quite obvious: The added check makes sure that what gets mapped is I/O memory both domains have access to, yet you say the daemon maps guest RAM. What I can't see is why you need this hypercall in this case - given what you say it's certainly not meant for the daemon to map memory into the guest? Mapping guest RAM ought to work via the privcmd kernel interface. We have certain machines which are showing reliable failure to boot under Xen-4.5, where they worked with 4.4. Symptoms range from the dom0 kernel crashing before printing anything, to complaining that the initrd is corrupt when attempting to decompress. This appears to be hardware specific. Any chance this is C-state related, just like narrowed down to for http://lists.xenproject.org/archives/html/xen-devel/2014-11/msg00228.html? I.e. Westmere Xeons being affected? If not, this would seem rather worrying to me (read: a release blocker). And even if so, a workaround would be minimally needed. Otoh you didn't report so for earlier RCs - was that just because the testing scope was more narrow then, or can we imply that this is a recently introduced regression? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] XenServer Xen-4.5 (-rc3ish) testing
On 09.12.14 at 01:21, andrew.coop...@citrix.com wrote: On 08/12/2014 20:00, Konrad Rzeszutek Wilk wrote: On Mon, Dec 08, 2014 at 07:03:19PM +, Andrew Cooper wrote: XEN_DOMCTL_memory_mapping hypercall fails with EPERM where it didn't in xen-4.4, given identical parameters. The hypercall gained an extra permission check as part of 0561e1f01e. Our usecase here is a daemon in dom0 mapping guest RAM to emulate a graphics card, but I currently don't see how that is incompatible with the new permissions check. I presume the daemon also does 'XEN_DOMCTL_iomem_permission' to grant the other domain access to its RAM? And before it makes the XEN_DOMCTL_memory_mapping hypercall. This is purely an implementer of the ioreq server infrastructure providing an emulated set of BARs in the guest as qemu would, but without using dom0 map-foreign powers. The gfn ranges in question are regular guest RAM as far as I am aware, and should not require any special io permissions. Either way, the identified changeset has apparently caused a regression, but I am not yet certain whether it is legitimately disabling something which should not have worked in the first place, or whether it is a change which needs reconsidering. Actually I think this is still too lax: When we set up Dom0's iomem permissions, we blindly add all memory, when we should only add all I/O memory (or better, exclude all RAM). Iiuc such a change would similarly break your daemon, without need for Arianna's. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Xen 4.5.0rc3 + Linux v3.18 + dom0pvh=1 = BOOM
On 08.12.14 at 22:27, konrad.w...@oracle.com wrote: [8.761336] [ cut here ] [8.761342] kernel BUG at arch/x86/xen/smp.c:438! if (HYPERVISOR_vcpu_op(VCPUOP_initialise, cpu, ctxt)) BUG(); [8.761348] invalid opcode: [#1] SMP [8.761355] Modules linked in: [8.761362] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.18.0 #1 [8.761369] Hardware name: LENOVO 10A6S09R01/SHARKBAY, BIOS FBKTA1AUS 10/22/2014 [8.761377] task: 8803ef058000 ti: 8803ef06 task.ti: 8803ef06 [8.761386] RIP: 0010:[810114c5] [810114c5] xen_cpu_up+0x4c5/0x4d0 [8.761396] RSP: :8803ef063dd8 EFLAGS: 00010282 [8.761402] RAX: fff4 RBX: 0001 RCX: 0001 -ENOMEM Very strange. I suppose that Dom0 kernel doesn't exhaust all of the hypervisor's memory? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Poor network performance between DomU with multiqueue support
On Mon, Dec 08, 2014 at 01:08:18PM +, Zhangleiqiang (Trump) wrote: On Mon, Dec 08, 2014 at 06:44:26AM +, Zhangleiqiang (Trump) wrote: On Fri, Dec 05, 2014 at 01:17:16AM +, Zhangleiqiang (Trump) wrote: [...] By the way, after rethinking the testing results for multi-queue pv (kernel 3.17.4+XEN 4.4) implementation, I find that when using four queues for netback/netfront, there will be about 3 netback process running with high CPU usage on receive Dom0 (about 85% usage per process running on one CPU core), and the aggregate throughout is only about 5Gbps. I doubt that there may be some bug or pitfall in current multi-queue implementation, because for 5Gbps throughout, occurring about all of 3 CPU core for packet receiving is somehow abnormal. 3.17.4 doesn't contain David Vrabel's fixes. Look for bc96f648df1bbc2729abbb84513cf4f64273a1f1 f48da8b14d04ca87ffcffe68829afd45f926ec6a ecf08d2dbb96d5a4b4bcc53a39e8d29cc8fef02e in David Miller's net tree. I have tried to testing with 3.18-rc5 which including these patches, however, it seems that the problem mentioned is not improved. There are still 3 netback receive processes each of which uses about 85% of CPU core. BTW there are some improvement planned for 4.6: [Xen-devel] [PATCH v3 0/2] gnttab: Improve scaleability. This is orthogonal to the problem you're trying to solve but it should help improve performance in general. Wei. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] help on qemu-xen for arch arm
I am trying to cross compile the qemu-xen for arm. Followed the link below http://wiki.xen.org/wiki/Xen_ARM_with_Virtualization_Extensions/CrossCompiling#Target_Environment Built with the command CONFIG_SITE=/etc/dpkg-cross/cross-config.armhf ./configure --build=x86_64-unknown-linux-gnu --host=arm-linux-gnueabihf --with-system-qemu make dist-tools CROSS_COMPILE=arm-linux-gnueabihf- XEN_TARGET_ARCH=arm32 I did not find the build process download and compile the qemu-xen as described it in this link. http://wiki.xen.org/wiki/QEMU_Upstream Does anybody help on how to cross-compile the qemu-xen for arm? Regards, Mao___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 11/19] xen: handle XENMEMF_get_vnumainfo in compat_memory_op
On 01.12.14 at 16:33, wei.l...@citrix.com wrote: @@ -273,6 +276,33 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat) break; } +case XENMEM_get_vnumainfo: + { Hard tab. +enum XLAT_vnuma_topology_info_vdistance vdistance = +XLAT_vnuma_topology_info_vdistance_h; +enum XLAT_vnuma_topology_info_vcpu_to_vnode vcpu_to_vnode = +XLAT_vnuma_topology_info_vcpu_to_vnode_h; +enum XLAT_vnuma_topology_info_vmemrange vmemrange = +XLAT_vnuma_topology_info_vmemrange_h; + +if ( copy_from_guest(cmp.vnuma, compat, 1) ) +return -EFAULT; + +#define XLAT_vnuma_topology_info_HNDL_vdistance_h(_d_, _s_) \ +guest_from_compat_handle((_d_)-vdistance.h, (_s_)-vdistance.h) +#define XLAT_vnuma_topology_info_HNDL_vcpu_to_vnode_h(_d_, _s_) \ +guest_from_compat_handle((_d_)-vcpu_to_vnode.h, (_s_)-vcpu_to_vnode.h) +#define XLAT_vnuma_topology_info_HNDL_vmemrange_h(_d_, _s_) \ +guest_from_compat_handle((_d_)-vmemrange.h, (_s_)-vmemrange.h) + +XLAT_vnuma_topology_info(nat.vnuma, cmp.vnuma); + +#undef XLAT_vnuma_topology_info_HNDL_vdistance_h +#undef XLAT_vnuma_topology_info_HNDL_vcpu_to_vnode_h +#undef XLAT_vnuma_topology_info_HNDL_vmemrange_h +break; + } Again. With these two fixed and the subject adjusted to name the sub-op correctly, Reviewed-by: Jan Beulich jbeul...@suse.com Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [v8][PATCH 04/17] update the existing hypercall to support XEN_DOMCTL_set_rdm
On 2014/12/9 16:19, Jan Beulich wrote: On 09.12.14 at 08:47, tiejun.c...@intel.com wrote: On 2014/12/8 16:51, Jan Beulich wrote: The whole if-copy-unlock-and-return-EFAULT-otherwise-increment is identical and can be factored out pretty easily afaict. What about this? struct get_reserved_device_memory { struct xen_reserved_device_memory_map map; unsigned int used_entries; struct domain *domain; }; static int get_reserved_device_memory(xen_pfn_t start, xen_ulong_t nr, u32 id, void *ctxt) { struct get_reserved_device_memory *grdm = ctxt; struct domain *d = grdm-domain; unsigned int i, hit_one = 0; u32 sbdf; struct xen_reserved_device_memory rdm = { .start_pfn = start, .nr_pages = nr }; if ( !d-arch.hvm_domain.pci_force ) { for ( i = 0; i d-arch.hvm_domain.num_pcidevs; i++ ) { sbdf = PCI_SBDF2(d-arch.hvm_domain.pcidevs[i].seg, d-arch.hvm_domain.pcidevs[i].bus, d-arch.hvm_domain.pcidevs[i].devfn); if ( sbdf == id ) { hit_one = 1; break; } } if ( !hit_one ) return 0; } Why do you always pick other than the simplest possible solution? I don't intend it to be, but I may go a complicated way, even a wrong way, based on my understanding. But as one main maintainer, if you always say to me in such a reproachful word more than once, I have to consider you may hint constantly I'm not a suitable candidate to finish this. Its fair to me, I'd really like to quit this to ask my manager if it can deliver to other guy to make sure this can move forward. You don't need a separate variable here, you can simply check whether i reached d-arch.hvm_domain.num_pcidevs after the loop. And even if you added a variable, it would want to be a Are you saying this? if ( i == d-arch.hvm_domain.num_pcidevs ) return 0; But if the last one happens to one hit, 'i' is equal to d-arch.hvm_domain.num_pcidevs. Thanks Tiejun bool_t one with the way you use it. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2] console: increase initial conring size
On 05.12.14 at 16:50, daniel.ki...@oracle.com wrote: In general initial conring size is sufficient. However, if log level is increased on platforms which have e.g. huge number of memory regions (I have an IBM System x3550 M2 with 8 GiB RAM which has more than 200 entries in EFI memory map) then some of earlier messages in console ring are overwritten. It means that in case of issues deeper analysis can be hindered. Sadly conring_size argument does not help because new console buffer is allocated late on heap. It means that it is not possible to allocate larger ring earlier. So, in this situation initial conring size should be increased. My experiments showed that even on not so big machines more than 26 KiB of free space are needed for initial messages. In theory we could increase conring size buffer to 32 KiB. However, I think that this value could be too small for huge machines with large number of ACPI tables and EFI memory regions. Hence, this patch increases initial conring size to 64 KiB. Signed-off-by: Daniel Kiper daniel.ki...@oracle.com Actually meanwhile I spotted a (minor) downside to this approach: It raises the lower limit of the permanent ring size to 64k too. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Crash of guest with nested vmx with Unknown nested vmexit reason 80000021.
On 09.12.14 at 10:09, groen...@grosc.com wrote: Did anyone find the time yet? I'm still more then willing testing any patches. Just yesterday we were told by Intel that they still can't foresee when they will find time. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [v8][PATCH 04/17] update the existing hypercall to support XEN_DOMCTL_set_rdm
On 09.12.14 at 10:12, tiejun.c...@intel.com wrote: On 2014/12/9 16:19, Jan Beulich wrote: On 09.12.14 at 08:47, tiejun.c...@intel.com wrote: On 2014/12/8 16:51, Jan Beulich wrote: The whole if-copy-unlock-and-return-EFAULT-otherwise-increment is identical and can be factored out pretty easily afaict. What about this? struct get_reserved_device_memory { struct xen_reserved_device_memory_map map; unsigned int used_entries; struct domain *domain; }; static int get_reserved_device_memory(xen_pfn_t start, xen_ulong_t nr, u32 id, void *ctxt) { struct get_reserved_device_memory *grdm = ctxt; struct domain *d = grdm-domain; unsigned int i, hit_one = 0; u32 sbdf; struct xen_reserved_device_memory rdm = { .start_pfn = start, .nr_pages = nr }; if ( !d-arch.hvm_domain.pci_force ) { for ( i = 0; i d-arch.hvm_domain.num_pcidevs; i++ ) { sbdf = PCI_SBDF2(d-arch.hvm_domain.pcidevs[i].seg, d-arch.hvm_domain.pcidevs[i].bus, d-arch.hvm_domain.pcidevs[i].devfn); if ( sbdf == id ) { hit_one = 1; break; } } if ( !hit_one ) return 0; } Why do you always pick other than the simplest possible solution? I don't intend it to be, but I may go a complicated way, even a wrong way, based on my understanding. But as one main maintainer, if you always say to me in such a reproachful word more than once, I have to consider you may hint constantly I'm not a suitable candidate to finish this. Its fair to me, I'd really like to quit this to ask my manager if it can deliver to other guy to make sure this can move forward. You don't need a separate variable here, you can simply check whether i reached d-arch.hvm_domain.num_pcidevs after the loop. And even if you added a variable, it would want to be a Are you saying this? if ( i == d-arch.hvm_domain.num_pcidevs ) return 0; Yes. Or use =. But if the last one happens to one hit, 'i' is equal to d-arch.hvm_domain.num_pcidevs. No, when the last one hits, i == d-arch.hvm_domain.num_pcidevs - 1. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [v8][PATCH 04/17] update the existing hypercall to support XEN_DOMCTL_set_rdm
On 2014/12/9 17:21, Jan Beulich wrote: On 09.12.14 at 10:12, tiejun.c...@intel.com wrote: On 2014/12/9 16:19, Jan Beulich wrote: On 09.12.14 at 08:47, tiejun.c...@intel.com wrote: On 2014/12/8 16:51, Jan Beulich wrote: The whole if-copy-unlock-and-return-EFAULT-otherwise-increment is identical and can be factored out pretty easily afaict. What about this? struct get_reserved_device_memory { struct xen_reserved_device_memory_map map; unsigned int used_entries; struct domain *domain; }; static int get_reserved_device_memory(xen_pfn_t start, xen_ulong_t nr, u32 id, void *ctxt) { struct get_reserved_device_memory *grdm = ctxt; struct domain *d = grdm-domain; unsigned int i, hit_one = 0; u32 sbdf; struct xen_reserved_device_memory rdm = { .start_pfn = start, .nr_pages = nr }; if ( !d-arch.hvm_domain.pci_force ) { for ( i = 0; i d-arch.hvm_domain.num_pcidevs; i++ ) { sbdf = PCI_SBDF2(d-arch.hvm_domain.pcidevs[i].seg, d-arch.hvm_domain.pcidevs[i].bus, d-arch.hvm_domain.pcidevs[i].devfn); if ( sbdf == id ) { hit_one = 1; break; } } if ( !hit_one ) return 0; } Why do you always pick other than the simplest possible solution? I don't intend it to be, but I may go a complicated way, even a wrong way, based on my understanding. But as one main maintainer, if you always say to me in such a reproachful word more than once, I have to consider you may hint constantly I'm not a suitable candidate to finish this. Its fair to me, I'd really like to quit this to ask my manager if it can deliver to other guy to make sure this can move forward. You don't need a separate variable here, you can simply check whether i reached d-arch.hvm_domain.num_pcidevs after the loop. And even if you added a variable, it would want to be a Are you saying this? if ( i == d-arch.hvm_domain.num_pcidevs ) return 0; Yes. Or use =. Okay. But if the last one happens to one hit, 'i' is equal to d-arch.hvm_domain.num_pcidevs. No, when the last one hits, i == d-arch.hvm_domain.num_pcidevs - 1. You're right. Thanks Tiejun ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] xen-netfront: Fix handling packets on compound pages with skb_linearize
On Mon, Dec 08, 2014 at 11:11:15AM +, David Vrabel wrote: On 08/12/14 10:19, Luis Henriques wrote: On Mon, Dec 01, 2014 at 09:55:24AM +0100, Stefan Bader wrote: On 11.08.2014 19:32, Zoltan Kiss wrote: There is a long known problem with the netfront/netback interface: if the guest tries to send a packet which constitues more than MAX_SKB_FRAGS + 1 ring slots, it gets dropped. The reason is that netback maps these slots to a frag in the frags array, which is limited by size. Having so many slots can occur since compound pages were introduced, as the ring protocol slice them up into individual (non-compound) page aligned slots. The theoretical worst case scenario looks like this (note, skbs are limited to 64 Kb here): linear buffer: at most PAGE_SIZE - 17 * 2 bytes, overlapping page boundary, using 2 slots first 15 frags: 1 + PAGE_SIZE + 1 bytes long, first and last bytes are at the end and the beginning of a page, therefore they use 3 * 15 = 45 slots last 2 frags: 1 + 1 bytes, overlapping page boundary, 2 * 2 = 4 slots Although I don't think this 51 slots skb can really happen, we need a solution which can deal with every scenario. In real life there is only a few slots overdue, but usually it causes the TCP stream to be blocked, as the retry will most likely have the same buffer layout. This patch solves this problem by linearizing the packet. This is not the fastest way, and it can fail much easier as it tries to allocate a big linear area for the whole packet, but probably easier by an order of magnitude than anything else. Probably this code path is not touched very frequently anyway. Signed-off-by: Zoltan Kiss zoltan.k...@citrix.com Cc: Wei Liu wei.l...@citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Paul Durrant paul.durr...@citrix.com Cc: net...@vger.kernel.org Cc: linux-ker...@vger.kernel.org Cc: xen-de...@lists.xenproject.org This does not seem to be marked explicitly as stable. Has someone already asked David Miller to put it on his stable queue? IMO it qualifies quite well and the actual change should be simple to pick/backport. Thank you Stefan, I'm queuing this for the next 3.16 kernel release. Don't backport this yes. It's broken. It produces malformed requests and netback will report a fatal error and stop all traffic on the VIF. David Ok, thank you. I've dropped it already. Cheers, -- Luís ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] arch arm qemu compile erro
While I tried to cross compile the qemu for xen with the following command: make dist-stubdom CROSS_COMPILE=arm-linux-gnueabihf- XEN_TARGET_ARCH=arm32 I got an error complain about the ./extras/mini-os/arch/arm32/arch.mk: No such file or directory is missing. My xen source version is on Xen-4.5.0-rc3 Is the build process for xen broken? or some ./configure needed? Regards, Mao ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [v8][PATCH 04/17] update the existing hypercall to support XEN_DOMCTL_set_rdm
At 08:19 + on 09 Dec (1418109561), Jan Beulich wrote: Why do you always pick other than the simplest possible solution? Jan, please don't make personal comments like this in code review. It can only offend and demoralize contributors, and deter other people from joining in. I understand that it can be frustrating, and I'm sure I have lashed out at people on-list in the past. But remember that people who are new to the project need time to learn, and keep the comments to the code itself. I can see that this series has been going for a long time, and is still getting hung up on coding style issues. Might it be useful to have a round of internal review from some of the more xen-experienced engineers at Intel before Jan looks at it again? Cheers, Tim. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] One question about the hypercall to translate gfn to mfn.
Hi all, As you can see, we are pushing our XenGT patches to the upstream. One feature we need in xen is to translate guests' gfn to mfn in XenGT dom0 device model. Here we may have 2 similar solutions: 1 Paul told me(and thank you, Paul :)) that there used to be a hypercall, XENMEM_translate_gpfn_list, which was removed by Keir in commit 2d2f7977a052e655db6748be5dabf5a58f5c5e32, because there was no usage at that time. So solution 1 is to revert this commit. However, since this hypercall was removed ages ago, the reverting met many conflicts, i.e. the gmfn_to_mfn is no longer used in x86, etc. 2 In our project, we defined a new hypercall XENMEM_get_mfn_from_pfn, which has a similar implementation like the previous XENMEM_translate_gpfn_list. One of the major differences is that this newly defined one is only for x86(called in arch_memory_op), so we do not have to worry about the arm side. Does anyone has any suggestions about this? Thanks in advance. :) B.R. Yu ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v6 0/2] add new p2m type class and new p2m type
On 12/9/2014 4:31 PM, Jan Beulich wrote: On 09.12.14 at 03:02, yu.c.zh...@linux.intel.com wrote: Thank you very much for your review. And could you please also help me about how to get an ACK? I'm not sure what's the next action I need to take. :-) I don't think you need to take any action at this point. The second patch will need Tim's ack, yes, but that's nothing to worry about (yet), since even with his ack the two patches wouldn't go in until after 4.5 got branched off of staging. Got it, and thanks! Yu Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] One question about the hypercall to translate gfn to mfn.
-Original Message- From: Yu, Zhang [mailto:yu.c.zh...@linux.intel.com] Sent: 09 December 2014 10:11 To: Paul Durrant; Keir (Xen.org); Tim (Xen.org); jbeul...@suse.com; Kevin Tian; Xen-devel@lists.xen.org Subject: One question about the hypercall to translate gfn to mfn. Hi all, As you can see, we are pushing our XenGT patches to the upstream. One feature we need in xen is to translate guests' gfn to mfn in XenGT dom0 device model. Here we may have 2 similar solutions: 1 Paul told me(and thank you, Paul :)) that there used to be a hypercall, XENMEM_translate_gpfn_list, which was removed by Keir in commit 2d2f7977a052e655db6748be5dabf5a58f5c5e32, because there was no usage at that time. So solution 1 is to revert this commit. However, since this hypercall was removed ages ago, the reverting met many conflicts, i.e. the gmfn_to_mfn is no longer used in x86, etc. 2 In our project, we defined a new hypercall XENMEM_get_mfn_from_pfn, which has a similar implementation like the previous XENMEM_translate_gpfn_list. One of the major differences is that this newly defined one is only for x86(called in arch_memory_op), so we do not have to worry about the arm side. Does anyone has any suggestions about this? IIUC what is needed is a means to IOMMU map a gfn in the service domain (dom0 for the moment) such that it can be accessed by the GPU. I think use of an raw mfn value currently works only because dom0 is using a 1:1 IOMMU mapping scheme. Is my understanding correct, or do you really need raw mfn values? Paul Thanks in advance. :) B.R. Yu ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [v8][PATCH 04/17] update the existing hypercall to support XEN_DOMCTL_set_rdm
On 09.12.14 at 11:11, t...@xen.org wrote: At 08:19 + on 09 Dec (1418109561), Jan Beulich wrote: Why do you always pick other than the simplest possible solution? Jan, please don't make personal comments like this in code review. It can only offend and demoralize contributors, and deter other people from joining in. I apologize - I shouldn't have permitted myself to do so. I understand that it can be frustrating, and I'm sure I have lashed out at people on-list in the past. But remember that people who are new to the project need time to learn, and keep the comments to the code itself. I can see that this series has been going for a long time, and is still getting hung up on coding style issues. Might it be useful to have a round of internal review from some of the more xen-experienced engineers at Intel before Jan looks at it again? I've been suggesting something along those lines a number of times, with (apparently) no success at all. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] Time in Xen
Hello, I'm using Xen 4.2 on CentOS kernel - 3.10.56-11.el6.centos.alt.x86_64 I need to interfere in Xen source code, I would like to ask several questions: - How to set tsc_mode to 1 using virt-manager or virt-install tool - or should I do it in different way? - Where in code (Xen 4.2) can I find part responsible for time emulation (while tsc_mode=1) - Where can I find code in Xen which sends the current time to kernel Thank you for responses ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] arch arm qemu compile erro
On Tue, 2014-12-09 at 10:04 +, Mao Mingya wrote: While I tried to cross compile the qemu for xen with the following command: make dist-stubdom CROSS_COMPILE=arm-linux-gnueabihf- XEN_TARGET_ARCH=arm32 I got an error complain about the ./extras/mini-os/arch/arm32/arch.mk: No such file or directory is missing. My xen source version is on Xen-4.5.0-rc3 Is the build process for xen broken? or some ./configure needed? stubdoms are not supported on ARM right now. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] XenServer Xen-4.5 (-rc3ish) testing
On 09/12/14 08:46, Jan Beulich wrote: We have certain machines which are showing reliable failure to boot under Xen-4.5, where they worked with 4.4. Symptoms range from the dom0 kernel crashing before printing anything, to complaining that the initrd is corrupt when attempting to decompress. This appears to be hardware specific. Any chance this is C-state related, just like narrowed down to for http://lists.xenproject.org/archives/html/xen-devel/2014-11/msg00228.html? I.e. Westmere Xeons being affected? If not, this would seem rather worrying to me (read: a release blocker). And even if so, a workaround would be minimally needed. Otoh you didn't report so for earlier RCs - was that just because the testing scope was more narrow then, or can we imply that this is a recently introduced regression? Jan I very much doubt it. We blanket set max cstate to 1 on that era of hardware, because the existing workarounds in Xen still experimentally don't work. https://github.com/xenserver/xen-4.4.pg/blob/master/detect-nehalem-c-state.patch The first system I am looking at with a view to fixing is a SandyBridge EN IBM Blade. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] One question about the hypercall to translate gfn to mfn.
On 09.12.14 at 11:10, yu.c.zh...@linux.intel.com wrote: As you can see, we are pushing our XenGT patches to the upstream. One feature we need in xen is to translate guests' gfn to mfn in XenGT dom0 device model. Here we may have 2 similar solutions: 1 Paul told me(and thank you, Paul :)) that there used to be a hypercall, XENMEM_translate_gpfn_list, which was removed by Keir in commit 2d2f7977a052e655db6748be5dabf5a58f5c5e32, because there was no usage at that time. So solution 1 is to revert this commit. However, since this hypercall was removed ages ago, the reverting met many conflicts, i.e. the gmfn_to_mfn is no longer used in x86, etc. 2 In our project, we defined a new hypercall XENMEM_get_mfn_from_pfn, which has a similar implementation like the previous XENMEM_translate_gpfn_list. One of the major differences is that this newly defined one is only for x86(called in arch_memory_op), so we do not have to worry about the arm side. Does anyone has any suggestions about this? Out of the two 1 seems preferable. But without background (see also Paul's reply) it's hard to tell whether that's what you want/need. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] One question about the hypercall to translate gfn to mfn.
On 12/9/2014 6:19 PM, Paul Durrant wrote: I think use of an raw mfn value currently works only because dom0 is using a 1:1 IOMMU mapping scheme. Is my understanding correct, or do you really need raw mfn values? Thanks for your quick response, Paul. Well, not exactly for this case. :) In XenGT, our need to translate gfn to mfn is for GPU's page table, which contains the translation between graphic address and the memory address. This page table is maintained by GPU drivers, and our service domain need to have a method to translate the guest physical addresses written by the vGPU into host physical ones. We do not use IOMMU in XenGT and therefore this translation may not necessarily be a 1:1 mapping. B.R. Yu ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] One question about the hypercall to translate gfn to mfn.
At 18:10 +0800 on 09 Dec (1418145055), Yu, Zhang wrote: Hi all, As you can see, we are pushing our XenGT patches to the upstream. One feature we need in xen is to translate guests' gfn to mfn in XenGT dom0 device model. Here we may have 2 similar solutions: 1 Paul told me(and thank you, Paul :)) that there used to be a hypercall, XENMEM_translate_gpfn_list, which was removed by Keir in commit 2d2f7977a052e655db6748be5dabf5a58f5c5e32, because there was no usage at that time. It's been suggested before that we should revive this hypercall, and I don't think it's a good idea. Whenever a domain needs to know the actual MFN of another domain's memory it's usually because the security model is problematic. In particular, finding the MFN is usually followed by a brute-force mapping from a dom0 process, or by passing the MFN to a device for unprotected DMA. These days DMA access should be protected by IOMMUs, or else the device drivers (and associated tools) are effectively inside the hypervisor's TCB. Luckily on x86 IOMMUs are widely available (and presumably present on anything new enough to run XenGT?). So I think the interface we need here is a please-map-this-gfn one, like the existing grant-table ops (which already do what you need by returning an address suitable for DMA). If adding a grant entry for every frame of the framebuffer within the guest is too much, maybe we can make a new interface for the guest to grant access to larger areas. Cheers, Tim. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [linux-next test] 32152: regressions - FAIL
flight 32152 linux-next real [real] http://www.chiark.greenend.org.uk/~xensrcts/logs/32152/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-i386-libvirt 5 xen-boot fail REGR. vs. 32100 test-amd64-i386-xl5 xen-boot fail REGR. vs. 32100 test-amd64-amd64-libvirt 5 xen-boot fail REGR. vs. 32100 test-amd64-i386-xl-credit25 xen-boot fail REGR. vs. 32100 test-amd64-i386-xl-multivcpu 5 xen-boot fail REGR. vs. 32100 test-amd64-amd64-xl 5 xen-boot fail REGR. vs. 32100 test-amd64-i386-qemut-rhel6hvm-amd 5 xen-bootfail REGR. vs. 32100 test-amd64-i386-rhel6hvm-amd 5 xen-boot fail REGR. vs. 32100 test-armhf-armhf-libvirt 7 debian-installfail REGR. vs. 32100 test-amd64-i386-qemuu-rhel6hvm-amd 5 xen-bootfail REGR. vs. 32100 test-amd64-i386-freebsd10-amd64 5 xen-boot fail REGR. vs. 32100 test-amd64-i386-freebsd10-i386 5 xen-bootfail REGR. vs. 32100 test-armhf-armhf-xl 7 debian-installfail REGR. vs. 32100 test-amd64-i386-xl-qemut-win7-amd64 5 xen-boot fail REGR. vs. 32100 test-amd64-i386-qemuu-rhel6hvm-intel 5 xen-boot fail REGR. vs. 32100 test-amd64-i386-qemut-rhel6hvm-intel 5 xen-boot fail REGR. vs. 32100 test-amd64-i386-rhel6hvm-intel 5 xen-bootfail REGR. vs. 32100 test-amd64-amd64-xl-qemut-debianhvm-amd64 5 xen-boot fail REGR. vs. 32100 test-amd64-i386-xl-qemuu-debianhvm-amd64 5 xen-boot fail REGR. vs. 32100 test-amd64-i386-xl-qemuu-ovmf-amd64 5 xen-boot fail REGR. vs. 32100 test-amd64-amd64-pair 8 xen-boot/dst_host fail REGR. vs. 32100 test-amd64-amd64-pair 7 xen-boot/src_host fail REGR. vs. 32100 test-amd64-amd64-xl-qemut-win7-amd64 5 xen-boot fail REGR. vs. 32100 test-amd64-i386-xl-qemut-debianhvm-amd64 5 xen-boot fail REGR. vs. 32100 test-amd64-amd64-xl-qemuu-debianhvm-amd64 5 xen-boot fail REGR. vs. 32100 test-amd64-i386-rumpuserxen-i386 5 xen-boot fail REGR. vs. 32100 test-amd64-amd64-rumpuserxen-amd64 5 xen-bootfail REGR. vs. 32100 test-amd64-amd64-xl-qemuu-ovmf-amd64 5 xen-boot fail REGR. vs. 32100 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 5 xen-boot fail REGR. vs. 32100 test-amd64-i386-pair 8 xen-boot/dst_host fail REGR. vs. 32100 test-amd64-i386-pair 7 xen-boot/src_host fail REGR. vs. 32100 test-amd64-i386-xl-qemuu-win7-amd64 5 xen-boot fail REGR. vs. 32100 test-amd64-i386-xl-winxpsp3 5 xen-boot fail REGR. vs. 32100 test-amd64-i386-xl-qemuu-winxpsp3 5 xen-boot fail REGR. vs. 32100 test-amd64-amd64-xl-win7-amd64 5 xen-bootfail REGR. vs. 32100 test-amd64-i386-xl-winxpsp3-vcpus1 5 xen-bootfail REGR. vs. 32100 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 5 xen-boot fail REGR. vs. 32100 test-amd64-amd64-xl-qemuu-win7-amd64 5 xen-boot fail REGR. vs. 32100 test-amd64-amd64-xl-winxpsp3 5 xen-boot fail REGR. vs. 32100 test-amd64-i386-xl-win7-amd64 5 xen-boot fail REGR. vs. 32100 test-amd64-amd64-xl-qemut-winxpsp3 5 xen-bootfail REGR. vs. 32100 Regressions which are regarded as allowable (not blocking): test-amd64-amd64-xl-sedf 5 xen-boot fail REGR. vs. 32100 test-amd64-amd64-xl-sedf-pin 5 xen-boot fail REGR. vs. 32100 test-amd64-i386-xl-qemut-winxpsp3 5 xen-bootfail blocked in 32100 test-amd64-amd64-xl-pcipt-intel 5 xen-boot fail REGR. vs. 32100 test-amd64-amd64-xl-qemuu-winxpsp3 5 xen-boot fail blocked in 32100 version targeted for testing: linux8191d07dbb16ae88cc2bc475584b9f185f02795f baseline version: linux7cc78f8fa02c2485104b86434acbc1538a3bd807 jobs: build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-armhf-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-armhf-pvopspass build-i386-pvops pass build-amd64-rumpuserxen pass build-i386-rumpuserxen pass test-amd64-amd64-xl fail test-armhf-armhf-xl
Re: [Xen-devel] One question about the hypercall to translate gfn to mfn.
-Original Message- From: Tim Deegan [mailto:t...@xen.org] Sent: 09 December 2014 10:47 To: Yu, Zhang Cc: Paul Durrant; Keir (Xen.org); jbeul...@suse.com; Kevin Tian; Xen- de...@lists.xen.org Subject: Re: One question about the hypercall to translate gfn to mfn. At 18:10 +0800 on 09 Dec (1418145055), Yu, Zhang wrote: Hi all, As you can see, we are pushing our XenGT patches to the upstream. One feature we need in xen is to translate guests' gfn to mfn in XenGT dom0 device model. Here we may have 2 similar solutions: 1 Paul told me(and thank you, Paul :)) that there used to be a hypercall, XENMEM_translate_gpfn_list, which was removed by Keir in commit 2d2f7977a052e655db6748be5dabf5a58f5c5e32, because there was no usage at that time. It's been suggested before that we should revive this hypercall, and I don't think it's a good idea. Whenever a domain needs to know the actual MFN of another domain's memory it's usually because the security model is problematic. In particular, finding the MFN is usually followed by a brute-force mapping from a dom0 process, or by passing the MFN to a device for unprotected DMA. These days DMA access should be protected by IOMMUs, or else the device drivers (and associated tools) are effectively inside the hypervisor's TCB. Luckily on x86 IOMMUs are widely available (and presumably present on anything new enough to run XenGT?). So I think the interface we need here is a please-map-this-gfn one, like the existing grant-table ops (which already do what you need by returning an address suitable for DMA). If adding a grant entry for every frame of the framebuffer within the guest is too much, maybe we can make a new interface for the guest to grant access to larger areas. IIUC the in-guest driver is Xen-unaware so any grant entry would have to be put in the guests table by the tools, which would entail some form of flexibly sized reserved range of grant entries otherwise any PV driver that are present in the guest would merrily clobber the new grant entries. A domain can already priv map a gfn into the MMU, so I think we just need an equivalent for the IOMMU. Paul Cheers, Tim. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] One question about the hypercall to translate gfn to mfn.
On Tue, 2014-12-09 at 11:05 +, Paul Durrant wrote: -Original Message- From: Tim Deegan [mailto:t...@xen.org] Sent: 09 December 2014 10:47 To: Yu, Zhang Cc: Paul Durrant; Keir (Xen.org); jbeul...@suse.com; Kevin Tian; Xen- de...@lists.xen.org Subject: Re: One question about the hypercall to translate gfn to mfn. At 18:10 +0800 on 09 Dec (1418145055), Yu, Zhang wrote: Hi all, As you can see, we are pushing our XenGT patches to the upstream. One feature we need in xen is to translate guests' gfn to mfn in XenGT dom0 device model. Here we may have 2 similar solutions: 1 Paul told me(and thank you, Paul :)) that there used to be a hypercall, XENMEM_translate_gpfn_list, which was removed by Keir in commit 2d2f7977a052e655db6748be5dabf5a58f5c5e32, because there was no usage at that time. It's been suggested before that we should revive this hypercall, and I don't think it's a good idea. Whenever a domain needs to know the actual MFN of another domain's memory it's usually because the security model is problematic. In particular, finding the MFN is usually followed by a brute-force mapping from a dom0 process, or by passing the MFN to a device for unprotected DMA. These days DMA access should be protected by IOMMUs, or else the device drivers (and associated tools) are effectively inside the hypervisor's TCB. Luckily on x86 IOMMUs are widely available (and presumably present on anything new enough to run XenGT?). So I think the interface we need here is a please-map-this-gfn one, like the existing grant-table ops (which already do what you need by returning an address suitable for DMA). If adding a grant entry for every frame of the framebuffer within the guest is too much, maybe we can make a new interface for the guest to grant access to larger areas. IIUC the in-guest driver is Xen-unaware so any grant entry would have to be put in the guests table by the tools, which would entail some form of flexibly sized reserved range of grant entries otherwise any PV driver that are present in the guest would merrily clobber the new grant entries. A domain can already priv map a gfn into the MMU, so I think we just need an equivalent for the IOMMU. I'm not sure I'm fully understanding what's going on here, but is a variant of XENMEM_add_to_physmap+XENMAPSPACE_gmfn_foreign which also returns a DMA handle a plausible solution? Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC V8 2/3] libxl domain snapshot API design
On Mon, 2014-12-08 at 22:04 -0700, Chun Yan Liu wrote: Partly. At least for domain disk snapshot create/delete, I prefer using qmp commands instead of calling qemu-img one by one. Using qmp commands, libvirt will need libxl's help. Of course, if libxl doesn't supply that, libvirt can call qemu-img to each disk one by one, not preferred but can do. You can't use qmp unless the domain is active, for an inactive domain there is no qemu to talk to, so you have to use qemu-img anyway in that case. Does libvirt not have existing code to do all this sort of thing? (I thought so, but it turns out I may be wrong, see below). And for an active domain I expect that *must* use qmp, since it seems unlikely that you can go around changing things under the feet of an active process (maybe I'm wrong?). Following the constraint that it's better NOT to supply disk snapshot functions in libxl, then we let xl and libvirt do that by themselve separately, that's OK. Then I think NO new API needs to be exported in libxl, since: * saving/restoring memory, there are already APIs. The principle is that if existing API doesn't work good enough for you we will consider adding a new one. We probably need a new API. If you want to do a live snapshot, we would need to notify xl that we are in the middle of pausing and resuming a domain. This is where we discussed a lot. Do we really need libxl_domain_snapshot_create API? or does xl can do the work? Even for live snapshot, after calling libxl_domain_suspend with LIVE flags, memory is saved and domain is paused. xl then can call disk snapshot functions to finish disk snapshots, after all of that, call libxl_domain_unpause to unpause the domain. So I don't think xl has any trouble to do that. In case there is some misunderstanding, please point out. My mistake, I incorrectly remembered that libxl_domain_suspend would destroy (for save or migate) or resume (for checkpoint) the guest before returning. Having refreshed my memory I see that you are correct: it returns with the domain paused and it is up to the toolstack to resume or destroy it as it wishes. Sorry for the confusion. Given that it does seem like the toolstack could indeed take the disksnapshots itself without an additional API. However the current architecture for libvirt to use libxl is like libvirt libxl [other lower level stuffs] So implementing snapshot management in xl cannot work for you either. It's not part of the current architecture. This is correct, xl should not be involved in a libvirt control stack, it is orthogonal. You are right. I understand you are trying to suggest a way to ease the job. Here just to make clear this way is really better and finally acceptable? :-) Just IMO, I think xl snapshot-list is wanted, that means managing snapshots in xl is needed. The xl idiom is that you do this sort of operation with existing CLI commands e.g. ls /var/lib/vm-images/*.qcow2, lvs, qemu-img etc. Adding snapshot-list to xl would be a whole load of work to create a bunch of infrastructure which you do not need to do. My understanding was that your primary aim here was to enable snapshots via libvirt and that xl support was just a nice to have. Is that right? Not that I'm against the idea of managing domain snapshot in xl, I'm trying to reduce workload here. The downside is that now xl and libvirt are disconnected, but I think it's fine. Two things here: 1. connect xl and libvirt, then will need to manage snapshot info in libxl (or libxlu) That's not preferred since the initial design. This is not the point we discuss here. 2. for xl only, list snapshots and delete snapshots, also need to manage snapshot info (in xl) Considering manage snapshot info in xl, only question is about idl and gentypes.py, expected structure is as following and expected to be saved into json file, but it contains xl namespace and libxl namespace things, gentypes.py will have problem. Better ideas? typedef struct xl_domain_snapshot { char * name; char * description; uint64_t creation_time; char * memory_path; int num_disks; libxl_disk_snapshot *disks; char *parent; bool *current; } xl_domain_snapshot; The libxl_disk_snapshot suggests that you still want storage management inside libxl, which should probably be in libxlu? Yeah. I may put it in libxlu. This depends on who the consumers of this datastructure are: If xl only - put it in xl itself. If libvirt+xl - put it in libxlu. My understanding was that libvirt already has data structures for dealing with snapshots, but this was based entirely on the commands listed by: virsh help | grep -E pool-\|snapshot- which seemed to me to be pretty feature rich and suggested that libvirt
Re: [Xen-devel] One question about the hypercall to translate gfn to mfn.
-Original Message- From: Ian Campbell Sent: 09 December 2014 11:11 To: Paul Durrant Cc: Tim (Xen.org); Yu, Zhang; Kevin Tian; Keir (Xen.org); jbeul...@suse.com; Xen-devel@lists.xen.org Subject: Re: [Xen-devel] One question about the hypercall to translate gfn to mfn. On Tue, 2014-12-09 at 11:05 +, Paul Durrant wrote: -Original Message- From: Tim Deegan [mailto:t...@xen.org] Sent: 09 December 2014 10:47 To: Yu, Zhang Cc: Paul Durrant; Keir (Xen.org); jbeul...@suse.com; Kevin Tian; Xen- de...@lists.xen.org Subject: Re: One question about the hypercall to translate gfn to mfn. At 18:10 +0800 on 09 Dec (1418145055), Yu, Zhang wrote: Hi all, As you can see, we are pushing our XenGT patches to the upstream. One feature we need in xen is to translate guests' gfn to mfn in XenGT dom0 device model. Here we may have 2 similar solutions: 1 Paul told me(and thank you, Paul :)) that there used to be a hypercall, XENMEM_translate_gpfn_list, which was removed by Keir in commit 2d2f7977a052e655db6748be5dabf5a58f5c5e32, because there was no usage at that time. It's been suggested before that we should revive this hypercall, and I don't think it's a good idea. Whenever a domain needs to know the actual MFN of another domain's memory it's usually because the security model is problematic. In particular, finding the MFN is usually followed by a brute-force mapping from a dom0 process, or by passing the MFN to a device for unprotected DMA. These days DMA access should be protected by IOMMUs, or else the device drivers (and associated tools) are effectively inside the hypervisor's TCB. Luckily on x86 IOMMUs are widely available (and presumably present on anything new enough to run XenGT?). So I think the interface we need here is a please-map-this-gfn one, like the existing grant-table ops (which already do what you need by returning an address suitable for DMA). If adding a grant entry for every frame of the framebuffer within the guest is too much, maybe we can make a new interface for the guest to grant access to larger areas. IIUC the in-guest driver is Xen-unaware so any grant entry would have to be put in the guests table by the tools, which would entail some form of flexibly sized reserved range of grant entries otherwise any PV driver that are present in the guest would merrily clobber the new grant entries. A domain can already priv map a gfn into the MMU, so I think we just need an equivalent for the IOMMU. I'm not sure I'm fully understanding what's going on here, but is a variant of XENMEM_add_to_physmap+XENMAPSPACE_gmfn_foreign which also returns a DMA handle a plausible solution? I think we want be able to avoid setting up a PTE in the MMU since it's not needed in most (or perhaps all?) cases. Paul Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] One question about the hypercall to translate gfn to mfn.
On 09.12.14 at 12:17, paul.durr...@citrix.com wrote: I think we want be able to avoid setting up a PTE in the MMU since it's not needed in most (or perhaps all?) cases. With shared page tables, there's no way to do one without the other. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 5/6] libxl: events: Deregister evtchn fd when not needed
Ian Campbell writes (Re: [PATCH 5/6] libxl: events: Deregister evtchn fd when not needed): On Fri, 2014-11-28 at 14:47 +, Ian Jackson wrote: libxl__ev_evtchn_* is always called with the ctx lock held. For the most part this is implicit due to the caller being in an ao callback, right? Yes. However, that they don't take the lock is contrary to the rules stated for libxl__ev_* in the doc comment. That should be fixed for 4.6. OK. Should I take this as an ack ? Thanks, Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 01/19] xen: dump vNUMA information with debug key u
On Mon, Dec 08, 2014 at 05:01:29PM +, Jan Beulich wrote: [...] +for ( i = 0; i vnuma-nr_vnodes; i++ ) +{ +err = snprintf(keyhandler_scratch, 12, %3u, +vnuma-vnode_to_pnode[i]); +if ( err 0 || vnuma-vnode_to_pnode[i] == NUMA_NO_NODE ) +strlcpy(keyhandler_scratch, ???, 3); + +printk( vnode %3u - pnode %s\n, i, keyhandler_scratch); +for ( j = 0; j vnuma-nr_vmemranges; j++ ) +{ +if ( vnuma-vmemrange[j].nid == i ) +{ +mem = vnuma-vmemrange[j].end - vnuma-vmemrange[j].start; +printk(%16PRIu64 MB: %#016PRIx64 - %#016PRIx64\n, Am I misremembering that these were just 0x%PRIx64 originally? Yes. I ask because converting to the 0-padded fixed width form makes no sense together with the # modifier. For these ranges I think it's OK. quite obvious that the numbers are hex, so I'd suggest dropping the #s without replacement. And to be honest I'm also against printing duplicate information: The memory range already specifies how much memory this is. Is this what you want? +if ( vnuma-vmemrange[j].nid == i ) +{ +printk( %016PRIx64 - %016PRIx64\n, + vnuma-vmemrange[j].start, + vnuma-vmemrange[j].end); +} And it prints out something like: (XEN) 2 vnodes, 2 vcpus: (XEN)vnode0 - pnode 0 (XEN) - bb80 (XEN)vcpus: 0 Wei. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] One question about the hypercall to translate gfn to mfn.
On 09/12/14 11:23, Jan Beulich wrote: On 09.12.14 at 12:17, paul.durr...@citrix.com wrote: I think we want be able to avoid setting up a PTE in the MMU since it's not needed in most (or perhaps all?) cases. With shared page tables, there's no way to do one without the other. Interestingly the IOMMU in front of the Intel GPU is only capable of handling 4k pages and so we wouldn't end up with share page tables being used. For other PCI device's then shared page tables will be a problem. Malcolm Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] One question about the hypercall to translate gfn to mfn.
On Tue, 2014-12-09 at 11:17 +, Paul Durrant wrote: -Original Message- From: Ian Campbell Sent: 09 December 2014 11:11 To: Paul Durrant Cc: Tim (Xen.org); Yu, Zhang; Kevin Tian; Keir (Xen.org); jbeul...@suse.com; Xen-devel@lists.xen.org Subject: Re: [Xen-devel] One question about the hypercall to translate gfn to mfn. On Tue, 2014-12-09 at 11:05 +, Paul Durrant wrote: -Original Message- From: Tim Deegan [mailto:t...@xen.org] Sent: 09 December 2014 10:47 To: Yu, Zhang Cc: Paul Durrant; Keir (Xen.org); jbeul...@suse.com; Kevin Tian; Xen- de...@lists.xen.org Subject: Re: One question about the hypercall to translate gfn to mfn. At 18:10 +0800 on 09 Dec (1418145055), Yu, Zhang wrote: Hi all, As you can see, we are pushing our XenGT patches to the upstream. One feature we need in xen is to translate guests' gfn to mfn in XenGT dom0 device model. Here we may have 2 similar solutions: 1 Paul told me(and thank you, Paul :)) that there used to be a hypercall, XENMEM_translate_gpfn_list, which was removed by Keir in commit 2d2f7977a052e655db6748be5dabf5a58f5c5e32, because there was no usage at that time. It's been suggested before that we should revive this hypercall, and I don't think it's a good idea. Whenever a domain needs to know the actual MFN of another domain's memory it's usually because the security model is problematic. In particular, finding the MFN is usually followed by a brute-force mapping from a dom0 process, or by passing the MFN to a device for unprotected DMA. These days DMA access should be protected by IOMMUs, or else the device drivers (and associated tools) are effectively inside the hypervisor's TCB. Luckily on x86 IOMMUs are widely available (and presumably present on anything new enough to run XenGT?). So I think the interface we need here is a please-map-this-gfn one, like the existing grant-table ops (which already do what you need by returning an address suitable for DMA). If adding a grant entry for every frame of the framebuffer within the guest is too much, maybe we can make a new interface for the guest to grant access to larger areas. IIUC the in-guest driver is Xen-unaware so any grant entry would have to be put in the guests table by the tools, which would entail some form of flexibly sized reserved range of grant entries otherwise any PV driver that are present in the guest would merrily clobber the new grant entries. A domain can already priv map a gfn into the MMU, so I think we just need an equivalent for the IOMMU. I'm not sure I'm fully understanding what's going on here, but is a variant of XENMEM_add_to_physmap+XENMAPSPACE_gmfn_foreign which also returns a DMA handle a plausible solution? I think we want be able to avoid setting up a PTE in the MMU since it's not needed in most (or perhaps all?) cases. Another (wildly under-informed) thought then: A while back Global logic proposed (for ARM) an infrastructure for allowing dom0 drivers to maintain a set of iommu like pagetables under hypervisor supervision (they called these remoteprocessor iommu). I didn't fully grok what it was at the time, let alone remember the details properly now, but AIUI it was essentially a framework for allowing a simple Xen side driver to provide PV-MMU-like update operations for a set of PTs which were not the main-processor's PTs, with validation etc. See http://thread.gmane.org/gmane.comp.emulators.xen.devel/212945 The introductory email even mentions GPUs... Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 01/19] xen: dump vNUMA information with debug key u
On 09.12.14 at 12:22, wei.l...@citrix.com wrote: Is this what you want? +if ( vnuma-vmemrange[j].nid == i ) +{ +printk( %016PRIx64 - %016PRIx64\n, + vnuma-vmemrange[j].start, + vnuma-vmemrange[j].end); +} And it prints out something like: (XEN) 2 vnodes, 2 vcpus: (XEN)vnode0 - pnode 0 (XEN) - bb80 (XEN)vcpus: 0 This looks fine, yes. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] One question about the hypercall to translate gfn to mfn.
-Original Message- From: Ian Campbell Sent: 09 December 2014 11:29 To: Paul Durrant Cc: Tim (Xen.org); Yu, Zhang; Kevin Tian; Keir (Xen.org); jbeul...@suse.com; Xen-devel@lists.xen.org Subject: Re: [Xen-devel] One question about the hypercall to translate gfn to mfn. On Tue, 2014-12-09 at 11:17 +, Paul Durrant wrote: -Original Message- From: Ian Campbell Sent: 09 December 2014 11:11 To: Paul Durrant Cc: Tim (Xen.org); Yu, Zhang; Kevin Tian; Keir (Xen.org); jbeul...@suse.com; Xen-devel@lists.xen.org Subject: Re: [Xen-devel] One question about the hypercall to translate gfn to mfn. On Tue, 2014-12-09 at 11:05 +, Paul Durrant wrote: -Original Message- From: Tim Deegan [mailto:t...@xen.org] Sent: 09 December 2014 10:47 To: Yu, Zhang Cc: Paul Durrant; Keir (Xen.org); jbeul...@suse.com; Kevin Tian; Xen- de...@lists.xen.org Subject: Re: One question about the hypercall to translate gfn to mfn. At 18:10 +0800 on 09 Dec (1418145055), Yu, Zhang wrote: Hi all, As you can see, we are pushing our XenGT patches to the upstream. One feature we need in xen is to translate guests' gfn to mfn in XenGT dom0 device model. Here we may have 2 similar solutions: 1 Paul told me(and thank you, Paul :)) that there used to be a hypercall, XENMEM_translate_gpfn_list, which was removed by Keir in commit 2d2f7977a052e655db6748be5dabf5a58f5c5e32, because there was no usage at that time. It's been suggested before that we should revive this hypercall, and I don't think it's a good idea. Whenever a domain needs to know the actual MFN of another domain's memory it's usually because the security model is problematic. In particular, finding the MFN is usually followed by a brute-force mapping from a dom0 process, or by passing the MFN to a device for unprotected DMA. These days DMA access should be protected by IOMMUs, or else the device drivers (and associated tools) are effectively inside the hypervisor's TCB. Luckily on x86 IOMMUs are widely available (and presumably present on anything new enough to run XenGT?). So I think the interface we need here is a please-map-this-gfn one, like the existing grant-table ops (which already do what you need by returning an address suitable for DMA). If adding a grant entry for every frame of the framebuffer within the guest is too much, maybe we can make a new interface for the guest to grant access to larger areas. IIUC the in-guest driver is Xen-unaware so any grant entry would have to be put in the guests table by the tools, which would entail some form of flexibly sized reserved range of grant entries otherwise any PV driver that are present in the guest would merrily clobber the new grant entries. A domain can already priv map a gfn into the MMU, so I think we just need an equivalent for the IOMMU. I'm not sure I'm fully understanding what's going on here, but is a variant of XENMEM_add_to_physmap+XENMAPSPACE_gmfn_foreign which also returns a DMA handle a plausible solution? I think we want be able to avoid setting up a PTE in the MMU since it's not needed in most (or perhaps all?) cases. Another (wildly under-informed) thought then: A while back Global logic proposed (for ARM) an infrastructure for allowing dom0 drivers to maintain a set of iommu like pagetables under hypervisor supervision (they called these remoteprocessor iommu). I didn't fully grok what it was at the time, let alone remember the details properly now, but AIUI it was essentially a framework for allowing a simple Xen side driver to provide PV-MMU-like update operations for a set of PTs which were not the main-processor's PTs, with validation etc. See http://thread.gmane.org/gmane.comp.emulators.xen.devel/212945 The introductory email even mentions GPUs... That series does indeed seem to be very relevant. Paul Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH] netback: don't store invalid vif pointer
When xenvif_alloc() fails, it returns a non-NULL error indicator. To avoid eventual races, we shouldn't store that into struct backend_info as readers of it only check for NULL. Signed-off-by: Jan Beulich jbeul...@suse.com --- a/drivers/net/xen-netback/xenbus.c +++ b/drivers/net/xen-netback/xenbus.c @@ -404,6 +404,7 @@ static int backend_create_xenvif(struct int err; long handle; struct xenbus_device *dev = be-dev; + struct xenvif *vif; if (be-vif != NULL) return 0; @@ -414,13 +415,13 @@ static int backend_create_xenvif(struct return (err 0) ? err : -EINVAL; } - be-vif = xenvif_alloc(dev-dev, dev-otherend_id, handle); - if (IS_ERR(be-vif)) { - err = PTR_ERR(be-vif); - be-vif = NULL; + vif = xenvif_alloc(dev-dev, dev-otherend_id, handle); + if (IS_ERR(vif)) { + err = PTR_ERR(vif); xenbus_dev_fatal(dev, err, creating interface); return err; } + be-vif = vif; kobject_uevent(dev-dev.kobj, KOBJ_ONLINE); return 0; ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [libvirt] [PATCH] libxl: Set path to console on domain startup.
On Mon, Dec 08, 2014 at 12:03:36PM -0700, Jim Fehlig wrote: Anthony PERARD wrote: The path to the pty of a Xen PV console is set only in virDomainOpenConsole. But this is done too late. A call to virDomainGetXMLDesc done before OpenConsole will not have the path to the pty, but a call after OpenConsole will. Hi Anthony, Thanks for the patch. Can you address the comments made by others, my comments below, and provide a V2? Will do. e.g. of the current issue. Starting a domain with 'console type=pty/' Then: virDomainGetXMLDesc(): devices console type='pty' target type='xen' port='0'/ /console /devices virDomainOpenConsole() virDomainGetXMLDesc(): devices console type='pty' tty='/dev/pts/30' source path='/dev/pts/30'/ target type='xen' port='0'/ /console /devices The patch intend to get the tty path on the first call of GetXMLDesc. Signed-off-by: Anthony PERARD anthony.per...@citrix.com --- src/libxl/libxl_domain.c | 17 + 1 file changed, 17 insertions(+) diff --git a/src/libxl/libxl_domain.c b/src/libxl/libxl_domain.c index 9c62291..de56054 100644 --- a/src/libxl/libxl_domain.c +++ b/src/libxl/libxl_domain.c @@ -1290,6 +1290,23 @@ libxlDomainStart(libxlDriverPrivatePtr driver, virDomainObjPtr vm, if (libxlDomainSetVcpuAffinities(driver, vm) 0) goto cleanup_dom; +if (vm-def-nconsoles) { +virDomainChrDefPtr chr = NULL; +chr = vm-def-consoles[0]; +if (chr chr-source.type == VIR_DOMAIN_CHR_TYPE_PTY) { +libxl_console_type console_type; +char *console = NULL; +console_type = +(chr-targetType == VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_SERIAL ? + LIBXL_CONSOLE_TYPE_SERIAL : LIBXL_CONSOLE_TYPE_PV); +ret = libxl_console_get_tty(priv-ctx, vm-def-id, chr-target.port, +console_type, console); +if (!ret) +ignore_value(VIR_STRDUP(chr-source.data.file.path, console)); VIR_STRDUP will not free an existing value. You should VIR_FREE first, which btw can handle a NULL argument. And since you're initializing source.data.file.path when starting the domain, I think we can drop the similar code in libxlDomainOpenConsole(). I will do that. Thanks, -- Anthony PERARD ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] Konrad R Wilk's Xen Project Committer Status confirmed (+ some governance policy concerns)
Hi all, I wanted to to summarize the outcome of the vote for Konrad's committer status. We had 4 votes in favor and one abstained. Which means Konrad is confirmed. A number of concerns were raised, e.g. shouldn't other maintainers (e.g. George Dunlap as past release manager, or Stefano Stabellini) also be nominated as a committer. This could be relatively easily fixed, but requires one of the other committers to make a nomination. Concerns about the definition of the committer role were raised. In summary, the governance policy today defines the committer role purely in terms of a maintainer with write access. However the role is actually also about being entrusted with the safety, governance and the stewardship of the project. This is implied in the governance document, but not spelled out. Tim Deegan, described the issue quite well: It's clear from the governance document that the committers are expected to resolve disagreements that the maintainers and contributors can't sort out between themselves. So maybe we should have a discussion about separating the roles of committer-with-tree-access and committer-for-governance? For me, the set of people who should be settling disputes is a subset of the set of people I would trust with push access to xen.git. IOW I'm a little wary of creating a group of people who make technical decisions but aren't themselves technical. This concern does not apply to Konrad, who has excellent standing in the community and has been actively involved in design, development and review inside the hypervisor. We don't need to resolve this issue before the 4.5 release. I think we should roll this up with a general review of our governance in spring. When I wrote up the contributor training document (see http://www.slideshare.net/xen_com_mgr/xen-project-contributor-training-part-2-processes-and-conventions-v10), it became clear that there are a number of conventions that are not well documented. These don't necessarily need to be part of the governance, but the governance document should probably link to some of our conventions. Best Regards Lars ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH] tools/hotplug: xendomains.service depends on network
Starting domains during boot will most likely require network for the local bridge and it may need access to remote filesystems. Add ordering tags to systemd service file. Signed-off-by: Olaf Hering o...@aepfle.de Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Wei Liu wei.l...@citrix.com --- This should go into 4.5 to avoid failures if xendomains is scheduled before remote mounts are fully available. tools/hotplug/Linux/systemd/xendomains.service.in | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tools/hotplug/Linux/systemd/xendomains.service.in b/tools/hotplug/Linux/systemd/xendomains.service.in index 9962671..66e2065 100644 --- a/tools/hotplug/Linux/systemd/xendomains.service.in +++ b/tools/hotplug/Linux/systemd/xendomains.service.in @@ -2,6 +2,8 @@ Description=Xendomains - start and stop guests on boot and shutdown Requires=proc-xen.mount xenstored.service After=proc-xen.mount xenstored.service xenconsoled.service xen-init-dom0.service +After=network-online.target +After=remote-fs.target ConditionPathExists=/proc/xen/capabilities [Service] ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] netback: don't store invalid vif pointer
On Tue, 2014-12-09 at 11:47 +, Jan Beulich wrote: When xenvif_alloc() fails, it returns a non-NULL error indicator. To avoid eventual races, we shouldn't store that into struct backend_info as readers of it only check for NULL. Signed-off-by: Jan Beulich jbeul...@suse.com Acked-by: Ian Campbell ian.campb...@citrix.com Thanks. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [COVERITY ACCESS] Request for access to Coverity
On Tue, 2014-12-02 at 15:12 +, Ian Campbell wrote: On Tue, 2014-12-02 at 15:10 +, George Dunlap wrote: On Thu, Nov 27, 2014 at 3:11 PM, Konrad Rzeszutek Wilk konrad.w...@oracle.com wrote: On Nov 27, 2014 6:59 AM, Tim Deegan t...@xen.org wrote: At 11:39 + on 27 Nov (1417084797), George Dunlap wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA512 I agree to the conditions in the XenProject Coverity contribution guidelines [1]. I'm a developer working for Citrix Systems UK, Ltd. I've been active in the Xen community since 2006; I was release coordinator for the 4.3 and 4.4 releases, and I am currently maintainer of the Xen scheduling subsystem. I would like access primarily to be able to write and speak intelligently about Xen and Coverity in blog postings and conference talks. I figure it would be easier to go through the stats and history myself, rather than trying to get some other developer to do it for me. (Although of course once I have access I'll probably end up doing some work looking at scans anyway.) +1 +1 too. OK, that's +4 and no minuses after 5 days. How long to I have to wait? :-) http://www.xenproject.org/help/contribution-guidelines.html says seven days... The time for voting has now passed. There were several +1s and no objections and so George has been granted access to the Coverity repository. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2] introduce grant copy for user land
-Original Message- From: Boris Ostrovsky [mailto:boris.ostrov...@oracle.com] Sent: 05 December 2014 6:06 PM To: Thanos Makatos; xen-de...@lists.xenproject.org Cc: David Vrabel Subject: Re: [Xen-devel] [PATCH v2] introduce grant copy for user land On 12/02/2014 11:13 AM, Thanos Makatos wrote: This patch introduces the interface to allow user-space applications execute grant-copy operations. This is done by sending an ioctl to the grant device. Signed-off-by: Thanos Makatos thanos.maka...@citrix.com --- drivers/xen/gntdev.c | 171 + include/uapi/xen/gntdev.h | 69 ++ 2 files changed, 240 insertions(+) diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c index 51f4c95..7b4a8e0 100644 --- a/drivers/xen/gntdev.c +++ b/drivers/xen/gntdev.c @@ -705,6 +705,174 @@ static long gntdev_ioctl_notify(struct gntdev_priv *priv, void __user *u) return rc; } +static int gntdev_gcopy_batch(int nr_segments, unsigned long gcopy_cb, + struct gntdev_grant_copy_segment __user *__segments, int dir, + int src, int dst) { + + static const int batch_size = PAGE_SIZE / (sizeof(struct page*) + + sizeof(struct gnttab_copy) + sizeof(struct gntdev_grant_copy_segment)); + struct page **pages = (struct page **)gcopy_cb; + struct gnttab_copy *batch = (struct gnttab_copy *)((unsigned long)pages + + sizeof(struct page*) * batch_size); + struct gntdev_grant_copy_segment *segments = + (struct gntdev_grant_copy_segment *)((unsigned long)batch + + sizeof(struct gnttab_copy) * batch_size); + unsigned int nr_pinned = 0, nr_segs2cp = 0; + int err = 0, i; + const int write = dir == GNTCOPY_IOCTL_g2s; + + nr_segments = min(nr_segments, batch_size); + + if (unlikely(copy_from_user(segments, __segments, + sizeof(struct gntdev_grant_copy_segment) * nr_segments))) { + pr_debug(failed to copy %d segments from user, nr_segments); + err = -EFAULT; + goto out; + } + + for (i = 0; i nr_segments; i++) { + + xen_pfn_t pgaddr; + unsigned long start, offset; + struct gntdev_grant_copy_segment *seg = segments[i]; + + if (dir == GNTCOPY_IOCTL_s2g || dir == GNTCOPY_IOCTL_g2s) { + + start = (unsigned long)seg-self.iov.iov_base PAGE_MASK; + offset = (unsigned long)seg-self.iov.iov_base ~PAGE_MASK; + if (unlikely(offset + seg-self.iov.iov_len PAGE_SIZE)) { + pr_warn(segments crossing page boundaries not yet + implemented\n); + err = -ENOSYS; + goto out; + } + + err = get_user_pages_fast(start, 1, write, pages[i]); + if (unlikely(err != 1)) { + pr_debug(failed to get user page %lu, start); + err = -EFAULT; + goto out; + } + + nr_pinned++; + + pgaddr = pfn_to_mfn(page_to_pfn(pages[i])); + } + + nr_segs2cp++; + + switch (dir) { + case GNTCOPY_IOCTL_g2s: /* copy from guest */ + batch[i].len = seg-self.iov.iov_len; + batch[i].source.u.ref = seg-self.ref; + batch[i].source.domid = src; + batch[i].source.offset = seg-self.offset; + batch[i].dest.u.gmfn = pgaddr; + batch[i].dest.domid = DOMID_SELF; + batch[i].dest.offset = offset; + batch[i].flags = GNTCOPY_source_gref; + break; + case GNTCOPY_IOCTL_s2g: /* copy to guest */ + batch[i].len = seg-self.iov.iov_len; + batch[i].source.u.gmfn = pgaddr; + batch[i].source.domid = DOMID_SELF; + batch[i].source.offset = offset; + batch[i].dest.u.ref = seg-self.ref; + batch[i].dest.domid = dst; + batch[i].dest.offset = seg-self.offset; + batch[i].flags = GNTCOPY_dest_gref; + break; + case GNTCOPY_IOCTL_g2g: /* copy guest to guest */ + batch[i].len = seg-g2g.len; + batch[i].source.u.ref = seg-g2g.src.ref; + batch[i].source.domid = src; + batch[i].source.offset = seg-g2g.src.offset; + batch[i].dest.u.ref = seg-g2g.dst.ref; + batch[i].dest.domid = dst; + batch[i].dest.offset =
[Xen-devel] [PATCH for 4.5] libxl: Tell qemu to use raw format when using a tapdisk
At the moment libxl unconditinally passes the underlying file format to qemu in the device string. However, when tapdisk is in use, tapdisk handles the underlying format and presents qemu with effectively a raw disk. When qemu looks at the tapdisk block device and doesn't find the image format it was looking for, it will fail. This effectively means that tapdisk cannot be used with HVM domains at the moment except for raw files. Instead, if we're using a tapdisk backend, tell qemu to use a raw file format. Signed-off-by: George Dunlap george.dun...@eu.citrix.com --- CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@citrix.com CC: Wei Liu wei.l...@citrix.com CC: Konrad Wilk konrad.w...@oracle.com Release exception justification: This fixes a bug in functionality, in that at the moment HVM guests cannot boot with tapdisk and vhd format. This is not a regression in xl functionality per se, since (AFAICT) this has never worked. However, given that 4.5 is the first release without xend, this *does* represent a regression in functionality for Xen as a whole (since before people using hvm guest with vhd on blktap could use xend). The fix is very simple and should only affect codepaths that already don't work, so the risk of regressions should be very low. While preparing this patch, I also noticed that cdroms will ignore the backend parameter and treat everything as a file. This is a bug but I think it's a much less important one to address this late in the release cycle. --- tools/libxl/libxl_dm.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c index b25b574..10f3090 100644 --- a/tools/libxl/libxl_dm.c +++ b/tools/libxl/libxl_dm.c @@ -797,11 +797,14 @@ static char ** libxl__build_device_model_args_new(libxl__gc *gc, continue; } -if (disks[i].backend == LIBXL_DISK_BACKEND_TAP) +if (disks[i].backend == LIBXL_DISK_BACKEND_TAP) { +format = qemu_disk_format_string(LIBXL_DISK_FORMAT_RAW); pdev_path = libxl__blktap_devpath(gc, disks[i].pdev_path, disks[i].format); -else +} else { pdev_path = disks[i].pdev_path; +} + /* * Explicit sd disks are passed through as is. -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.5 1/3] python/xc: Fix multiple issues in pyflask_context_to_sid()
On Fri, 2014-11-28 at 11:37 +, Ian Campbell wrote: On Thu, 2014-11-27 at 12:34 +, Andrew Cooper wrote: The error handling from a failed memory allocation should return PyErr_SetFromErrno(xc_error_obj); rather than simply calling it and continuing to the memcpy() below, with the dest pointer being NULL. Furthermore, the context string is simply an input parameter to the hypercall, and is not mutated anywhere along the way. The error handling elsewhere in the function can be simplified by not duplicating it to start with. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com Coverity-IDs: 1055305 1055721 CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com CC: Xen Coverity Team cover...@xen.org Acked-by: Ian Campbell ian.campb...@citrix.com This would have been far more obviously correct for 4.5 if you had stuck to fixing the issue in the first paragraph. Konrad, given http://article.gmane.org/gmane.comp.emulators.xen.devel/224881 does this have a release ack? Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.5 1/3] python/xc: Fix multiple issues in pyflask_context_to_sid()
On 09/12/14 14:27, Ian Campbell wrote: On Fri, 2014-11-28 at 11:37 +, Ian Campbell wrote: On Thu, 2014-11-27 at 12:34 +, Andrew Cooper wrote: The error handling from a failed memory allocation should return PyErr_SetFromErrno(xc_error_obj); rather than simply calling it and continuing to the memcpy() below, with the dest pointer being NULL. Furthermore, the context string is simply an input parameter to the hypercall, and is not mutated anywhere along the way. The error handling elsewhere in the function can be simplified by not duplicating it to start with. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com Coverity-IDs: 1055305 1055721 CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com CC: Xen Coverity Team cover...@xen.org Acked-by: Ian Campbell ian.campb...@citrix.com This would have been far more obviously correct for 4.5 if you had stuck to fixing the issue in the first paragraph. Konrad, given http://article.gmane.org/gmane.comp.emulators.xen.devel/224881 does this have a release ack? Ian. I can resubmit with a clearer description if that would help clarity, but the code is correct for the fixes (not fantastically well) described. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for 4.5] libxl: Tell qemu to use raw format when using a tapdisk
On Tue, Dec 09, 2014 at 02:04:19PM +, George Dunlap wrote: At the moment libxl unconditinally passes the underlying file format to qemu in the device string. However, when tapdisk is in use, tapdisk handles the underlying format and presents qemu with effectively a raw disk. When qemu looks at the tapdisk block device and doesn't find the image format it was looking for, it will fail. This effectively means that tapdisk cannot be used with HVM domains at the moment except for raw files. Instead, if we're using a tapdisk backend, tell qemu to use a raw file format. Signed-off-by: George Dunlap george.dun...@eu.citrix.com --- CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@citrix.com CC: Wei Liu wei.l...@citrix.com CC: Konrad Wilk konrad.w...@oracle.com Acked-by: Wei Liu wei.l...@citrix.com Release exception justification: This fixes a bug in functionality, in that at the moment HVM guests cannot boot with tapdisk and vhd format. This is not a regression in xl functionality per se, since (AFAICT) this has never worked. However, given that 4.5 is the first release without xend, this *does* represent a regression in functionality for Xen as a whole (since before people using hvm guest with vhd on blktap could use xend). The fix is very simple and should only affect codepaths that already don't work, so the risk of regressions should be very low. While preparing this patch, I also noticed that cdroms will ignore the backend parameter and treat everything as a file. This is a bug but I think it's a much less important one to address this late in the release cycle. We should create a bug tracker entry for this. --- tools/libxl/libxl_dm.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c index b25b574..10f3090 100644 --- a/tools/libxl/libxl_dm.c +++ b/tools/libxl/libxl_dm.c @@ -797,11 +797,14 @@ static char ** libxl__build_device_model_args_new(libxl__gc *gc, continue; } -if (disks[i].backend == LIBXL_DISK_BACKEND_TAP) +if (disks[i].backend == LIBXL_DISK_BACKEND_TAP) { +format = qemu_disk_format_string(LIBXL_DISK_FORMAT_RAW); pdev_path = libxl__blktap_devpath(gc, disks[i].pdev_path, disks[i].format); -else +} else { pdev_path = disks[i].pdev_path; +} + Minor nit, extra blank line, but this alone doesn't warrant a resend. /* * Explicit sd disks are passed through as is. -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] error of VM Migration (xl toolstack) failed when migrating Webserver VM with 100 connecitons using httperf
On Mon, Dec 8, 2014 at 4:36 AM, Minalkumar Patel patel...@yahoo.co.in wrote: error of Migration failed when migrating Webserver VM with 100 connecitons using httperf Thanks for the bug report. Can you please include more information about your setup? At very least the output of xl info, your domain config file, and preferrably your serial console output as well; or the output of dmesg if you have no serial console. See here for more details: http://wiki.xenproject.org/wiki/Reporting_Bugs_against_Xen_Project Thanks, -George First time it generats (a part of follwoing output): migration target: Transfer complete, requesting permission to start domain. migration sender: Target has acknowledged transfer. migration sender: Giving target permission to start. migration target: Got permission, starting domain. [everything ok uptill now] libxl: error: libxl.c:321:libxl__domain_rename: domain with name drbdvm1 already exists. migration target: Failure, destroying our copy. migration sender: Target reports startup failure (status code 6). migration target: Cleanup OK, granting sender permission to resume. migration sender: Trying to resume at our end. libxl: debug: libxl.c:435:libxl_domain_resume: ao 0xa91bf0: create: how=(nil) callback=(nil) poller=0xa91c50 xc: error: Cannot resume uncooperative HVM guests: Internal error libxl: error: libxl.c:404:libxl__domain_resume: xc_domain_resume failed for domain 10: No such file or directory libxl: debug: libxl_event.c:1499:libxl__ao_complete: ao 0xa91bf0: complete, rc=-3 libxl: debug: libxl.c:438:libxl_domain_resume: ao 0xa91bf0: inprogress: poller=0xa91c50, flags=ic libxl: debug: libxl_event.c:1471:libxl__ao__destroy: ao 0xa91bf0: destroy Migration failed due to problems at target. Second time it generates (a part of follwing output): xc: detail: Start last iteration [everything ok uptill now] libxl: debug: libxl_dom.c:933:libxl__domain_suspend_common_callback: issuing PVHVM suspend request via XenBus control node libxl: debug: libxl_dom.c:937:libxl__domain_suspend_common_callback: wait for the guest to acknowledge suspend request libxl: error: libxl_dom.c:958:libxl__domain_suspend_common_callback: guest didn't acknowledge suspend, cancelling request libxl: error: libxl_dom.c:980:libxl__domain_suspend_common_callback: guest didn't acknowledge suspend, request cancelled xc: error: Suspend request failed: Internal error xc: error: Domain appears not to have suspended: Internal error libxl: debug: libxl_event.c:512:libxl__ev_xswatch_register: watch w=0x21c5f80 wpath=/local/domain/0/device-model/10/logdirty/ret token=3/1: register slotnum=3 libxl: debug: libxl_event.c:457:watchfd_callback: watch w=0x21c5f80 wpath=/local/domain/0/device-model/10/logdirty/ret token=3/1: event epath=/local/domain/0/device-model/10/logdirty/ret libxl: debug: libxl_event.c:457:watchfd_callback: watch w=0x21c5f80 wpath=/local/domain/0/device-model/10/logdirty/ret token=3/1: event epath=/local/domain/0/device-model/10/logdirty/ret libxl: debug: libxl_event.c:549:libxl__ev_xswatch_deregister: watch w=0x21c5f80 wpath=/local/domain/0/device-model/10/logdirty/ret token=3/1: deregister slotnum=3 libxl: debug: libxl_event.c:426:watchfd_callback: watch epath=/local/domain/0/device-model/10/logdirty/ret token=3/1: empty slot xc: detail: Save exit rc=1 libxl-save-helper: debug: complete r=1: No such device libxl: error: libxl_dom.c:1266:libxl__xc_domain_save_done: saving domain: domain did not respond to suspend request: No such device libxl: debug: libxl_event.c:1499:libxl__ao_complete: ao 0x21c5bf0: complete, rc=-8 libxl: debug: libxl_event.c:1471:libxl__ao__destroy: ao 0x21c5bf0: destroy migration sender: libxl_domain_suspend failed (rc=-8) xc: error: 0-length read: Internal error xc: error: read_exact_timed failed (read rc: 0, errno: 0): Internal error xc: error: Error when reading batch size (0 = Success): Internal error xc: error: Error when reading batch (0 = Success): Internal error libxl: error: libxl_create.c:820:libxl__xc_domain_restore_done: restoring domain: Resource temporarily unavailable libxl: error: libxl_create.c:902:domcreate_rebuild_done: cannot (re-)build domain: -3 libxl: error: libxl.c:1394:libxl__destroy_domid: non-existant domain 6 libxl: error: libxl.c:1358:domain_destroy_callback: unable to destroy guest with domid 6 libxl: error: libxl_create.c:1153:domcreate_destruction_cb: unable to destroy domain 6 following failed creation migration target: Domain creation failed (code -3). libxl: info: libxl_exec.c:118:libxl_report_child_exitstatus: migration target process [11104] exited with error status 3 Migration failed, failed to suspend at sender. What goes wrong please tell me?It also shows migration of Webserver VM at destination and it can be manually started and in worknig condition. But at sender, VM is still continueing webserver based httperf connections and doesn't turn
Re: [Xen-devel] [PATCH for 4.5] libxl: Tell qemu to use raw format when using a tapdisk
On 12/09/2014 02:32 PM, Wei Liu wrote: On Tue, Dec 09, 2014 at 02:04:19PM +, George Dunlap wrote: At the moment libxl unconditinally passes the underlying file format to qemu in the device string. However, when tapdisk is in use, tapdisk handles the underlying format and presents qemu with effectively a raw disk. When qemu looks at the tapdisk block device and doesn't find the image format it was looking for, it will fail. This effectively means that tapdisk cannot be used with HVM domains at the moment except for raw files. Instead, if we're using a tapdisk backend, tell qemu to use a raw file format. Signed-off-by: George Dunlap george.dun...@eu.citrix.com --- CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@citrix.com CC: Wei Liu wei.l...@citrix.com CC: Konrad Wilk konrad.w...@oracle.com Acked-by: Wei Liu wei.l...@citrix.com Release exception justification: This fixes a bug in functionality, in that at the moment HVM guests cannot boot with tapdisk and vhd format. This is not a regression in xl functionality per se, since (AFAICT) this has never worked. However, given that 4.5 is the first release without xend, this *does* represent a regression in functionality for Xen as a whole (since before people using hvm guest with vhd on blktap could use xend). The fix is very simple and should only affect codepaths that already don't work, so the risk of regressions should be very low. While preparing this patch, I also noticed that cdroms will ignore the backend parameter and treat everything as a file. This is a bug but I think it's a much less important one to address this late in the release cycle. We should create a bug tracker entry for this. --- tools/libxl/libxl_dm.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c index b25b574..10f3090 100644 --- a/tools/libxl/libxl_dm.c +++ b/tools/libxl/libxl_dm.c @@ -797,11 +797,14 @@ static char ** libxl__build_device_model_args_new(libxl__gc *gc, continue; } -if (disks[i].backend == LIBXL_DISK_BACKEND_TAP) +if (disks[i].backend == LIBXL_DISK_BACKEND_TAP) { +format = qemu_disk_format_string(LIBXL_DISK_FORMAT_RAW); pdev_path = libxl__blktap_devpath(gc, disks[i].pdev_path, disks[i].format); -else +} else { pdev_path = disks[i].pdev_path; +} + Minor nit, extra blank line, but this alone doesn't warrant a resend. Yes, I noticed this as soon as I sent it. :-/ -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2] xen/serial: setup UART idle mode for OMAP
On Mon, 2014-12-08 at 15:51 +0200, Oleksandr Dmytryshyn wrote: UART is not able to receive bytes when idle mode is not configured properly. When we use Xen with old Linux Kernel (for example 3.8) this kernel configures hwmods for all devices even if the device tree nodes for those devices is absent in device tree. Thus UART idle mode is configured too. The fake node for the UART should be added to the device tree because the MMIO range is not mapped by Xen. So UART works normally in this case. But new Linux Kernel (3.12 and upper) doesn't configure idle mode for UART and UART can not work normally in this case. I think the focus is too much on the hack done with 3.8 to make this work rather than on the fix being made here itself. The hack is only really of peripheral/historic interest. How about instead: The UART is not able to receive bytes when idle mode is not configured properly, therefore setup the UART with autoidle and wakeup enabled. You could stop here or if you really want to cover the old hack you could go on to say: Older Linux kernels (for example 3.8) configures hwmods for all devices even if the device tree nodes for those devices is absent in device tree, thus UART idle mode is configured too. With such kernels we can workaround the issue by adding a fake node in the UART containing this MMIO range, which is therefore mapped by Xen to dom0, which reconfigures the UART, causing things to work normally. Newer Linux Kernels (3.12 and beyond) do not configure idle mode for UART and so this hack no longer works. If you are happy with the proposed wording (and indicate whether you want both bits or just the first) then, subject to Konrad giving a release Ack I'd be happy with this for 4.5 and I'll change the commit log as I go. Konrad, this is a bug fix for a particular piece of hardware, it can't affect anything other than the OMAP ARM platform. Signed-off-by: Oleksandr Dmytryshyn oleksandr.dmytrys...@globallogic.com --- Changed since v1: * corrected commit message xen/drivers/char/omap-uart.c | 3 +++ xen/include/xen/8250-uart.h | 4 2 files changed, 7 insertions(+) diff --git a/xen/drivers/char/omap-uart.c b/xen/drivers/char/omap-uart.c index a798b8d..16d1454 100644 --- a/xen/drivers/char/omap-uart.c +++ b/xen/drivers/char/omap-uart.c @@ -195,6 +195,9 @@ static void __init omap_uart_init_preirq(struct serial_port *port) omap_write(uart, UART_MCR, UART_MCR_DTR|UART_MCR_RTS); omap_write(uart, UART_OMAP_MDR1, UART_OMAP_MDR1_16X_MODE); + +/* setup iddle mode */ +omap_write(uart, UART_SYSC, OMAP_UART_SYSC_DEF_CONF); } static void __init omap_uart_init_postirq(struct serial_port *port) diff --git a/xen/include/xen/8250-uart.h b/xen/include/xen/8250-uart.h index a682bae..304b9dd 100644 --- a/xen/include/xen/8250-uart.h +++ b/xen/include/xen/8250-uart.h @@ -32,6 +32,7 @@ #define UART_MCR 0x04/* Modem control*/ #define UART_LSR 0x05/* line status */ #define UART_MSR 0x06/* Modem status */ +#define UART_SYSC 0x15/* System configuration register */ #define UART_USR 0x1f/* Status register (DW) */ #define UART_DLL 0x00/* divisor latch (ls) (DLAB=1) */ #define UART_DLM 0x01/* divisor latch (ms) (DLAB=1) */ @@ -145,6 +146,9 @@ /* SCR register bitmasks */ #define OMAP_UART_SCR_RX_TRIG_GRANU1_MASK (1 7) +/* System configuration register */ +#define OMAP_UART_SYSC_DEF_CONF 0x0d /* autoidle mode, wakeup is enabled */ + #endif /* __XEN_8250_UART_H__ */ /* ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for 4.5] libxl: Tell qemu to use raw format when using a tapdisk
On Tue, 2014-12-09 at 14:04 +, George Dunlap wrote: At the moment libxl unconditinally passes the underlying file format to qemu in the device string. However, when tapdisk is in use, tapdisk handles the underlying format and presents qemu with effectively a raw disk. When qemu looks at the tapdisk block device and doesn't find the image format it was looking for, it will fail. This effectively means that tapdisk cannot be used with HVM domains at the moment except for raw files. Instead, if we're using a tapdisk backend, tell qemu to use a raw file format. Signed-off-by: George Dunlap george.dun...@eu.citrix.com Acked-by: Ian Campbell ian.campb...@citrix.com --- CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@citrix.com CC: Wei Liu wei.l...@citrix.com CC: Konrad Wilk konrad.w...@oracle.com Release exception justification: I agree with your reasoning. This fixes a bug in functionality, in that at the moment HVM guests cannot boot with tapdisk and vhd format. This is not a regression in xl functionality per se, since (AFAICT) this has never worked. However, given that 4.5 is the first release without xend, this *does* represent a regression in functionality for Xen as a whole (since before people using hvm guest with vhd on blktap could use xend). The fix is very simple and should only affect codepaths that already don't work, so the risk of regressions should be very low. While preparing this patch, I also noticed that cdroms will ignore the backend parameter and treat everything as a file. This is a bug but I think it's a much less important one to address this late in the release cycle. --- tools/libxl/libxl_dm.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c index b25b574..10f3090 100644 --- a/tools/libxl/libxl_dm.c +++ b/tools/libxl/libxl_dm.c @@ -797,11 +797,14 @@ static char ** libxl__build_device_model_args_new(libxl__gc *gc, continue; } -if (disks[i].backend == LIBXL_DISK_BACKEND_TAP) +if (disks[i].backend == LIBXL_DISK_BACKEND_TAP) { +format = qemu_disk_format_string(LIBXL_DISK_FORMAT_RAW); pdev_path = libxl__blktap_devpath(gc, disks[i].pdev_path, disks[i].format); -else +} else { pdev_path = disks[i].pdev_path; +} + /* * Explicit sd disks are passed through as is. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [xen-unstable test] 32153: tolerable trouble: broken/fail/pass - PUSHED
flight 32153 xen-unstable real [real] http://www.chiark.greenend.org.uk/~xensrcts/logs/32153/ Failures :-/ but no regressions. Tests which are failing intermittently (not blocking): test-amd64-amd64-libvirt 3 host-install(3) broken pass in 32139 test-armhf-armhf-xl 4 xen-installfail in 32139 pass in 32153 test-amd64-i386-xl-win7-amd64 5 xen-boot fail in 32139 pass in 32153 Regressions which are regarded as allowable (not blocking): test-amd64-i386-pair17 guest-migrate/src_host/dst_host fail like 32093 Tests which did not succeed, but are not blocking: test-amd64-i386-libvirt 9 guest-start fail never pass test-amd64-amd64-xl-pcipt-intel 9 guest-start fail never pass test-armhf-armhf-libvirt 9 guest-start fail never pass test-armhf-armhf-xl 10 migrate-support-checkfail never pass test-amd64-i386-xl-qemut-winxpsp3 14 guest-stopfail never pass test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail never pass test-amd64-i386-xl-qemuu-winxpsp3 14 guest-stopfail never pass test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass test-amd64-amd64-xl-win7-amd64 14 guest-stop fail never pass test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass test-amd64-amd64-xl-qemut-winxpsp3 14 guest-stop fail never pass test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop fail never pass test-amd64-i386-xl-win7-amd64 14 guest-stop fail never pass test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop fail never pass test-amd64-amd64-xl-winxpsp3 14 guest-stop fail never pass test-amd64-i386-xl-winxpsp3 14 guest-stop fail never pass test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop fail never pass test-amd64-amd64-xl-qemuu-winxpsp3 14 guest-stop fail never pass test-amd64-amd64-libvirt 9 guest-start fail in 32139 never pass version targeted for testing: xen 10e7747bca53820e313574432f231070153b baseline version: xen 3a80985b894f54eb3b2e143e4dea737cf139a517 People who touched revisions under test: Andrew Cooper andrew.coop...@citrix.com Boris Ostrovsky boris.ostrov...@oracle.com Chunyan Liu cy...@suse.com Daniel Kiper daniel.ki...@oracle.com Don Dugger donald.d.dug...@intel.com Euan Harris euan.har...@citrix.com Ian Campbell ian.campb...@citrix.com Ian Jackson ian.jack...@eu.citrix.com Jan Beulich jbeul...@suse.com Julien Grall julien.gr...@linaro.org Kevin Tian kevin.t...@intel.com M A Young m.a.yo...@durham.ac.uk Michael Young m.a.yo...@durham.ac.uk Olaf Hering o...@aepfle.de Razvan Cojocaru rcojoc...@bitdefender.com Tim Deegan t...@xen.org Vitaly Kuznetsov vkuzn...@redhat.com Wei Liu wei.l...@citrix.com jobs: build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-armhf-libvirt pass build-i386-libvirt pass build-amd64-oldkern pass build-i386-oldkern pass build-amd64-pvopspass build-armhf-pvopspass build-i386-pvops pass build-amd64-rumpuserxen pass build-i386-rumpuserxen pass test-amd64-amd64-xl pass test-armhf-armhf-xl pass test-amd64-i386-xl pass test-amd64-i386-rhel6hvm-amd pass test-amd64-i386-qemut-rhel6hvm-amd pass test-amd64-i386-qemuu-rhel6hvm-amd pass test-amd64-amd64-xl-qemut-debianhvm-amd64pass test-amd64-i386-xl-qemut-debianhvm-amd64 pass test-amd64-amd64-xl-qemuu-debianhvm-amd64pass test-amd64-i386-xl-qemuu-debianhvm-amd64 pass test-amd64-i386-freebsd10-amd64 pass test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
Re: [Xen-devel] [PATCH] tools/xenstore: fix link error with libsystemd
On Fri, 2014-12-05 at 12:19 -0500, Konrad Rzeszutek Wilk wrote: On Fri, Dec 05, 2014 at 10:53:03AM +, Ian Campbell wrote: On Fri, 2014-12-05 at 11:49 +0100, Olaf Hering wrote: Linking fails with undefined reference to the used systemd functions. Move LDFLAGS after the object files to fix the failure. Signed-off-by: Olaf Hering o...@aepfle.de Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Acked-by: Ian Campbell ian.campb...@citrix.com This should go into 4.5. Release-Acked-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com Applied, thanks. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.5] xen/arm: Correct the opcode for BUG_INSTR on arm32
On Fri, 2014-12-05 at 10:54 +, Ian Campbell wrote: Not sure, why I dropped the 0 when I implemented the patch... This is a bug fixed for Xen 4.5. This is only affected ARM32 where the BUG opcode was malformed. With the malformed opcode, the ASSERT/BUG_ON is skipped and the processor may execute another patch (because the compiler has optimized s/patch/path/ ? Will fix on commit. Oh, this isn't in the main body of the commit log. due the unreachable in both macro). The code modified is only executed when Xen is in bad state. Release-Acked-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com Acked-by: Ian Campbell ian.campb...@citrix.com Applied (with no changes). ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2] xen/serial: setup UART idle mode for OMAP
On Tue, Dec 9, 2014 at 4:47 PM, Ian Campbell ian.campb...@citrix.com wrote: On Mon, 2014-12-08 at 15:51 +0200, Oleksandr Dmytryshyn wrote: UART is not able to receive bytes when idle mode is not configured properly. When we use Xen with old Linux Kernel (for example 3.8) this kernel configures hwmods for all devices even if the device tree nodes for those devices is absent in device tree. Thus UART idle mode is configured too. The fake node for the UART should be added to the device tree because the MMIO range is not mapped by Xen. So UART works normally in this case. But new Linux Kernel (3.12 and upper) doesn't configure idle mode for UART and UART can not work normally in this case. I think the focus is too much on the hack done with 3.8 to make this work rather than on the fix being made here itself. The hack is only really of peripheral/historic interest. How about instead: The UART is not able to receive bytes when idle mode is not configured properly, therefore setup the UART with autoidle and wakeup enabled. You could stop here or if you really want to cover the old hack you could go on to say: Older Linux kernels (for example 3.8) configures hwmods for all devices even if the device tree nodes for those devices is absent in device tree, thus UART idle mode is configured too. With such kernels we can workaround the issue by adding a fake node in the UART containing this MMIO range, which is therefore mapped by Xen to dom0, which reconfigures the UART, causing things to work normally. Newer Linux Kernels (3.12 and beyond) do not configure idle mode for UART and so this hack no longer works. If you are happy with the proposed wording (and indicate whether you want both bits or just the first) then, subject to Konrad giving a release Ack I'd be happy with this for 4.5 and I'll change the commit log as I go. I'm fully happy with proposed wording (and want the both bits to be used) Konrad, this is a bug fix for a particular piece of hardware, it can't affect anything other than the OMAP ARM platform. Signed-off-by: Oleksandr Dmytryshyn oleksandr.dmytrys...@globallogic.com --- Changed since v1: * corrected commit message xen/drivers/char/omap-uart.c | 3 +++ xen/include/xen/8250-uart.h | 4 2 files changed, 7 insertions(+) diff --git a/xen/drivers/char/omap-uart.c b/xen/drivers/char/omap-uart.c index a798b8d..16d1454 100644 --- a/xen/drivers/char/omap-uart.c +++ b/xen/drivers/char/omap-uart.c @@ -195,6 +195,9 @@ static void __init omap_uart_init_preirq(struct serial_port *port) omap_write(uart, UART_MCR, UART_MCR_DTR|UART_MCR_RTS); omap_write(uart, UART_OMAP_MDR1, UART_OMAP_MDR1_16X_MODE); + +/* setup iddle mode */ +omap_write(uart, UART_SYSC, OMAP_UART_SYSC_DEF_CONF); } static void __init omap_uart_init_postirq(struct serial_port *port) diff --git a/xen/include/xen/8250-uart.h b/xen/include/xen/8250-uart.h index a682bae..304b9dd 100644 --- a/xen/include/xen/8250-uart.h +++ b/xen/include/xen/8250-uart.h @@ -32,6 +32,7 @@ #define UART_MCR 0x04/* Modem control*/ #define UART_LSR 0x05/* line status */ #define UART_MSR 0x06/* Modem status */ +#define UART_SYSC 0x15/* System configuration register */ #define UART_USR 0x1f/* Status register (DW) */ #define UART_DLL 0x00/* divisor latch (ls) (DLAB=1) */ #define UART_DLM 0x01/* divisor latch (ms) (DLAB=1) */ @@ -145,6 +146,9 @@ /* SCR register bitmasks */ #define OMAP_UART_SCR_RX_TRIG_GRANU1_MASK (1 7) +/* System configuration register */ +#define OMAP_UART_SYSC_DEF_CONF 0x0d /* autoidle mode, wakeup is enabled */ + #endif /* __XEN_8250_UART_H__ */ /* ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/4] dma: add dma_get_required_mask_from_max_pfn()
On Mon, Dec 08, 2014 at 10:36:14AM +, David Vrabel wrote: On 05/12/14 21:31, Greg Kroah-Hartman wrote: On Fri, Dec 05, 2014 at 02:08:00PM +, David Vrabel wrote: A generic dma_get_required_mask() is useful even for architectures (such as ia64) that define ARCH_HAS_GET_REQUIRED_MASK. Signed-off-by: David Vrabel david.vra...@citrix.com Reviewed-by: Stefano Stabellini stefano.stabell...@eu.citrix.com --- drivers/base/platform.c | 10 -- Is this why you sent this to me? The x86 maintainers should handle this patch set, not me for a tiny 8 lines in just one of the files, sorry. This series will be merged via the Xen tree, but this patch still needs your review or ack. How about waiting until after the merge window and resending it, asking for that, instead of making me guess :) greg k-h ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-xen-4.5 1/3] tools/hotplug: distclean target should remove files generated by configure
On Tue, Dec 09, 2014 at 02:35:37PM +, Ian Campbell wrote: On Tue, 2014-12-02 at 16:16 +0100, Daniel Kiper wrote: +distclean: + rm -f Linux/init.d/sysconfig.xencommons Linux/init.d/xencommons NetBSD/rc.d/xencommons Configure generates a boatload more things than this, see e.g. $ grep hotplug/ tools/configure.ac Perhaps the answer would be to recurse into tools/hotplug/* and refactor the existing XEN_SCRIPTS to have the generated stuff in XEN_SCRIPTS_GEN instead and XEN_SCRIPTS += $(XEN_SCRIPTS). The on distclean remove the XEN_SCRIPTS_GEN ones. It's not ideal, but at least it puts the distclean logic in the same place as the logic to install the files, if not their generation, which at least increases the chance of someone adding it to the right place. Daniel pointed to me that the two other patches that were committed solve the problem of seeing those auto-generated files sticking (and forcing one to use git checkout XYZ -f). So this small patch can be ditched in favour of the more encompassing design that you have sketched out. Thank you! Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] libxl: Fix building libxlu_cfg_y.y with bison 3.0
Ian Campbell writes (Re: [Xen-devel] [PATCH] libxl: Fix building libxlu_cfg_y.y with bison 3.0): There was a point in time where the prevailing version of bison (or maybe flex) in stable distro releases had a bug which meant these files could not be regenerated easily on common distros. I don't recall the details well enough to know if that time has now passed. Perhaps Ian J does. We use (essential) features found only in non-ancient versions of bison and flex. The last time it was proposed to remove these pregenerated files, there were complaints from people who were using six-year-old bison and flex. I think we should prioritise compatibility with new versions of bison and flex, and retain the pregenerated versions of people with incompatibly old version. I think for 4.5 we should: * Regenerate the flex files with current wheezy's flex. (I have reviewed the diff in the generated code. It produces trivial changes which add declarations of xlu__cfg_yyget_column and xlu__cfg_yyset_column, but no code body changes - see below.) * Take the patch from Ed and regenerate the bison files too. Konrad: can we have two freeze exceptions please ? Risk analysis for Ed's patch: Not taking the patch hurts systems with bison 2.7.1 or later: - Affected systems include Debian jessie, which will be released during the lifetime of 4.5. - Bison 2.7.1 was released in April 2013, a year and a half ago. So any system which is less than 18 months out of date suffers pain from not updating. Taking the patch hurts systems with old bison: - Bison 2.4.1 is known to work and was released in December 2008. So systems which suffer pain from updating are using at least 4-year-old bison. - I have not investigated whether there are even older bison versions which are still OK with updating. On the affected systems: - Attempts to build actually-from-source, or with modified bison input, will definitely fail. - Some proportion of other builds will fail anyway due to timestamp skew. Risk analysis for regenerating with current wheezy's flex: There doesn't seem to be any actual change in the generated code apart from a few new function declarations and changes to #line directives. So the risk of breakage is small. Furthermore: Not taking the patch now means that our flex file will be out of date and may be regenerated and updated accidentally (or need to be regenerated) at some point in the future. That means that (a) whatever risk of breakage we are taking might be discovered at an inconvenient time (b) additional build system trouble might result from an out of date file. There isn't AFAICT a necessary connection between these two but we normally regenerate both the bison and flex files together, if necessary, before making any changes to the input files. Thanks, Ian. diff --git a/tools/libxl/libxlu_cfg_l.c b/tools/libxl/libxlu_cfg_l.c index df352aa..450863a 100644 --- a/tools/libxl/libxlu_cfg_l.c +++ b/tools/libxl/libxlu_cfg_l.c @@ -610,6 +610,10 @@ int xlu__cfg_yyget_lineno (yyscan_t yyscanner ); void xlu__cfg_yyset_lineno (int line_number ,yyscan_t yyscanner ); +int xlu__cfg_yyget_column (yyscan_t yyscanner ); + +void xlu__cfg_yyset_column (int column_no ,yyscan_t yyscanner ); + YYSTYPE * xlu__cfg_yyget_lval (yyscan_t yyscanner ); void xlu__cfg_yyset_lval (YYSTYPE * yylval_param ,yyscan_t yyscanner ); @@ -762,7 +766,7 @@ YY_DECL #line 53 libxlu_cfg_l.l -#line 766 libxlu_cfg_l.c +#line 770 libxlu_cfg_l.c yylval = yylval_param; @@ -971,7 +975,7 @@ YY_RULE_SETUP #line 104 libxlu_cfg_l.l YY_FATAL_ERROR( flex scanner jammed ); YY_BREAK -#line 975 libxlu_cfg_l.c +#line 979 libxlu_cfg_l.c case YY_STATE_EOF(INITIAL): case YY_STATE_EOF(lexerr): yyterminate(); diff --git a/tools/libxl/libxlu_cfg_l.h b/tools/libxl/libxlu_cfg_l.h index 4078302..151064e 100644 --- a/tools/libxl/libxlu_cfg_l.h +++ b/tools/libxl/libxlu_cfg_l.h @@ -276,6 +276,10 @@ int xlu__cfg_yyget_lineno (yyscan_t yyscanner ); void xlu__cfg_yyset_lineno (int line_number ,yyscan_t yyscanner ); +int xlu__cfg_yyget_column (yyscan_t yyscanner ); + +void xlu__cfg_yyset_column (int column_no ,yyscan_t yyscanner ); + YYSTYPE * xlu__cfg_yyget_lval (yyscan_t yyscanner ); void xlu__cfg_yyset_lval (YYSTYPE * yylval_param ,yyscan_t yyscanner ); @@ -352,6 +356,6 @@ extern int xlu__cfg_yylex \ #line 104 libxlu_cfg_l.l -#line 356 libxlu_cfg_l.h +#line 360 libxlu_cfg_l.h #undef xlu__cfg_yyIN_HEADER #endif /* xlu__cfg_yyHEADER_H */ diff --git a/tools/libxl/libxlu_disk_l.c b/tools/libxl/libxlu_disk_l.c index 2c6e8e3..beea7f9 100644 --- a/tools/libxl/libxlu_disk_l.c +++ b/tools/libxl/libxlu_disk_l.c @@ -1011,6 +1011,10 @@ int xlu__disk_yyget_lineno (yyscan_t yyscanner ); void xlu__disk_yyset_lineno (int line_number ,yyscan_t yyscanner ); +int xlu__disk_yyget_column (yyscan_t yyscanner ); + +void xlu__disk_yyset_column (int column_no ,yyscan_t
[Xen-devel] [PATCH V2] libxl: Set path to console on domain startup.
The path to the pty of a Xen PV console is set only in virDomainOpenConsole. But this is done too late. A call to virDomainGetXMLDesc done before OpenConsole will not have the path to the pty, but a call after OpenConsole will. e.g. of the current issue. Starting a domain with 'console type=pty/' Then: virDomainGetXMLDesc(): devices console type='pty' target type='xen' port='0'/ /console /devices virDomainOpenConsole() virDomainGetXMLDesc(): devices console type='pty' tty='/dev/pts/30' source path='/dev/pts/30'/ target type='xen' port='0'/ /console /devices The patch intend to have the TTY path on the first call of GetXMLDesc. This is done by setting up the path at domain start up instead of in OpenConsole. https://bugzilla.redhat.com/show_bug.cgi?id=1170743 Signed-off-by: Anthony PERARD anthony.per...@citrix.com --- Change in V2: Adding bug report link. Reword the last part of the patch description. Cleanup the code. Use VIR_FREE before VIR_STRDUP. Remove the code from OpenConsole as it is now a duplicate. --- src/libxl/libxl_domain.c | 20 src/libxl/libxl_driver.c | 15 --- 2 files changed, 20 insertions(+), 15 deletions(-) diff --git a/src/libxl/libxl_domain.c b/src/libxl/libxl_domain.c index 9c62291..325de79 100644 --- a/src/libxl/libxl_domain.c +++ b/src/libxl/libxl_domain.c @@ -1290,6 +1290,26 @@ libxlDomainStart(libxlDriverPrivatePtr driver, virDomainObjPtr vm, if (libxlDomainSetVcpuAffinities(driver, vm) 0) goto cleanup_dom; +if (vm-def-nconsoles) { +virDomainChrDefPtr chr = vm-def-consoles[0]; +if (chr chr-source.type == VIR_DOMAIN_CHR_TYPE_PTY) { +libxl_console_type console_type; +char *console = NULL; + +console_type = +(chr-targetType == VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_SERIAL ? + LIBXL_CONSOLE_TYPE_SERIAL : LIBXL_CONSOLE_TYPE_PV); +ret = libxl_console_get_tty(priv-ctx, vm-def-id, +chr-target.port, console_type, +console); +if (!ret) { +VIR_FREE(chr-source.data.file.path); +ignore_value(VIR_STRDUP(chr-source.data.file.path, console)); +} +VIR_FREE(console); +} +} + if (!start_paused) { libxl_domain_unpause(priv-ctx, domid); virDomainObjSetState(vm, VIR_DOMAIN_RUNNING, VIR_DOMAIN_RUNNING_BOOTED); diff --git a/src/libxl/libxl_driver.c b/src/libxl/libxl_driver.c index 53c87ce..e79afac 100644 --- a/src/libxl/libxl_driver.c +++ b/src/libxl/libxl_driver.c @@ -3957,10 +3957,8 @@ libxlDomainOpenConsole(virDomainPtr dom, { virDomainObjPtr vm = NULL; int ret = -1; -libxl_console_type console_type; virDomainChrDefPtr chr = NULL; libxlDomainObjPrivatePtr priv; -char *console = NULL; virCheckFlags(VIR_DOMAIN_CONSOLE_FORCE, -1); @@ -4002,18 +4000,6 @@ libxlDomainOpenConsole(virDomainPtr dom, goto cleanup; } -console_type = -(chr-targetType == VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_SERIAL ? -LIBXL_CONSOLE_TYPE_SERIAL : LIBXL_CONSOLE_TYPE_PV); - -ret = libxl_console_get_tty(priv-ctx, vm-def-id, chr-target.port, -console_type, console); -if (ret) -goto cleanup; - -if (VIR_STRDUP(chr-source.data.file.path, console) 0) -goto cleanup; - /* handle mutually exclusive access to console devices */ ret = virChrdevOpen(priv-devs, chr-source, @@ -4027,7 +4013,6 @@ libxlDomainOpenConsole(virDomainPtr dom, } cleanup: -VIR_FREE(console); if (vm) virObjectUnlock(vm); return ret; -- Anthony PERARD ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 3/6] libxl: events: Deregister, don't just modify, sigchld pipe fd
We want to have no fd events registered when we are idle. This implies that we must be able to deregister our interest in the sigchld self-pipe fd, not just modify to request no events. Signed-off-by: Ian Jackson ian.jack...@eu.citrix.com Acked-by: Ian Campbell ian.campb...@citrix.com Tested-by: Ian Campbell ian.campb...@citrix.com Release-Acked-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com --- tools/libxl/libxl_fork.c |9 + 1 file changed, 1 insertion(+), 8 deletions(-) diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c index fa15095..144208a 100644 --- a/tools/libxl/libxl_fork.c +++ b/tools/libxl/libxl_fork.c @@ -372,15 +372,8 @@ static void sigchld_user_remove(libxl_ctx *ctx) /* idempotent */ void libxl__sigchld_notneeded(libxl__gc *gc) /* non-reentrant, idempotent */ { -int rc; - sigchld_user_remove(CTX); - -if (libxl__ev_fd_isregistered(CTX-sigchld_selfpipe_efd)) { -rc = libxl__ev_fd_modify(gc, CTX-sigchld_selfpipe_efd, 0); -if (rc) -libxl__ev_fd_deregister(gc, CTX-sigchld_selfpipe_efd); -} +libxl__ev_fd_deregister(gc, CTX-sigchld_selfpipe_efd); } int libxl__sigchld_needed(libxl__gc *gc) /* non-reentrant, idempotent */ -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 5/6] libxl: events: Deregister evtchn fd when not needed [and 1 more messages]
Ian Campbell writes (Re: [PATCH 5/6] libxl: events: Deregister evtchn fd when not needed): There were other comments further down my original review which you haven't answered. I don't think they were (all) predicated on a particular answer to the first question (although some were). Sorry, I didn't see those buried in down the patch ... Ian Campbell writes (Re: [PATCH 5/6] libxl: events: Deregister evtchn fd when not needed): On Thu, 2014-11-27 at 18:27 +, Ian Jackson wrote: @@ -733,14 +733,10 @@ int libxl__ctx_evtchn_init(libxl__gc *gc) { goto out; } -fd = xc_evtchn_fd(xce); -assert(fd = 0); +CTX-evtchn_fd = xc_evtchn_fd(xce); +assert(CTX-evtchn_fd = 0); Given that you can always retrieve this no demand with xc_evtchn_fd(xce) and that it is cheap do you need to stash it in the ctx? Good point. @@ -758,6 +760,13 @@ int libxl__ev_evtchn_wait(libxl__gc *gc, libxl__ev_evtchn *evev) DBG(ev_evtchn=%p port=%d wait (was waiting=%d), evev, evev-port, evev-waiting); +rc = libxl__ctx_evtchn_init(gc); +if (rc) goto out; + +rc = libxl__ev_fd_register(gc, CTX-evtchn_efd, + evtchn_fd_callback, CTX-evtchn_fd, POLLIN); +if (rc) goto out; Do you not need to do this only if evtchns_waiting is currently empty or the efd is idle? In fact, I should check libxl__ev_fd_isregistered. That makes the fragment idempotent. I'm surprised this worked for you as it was... if (evev-waiting) return 0; If you hit this you leave the stuff above done. Which may or may not matter depending on the answer above. Given the change above, I don't think it matters, because if evev-waiting, all of the other stuff is definitely already set up anyway. It is clearest to put the new initialisation fragment next to the existing one. I will resend with the two changes above. The diff between the previous version of this patch and the forthcoming new one is below. Thanks for the careful review. Ian. diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c index 716f318..a36e6d9 100644 --- a/tools/libxl/libxl_event.c +++ b/tools/libxl/libxl_event.c @@ -721,7 +721,7 @@ static void evtchn_fd_callback(libxl__egc *egc, libxl__ev_fd *ev, int libxl__ctx_evtchn_init(libxl__gc *gc) { xc_evtchn *xce; -int rc; +int rc, fd; if (CTX-xce) return 0; @@ -733,10 +733,10 @@ int libxl__ctx_evtchn_init(libxl__gc *gc) { goto out; } -CTX-evtchn_fd = xc_evtchn_fd(xce); -assert(CTX-evtchn_fd = 0); +fd = xc_evtchn_fd(xce); +assert(fd = 0); -rc = libxl_fd_set_nonblock(CTX, CTX-evtchn_fd, 1); +rc = libxl_fd_set_nonblock(CTX, fd, 1); if (rc) goto out; CTX-xce = xce; @@ -763,9 +763,11 @@ int libxl__ev_evtchn_wait(libxl__gc *gc, libxl__ev_evtchn *evev) rc = libxl__ctx_evtchn_init(gc); if (rc) goto out; -rc = libxl__ev_fd_register(gc, CTX-evtchn_efd, - evtchn_fd_callback, CTX-evtchn_fd, POLLIN); -if (rc) goto out; +if (!libxl__ev_fd_isregistered(CTX-evtchn_efd)) { +rc = libxl__ev_fd_register(gc, CTX-evtchn_efd, evtchn_fd_callback, + xc_evtchn_fd(CTX-xce), POLLIN); +if (rc) goto out; +} if (evev-waiting) return 0; diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h index 2eeba1e..9695f18 100644 --- a/tools/libxl/libxl_internal.h +++ b/tools/libxl/libxl_internal.h @@ -359,7 +359,6 @@ struct libxl__ctx { xc_evtchn *xce; /* waiting must be done only with libxl__ev_evtchn* */ LIBXL_LIST_HEAD(, libxl__ev_evtchn) evtchns_waiting; -int evtchn_fd; libxl__ev_fd evtchn_efd; LIBXL_TAILQ_HEAD(libxl__evgen_domain_death_list, libxl_evgen_domain_death) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 5/6] libxl: events: Deregister evtchn fd when not needed
On Tue, 2014-12-09 at 15:54 +, Ian Jackson wrote: We want to have no fd events registered when we are idle. In this patch, deal with the evtchn fd: * Defer setup of the evtchn handle to the first use. * Defer registration of the evtchn fd; register as needed on use. * When cancelling an evtchn wait, or when wait setup fails, check whether there are now no evtchn waits and if so deregister the fd. * On libxl teardown, the evtchn fd should therefore be unregistered. assert that this is the case. Signed-off-by: Ian Jackson ian.jack...@eu.citrix.com Release-Acked-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com Acked-by: Ian Campbell ian.campb...@citrix.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 2/6] libxl: events: Deregister xenstore watch fd when not needed
We want to have no fd events registered when we are idle. In this patch, deal with the xenstore watch fd: * Track the total number of active watches. * When deregistering a watch, or when watch registration fails, check whether there are now no watches and if so deregister the fd. * On libxl teardown, the watch fd should therefore be unregistered. assert that this is the case. Signed-off-by: Ian Jackson ian.jack...@eu.citrix.com --- tools/libxl/libxl.c |2 +- tools/libxl/libxl_event.c| 11 +++ tools/libxl/libxl_internal.h |2 +- 3 files changed, 13 insertions(+), 2 deletions(-) diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c index 55ef535..a238621 100644 --- a/tools/libxl/libxl.c +++ b/tools/libxl/libxl.c @@ -164,7 +164,7 @@ int libxl_ctx_free(libxl_ctx *ctx) for (i = 0; i ctx-watch_nslots; i++) assert(!libxl__watch_slot_contents(gc, i)); -libxl__ev_fd_deregister(gc, ctx-watch_efd); +assert(!libxl__ev_fd_isregistered(ctx-watch_efd)); libxl__ev_fd_deregister(gc, ctx-evtchn_efd); libxl__ev_fd_deregister(gc, ctx-sigchld_selfpipe_efd); diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c index 22b1227..da0a20e 100644 --- a/tools/libxl/libxl_event.c +++ b/tools/libxl/libxl_event.c @@ -524,6 +524,13 @@ static char *watch_token(libxl__gc *gc, int slotnum, uint32_t counterval) return libxl__sprintf(gc, %d/%PRIx32, slotnum, counterval); } +static void watches_check_fd_deregister(libxl__gc *gc) +{ +assert(CTX-nwatches=0); +if (!CTX-nwatches) +libxl__ev_fd_deregister(gc, CTX-watch_efd); +} + int libxl__ev_xswatch_register(libxl__gc *gc, libxl__ev_xswatch *w, libxl__ev_xswatch_callback *func, const char *path /* copied */) @@ -579,6 +586,7 @@ int libxl__ev_xswatch_register(libxl__gc *gc, libxl__ev_xswatch *w, w-slotnum = slotnum; w-path = path_copy; w-callback = func; +CTX-nwatches++; libxl__set_watch_slot_contents(use, w); CTX_UNLOCK; @@ -590,6 +598,7 @@ int libxl__ev_xswatch_register(libxl__gc *gc, libxl__ev_xswatch *w, if (use) LIBXL_SLIST_INSERT_HEAD(CTX-watch_freeslots, use, empty); free(path_copy); +watches_check_fd_deregister(gc); CTX_UNLOCK; return rc; } @@ -614,6 +623,8 @@ void libxl__ev_xswatch_deregister(libxl__gc *gc, libxl__ev_xswatch *w) libxl__ev_watch_slot *slot = CTX-watch_slots[w-slotnum]; LIBXL_SLIST_INSERT_HEAD(CTX-watch_freeslots, slot, empty); w-slotnum = -1; +CTX-nwatches--; +watches_check_fd_deregister(gc); } else { LOG(DEBUG, watch w=%p: deregister unregistered, w); } diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h index a38f695..9695f18 100644 --- a/tools/libxl/libxl_internal.h +++ b/tools/libxl/libxl_internal.h @@ -352,7 +352,7 @@ struct libxl__ctx { LIBXL_TAILQ_HEAD(, libxl__ev_time) etimes; libxl__ev_watch_slot *watch_slots; -int watch_nslots; +int watch_nslots, nwatches; LIBXL_SLIST_HEAD(, libxl__ev_watch_slot) watch_freeslots; uint32_t watch_counter; /* helps disambiguate slot reuse */ libxl__ev_fd watch_efd; -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 1/6] libxl: events: Assert that libxl_ctx_free is not called from a hook
No-one in their right mind would do this, and if they did everything would definitely collapse. Arrange that if this happens, we crash ASAP. Signed-off-by: Ian Jackson ian.jack...@eu.citrix.com Acked-by: Ian Campbell ian.campb...@citrix.com Tested-by: Ian Campbell ian.campb...@citrix.com Release-Acked-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com --- tools/libxl/libxl.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c index 74c00dc..55ef535 100644 --- a/tools/libxl/libxl.c +++ b/tools/libxl/libxl.c @@ -148,6 +148,8 @@ int libxl_ctx_free(libxl_ctx *ctx) { if (!ctx) return 0; +assert(!ctx-osevent_in_hook); + int i; GC_INIT(ctx); -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH for-4.5 v2 0/6] libxl: events: Tear down fd interests when idle
If libxl_event_register_callbacks is called with nonzero hooks while any part of libxl has any fd interests registered internally, then: * There is no way for libxl to notify the application that it wants to be told about these fds, because the spec in the doc comment says that the new hooks are not called for existing interests. * When libxl becomes no longer interested, it will try to find the nexus for the deregistration hook. But such a nexus is only set up with nonzero hooks, so libxl will dereference NULL. * Specifically, since 66bff9fd492f, libxl would unconditionally become interested in its event channel fd during setup. So if the application sets nontrivial hooks it will always crash in libxl_ctx_free. (This case reported as a bug by Ian Campbell.) To fix this, it would be nice to simply forbid `late' registration of event hooks. But this would be an incompatible API changel. And indeed libvirt already registers event hooks after using the ctx to create a domain (!) So instead we add the minimum workable restriction: hooks can (only) be registered when libxl is idle. To do this we need to: * Defer registration of fds until they are needed. * Deregister fds again as they become idle. There is no need to do anything about timeouts because an idle libxl already never has any timeouts. In this series I add defensive assertions. This is a good idea because violations of the rules typically produce crashes anyway, but distant from the cause. The changes in version 2 of the series are: * Fix bogus non-idempotent of evtchn_efd. (Bug in the patch.) * Do not put evtchn_fd in the ctx. (Style improvement.) This series should be included in Xen 4.5: The evtchn fd issue is new in 4.5 - that is, we have a regression since 4.4. It causes libvirt to segfault. But even in 4.4 there are potential bugs, with symptoms such as the libxl watch fd not being properly registered, so that operations are unreasonably delayed, or crashes on ctx teardown. So after these patches make it into 4.5 master the relevant subset should probably be backported. Version 1 of this series was: Release-Acked-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com which I have transferred into this version. Version 1 had: Tested-by: Ian Campbell ian.campb...@citrix.com but that does not apply any more from patch 5 onwards since patch 5 has changed. Ian C, do you still have the setup you used to test v1 ? Thanks, Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 5/6] libxl: events: Deregister evtchn fd when not needed
We want to have no fd events registered when we are idle. In this patch, deal with the evtchn fd: * Defer setup of the evtchn handle to the first use. * Defer registration of the evtchn fd; register as needed on use. * When cancelling an evtchn wait, or when wait setup fails, check whether there are now no evtchn waits and if so deregister the fd. * On libxl teardown, the evtchn fd should therefore be unregistered. assert that this is the case. Signed-off-by: Ian Jackson ian.jack...@eu.citrix.com Release-Acked-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com --- v2: Do not bother putting evtchn_fd in the ctx; instead, get it from xc_evtchn_fd when we need it. (Cosmetic.) Do not register the evtchn fd multiple times: check it's not registered before we call libxl__ev_fd_register. (Bugfix.) --- tools/libxl/libxl.c |4 +--- tools/libxl/libxl_event.c | 21 + 2 files changed, 18 insertions(+), 7 deletions(-) diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c index 8f06043..50a8928 100644 --- a/tools/libxl/libxl.c +++ b/tools/libxl/libxl.c @@ -118,8 +118,6 @@ int libxl_ctx_alloc(libxl_ctx **pctx, int version, rc = ERROR_FAIL; goto out; } -rc = libxl__ctx_evtchn_init(gc); - *pctx = ctx; return 0; @@ -166,7 +164,7 @@ int libxl_ctx_free(libxl_ctx *ctx) for (i = 0; i ctx-watch_nslots; i++) assert(!libxl__watch_slot_contents(gc, i)); assert(!libxl__ev_fd_isregistered(ctx-watch_efd)); -libxl__ev_fd_deregister(gc, ctx-evtchn_efd); +assert(!libxl__ev_fd_isregistered(ctx-evtchn_efd)); assert(!libxl__ev_fd_isregistered(ctx-sigchld_selfpipe_efd)); /* Now there should be no more events requested from the application: */ diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c index da0a20e..a36e6d9 100644 --- a/tools/libxl/libxl_event.c +++ b/tools/libxl/libxl_event.c @@ -739,10 +739,6 @@ int libxl__ctx_evtchn_init(libxl__gc *gc) { rc = libxl_fd_set_nonblock(CTX, fd, 1); if (rc) goto out; -rc = libxl__ev_fd_register(gc, CTX-evtchn_efd, - evtchn_fd_callback, fd, POLLIN); -if (rc) goto out; - CTX-xce = xce; return 0; @@ -751,6 +747,12 @@ int libxl__ctx_evtchn_init(libxl__gc *gc) { return rc; } +static void evtchn_check_fd_deregister(libxl__gc *gc) +{ +if (CTX-xce LIBXL_LIST_EMPTY(CTX-evtchns_waiting)) +libxl__ev_fd_deregister(gc, CTX-evtchn_efd); +} + int libxl__ev_evtchn_wait(libxl__gc *gc, libxl__ev_evtchn *evev) { int r, rc; @@ -758,6 +760,15 @@ int libxl__ev_evtchn_wait(libxl__gc *gc, libxl__ev_evtchn *evev) DBG(ev_evtchn=%p port=%d wait (was waiting=%d), evev, evev-port, evev-waiting); +rc = libxl__ctx_evtchn_init(gc); +if (rc) goto out; + +if (!libxl__ev_fd_isregistered(CTX-evtchn_efd)) { +rc = libxl__ev_fd_register(gc, CTX-evtchn_efd, evtchn_fd_callback, + xc_evtchn_fd(CTX-xce), POLLIN); +if (rc) goto out; +} + if (evev-waiting) return 0; @@ -773,6 +784,7 @@ int libxl__ev_evtchn_wait(libxl__gc *gc, libxl__ev_evtchn *evev) return 0; out: +evtchn_check_fd_deregister(gc); return rc; } @@ -786,6 +798,7 @@ void libxl__ev_evtchn_cancel(libxl__gc *gc, libxl__ev_evtchn *evev) evev-waiting = 0; LIBXL_LIST_REMOVE(evev, entry); +evtchn_check_fd_deregister(gc); } /* -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 4/6] libxl: events: Tear down SIGCHLD machinery on ctx destruction
We want to have no fd events registered when we are idle. Also, we should put back the default SIGCHLD handler. So: * In libxl_ctx_free, use libxl_childproc_setmode to set the mode to the default, which is libxl_sigchld_owner_libxl (ie `libxl owns SIGCHLD only when it has active children'). But of course there are no active children at libxl teardown so this results in libxl__sigchld_notneeded: the ctx loses its interest in SIGCHLD (unsetting the SIGCHLD handler if we were the last ctx) and deregisters the per-ctx selfpipe fd. * assert that this is the case: ie that we are no longer interested in the selfpipe. Signed-off-by: Ian Jackson ian.jack...@eu.citrix.com Acked-by: Ian Campbell ian.campb...@citrix.com Tested-by: Ian Campbell ian.campb...@citrix.com Release-Acked-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com --- tools/libxl/libxl.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c index a238621..8f06043 100644 --- a/tools/libxl/libxl.c +++ b/tools/libxl/libxl.c @@ -162,11 +162,12 @@ int libxl_ctx_free(libxl_ctx *ctx) while ((eject = LIBXL_LIST_FIRST(CTX-disk_eject_evgens))) libxl__evdisable_disk_eject(gc, eject); +libxl_childproc_setmode(CTX,0,0); for (i = 0; i ctx-watch_nslots; i++) assert(!libxl__watch_slot_contents(gc, i)); assert(!libxl__ev_fd_isregistered(ctx-watch_efd)); libxl__ev_fd_deregister(gc, ctx-evtchn_efd); -libxl__ev_fd_deregister(gc, ctx-sigchld_selfpipe_efd); +assert(!libxl__ev_fd_isregistered(ctx-sigchld_selfpipe_efd)); /* Now there should be no more events requested from the application: */ -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 2/6] libxl: events: Deregister xenstore watch fd when not needed
On Tue, 2014-12-09 at 15:54 +, Ian Jackson wrote: We want to have no fd events registered when we are idle. In this patch, deal with the xenstore watch fd: * Track the total number of active watches. * When deregistering a watch, or when watch registration fails, check whether there are now no watches and if so deregister the fd. * On libxl teardown, the watch fd should therefore be unregistered. assert that this is the case. Signed-off-by: Ian Jackson ian.jack...@eu.citrix.com Acked-by: Ian Campbell ian.campb...@citrix.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 5/5] tools/hotplug: support XENSTORED_TRACE in systemd
Olaf Hering writes (Re: [PATCH 5/5] tools/hotplug: support XENSTORED_TRACE in systemd): On Fri, Dec 05, Ian Jackson wrote: I think the only way to make this work properly is to factor the necessary parts out of init.d/xencommons into a new script which can be used by both xencommons and systemd. I'm not sure such a patch would be appropriate for 4.5 at this stage. I came up with this, it appears to work in my testing. Will do more testing later today. Thanks. I think this is going in roughly the right direction. But: I think the script is rather over-engineered, and that it ought to be in /etc. + $(INSTALL_PROG) $(XENSTORED_LIBEXEC) $(DESTDIR)$(LIBEXEC_BIN) Sysadmins might want to edit the script to do something we haven't thought of. So it should be in /etc where they can do so. diff --git a/tools/hotplug/Linux/xenstored.sh.in b/tools/hotplug/Linux/xenstored.sh.in new file mode 100644 index 000..11caf25 --- /dev/null +++ b/tools/hotplug/Linux/xenstored.sh.in ... +case $1 in +--exec) +do_exec=exec +;; +--opt) +opts=(${opts[@]} $2) +shift +;; +esac +shift +done I don't think this script wants to contain an option parser! +. @XEN_SCRIPT_DIR@/hotplugpath.sh ... +test -f $xencommons_config/xencommons . $xencommons_config/xencommons ... +# Wait for xenstored to actually come up, timing out after 30 seconds +while [ $time -lt $timeout ] ! `${BINDIR}/xenstore-read -s / /dev/null 21` ; do I would have expected this new script to contain only the functionality which was previously both in (a) the systemd unit and (b) the traditional init script, and which you are removing from both those places. So I think you should be able to say no ultimate functional change in your commit message. The systemd unit doesn't currently contain anything messing about with xenstore-read to detect when xenstored is working. Is that a bug that was previously in the systemd unit, or is it a mistake in your patch that this is added here ? Thanks, Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [Xen-users] 4.5 git: regression in xen systemd shutdown hangs the OS
Olaf Hering writes (Re: [Xen-devel] [Xen-users] 4.5 git: regression in xen systemd shutdown hangs the OS): Perhaps a crashed or otherwise unavailable xenstored isnt such a problem because its like that since a very long time. Without xenstored, the system is basically hosed. And it's not really restartable. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 2/6] libxl: events: Deregister xenstore watch fd when not needed
Ian Campbell writes (Re: [PATCH 2/6] libxl: events: Deregister xenstore watch fd when not needed): On Tue, 2014-12-09 at 15:54 +, Ian Jackson wrote: We want to have no fd events registered when we are idle. In this patch, deal with the xenstore watch fd: ... Signed-off-by: Ian Jackson ian.jack...@eu.citrix.com Acked-by: Ian Campbell ian.campb...@citrix.com Thanks. In fact you had acked this already but somehow I have dropped that. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.5 1/3] python/xc: Fix multiple issues in pyflask_context_to_sid()
On Tue, Dec 09, 2014 at 02:30:24PM +, Andrew Cooper wrote: On 09/12/14 14:27, Ian Campbell wrote: On Fri, 2014-11-28 at 11:37 +, Ian Campbell wrote: On Thu, 2014-11-27 at 12:34 +, Andrew Cooper wrote: The error handling from a failed memory allocation should return PyErr_SetFromErrno(xc_error_obj); rather than simply calling it and continuing to the memcpy() below, with the dest pointer being NULL. Furthermore, the context string is simply an input parameter to the hypercall, and is not mutated anywhere along the way. The error handling elsewhere in the function can be simplified by not duplicating it to start with. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com Coverity-IDs: 1055305 1055721 CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com CC: Xen Coverity Team cover...@xen.org Acked-by: Ian Campbell ian.campb...@citrix.com This would have been far more obviously correct for 4.5 if you had stuck to fixing the issue in the first paragraph. Konrad, given http://article.gmane.org/gmane.comp.emulators.xen.devel/224881 does this have a release ack? Ian. I can resubmit with a clearer description if that would help clarity, but the code is correct for the fixes (not fantastically well) described. Please do - that is all I was waiting for. Thank you. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2] xen/serial: setup UART idle mode for OMAP
On Tue, Dec 09, 2014 at 05:25:08PM +0200, Oleksandr Dmytryshyn wrote: On Tue, Dec 9, 2014 at 4:47 PM, Ian Campbell ian.campb...@citrix.com wrote: On Mon, 2014-12-08 at 15:51 +0200, Oleksandr Dmytryshyn wrote: UART is not able to receive bytes when idle mode is not configured properly. When we use Xen with old Linux Kernel (for example 3.8) this kernel configures hwmods for all devices even if the device tree nodes for those devices is absent in device tree. Thus UART idle mode is configured too. The fake node for the UART should be added to the device tree because the MMIO range is not mapped by Xen. So UART works normally in this case. But new Linux Kernel (3.12 and upper) doesn't configure idle mode for UART and UART can not work normally in this case. I think the focus is too much on the hack done with 3.8 to make this work rather than on the fix being made here itself. The hack is only really of peripheral/historic interest. How about instead: The UART is not able to receive bytes when idle mode is not configured properly, therefore setup the UART with autoidle and wakeup enabled. You could stop here or if you really want to cover the old hack you could go on to say: Older Linux kernels (for example 3.8) configures hwmods for all devices even if the device tree nodes for those devices is absent in device tree, thus UART idle mode is configured too. With such kernels we can workaround the issue by adding a fake node in the UART containing this MMIO range, which is therefore mapped by Xen to dom0, which reconfigures the UART, causing things to work normally. Newer Linux Kernels (3.12 and beyond) do not configure idle mode for UART and so this hack no longer works. If you are happy with the proposed wording (and indicate whether you want both bits or just the first) then, subject to Konrad giving a release Ack I'd be happy with this for 4.5 and I'll change the commit log as I go. I'm fully happy with proposed wording (and want the both bits to be used) Konrad, this is a bug fix for a particular piece of hardware, it can't affect anything other than the OMAP ARM platform. OK, RElease-Acked-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com Signed-off-by: Oleksandr Dmytryshyn oleksandr.dmytrys...@globallogic.com --- Changed since v1: * corrected commit message xen/drivers/char/omap-uart.c | 3 +++ xen/include/xen/8250-uart.h | 4 2 files changed, 7 insertions(+) diff --git a/xen/drivers/char/omap-uart.c b/xen/drivers/char/omap-uart.c index a798b8d..16d1454 100644 --- a/xen/drivers/char/omap-uart.c +++ b/xen/drivers/char/omap-uart.c @@ -195,6 +195,9 @@ static void __init omap_uart_init_preirq(struct serial_port *port) omap_write(uart, UART_MCR, UART_MCR_DTR|UART_MCR_RTS); omap_write(uart, UART_OMAP_MDR1, UART_OMAP_MDR1_16X_MODE); + +/* setup iddle mode */ +omap_write(uart, UART_SYSC, OMAP_UART_SYSC_DEF_CONF); } static void __init omap_uart_init_postirq(struct serial_port *port) diff --git a/xen/include/xen/8250-uart.h b/xen/include/xen/8250-uart.h index a682bae..304b9dd 100644 --- a/xen/include/xen/8250-uart.h +++ b/xen/include/xen/8250-uart.h @@ -32,6 +32,7 @@ #define UART_MCR 0x04/* Modem control*/ #define UART_LSR 0x05/* line status */ #define UART_MSR 0x06/* Modem status */ +#define UART_SYSC 0x15/* System configuration register */ #define UART_USR 0x1f/* Status register (DW) */ #define UART_DLL 0x00/* divisor latch (ls) (DLAB=1) */ #define UART_DLM 0x01/* divisor latch (ms) (DLAB=1) */ @@ -145,6 +146,9 @@ /* SCR register bitmasks */ #define OMAP_UART_SCR_RX_TRIG_GRANU1_MASK (1 7) +/* System configuration register */ +#define OMAP_UART_SYSC_DEF_CONF 0x0d /* autoidle mode, wakeup is enabled */ + #endif /* __XEN_8250_UART_H__ */ /* ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 5/5] tools/hotplug: support XENSTORED_TRACE in systemd
On Tue, Dec 09, Ian Jackson wrote: Olaf Hering writes (Re: [PATCH 5/5] tools/hotplug: support XENSTORED_TRACE in systemd): On Fri, Dec 05, Ian Jackson wrote: I think the only way to make this work properly is to factor the necessary parts out of init.d/xencommons into a new script which can be used by both xencommons and systemd. I'm not sure such a patch would be appropriate for 4.5 at this stage. I came up with this, it appears to work in my testing. Will do more testing later today. Thanks. I think this is going in roughly the right direction. But: I think the script is rather over-engineered, and that it ought to be in /etc. Why should the wrapper be in /etc?! xendomains isnt in /etc either. + $(INSTALL_PROG) $(XENSTORED_LIBEXEC) $(DESTDIR)$(LIBEXEC_BIN) Sysadmins might want to edit the script to do something we haven't thought of. So it should be in /etc where they can do so. So they should mail here if they find substantial functionality missing. Until then they continue to not enable xencommons and run their own startup script. I don't think this script wants to contain an option parser! How should it handle exec vs. no-exec? Just a single yes/no knob, so essentially sysv vs systemd? The systemd unit doesn't currently contain anything messing about with xenstore-read to detect when xenstored is working. Is that a bug that was previously in the systemd unit, or is it a mistake in your patch that this is added here ? No idea how long it takes to have a functional xenstored after running it. Perhaps the forking has some overhead and it returns earlier than it can process requests. If thats the reason why the loop exists in the sysv runlevel script then that loop should be used only without --no-fork. Olaf ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [v8][PATCH 02/17] introduce XEN_DOMCTL_set_rdm
On Tue, Dec 09, 2014 at 08:33:56AM +, Jan Beulich wrote: On 09.12.14 at 02:06, tiejun.c...@intel.com wrote: Also how does this work with 32-bit dom0s? Is there a need to use the compat layer? Are you saying in xsm case? Others? Actually this new DOMCTL is similar with XEN_DOMCTL_assign_device in some senses but I don't see such an issue you're pointing. I was thinking about the compat layer and making sure it works properly. Do we really need this consideration? I mean I referred to that existing XEN_DOMCTL_assign_device to implement this new DOMCTL, but looks there's nothing related to this point. Or could you make your thought clear to me with an exiting example? Then I can take a look at what exactly should be taken in my new DOMCTL since I'm a fresh man to work out this properly inside xen. I think Konrad got a little confused here - domctl-s intentionally are structured so that they don't need a compat translation layer. Ah! Thank you for that reminder! Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 12/19] hvmloader: retrieve vNUMA information from hypervisor
On 01.12.14 at 16:33, wei.l...@citrix.com wrote: @@ -261,6 +262,8 @@ int main(void) init_hypercalls(); +init_vnuma_info(); This is very early, and I don't think it needs to be - I guess it could be done right before doing ACPI stuff? --- /dev/null +++ b/tools/firmware/hvmloader/vnuma.c @@ -0,0 +1,84 @@ +/* + * vnuma.c: obtain vNUMA information from hypervisor + * + * Copyright (c) 2014 Wei Liu, Citrix Systems (RD) Ltd. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + *notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + *notice, this list of conditions and the following disclaimer in the + *documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include util.h +#include hypercall.h +#include vnuma.h +#include errno.h This is the system header, not the Xen one. See the discussion at http://lists.xenproject.org/archives/html/xen-devel/2014-10/msg03206.html and perhaps build on the resulting patch http://lists.xenproject.org/archives/html/xen-devel/2014-12/msg00013.html. + +unsigned int nr_vnodes, nr_vmemranges; +unsigned int *vcpu_to_vnode, *vdistance; +xen_vmemrange_t *vmemrange; + +void init_vnuma_info(void) +{ +int rc, retry = 0; +struct xen_vnuma_topology_info vnuma_topo = { +.domid = DOMID_SELF, +}; + +vcpu_to_vnode = scratch_alloc(sizeof(uint32_t) * hvm_info-nr_vcpus, 0); +vdistance = scratch_alloc(sizeof(uint32_t) * MAX_VDISTANCE, 0); +vmemrange = scratch_alloc(sizeof(xen_vmemrange_t) * MAX_VMEMRANGES, 0); + +vnuma_topo.nr_vnodes = MAX_VNODES; +vnuma_topo.nr_vcpus = hvm_info-nr_vcpus; +vnuma_topo.nr_vmemranges = MAX_VMEMRANGES; + +set_xen_guest_handle(vnuma_topo.vdistance.h, vdistance); +set_xen_guest_handle(vnuma_topo.vcpu_to_vnode.h, vcpu_to_vnode); +set_xen_guest_handle(vnuma_topo.vmemrange.h, vmemrange); + +rc = hypercall_memory_op(XENMEM_get_vnumainfo, vnuma_topo); +while ( rc == -EAGAIN retry 10 ) +{ +rc = hypercall_memory_op(XENMEM_get_vnumainfo, vnuma_topo); +retry++; +} +if ( rc 0 ) +{ +printf(Failed to retrieve vNUMA information, rc = %d\n, rc); +goto out; return; +} + +nr_vnodes = vnuma_topo.nr_vnodes; +nr_vmemranges = vnuma_topo.nr_vmemranges; + +out: +return; Drop these two (or really three) unnecessary lines please. --- /dev/null +++ b/tools/firmware/hvmloader/vnuma.h @@ -0,0 +1,56 @@ +/** + * vnuma.h + * + * Copyright (c) 2014, Wei Liu + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License version 2 + * as published by the Free Software Foundation; or, when distributed + * separately from the Linux kernel or incorporated into other + * software packages, subject to the following license: + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this source file (the Software), to deal in the Software without + * restriction, including without limitation the rights to use, copy, modify, + * merge, publish, distribute, sublicense, and/or sell copies of the Software, + * and to permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
Re: [Xen-devel] [PATCH v2 13/19] hvmloader: construct SRAT
On 01.12.14 at 16:33, wei.l...@citrix.com wrote: @@ -203,6 +204,65 @@ static struct acpi_20_waet *construct_waet(void) return waet; } +static struct acpi_20_srat *construct_srat(void) +{ +struct acpi_20_srat *srat; +struct acpi_20_srat_processor *processor; +struct acpi_20_srat_memory *memory; +unsigned int size; +void *p; +int i; +uint64_t mem; + +size = sizeof(*srat) + sizeof(*processor) * hvm_info-nr_vcpus + +sizeof(*memory) * nr_vmemranges; + +p = mem_alloc(size, 16); +if (!p) return NULL; Coding style (despite you likely copied it from elsewhere). + +srat = p; +memset(srat, 0, sizeof(*srat)); +srat-header.signature= ACPI_2_0_SRAT_SIGNATURE; +srat-header.revision = ACPI_2_0_SRAT_REVISION; +fixed_strcpy(srat-header.oem_id, ACPI_OEM_ID); +fixed_strcpy(srat-header.oem_table_id, ACPI_OEM_TABLE_ID); +srat-header.oem_revision = ACPI_OEM_REVISION; +srat-header.creator_id = ACPI_CREATOR_ID; +srat-header.creator_revision = ACPI_CREATOR_REVISION; +srat-table_revision = ACPI_SRAT_TABLE_REVISION; + +processor = (struct acpi_20_srat_processor *)(srat + 1); +for ( i = 0; i hvm_info-nr_vcpus; i++ ) +{ +memset(processor, 0, sizeof(*processor)); +processor-type = ACPI_PROCESSOR_AFFINITY; +processor-length = sizeof(*processor); +processor-domain = vcpu_to_vnode[i]; +processor-apic_id = LAPIC_ID(i); +processor-flags= ACPI_LOCAL_APIC_AFFIN_ENABLED; +processor++; +} + +memory = (struct acpi_20_srat_memory *)processor; +for ( i = 0; i nr_vmemranges; i++ ) +{ +mem = vmemrange[i].end - vmemrange[i].start; Do you really need this helper variable? +memset(memory, 0, sizeof(*memory)); +memory-type = ACPI_MEMORY_AFFINITY; +memory-length= sizeof(*memory); +memory-domain= vmemrange[i].nid; +memory-flags = ACPI_MEM_AFFIN_ENABLED; +memory-base_address = vmemrange[i].start; +memory-mem_length= mem; +memory++; +} + +srat-header.length = size; Mind checking size == memory - p here? @@ -346,6 +407,16 @@ static int construct_secondary_tables(unsigned long *table_ptrs, } } +/* SRAT */ +if ( nr_vnodes 0 ) +{ +srat = construct_srat(); +if (srat) Coding style again. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 14/19] hvmloader: construct SLIT
On 01.12.14 at 16:33, wei.l...@citrix.com wrote: --- a/tools/firmware/hvmloader/acpi/build.c +++ b/tools/firmware/hvmloader/acpi/build.c @@ -263,6 +263,38 @@ static struct acpi_20_srat *construct_srat(void) return srat; } +static struct acpi_20_slit *construct_slit(void) +{ +struct acpi_20_slit *slit; +unsigned int num, size; +int i; unsigned int please. Plus similar coding style issues further down as in the previous patch. @@ -415,6 +448,11 @@ static int construct_secondary_tables(unsigned long *table_ptrs, table_ptrs[nr_tables++] = (unsigned long)srat; else printf(Failed to build SRAT, skipping...\n); +slit = construct_slit(); +if (srat) DYM slit? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 5/5] tools/hotplug: support XENSTORED_TRACE in systemd
Olaf Hering writes (Re: [PATCH 5/5] tools/hotplug: support XENSTORED_TRACE in systemd): On Tue, Dec 09, Ian Jackson wrote: But: I think the script is rather over-engineered, and that it ought to be in /etc. Why should the wrapper be in /etc?! Because it is the last chance the admin has to adjust how xenstored is run, which they ought to be able to do. xendomains isnt in /etc either. xendomains is rather different. + $(INSTALL_PROG) $(XENSTORED_LIBEXEC) $(DESTDIR)$(LIBEXEC_BIN) Sysadmins might want to edit the script to do something we haven't thought of. So it should be in /etc where they can do so. So they should mail here if they find substantial functionality missing. That is no good because it doesn't allow them to make the change immediately on their own system, in a way that won't be overwritten by their system's package manager. Until then they continue to not enable xencommons and run their own startup script. That is not a practical suggestion. Bottom line: as relevant maintainer, I'm afraid I'm going to insist that this script be in /etc. I don't think this script wants to contain an option parser! How should it handle exec vs. no-exec? Just a single yes/no knob, so essentially sysv vs systemd? I was imagining a named parameter as SuS calls them. One or both the sites which run this wrapper script would pass an environment variable. Something like this in the script: $xenstored_do_exec $XENSTORED $@ $XENSTORED_TRACE_ARGS $XENSTORED_ARGS where the systemd unit simply sets `xenstored_do_exec=exec'. The systemd unit doesn't currently contain anything messing about with xenstore-read to detect when xenstored is working. Is that a bug that was previously in the systemd unit, or is it a mistake in your patch that this is added here ? No idea how long it takes to have a functional xenstored after running it. Perhaps the forking has some overhead and it returns earlier than it can process requests. If thats the reason why the loop exists in the sysv runlevel script then that loop should be used only without --no-fork. I think it is up to you as the person proposing a change to explain what the situation is and why your change is justified. It's not really satisfactory for a maintainer or reviewer to ask questions of a submitter and get simply `I'm not sure ... perhaps ...' without some kind of acknowledgement that more information (and perhaps research) is needed before we can establish how the code should look. In this case that means I think you should find out whether the lack of this xenstore-read polling loop is actually a bug in the existing systemd unit. If (as I suspect) this is not a bug, then your code motion should not change this aspect of the operation under systemd. Thanks, Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2] VMX: don't allow PVH to reach handle_mmio()
El 08/12/14 a les 10.12, Jan Beulich ha escrit: PVH guests accessing I/O ports via string ops is not supported yet. Reported-by: Roger Pau Monnéroger@citrix.com Signed-off-by: Jan Beulich jbeul...@suse.com This looks fine to me (at least it doesn't break the existing IN/OUT users), and seems the best solution 4.5 wise: Acked-by: Roger Pau Monné roger@citrix.com For 4.6 I think we need to start using a different hvm_io_bitmap for PVH Dom0 that allows direct access to the IO ports, bypassing the vmexit and simplifying the code in Xen (this would also fix INS/OUTS). Still not sure what should be done for PVH DomUs, specially when PVH gains support for pci-passthrough. Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2] VMX: don't allow PVH to reach handle_mmio()
On 09.12.14 at 18:01, roger@citrix.com wrote: For 4.6 I think we need to start using a different hvm_io_bitmap for PVH Dom0 that allows direct access to the IO ports, bypassing the vmexit and simplifying the code in Xen (this would also fix INS/OUTS). Still not sure what should be done for PVH DomUs, specially when PVH gains support for pci-passthrough. With the difficulty being that for PV the hypervisor intentionally intercepts accesses to some of the ports, so we can't blindly allow PVH access to all the ports its allowed access to. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v2] console: allocate ring buffer earlier
... when conring_size= was specified on the command line. We can't really do this as early as we would want to when the option was not specified, as the default depends on knowing the system CPU count. Yet the parsing of the ACPI tables is one of the things that generates a lot of output especially on large systems. Signed-off-by: Jan Beulich jbeul...@suse.com --- v2: Also adjust ARM (as requested by Julien). Rename new function to console_init_ring(). --- a/xen/arch/arm/setup.c +++ b/xen/arch/arm/setup.c @@ -745,6 +745,7 @@ void __init start_xen(unsigned long boot dt_uart_init(); console_init_preirq(); +console_init_ring(); system_state = SYS_STATE_boot; --- a/xen/arch/x86/setup.c +++ b/xen/arch/x86/setup.c @@ -1187,6 +1187,7 @@ void __init noreturn __start_xen(unsigne } vm_init(); +console_init_ring(); vesa_init(); softirq_init(); --- a/xen/drivers/char/console.c +++ b/xen/drivers/char/console.c @@ -744,15 +744,14 @@ void __init console_init_preirq(void) } } -void __init console_init_postirq(void) +void __init console_init_ring(void) { char *ring; unsigned int i, order, memflags; - -serial_init_postirq(); +unsigned long flags; if ( !opt_conring_size ) -opt_conring_size = num_present_cpus() (9 + xenlog_lower_thresh); +return; order = get_order_from_bytes(max(opt_conring_size, conring_size)); memflags = MEMF_bits(crashinfo_maxaddr_bits); @@ -763,17 +762,30 @@ void __init console_init_postirq(void) } opt_conring_size = PAGE_SIZE order; -spin_lock_irq(console_lock); +spin_lock_irqsave(console_lock, flags); for ( i = conringc ; i != conringp; i++ ) ring[i (opt_conring_size - 1)] = conring[i (conring_size - 1)]; conring = ring; smp_wmb(); /* Allow users of console_force_unlock() to see larger buffer. */ conring_size = opt_conring_size; -spin_unlock_irq(console_lock); +spin_unlock_irqrestore(console_lock, flags); printk(Allocated console ring of %u KiB.\n, opt_conring_size 10); } +void __init console_init_postirq(void) +{ +serial_init_postirq(); + +if ( conring != _conring ) +return; + +if ( !opt_conring_size ) +opt_conring_size = num_present_cpus() (9 + xenlog_lower_thresh); + +console_init_ring(); +} + void __init console_endboot(void) { int i, j; --- a/xen/include/xen/console.h +++ b/xen/include/xen/console.h @@ -14,6 +14,7 @@ struct xen_sysctl_readconsole; long read_console_ring(struct xen_sysctl_readconsole *op); void console_init_preirq(void); +void console_init_ring(void); void console_init_postirq(void); void console_endboot(void); int console_has(const char *device); console: allocate ring buffer earlier ... when conring_size= was specified on the command line. We can't really do this as early as we would want to when the option was not specified, as the default depends on knowing the system CPU count. Yet the parsing of the ACPI tables is one of the things that generates a lot of output especially on large systems. Signed-off-by: Jan Beulich jbeul...@suse.com --- v2: Also adjust ARM (as requested by Julien). Rename new function to console_init_ring(). --- a/xen/arch/arm/setup.c +++ b/xen/arch/arm/setup.c @@ -745,6 +745,7 @@ void __init start_xen(unsigned long boot dt_uart_init(); console_init_preirq(); +console_init_ring(); system_state = SYS_STATE_boot; --- a/xen/arch/x86/setup.c +++ b/xen/arch/x86/setup.c @@ -1187,6 +1187,7 @@ void __init noreturn __start_xen(unsigne } vm_init(); +console_init_ring(); vesa_init(); softirq_init(); --- a/xen/drivers/char/console.c +++ b/xen/drivers/char/console.c @@ -744,15 +744,14 @@ void __init console_init_preirq(void) } } -void __init console_init_postirq(void) +void __init console_init_ring(void) { char *ring; unsigned int i, order, memflags; - -serial_init_postirq(); +unsigned long flags; if ( !opt_conring_size ) -opt_conring_size = num_present_cpus() (9 + xenlog_lower_thresh); +return; order = get_order_from_bytes(max(opt_conring_size, conring_size)); memflags = MEMF_bits(crashinfo_maxaddr_bits); @@ -763,17 +762,30 @@ void __init console_init_postirq(void) } opt_conring_size = PAGE_SIZE order; -spin_lock_irq(console_lock); +spin_lock_irqsave(console_lock, flags); for ( i = conringc ; i != conringp; i++ ) ring[i (opt_conring_size - 1)] = conring[i (conring_size - 1)]; conring = ring; smp_wmb(); /* Allow users of console_force_unlock() to see larger buffer. */ conring_size = opt_conring_size; -spin_unlock_irq(console_lock); +spin_unlock_irqrestore(console_lock, flags); printk(Allocated console ring of %u KiB.\n, opt_conring_size 10); } +void __init console_init_postirq(void) +{ +serial_init_postirq(); + +if ( conring !=
Re: [Xen-devel] [PATCH v2 12/19] hvmloader: retrieve vNUMA information from hypervisor
On Tue, Dec 09, 2014 at 04:46:22PM +, Jan Beulich wrote: On 01.12.14 at 16:33, wei.l...@citrix.com wrote: @@ -261,6 +262,8 @@ int main(void) init_hypercalls(); +init_vnuma_info(); This is very early, and I don't think it needs to be - I guess it could be done right before doing ACPI stuff? Yes, it can be moved right before ACPI stuff. --- /dev/null +++ b/tools/firmware/hvmloader/vnuma.c @@ -0,0 +1,84 @@ +/* + * vnuma.c: obtain vNUMA information from hypervisor + * + * Copyright (c) 2014 Wei Liu, Citrix Systems (RD) Ltd. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + *notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + *notice, this list of conditions and the following disclaimer in the + *documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include util.h +#include hypercall.h +#include vnuma.h +#include errno.h This is the system header, not the Xen one. See the discussion at http://lists.xenproject.org/archives/html/xen-devel/2014-10/msg03206.html and perhaps build on the resulting patch http://lists.xenproject.org/archives/html/xen-devel/2014-12/msg00013.html. I will cherry pick that patch and build on top of it. + +unsigned int nr_vnodes, nr_vmemranges; +unsigned int *vcpu_to_vnode, *vdistance; +xen_vmemrange_t *vmemrange; + [...] + */ + +#ifndef __HVMLOADER_VNUMA_H__ +#define __HVMLOADER_VNUMA_H__ + +#include xen/memory.h + +#define MAX_VNODES 64 +#define MAX_VDISTANCE (MAX_VNODES * MAX_VNODES) +#define MAX_VMEMRANGES (MAX_VNODES * 2) These look arbitrary - please add a (brief) comment giving some rationale so that people needing to change them know eventual consequences. Would it perhaps make sense to derive MAX_VNODES from HVM_MAX_VCPUS? Additionally I think the I don't think these two have very strong connection though. And I remember you saying HVM_MAX_VCPUS will be removed. code wouldn't become much more difficult if you didn't build in static upper limits. Yes I can make two hypercalls. First one to retrieve the number of nodes / vmemranges configured, allocate memory then fill in those arrays with second hypercall. Wei. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 13/19] hvmloader: construct SRAT
On Tue, Dec 09, 2014 at 04:53:40PM +, Jan Beulich wrote: On 01.12.14 at 16:33, wei.l...@citrix.com wrote: @@ -203,6 +204,65 @@ static struct acpi_20_waet *construct_waet(void) return waet; } +static struct acpi_20_srat *construct_srat(void) +{ +struct acpi_20_srat *srat; +struct acpi_20_srat_processor *processor; +struct acpi_20_srat_memory *memory; +unsigned int size; +void *p; +int i; +uint64_t mem; + +size = sizeof(*srat) + sizeof(*processor) * hvm_info-nr_vcpus + +sizeof(*memory) * nr_vmemranges; + +p = mem_alloc(size, 16); +if (!p) return NULL; Coding style (despite you likely copied it from elsewhere). Fixed. + +srat = p; +memset(srat, 0, sizeof(*srat)); +srat-header.signature= ACPI_2_0_SRAT_SIGNATURE; +srat-header.revision = ACPI_2_0_SRAT_REVISION; +fixed_strcpy(srat-header.oem_id, ACPI_OEM_ID); +fixed_strcpy(srat-header.oem_table_id, ACPI_OEM_TABLE_ID); +srat-header.oem_revision = ACPI_OEM_REVISION; +srat-header.creator_id = ACPI_CREATOR_ID; +srat-header.creator_revision = ACPI_CREATOR_REVISION; +srat-table_revision = ACPI_SRAT_TABLE_REVISION; + +processor = (struct acpi_20_srat_processor *)(srat + 1); +for ( i = 0; i hvm_info-nr_vcpus; i++ ) +{ +memset(processor, 0, sizeof(*processor)); +processor-type = ACPI_PROCESSOR_AFFINITY; +processor-length = sizeof(*processor); +processor-domain = vcpu_to_vnode[i]; +processor-apic_id = LAPIC_ID(i); +processor-flags= ACPI_LOCAL_APIC_AFFIN_ENABLED; +processor++; +} + +memory = (struct acpi_20_srat_memory *)processor; +for ( i = 0; i nr_vmemranges; i++ ) +{ +mem = vmemrange[i].end - vmemrange[i].start; Do you really need this helper variable? Removed. +memset(memory, 0, sizeof(*memory)); +memory-type = ACPI_MEMORY_AFFINITY; +memory-length= sizeof(*memory); +memory-domain= vmemrange[i].nid; +memory-flags = ACPI_MEM_AFFIN_ENABLED; +memory-base_address = vmemrange[i].start; +memory-mem_length= mem; +memory++; +} + +srat-header.length = size; Mind checking size == memory - p here? Why? There doesn't seem to be anything that would cause memory -p != size in between during runtime. @@ -346,6 +407,16 @@ static int construct_secondary_tables(unsigned long *table_ptrs, } } +/* SRAT */ +if ( nr_vnodes 0 ) +{ +srat = construct_srat(); +if (srat) Coding style again. Fixed. Wei. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCHv1 net] xen-netfront: use correct linear area after linearizing an skb
Commit 97a6d1bb2b658ac85ed88205ccd1ab809899884d (xen-netfront: Fix handling packets on compound pages with skb_linearize) attempted to fix a problem where an skb that would have required too many slots would be dropped causing TCP connections to stall. However, it filled in the first slot using the original buffer and not the new one and would use the wrong offset and grant access to the wrong page. Netback would notice the malformed request and stop all traffic on the VIF, reporting: vif vif-3-0 vif3.0: txreq.offset: 85e, size: 4002, end: 6144 vif vif-3-0 vif3.0: fatal error; disabling device Reported-by: Anthony Wright anth...@overnetdata.com Tested-by: Anthony Wright anth...@overnetdata.com Signed-off-by: David Vrabel david.vra...@citrix.com --- drivers/net/xen-netfront.c |3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c index ece8d18..eeed0ce 100644 --- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -627,6 +627,9 @@ static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev) slots, skb-len); if (skb_linearize(skb)) goto drop; + data = skb-data; + offset = offset_in_page(data); + len = skb_headlen(skb); } spin_lock_irqsave(queue-tx_lock, flags); -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Xen-unstable: irq.c:xxxx: dom1: forcing unbind of pirq 40
Tuesday, December 9, 2014, 7:05:32 PM, you wrote: Tuesday, December 9, 2014, 6:29:08 PM, you wrote: On 09.12.14 at 17:24, li...@eikelenboom.it wrote: Currently i'm running Xen-unstable with Stefano's libxl fixup patches (not doing reset due to qmp race, switching order of xc_domain_irq_permission and xc_physdev_unmap_pirq) and with a kernel with Konrad's 3.19 patch series. Although i have seen these messages before, so i don't know if it is a recent regression (don't think so). I still see these in xl dmesg when shutting down guests with pci passthrough: irq.c:: dom1: forcing unbind of pirq 40 It seems to cleanup ok afterwards, but from the sound warning it shouldn't get to that point. I get these for both PV and HVM guests. These messages simply indicate that the guest didn't do some expected cleanup itself. OK, the thing is that at present is does this on every shutdown of guests with pci-passthrough, all running a recent kernel. That hasn't always been that way as far as i can remember. I will see if i have some old guest kernels and see if i get the same there. The most ancient i had around was 3.12, tested on the PV guest, but no difference. I see these too when Dom0 shuts down, at least for certain drivers. For HVM arguably qemu might know better and tear things down properly, but in no event are these messages - when occurring in connection with a guest going down - indications of a problem. We could even see to lower their severity or suppress them altogether when the domain is known to be dying (you may want to check whether -is_dying is already set at that point). Just tested, but for both PV and HVM when it enters unmap_domain_pirq d-is_dying ? 1 : 0 returns 0. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 2/2] x86, arm64, platform, xen, kconfig: add xen defconfig helper
On Tue, Dec 9, 2014 at 12:22 PM, Luis R. Rodriguez mcg...@do-not-panic.com wrote: If you are sure then yes. Likewise here. Luis ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 2/2] x86, arm64, platform, xen, kconfig: add xen defconfig helper
On 09/12/14 20:22, Luis R. Rodriguez wrote: On Tue, Dec 9, 2014 at 1:06 AM, Julien Grall julien.gr...@linaro.org wrote: Hello Luis, On 08/12/2014 23:05, Luis R. Rodriguez wrote: diff --git a/kernel/configs/xen.config b/kernel/configs/xen.config new file mode 100644 index 000..0d0eb6d --- /dev/null +++ b/kernel/configs/xen.config +CONFIG_XEN_MCE_LOG=y MCE is x86 specific. That's what I thought too but its available for arm64, so should we fix that Kconfig to depend on x86? Are you sure? On the Linus's repo I have: config XEN_MCE_LOG bool Xen platform mcelog depends on XEN_DOM0 X86_64 X86_MCE Anyway, the MCE interface in the hypervisor is implemented in arch/x86 not in common code. +CONFIG_XEN_HAVE_PVMMU=y We don't have PVMMU support on ARM. Shouldn't you move this config in architecture specific code? If you are sure then yes. I'm 100% sure. MMU is handled by the hardware on ARM. Thinking a bit more about this option. CONFIG_XEN_HAVE_PVMMU can't be selected by the user. It's automatically added per platform (for instance see arch/x86/xen/Kconfig). So maybe it should not even appear in the one of the fragment configs? Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [libvirt test] 32180: regressions - FAIL
flight 32180 libvirt real [real] http://www.chiark.greenend.org.uk/~xensrcts/logs/32180/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-i386-libvirt5 libvirt-build fail REGR. vs. 32137 build-amd64-libvirt 5 libvirt-build fail REGR. vs. 32137 build-armhf-libvirt 5 libvirt-build fail REGR. vs. 32137 Tests which did not succeed, but are not blocking: test-amd64-i386-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-libvirt 1 build-check(1) blocked n/a test-armhf-armhf-libvirt 1 build-check(1) blocked n/a version targeted for testing: libvirt 8a408b869ce343f953a7ca564598985504b9aa7f baseline version: libvirt 253319ced6e9e92992f59000e3810b7d3ab59750 People who touched revisions under test: Christophe Fergeau cferg...@redhat.com Eric Blake ebl...@redhat.com Kyle DeFrancia k...@linux.vnet.ibm.com Laine Stump la...@laine.org Martin Kletzander mklet...@redhat.com Peter Krempa pkre...@redhat.com jobs: build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt fail build-armhf-libvirt fail build-i386-libvirt fail build-amd64-pvopspass build-armhf-pvopspass build-i386-pvops pass test-amd64-amd64-libvirt blocked test-armhf-armhf-libvirt blocked test-amd64-i386-libvirt blocked sg-report-flight on osstest.cam.xci-test.com logs: /home/xc_osstest/logs images: /home/xc_osstest/images Logs, config files, etc. are available at http://www.chiark.greenend.org.uk/~xensrcts/logs Test harness code can be found at http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary Not pushing. (No revision log; it would be 430 lines long.) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PV DomU running linux 3.17.3 causing xen-netback fatal error in Dom0
On 09/12/2014 22:30, Siegmann Joseph wrote: Would you mind sharing what it would take to correct this issue… is there a file I could just replace until a patch is released? We simply applied David Vrabel's patch from 8/12/14 to a stock 3.17.3 kernel we were running in the DomU and it fixed the problem. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v2 0/2] x86/arm64: add xenconfig
From: Luis R. Rodriguez mcg...@suse.com This v2 addreses some minor feedback on patch 2, and just mends in the review tags. Luis R. Rodriguez (2): x86, platform, xen, kconfig: clarify kvmconfig is for kvm x86, arm, platform, xen, kconfig: add xen defconfig helper arch/x86/configs/xen.config | 7 +++ kernel/configs/xen.config | 30 ++ scripts/kconfig/Makefile| 7 ++- 3 files changed, 43 insertions(+), 1 deletion(-) create mode 100644 arch/x86/configs/xen.config create mode 100644 kernel/configs/xen.config -- 2.1.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v2 2/2] x86, arm, platform, xen, kconfig: add xen defconfig helper
From: Luis R. Rodriguez mcg...@suse.com This lets you build a kernel which can support xen dom0 or xen guests by just using: make xenconfig on both x86 and arm64 kernels. This also splits out the options which are available currently to be built with x86 and 'make ARCH=arm64' under a shared config. Technically xen supports a dom0 kernel and also a guest kernel configuration but upon review with the xen team since we don't have many dom0 options its best to just combine these two into one. Cc: Josh Triplett j...@joshtriplett.org Cc: Borislav Petkov b...@suse.de Cc: Pekka Enberg penb...@kernel.org Cc: David Rientjes rient...@google.com Cc: Michal Marek mma...@suse.cz Cc: Randy Dunlap rdun...@infradead.org Cc: penb...@kernel.org Cc: levinsasha...@gmail.com Cc: mtosa...@redhat.com Cc: fengguang...@intel.com Cc: David Vrabel david.vra...@citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com Cc: xen-de...@lists.xenproject.org Reviewed-by: Josh Triplett j...@joshtriplett.org Signed-off-by: Luis R. Rodriguez mcg...@suse.com --- arch/x86/configs/xen.config | 7 +++ kernel/configs/xen.config | 30 ++ scripts/kconfig/Makefile| 5 + 3 files changed, 42 insertions(+) create mode 100644 arch/x86/configs/xen.config create mode 100644 kernel/configs/xen.config diff --git a/arch/x86/configs/xen.config b/arch/x86/configs/xen.config new file mode 100644 index 000..92b8587f --- /dev/null +++ b/arch/x86/configs/xen.config @@ -0,0 +1,7 @@ +# x86 xen specific config options +CONFIG_XEN_PVHVM=y +CONFIG_XEN_MAX_DOMAIN_MEMORY=500 +CONFIG_XEN_SAVE_RESTORE=y +# CONFIG_XEN_DEBUG_FS is not set +CONFIG_XEN_PVH=y +CONFIG_XEN_MCE_LOG=y diff --git a/kernel/configs/xen.config b/kernel/configs/xen.config new file mode 100644 index 000..d2ec010 --- /dev/null +++ b/kernel/configs/xen.config @@ -0,0 +1,30 @@ +# generic config +CONFIG_XEN=y +CONFIG_XEN_DOM0=y +CONFIG_PCI_XEN=y +CONFIG_XEN_PCIDEV_FRONTEND=m +CONFIG_XEN_BLKDEV_FRONTEND=m +CONFIG_XEN_BLKDEV_BACKEND=m +CONFIG_XEN_NETDEV_FRONTEND=m +CONFIG_XEN_NETDEV_BACKEND=m +CONFIG_INPUT_XEN_KBDDEV_FRONTEND=y +CONFIG_HVC_XEN=y +CONFIG_HVC_XEN_FRONTEND=y +CONFIG_TCG_XEN=m +CONFIG_XEN_WDT=m +CONFIG_XEN_FBDEV_FRONTEND=y +CONFIG_XEN_BALLOON=y +CONFIG_XEN_BALLOON_MEMORY_HOTPLUG=y +CONFIG_XEN_SCRUB_PAGES=y +CONFIG_XEN_DEV_EVTCHN=m +CONFIG_XEN_BACKEND=y +CONFIG_XENFS=m +CONFIG_XEN_COMPAT_XENFS=y +CONFIG_XEN_SYS_HYPERVISOR=y +CONFIG_XEN_XENBUS_FRONTEND=y +CONFIG_XEN_GNTDEV=m +CONFIG_XEN_GRANT_DEV_ALLOC=m +CONFIG_SWIOTLB_XEN=y +CONFIG_XEN_PCIDEV_BACKEND=m +CONFIG_XEN_PRIVCMD=m +CONFIG_XEN_ACPI_PROCESSOR=m diff --git a/scripts/kconfig/Makefile b/scripts/kconfig/Makefile index ff612b0..f4a8f89 100644 --- a/scripts/kconfig/Makefile +++ b/scripts/kconfig/Makefile @@ -117,6 +117,10 @@ PHONY += kvmconfig kvmconfig: $(call mergeconfig,kvm_guest) +PHONY += xenconfig +xenconfig: + $(call mergeconfig,xen) + PHONY += tinyconfig tinyconfig: allnoconfig $(call mergeconfig,tiny) @@ -142,6 +146,7 @@ help: @echo ' listnewconfig - List new options' @echo ' olddefconfig- Same as silentoldconfig but sets new symbols to their default value' @echo ' kvmconfig - Enable additional options for kvm guest kernel support' + @echo ' xenconfig - Enable additional options for xen dom0 and guest kernel support' @echo ' tinyconfig - Configure the tiniest possible kernel' # lxdialog stuff -- 2.1.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 1/2] x86, platform, xen, kconfig: clarify kvmconfig is for kvm
On Tue, Dec 09, 2014 at 03:35:37PM -0800, Luis R. Rodriguez wrote: From: Luis R. Rodriguez mcg...@suse.com We'll be adding options for xen as well. Cc: Josh Triplett j...@joshtriplett.org Cc: Borislav Petkov b...@suse.de Cc: Pekka Enberg penb...@kernel.org Cc: David Rientjes rient...@google.com Cc: Michal Marek mma...@suse.cz Cc: Randy Dunlap rdun...@infradead.org Cc: penb...@kernel.org Cc: levinsasha...@gmail.com Cc: mtosa...@redhat.com Cc: fengguang...@intel.com Cc: David Vrabel david.vra...@citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com Cc: xen-de...@lists.xenproject.org Acked-by: David Rientjes rient...@google.com Acked-by: Borislav Petkov b...@suse.de Signed-off-by: Luis R. Rodriguez mcg...@suse.com Reviewed-by: Josh Triplett j...@joshtriplett.org scripts/kconfig/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/kconfig/Makefile b/scripts/kconfig/Makefile index 9645c07..ff612b0 100644 --- a/scripts/kconfig/Makefile +++ b/scripts/kconfig/Makefile @@ -141,7 +141,7 @@ help: @echo ' randconfig - New config with random answer to all options' @echo ' listnewconfig - List new options' @echo ' olddefconfig- Same as silentoldconfig but sets new symbols to their default value' - @echo ' kvmconfig - Enable additional options for guest kernel support' + @echo ' kvmconfig - Enable additional options for kvm guest kernel support' @echo ' tinyconfig - Configure the tiniest possible kernel' # lxdialog stuff -- 2.1.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] One question about the hypercall to translate gfn to mfn.
From: Jan Beulich [mailto:jbeul...@suse.com] Sent: Tuesday, December 09, 2014 6:50 PM On 09.12.14 at 11:37, yu.c.zh...@linux.intel.com wrote: On 12/9/2014 6:19 PM, Paul Durrant wrote: I think use of an raw mfn value currently works only because dom0 is using a 1:1 IOMMU mapping scheme. Is my understanding correct, or do you really need raw mfn values? Thanks for your quick response, Paul. Well, not exactly for this case. :) In XenGT, our need to translate gfn to mfn is for GPU's page table, which contains the translation between graphic address and the memory address. This page table is maintained by GPU drivers, and our service domain need to have a method to translate the guest physical addresses written by the vGPU into host physical ones. We do not use IOMMU in XenGT and therefore this translation may not necessarily be a 1:1 mapping. Hmm, that suggests you indeed need raw MFNs, which in turn seems problematic wrt PVH Dom0 (or you'd need a GFN-GMFN translation layer). But while you don't use the IOMMU yourself, I suppose the GPU accesses still don't bypass the IOMMU? In which case all you'd need returned is a frame number that guarantees that after IOMMU translation it refers to the correct MFN, i.e. still allowing for your Dom0 driver to simply set aside a part of its PFN space, asking Xen to (IOMMU-)map the necessary guest frames into there. No. What we require is the raw MFNs. One IOMMU device entry can't point to multiple VM's page tables, so that's why XenGT needs to use software shadow GPU page table to implement the sharing. Note it's not for dom0 to access the MFN. It's for dom0 to setup the correct shadow GPU page table, so a VM can access the graphics memory in a controlled way. Thanks Kevin ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] One question about the hypercall to translate gfn to mfn.
From: Tim Deegan [mailto:t...@xen.org] Sent: Tuesday, December 09, 2014 6:47 PM At 18:10 +0800 on 09 Dec (1418145055), Yu, Zhang wrote: Hi all, As you can see, we are pushing our XenGT patches to the upstream. One feature we need in xen is to translate guests' gfn to mfn in XenGT dom0 device model. Here we may have 2 similar solutions: 1 Paul told me(and thank you, Paul :)) that there used to be a hypercall, XENMEM_translate_gpfn_list, which was removed by Keir in commit 2d2f7977a052e655db6748be5dabf5a58f5c5e32, because there was no usage at that time. It's been suggested before that we should revive this hypercall, and I don't think it's a good idea. Whenever a domain needs to know the actual MFN of another domain's memory it's usually because the security model is problematic. In particular, finding the MFN is usually followed by a brute-force mapping from a dom0 process, or by passing the MFN to a device for unprotected DMA. In our case it's not because the security model is problematic. It's because GPU virtualization is done in Dom0 while the memory virtualization is done in hypervisor. We need a means to query GPFN-MFN so we can setup shadow GPU page table in Dom0 correctly, for a VM. These days DMA access should be protected by IOMMUs, or else the device drivers (and associated tools) are effectively inside the hypervisor's TCB. Luckily on x86 IOMMUs are widely available (and presumably present on anything new enough to run XenGT?). yes, IOMMU protect DMA accesses in a device-agnostic way. But in our case, IOMMU can't be used because it's only for exclusively assigned case, as I replied in another mail. And to reduce the hypervisor TCB, we put device model in Dom0 which is why a interface is required to connect p2m information. So I think the interface we need here is a please-map-this-gfn one, like the existing grant-table ops (which already do what you need by returning an address suitable for DMA). If adding a grant entry for every frame of the framebuffer within the guest is too much, maybe we can make a new interface for the guest to grant access to larger areas. A please-map-this-gfn interface assumes the logic behind lies in Xen hypervisor, e.g. managing CPU page table or IOMMU entry. However here the management of GPU page table is in Dom0, and what we want is a please-tell-me-mfn-for-a-gpfn interface, so we can translate from gpfn in guest GPU PTE to a mfn in shadow GPU PTE. Hope this makes the requirement clearer. Cheers, Tim. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] One question about the hypercall to translate gfn to mfn.
From: Malcolm Crossley Sent: Tuesday, December 09, 2014 6:52 PM On 09/12/14 10:37, Yu, Zhang wrote: On 12/9/2014 6:19 PM, Paul Durrant wrote: I think use of an raw mfn value currently works only because dom0 is using a 1:1 IOMMU mapping scheme. Is my understanding correct, or do you really need raw mfn values? Thanks for your quick response, Paul. Well, not exactly for this case. :) In XenGT, our need to translate gfn to mfn is for GPU's page table, which contains the translation between graphic address and the memory address. This page table is maintained by GPU drivers, and our service domain need to have a method to translate the guest physical addresses written by the vGPU into host physical ones. We do not use IOMMU in XenGT and therefore this translation may not necessarily be a 1:1 mapping. XenGT must use the IOMMU mappings that Xen has setup for the domain which owns the GPU. Currently Dom0 own's the GPU and so it's IOMMU mappings match the MFN's addresses. I suspect XenGT will not work if Xen is booted with iommu=dom0-strict. This is a good point. So yes in this case IOMMU is still active which contains a 1:1 IOMMU mapping table, but it's a separate thing from the interface discussed here, which is about setup a shadow GPU page table for other VM's graphics memory accesses. Thanks Kevin ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel