Re: [Xen-devel] [libvirt] [PATCH V4 1/3] Introduce support for parsing/formatting Xen xl config format
On 01/13/2015 04:47 PM, Jim Fehlig wrote: +++ b/src/Makefile.am @@ -1005,6 +1005,10 @@ XENCONFIG_SOURCES = \ xenconfig/xen_common.c xenconfig/xen_common.h \ xenconfig/xen_sxpr.c xenconfig/xen_sxpr.h \ xenconfig/xen_xm.c xenconfig/xen_xm.h +if WITH_LIBXL +XENCONFIG_SOURCES += \ + xenconfig/xen_xl.c xenconfig/xen_xl.h +endif WITH_LIBXL Missing an EXTRA_DIST listing to ensure these two files are part of a tarball even when configure does not build libxl sources (that is, make sure 'make distcheck' will not fail if configured on a machine without libxl support). Several of the tests fail on the old distro I'm testing on, which causes 'make distcheck' to fail. But I assumed I could simulate it on a newer distro with 'configure --without-libxl', yet 'make distcheck' succeeds and the files are included in the tarball. Are you seeing a failure on RHEL5? I haven't yet tested a 'make distcheck' on that particular VM of mine; I'll fire one off and see what happens. It might be simpler to just to check that if 'make dist' is run when configured --without-libxl, then it still includes both files in the tarball. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [libvirt] [PATCH V4 2/3] tests: Tests for the xen-xl parser
Eric Blake wrote: On 01/13/2015 08:53 AM, Jim Fehlig wrote: From: Kiarie Kahurani davidkiar...@gmail.com add tests for the xen_xl config parser Signed-off-by: Kiarie Kahurani davidkiar...@gmail.com Signed-off-by: Jim Fehlig jfeh...@suse.com --- V4: Only build xlconfigtest when libxl is available. @@ -227,6 +228,11 @@ if WITH_XEN test_programs += xml2sexprtest sexpr2xmltest \ xmconfigtest xencapstest statstest reconnect endif WITH_XEN + +if WITH_LIBXL +test_programs += xlconfigtest +endif WITH_LIBXL Nice. + +DO_TEST(new-disk, 3); +//DO_TEST(spice, 3); + Do we still need this comment? Wow, I can't believe I missed that. Thanks for catching it. This series is quite old and I forgot about this todo item. Enabling the spice test not only exposed problems with the xlconfigdata test files, but caught a bug in the xen-xl parser too! I'll fix those in V5 after seeing your answer to my question in 1/3. Regards, Jim ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC V9 2/4] domain snapshot overview
On 1/12/2015 at 09:54 PM, in message 1421070890.26317.69.ca...@citrix.com, Ian Campbell ian.campb...@citrix.com wrote: On Mon, 2015-01-12 at 00:01 -0700, Chun Yan Liu wrote: On 1/8/2015 at 08:26 PM, in message 1420719995.19787.62.ca...@citrix.com, Ian Campbell ian.campb...@citrix.com wrote: On Mon, 2014-12-22 at 20:42 -0700, Chun Yan Liu wrote: On 12/19/2014 at 06:25 PM, in message 1418984720.20028.15.ca...@citrix.com, Ian Campbell ian.campb...@citrix.com wrote: On Thu, 2014-12-18 at 22:45 -0700, Chun Yan Liu wrote: On 12/18/2014 at 11:10 PM, in message 1418915443.11882.86.ca...@citrix.com, Ian Campbell ian.campb...@citrix.com wrote: On Tue, 2014-12-16 at 14:32 +0800, Chunyan Liu wrote: Changes to V8: * add an overview document, so that one can has a overall look about the whole domain snapshot work, limits, requirements, how to do, etc. = Domain snapshot overview I don't see a similar section for disk snapshots, are you not considering those here except as a part of a domain snapshot or is this an oversight? There are three main use cases (that I know of at least) for snapshotting like behaviour. One is as you've mentioned below for backup, i.e. to preserve the VM at a certain point in time in order to be able to roll back to it. Is this the only usecase you are considering? Yes. I didn't take disk snapshot thing into the scope. A second use case is to support gold image type deployments, i.e. where you create one baseline single disk image and then clone it multiple times to deploy lots of guests. I think this is usually a disk snapshot type thing, but maybe it can be implemented as restoring a gold domain snapshot multiple times (e.g. for start of day performance reasons). As we initially discussed about the thing, disk snapshot thing can be done be existing tools directly like qemu-img, vhd-util. I was reading this section as a more generic overview of snapshotting, without reference to where/how things might ultimately be implemented. From a design point of view it would be useful to cover the various use cases, even if the solution is that the user implements them using CLI tools by hand (xl) or the toolstack does it for them internally (libvirt). This way we can more clearly see the full picture, which allows us to validate that we are making the right choices about what goes where. OK. I see. I think this user case is more like how to use the snapshot, rather than how to implement snapshot. Right? Correct, what the user is actually trying to achieve with the functionality. 'Gold image' or 'Gold domain', the needed work is more like cloning disks. Yes, or resuming multiple times. I see. But IMO it doesn't need change in snapshot design and implementation. Even resuming multiple times, they couldn't use the same image but duplicate the image multiple times. Perhaps, but the use case should be included so that this rationale for not worrying about it can be written down (so that people like me don't keep asking...) Got it. Thanks! The third case, (which is similar to the first), is taking a disk snapshot in order to be able to run you usual backup software on the snapshot (which is now unchanging, which is handy) and then deleting the disk snapshot (this differs from the first case in which disk is active after the snapshot, and due to the lack of the memory part). Sorry, I'm still not quite clear about what this user case wants to do. The user has an active domain which they want to backup, but backup software often does not cope well if the data is changing under its feet. So the users wants to take a snapshot of the domains disks while leaving the domain running, so they can backup that static version of the disk out of band from the VM itself (e.g. by attaching it to a separate backup VM). Got it. So that's simply disk-only snapshot when domian is active. As you
Re: [Xen-devel] [PATCH Linux-2.6.18] scsifront: avoid aquiring same lock twice if ring is full
On 01/13/2015 07:53 PM, Pasi Kärkkäinen wrote: Hi, On Tue, Jan 13, 2015 at 05:22:58PM +0100, Juergen Gross wrote: The locking in scsifront_dev_reset_handler() is obviously wrong. In case of a full ring the host lock is aquired twice. Fixing this issue enables to get rid of the endless fo loop with an explicit break statement. Is this patch needed in upstream Linux kernel aswell, now that Xen PVSCSI drivers are in upstream Linux ? No, especially this part of the code was reorganized and doesn't have that issue. Juergen Thanks, -- Pasi Signed-off-by: Juergen Gross jgr...@suse.com --- diff -r 078f1bb69ea5 drivers/xen/scsifront/scsifront.c --- a/drivers/xen/scsifront/scsifront.c Wed Dec 10 10:22:39 2014 +0100 +++ b/drivers/xen/scsifront/scsifront.c Tue Jan 13 14:32:33 2015 +0100 @@ -447,12 +447,10 @@ static int scsifront_dev_reset_handler(s uint16_t rqid; int err = 0; - for (;;) { #if LINUX_VERSION_CODE = KERNEL_VERSION(2,6,12) - spin_lock_irq(host-host_lock); + spin_lock_irq(host-host_lock); #endif - if (!RING_FULL(info-ring)) - break; + while (RING_FULL(info-ring)) { if (err) { #if LINUX_VERSION_CODE = KERNEL_VERSION(2,6,12) spin_unlock_irq(host-host_lock); ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.6 2/4] xen/arm: vgic: Keep track of vIRQ used by a domain
On 13/01/15 16:46, Ian Campbell wrote: We need to track everything for interrupt assignment to a guest/dom0. So if the guest ask for a free vIRQ we can give it directly. Makes sense. In that case you 0/4 mail doesn't fully describe the use case for the series, since it talks about the dom0 PPI only. Sorry I skipped this comment by inadvertence. My cover letter was explaining the current use case, I didn't think to explain the future use case. I will update the cover letter. Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 00/11] Alternate p2m: support multiple copies of host p2m
On 01/13/2015 12:56 AM, Jan Beulich wrote: On 12.01.15 at 18:36, edmund.h.wh...@intel.com wrote: On 01/12/2015 02:00 AM, Jan Beulich wrote: On 10.01.15 at 00:04, edmund.h.wh...@intel.com wrote: On 01/09/2015 02:41 PM, Andrew Cooper wrote: Having some non-OS part of the guest swap the EPT tables and accidentally turn a DMA buffer read-only is not going to end well. The agent can certainly do bad things, and at some level you have to assume it is sensible enough not to. However, I'm not sure this is fundamentally more dangerous than what a privileged domain can do today using the MEMOP... operations, and people are already using those for very similar purposes. I don't follow - how is what privileged domain can do related to the proposed changes here (which are - via VMFUNC - at least partially guest controllable, and that's also the case Andrew mentioned in his reply)? I'm having a hard time understanding how a P2M stripped of anything that's not plain RAM can be very useful to a guest. IOW without such fundamental aspects clarified I don't see a point in looking at the individual patches (which btw, according to your wording elsewhere, should have been marked RFC). In this patch series, none of the new hypercalls are protected by xsm policies. Earlier in the process of working on this code, I added such a check to all the hypercalls, but then removed them all because it dawned on me that I didn't actually understand what I was doing and my code only worked because I only ever built the dummy permit everything policy. Should some version of this patch series be accepted, my hope is that someone who does understand xsm policies would put the appropriate checks in place, and at that point I maintain that these extra capabilities would not be fundamentally more dangerous than existing mechanisms available to privileged domains, because policy can prevent the guest using vmfunc. That's obviously not true today. Please simply consult with the XSM maintainer on questions/issues like this. Proposing a partial (insecure) patch set isn't appropriate. The alternate p2m's only contain entries for ram pages with valid mfn's. All other page types are still handled in the nested page fault handler for the host p2m. Those pages (at least the ones I've encountered) don't require the hardware to have a valid EPTE for the page. I.e. the functionality requiring e.g. p2m_ram_logdirty and p2m_mmio_direct is then incompatible with your proposed additions (which I think was also already noted by Andrew). That's imo not a basis to think about accepting (or even reviewing) the series. Andrew raised that question, and I answered that pages needing special handling are compatible with these changes. Unless I misunderstood him, he accepted that. If the hardware is never intended to be able to satisfy an access to a page without generating an EPT violation, then all the hardware needs is a set of EPT's that guarantee that behaviour. These changes take of advantage of that to avoid copying any of the EPTE's for special pages into the alternate p2m's. Instead, the nested page fault handler for the alternate p2m returns a status to indicate that the host p2m nested page fault handler should handle the violation using the data in the host p2m. If the result is that the page becomes ram in the host p2m and the instruction is restarted, the hardware will generate another violation and this time the EPTE will be copied. This works. I have vram log-dirty working, something that does not work with the nestedhvm nested EPT code. Ed ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RFC] make error codes a formal part of the ABI
Ian Campbell writes (Re: [PATCH RFC] make error codes a formal part of the ABI): On Tue, 2015-01-13 at 16:21 +, Jan Beulich wrote: There's on small block commented with TBD left in the public header. This is the main reason for the submission being RFC. While we don't currently use these error codes, I'm not sure if we should leave all or some of them out for the time being. I say lets omit any we don't use for now. Is it possible that anyone is using the existing header file where these values were defined ? If so their code might say case ELOOP: which would not compile when they switched to the new header. I don't know whether this is likely, or a problem. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 05/19] libxl: add vmemrange to libxl__domain_build_state
Wei Liu writes ([PATCH v3 05/19] libxl: add vmemrange to libxl__domain_build_state): A vnode consists of one or more vmemranges (virtual memory range). One example of multiple vmemranges is that there is a hole in one vnode. I'm finding this series a bit oddly structured. This patch, for example, just introduces some new fields to an internal state struct - but these fields are not initialised, set, or read. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 18/19] libxlutil: nested list support
Wei Liu writes (Re: [PATCH v3 18/19] libxlutil: nested list support): On Tue, Jan 13, 2015 at 03:52:48PM +, Ian Jackson wrote: This commit message is very brief. For example, under the heading of `Rework internal representation of setting' I would expect a clear description of every formulaic change. Originally the internal representation of setting is (string, string) pair, the first string being the name of the setting, second string being the value of the setting. Now the internal is changed to (string, ConfigValue) pair, where ConfigValue can refer to a string or a list of ConfigValue's. Internal functions to deal with setting are changed accordingly. Does the above description makes things clearer? Yes. Something like that should be in the commit message. It would help to refer to the actual type names. You could say (if true) for example, internal functions new refer to a ConfigSetting; the public APIs still talk about ConfigValues or some such. Also, I think would be much easier to review if split up into 3 parts, which from the description above ought to be doable without trouble. OK. I can try to split this patch into three. If it's difficult for some reason then do get back to me. AFAICT from your changes, the API is not backward compatible. ICBW, but if I'm right that's not acceptable I'm afraid, even in libxlu. The old APIs still have the same semantic as before, so any applications linked against those APIs still have the same results returned. Oh yes. Sorry, I had misread the patch and read your changes to libxlu_internal.h as being in libxlutil.h. Previous APIs work as before. That can't be right because you have to at least specify how they deal with the additional config file syntax. No, the old APIs don't deal with new syntax. If applications want to support new syntax, they need to use new API. It's obvious that the new API can't return the new syntax. The question is what happens if you try. If old APIs try to get value from new syntax, it has no effect. I don't think has no effect can be right. What is the actual return value from the function ? Will it be treated as an error ? (IMO it should be, and this should be documented.) Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PULL 2/2] xen-hvm: increase maxmem before calling xc_domain_populate_physmap
Increase maxmem before calling xc_domain_populate_physmap_exact to avoid the risk of running out of guest memory. This way we can also avoid complex memory calculations in libxl at domain construction time. This patch fixes an abort() when assigning more than 4 NICs to a VM. Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Signed-off-by: Don Slutz dsl...@verizon.com --- xen-hvm.c | 24 1 file changed, 24 insertions(+) diff --git a/xen-hvm.c b/xen-hvm.c index 7548794..e2e575b 100644 --- a/xen-hvm.c +++ b/xen-hvm.c @@ -90,6 +90,12 @@ static inline ioreq_t *xen_vcpu_ioreq(shared_iopage_t *shared_page, int vcpu) #endif #define BUFFER_IO_MAX_DELAY 100 +/* Leave some slack so that hvmloader does not complain about lack of + * memory at boot time (Could not allocate order=0 extent). + * Once hvmloader is modified to cope with that situation without + * printing warning messages, QEMU_SPARE_PAGES can be removed. + */ +#define QEMU_SPARE_PAGES 16 typedef struct XenPhysmap { hwaddr start_addr; @@ -244,6 +250,8 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size, MemoryRegion *mr) unsigned long nr_pfn; xen_pfn_t *pfn_list; int i; +xc_domaininfo_t info; +unsigned long free_pages; if (runstate_check(RUN_STATE_INMIGRATE)) { /* RAM already populated in Xen */ @@ -266,6 +274,22 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size, MemoryRegion *mr) pfn_list[i] = (ram_addr TARGET_PAGE_BITS) + i; } +if ((xc_domain_getinfolist(xen_xc, xen_domid, 1, info) != 1) || +(info.domain != xen_domid)) { +hw_error(xc_domain_getinfolist failed); +} +free_pages = info.max_pages - info.tot_pages; +if (free_pages QEMU_SPARE_PAGES) { +free_pages -= QEMU_SPARE_PAGES; +} else { +free_pages = 0; +} +if ((free_pages nr_pfn) +(xc_domain_setmaxmem(xen_xc, xen_domid, + ((info.max_pages + nr_pfn - free_pages) + (XC_PAGE_SHIFT - 10))) 0)) { +hw_error(xc_domain_setmaxmem failed); +} if (xc_domain_populate_physmap_exact(xen_xc, xen_domid, nr_pfn, 0, 0, pfn_list)) { hw_error(xen: failed to populate ram at RAM_ADDR_FMT, ram_addr); } -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PULL 1/2] xen-pt: Fix PCI devices re-attach failed
From: Liang Li liang.z...@intel.com Use the 'xl pci-attach $DomU $BDF' command to attach more than one PCI devices to the guest, then detach the devices with 'xl pci-detach $DomU $BDF', after that, re-attach these PCI devices again, an error message will be reported like following: libxl: error: libxl_qmp.c:287:qmp_handle_error_response: receive an error message from QMP server: Duplicate ID 'pci-pt-03_10.1' for device. If using the 'address_space_memory' as the parameter of 'memory_listener_register', 'xen_pt_region_del' will not be called if the memory region's name is not 'xen-pci-pt-*' when the devices is detached. This will cause the device's related QemuOpts object not be released properly. Using the device's address space can avoid such issue, because the calling count of 'xen_pt_region_add' when attaching and the calling count of 'xen_pt_region_del' when detaching is the same, so all the memory region ref and unref by the 'xen_pt_region_add' and 'xen_pt_region_del' can be released properly. Signed-off-by: Liang Li liang.z...@intel.com Reviewed-by: Paolo Bonzini pbonz...@redhat.com Reported-by: Longtao Pang longtaox.p...@intel.com --- hw/xen/xen_pt.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c index c1bf357..f2893b2 100644 --- a/hw/xen/xen_pt.c +++ b/hw/xen/xen_pt.c @@ -736,7 +736,7 @@ static int xen_pt_initfn(PCIDevice *d) } out: -memory_listener_register(s-memory_listener, address_space_memory); +memory_listener_register(s-memory_listener, s-dev.bus_master_as); memory_listener_register(s-io_listener, address_space_io); XEN_PT_LOG(d, Real physical device %02x:%02x.%d registered successfully!\n, -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCHv1 0/3 net-next] xen-netfront: refactor making Tx requests
As netfront as evolved to handle different sorts of skbs the code to fill a Tx requests has been copy and pasted several times. The series refactors this and a few other areas. The first patch is to a Xen header but this can be merged via net-next. David ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 2/3] xen-netfront: refactor skb slot counting
A function to count the number of slots an skb needs is more useful than one that counts the slots needed for only the frags. Signed-off-by: David Vrabel david.vra...@citrix.com --- drivers/net/xen-netfront.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c index 22bcb4e..6b29b3a 100644 --- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -521,13 +521,15 @@ static void xennet_make_frags(struct sk_buff *skb, struct netfront_queue *queue, } /* - * Count how many ring slots are required to send the frags of this - * skb. Each frag might be a compound page. + * Count how many ring slots are required to send this skb. Each frag + * might be a compound page. */ -static int xennet_count_skb_frag_slots(struct sk_buff *skb) +static int xennet_count_skb_slots(struct sk_buff *skb) { int i, frags = skb_shinfo(skb)-nr_frags; - int pages = 0; + int pages; + + pages = PFN_UP(offset_in_page(skb-data) + skb_headlen(skb)); for (i = 0; i frags; i++) { skb_frag_t *frag = skb_shinfo(skb)-frags + i; @@ -597,8 +599,7 @@ static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev) goto drop; } - slots = DIV_ROUND_UP(offset + len, PAGE_SIZE) + - xennet_count_skb_frag_slots(skb); + slots = xennet_count_skb_slots(skb); if (unlikely(slots MAX_SKB_FRAGS + 1)) { net_dbg_ratelimited(xennet: skb rides the rocket: %d slots, %d bytes\n, slots, skb-len); -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 00/24] xen/arm: Add support for non-pci passthrough
On 13/01/15 14:25, Julien Grall wrote: This series has been tested on Midway by assigning the secondary network card to a guest (see instruction below). I plan to do futher testing on other boards. I forgot to precise that only changes has only been build tested on x86. Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 0/2] xen/arm: Misc for grant-table
Hi all, This series contains a couple of change for the grant-table header. The first one only removed an unused/misplaced define. The second one, increase the number of grant frame iniatialize when the domain is created. Regards, Julien Grall (2): xen/arm: Remove the define INVALID_GFN from arch-arm/grant_table.h xen/arm: grant-table: Increased the initial number of grant frame to 4 xen/include/asm-arm/grant_table.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RFC] make error codes a formal part of the ABI
On 13.01.15 at 17:57, ian.jack...@eu.citrix.com wrote: Ian Campbell writes (Re: [PATCH RFC] make error codes a formal part of the ABI): On Tue, 2015-01-13 at 16:21 +, Jan Beulich wrote: There's on small block commented with TBD left in the public header. This is the main reason for the submission being RFC. While we don't currently use these error codes, I'm not sure if we should leave all or some of them out for the time being. I say lets omit any we don't use for now. Is it possible that anyone is using the existing header file where these values were defined ? If so their code might say case ELOOP: which would not compile when they switched to the new header. The existing header is a hypervisor private one. Any code outside the hypervisor using it imo deserves to get broken. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 07/19] libxl: x86: factor out e820_host_sanitize
Wei Liu writes ([PATCH v3 07/19] libxl: x86: factor out e820_host_sanitize): This function gets the machine E820 map and sanitize it according to PV guest configuration. This will be used in later patch. No functional change introduced in this patch. Thanks. It is easy to see that this is correct. The way that `rc' is used to contain a libxc (syscall) return value is contrary to the coding style but it is better not to fix this in the same patch as the code motion. Acked-by: Ian Jackson ian.jack...@eu.citrix.com Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RFC] make error codes a formal part of the ABI
On Tue, 2015-01-13 at 16:57 +, Ian Jackson wrote: Ian Campbell writes (Re: [PATCH RFC] make error codes a formal part of the ABI): On Tue, 2015-01-13 at 16:21 +, Jan Beulich wrote: There's on small block commented with TBD left in the public header. This is the main reason for the submission being RFC. While we don't currently use these error codes, I'm not sure if we should leave all or some of them out for the time being. I say lets omit any we don't use for now. Is it possible that anyone is using the existing header file where these values were defined ? It's not installed or in the regular header paths, so it seems unlikely, or at least they would have had to jump through some hoops and no doubt have a big comment about their fragile hack... If so their code might say case ELOOP: which would not compile when they switched to the new header. I don't know whether this is likely, or a problem. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.6 2/4] xen/arm: vgic: Keep track of vIRQ used by a domain
On Tue, 2015-01-13 at 16:57 +, Julien Grall wrote: (CC Jan) I think you forget, I added him. @@ -49,6 +49,21 @@ int domain_vtimer_init(struct domain *d) { d-arch.phys_timer_base.offset = NOW(); d-arch.virt_timer_base.offset = READ_SYSREG64(CNTPCT_EL0); + +/* At this stage vgic_reserve_virq can't fail */ +if ( is_hardware_domain(d) ) +{ +BUG_ON(!vgic_reserve_virq(d, timer_get_irq(TIMER_PHYS_SECURE_PPI))); +BUG_ON(!vgic_reserve_virq(d, timer_get_irq(TIMER_PHYS_NONSECURE_PPI))); +BUG_ON(!vgic_reserve_virq(d, timer_get_irq(TIMER_VIRT_PPI))); +} +else +{ +BUG_ON(!vgic_reserve_virq(d, GUEST_TIMER_PHYS_S_PPI)); +BUG_ON(!vgic_reserve_virq(d, GUEST_TIMER_PHYS_NS_PPI)); +BUG_ON(!vgic_reserve_virq(d, GUEST_TIMER_VIRT_PPI)); Although BUG_ON is not conditional on $debug I think we still should avoid side effects in the condition. I know, but this should never fail as it called during on domain construction. If so we may have some other issue later if we decide to assign PPI to a guest. I would prefer to keep the BUG_ON here I'm not objecting the the BUG_ON itself but to the fact that the condition has a side effect. Please use: if (!do_something()) BUG() instead to avoid this. We have other place in the code where BUG_ON as a side-effect. If we do then it is a tiny minority of places, and they are IMHO wrong. I spotted one in the 600+ results of grepping for BUG_ON. IHMO, if (!do_something()) BUG() = BUG_ON. No, BUG_ON() is a variant of ASSERT(), with the distinction being that the former is not only included when debug=y. It is as wrong to have a side-effect in the BUG_ON as it is to have one in an ASSERT. On the latter you know directly why it's failing, on the former you have to look at the code. If it's important/possible to fail then a log message would be appropriate. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 3/3] xen-netfront: refactor making Tx requests
Eliminate all the duplicate code for making Tx requests by consolidating them into a single xennet_make_one_txreq() function. xennet_make_one_txreq() and xennet_make_txreqs() work with pages and offsets so it will be easier to make netfront handle highmem frags in the future. Signed-off-by: David Vrabel david.vra...@citrix.com --- drivers/net/xen-netfront.c | 181 1 file changed, 67 insertions(+), 114 deletions(-) diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c index 6b29b3a..68e0e8f 100644 --- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -425,99 +425,56 @@ static void xennet_tx_buf_gc(struct netfront_queue *queue) xennet_maybe_wake_tx(queue); } -static void xennet_make_frags(struct sk_buff *skb, struct netfront_queue *queue, - struct xen_netif_tx_request *tx) -{ - char *data = skb-data; - unsigned long mfn; - RING_IDX prod = queue-tx.req_prod_pvt; - int frags = skb_shinfo(skb)-nr_frags; - unsigned int offset = offset_in_page(data); - unsigned int len = skb_headlen(skb); +static struct xen_netif_tx_request *xennet_make_one_txreq( + struct netfront_queue *queue, struct sk_buff *skb, + struct page *page, unsigned int offset, unsigned int len) +{ unsigned int id; + struct xen_netif_tx_request *tx; grant_ref_t ref; - int i; - /* While the header overlaps a page boundary (including being - larger than a page), split it it into page-sized chunks. */ - while (len PAGE_SIZE - offset) { - tx-size = PAGE_SIZE - offset; - tx-flags |= XEN_NETTXF_more_data; - len -= tx-size; - data += tx-size; - offset = 0; + len = min_t(unsigned int, PAGE_SIZE - offset, len); - id = get_id_from_freelist(queue-tx_skb_freelist, queue-tx_skbs); - queue-tx_skbs[id].skb = skb_get(skb); - tx = RING_GET_REQUEST(queue-tx, prod++); - tx-id = id; - ref = gnttab_claim_grant_reference(queue-gref_tx_head); - BUG_ON((signed short)ref 0); + id = get_id_from_freelist(queue-tx_skb_freelist, queue-tx_skbs); + tx = RING_GET_REQUEST(queue-tx, queue-tx.req_prod_pvt++); + ref = gnttab_claim_grant_reference(queue-gref_tx_head); + BUG_ON((signed short)ref 0); - mfn = virt_to_mfn(data); - gnttab_grant_foreign_access_ref(ref, queue-info-xbdev-otherend_id, - mfn, GNTMAP_readonly); + gnttab_grant_foreign_access_ref(ref, queue-info-xbdev-otherend_id, + page_to_mfn(page), GNTMAP_readonly); - queue-grant_tx_page[id] = virt_to_page(data); - tx-gref = queue-grant_tx_ref[id] = ref; - tx-offset = offset; - tx-size = len; - tx-flags = 0; - } + queue-tx_skbs[id].skb = skb; + queue-grant_tx_page[id] = page; + queue-grant_tx_ref[id] = ref; - /* Grant backend access to each skb fragment page. */ - for (i = 0; i frags; i++) { - skb_frag_t *frag = skb_shinfo(skb)-frags + i; - struct page *page = skb_frag_page(frag); + tx-id = id; + tx-gref = ref; + tx-offset = offset; + tx-size = len; + tx-flags = 0; - len = skb_frag_size(frag); - offset = frag-page_offset; + return tx; +} - /* Skip unused frames from start of page */ - page += offset PAGE_SHIFT; - offset = ~PAGE_MASK; +static struct xen_netif_tx_request *xennet_make_txreqs( + struct netfront_queue *queue, struct xen_netif_tx_request *tx, + struct sk_buff *skb, struct page *page, + unsigned int offset, unsigned int len) +{ + /* Skip unused frames from start of page */ + page += offset PAGE_SHIFT; + offset = ~PAGE_MASK; - while (len 0) { - unsigned long bytes; - - bytes = PAGE_SIZE - offset; - if (bytes len) - bytes = len; - - tx-flags |= XEN_NETTXF_more_data; - - id = get_id_from_freelist(queue-tx_skb_freelist, - queue-tx_skbs); - queue-tx_skbs[id].skb = skb_get(skb); - tx = RING_GET_REQUEST(queue-tx, prod++); - tx-id = id; - ref = gnttab_claim_grant_reference(queue-gref_tx_head); - BUG_ON((signed short)ref 0); - - mfn = pfn_to_mfn(page_to_pfn(page)); - gnttab_grant_foreign_access_ref(ref, -
[Xen-devel] [PATCH 1/3] xen: add page_to_mfn()
pfn_to_mfn(page_to_pfn(p)) is a common use case so add a generic helper for it. Signed-off-by: David Vrabel david.vra...@citrix.com --- include/xen/page.h |5 + 1 file changed, 5 insertions(+) diff --git a/include/xen/page.h b/include/xen/page.h index 12765b6..c5ed20b 100644 --- a/include/xen/page.h +++ b/include/xen/page.h @@ -3,6 +3,11 @@ #include asm/xen/page.h +static inline unsigned long page_to_mfn(struct page *page) +{ + return pfn_to_mfn(page_to_pfn(page)); +} + struct xen_memory_region { phys_addr_t start; phys_addr_t size; -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 03/19] libxc: allocate memory with vNUMA information for PV guest
Wei Liu writes ([PATCH v3 03/19] libxc: allocate memory with vNUMA information for PV guest): ... diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h index 07d7224..c459e77 100644 --- a/tools/libxc/include/xc_dom.h +++ b/tools/libxc/include/xc_dom.h @@ -167,6 +167,11 @@ struct xc_dom_image { ... +/* vNUMA information */ +unsigned int *vnode_to_pnode; /* vnode to pnode mapping array */ +uint64_t *vnode_size; /* vnode size array */ You don't specify the units. You should probably name the variable _bytes or _pages or something. Looking at the algorithm below it seems to be in _mby. But the domain size is specified in pages. So AFAICT if you try to create a domain which is not a whole number of pages, it is bound to fail ! Perhaps the vnode memory size should be in pages too. +unsigned int nr_vnodes; /* number of elements of above arrays */ Is there some reason to prefer this arrangement with multiple parallel arrays, to one with a single array of structs ? +/* Setup dummy vNUMA information if it's not provided. Not + * that this is a valid state if libxl doesn't provide any + * vNUMA information. + * + * In this case we setup some dummy value for the convenience + * of the allocation code. Note that from the user's PoV the + * guest still has no vNUMA configuration. + */ This arrangement for defaulting makes it difficult to supply only partial information - for example, to supply the number of vnodes but allow the system to make up the details. I have a similar complaint about the corresponding libxl code. I think you should decide where you want the defaulting to be, and do it in a more flexible way in that one place. Probably, libxl. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.6 2/4] xen/arm: vgic: Keep track of vIRQ used by a domain
On 13/01/15 16:57, Julien Grall wrote: (CC Jan) Forgot to really CC Jan for the bool stuff. Hi Ian, On 13/01/15 16:46, Ian Campbell wrote: vgic_reserve_irq returns a boolean: Please use true/false then. In Xen we have xen/stdbool.h which differs from normal stdboot.h. I'm not sure what the rules are for use. Jan please correct me if I'm wrong, xen/stdbool.h has been introduced for the ELF code and should not be used anywhere else. true/false is defined in xen/stdbool.h together with Bool not bool_t. 0 = not reserved 1 = reserved I don't see why we should return an int in this case, as the caller should know how to use it. It's slightly more conventional to return error codes, but I guess I don't mind much. Agree, but in this particular case we don't have to know the error code. So it's pointless to return it. @@ -49,6 +49,21 @@ int domain_vtimer_init(struct domain *d) { d-arch.phys_timer_base.offset = NOW(); d-arch.virt_timer_base.offset = READ_SYSREG64(CNTPCT_EL0); + +/* At this stage vgic_reserve_virq can't fail */ +if ( is_hardware_domain(d) ) +{ +BUG_ON(!vgic_reserve_virq(d, timer_get_irq(TIMER_PHYS_SECURE_PPI))); +BUG_ON(!vgic_reserve_virq(d, timer_get_irq(TIMER_PHYS_NONSECURE_PPI))); +BUG_ON(!vgic_reserve_virq(d, timer_get_irq(TIMER_VIRT_PPI))); +} +else +{ +BUG_ON(!vgic_reserve_virq(d, GUEST_TIMER_PHYS_S_PPI)); +BUG_ON(!vgic_reserve_virq(d, GUEST_TIMER_PHYS_NS_PPI)); +BUG_ON(!vgic_reserve_virq(d, GUEST_TIMER_VIRT_PPI)); Although BUG_ON is not conditional on $debug I think we still should avoid side effects in the condition. I know, but this should never fail as it called during on domain construction. If so we may have some other issue later if we decide to assign PPI to a guest. I would prefer to keep the BUG_ON here I'm not objecting the the BUG_ON itself but to the fact that the condition has a side effect. Please use: if (!do_something()) BUG() instead to avoid this. We have other place in the code where BUG_ON as a side-effect. IHMO, if (!do_something()) BUG() = BUG_ON. On the latter you know directly why it's failing, on the former you have to look at the code. Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 12/19] hvmloader: retrieve vNUMA information from hypervisor
On Tue, Jan 13, 2015 at 04:50:11PM +, Jan Beulich wrote: On 13.01.15 at 13:11, wei.l...@citrix.com wrote: +void init_vnuma_info(void) +{ +int rc, retry = 0; +struct xen_vnuma_topology_info vnuma_topo; + +vcpu_to_vnode = scratch_alloc(sizeof(uint32_t) * hvm_info-nr_vcpus, 0); sizeof(*vcpu_to_vnode) please. Done. +rc = -EAGAIN; +while ( rc == -EAGAIN retry 10 ) What's the justification for 10 here? A sane tool stack shouldn't alter the values while starting the domain. I wasn't sure if a toolstack will change the values whilst domain is running. But you now confirm that a sane toolstack shouldn't do that I can just remove this loop. +{ +vnuma_topo.domid = DOMID_SELF; +vnuma_topo.pad = 0; +vnuma_topo.nr_vcpus = 0; +vnuma_topo.nr_vnodes = 0; +vnuma_topo.nr_vmemranges = 0; + +set_xen_guest_handle(vnuma_topo.vdistance.h, NULL); +set_xen_guest_handle(vnuma_topo.vcpu_to_vnode.h, NULL); +set_xen_guest_handle(vnuma_topo.vmemrange.h, NULL); + +rc = hypercall_memory_op(XENMEM_get_vnumainfo, vnuma_topo); + +if ( rc == -EOPNOTSUPP ) +return; + +if ( rc != -ENOBUFS ) +break; + +ASSERT(vnuma_topo.nr_vcpus == hvm_info-nr_vcpus); I also wonder whether we shouldn't make the hypervisor return back the (modified) values in the -EAGAIN error case, so that you could move above first half of the loop body out of the loop. I don't think hypercall modifies values in -EAGAIN case. The first half of that loop is to prepare hypercall structure so that we can retrieve the new size. But since you say no sane toolstack should do that this issue becomes moot. Wei. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 05/19] libxl: add vmemrange to libxl__domain_build_state
On Tue, Jan 13, 2015 at 05:02:10PM +, Ian Jackson wrote: Wei Liu writes ([PATCH v3 05/19] libxl: add vmemrange to libxl__domain_build_state): A vnode consists of one or more vmemranges (virtual memory range). One example of multiple vmemranges is that there is a hole in one vnode. I'm finding this series a bit oddly structured. This patch, for example, just introduces some new fields to an internal state struct - but these fields are not initialised, set, or read. The new fields (and other existing fields) are initialised to zero in initiate_domain_create, that's why it doesn't need to be explicitly initialised. These new fields are accessed in the next patch. I can either explicitly say so in commit log or squash this patch with the next one. Which way do you prefer? TBH I don't think this patch and next one should be squashed into one patch. Wei. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.6 2/4] xen/arm: vgic: Keep track of vIRQ used by a domain
On 13/01/15 17:18, Ian Campbell wrote: On Tue, 2015-01-13 at 16:57 +, Julien Grall wrote: (CC Jan) I think you forget, I added him. @@ -49,6 +49,21 @@ int domain_vtimer_init(struct domain *d) { d-arch.phys_timer_base.offset = NOW(); d-arch.virt_timer_base.offset = READ_SYSREG64(CNTPCT_EL0); + +/* At this stage vgic_reserve_virq can't fail */ +if ( is_hardware_domain(d) ) +{ +BUG_ON(!vgic_reserve_virq(d, timer_get_irq(TIMER_PHYS_SECURE_PPI))); +BUG_ON(!vgic_reserve_virq(d, timer_get_irq(TIMER_PHYS_NONSECURE_PPI))); +BUG_ON(!vgic_reserve_virq(d, timer_get_irq(TIMER_VIRT_PPI))); +} +else +{ +BUG_ON(!vgic_reserve_virq(d, GUEST_TIMER_PHYS_S_PPI)); +BUG_ON(!vgic_reserve_virq(d, GUEST_TIMER_PHYS_NS_PPI)); +BUG_ON(!vgic_reserve_virq(d, GUEST_TIMER_VIRT_PPI)); Although BUG_ON is not conditional on $debug I think we still should avoid side effects in the condition. I know, but this should never fail as it called during on domain construction. If so we may have some other issue later if we decide to assign PPI to a guest. I would prefer to keep the BUG_ON here I'm not objecting the the BUG_ON itself but to the fact that the condition has a side effect. Please use: if (!do_something()) BUG() instead to avoid this. We have other place in the code where BUG_ON as a side-effect. If we do then it is a tiny minority of places, and they are IMHO wrong. I spotted one in the 600+ results of grepping for BUG_ON. I spotted more. Anyway, I will move to a if (!do_smth()) BUG() form. Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 03/19] libxc: allocate memory with vNUMA information for PV guest
On Tue, Jan 13, 2015 at 05:05:26PM +, Ian Jackson wrote: Wei Liu writes ([PATCH v3 03/19] libxc: allocate memory with vNUMA information for PV guest): ... diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h index 07d7224..c459e77 100644 --- a/tools/libxc/include/xc_dom.h +++ b/tools/libxc/include/xc_dom.h @@ -167,6 +167,11 @@ struct xc_dom_image { ... +/* vNUMA information */ +unsigned int *vnode_to_pnode; /* vnode to pnode mapping array */ +uint64_t *vnode_size; /* vnode size array */ You don't specify the units. You should probably name the variable _bytes or _pages or something. Looking at the algorithm below it seems to be in _mby. But the domain size is specified in pages. So AFAICT if you try to create a domain which is not a whole number of pages, it is bound to fail ! Perhaps the vnode memory size should be in pages too. Let's use page as unit. +unsigned int nr_vnodes; /* number of elements of above arrays */ Is there some reason to prefer this arrangement with multiple parallel arrays, to one with a single array of structs ? No, I don't have preference. I can pack vnode_to_pnode and vnode_size(_pages) into a struct. +/* Setup dummy vNUMA information if it's not provided. Not + * that this is a valid state if libxl doesn't provide any + * vNUMA information. + * + * In this case we setup some dummy value for the convenience + * of the allocation code. Note that from the user's PoV the + * guest still has no vNUMA configuration. + */ This arrangement for defaulting makes it difficult to supply only partial information - for example, to supply the number of vnodes but allow the system to make up the details. I have a similar complaint about the corresponding libxl code. I think you should decide where you want the defaulting to be, and do it in a more flexible way in that one place. Probably, libxl. The defaulting will be in libxl. That's what Dario is working on. If libxl provides information, these dummy values will have no effect. Maybe the comment is confusing. I wasn't saying there defaulting happening inside libxc. It's only for the convenience of the allocation code, because it needs to operate on one mapping. Wei. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for 2.3 v2 1/1] xen-hvm: increase maxmem before calling xc_domain_populate_physmap
On Mon, 12 Jan 2015, Stefano Stabellini wrote: On Wed, 3 Dec 2014, Don Slutz wrote: From: Stefano Stabellini stefano.stabell...@eu.citrix.com Increase maxmem before calling xc_domain_populate_physmap_exact to avoid the risk of running out of guest memory. This way we can also avoid complex memory calculations in libxl at domain construction time. This patch fixes an abort() when assigning more than 4 NICs to a VM. Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Signed-off-by: Don Slutz dsl...@verizon.com --- v2: Changes by Don Slutz Switch from xc_domain_getinfo to xc_domain_getinfolist Fix error check for xc_domain_getinfolist Limit increase of maxmem to only do when needed: Add QEMU_SPARE_PAGES (How many pages to leave free) Add free_pages calculation xen-hvm.c | 19 +++ 1 file changed, 19 insertions(+) diff --git a/xen-hvm.c b/xen-hvm.c index 7548794..d30e77e 100644 --- a/xen-hvm.c +++ b/xen-hvm.c @@ -90,6 +90,7 @@ static inline ioreq_t *xen_vcpu_ioreq(shared_iopage_t *shared_page, int vcpu) #endif #define BUFFER_IO_MAX_DELAY 100 +#define QEMU_SPARE_PAGES 16 We need a big comment here to explain why we have this parameter and when we'll be able to get rid of it. Other than that the patch is fine. Thanks! Actually I'll just go ahead and add the comment and commit, if for you is OK. Cheers, Stefano ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 15/19] libxc: allocate memory with vNUMA information for HVM guest
On Tue, Jan 13, 2015 at 12:11:43PM +, Wei Liu wrote: The algorithm is more or less the same as the one used for PV guest. Libxc gets hold of the mapping of vnode to pnode and size of each vnode then allocate memory accordingly. Could you split this patch in two? One part for the adding of the code and the other for moving the existing code around? And then the function returns low memory end, high memory end and mmio start to caller. Libxl needs those values to construct vmemranges for that guest. Signed-off-by: Wei Liu wei.l...@citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Dario Faggioli dario.faggi...@citrix.com Cc: Elena Ufimtseva ufimts...@gmail.com --- Changes in v3: 1. Rewrite commit log. 2. Add a few code comments. --- tools/libxc/include/xenguest.h |7 ++ tools/libxc/xc_hvm_build_x86.c | 224 ++-- 2 files changed, 151 insertions(+), 80 deletions(-) diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h index 40bbac8..d1cbb4e 100644 --- a/tools/libxc/include/xenguest.h +++ b/tools/libxc/include/xenguest.h @@ -230,6 +230,13 @@ struct xc_hvm_build_args { struct xc_hvm_firmware_module smbios_module; /* Whether to use claim hypercall (1 - enable, 0 - disable). */ int claim_enabled; +unsigned int nr_vnodes;/* Number of vnodes */ +unsigned int *vnode_to_pnode; /* Vnode to pnode mapping */ +uint64_t *vnode_size; /* Size of vnodes */ +/* Out parameters */ +uint64_t lowmem_end; +uint64_t highmem_end; +uint64_t mmio_start; }; /** diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c index c81a25b..54d3dc8 100644 --- a/tools/libxc/xc_hvm_build_x86.c +++ b/tools/libxc/xc_hvm_build_x86.c @@ -89,7 +89,8 @@ static int modules_init(struct xc_hvm_build_args *args, } static void build_hvm_info(void *hvm_info_page, uint64_t mem_size, - uint64_t mmio_start, uint64_t mmio_size) + uint64_t mmio_start, uint64_t mmio_size, + struct xc_hvm_build_args *args) { struct hvm_info_table *hvm_info = (struct hvm_info_table *) (((unsigned char *)hvm_info_page) + HVM_INFO_OFFSET); @@ -119,6 +120,10 @@ static void build_hvm_info(void *hvm_info_page, uint64_t mem_size, hvm_info-high_mem_pgend = highmem_end PAGE_SHIFT; hvm_info-reserved_mem_pgstart = ioreq_server_pfn(0); +args-lowmem_end = lowmem_end; +args-highmem_end = highmem_end; +args-mmio_start = mmio_start; + /* Finish with the checksum. */ for ( i = 0, sum = 0; i hvm_info-length; i++ ) sum += ((uint8_t *)hvm_info)[i]; @@ -244,7 +249,7 @@ static int setup_guest(xc_interface *xch, char *image, unsigned long image_size) { xen_pfn_t *page_array = NULL; -unsigned long i, nr_pages = args-mem_size PAGE_SHIFT; +unsigned long i, j, nr_pages = args-mem_size PAGE_SHIFT; unsigned long target_pages = args-mem_target PAGE_SHIFT; uint64_t mmio_start = (1ull 32) - args-mmio_size; uint64_t mmio_size = args-mmio_size; @@ -258,13 +263,13 @@ static int setup_guest(xc_interface *xch, xen_capabilities_info_t caps; unsigned long stat_normal_pages = 0, stat_2mb_pages = 0, stat_1gb_pages = 0; -int pod_mode = 0; +unsigned int memflags = 0; int claim_enabled = args-claim_enabled; xen_pfn_t special_array[NR_SPECIAL_PAGES]; xen_pfn_t ioreq_server_array[NR_IOREQ_SERVER_PAGES]; - -if ( nr_pages target_pages ) -pod_mode = XENMEMF_populate_on_demand; +uint64_t dummy_vnode_size; +unsigned int dummy_vnode_to_pnode; +uint64_t total; memset(elf, 0, sizeof(elf)); if ( elf_init(elf, image, image_size) != 0 ) @@ -276,6 +281,37 @@ static int setup_guest(xc_interface *xch, v_start = 0; v_end = args-mem_size; +if ( nr_pages target_pages ) +memflags |= XENMEMF_populate_on_demand; + +if ( args-nr_vnodes == 0 ) +{ +/* Build dummy vnode information */ +args-nr_vnodes = 1; +dummy_vnode_to_pnode = XC_VNUMA_NO_NODE; +dummy_vnode_size = args-mem_size 20; +args-vnode_size = dummy_vnode_size; +args-vnode_to_pnode = dummy_vnode_to_pnode; +} +else +{ +if ( nr_pages target_pages ) +{ +PERROR(Cannot enable vNUMA and PoD at the same time); +goto error_out; +} +} + +total = 0; +for ( i = 0; i args-nr_vnodes; i++ ) +total += (args-vnode_size[i] 20); +if ( total != args-mem_size ) +{ +PERROR(Memory size requested by vNUMA (0x%PRIx64) mismatches memory size configured for domain (0x%PRIx64), + total, args-mem_size); +goto error_out; +
Re: [Xen-devel] [PATCH v3 08/19] libxl: functions to build vmemranges for PV guest
Wei Liu writes ([PATCH v3 08/19] libxl: functions to build vmemranges for PV guest): Introduce a arch-independent routine to generate one vmemrange per vnode. Also introduce arch-dependent routines for different architectures because part of the process is arch-specific -- ARM has yet have NUMA support and E820 is x86 only. For those x86 guests who care about machine E820 map (i.e. with e820_host=1), vnode is further split into several vmemranges to accommodate memory holes. A few stubs for libxl_arm.c are created. ... +/* Generate one vmemrange for each virtual node. */ +next = 0; +for (i = 0; i b_info-num_vnuma_nodes; i++) { +libxl_vnode_info *p = b_info-vnuma_nodes[i]; + +v = libxl__realloc(gc, v, sizeof(*v) * (i+1)); Please use GCREALLOC_ARRAY. +v[i].start = next; +v[i].end = next + (p-mem 20); /* mem is in MiB */ Why are all these values in different units ? Also, it would be best if the units were in the field and variable names. Then you wouldn't have to write an explanatory comment. diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c index e959e37..2018afc 100644 --- a/tools/libxl/libxl_x86.c +++ b/tools/libxl/libxl_x86.c @@ -338,6 +338,80 @@ int libxl__arch_domain_finalise_hw_description(libxl__gc *gc, ... +int libxl__arch_vnuma_build_vmemrange(libxl__gc *gc, + uint32_t domid, + libxl_domain_build_info *b_info, + libxl__domain_build_state *state) +{ ... +n = 0; /* E820 counter */ How about putting this information in the variable name rather than dropping it into a comment ? Likewise i. +while (remaining 0) { +if (n = nr_e820) { +rc = ERROR_FAIL; ERROR_NOMEM, surely ? +if (map[n].size = remaining) { +v[x].start = map[n].addr; +v[x].end = map[n].addr + remaining; +map[n].addr += remaining; +map[n].size -= remaining; +remaining = 0; +} else { +v[x].start = map[n].addr; +v[x].end = map[n].addr + map[n].size; +remaining -= map[n].size; +n++; +} It might be possible to write this more compactly with something like use = map[n].size remaining ? map[n].size : remaining; Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 02/11] VMX: implement suppress #VE.
On 01/12/2015 09:45 AM, Ed White wrote: On 01/12/2015 08:43 AM, Andrew Cooper wrote: On 09/01/15 21:26, Ed White wrote: In preparation for selectively enabling hardware #VE in a later patch, set suppress #VE on all EPTE's on #VE-capable hardware. Suppress #VE should always be the default condition for two reasons: it is generally not safe to deliver #VE into a guest unless that guest has been modified to receive it; and even then for most EPT violations only the hypervisor is able to handle the violation. Signed-off-by: Ed White edmund.h.wh...@intel.com --- xen/arch/x86/mm/p2m-ept.c | 34 +- xen/include/asm-x86/hvm/vmx/vmx.h | 1 + 2 files changed, 34 insertions(+), 1 deletion(-) diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c index eb8b5f9..2b9f07c 100644 --- a/xen/arch/x86/mm/p2m-ept.c +++ b/xen/arch/x86/mm/p2m-ept.c @@ -41,7 +41,7 @@ #define is_epte_superpage(ept_entry)((ept_entry)-sp) static inline bool_t is_epte_valid(ept_entry_t *e) { -return (e-epte != 0 e-sa_p2mt != p2m_invalid); +return (e-valid != 0 e-sa_p2mt != p2m_invalid); } /* returns : 0 for success, -errno otherwise */ @@ -194,6 +194,19 @@ static int ept_set_middle_entry(struct p2m_domain *p2m, ept_entry_t *ept_entry) ept_entry-r = ept_entry-w = ept_entry-x = 1; +/* Disable #VE on all entries */ +if ( cpu_has_vmx_virt_exceptions ) +{ +ept_entry_t *table = __map_domain_page(pg); + +for ( int i = 0; i EPT_PAGETABLE_ENTRIES; i++ ) Style - please declare i in the upper scope, and it should be unsigned. +table[i].suppress_ve = 1; + +unmap_domain_page(table); + +ept_entry-suppress_ve = 1; +} + return 1; } @@ -243,6 +256,10 @@ static int ept_split_super_page(struct p2m_domain *p2m, ept_entry_t *ept_entry, epte-sp = (level 1); epte-mfn += i * trunk; epte-snp = (iommu_enabled iommu_snoop); + +if ( cpu_has_vmx_virt_exceptions ) +epte-suppress_ve = 1; + ASSERT(!epte-rsvd1); ept_p2m_type_to_flags(epte, epte-sa_p2mt, epte-access); @@ -753,6 +770,9 @@ ept_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn, ept_p2m_type_to_flags(new_entry, p2mt, p2ma); } +if ( cpu_has_vmx_virt_exceptions ) +new_entry.suppress_ve = 1; + rc = atomic_write_ept_entry(ept_entry, new_entry, target); if ( unlikely(rc) ) old_entry.epte = 0; @@ -1069,6 +1089,18 @@ int ept_p2m_init(struct p2m_domain *p2m) /* set EPT page-walk length, now it's actual walk length - 1, i.e. 3 */ ept-ept_wl = 3; +/* Disable #VE on all entries */ +if ( cpu_has_vmx_virt_exceptions ) +{ +ept_entry_t *table = +map_domain_page(pagetable_get_pfn(p2m_get_pagetable(p2m))); + +for ( int i = 0; i EPT_PAGETABLE_ENTRIES; i++ ) +table[i].suppress_ve = 1; Is it safe setting SVE on an entry which is not known to be a superpage or not present? The manual states that the bit is ignored in this case, but I am concerned that, as with SVE, this bit will suddenly gain meaning in the future. It is safe to do this. Never say never, but I am aware of no plans to overload this bit, and I would know. Unless you feel strongly about it, I would prefer to leave this as-is, since changing it would make the code more complex. One point that I should have clarified yesterday: the SDM says the bit is ignored for a non-terminal present entry; the bit is not ignored for non-present entries, which is why I have to set all the SVE bits in a new page -- my lazy EPTE copying algorithm wouldn't work otherwise because all the zero entries would generate #VE. Ed + +unmap_domain_page(table); +} + if ( !zalloc_cpumask_var(ept-synced_mask) ) return -ENOMEM; diff --git a/xen/include/asm-x86/hvm/vmx/vmx.h b/xen/include/asm-x86/hvm/vmx/vmx.h index 8bae195..70fee74 100644 --- a/xen/include/asm-x86/hvm/vmx/vmx.h +++ b/xen/include/asm-x86/hvm/vmx/vmx.h @@ -49,6 +49,7 @@ typedef union { suppress_ve : 1; /* bit 63 - suppress #VE */ }; u64 epte; +u64 valid : 63; /* entire EPTE except suppress #VE bit */ I am not sure 'valid' is a sensible name here. As it is only used in is_epte_valid(), might it be better to just use -epte and a bitmask for everything other than the #VE bit? This seemed more in the style of the code I was changing, but I can do it as you suggest. Ed } ept_entry_t; typedef struct { ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 00/14] Enable vTPM subsystem on TPM 2.0
-Original Message- From: Daniel De Graaf [mailto:dgde...@tycho.nsa.gov] Sent: Tuesday, January 13, 2015 11:54 PM To: Xu, Quan; xen-devel@lists.xen.org Cc: stefano.stabell...@eu.citrix.com; samuel.thiba...@ens-lyon.org; ian.campb...@citrix.com; ian.jack...@eu.citrix.com; jbeul...@suse.com; k...@xen.org; t...@xen.org Subject: Re: [PATCH v3 00/14] Enable vTPM subsystem on TPM 2.0 On 01/12/2015 11:06 AM, Xu, Quan wrote: Graaf, Now there are no more comments for this series of patch. Can this series of patch be merged in staging branch? or any other AR, let me know. If the series of patch are in staging branch, the Community and I can continue to develop and enhance it. A few remaining comments: Patch 6 adds an #if 0 block; is this test code that you meant to remove? Thanks, It is just an example how to bind/unbind. I will remove it in v4 and send out v4 ASAP. Patch 9 (see reply). I will fix it. Are you planning to replace TPM2_Bind with TPM2_Seal in a later series? If so, please make a note of this limitation in the documentation for TPM2, since using PCRs to seal the data can be an important security feature that users of the vtpmmgr rely on. Yes, I will replace TPM2_Bind with TPM2_Seal in a later series. For the other patches in this series (1-5,7-8,10): Acked-by: Daniel De Graaf dgde...@tycho.nsa.gov With patch #14 documenting the lack of TPM2 sealing, #11-13 are also Acked. I will fix the Patch#14 documenting the lack of TPM2 sealing in v4. Thanks again. Quan - Daniel Thanks Quan -Original Message- From: Xu, Quan Sent: Wednesday, December 31, 2014 1:50 PM To: xen-devel@lists.xen.org Cc: dgde...@tycho.nsa.gov; stefano.stabell...@eu.citrix.com; samuel.thiba...@ens-lyon.org; ian.campb...@citrix.com; ian.jack...@eu.citrix.com; jbeul...@suse.com; k...@xen.org; t...@xen.org; Xu, Quan Subject: [PATCH v3 00/14] Enable vTPM subsystem on TPM 2.0 ### # Happy New Year..# ### This series of patch enable the virtual Trusted Platform Module (vTPM) subsystem for Xen on TPM 2.0. Noted, functionality for a virtual guest operating system (a DomU) is still TPM 1.2. The main modifcation is on vtpmmgr-stubdom. The challenge is that TPM 2.0 is not backward compatible with TPM 1.2. -- DESIGN OVERVIEW -- The architecture of vTPM subsystem on TPM 2.0 is described below: +--+ |Linux DomU| ... | | ^ | | v | | | xen-tpmfront | +--+ | ^ v | +--+ | mini-os/tpmback | | | ^ | | v | | | vtpm-stubdom| ... | | ^ | | v | | | mini-os/tpmfront | +--+ | ^ v | +--+ | mini-os/tpmback | | | ^ | | v | | | vtpmmgr-stubdom | | | ^ | | v | | | mini-os/tpm2_tis | +--+ | ^ v | +--+ | Hardware TPM 2.0 | +--+ * Linux DomU: The Linux based guest that wants to use a vTPM. There many be more than one of these. * xen-tpmfront.ko: Linux kernel virtual TPM frontend driver. This driver provides vTPM access to a para-virtualized Linux based DomU. * mini-os/tpmback: Mini-os TPM backend driver. The Linux frontend driver connects to this backend driver to facilitate communications between the Linux DomU and its vTPM. This driver is also used by vtpmmgr-stubdom to communicate with vtpm-stubdom. * vtpm-stubdom: A mini-os stub domain that implements a vTPM. There is a one to one mapping between running vtpm-stubdom instances and logical vtpms on the system. The vTPM Platform Configuration Registers (PCRs) are all initialized to zero. * mini-os/tpmfront: Mini-os TPM frontend driver. The vTPM mini-os domain vtpm-stubdom uses this driver to communicate with vtpmmgr-stubdom. This driver could also be used separately to implement a mini-os domain that wishes to use a vTPM of its own. * vtpmmgr-stubdom: A mini-os domain that implements the vTPM manager. There is only one vTPM manager and it should be running during the entire lifetime of the machine. This domain regulates access to the physical TPM on the system and secures the persistent state of each vTPM. * mini-os/tpm2_tis: Mini-os TPM version 2.0 TPM Interface Specification
Re: [Xen-devel] [Xen-users] [TestDay] minor bug + possible configuration bug 4.5rc4 archlinux
On Tue, Jan 13, Doug McMillan wrote: Also quick question if I am understanding my remaining issue [tmpfs: Bad mount option context] as described by a previous thread. Until the code that generates it changes I need to manually change var-lib-xenstored.mount from ... Or am I misunderstanding that also?? No, thats fine. The contex= is already removed for 4.5.0. Olaf ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] (v2) Design proposal for RMRR fix
From: Jan Beulich [mailto:jbeul...@suse.com] Sent: Wednesday, January 14, 2015 12:06 AM On 13.01.15 at 17:00, george.dun...@eu.citrix.com wrote: Another option I was thinking about: Before assigning a device to a guest, you have to unplug the device and assign it to pci-back (e.g., with xl pci-assignable-add). In addition to something like rmmr=host, we could add rmrr=assignable, which would add all of the RMRRs of all devices currently listed as assignable. The idea would then be that you first make all your devices assignable, then just start your guests, and everything you've made assignable will be able to be assigned. Nice idea indeed, but I'm not sure about its practicability: It may not be desirable to make all devices eventually to be handed to a guest prior to starting any of the guests it may get handed to. In particular there may be reasons why the host needs the device while (or until after) creating the guests. and I'm not sure whether there's enough knowledge to judge whether a device is assignable since potential conflicts may be detected only when the guest is launched. Thanks Kevin ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 00/11] Alternate p2m: support multiple copies of host p2m
Ed White edmund.h.wh...@intel.com 01/13/15 9:03 PM On 01/13/2015 11:01 AM, Andrew Cooper wrote: One thing I have noticed while looking at the #VE stuff that EPT also supports A/D tracking, which might be quite a nice optimisation and forgo the need for p2m_ram_logdirty, but I think this should be treated as an orthogonal item. This is far from my area of expertise, but I believe there is code in Xen to use EPT D bits in migration. There once was a patch series, but upon asking on the (performance) benefits, the submitting engineer stated that there was no measurable improvement, and hence the series never got applied. Right now PML is being worked on afaik, which from what I can tell will make it a lot easier (compared to scanning the whole tree for set D bits) to collect the modified bitmap when the tool stack asks for it. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 00/11] Alternate p2m: support multiple copies of host p2m
Ed White edmund.h.wh...@intel.com 01/13/15 10:32 PM On 01/13/2015 12:45 PM, Andrew Cooper wrote: On 13/01/15 20:02, Ed White wrote: The set of mfn's is the same, but I do allow gfn-mfn mappings to be modified under certain circumstances. One use of this is to point the same VA to different physical pages (with different access permissions) in different p2m's to hide memory changes. What is the practical use of being able to play paging tricks like this behind a VMs back? I'm restricted in how much detail I can go into on a public mailing list, but imagine that you want a data read to see one thing and an instruction fetch to see something else. How would that work? There can only be one P2M in use at a time, and that's used for both translations. Or are you saying at least one of the two accesses would be emulated nevertheless? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [CALL-FOR-AGENDA] Monthly Xen.org Technical Call (2015-01-14)
On Wed, 2015-01-07 at 15:32 +, Ian Campbell wrote: The first Xen technical call will be at: Wed 14 Jan 17:00:00 GMT 201 `date -d @1421254800` See http://lists.xen.org/archives/html/xen-devel/2015-01/msg00414.html for more information on the call. In the absence of any further information from Konrad on the plans for the retrospective there are no agenda items and therefore no call tomorrow. Thanks, Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 03/19] libxc: allocate memory with vNUMA information for PV guest
On 13/01/15 12:11, Wei Liu wrote: From libxc's point of view, it only needs to know vnode to pnode mapping and size of each vnode to allocate memory accordingly. Add these fields to xc_dom structure. The caller might not pass in vNUMA information. In that case, a dummy layout is generated for the convenience of libxc's allocation code. The upper layer (libxl etc) still sees the domain has no vNUMA configuration. Signed-off-by: Wei Liu wei.l...@citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Dario Faggioli dario.faggi...@citrix.com Cc: Elena Ufimtseva ufimts...@gmail.com --- Changes in v3: 1. Rewrite commit log. 2. Shorten some error messages. --- tools/libxc/include/xc_dom.h |5 +++ tools/libxc/xc_dom_x86.c | 79 -- tools/libxc/xc_private.h |2 ++ 3 files changed, 75 insertions(+), 11 deletions(-) diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h index 07d7224..c459e77 100644 --- a/tools/libxc/include/xc_dom.h +++ b/tools/libxc/include/xc_dom.h @@ -167,6 +167,11 @@ struct xc_dom_image { struct xc_dom_loader *kernel_loader; void *private_loader; +/* vNUMA information */ +unsigned int *vnode_to_pnode; /* vnode to pnode mapping array */ +uint64_t *vnode_size; /* vnode size array */ Please make it very clear in the comment here that size is in MB (at least I presume so, given the shifts by 20). There are currently no specified units. +unsigned int nr_vnodes; /* number of elements of above arrays */ + /* kernel loader */ struct xc_dom_arch *arch_hooks; /* allocate up to virt_alloc_end */ diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c index bf06fe4..06a7e54 100644 --- a/tools/libxc/xc_dom_x86.c +++ b/tools/libxc/xc_dom_x86.c @@ -759,7 +759,8 @@ static int x86_shadow(xc_interface *xch, domid_t domid) int arch_setup_meminit(struct xc_dom_image *dom) { int rc; -xen_pfn_t pfn, allocsz, i, j, mfn; +xen_pfn_t pfn, allocsz, mfn, total, pfn_base; +int i, j; rc = x86_compat(dom-xch, dom-guest_domid, dom-guest_type); if ( rc ) @@ -811,18 +812,74 @@ int arch_setup_meminit(struct xc_dom_image *dom) /* setup initial p2m */ for ( pfn = 0; pfn dom-total_pages; pfn++ ) dom-p2m_host[pfn] = pfn; - + +/* Setup dummy vNUMA information if it's not provided. Not + * that this is a valid state if libxl doesn't provide any + * vNUMA information. + * + * In this case we setup some dummy value for the convenience + * of the allocation code. Note that from the user's PoV the + * guest still has no vNUMA configuration. + */ +if ( dom-nr_vnodes == 0 ) +{ +dom-nr_vnodes = 1; +dom-vnode_to_pnode = xc_dom_malloc(dom, + sizeof(*dom-vnode_to_pnode)); +dom-vnode_to_pnode[0] = XC_VNUMA_NO_NODE; +dom-vnode_size = xc_dom_malloc(dom, sizeof(*dom-vnode_size)); +dom-vnode_size[0] = (dom-total_pages PAGE_SHIFT) 20; +} + +total = 0; +for ( i = 0; i dom-nr_vnodes; i++ ) +total += ((dom-vnode_size[i] 20) PAGE_SHIFT); Can I suggest a mb_to_pages() helper rather than opencoding this in several locations. +if ( total != dom-total_pages ) +{ +xc_dom_panic(dom-xch, XC_INTERNAL_ERROR, + %s: vNUMA page count mismatch (0x%PRIpfn != 0x%PRIpfn)\n, + __FUNCTION__, total, dom-total_pages); __func__ please. It is part of C99 unlike __FUNCTION__ which is a gnuism. andrewcoop:xen.git$ git grep __FUNCTION__ | wc -l 230 andrewcoop:xen.git$ git grep __func__ | wc -l 194 Looks like the codebase is very mixed, but best to err on the side of the standard. +return -EINVAL; +} + /* allocate guest memory */ -for ( i = rc = allocsz = 0; - (i dom-total_pages) !rc; - i += allocsz ) +pfn_base = 0; +for ( i = 0; i dom-nr_vnodes; i++ ) { -allocsz = dom-total_pages - i; -if ( allocsz 1024*1024 ) -allocsz = 1024*1024; -rc = xc_domain_populate_physmap_exact( -dom-xch, dom-guest_domid, allocsz, -0, 0, dom-p2m_host[i]); +unsigned int memflags; +uint64_t pages; + +memflags = 0; +if ( dom-vnode_to_pnode[i] != XC_VNUMA_NO_NODE ) +{ +memflags |= XENMEMF_exact_node(dom-vnode_to_pnode[i]); +memflags |= XENMEMF_exact_node_request; +} + +pages = (dom-vnode_size[i] 20) PAGE_SHIFT; + +for ( j = 0;
Re: [Xen-devel] [PATCH 00/11] Alternate p2m: support multiple copies of host p2m
On 01/13/2015 11:01 AM, Andrew Cooper wrote: On 09/01/15 21:26, Ed White wrote: This set of patches adds support to hvm domains for EPTP switching by creating multiple copies of the host p2m (currently limited to 10 copies). The primary use of this capability is expected to be in scenarios where access to memory needs to be monitored and/or restricted below the level at which the guest OS page tables operate. Two examples that were discussed at the 2014 Xen developer summit are: VM introspection: http://www.slideshare.net/xen_com_mgr/ zero-footprint-guest-memory-introspection-from-xen Secure inter-VM communication: http://www.slideshare.net/xen_com_mgr/nakajima-nvf Each p2m copy is populated lazily on EPT violations, and only contains entries for ram p2m types. Permissions for pages in alternate p2m's can be changed in a similar way to the existing memory access interface, and gfn-mfn mappings can be changed. All this is done through extra HVMOP types. The cross-domain HVMOP code has been compile-tested only. Also, the cross-domain code is hypervisor-only, the toolstack has not been modified. The intra-domain code has been tested. Violation notifications can only be received for pages that have been modified (access permissions and/or gfn-mfn mapping) intra-domain, and only on VCPU's that have enabled notification. VMFUNC and #VE will both be emulated on hardware without native support. This code is not compatible with nested hvm functionality and will refuse to work with nested hvm active. It is also not compatible with migration. It should be considered experimental. Having reviewed most of the series, I believe I now have a feeling for what you are trying to achieve, but I would like to discuss some of the design implications. The following is my understanding of the situation. Please correct me if I have made a mistake. Thanks for investing the time to do this. Maybe this first couple of days would have gone more smoothly if something like this was in the cover letter. With the exception of a couple of minor points, you are spot on. Currently, a domain has a single host p2m. This contains the guest physical address mappings, and a combination of p2m types which are used by existing components to allow certain actions to happen. All vcpus run with the same host p2m. A domain may have a number of nested p2ms (currently an arbitrary limit of 10). These are used for nested-virt and are translated by the host p2m. Vcpus in guest mode run under a nested p2m. This new altp2m infrastructure adds the ability to use a different set of tables in the place of the host p2m. This, in practice, allows for different translations, different p2m types, different access permissions. One usecase of alternate p2ms is to provide introspection information to out-of-guest entities (via the mem_event interface) or to in-guest entities (via #VE). Now for some observations and assumptions. It occurs to me that the altp2m mechanism is generic. From the look of the series, it is mostly implemented in a generic way, which is great. The only Intel specific bits appear to be the ept handling itself, 'vmfunc' instruction support and #VE injection to in-guest entities. That was my intention. I don't know enough about the state of AMD virtualization to know if it can support these patches by emulating vmfunc and #VE, but that was my target. I can't think of any reasonable case where the alternate p2m would want mappings different to the host p2m. That is to say, an altp2m will map the same set of mfns to make a guest physical address space, but may differ in page permissions and possibly p2m types. The set of mfn's is the same, but I do allow gfn-mfn mappings to be modified under certain circumstances. One use of this is to point the same VA to different physical pages (with different access permissions) in different p2m's to hide memory changes. Given the above restriction, I believe a lot of the existing features can continue to work and coexist. For generating mem_events, the permissions can be altered in the altp2m. For injecting #VE, the altp2m type can change to the new p2m_ram_rw, so long as the host p2m type is compatible. For both, a vmexit can occur. Xen can do the appropriate action and also inject a #VE on its way back into the guest. One thing I have noticed while looking at the #VE stuff that EPT also supports A/D tracking, which might be quite a nice optimisation and forgo the need for p2m_ram_logdirty, but I think this should be treated as an orthogonal item. This is far from my area of expertise, but I believe there is code in Xen to use EPT D bits in migration. Ed When shared ept/iommu is not in use, altp2m can safely be used by vcpus, as this will not interfere with the IOMMU permissions. Furthermore, I can't conceptually think of an
Re: [Xen-devel] [PATCH Linux-2.6.18] scsifront: avoid aquiring same lock twice if ring is full
Hi, On Tue, Jan 13, 2015 at 05:22:58PM +0100, Juergen Gross wrote: The locking in scsifront_dev_reset_handler() is obviously wrong. In case of a full ring the host lock is aquired twice. Fixing this issue enables to get rid of the endless fo loop with an explicit break statement. Is this patch needed in upstream Linux kernel aswell, now that Xen PVSCSI drivers are in upstream Linux ? Thanks, -- Pasi Signed-off-by: Juergen Gross jgr...@suse.com --- diff -r 078f1bb69ea5 drivers/xen/scsifront/scsifront.c --- a/drivers/xen/scsifront/scsifront.c Wed Dec 10 10:22:39 2014 +0100 +++ b/drivers/xen/scsifront/scsifront.c Tue Jan 13 14:32:33 2015 +0100 @@ -447,12 +447,10 @@ static int scsifront_dev_reset_handler(s uint16_t rqid; int err = 0; - for (;;) { #if LINUX_VERSION_CODE = KERNEL_VERSION(2,6,12) - spin_lock_irq(host-host_lock); + spin_lock_irq(host-host_lock); #endif - if (!RING_FULL(info-ring)) - break; + while (RING_FULL(info-ring)) { if (err) { #if LINUX_VERSION_CODE = KERNEL_VERSION(2,6,12) spin_unlock_irq(host-host_lock); ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 2/2] x86, arm, platform, xen, kconfig: add xen defconfig helper
On Mon, Dec 15, 2014 at 02:58:26PM +, Stefano Stabellini wrote: On Tue, 9 Dec 2014, Luis R. Rodriguez wrote: From: Luis R. Rodriguez mcg...@suse.com This lets you build a kernel which can support xen dom0 or xen guests by just using: make xenconfig on both x86 and arm64 kernels. This also splits out the options which are available currently to be built with x86 and 'make ARCH=arm64' under a shared config. Technically xen supports a dom0 kernel and also a guest kernel configuration but upon review with the xen team since we don't have many dom0 options its best to just combine these two into one. Cc: Josh Triplett j...@joshtriplett.org Cc: Borislav Petkov b...@suse.de Cc: Pekka Enberg penb...@kernel.org Cc: David Rientjes rient...@google.com Cc: Michal Marek mma...@suse.cz Cc: Randy Dunlap rdun...@infradead.org Cc: penb...@kernel.org Cc: levinsasha...@gmail.com Cc: mtosa...@redhat.com Cc: fengguang...@intel.com Cc: David Vrabel david.vra...@citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com Cc: xen-de...@lists.xenproject.org Reviewed-by: Josh Triplett j...@joshtriplett.org Signed-off-by: Luis R. Rodriguez mcg...@suse.com --- arch/x86/configs/xen.config | 7 +++ kernel/configs/xen.config | 30 ++ scripts/kconfig/Makefile| 5 + 3 files changed, 42 insertions(+) create mode 100644 arch/x86/configs/xen.config create mode 100644 kernel/configs/xen.config diff --git a/arch/x86/configs/xen.config b/arch/x86/configs/xen.config new file mode 100644 index 000..92b8587f --- /dev/null +++ b/arch/x86/configs/xen.config @@ -0,0 +1,7 @@ +# x86 xen specific config options +CONFIG_XEN_PVHVM=y +CONFIG_XEN_MAX_DOMAIN_MEMORY=500 +CONFIG_XEN_SAVE_RESTORE=y +# CONFIG_XEN_DEBUG_FS is not set +CONFIG_XEN_PVH=y +CONFIG_XEN_MCE_LOG=y diff --git a/kernel/configs/xen.config b/kernel/configs/xen.config new file mode 100644 index 000..d2ec010 --- /dev/null +++ b/kernel/configs/xen.config @@ -0,0 +1,30 @@ +# generic config +CONFIG_XEN=y +CONFIG_XEN_DOM0=y +CONFIG_PCI_XEN=y This shouldn't be here If PCI is not supported on the arch this won't be selected as kconfig would not allow for it, what would be the issue of keeping it here? What xen instances would we not want to have this enabled for and can we instead manage that through Kconfig magic by negating PCI_XEN for it? +CONFIG_XEN_PCIDEV_FRONTEND=m +CONFIG_XEN_BLKDEV_FRONTEND=m +CONFIG_XEN_BLKDEV_BACKEND=m +CONFIG_XEN_NETDEV_FRONTEND=m +CONFIG_XEN_NETDEV_BACKEND=m +CONFIG_INPUT_XEN_KBDDEV_FRONTEND=y +CONFIG_HVC_XEN=y +CONFIG_HVC_XEN_FRONTEND=y +CONFIG_TCG_XEN=m neither should this OK! +CONFIG_XEN_WDT=m +CONFIG_XEN_FBDEV_FRONTEND=y +CONFIG_XEN_BALLOON=y +CONFIG_XEN_BALLOON_MEMORY_HOTPLUG=y +CONFIG_XEN_SCRUB_PAGES=y +CONFIG_XEN_DEV_EVTCHN=m +CONFIG_XEN_BACKEND=y +CONFIG_XENFS=m +CONFIG_XEN_COMPAT_XENFS=y +CONFIG_XEN_SYS_HYPERVISOR=y +CONFIG_XEN_XENBUS_FRONTEND=y +CONFIG_XEN_GNTDEV=m +CONFIG_XEN_GRANT_DEV_ALLOC=m +CONFIG_SWIOTLB_XEN=y +CONFIG_XEN_PCIDEV_BACKEND=m +CONFIG_XEN_PRIVCMD=m +CONFIG_XEN_ACPI_PROCESSOR=m and this OK! Luis ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] --enable-xsm ?
Hey I was wondering if there would be any plans for configure.ac (or the m4 scripts) to have an --enable-xsm which would set XSM_ENABLE (or FLASK_ENABLE) to true? Right now by default to build with XSM one has to manually change the Config.mk ENABLE_XSM option to 'y'. Thanks. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 04/11] x86/MM: Improve p2m type checks.
On 01/12/2015 09:48 AM, Andrew Cooper wrote: On 09/01/15 21:26, Ed White wrote: diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h index 5f7fe71..8193901 100644 --- a/xen/include/asm-x86/p2m.h +++ b/xen/include/asm-x86/p2m.h @@ -193,6 +193,9 @@ struct p2m_domain { * threaded on in LRU order. */ struct list_head np2m_list; +/* Does this p2m belong to the altp2m code? */ +bool_t alternate; + /* Host p2m: Log-dirty ranges registered for the domain. */ struct rangeset *logdirty_ranges; @@ -290,7 +293,9 @@ struct p2m_domain *p2m_get_nestedp2m(struct vcpu *v, uint64_t np2m_base); */ struct p2m_domain *p2m_get_p2m(struct vcpu *v); -#define p2m_is_nestedp2m(p2m) ((p2m) != p2m_get_hostp2m((p2m-domain))) +#define p2m_is_hostp2m(p2m) ((p2m) == p2m_get_hostp2m((p2m-domain))) +#define p2m_is_altp2m(p2m)((p2m)-alternate) +#define p2m_is_nestedp2m(p2m) (!p2m_is_altp2m(p2m) !p2m_ishostp2m(p2m)) Might this be better expressed as a p2m type, currently of the set {host, alt, nested} ? p2m_is_nestedp2m() is starting to hide some moderately complicated calculations. Any suggestions for the name? Unfortunately, p2m_type is already taken, and I can't think of a good alternative. Ed ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH] xen/arm: Blacklist the memory mapped timer (armv7-timer-mem)
Some platform (such as the VFP Base AEMv8 model) has a memory mapped timer. We don't want DOM0 use this timer rather than the generic ARM timer. So blacklist it for all platforms. Signed-off-by: Julien Grall julien.gr...@linaro.org --- This patch is candidate to backport for Xen 4.5 and Xen 4.4. It may not apply correctly for Xen 4.4. --- xen/arch/arm/domain_build.c | 1 + 1 file changed, 1 insertion(+) diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index bf8dc78..16ce248 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -1047,6 +1047,7 @@ static int handle_node(struct domain *d, struct kernel_info *kinfo, DT_MATCH_COMPATIBLE(arm,psci), DT_MATCH_PATH(/cpus), DT_MATCH_TYPE(memory), +DT_MATCH_COMPATIBLE(arm,armv7-timer-mem), { /* sentinel */ }, }; static const struct dt_device_match gic_matches[] __initconst = -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 2/2] x86, arm, platform, xen, kconfig: add xen defconfig helper
Hello Luis, On 13/01/15 19:03, Luis R. Rodriguez wrote: diff --git a/kernel/configs/xen.config b/kernel/configs/xen.config new file mode 100644 index 000..d2ec010 --- /dev/null +++ b/kernel/configs/xen.config @@ -0,0 +1,30 @@ +# generic config +CONFIG_XEN=y +CONFIG_XEN_DOM0=y +CONFIG_PCI_XEN=y +CONFIG_XEN_PCIDEV_FRONTEND=m +CONFIG_XEN_BLKDEV_FRONTEND=m +CONFIG_XEN_BLKDEV_BACKEND=m +CONFIG_XEN_NETDEV_FRONTEND=m +CONFIG_XEN_NETDEV_BACKEND=m +CONFIG_INPUT_XEN_KBDDEV_FRONTEND=y +CONFIG_HVC_XEN=y +CONFIG_HVC_XEN_FRONTEND=y +CONFIG_TCG_XEN=m +CONFIG_XEN_WDT=m +CONFIG_XEN_FBDEV_FRONTEND=y +CONFIG_XEN_BALLOON=y +CONFIG_XEN_BALLOON_MEMORY_HOTPLUG=y +CONFIG_XEN_SCRUB_PAGES=y +CONFIG_XEN_DEV_EVTCHN=m +CONFIG_XEN_BACKEND=y +CONFIG_XENFS=m +CONFIG_XEN_COMPAT_XENFS=y +CONFIG_XEN_SYS_HYPERVISOR=y +CONFIG_XEN_XENBUS_FRONTEND=y +CONFIG_XEN_GNTDEV=m +CONFIG_XEN_GRANT_DEV_ALLOC=m +CONFIG_SWIOTLB_XEN=y +CONFIG_XEN_PCIDEV_BACKEND=m +CONFIG_XEN_PRIVCMD=m +CONFIG_XEN_ACPI_PROCESSOR=m The common fragment config looks good for both ARM32 and ARM64: Acked-by: Julien Grall julien.gr...@linaro.org Can someone apply this? Who should this go through? Stefano had some comments on this patch. See: http://lists.xenproject.org/archives/html/xen-devel/2014-12/msg01531.html Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 01/19] xen: dump vNUMA information with debug key u
On 13/01/15 12:11, Wei Liu wrote: @@ -408,6 +413,49 @@ static void dump_numa(unsigned char key) for_each_online_node ( i ) printk(Node %u: %u\n, i, page_num_node[i]); + +if ( !d-vnuma ) +continue; Nit - extraneous whitespace. + +vnuma = d-vnuma; +printk( %u vnodes, %u vcpus:\n, vnuma-nr_vnodes, d-max_vcpus); +for ( i = 0; i vnuma-nr_vnodes; i++ ) +{ +err = snprintf(keyhandler_scratch, 12, %3u, +vnuma-vnode_to_pnode[i]); +if ( err 0 || vnuma-vnode_to_pnode[i] == NUMA_NO_NODE ) +strlcpy(keyhandler_scratch, ???, 3); + +printk( vnode %3u - pnode %s\n, i, keyhandler_scratch); +for ( j = 0; j vnuma-nr_vmemranges; j++ ) +{ +if ( vnuma-vmemrange[j].nid == i ) +{ +printk( %016PRIx64 - %016PRIx64\n, + vnuma-vmemrange[j].start, + vnuma-vmemrange[j].end); +} +} + +printk( vcpus: ); +for ( j = 0, n = 0; j d-max_vcpus; j++ ) +{ +if ( !(j 0x3f) ) +process_pending_softirqs(); + +if ( vnuma-vcpu_to_vnode[j] == i ) +{ +if ( (n + 1) % 8 == 0 ) +printk(%3d\n, j); +else if ( !(n % 8) n != 0 ) +printk(%17d , j); +else +printk(%3d , j); +n++; +} Do you have a sample of what this looks like for a VM with more than 8 vcpus ? ~Andrew +} +printk(\n); +} } rcu_read_unlock(domlist_read_lock); ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PULL 0/2] Xen tree 2015-01-13
On 13 January 2015 at 18:24, Stefano Stabellini stefano.stabell...@eu.citrix.com wrote: The following changes since commit 7d5ad15d17f26dd4f9ff5f3491828bc34e74f28c: Merge remote-tracking branch 'remotes/stefanha/tags/net-pull-request' into staging (2015-01-12 11:13:24 +) are available in the git repository at: git://xenbits.xen.org/people/sstabellini/qemu-dm.git xen-2015-01-13 for you to fetch changes up to c1d322e6048796296555dd36fdd102d7fa2f50bf: xen-hvm: increase maxmem before calling xc_domain_populate_physmap (2015-01-13 18:05:52 +) Liang Li (1): xen-pt: Fix PCI devices re-attach failed Stefano Stabellini (1): xen-hvm: increase maxmem before calling xc_domain_populate_physmap Applied, thanks. -- PMM ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Architecture for dom0 integrity measurements.
On Jan 12, 3:53pm, Xu, Quan wrote: } Subject: RE: [Xen-devel] Architecture for dom0 integrity measurements. Hi, Dr. G.W. Wettstein Hi Quan, thanks for taking the time to reply. cc Graaf who is vTPM / XSM maintainer. Also cc Stefano. Greetings to everyone else as well. -Original Message- From: xen-devel-boun...@lists.xen.org [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of Dr. Greg Wettstein Sent: Saturday, January 10, 2015 10:59 PM To: xen-devel@lists.xen.org Subject: [Xen-devel] Architecture for dom0 integrity measurements. Hi, I hope the weekend is going well for everyone. We have been watching the discussions on the list over the holiday on the refinement and enhancement of the TPM architecture for Xen, including support for TPM 2.0. We are active in measured platform development and I wanted to pose what is perhaps a philosophical question to everyone. Our systems boot from a hardware root of trust via TXT and we heavily leverage the Linux Integrity Measurement Architecture (IMA) for mutual remote attestation. Is it based on TBoot / OpenAttestation ? Yes, we leverage TBOOT to implement the root of trust for our security supervisor. We have worked with OAT but our development efforts have been focused on something we refer to as POSSUM. We are heavily invested in the concept of intrinsically linking identity to authentication and ephemeral key exchange through mutual device attestation. Others may disagree but I wouldn't even contemplate delivering an integrity certified platform without including all of the dom0 infrastructure into the platform measurement status. We heavily leverage the current 4.4.x vTPM implementation for testing and development and the documentation states clearly to not integrate TPM/TIS support into the dom0 OS. Ditto. Everyone seems to agree on this point. The obvious model is to run a software TPM simulator in dom0 and have the vTPM I/O domains link to that. We are heavily invested in IBM's software TPM simulator and have been tossing around the idea of building up a proof of concept based on that. I wanted to make sure we were not misunderstanding anything with the current or proposed architecture before we invest the resources. BM's software TPM simulator, is it libtpms? For all I know, the libtpms is a library that targets the integration of TPM functionality into hypervisors. In this mode, libtpms is dynamic linking library, so there is no root of trust. If you really want to enable it, I have some=20 Suggestion. It is Ken Goldman's work at IBM and the library name is libtpm. It is a library of TPM functionality which is used to implement a TPM simulator/server. Trousers talks to the server and for testing and development we have been able to move our codebase between it and hardware without modification. 1. vTPM I/O domains is now needed in this mode, QEMU can implement another TPM Backend to link libtpms. Try to refer to http://lists.nongnu.org/archive/html/qemu-devel/2013-11/msg00674.html=20 2. Enabling seabios for HVM virtual machine. Refer to patch ' vTPM: add TPM TCPA and SSDT for HVM virtual machine' And https://github.com/virt2x/seabios2=20 Thanks for the references, we are following up on . We have also been considering whether or not to implement the multiple TPM states in the context of the dom0 hardware virtualization instance. Does it mean initial states from libtpms? Such as clear/save/.etc. Correct me, if I am wrong.. I believe we are talking about the same conept/technology. The library initializes its TPM 'state' but the state is not anchored in hardware. Once again not as 'technically secure' but it does cut out a lot of complexity with the current model, Yes, agree with this point. Yes, it doesn't take very much work on this technology to appreciate the reproducibility, flexibility and debuggability of a simulator. It is not, as I noted above, capable of implementing a hardware root of trust like a hardware TPM but the same rules apply to it as the vTPM/vtpmmgr architecture. If the simulator and its database is aanchored to a hardware root of trust it should be possible to have its simulation services be trusted by a guest. We've started work on going through the code and building up a prototype which is capable of running multiple TPM instances, each of which could be coupled to a guest domain. We will see where that leads us with respec to couping it via XEN's tpm front/back drivers to a guest. with the added benefit of that infrastructure being covered by a hardware rooted IMA state. Also we are extremely interested in what hardware and motherboards with TPM 2.0 support are being used for this development, obviously with TXT being a requirement. It wasn't too long ago we were advised directly by Intel that physical hardware was not available, perhaps that was a miscommunication. Given the work
Re: [Xen-devel] [PATCH v3 4/5] tools: code refactoring for MBM
On Tue, Jan 13, 2015 at 01:46:50PM +, Wei Liu wrote: On Tue, Jan 13, 2015 at 04:02:12PM +0800, Chao Peng wrote: Make some internal routines common so that total/local memory bandwidth monitoring in the next patch can make use of them. Signed-off-by: Chao Peng chao.p.p...@linux.intel.com Acked-by: Wei Liu wei.l...@citrix.com Could you please in your later patch submission include short change log int the commit message so that reviewers can know what has changed. The change log can be separated with --- so they do appear in the commit message in tree. Thanks Wei. I added the change logs in the cover letter for the whole patch serial. But your suggestion is valuable as we can add more detail change logs on per-patch base. I will take your VNUMA patch as an example :) Chao ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] Question about partitioning shared cache in Xen
Hi, [Goal] I want to investigate the impact of the shared cache on the performance of workload in guest domain. I also want to partition the shared cache via page coloring mechanism so that guest domains can use different cache colors of shared cache and will not have interference in the shared cache. [Motivation: Why do I want to partition the shared cache?] Because the shared cache is shared among all guest domains (I assume the machine has multicores sharing the same LLC. For example, Intel(R) Xeon(R) CPU E5-1650 v2 has 6 physical cores sharing a 12MB L3 cache.), the workload in one domU can interfere another domU's memory-intensive workload on the same machine via shared cache. This shared-cache interference makes the execution time of the workload in a domU non-deterministic and increase a lot. (If we assume the worst case, the worst-case execution time of the workload will be too pessimistic.) A stable execution time is very important in real-time computation when the real-time program, like the control program on automobile, have to generate the result within a deadline. I did some quick measurements to show how shared cache can be used by a holistic domain to interfere the execution time of another domain's workload. I pin the VCPUs of two domains to different physical cores and use one domain to pollute the shared cache. The result shows that the shared-cache interference can make the execution time of another domain's workload slow down by 4x. The whole experiment result can be found at https://github.com/PennPanda/cis601/blob/master/project/data/boxplot_cache_v2.pdf . (The workload of the figure is a program reading a large array. I run the program for 100 times and draw the latency of accessing the array in a box plot. The first column with name alone−d1v1 is the boxplot latency when the program in dom1 runs alone. The fourth column d1v1d2v1−pindiffcore is the boxplot latency when the program in dom1 runs along with another program in dom2, and these two domains uses different cores. dom1 and dom2 have 1 vcpu with budget equal to period. The scheduler is credit scheduler.) [Idea of how to partition the shared cache] When a PV guest domain is created, it will call xc_dom_boot_mem_init() to allocate memory for the domain, which finally calls xc_domain_populate_physmap_exact() to allocate memory pages from domheap in Xen. The idea of partitioning the share cache is as follows: 1) xl tool change: Add an option in domain's configuration file which specifies which cache colors this domain should use. (I have done this and when I use xl create --dry-run, I can see the parameters are parsed to the build information.) 2) hypervisor change: Add another hypercall xc_domain_populate_physmap_exact_ca() which has one more parameter, i.e, the cache colors this domain should use. I also need to reserve a memory pool which sort the reserved memory pages based on its cache color. When a PV domain is created, I can specify the cache colors it uses. Then the xl tool will call the xc_domain_populate_physmap_exact_ca() to only allocate the memory pages with the specified cache colors to this domain. [Quick implementation] I attached my quick implementation patch at the end of this email. [Issues and Questions] After I applied the patch to Xen's commit point 36174af3fbeb1b662c0eadbfa193e77f68cc955b and run it on my machine, dom0 cannot boot up.:-( The error message from dom0 is: [0.00] Kernel panic - not syncing: Failed to get contiguous memory for DMA from Xen! [0.00] You either: don't have the permissions, do not have enough free memory under 4GB, or the hypervisor memory is too fragmented! (rc:-12) I tried to print every message in the function I touched in order to figure out where goes wrong but failed. :-( The thing I cannot understand is that: My implementation haven't reserve any memory pages in the cache-aware memory pool before the system boots up. Basically, every function I modified haven't been called before the system boots up. But the system crashes. :-( (The system can boot up and work perfectly before applying my patch.) I really appreciate it if any of you could point out the part I missed or misunderstood. :-) Thank you very very much! Best, Meng The full crash message is as follows: Xen 4.5.0-rc (XEN) Xen version 4.5.0-rc (root@) (gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3) debug=y Sun Jan 11 11:39:23 EST 2015 (XEN) Latest ChangeSet: Sun Jan 4 22:19:40 2015 -0500 git:962a13f-dirty (XEN) Bootloader: GRUB 1.99-21ubuntu3.14 (XEN) Command line: placeholder dom0_memory=512M sched=credit console=tty0 com1=115200n8 console=com1 (XEN) Video information: (XEN) VGA is text mode 80x25, font 8x16 (XEN) Disc information: (XEN) Found 1 MBR signatures (XEN) Found 1 EDD information structures (XEN) Xen-e820 RAM map: (XEN) - 0009fc00 (usable) (XEN) 0009fc00 - 000a (reserved) (XEN) 000f -
Re: [Xen-devel] [PATCH v2] [Bugfix] x86/apic: Fix xen IRQ allocation failure caused by commit b81975eade8c
On Tue, 13 Jan 2015, Sander Eikelenboom wrote: Monday, January 12, 2015, 4:01:00 PM, you wrote: On 12/01/15 13:39, Jiang Liu wrote: Commit b81975eade8c (x86, irq: Clean up irqdomain transition code) breaks xen IRQ allocation because xen_smp_prepare_cpus() doesn't invoke setup_IO_APIC(), so no irqdomains created for IOAPICs and mp_map_pin_to_irq() fails at the very beginning. Enhance xen_smp_prepare_cpus() to call setup_IO_APIC() to initialize irqdomain for IOAPICs. Having Xen call setup_IO_APIC() to initialize the irq domains then having to add special cases to it is just wrong. The bits of init deferred by mp_register_apic() are also deferred to two different places which looks odd. What about something like the following (untested) patch? Hi David / Gerry, David's patch (after fixing a few compile issues) fixes the problem. The power button now works for me on: - intel baremetal - intel xen - amd baremetal (no issues with the override anymore) - amd xen (no freeze issues anymore) Can someone please send a proper patch with changelog? Thanks, tglx ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [Xen-users] [TestDay] minor bug + possible configuration bug 4.5rc4 archlinux
On Mon, Jan 12, Ian Campbell wrote: @devs -- we obviously need to do something about this (too late for 4.5, but for 4.6 + backport). Perhaps there is some alternative systemd construction which disassociates the actual path from the abstract service xenstored dir mounted? I dont think we can do anything about this systemd brain damage. Either it gets its Where= from such line within the file, or it gets its Where= from the filename. In which case it has to stop looking at a Where= line. In any case, its wrong to use --localstatedir=/tmpfs-mount-point because that means all mails in the spool subdirectory are in danger. If thats the mindset of ArchLinux all we can do is to recommend to stop using it for any serious task. Olaf ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 00/11] Alternate p2m: support multiple copies of host p2m
On 12.01.15 at 18:36, edmund.h.wh...@intel.com wrote: On 01/12/2015 02:00 AM, Jan Beulich wrote: On 10.01.15 at 00:04, edmund.h.wh...@intel.com wrote: On 01/09/2015 02:41 PM, Andrew Cooper wrote: Having some non-OS part of the guest swap the EPT tables and accidentally turn a DMA buffer read-only is not going to end well. The agent can certainly do bad things, and at some level you have to assume it is sensible enough not to. However, I'm not sure this is fundamentally more dangerous than what a privileged domain can do today using the MEMOP... operations, and people are already using those for very similar purposes. I don't follow - how is what privileged domain can do related to the proposed changes here (which are - via VMFUNC - at least partially guest controllable, and that's also the case Andrew mentioned in his reply)? I'm having a hard time understanding how a P2M stripped of anything that's not plain RAM can be very useful to a guest. IOW without such fundamental aspects clarified I don't see a point in looking at the individual patches (which btw, according to your wording elsewhere, should have been marked RFC). In this patch series, none of the new hypercalls are protected by xsm policies. Earlier in the process of working on this code, I added such a check to all the hypercalls, but then removed them all because it dawned on me that I didn't actually understand what I was doing and my code only worked because I only ever built the dummy permit everything policy. Should some version of this patch series be accepted, my hope is that someone who does understand xsm policies would put the appropriate checks in place, and at that point I maintain that these extra capabilities would not be fundamentally more dangerous than existing mechanisms available to privileged domains, because policy can prevent the guest using vmfunc. That's obviously not true today. Please simply consult with the XSM maintainer on questions/issues like this. Proposing a partial (insecure) patch set isn't appropriate. The alternate p2m's only contain entries for ram pages with valid mfn's. All other page types are still handled in the nested page fault handler for the host p2m. Those pages (at least the ones I've encountered) don't require the hardware to have a valid EPTE for the page. I.e. the functionality requiring e.g. p2m_ram_logdirty and p2m_mmio_direct is then incompatible with your proposed additions (which I think was also already noted by Andrew). That's imo not a basis to think about accepting (or even reviewing) the series. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 5/5] tools: add total/local memory bandwith monitoring
Add Memory Bandwidth Monitoring(MBM) for VMs. Two types of monitoring are supported: total and local memory bandwidth monitoring. To use it, CMT should be enabled in hypervisor. Signed-off-by: Chao Peng chao.p.p...@linux.intel.com --- docs/man/xl.pod.1 |9 + tools/libxc/include/xenctrl.h |2 + tools/libxc/xc_psr.c |8 tools/libxl/libxl.h |8 tools/libxl/libxl_psr.c | 84 + tools/libxl/libxl_types.idl |2 + tools/libxl/xl_cmdimpl.c | 21 ++- tools/libxl/xl_cmdtable.c |4 +- 8 files changed, 136 insertions(+), 2 deletions(-) diff --git a/docs/man/xl.pod.1 b/docs/man/xl.pod.1 index 6b89ba8..0370625 100644 --- a/docs/man/xl.pod.1 +++ b/docs/man/xl.pod.1 @@ -1461,6 +1461,13 @@ is domain level. To monitor a specific domain, just attach the domain id with the monitoring service. When the domain doesn't need to be monitored any more, detach the domain id from the monitoring service. +Intel Broadwell and later server platforms also offer total/local memory +bandwidth monitoring. Xen supports per-domain monitoring for these two +additional monitoring types. Both memory bandwidth monitoring and L3 cache +occupancy monitoring share the same set of underground monitoring service. Once +a domain is attached to the monitoring service, monitoring data can be showed +for any of these monitoring types. + =over 4 =item Bpsr-cmt-attach [Idomain-id] @@ -1476,6 +1483,8 @@ detach: Detach the platform shared resource monitoring service from a domain. Show monitoring data for a certain domain or all domains. Current supported monitor types are: - cache-occupancy: showing the L3 cache occupancy. + - total-mem-bandwidth: showing the total memory bandwidth. + - local-mem-bandwidth: showing the local memory bandwidth. =back diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h index c6e9e3e..06366b5 100644 --- a/tools/libxc/include/xenctrl.h +++ b/tools/libxc/include/xenctrl.h @@ -2688,6 +2688,8 @@ int xc_resource_op(xc_interface *xch, uint32_t nr_ops, xc_resource_op_t *ops); #if defined(__i386__) || defined(__x86_64__) enum xc_psr_cmt_type { XC_PSR_CMT_L3_OCCUPANCY, +XC_PSR_CMT_TOTAL_MEM_BANDWIDTH, +XC_PSR_CMT_LOCAL_MEM_BANDWIDTH, }; typedef enum xc_psr_cmt_type xc_psr_cmt_type; int xc_psr_cmt_attach(xc_interface *xch, uint32_t domid); diff --git a/tools/libxc/xc_psr.c b/tools/libxc/xc_psr.c index a9881a4..5858693 100644 --- a/tools/libxc/xc_psr.c +++ b/tools/libxc/xc_psr.c @@ -23,6 +23,8 @@ #define IA32_CMT_CTR_ERROR_MASK (0x3ull 62) #define EVTID_L3_OCCUPANCY 0x1 +#define EVTID_TOTAL_MEM_BANDWIDTH 0x2 +#define EVTID_LOCAL_MEM_BANDWIDTH 0x3 int xc_psr_cmt_attach(xc_interface *xch, uint32_t domid) { @@ -168,6 +170,12 @@ int xc_psr_cmt_get_data(xc_interface *xch, uint32_t rmid, uint32_t cpu, case XC_PSR_CMT_L3_OCCUPANCY: evtid = EVTID_L3_OCCUPANCY; break; +case XC_PSR_CMT_TOTAL_MEM_BANDWIDTH: +evtid = EVTID_TOTAL_MEM_BANDWIDTH; +break; +case XC_PSR_CMT_LOCAL_MEM_BANDWIDTH: +evtid = EVTID_LOCAL_MEM_BANDWIDTH; +break; default: return -1; } diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h index 596d2a0..347ef52 100644 --- a/tools/libxl/libxl.h +++ b/tools/libxl/libxl.h @@ -1462,6 +1462,14 @@ int libxl_psr_cmt_get_cache_occupancy(libxl_ctx *ctx, uint32_t domid, uint32_t socketid, uint32_t *l3_cache_occupancy); +int libxl_psr_cmt_get_total_mem_bandwidth(libxl_ctx *ctx, + uint32_t domid, + uint32_t socketid, + uint32_t *bandwidth); +int libxl_psr_cmt_get_local_mem_bandwidth(libxl_ctx *ctx, + uint32_t domid, + uint32_t socketid, + uint32_t *bandwidth); #endif /* misc */ diff --git a/tools/libxl/libxl_psr.c b/tools/libxl/libxl_psr.c index c88c421..0c3e4e6 100644 --- a/tools/libxl/libxl_psr.c +++ b/tools/libxl/libxl_psr.c @@ -18,6 +18,7 @@ #define IA32_QM_CTR_ERROR_MASK (0x3ul 62) +#define MBM_SAMPLE_RETRY_MAX 4 static void libxl__psr_cmt_log_err_msg(libxl__gc *gc, int err) { @@ -240,6 +241,89 @@ out: return rc; } +static int libxl__psr_cmt_get_mem_bandwidth(libxl__gc *gc, +uint32_t domid, +xc_psr_cmt_type type, +uint32_t socketid, +uint32_t *bandwidth) +{ +uint64_t sample1, sample2; +uint32_t upscaling_factor; +int retry_attempts = 0; +int rc; +
[Xen-devel] [PATCH v3 4/5] tools: code refactoring for MBM
Make some internal routines common so that total/local memory bandwidth monitoring in the next patch can make use of them. Signed-off-by: Chao Peng chao.p.p...@linux.intel.com --- tools/libxl/libxl_psr.c | 44 - tools/libxl/xl_cmdimpl.c | 54 +++--- 2 files changed, 61 insertions(+), 37 deletions(-) diff --git a/tools/libxl/libxl_psr.c b/tools/libxl/libxl_psr.c index 84819e6..c88c421 100644 --- a/tools/libxl/libxl_psr.c +++ b/tools/libxl/libxl_psr.c @@ -176,20 +176,16 @@ int libxl_psr_cmt_get_l3_event_mask(libxl_ctx *ctx, uint32_t *event_mask) return rc; } -int libxl_psr_cmt_get_cache_occupancy(libxl_ctx *ctx, - uint32_t domid, - uint32_t socketid, - uint32_t *l3_cache_occupancy) +static int libxl__psr_cmt_get_l3_monitoring_data(libxl__gc *gc, + uint32_t domid, + xc_psr_cmt_type type, + uint32_t socketid, + uint64_t *data) { -GC_INIT(ctx); - unsigned int rmid; -uint32_t upscaling_factor; -uint64_t monitor_data; int cpu, rc; -xc_psr_cmt_type type; -rc = xc_psr_cmt_get_domain_rmid(ctx-xch, domid, rmid); +rc = xc_psr_cmt_get_domain_rmid(CTX-xch, domid, rmid); if (rc 0 || rmid == 0) { LOGE(ERROR, fail to get the domain rmid, or domain is not attached with platform QoS monitoring service); @@ -204,14 +200,32 @@ int libxl_psr_cmt_get_cache_occupancy(libxl_ctx *ctx, goto out; } -type = XC_PSR_CMT_L3_OCCUPANCY; -rc = xc_psr_cmt_get_data(ctx-xch, rmid, cpu, type, monitor_data); +rc = xc_psr_cmt_get_data(CTX-xch, rmid, cpu, type, data); if (rc 0) { LOGE(ERROR, failed to get monitoring data); rc = ERROR_FAIL; -goto out; } +out: +return rc; +} + +int libxl_psr_cmt_get_cache_occupancy(libxl_ctx *ctx, + uint32_t domid, + uint32_t socketid, + uint32_t *l3_cache_occupancy) +{ +GC_INIT(ctx); +uint64_t data; +uint32_t upscaling_factor; +int rc; + +rc = libxl__psr_cmt_get_l3_monitoring_data(gc, domid, + XC_PSR_CMT_L3_OCCUPANCY, + socketid, data); +if (rc 0) +goto out; + rc = xc_psr_cmt_get_l3_upscaling_factor(ctx-xch, upscaling_factor); if (rc 0) { LOGE(ERROR, failed to get L3 upscaling factor); @@ -219,8 +233,8 @@ int libxl_psr_cmt_get_cache_occupancy(libxl_ctx *ctx, goto out; } -*l3_cache_occupancy = upscaling_factor * monitor_data / 1024; -rc = 0; +*l3_cache_occupancy = upscaling_factor * data / 1024; + out: GC_FREE; return rc; diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c index 24f3c8d..09ca73e 100644 --- a/tools/libxl/xl_cmdimpl.c +++ b/tools/libxl/xl_cmdimpl.c @@ -7846,12 +7846,13 @@ out: } #ifdef LIBXL_HAVE_PSR_CMT -static void psr_cmt_print_domain_cache_occupancy(libxl_dominfo *dominfo, - uint32_t nr_sockets) +static void psr_cmt_print_domain_l3_info(libxl_dominfo *dominfo, + libxl_psr_cmt_type type, + uint32_t nr_sockets) { char *domain_name; uint32_t socketid; -uint32_t l3_cache_occupancy; +uint32_t data; if (!libxl_psr_cmt_domain_attached(ctx, dominfo-domid)) return; @@ -7861,15 +7862,21 @@ static void psr_cmt_print_domain_cache_occupancy(libxl_dominfo *dominfo, free(domain_name); for (socketid = 0; socketid nr_sockets; socketid++) { -if (!libxl_psr_cmt_get_cache_occupancy(ctx, dominfo-domid, socketid, - l3_cache_occupancy)) -printf(%13u KB, l3_cache_occupancy); +switch (type) { +case LIBXL_PSR_CMT_TYPE_CACHE_OCCUPANCY: +if (!libxl_psr_cmt_get_cache_occupancy(ctx, dominfo-domid, + socketid, data)) +printf(%13u KB, data); +break; +default: +return; +} } printf(\n); } -static int psr_cmt_show_cache_occupancy(uint32_t domid) +static int psr_cmt_show_l3_info(libxl_psr_cmt_type type, uint32_t domid) { uint32_t i, socketid, nr_sockets, total_rmid; uint32_t l3_cache_size; @@ -7905,19 +7912,22 @@ static int psr_cmt_show_cache_occupancy(uint32_t domid) printf(%14s %d, Socket, socketid); printf(\n); -/* Total L3 cache size */ -printf(%-46s, Total L3 Cache Size); -
[Xen-devel] [PATCH v3 1/5] x86: expose CMT L3 event mask to user space
L3 event mask indicates the event types supported in host, including cache occupancy event as well as local/total memory bandwidth events for Memory Bandwidth Monitoring(MBM). Expose it so all these events can be monitored in user space. Signed-off-by: Chao Peng chao.p.p...@linux.intel.com Reviewed-by: Andrew Cooper andrew.coop...@citrix.com Acked-by: Jan Beulich jbeul...@suse.com --- xen/arch/x86/sysctl.c |3 +++ xen/include/public/sysctl.h |1 + 2 files changed, 4 insertions(+) diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c index 57ad992..611a291 100644 --- a/xen/arch/x86/sysctl.c +++ b/xen/arch/x86/sysctl.c @@ -157,6 +157,9 @@ long arch_do_sysctl( sysctl-u.psr_cmt_op.u.data = (ret ? 0 : info.size); break; } +case XEN_SYSCTL_PSR_CMT_get_l3_event_mask: +sysctl-u.psr_cmt_op.u.data = psr_cmt-l3.features; +break; default: sysctl-u.psr_cmt_op.u.data = 0; ret = -ENOSYS; diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h index b3713b3..8552dc6 100644 --- a/xen/include/public/sysctl.h +++ b/xen/include/public/sysctl.h @@ -641,6 +641,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_sysctl_coverage_op_t); /* The L3 cache size is returned in KB unit */ #define XEN_SYSCTL_PSR_CMT_get_l3_cache_size 2 #define XEN_SYSCTL_PSR_CMT_enabled 3 +#define XEN_SYSCTL_PSR_CMT_get_l3_event_mask 4 struct xen_sysctl_psr_cmt_op { uint32_t cmd; /* IN: XEN_SYSCTL_PSR_CMT_* */ uint32_t flags; /* padding variable, may be extended for future use */ -- 1.7.9.5 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 3/5] tools: correct coding style for psr
- space: remove space after '(' or before ')' in 'if' condition; - indention: align function definition/call arguments; Signed-off-by: Chao Peng chao.p.p...@linux.intel.com Acked-by: Wei Liu wei.l...@citrix.com --- tools/libxc/include/xenctrl.h |8 tools/libxc/xc_psr.c | 10 +- tools/libxl/libxl.h | 11 +++ tools/libxl/libxl_psr.c | 11 +++ tools/libxl/xl_cmdimpl.c | 11 ++- 5 files changed, 29 insertions(+), 22 deletions(-) diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h index 96b357c..c6e9e3e 100644 --- a/tools/libxc/include/xenctrl.h +++ b/tools/libxc/include/xenctrl.h @@ -2693,15 +2693,15 @@ typedef enum xc_psr_cmt_type xc_psr_cmt_type; int xc_psr_cmt_attach(xc_interface *xch, uint32_t domid); int xc_psr_cmt_detach(xc_interface *xch, uint32_t domid); int xc_psr_cmt_get_domain_rmid(xc_interface *xch, uint32_t domid, -uint32_t *rmid); + uint32_t *rmid); int xc_psr_cmt_get_total_rmid(xc_interface *xch, uint32_t *total_rmid); int xc_psr_cmt_get_l3_upscaling_factor(xc_interface *xch, -uint32_t *upscaling_factor); + uint32_t *upscaling_factor); int xc_psr_cmt_get_l3_event_mask(xc_interface *xch, uint32_t *event_mask); int xc_psr_cmt_get_l3_cache_size(xc_interface *xch, uint32_t cpu, uint32_t *l3_cache_size); -int xc_psr_cmt_get_data(xc_interface *xch, uint32_t rmid, -uint32_t cpu, uint32_t psr_cmt_type, uint64_t *monitor_data); +int xc_psr_cmt_get_data(xc_interface *xch, uint32_t rmid, uint32_t cpu, +uint32_t psr_cmt_type, uint64_t *monitor_data); int xc_psr_cmt_enabled(xc_interface *xch); #endif diff --git a/tools/libxc/xc_psr.c b/tools/libxc/xc_psr.c index ac19fe4..a9881a4 100644 --- a/tools/libxc/xc_psr.c +++ b/tools/libxc/xc_psr.c @@ -47,7 +47,7 @@ int xc_psr_cmt_detach(xc_interface *xch, uint32_t domid) } int xc_psr_cmt_get_domain_rmid(xc_interface *xch, uint32_t domid, -uint32_t *rmid) + uint32_t *rmid) { int rc; DECLARE_DOMCTL; @@ -88,7 +88,7 @@ int xc_psr_cmt_get_total_rmid(xc_interface *xch, uint32_t *total_rmid) } int xc_psr_cmt_get_l3_upscaling_factor(xc_interface *xch, -uint32_t *upscaling_factor) + uint32_t *upscaling_factor) { static int val = 0; int rc; @@ -130,7 +130,7 @@ int xc_psr_cmt_get_l3_event_mask(xc_interface *xch, uint32_t *event_mask) } int xc_psr_cmt_get_l3_cache_size(xc_interface *xch, uint32_t cpu, - uint32_t *l3_cache_size) + uint32_t *l3_cache_size) { static int val = 0; int rc; @@ -155,8 +155,8 @@ int xc_psr_cmt_get_l3_cache_size(xc_interface *xch, uint32_t cpu, return rc; } -int xc_psr_cmt_get_data(xc_interface *xch, uint32_t rmid, -uint32_t cpu, xc_psr_cmt_type type, uint64_t *monitor_data) +int xc_psr_cmt_get_data(xc_interface *xch, uint32_t rmid, uint32_t cpu, +xc_psr_cmt_type type, uint64_t *monitor_data) { xc_resource_op_t op; xc_resource_entry_t entries[2]; diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h index c9a64f9..596d2a0 100644 --- a/tools/libxl/libxl.h +++ b/tools/libxl/libxl.h @@ -1454,11 +1454,14 @@ int libxl_psr_cmt_detach(libxl_ctx *ctx, uint32_t domid); int libxl_psr_cmt_domain_attached(libxl_ctx *ctx, uint32_t domid); int libxl_psr_cmt_enabled(libxl_ctx *ctx); int libxl_psr_cmt_get_total_rmid(libxl_ctx *ctx, uint32_t *total_rmid); -int libxl_psr_cmt_get_l3_cache_size(libxl_ctx *ctx, uint32_t socketid, -uint32_t *l3_cache_size); +int libxl_psr_cmt_get_l3_cache_size(libxl_ctx *ctx, +uint32_t socketid, +uint32_t *l3_cache_size); int libxl_psr_cmt_get_l3_event_mask(libxl_ctx *ctx, uint32_t *event_mask); -int libxl_psr_cmt_get_cache_occupancy(libxl_ctx *ctx, uint32_t domid, -uint32_t socketid, uint32_t *l3_cache_occupancy); +int libxl_psr_cmt_get_cache_occupancy(libxl_ctx *ctx, + uint32_t domid, + uint32_t socketid, + uint32_t *l3_cache_occupancy); #endif /* misc */ diff --git a/tools/libxl/libxl_psr.c b/tools/libxl/libxl_psr.c index 07f2aee..84819e6 100644 --- a/tools/libxl/libxl_psr.c +++ b/tools/libxl/libxl_psr.c @@ -135,8 +135,9 @@ int libxl_psr_cmt_get_total_rmid(libxl_ctx *ctx, uint32_t *total_rmid) return rc; } -int libxl_psr_cmt_get_l3_cache_size(libxl_ctx *ctx, uint32_t socketid, - uint32_t *l3_cache_size) +int libxl_psr_cmt_get_l3_cache_size(libxl_ctx *ctx, +uint32_t socketid, +uint32_t
[Xen-devel] [PATCH v3 0/5] enable Memory Bandwidth Monitoring (MBM) for VMs
Changes from v2: * Remove the usage of static to cache data in xc; NOTE: Other places that already existed before are not touched due to the needs for API change. Will fix in separate patch if desirable. * Coding style; Changes from v1: * Move event type check from xc to xl; * Add retry capability for MBM sampling; * Fix Coding style/docs; Intel Memory Bandwidth Monitoring(MBM) is a new hardware feature which builds on the CMT infrastructure to allow monitoring of system memory bandwidth. Event codes are provided to monitor both total and local bandwidth, meaning bandwidth over QPI and other external links can be monitored. For XEN, MBM is used to monitor memory bandwidth for VMs. Due to its dependency on CMT, the software also makes use of most of CMT codes. Actually, besides introducing two additional events and some cpuid feature bits, there are no extra changes compared to cache occupancy monitoring in CMT. Due to this, CMT should be enabled first to use this feature. For interface changes, the patch serial only introduces a new command XEN_SYSCTL_PSR_CMT_get_l3_event_mask which exposes MBM feature capability to user space and introduces two additional options for xl psr-cmt-show: total_mem_bandwidth: Show total memory bandwidth local_mem_bandwidth: Show local memory bandwidth The usage flow keeps the same with CMT. Chao Peng (5): x86: expose CMT L3 event mask to user space tools: add routine to get CMT L3 event mask tools: correct coding style for psr tools: code refactoring for MBM tools: add total/local memory bandwith monitoring docs/man/xl.pod.1 |9 +++ tools/libxc/include/xenctrl.h | 11 ++-- tools/libxc/xc_psr.c | 35 -- tools/libxl/libxl.h | 20 -- tools/libxl/libxl_psr.c | 142 + tools/libxl/libxl_types.idl |2 + tools/libxl/xl_cmdimpl.c | 72 +++-- tools/libxl/xl_cmdtable.c |4 +- xen/arch/x86/sysctl.c |3 + xen/include/public/sysctl.h |1 + 10 files changed, 251 insertions(+), 48 deletions(-) -- 1.7.9.5 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 2/5] tools: add routine to get CMT L3 event mask
This is the tools side wrapper for XEN_SYSCTL_PSR_CMT_get_l3_event_mask of XEN_SYSCTL_psr_cmt_op. Signed-off-by: Chao Peng chao.p.p...@linux.intel.com --- tools/libxc/include/xenctrl.h |1 + tools/libxc/xc_psr.c | 17 + tools/libxl/libxl.h |1 + tools/libxl/libxl_psr.c | 15 +++ 4 files changed, 34 insertions(+) diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h index 0ad8b8d..96b357c 100644 --- a/tools/libxc/include/xenctrl.h +++ b/tools/libxc/include/xenctrl.h @@ -2697,6 +2697,7 @@ int xc_psr_cmt_get_domain_rmid(xc_interface *xch, uint32_t domid, int xc_psr_cmt_get_total_rmid(xc_interface *xch, uint32_t *total_rmid); int xc_psr_cmt_get_l3_upscaling_factor(xc_interface *xch, uint32_t *upscaling_factor); +int xc_psr_cmt_get_l3_event_mask(xc_interface *xch, uint32_t *event_mask); int xc_psr_cmt_get_l3_cache_size(xc_interface *xch, uint32_t cpu, uint32_t *l3_cache_size); int xc_psr_cmt_get_data(xc_interface *xch, uint32_t rmid, diff --git a/tools/libxc/xc_psr.c b/tools/libxc/xc_psr.c index 872e6dc..ac19fe4 100644 --- a/tools/libxc/xc_psr.c +++ b/tools/libxc/xc_psr.c @@ -112,6 +112,23 @@ int xc_psr_cmt_get_l3_upscaling_factor(xc_interface *xch, return rc; } +int xc_psr_cmt_get_l3_event_mask(xc_interface *xch, uint32_t *event_mask) +{ +int rc; +DECLARE_SYSCTL; + +sysctl.cmd = XEN_SYSCTL_psr_cmt_op; +sysctl.u.psr_cmt_op.cmd = +XEN_SYSCTL_PSR_CMT_get_l3_event_mask; +sysctl.u.psr_cmt_op.flags = 0; + +rc = xc_sysctl(xch, sysctl); +if ( !rc ) +*event_mask = sysctl.u.psr_cmt_op.u.data; + +return rc; +} + int xc_psr_cmt_get_l3_cache_size(xc_interface *xch, uint32_t cpu, uint32_t *l3_cache_size) { diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h index 0a123f1..c9a64f9 100644 --- a/tools/libxl/libxl.h +++ b/tools/libxl/libxl.h @@ -1456,6 +1456,7 @@ int libxl_psr_cmt_enabled(libxl_ctx *ctx); int libxl_psr_cmt_get_total_rmid(libxl_ctx *ctx, uint32_t *total_rmid); int libxl_psr_cmt_get_l3_cache_size(libxl_ctx *ctx, uint32_t socketid, uint32_t *l3_cache_size); +int libxl_psr_cmt_get_l3_event_mask(libxl_ctx *ctx, uint32_t *event_mask); int libxl_psr_cmt_get_cache_occupancy(libxl_ctx *ctx, uint32_t domid, uint32_t socketid, uint32_t *l3_cache_occupancy); #endif diff --git a/tools/libxl/libxl_psr.c b/tools/libxl/libxl_psr.c index 0437465..07f2aee 100644 --- a/tools/libxl/libxl_psr.c +++ b/tools/libxl/libxl_psr.c @@ -160,6 +160,21 @@ out: return rc; } +int libxl_psr_cmt_get_l3_event_mask(libxl_ctx *ctx, uint32_t *event_mask) +{ +GC_INIT(ctx); +int rc; + +rc = xc_psr_cmt_get_l3_event_mask(ctx-xch, event_mask); +if (rc 0) { +libxl__psr_cmt_log_err_msg(gc, errno); +rc = ERROR_FAIL; +} + +GC_FREE; +return rc; +} + int libxl_psr_cmt_get_cache_occupancy(libxl_ctx *ctx, uint32_t domid, uint32_t socketid, uint32_t *l3_cache_occupancy) { -- 1.7.9.5 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH] xen-time: decreasing the rating of the xen clocksource below that of the tsc clocksource for dom0's
From: Palik, Imre im...@amazon.de In Dom0's the use of the TSC clocksource (whenever it is stable enough to be used) instead of the Xen clocksource should not cause any issues, as Dom0 VMs never live-migrated. The TSC clocksource is somewhat more efficient than the Xen paravirtualised clocksource, thus it should have higher rating. This patch decreases the rating of the Xen clocksource in Dom0s to 275. Which is half-way between the rating of the TSC clocksource (300) and the hpet clocksource (250). Cc: Anthony Liguori aligu...@amazon.com Signed-off-by: Imre Palik im...@amazon.de --- arch/x86/xen/time.c |4 1 file changed, 4 insertions(+) diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c index f473d26..c768726 100644 --- a/arch/x86/xen/time.c +++ b/arch/x86/xen/time.c @@ -487,6 +487,10 @@ static void __init xen_time_init(void) int cpu = smp_processor_id(); struct timespec tp; + /* As Dom0 is never moved, no penalty on using TSC there */ + if (xen_initial_domain()) + xen_clocksource.rating = 275; + clocksource_register_hz(xen_clocksource, NSEC_PER_SEC); if (HYPERVISOR_vcpu_op(VCPUOP_stop_periodic_timer, cpu, NULL) == 0) { -- 1.7.9.5 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [libvirt test] 33382: regressions - FAIL
flight 33382 libvirt real [real] http://www.chiark.greenend.org.uk/~xensrcts/logs/33382/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-armhf-libvirt 5 libvirt-build fail REGR. vs. 32648 build-i386-libvirt5 libvirt-build fail REGR. vs. 32648 Tests which did not succeed, but are not blocking: test-armhf-armhf-libvirt 1 build-check(1) blocked n/a test-amd64-i386-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-libvirt 9 guest-start fail never pass version targeted for testing: libvirt 97fac17c77d9bdfacafff1c5c39b2df3c1530614 baseline version: libvirt 2360fe5d24175835d3f5fd1c7e8e6e13addab629 People who touched revisions under test: Alexander Burluka aburl...@parallels.com Cedric Bosdonnat cbosdon...@suse.com Chunyan Liu cy...@suse.com Cédric Bosdonnat cbosdon...@suse.com Daniel P. Berrange berra...@redhat.com Eric Blake ebl...@redhat.com Geoff Hickey ghic...@datagravity.com Jim Fehlig jfeh...@suse.com Jiri Denemark jdene...@redhat.com John Ferlan jfer...@redhat.com Ján Tomko jto...@redhat.com Kiarie Kahurani davidkiar...@gmail.com Luyao Huang lhu...@redhat.com Michal Privoznik mpriv...@redhat.com Nehal J Wani nehaljw.k...@gmail.com Pavel Hrdina phrd...@redhat.com Peter Krempa pkre...@redhat.com Stefan Berger stef...@linux.vnet.ibm.com jobs: build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-armhf-libvirt fail build-i386-libvirt fail build-amd64-pvopspass build-armhf-pvopspass build-i386-pvops pass test-amd64-amd64-libvirt fail test-armhf-armhf-libvirt blocked test-amd64-i386-libvirt blocked sg-report-flight on osstest.cam.xci-test.com logs: /home/xc_osstest/logs images: /home/xc_osstest/images Logs, config files, etc. are available at http://www.chiark.greenend.org.uk/~xensrcts/logs Test harness code can be found at http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary Not pushing. (No revision log; it would be 680 lines long.) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Fwd: [OPW PATCH 1/4] tools/xl: Calling _init and _dispose function for libxl types
There was no v2 (v2 was not created properly). Yes, 1/4 was the cover letter. And 4/4 was not correct. Thank you for applying the patches. On Mon, Jan 12, 2015 at 11:29 PM, Ian Campbell ian.campb...@citrix.com wrote: On Tue, 2014-10-21 at 18:04 +0100, George Dunlap wrote: Just getting back to these after the freeze. On Tue, Oct 21, 2014 at 5:34 PM, Uma Sharma uma.sharma...@gmail.com wrote: Should I resend the patches then? On the xen-devel list, always reply at the bottom, like this. :-) I think normally it wouldn't matter, but since the point of the exercise is to get you familiar with the tools, I'd say yes, why don't you send them again (maybe using the 'v2' tag). Was there a v2 here? If so I seem to have misplaced it. As it stands it looks like I have: [OPW PATCH 2/4] tools/xl: Call init function for libxl_domain_sched_params (AKA 544581bd.847e460a.4ff9.a...@mx.google.com) [OPW PATCH 3/4] tools/xl: Call init function for libxl_bitmap (AKA 54458271.a28b420a.52e5.a...@mx.google.com) Both of which are acked by Wei, I have applied them. I don't seem to have the actual 1/4 patch, or was 1/4 just the cover letter? [OPW PATCH 4/4] tools/xl:Call init and dispose function for libxl_dominfo (AKA 544583e4.c8e7420a.6486.b...@mx.google.com) was incorrect, as was the followup tools/xl:Making _dispose function simplicity for libxl_dominfo. I think the code in that case is correct as is. Please let me know if there are any other outstanding patches from the OPW application process which I've missed. Ian. -- Regards, Uma Sharma http://about.me/umasharma ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [linux-linus test] 33377: regressions - FAIL
flight 33377 linux-linus real [real] http://www.chiark.greenend.org.uk/~xensrcts/logs/33377/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-i386-qemuu-rhel6hvm-intel 5 xen-boot fail REGR. vs. 32879 test-amd64-i386-xl-credit29 guest-start fail REGR. vs. 32879 Regressions which are regarded as allowable (not blocking): test-amd64-i386-freebsd10-i386 7 freebsd-install fail like 32879 test-amd64-i386-freebsd10-amd64 7 freebsd-install fail like 32879 test-amd64-i386-pair17 guest-migrate/src_host/dst_host fail like 32879 test-amd64-amd64-xl-qemuu-winxpsp3 7 windows-install fail like 32879 Tests which did not succeed, but are not blocking: test-armhf-armhf-libvirt 9 guest-start fail never pass test-amd64-amd64-xl-pvh-intel 9 guest-start fail never pass test-amd64-i386-libvirt 9 guest-start fail never pass test-armhf-armhf-xl 10 migrate-support-checkfail never pass test-amd64-amd64-libvirt 9 guest-start fail never pass test-amd64-amd64-xl-pcipt-intel 9 guest-start fail never pass test-amd64-amd64-xl-pvh-amd 9 guest-start fail never pass test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail never pass test-amd64-amd64-xl-winxpsp3 14 guest-stop fail never pass test-amd64-i386-xl-qemuu-winxpsp3 14 guest-stopfail never pass test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop fail never pass test-amd64-i386-xl-win7-amd64 14 guest-stop fail never pass test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop fail never pass test-amd64-amd64-xl-win7-amd64 14 guest-stop fail never pass test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop fail never pass test-amd64-amd64-xl-qemut-winxpsp3 14 guest-stop fail never pass test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass test-amd64-i386-xl-winxpsp3 14 guest-stop fail never pass test-amd64-i386-xl-qemut-winxpsp3 14 guest-stopfail never pass version targeted for testing: linuxeaa27f34e91a14cdceed26ed6c6793ec1d186115 baseline version: linux9bb29b6b927bcd79cf185ee67bcebfe630f0dea1 People who touched revisions under test: John W. Linville linvi...@tuxdriver.com Aaron Brown aaron.f.br...@intel.com Aaron Plattner aplatt...@nvidia.com Alan Stern st...@rowland.harvard.edu Alex Deucher alexander.deuc...@amd.com Alex Thorlton athorl...@sgi.com Alex Williamson alex.william...@redhat.com Alexandre Courbot acour...@nvidia.com Alexey Khoroshilov khoroshi...@ispras.ru Andi Kleen a...@linux.intel.com Andreas Oehler andr...@oehler-net.de Andrew Jackson andrew.jack...@arm.com Andrew Morton a...@linux-foundation.org Andy Lutomirski l...@amacapital.net Andy Shevchenko andy.shevche...@gmail.com Anil Chintalapati (achintal) achin...@cisco.com Anil Chintalapati achin...@cisco.com Anton Vorontsov anton.voront...@linaro.org Antonio Quartulli anto...@meshcoding.com Ard Biesheuvel ard.biesheu...@linaro.org Arnaldo Carvalho de Melo a...@redhat.com Arne Goedeke e...@laramies.com Aron Szabo a...@ubit.hu Ben Goz ben@amd.com Ben Pfaff b...@nicira.com Ben Skeggs bske...@redhat.com Benjamin Tissoires benjamin.tissoi...@redhat.com Bjørn Mork bj...@mork.no Bruno Prémont bonb...@linux-vserver.org Catalin Marinas catalin.mari...@arm.com Chad Dupuis chad.dup...@qlogic.com Chris Mason c...@fb.com Chris Wilson ch...@chris-wilson.co.uk Christian König christian.koe...@amd.com Christoph Hellwig h...@lst.de Corey Minyard cminy...@mvista.com Dan Carpenter dan.carpen...@oracle.com Daniel Borkmann dbork...@redhat.com Daniel Mack dan...@zonque.org Daniel Nicoletti dantt...@gmail.com Daniel Thompson daniel.thomp...@linaro.org Daniel Walter d.wal...@0x90.at Dave Airlie airl...@gmail.com Dave Airlie airl...@redhat.com David Ahern dsah...@gmail.com David Drysdale drysd...@google.com David Howells dhowe...@redhat.com David Rientjes rient...@google.com David S. Miller da...@davemloft.net Davidlohr Bueso d...@stgolabs.net Doug Anderson diand...@chromium.org Fabian Frederick f...@skynet.be Fang, Yang A yang.a.f...@intel.com Felipe Balbi ba...@ti.com Filipe Manana fdman...@suse.com Francesco Virlinzi francesco.virli...@st.com Giedrius StatkeviÄius giedrius.statkevic...@gmail.com Govindarajulu Varadarajan _gov...@gmx.com Grygorii Strashko
Re: [Xen-devel] [PATCH v5 7/9] libxc: introduce soft reset for HVM domains
On Thu, 2014-12-11 at 14:45 +0100, Vitaly Kuznetsov wrote: Add new xc_domain_soft_reset() function which performs so-called 'soft reset' for an HVM domain. It is being performed in the following way: - Save HVM context and all HVM params; - Devour original domain with XEN_DOMCTL_devour; - Wait till original domain dies or has no pages left; - Restore HVM context, HVM params, seed grant table. Are any of these operations slow, per the definition under 'Machinery for asynchronous operations (ao)' in libxl_internal.h? Wait till original domain dies sounds like it might be. That might have implications for the use of this functionality from libxl. +xc_hvm_param_get(xch, source_dom, HVM_PARAM_IDENT_PT, + hvm_params[HVM_PARAM_IDENT_PT]); There's quite a risk of the set of HVM parameters retrieved getting out of sync, either with the hypervisor or with the sets done below. I don't know if any part of the migration infrastructure (specifically Andy Cooper's v2 stuff, or some of the underlying hypercalls) could be reused here to pickle/unpickle the state? Other possibilities: A new hypercall pair to get/set all hvm params. An list of params to save/restore locally here, which would at least stop the get/set parts gettuing out of sync, but doesn't help with the hypervisor getting out of sync (and therefore would not be my preferred solution). Also this function needs to take arch specifics into account. +while ( 1 ) +{ +sleep(SLEEP_INT); +if ( xc_get_tot_pages(xch, source_dom) = 0 ) +{ +DPRINTF(All pages were transferred); +break; +} +} I think we are going to need to find a better solution than this. Changing the nature of the hypercall as I suggested in a previous reply would also remove this, so I'll wait for a verdict on that before worrying about this bit any further. [...] +PERROR(Faled to perform soft reset, destroying domain %d, Failed Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 01/10] xen/arm: Implement hip04-d01 platform
2015-01-13 11:58 GMT+00:00 Ian Campbell ian.campb...@citrix.com: On Mon, 2014-11-03 at 10:11 +, Frediano Ziglio wrote: Add this new platform to Xen. This platform require specific code to initialize CPUs. What is the bootwrapper? Are you running this on real silicon or on an emulator? Can the platform be made to do PSCI instead? Very real. It's actually on my desk and I'm not in Matrix :-) Has no PSCI support. Would be honestly very great. As we (as company) write the firmware could be technically doable. There is no plan. This piece of software is meant to bring the CPU from Secure mode to Unsecure Hypervisor mode before calling kernel/hypervisor code and provide supervisor calls. +np_fab = dt_find_compatible_node(NULL, NULL, hisilicon,hip04-fabric); Please add a reference to the DT bindings document for these values. linux/Documentation/devicetree/bindings/arm/hisilicon/hisilicon.txt seems related but doesn't talk about most of these fields. There are documentation in the Linaro kernel, see https://git.linaro.org/kernel/linux-linaro-tracking.git/blob/HEAD:/Documentation/devicetree/bindings/arm/hisilicon/hisilicon.txt. I hope it will be merged soon. Frediano ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 8/9] libxl: soft reset support
On Thu, 2014-12-11 at 14:45 +0100, Vitaly Kuznetsov wrote: Supported for HVM guests only. Is it specifically PVHVM guests, or are unaware HVM guests also supported? (I think the answer is that an unaware HVM guest has no way to trigger a soft reset, so maybe it's moot...) diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h index 0a123f1..710dc0e 100644 --- a/tools/libxl/libxl.h +++ b/tools/libxl/libxl.h @@ -929,6 +929,12 @@ int static inline libxl_domain_create_restore_0x040200( #endif +int libxl_domain_soft_reset(libxl_ctx *ctx, libxl_domain_config *d_config, +uint32_t *domid, uint32_t domid_old, +const libxl_asyncop_how *ao_how, +const libxl_asyncprogress_how *aop_console_how) +LIBXL_EXTERNAL_CALLERS_ONLY; + /* A progress report will be made via ao_console_how, of type * domain_create_console_available, when the domain's primary * console is available and can be connected to. diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c index 1198225..0a840c9 100644 --- a/tools/libxl/libxl_create.c +++ b/tools/libxl/libxl_create.c @@ -25,6 +25,8 @@ #include xen/hvm/hvm_info_table.h #include xen/hvm/e820.h +#define INVALID_DOMID ~0 Is this completely internal to this file, or are you requiring that it matches the one in xl_cmdimpl.c (i.e does it cross the library interface)? + +void libxl__xc_domain_soft_reset(libxl__egc *egc, + libxl__domain_create_state *dcs) +{ +STATE_AO_GC(dcs-ao); +libxl_ctx *ctx = libxl__gc_owner(gc); +const uint32_t domid_soft_reset = dcs-domid_soft_reset; +const uint32_t domid = dcs-guest_domid; +libxl_domain_config *const d_config = dcs-guest_config; +libxl_domain_build_info *const info = d_config-b_info; +uint8_t *buf; +uint32_t len; +uint32_t console_domid, store_domid; +unsigned long store_mfn, console_mfn; +int rc; +struct libxl__domain_suspend_state *dss; + +GCNEW(dss); + +dss-ao = ao; +dss-domid = domid_soft_reset; +dss-dm_savefile = GCSPRINTF(/var/lib/xen/qemu-save.%d, + domid_soft_reset); + +if (info-type == LIBXL_DOMAIN_TYPE_HVM) { I thought the alternative (PV) wasn't possible? +rc = libxl__domain_suspend_device_model(gc, dss); +if (rc) goto out; +} + +console_domid = dcs-build_state.console_domid; +store_domid = dcs-build_state.store_domid; [...] +rc = xc_domain_soft_reset(ctx-xch, domid_soft_reset, domid, console_domid, + console_mfn, store_domid, store_mfn); +if (rc) goto out; [..] +dcs-build_state.store_mfn = store_mfn; +dcs-build_state.console_mfn = console_mfn; Are you trying to avoid passing dcs-build_state.store_mfn to the xc function directly for some reason? + +rc = libxl__toolstack_save(domid_soft_reset, buf, len, dss); +if (rc) goto out; + +rc = libxl__toolstack_restore(domid, buf, len, dcs-shs); +if (rc) goto out; +out: +/* + * Now pretend we did normal restore and simply call + * libxl__xc_domain_restore_done(). + */ +libxl__xc_domain_restore_done(egc, dcs, rc, 0, 0); +} + void libxl__srm_callout_callback_restore_results(unsigned long store_mfn, unsigned long console_mfn, void *user) { diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index 4a0e2be..10ef652 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -121,6 +121,8 @@ libxl_action_on_shutdown = Enumeration(action_on_shutdown, [ (5, COREDUMP_DESTROY), (6, COREDUMP_RESTART), + +(7, SOFT_RESET), I think I mention a LIBXL_HAVE #define earlier on, since they are all related I think you can have a single one for the overall feature rather than ones for each new enum value. function etc. Probably LIBXL_HAVE_DOMAIN_SOFT_RESET fits best? @@ -2519,7 +2538,17 @@ start: * restore/migrate-receive it again. */ restoring = 0; -}else{ +} else if (domid_old != INVALID_DOMID) { +/* Do soft reset */ +ret = libxl_domain_soft_reset(ctx, d_config, + domid, domid_old, + 0, 0); + +if ( ret ) { +goto error_out; +} +domid_old = INVALID_DOMID; +} else { ret = libxl_domain_create_new(ctx, d_config, domid, 0, autoconnect_console_how); } @@ -2583,6 +2612,8 @@ start: event-u.domain_shutdown.shutdown_reason, event-u.domain_shutdown.shutdown_reason); switch (handle_domain_death(domid, event, d_config)) { +case 3: +domid_old = domid; Please comment when falling
[Xen-devel] [PATCH v3 06/24] xen/arm: Map disabled device in DOM0
The check to avoid mapping disabled device in DOM0 was added in the anticipation of the device passthrough. But, a brand new property will be added later to mark device which will passthrough. At the same time, remove the memory type check because those nodes has been blacklisted. Futhermore, some platform (such as the OMAP) may try to poke device even if the property status is set to disabled. Signed-off-by: Julien Grall julien.gr...@linaro.org Cc: Andrii Tseglytskyi andrii.tseglyts...@globallogic.com --- Changes in v3: - Patch added - xen/arm: follow-up to allow DOM0 manage IRQ and MMIO has been split in 2 patch [1] - Drop the check for memory type. Thoses nodes have been blacklisted. [1] https://patches.linaro.org/34669/ --- xen/arch/arm/domain_build.c| 19 +++ xen/arch/arm/platforms/omap5.c | 12 2 files changed, 3 insertions(+), 28 deletions(-) diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index 8f1b48e..f68755f 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -1104,22 +1104,9 @@ static int handle_node(struct domain *d, struct kernel_info *kinfo, return 0; } -/* - * Some device doesn't need to be mapped in Xen: - * - Memory: the guest will see a different view of memory. It will - * be allocated later. - * - Disabled device: Linux is able to cope with status=disabled - * property. Therefore these device doesn't need to be mapped. This - * solution can be use later for pass through. - */ -if ( !dt_device_type_is_equal(node, memory) - dt_device_is_available(node) ) -{ -res = map_device(d, node); - -if ( res ) -return res; -} +res = map_device(d, node); +if ( res) +return res; /* * The property name is used to have a different name on older FDT diff --git a/xen/arch/arm/platforms/omap5.c b/xen/arch/arm/platforms/omap5.c index 9d6e504..e7bf30d 100644 --- a/xen/arch/arm/platforms/omap5.c +++ b/xen/arch/arm/platforms/omap5.c @@ -155,17 +155,6 @@ static const char * const dra7_dt_compat[] __initconst = NULL }; -static const struct dt_device_match dra7_blacklist_dev[] __initconst = -{ -/* OMAP Linux kernel handles devices with status disabled in a - * weird manner - tries to reset them. While their memory ranges - * are not mapped, this leads to data aborts, so skip these devices - * from DT for dom0. - */ -DT_MATCH_NOT_AVAILABLE(), -{ /* sentinel */ }, -}; - PLATFORM_START(omap5, TI OMAP5) .compatible = omap5_dt_compat, .init_time = omap5_init_time, @@ -185,7 +174,6 @@ PLATFORM_START(dra7, TI DRA7) .dom0_gnttab_start = 0x4b00, .dom0_gnttab_size = 0x2, -.blacklist_dev = dra7_blacklist_dev, PLATFORM_END /* -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 12/24] xen/arm: Release IRQ routed to a domain when it's destroying
Xen has to release IRQ routed to a domain in order to reuse later. Currently only SPIs can be routed to the guest so we only need to browse SPIs for a specific domain. Futhermore, a guest can crash and let the IRQ in an incorrect state (i.e has not being EOIed). Xen will have to reset the IRQ in order to be able to reuse the IRQ later. Introduce 2 new functions for release an IRQ routed to a domain: - release_guest_irq: upper level to retrieve the IRQ, call the GIC code and release the action - gic_remove_guest_irq: Check if we can remove the IRQ, and reset it if necessary Signed-off-by: Julien Grall julien.gr...@linaro.org --- Changes in v3: - Take the vgic rank lock to protect p-desc - Correctly check if the IRQ is disabled - Extend the check on the virq in release_guest_irq - Use vgic_get_target_vcpu to get the target vCPU - Remove spurious change Changes in v2: - Drop the desc-handler = no_irq_type in release_irq as it's buggy if the IRQ is routed to Xen - Add release_guest_irq and gic_remove_guest_irq --- xen/arch/arm/gic.c| 46 + xen/arch/arm/irq.c| 48 +++ xen/arch/arm/vgic.c | 16 xen/include/asm-arm/gic.h | 4 xen/include/asm-arm/irq.h | 2 ++ 5 files changed, 116 insertions(+) diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c index 240870f..bb298e9 100644 --- a/xen/arch/arm/gic.c +++ b/xen/arch/arm/gic.c @@ -162,6 +162,52 @@ out: return res; } +/* This function only works with SPIs for now */ +int gic_remove_irq_from_guest(struct domain *d, unsigned int virq, + struct irq_desc *desc) +{ +struct vcpu *v_target = vgic_get_target_vcpu(d-vcpu[0], virq); +struct vgic_irq_rank *rank = vgic_rank_irq(v_target, virq); +struct pending_irq *p = irq_to_pending(v_target, virq); +unsigned long flags; + +ASSERT(spin_is_locked(desc-lock)); +ASSERT(test_bit(_IRQ_GUEST, desc-status)); +ASSERT(p-desc == desc); + +vgic_lock_rank(v_target, rank, flags); + +/* If the IRQ is removed when the domain is dying, we only need to + * EOI the IRQ if it has not been done by the guest + */ +if ( d-is_dying ) +{ +desc-handler-shutdown(desc); +if ( test_bit(_IRQ_INPROGRESS, desc-status) ) +gic_hw_ops-deactivate_irq(desc); +clear_bit(_IRQ_INPROGRESS, desc-status); +goto end; +} + +/* TODO: Handle eviction from LRs. For now, deny remove if the IRQ + * is inflight and not disabled. + */ +if ( test_bit(_IRQ_INPROGRESS, desc-status) || + !test_bit(_IRQ_DISABLED, desc-status) ) +return -EBUSY; + +end: +clear_bit(_IRQ_GUEST, desc-status); +desc-handler = no_irq_type; + +p-desc = NULL; + +vgic_unlock_rank(v_target, rank, flags); + + +return 0; +} + int gic_irq_xlate(const u32 *intspec, unsigned int intsize, unsigned int *out_hwirq, unsigned int *out_type) diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c index 0072347..ce5ae1a 100644 --- a/xen/arch/arm/irq.c +++ b/xen/arch/arm/irq.c @@ -504,6 +504,54 @@ free_info: return retval; } +int release_guest_irq(struct domain *d, unsigned int virq) +{ +struct irq_desc *desc; +struct irq_guest *info; +unsigned long flags; +struct pending_irq *p; +int ret; + +/* Only SPIs are supported */ +if ( virq 32 || virq = vgic_num_irqs(d) ) +return -EINVAL; + +p = irq_to_pending(d-vcpu[0], virq); +if ( !p-desc ) +return -EINVAL; + +desc = p-desc; + +spin_lock_irqsave(desc-lock, flags); + +ret = -EINVAL; +if ( !test_bit(_IRQ_GUEST, desc-status) ) +goto unlock; + +ret = -EINVAL; + +info = irq_get_guest_info(desc); +if ( d != info-d ) +goto unlock; + +ret = gic_remove_irq_from_guest(d, virq, desc); + +spin_unlock_irqrestore(desc-lock, flags); + +if ( !ret ) +{ +release_irq(desc-irq, info); +xfree(info); +} + +return ret; + +unlock: +spin_unlock_irqrestore(desc-lock, flags); + +return ret; +} + /* * pirq event channels. We don't use these on ARM, instead we use the * features of the GIC to inject virtualised normal interrupts. diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c index fc8a270..4ddfd73 100644 --- a/xen/arch/arm/vgic.c +++ b/xen/arch/arm/vgic.c @@ -133,6 +133,22 @@ void register_vgic_ops(struct domain *d, const struct vgic_ops *ops) void domain_vgic_free(struct domain *d) { +int i; +int ret; + +for ( i = 0; i (d-arch.vgic.nr_spis); i++ ) +{ +struct pending_irq *p = d-arch.vgic.pending_irqs[i]; + +if ( p-desc ) +{ +ret = release_guest_irq(d, p-irq); +if ( ret ) +dprintk(XENLOG_G_WARNING, d%u: Failed to
[Xen-devel] [PATCH v3 03/24] xen/dts: Allow only IRQ translation that are mapped to main GIC
Xen is only able to handle one GIC controller. Some platform may contain other interrupt controller. Make sure to only translate IRQ mapped into the GIC handled by Xen. Signed-off-by: Julien Grall julien.gr...@linaro.org --- Changes in v3: - Patch was previously sent a separate series [1] - Rework the comment in dt_irq_translate. Changelog based on the separate series: Changes in v3: - Add an ASSERT to check that dt_interrupt_controller is not NULL. Changes in v2: - Fix compilation... [1] https://patches.linaro.org/33312/ --- xen/common/device_tree.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c index f471008..bb9d7ce 100644 --- a/xen/common/device_tree.c +++ b/xen/common/device_tree.c @@ -1058,8 +1058,14 @@ int dt_irq_translate(const struct dt_raw_irq *raw, struct dt_irq *out_irq) { ASSERT(dt_irq_xlate != NULL); +ASSERT(dt_interrupt_controller != NULL); -/* TODO: Retrieve the right irq_xlate. This is only work for the gic */ +/* + * TODO: Retrieve the right irq_xlate. This is only works for the primary + * interrupt controller. + */ +if ( raw-controller != dt_interrupt_controller ) +return -EINVAL; return dt_irq_xlate(raw-specifier, raw-size, out_irq-irq, out_irq-type); -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 08/24] xen/arm: Allow virq != irq
Actually Xen is assuming that the virtual IRQ will always be the same as IRQ. Modify route_guest_irq to take the virtual IRQ in parameter and let Xen assign a different IRQ number. Also store the vIRQ in the desc action to easily retrieve easily the IRQ target when we need to inject the interrupt. As DOM0 will get most the devices, the vIRQ is equal to the IRQ in that case. At the same time modify the behavior of irq_get_domain. The function now assumes that the irq_desc belongs to an IRQ assigned to a guest. Signed-off-by: Julien Grall julien.gr...@linaro.org --- Changes in v3 - Spelling/grammar nits - Fix compilation on ARM64. Forgot to update route_irq_to_guest call for xgene platform. - Add a word about irq_get_domain behavior change - More s/irq/virq/ because of the rebasing on the latest staging Changes in v2: - Patch added --- xen/arch/arm/domain_build.c | 2 +- xen/arch/arm/gic.c | 5 ++-- xen/arch/arm/irq.c | 47 ++-- xen/arch/arm/platforms/xgene-storm.c | 2 +- xen/arch/arm/vgic.c | 20 +++ xen/include/asm-arm/gic.h| 3 ++- xen/include/asm-arm/irq.h| 4 +-- xen/include/asm-arm/vgic.h | 4 +-- 8 files changed, 55 insertions(+), 32 deletions(-) diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index b48b5d0..06c1dec 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -1029,7 +1029,7 @@ static int handle_device(struct domain *d, struct dt_device_node *dev) * twice the IRQ. This can happen if the IRQ is shared */ vgic_reserve_virq(d, irq); -res = route_irq_to_guest(d, irq, dt_node_name(dev)); +res = route_irq_to_guest(d, irq, irq, dt_node_name(dev)); if ( res ) { printk(XENLOG_ERR Unable to route IRQ %u to domain %u\n, diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c index eb0c5d6..15de283 100644 --- a/xen/arch/arm/gic.c +++ b/xen/arch/arm/gic.c @@ -126,7 +126,8 @@ void gic_route_irq_to_xen(struct irq_desc *desc, const cpumask_t *cpu_mask, /* Program the GIC to route an interrupt to a guest * - desc.lock must be held */ -void gic_route_irq_to_guest(struct domain *d, struct irq_desc *desc, +void gic_route_irq_to_guest(struct domain *d, unsigned int virq, +struct irq_desc *desc, const cpumask_t *cpu_mask, unsigned int priority) { struct pending_irq *p; @@ -139,7 +140,7 @@ void gic_route_irq_to_guest(struct domain *d, struct irq_desc *desc, /* Use vcpu0 to retrieve the pending_irq struct. Given that we only * route SPIs to guests, it doesn't make any difference. */ -p = irq_to_pending(d-vcpu[0], desc-irq); +p = irq_to_pending(d-vcpu[0], virq); p-desc = desc; } diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c index 25ecf1d..830832c 100644 --- a/xen/arch/arm/irq.c +++ b/xen/arch/arm/irq.c @@ -31,6 +31,13 @@ static unsigned int local_irqs_type[NR_LOCAL_IRQS]; static DEFINE_SPINLOCK(local_irqs_type_lock); +/* Describe an IRQ assigned to a guest */ +struct irq_guest +{ +struct domain *d; +unsigned int virq; +}; + static void ack_none(struct irq_desc *irq) { printk(unexpected IRQ trap at irq %02x\n, irq-irq); @@ -122,18 +129,20 @@ void __cpuinit init_secondary_IRQ(void) BUG_ON(init_local_irq_data() 0); } -static inline struct domain *irq_get_domain(struct irq_desc *desc) +static inline struct irq_guest *irq_get_guest_info(struct irq_desc *desc) { ASSERT(spin_is_locked(desc-lock)); - -if ( !test_bit(_IRQ_GUEST, desc-status) ) -return dom_xen; - +ASSERT(test_bit(_IRQ_GUEST, desc-status)); ASSERT(desc-action != NULL); return desc-action-dev_id; } +static inline struct domain *irq_get_domain(struct irq_desc *desc) +{ +return irq_get_guest_info(desc)-d; +} + void irq_set_affinity(struct irq_desc *desc, const cpumask_t *cpu_mask) { if ( desc != NULL ) @@ -197,7 +206,7 @@ void do_IRQ(struct cpu_user_regs *regs, unsigned int irq, int is_fiq) if ( test_bit(_IRQ_GUEST, desc-status) ) { -struct domain *d = irq_get_domain(desc); +struct irq_guest *info = irq_get_guest_info(desc); desc-handler-end(desc); @@ -206,7 +215,7 @@ void do_IRQ(struct cpu_user_regs *regs, unsigned int irq, int is_fiq) /* the irq cannot be a PPI, we only support delivery of SPIs to * guests */ -vgic_vcpu_inject_spi(d, irq); +vgic_vcpu_inject_spi(info-d, info-virq); goto out_no_end; } @@ -370,19 +379,30 @@ err: return rc; } -int route_irq_to_guest(struct domain *d, unsigned int irq, - const char * devname) +int route_irq_to_guest(struct domain *d, unsigned int virq, +
[Xen-devel] [PATCH v3 11/24] xen/arm: Let the toolstack configure the number of SPIs
Each domain may have a different number of IRQs depending on the devices assigned to it. Rather re-using the number of IRQs used by the hardwared GIC, let the toolstack specify the number of SPIs when the domain is created. This will avoid to waste memory. To calculate the number of SPIs, we assume that any IRQ given via the option irqs= in xl is mapped 1:1 to the guest. Signed-off-by: Julien Grall julien.gr...@linaro.org Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Jan Beulich jbeul...@suse.com Cc: Wei Liu wei.l...@citrix.com --- Changes in v3: - Fix typoes - A separate has been created to extend the DOMCTL create domain Changes in v2: - Patch added --- tools/libxc/xc_domain.c | 1 + tools/libxl/libxl_arm.c | 19 +++ xen/arch/arm/domain.c | 7 ++- xen/arch/arm/setup.c | 1 + xen/arch/arm/vgic.c | 10 +- xen/include/asm-arm/domain.h | 2 ++ xen/include/asm-arm/setup.h | 1 + xen/include/asm-arm/vgic.h| 2 +- xen/include/public/arch-arm.h | 2 ++ 9 files changed, 38 insertions(+), 7 deletions(-) diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c index eebc121..eb066cf 100644 --- a/tools/libxc/xc_domain.c +++ b/tools/libxc/xc_domain.c @@ -67,6 +67,7 @@ int xc_domain_create(xc_interface *xch, /* No arch-specific configuration for now */ #elif defined (__arm__) || defined(__aarch64__) config.gic_version = XEN_DOMCTL_CONFIG_GIC_DEFAULT; +config.nr_spis = 0; #else errno = ENOSYS; return -1; diff --git a/tools/libxl/libxl_arm.c b/tools/libxl/libxl_arm.c index cddce6e..53177eb 100644 --- a/tools/libxl/libxl_arm.c +++ b/tools/libxl/libxl_arm.c @@ -39,6 +39,25 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc, libxl_domain_config *d_config, xc_domain_configuration_t *xc_config) { +uint32_t nr_spis = 0; +unsigned int i; + +for (i = 0; i d_config-b_info.num_irqs; i++) { +int irq = d_config-b_info.irqs[i]; +int spi = irq - 32; + +if (irq 32) +continue; + +if (nr_spis = spi) +nr_spis = spi + 1; +} + +LOG(DEBUG, Configure the domain); + +xc_config-nr_spis = nr_spis; +LOG(DEBUG, - Allocate %u SPIs, nr_spis); + xc_config-gic_version = XEN_DOMCTL_CONFIG_GIC_DEFAULT; return 0; diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c index 2473b10..6e56665 100644 --- a/xen/arch/arm/domain.c +++ b/xen/arch/arm/domain.c @@ -560,10 +560,15 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags, } config-gic_version = gic_version; +/* Sanity check on the number of SPIs */ +rc = -EINVAL; +if ( config-nr_spis (gic_number_lines() - 32) ) +goto fail; + if ( (rc = gicv_setup(d)) != 0 ) goto fail; -if ( (rc = domain_vgic_init(d)) != 0 ) +if ( (rc = domain_vgic_init(d, config-nr_spis)) != 0 ) goto fail; if ( (rc = domain_vtimer_init(d)) != 0 ) diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c index 18227f6..b28a708 100644 --- a/xen/arch/arm/setup.c +++ b/xen/arch/arm/setup.c @@ -815,6 +815,7 @@ void __init start_xen(unsigned long boot_phys_offset, /* Create initial domain 0. */ /* The vGIC for DOM0 is exactly emulated the hardware GIC */ config.gic_version = XEN_DOMCTL_CONFIG_GIC_DEFAULT; +config.nr_spis = gic_number_lines() - 32; dom0 = domain_create(0, 0, 0, config); if ( IS_ERR(dom0) || (alloc_dom0_vcpu0(dom0) == NULL) ) diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c index c915670..fc8a270 100644 --- a/xen/arch/arm/vgic.c +++ b/xen/arch/arm/vgic.c @@ -67,16 +67,16 @@ static void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq) p-irq = virq; } -int domain_vgic_init(struct domain *d) +int domain_vgic_init(struct domain *d, unsigned int nr_spis) { int i; d-arch.vgic.ctlr = 0; -if ( is_hardware_domain(d) ) -d-arch.vgic.nr_spis = gic_number_lines() - 32; -else -d-arch.vgic.nr_spis = 0; /* We don't need SPIs for the guest */ +/* The number of SPIs has to be aligned to 32 see + * GICD_TYPER.ITLinesNumber definition + */ +d-arch.vgic.nr_spis = ROUNDUP(nr_spis, 32); switch ( gic_hw_version() ) { diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h index d302fc9..101b4e9 100644 --- a/xen/include/asm-arm/domain.h +++ b/xen/include/asm-arm/domain.h @@ -121,6 +121,8 @@ struct arch_domain unsigned int evtchn_irq; } __cacheline_aligned; +#define domain_is_configured(d) ((d)-arch.is_configured) + struct arch_vcpu { struct { diff --git a/xen/include/asm-arm/setup.h b/xen/include/asm-arm/setup.h index ba5a67d..254cc17 100644 --- a/xen/include/asm-arm/setup.h +++ b/xen/include/asm-arm/setup.h @@ -54,6 +54,7 @@ void copy_from_paddr(void *dst, paddr_t paddr,
[Xen-devel] [PATCH v3 21/24] tools/(lib)xl: Add partial device tree support for ARM
Let the user to pass additional nodes to the guest device tree. For this purpose, everything in the node /passthrough from the partial device tree will be copied into the guest device tree. The node /aliases will be also copied to allow the user to define aliases which can be used by the guest kernel. A simple partial device tree will look like: /dts-v1/; / { #address-cells = 2; #size-cells = 2; passthrough { compatible = simple-bus; ranges; #address-cells = 2; #size-cells = 2; /* List of your nodes */ } }; Note that: * The interrupt-parent proporties will be added by the toolstack in the root node * The properties compatible, ranges, #address-cells and #size-cells in /passthrough are mandatory. Signed-off-by: Julien Grall julien.gr...@linaro.org Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Wei Liu wei.l...@citrix.com --- Changes in v3: - Patch added --- docs/man/xl.cfg.pod.5 | 7 ++ tools/libxl/libxl_arm.c | 253 tools/libxl/libxl_types.idl | 1 + tools/libxl/xl_cmdimpl.c| 1 + 4 files changed, 262 insertions(+) diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5 index e2f91fc..225b782 100644 --- a/docs/man/xl.cfg.pod.5 +++ b/docs/man/xl.cfg.pod.5 @@ -398,6 +398,13 @@ not emulated. Specify that this domain is a driver domain. This enables certain features needed in order to run a driver domain. +=item Bdevice_tree=PATH + +Specify a partial device tree (compiled via the Device Tree Compiler). +Everything under the node /passthrough will be copied into the guest +device tree. For convenience, the node /aliases is also copied to allow +the user to defined aliases which can be used by the guest kernel. + =back =head2 Devices diff --git a/tools/libxl/libxl_arm.c b/tools/libxl/libxl_arm.c index 53177eb..619458b 100644 --- a/tools/libxl/libxl_arm.c +++ b/tools/libxl/libxl_arm.c @@ -540,6 +540,238 @@ out: } } +static bool check_overrun(uint64_t a, uint64_t b, uint32_t max) +{ +return ((a + b) UINT_MAX || (a + b) max); +} + +/* Only FDT v17 is supported */ +#define FDT_REQUIRED_VERSION0x11 + +static int check_partial_fdt(libxl__gc *gc, void *fdt, size_t size) +{ +int r; + +if (size FDT_V17_SIZE) { +LOG(ERROR, Partial FDT is too small); +return ERROR_FAIL; +} + +if (fdt_magic(fdt) != FDT_MAGIC) { +LOG(ERROR, Partial FDT is not a valid Flat Device Tree); +return ERROR_FAIL; +} + +if (fdt_version(fdt) != FDT_REQUIRED_VERSION) { +LOG(ERROR, Partial FDT version not supported. Required 0x%x got 0x%x, +FDT_REQUIRED_VERSION, fdt_version(fdt)); +return ERROR_FAIL; +} + +r = fdt_check_header(fdt); +if (r) { +LOG(ERROR, Failed to check the partial FDT (%d), r); +return ERROR_FAIL; +} + +/* Check if the *size and off* fields doesn't overrun the totalsize + * of the partial FDT. + */ +if (fdt_totalsize(fdt) size) { +LOG(ERROR, Partial FDT totalsize is too big); +return ERROR_FAIL; +} + +size = fdt_totalsize(fdt); +if (fdt_off_dt_struct(fdt) size || +fdt_off_dt_strings(fdt) size || +check_overrun(fdt_off_dt_struct(fdt), fdt_size_dt_struct(fdt), size) || +check_overrun(fdt_off_dt_strings(fdt), fdt_size_dt_strings(fdt), size)) { +LOG(ERROR, Failed to validate the header of the partial FDT); +return ERROR_FAIL; +} + +return 0; +} + +/* + * Check if a string stored the strings block section is correctly + * nul-terminated. + * off_dt_strings and size_dt_strings fields have been validity-check + * earlier, so it's safe to use them here. + */ +static bool check_string(void *fdt, int nameoffset) +{ +const char *str = fdt_string(fdt, nameoffset); + +for (; nameoffset fdt_size_dt_strings(fdt); nameoffset++, str++) { +if (*str == '\0') +return true; +} + +return false; +} + +static int copy_properties(libxl__gc *gc, void *fdt, void *pfdt, + int nodeoff) +{ +int propoff, nameoff, r; +const struct fdt_property *prop; + +for (propoff = fdt_first_property_offset(pfdt, nodeoff); + propoff = 0; + propoff = fdt_next_property_offset(pfdt, propoff)) { + +if (!(prop = fdt_get_property_by_offset(pfdt, propoff, NULL))) { +return -FDT_ERR_INTERNAL; +} + +/* + * Libfdt doesn't perform any check on the validity of a string + * stored in the strings block section. As the property name is + * stored there, check it. + */ +nameoff = fdt32_to_cpu(prop-nameoff); +if (!check_string(pfdt, nameoff)) { +LOG(ERROR, The strings block section of the partial FDT is malformed); +return -FDT_ERR_BADSTRUCTURE; +} + +r =
[Xen-devel] [PATCH v3 00/24] xen/arm: Add support for non-pci passthrough
Hello all, This is the third version of this patch series to add support for platform device passthrough on ARM. Compare to the previous version [1], the automatic mapping of MMIO/IRQ and the generation of the device tree has been dropped. Instead the user will have to: - Map manually MMIO/IRQ - Describe the device in the newly partial device tree support - Specify the list of device protected by an IOMMU to assign to the guest. While this solution is primitive, this is allow us to support more complex device in Xen with an little additionnal work for the user. Attempting to do it automatically is more difficult because we may not know the dependencies between devices (for instance a Network card and a phy). To avoid adding code in DOM0 to manage platform device deassignment, the user has to add the property xen,passthrough to the device tree node describing the device. This can be easily done via U-Boot. For instance, if we want to passthrough the second network card of a Midway server to the guest. The user will have to add the following line the u-boot script: fdt set /soc/ethernet@fff51000 xen,passthrough This series has been tested on Midway by assigning the secondary network card to a guest (see instruction below). I plan to do futher testing on other boards. There is some TODO, mostly related to XSM in different patches (see commit message or /* TODO: ... */ in the files). This series is based on my series Find automatically a PPI for DOM0 event channel IRQ [2] and xen/arm: Resync the SMMU driver with the Linux one [3]. A working tree can be found here: git://xenbits.xen.org/julieng/xen-unstable.git branch passthrough-v3 Major changes in v3: - Rework the approach to passthrough a device (xen,passthrough + partial device tree). - Extend the existing hypercalls to assign/deassign device rather than adding new one. - Merge series [4] and [5] in this serie. Major changes in v2: - Drop the patch #1 of the previous version - Virtual IRQ are not anymore equal to the physical interrupt - Move the hypercall to get DT informations for privcmd to domctl - Split the domain creation in 2 two parts to allow per guest VGIC configuration (such as the number of SPIs). - Bunch of typoes, commit improvement, function renaming. For all changes see in each patch. I believe, it's better to have a basic support in Xen rather than nothing. This could be improved later. Sincerely yours, [1] http://lists.xen.org/archives/html/xen-devel/2014-07/msg04090.html [2] http://lists.xenproject.org/archives/html/xen-devel/2014-12/msg01386.html [3] http://lists.xenproject.org/archives/html/xen-devel/2014-12/msg01612.html [4] http://lists.xen.org/archives/html/xen-devel/2014-11/msg01672.html [5] http://lists.xenproject.org/archives/html/xen-devel/2014-07/msg02098.html = Instructions to passthrough a non-PCI device The example will use the secondary network card for the midway server. 1) Mark the device to let Xen knowns the device will be used for passthrough. This is done in the device tree node describing the device by adding the property xen,passthrough. The command to do it in U-Boot is: fdt set /soc/ethernet@fff51000 xen,passthrough 2) Create the partial device tree describing the device. The IRQ are mapped 1:1 to the guest (i.e VIRQ == IRQ). For MMIO will have to find hole in the guest memory layout (see xen/include/public/arch-arm.h, noted the layout is not stable and can change between 2 releases version of Xen). /dts-v1/; / { #address-cells = 2; #size-cells = 2; aliases { net = mac0; }; passthrough { compatible = simple-bus; ranges; #address-cells = 2; #size-cells = 2; mac0: ethernet@1000 { compatible = calxeda,hb-xgmac; reg = 0 0x1000 0 0x1000; interrupts = 0 80 4 0 81 4 0 82 4; /* dma-coherent can't be set because it requires platform * specific code for highbank */ /* dma-coherent; */ }; foo { my = mac0; }; }; }; 3) Compile the partial guest device with dtc (Device Tree Compiler). For our purpose, the compiled file will be called guest-midway.dtb and placed in /root in DOM0. 3) Add the following options in the guest configuration file: device_tree = /root/guest-midway.dtb dtdev = [ /soc/ethernet@fff51000 ] irqs = [ 112, 113, 114 ] iomem = [ 0xfff51,1@0x1 ] Cc: manish.ja...@caviumnetworks.com Cc: suravee.suthikulpa...@amd.com Cc: andrii.tseglyts...@globallogic.com Julien Grall (24): xen: Extend DOMCTL createdomain to support arch configuration xen/arm: Divide GIC initialization in 2 parts xen/dts: Allow only IRQ translation that are mapped to main GIC xen: guestcopy: Provide an helper to safely copy string from guest
[Xen-devel] [PATCH v3 04/24] xen: guestcopy: Provide an helper to safely copy string from guest
Flask code already provides an helper to copy a string from guest. In a later patch, the new DT hypercalls will need a similar function. To avoid code duplication, copy the flask helper (flask_copying_string) to common code: - Rename into safe_copy_string_from_guest - Add comment to explain the extra +1 - Return directly the buffer and use the macros provided by xen/err.h to return an error code if necessary. Signed-off-by: Julien Grall julien.gr...@linaro.org Cc: Daniel De Graaf dgde...@tycho.nsa.gov Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Jan Beulich jbeul...@suse.com Cc: Keir Fraser k...@xen.org --- Changes in v3: - Use macros of xen/err.h to return either the buffer or an error code - Reuse size_t instead of unsigned long - Update comment and commit message Changes in v2: - Rename copy_string_from_guest into safe_copy_string_from_guest - Update commit message and comment in the code --- xen/common/Makefile| 1 + xen/common/guestcopy.c | 30 + xen/include/xen/guest_access.h | 5 + xen/xsm/flask/flask_op.c | 43 ++ 4 files changed, 46 insertions(+), 33 deletions(-) create mode 100644 xen/common/guestcopy.c diff --git a/xen/common/Makefile b/xen/common/Makefile index 9ce75bb..3da774a 100644 --- a/xen/common/Makefile +++ b/xen/common/Makefile @@ -10,6 +10,7 @@ obj-y += event_2l.o obj-y += event_channel.o obj-y += event_fifo.o obj-y += grant_table.o +obj-y += guestcopy.o obj-y += irq.o obj-y += kernel.o obj-y += keyhandler.o diff --git a/xen/common/guestcopy.c b/xen/common/guestcopy.c new file mode 100644 index 000..d974f5c --- /dev/null +++ b/xen/common/guestcopy.c @@ -0,0 +1,30 @@ +#include xen/config.h +#include xen/lib.h +#include xen/guest_access.h +#include xen/err.h + +/* The function copies a string from the guest and adds a NUL to + * make sure the string is correctly terminated. + */ +void *safe_copy_string_from_guest(XEN_GUEST_HANDLE(char) u_buf, + size_t size, size_t max_size) +{ +char *tmp; + +if ( size max_size ) +return ERR_PTR(-ENOENT); + +/* Add an extra +1 to append \0 */ +tmp = xmalloc_array(char, size + 1); +if ( !tmp ) +return ERR_PTR(-ENOMEM); + +if ( copy_from_guest(tmp, u_buf, size) ) +{ +xfree(tmp); +return ERR_PTR(-EFAULT); +} +tmp[size] = 0; + +return tmp; +} diff --git a/xen/include/xen/guest_access.h b/xen/include/xen/guest_access.h index 373454e..55645e6 100644 --- a/xen/include/xen/guest_access.h +++ b/xen/include/xen/guest_access.h @@ -8,6 +8,8 @@ #define __XEN_GUEST_ACCESS_H__ #include asm/guest_access.h +#include xen/types.h +#include public/xen.h #define copy_to_guest(hnd, ptr, nr) \ copy_to_guest_offset(hnd, 0, ptr, nr) @@ -27,4 +29,7 @@ #define __clear_guest(hnd, nr) \ __clear_guest_offset(hnd, 0, nr) +void *safe_copy_string_from_guest(XEN_GUEST_HANDLE(char) u_buf, + size_t size, size_t max_size); + #endif /* __XEN_GUEST_ACCESS_H__ */ diff --git a/xen/xsm/flask/flask_op.c b/xen/xsm/flask/flask_op.c index 7743aac..b14d306 100644 --- a/xen/xsm/flask/flask_op.c +++ b/xen/xsm/flask/flask_op.c @@ -12,6 +12,7 @@ #include xen/event.h #include xsm/xsm.h #include xen/guest_access.h +#include xen/err.h #include public/xsm/flask_op.h @@ -76,29 +77,6 @@ static int domain_has_security(struct domain *d, u32 perms) perms, NULL); } -static int flask_copyin_string(XEN_GUEST_HANDLE(char) u_buf, char **buf, - size_t size, size_t max_size) -{ -char *tmp; - -if ( size max_size ) -return -ENOENT; - -tmp = xmalloc_array(char, size + 1); -if ( !tmp ) -return -ENOMEM; - -if ( copy_from_guest(tmp, u_buf, size) ) -{ -xfree(tmp); -return -EFAULT; -} -tmp[size] = 0; - -*buf = tmp; -return 0; -} - #endif /* COMPAT */ static int flask_security_user(struct xen_flask_userlist *arg) @@ -112,9 +90,9 @@ static int flask_security_user(struct xen_flask_userlist *arg) if ( rv ) return rv; -rv = flask_copyin_string(arg-u.user, user, arg-size, PAGE_SIZE); -if ( rv ) -return rv; +user = safe_copy_string_from_guest(arg-u.user, arg-size, PAGE_SIZE); +if ( IS_ERR(user) ) +return PTR_ERR(user); rv = security_get_user_sids(arg-start_sid, user, sids, nsids); if ( rv 0 ) @@ -227,9 +205,9 @@ static int flask_security_context(struct xen_flask_sid_context *arg) if ( rv ) return rv; -rv = flask_copyin_string(arg-context, buf, arg-size, PAGE_SIZE); -if ( rv ) -return rv; +buf = safe_copy_string_from_guest(arg-context, arg-size, PAGE_SIZE); +if ( IS_ERR(buf) ) +return PTR_ERR(buf);
[Xen-devel] [PATCH v3 16/24] xen/passthrough: Introduce iommu_construct
This new function will correctly initialize the IOMMU page table for the current domain. Also use it in iommu_assign_dt_device even though the current IOMMU implementation on ARM shares P2M with the processor. Signed-off-by: Julien Grall julien.gr...@linaro.org Cc: Jan Beulich jbeul...@suse.com --- Changes in v3: - The ASSERT in iommu_construct was redundant with the if () - Remove d-need_iommu = 1 in assign_device has it's already done by iommu_construct. - Simplify the code in the caller of iommu_construct Changes in v2: - Add missing Signed-off-by - Rename iommu_buildup to iommu_construct --- xen/drivers/passthrough/arm/iommu.c | 6 ++ xen/drivers/passthrough/device_tree.c | 4 xen/drivers/passthrough/iommu.c | 19 +++ xen/drivers/passthrough/pci.c | 15 --- xen/include/xen/iommu.h | 2 ++ 5 files changed, 35 insertions(+), 11 deletions(-) diff --git a/xen/drivers/passthrough/arm/iommu.c b/xen/drivers/passthrough/arm/iommu.c index 3e9303a..5870aef 100644 --- a/xen/drivers/passthrough/arm/iommu.c +++ b/xen/drivers/passthrough/arm/iommu.c @@ -68,3 +68,9 @@ void arch_iommu_domain_destroy(struct domain *d) { iommu_dt_domain_destroy(d); } + +int arch_iommu_populate_page_table(struct domain *d) +{ +/* The IOMMU shares the p2m with the CPU */ +return -ENOSYS; +} diff --git a/xen/drivers/passthrough/device_tree.c b/xen/drivers/passthrough/device_tree.c index 377d41d..88e496e 100644 --- a/xen/drivers/passthrough/device_tree.c +++ b/xen/drivers/passthrough/device_tree.c @@ -41,6 +41,10 @@ int iommu_assign_dt_device(struct domain *d, struct dt_device_node *dev) if ( !list_empty(dev-domain_list) ) goto fail; +rc = iommu_construct(d); +if ( rc ) +goto fail; + rc = hd-platform_ops-assign_device(d, 0, dt_to_dev(dev)); if ( rc ) diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c index cc12735..8915244 100644 --- a/xen/drivers/passthrough/iommu.c +++ b/xen/drivers/passthrough/iommu.c @@ -187,6 +187,25 @@ void iommu_teardown(struct domain *d) tasklet_schedule(iommu_pt_cleanup_tasklet); } +int iommu_construct(struct domain *d) +{ +int rc = 0; + +if ( need_iommu(d) 0 ) +return 0; + +if ( !iommu_use_hap_pt(d) ) +{ +rc = arch_iommu_populate_page_table(d); +if ( rc ) +return rc; +} + +d-need_iommu = 1; + +return rc; +} + void iommu_domain_destroy(struct domain *d) { struct hvm_iommu *hd = domain_hvm_iommu(d); diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c index 43ce5dc..9a47a37 100644 --- a/xen/drivers/passthrough/pci.c +++ b/xen/drivers/passthrough/pci.c @@ -1355,18 +1355,11 @@ static int assign_device(struct domain *d, u16 seg, u8 bus, u8 devfn) if ( !spin_trylock(pcidevs_lock) ) return -ERESTART; -if ( need_iommu(d) = 0 ) +rc = iommu_construct(d); +if ( rc ) { -if ( !iommu_use_hap_pt(d) ) -{ -rc = arch_iommu_populate_page_table(d); -if ( rc ) -{ -spin_unlock(pcidevs_lock); -return rc; -} -} -d-need_iommu = 1; +spin_unlock(pcidevs_lock); +return rc; } pdev = pci_get_pdev_by_domain(hardware_domain, seg, bus, devfn); diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h index d0f99ef..c146ee4 100644 --- a/xen/include/xen/iommu.h +++ b/xen/include/xen/iommu.h @@ -65,6 +65,8 @@ int arch_iommu_domain_init(struct domain *d); int arch_iommu_populate_page_table(struct domain *d); void arch_iommu_check_autotranslated_hwdom(struct domain *d); +int iommu_construct(struct domain *d); + /* Function used internally, use iommu_domain_destroy */ void iommu_teardown(struct domain *d); -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 09/24] xen/arm: route_irq_to_guest: Check validity of the IRQ
Currently Xen only supports SPIs routing for guest, add a function is_assignable_irq to check if we can assign a given IRQ to the guest. Secondly, make sure the vIRQ is not the greater that the number of IRQs handle to the vGIC and it's an SPIs. Thirdly, when the IRQ is already assigned to the domain, check the user is not asking to use a different vIRQ than the one already bound. Finally, desc-arch.type which contains the IRQ type (i.e level/edge) must be correctly configured before. The IRQ type won't be configure when: - the device has been blacklist for the current platform - the IRQ has not been describe in the device tree I think we can safely assume that a user won't never ask to route as such IRQ to the guest. Also, use XENLOG_G_ERR in the error message within the function as it will be later called from a guest. Signed-off-by: Julien Grall julien.gr...@linaro.org --- Changes in v3: - Fix typo in commit message and comment - Add a check that the vIRQ is an SPI - Check if the user is not asking for a different vIRQ when the IRQ is already assigned to the guest Changes in v2: - Rename is_routable_irq into is_assignable_irq - Check if the IRQ is not greater than the number handled by the number of IRQs handled by the gic - Move is_assignable_irq in irq.c rather than defining in the header irq.h - Retrieve the irq descriptor after checking the validity of the IRQ - vgic_num_irqs has been moved in a separate patch - Fix the irq check against vgic_num_irqs - Use virq instead of irq for vGIC sanity check --- xen/arch/arm/irq.c| 58 +++ xen/include/asm-arm/irq.h | 2 ++ 2 files changed, 56 insertions(+), 4 deletions(-) diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c index 830832c..af408ac 100644 --- a/xen/arch/arm/irq.c +++ b/xen/arch/arm/irq.c @@ -379,6 +379,15 @@ err: return rc; } +bool_t is_assignable_irq(unsigned int irq) +{ +/* For now, we can only route SPIs to the guest */ +return ((irq = NR_LOCAL_IRQS) (irq gic_number_lines())); +} + +/* Route an IRQ to a specific guest. + * For now only SPIs are assignabled to the guest. + */ int route_irq_to_guest(struct domain *d, unsigned int virq, unsigned int irq, const char * devname) { @@ -388,6 +397,29 @@ int route_irq_to_guest(struct domain *d, unsigned int virq, unsigned long flags; int retval = 0; +if ( !is_assignable_irq(irq) ) +{ +dprintk(XENLOG_G_ERR, the IRQ%u is not routable\n, irq); +return -EINVAL; +} + +desc = irq_to_desc(irq); + +if ( virq = vgic_num_irqs(d) ) +{ +dprintk(XENLOG_G_ERR, +the vIRQ number %u is too high for domain %u (max = %u)\n, +irq, d-domain_id, vgic_num_irqs(d)); +return -EINVAL; +} + +/* Only routing to virtual SPIs is supported */ +if ( virq 32 ) +{ +dprintk(XENLOG_G_ERR, IRQ can only be routed to a virtual SPIs); +return -EINVAL; +} + action = xmalloc(struct irqaction); if ( !action ) return -ENOMEM; @@ -408,8 +440,18 @@ int route_irq_to_guest(struct domain *d, unsigned int virq, spin_lock_irqsave(desc-lock, flags); +if ( desc-arch.type == DT_IRQ_TYPE_INVALID ) +{ +dprintk(XENLOG_G_ERR, IRQ %u has not been configured\n, +irq); +retval = -EIO; +goto out; +} + /* If the IRQ is already used by someone - * - If it's the same domain - Xen doesn't need to update the IRQ desc + * - If it's the same domain - Xen doesn't need to update the IRQ desc. + * For safety check if we are not trying to assign the IRQ to a + * different vIRQ. * - Otherwise - For now, don't allow the IRQ to be shared between * Xen and domains. */ @@ -418,13 +460,21 @@ int route_irq_to_guest(struct domain *d, unsigned int virq, struct domain *ad = irq_get_domain(desc); if ( test_bit(_IRQ_GUEST, desc-status) d == ad ) +{ +if ( irq_get_guest_info(desc)-virq != virq ) +{ +dprintk(XENLOG_G_ERR, d%u: IRQ %u is already assigned to vIRQ %u\n, +d-domain_id, irq, irq_get_guest_info(desc)-virq); +retval = -EPERM; +} goto out; +} if ( test_bit(_IRQ_GUEST, desc-status) ) -printk(XENLOG_ERR ERROR: IRQ %u is already used by domain %u\n, - irq, ad-domain_id); +dprintk(XENLOG_G_ERR, IRQ %u is already used by domain %u\n, +irq, ad-domain_id); else -printk(XENLOG_ERR ERROR: IRQ %u is already used by Xen\n, irq); +dprintk(XENLOG_G_ERR, IRQ %u is already used by Xen\n, irq); retval = -EBUSY; goto out; } diff --git
[Xen-devel] [PATCH v3 05/24] xen/arm: vgic: Introduce a function to initialize pending_irq
The structure pending_irq is initialized on the same way in 2 differents place. Introduce vgic_init_pending_irq to avoid code duplication. Also move the setting of the irq field in this function as we need to initialize it once rather than every time an IRQ is injected to the guest. Finally, use unsigned int for the irq field to be consistent with the virq variable Signed-off-by: Julien Grall julien.gr...@linaro.org Acked-by: Stefano Stabellini stefano.stabell...@eu.citrix.com --- Changes in v3: - Add Stefano's acked - The irq field is now unsigned int - Update commit message to speak about the int - unsigned int change - Use unsigned int rather than unsigned Changes in v2: - Patch added --- xen/arch/arm/gic.c | 2 +- xen/arch/arm/vgic.c| 19 ++- xen/include/asm-arm/vgic.h | 2 +- 3 files changed, 12 insertions(+), 11 deletions(-) diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c index 63147f3..eb0c5d6 100644 --- a/xen/arch/arm/gic.c +++ b/xen/arch/arm/gic.c @@ -627,7 +627,7 @@ void gic_dump_info(struct vcpu *v) list_for_each_entry ( p, v-arch.vgic.inflight_irqs, inflight ) { -printk(Inflight irq=%d lr=%u\n, p-irq, p-lr); +printk(Inflight irq=%u lr=%u\n, p-irq, p-lr); } list_for_each_entry( p, v-arch.vgic.lr_pending, lr_queue ) diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c index 0b24eec..38216f7 100644 --- a/xen/arch/arm/vgic.c +++ b/xen/arch/arm/vgic.c @@ -60,6 +60,13 @@ struct vgic_irq_rank *vgic_rank_irq(struct vcpu *v, unsigned int irq) return vgic_get_rank(v, rank); } +static void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq) +{ +INIT_LIST_HEAD(p-inflight); +INIT_LIST_HEAD(p-lr_queue); +p-irq = virq; +} + int domain_vgic_init(struct domain *d) { int i; @@ -100,10 +107,8 @@ int domain_vgic_init(struct domain *d) return -ENOMEM; for (i=0; id-arch.vgic.nr_spis; i++) -{ -INIT_LIST_HEAD(d-arch.vgic.pending_irqs[i].inflight); -INIT_LIST_HEAD(d-arch.vgic.pending_irqs[i].lr_queue); -} +vgic_init_pending_irq(d-arch.vgic.pending_irqs[i], i + 32); + for (i=0; iDOMAIN_NR_RANKS(d); i++) spin_lock_init(d-arch.vgic.shared_irqs[i].lock); @@ -147,10 +152,7 @@ int vcpu_vgic_init(struct vcpu *v) memset(v-arch.vgic.pending_irqs, 0, sizeof(v-arch.vgic.pending_irqs)); for (i = 0; i 32; i++) -{ -INIT_LIST_HEAD(v-arch.vgic.pending_irqs[i].inflight); -INIT_LIST_HEAD(v-arch.vgic.pending_irqs[i].lr_queue); -} +vgic_init_pending_irq(v-arch.vgic.pending_irqs[i], i); INIT_LIST_HEAD(v-arch.vgic.inflight_irqs); INIT_LIST_HEAD(v-arch.vgic.lr_pending); @@ -407,7 +409,6 @@ void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int irq) goto out; } -n-irq = irq; n-priority = priority; /* the irq is enabled */ diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h index 460a2f3..8582d9d 100644 --- a/xen/include/asm-arm/vgic.h +++ b/xen/include/asm-arm/vgic.h @@ -67,7 +67,7 @@ struct pending_irq #define GIC_IRQ_GUEST_MIGRATING 4 unsigned long status; struct irq_desc *desc; /* only set it the irq corresponds to a physical irq */ -int irq; +unsigned int irq; #define GIC_INVALID_LR ~(uint8_t)0 uint8_t lr; uint8_t priority; -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 01/10] xen/arm: Implement hip04-d01 platform
On Tue, 2015-01-13 at 14:09 +, Frediano Ziglio wrote: 2015-01-13 11:58 GMT+00:00 Ian Campbell ian.campb...@citrix.com: On Mon, 2014-11-03 at 10:11 +, Frediano Ziglio wrote: Add this new platform to Xen. This platform require specific code to initialize CPUs. What is the bootwrapper? Are you running this on real silicon or on an emulator? Can the platform be made to do PSCI instead? Very real. It's actually on my desk and I'm not in Matrix :-) OK. The choice of bootwrapper as a name is a bit unfortunate, since it is already used for something else, but oh well. Has no PSCI support. Would be honestly very great. As we (as company) write the firmware could be technically doable. There is no plan. This piece of software is meant to bring the CPU from Secure mode to Unsecure Hypervisor mode before calling kernel/hypervisor code and provide supervisor calls. Sounds a lot like PSCI to me, except non-standard ;-) +np_fab = dt_find_compatible_node(NULL, NULL, hisilicon,hip04-fabric); Please add a reference to the DT bindings document for these values. linux/Documentation/devicetree/bindings/arm/hisilicon/hisilicon.txt seems related but doesn't talk about most of these fields. There are documentation in the Linaro kernel, see https://git.linaro.org/kernel/linux-linaro-tracking.git/blob/HEAD:/Documentation/devicetree/bindings/arm/hisilicon/hisilicon.txt. I hope it will be merged soon. Thanks, but this doesn't seem to cover many of the properties used by the code you are adding, e.g. bootwrapper-{size,magic}, relocation-{entry,size} (in fact it suggests they are part of a boot-method array). I get the feeling these might be legacy/deprecated. Perhaps we could get away without supporting such things? Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 08/14] xen-netback: use foreign page information from the pages themselves
On 12/01/15 15:43, David Vrabel wrote: From: Jenny Herbert jenny.herb...@citrix.com Use the foreign page flag in netback to get the domid and grant ref needed for the grant copy. This signficiantly simplifies the netback code and makes netback work with foreign pages from other backends (e.g., blkback). This allows blkback to use iSCSI disks provided by domUs running on the same host. Dave, This depends on several xen changes. It's been Acked-by: Ian Campbell ian.campb...@citrix.com Are you happy for me to merge this via the xen tree in 3.20? David Signed-off-by: Jenny Herbert jennifer.herb...@citrix.com Signed-off-by: David Vrabel david.vra...@citrix.com --- drivers/net/xen-netback/netback.c | 100 - 1 file changed, 9 insertions(+), 91 deletions(-) diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index 6441318..ae3ab37 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -314,9 +314,7 @@ static struct xenvif_rx_meta *get_next_rx_buffer(struct xenvif_queue *queue, static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb, struct netrx_pending_operations *npo, struct page *page, unsigned long size, - unsigned long offset, int *head, - struct xenvif_queue *foreign_queue, - grant_ref_t foreign_gref) + unsigned long offset, int *head) { struct gnttab_copy *copy_gop; struct xenvif_rx_meta *meta; @@ -333,6 +331,8 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb offset = ~PAGE_MASK; while (size 0) { + struct xen_page_foreign *foreign; + BUG_ON(offset = PAGE_SIZE); BUG_ON(npo-copy_off MAX_BUFFER_OFFSET); @@ -361,9 +361,10 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb copy_gop-flags = GNTCOPY_dest_gref; copy_gop-len = bytes; - if (foreign_queue) { - copy_gop-source.domid = foreign_queue-vif-domid; - copy_gop-source.u.ref = foreign_gref; + foreign = xen_page_foreign(page); + if (foreign) { + copy_gop-source.domid = foreign-domid; + copy_gop-source.u.ref = foreign-gref; copy_gop-flags |= GNTCOPY_source_gref; } else { copy_gop-source.domid = DOMID_SELF; @@ -406,35 +407,6 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb } /* - * Find the grant ref for a given frag in a chain of struct ubuf_info's - * skb: the skb itself - * i: the frag's number - * ubuf: a pointer to an element in the chain. It should not be NULL - * - * Returns a pointer to the element in the chain where the page were found. If - * not found, returns NULL. - * See the definition of callback_struct in common.h for more details about - * the chain. - */ -static const struct ubuf_info *xenvif_find_gref(const struct sk_buff *const skb, - const int i, - const struct ubuf_info *ubuf) -{ - struct xenvif_queue *foreign_queue = ubuf_to_queue(ubuf); - - do { - u16 pending_idx = ubuf-desc; - - if (skb_shinfo(skb)-frags[i].page.p == - foreign_queue-mmap_pages[pending_idx]) - break; - ubuf = (struct ubuf_info *) ubuf-ctx; - } while (ubuf); - - return ubuf; -} - -/* * Prepare an SKB to be transmitted to the frontend. * * This function is responsible for allocating grant operations, meta @@ -459,8 +431,6 @@ static int xenvif_gop_skb(struct sk_buff *skb, int head = 1; int old_meta_prod; int gso_type; - const struct ubuf_info *ubuf = skb_shinfo(skb)-destructor_arg; - const struct ubuf_info *const head_ubuf = ubuf; old_meta_prod = npo-meta_prod; @@ -507,68 +477,16 @@ static int xenvif_gop_skb(struct sk_buff *skb, len = skb_tail_pointer(skb) - data; xenvif_gop_frag_copy(queue, skb, npo, - virt_to_page(data), len, offset, head, - NULL, - 0); + virt_to_page(data), len, offset, head); data += len; } for (i = 0; i nr_frags; i++) { - /* This variable also signals whether foreign_gref has a real - * value or not. - */ - struct xenvif_queue *foreign_queue = NULL; - grant_ref_t
Re: [Xen-devel] [PATCH v3 5/5] tools: add total/local memory bandwith monitoring
On Tue, Jan 13, 2015 at 04:02:13PM +0800, Chao Peng wrote: Add Memory Bandwidth Monitoring(MBM) for VMs. Two types of monitoring are supported: total and local memory bandwidth monitoring. To use it, CMT should be enabled in hypervisor. Signed-off-by: Chao Peng chao.p.p...@linux.intel.com --- docs/man/xl.pod.1 |9 + tools/libxc/include/xenctrl.h |2 + tools/libxc/xc_psr.c |8 tools/libxl/libxl.h |8 tools/libxl/libxl_psr.c | 84 + tools/libxl/libxl_types.idl |2 + tools/libxl/xl_cmdimpl.c | 21 ++- tools/libxl/xl_cmdtable.c |4 +- 8 files changed, 136 insertions(+), 2 deletions(-) diff --git a/docs/man/xl.pod.1 b/docs/man/xl.pod.1 index 6b89ba8..0370625 100644 --- a/docs/man/xl.pod.1 +++ b/docs/man/xl.pod.1 @@ -1461,6 +1461,13 @@ is domain level. To monitor a specific domain, just attach the domain id with the monitoring service. When the domain doesn't need to be monitored any more, detach the domain id from the monitoring service. +Intel Broadwell and later server platforms also offer total/local memory +bandwidth monitoring. Xen supports per-domain monitoring for these two +additional monitoring types. Both memory bandwidth monitoring and L3 cache +occupancy monitoring share the same set of underground monitoring service. Once ^^^ underlying? I'm not native speaker though. I will defer reviewing this paragraph to a native English speaker. +a domain is attached to the monitoring service, monitoring data can be showed +for any of these monitoring types. + =over 4 [...] +static int libxl__psr_cmt_get_mem_bandwidth(libxl__gc *gc, +uint32_t domid, +xc_psr_cmt_type type, +uint32_t socketid, +uint32_t *bandwidth) +{ +uint64_t sample1, sample2; +uint32_t upscaling_factor; +int retry_attempts = 0; +int rc; + +do { +rc = libxl__psr_cmt_get_l3_monitoring_data(gc, domid, type, socketid, + sample1); +if (rc 0) { +rc = ERROR_FAIL; +goto out; +} + +usleep(1); + +rc = libxl__psr_cmt_get_l3_monitoring_data(gc, domid, type, socketid, + sample2); +if (rc 0) { + rc = ERROR_FAIL; + goto out; +} + +if (sample2 = sample1) If sample2 == sample1 then bandwidth is zero. Is this expected? +break; + +if (retry_attempts MBM_SAMPLE_RETRY_MAX) { +retry_attempts++; +} else { +LOGE(ERROR, event counter overflowed); +rc = ERROR_FAIL; +goto out; +} + +} while(1); Minor nit, should be while (1). The rest of this patch looks OK to me. Wei. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 6/9] libxl: add libxl__domain_soft_reset_destroy()
On Thu, 2014-12-11 at 14:45 +0100, Vitaly Kuznetsov wrote: New libxl__domain_soft_reset_destroy() is an internal-only version of libxl_domain_destroy() which follows the same domain destroy path with the only difference: xc_domain_destroy() is being avoided so the domain is not actually being destroyed. Rather than duplicating the bulk of libxl_domain_destroy, please make this libxl__domain_destroy taking a flag and turn libxl_domain_destroy into a thin wrapper around the new internal version. Add soft_reset flag to libxl__domain_destroy_state structure to support the change. The original libxl_domain_destroy() function could be easily modified to support new flag but I'm trying to avoid that as it is part of public API. There are mechanisms which could be used here to rev the API if it was desirable to expose this flag to the calling toolstack for some reason, e.g. checkout the uses of LIBXL_API_VERSION in libxl.h. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] libxl: provide xenlight.pc
On Tue, 2015-01-13 at 12:56 +, Wei Liu wrote: On Tue, Jan 13, 2015 at 01:19:05PM +0100, Olaf Hering wrote: On Tue, Jan 13, Ian Campbell wrote: On Fri, 2015-01-09 at 14:32 +, Wei Liu wrote: A pkg-config file for libxl. It also contains two variables (xenfirmwaredir and libexec_bin) so that tools that are very keen on knowing the locations of Xen binaries (say, libvirt) can use them to determine the location of the binaries. Please rerun autogen.sh after applying this patch. Forgot to reply to this earlier: Should there really be another file.in.in.in.in mess? I think the major/minor values could be placed into some m4 file so that they can be substituted properly by configure. I was two minded when I wrote this path. On one hand I didn't want to place a m4 file here, on the other I didn't want to leak library version numbers to top level m4 directory. Finally I decided to do the .in.in trick. So if you have an argument for either of these please convince me... Or you have other idea about file placement please tell me. I think the library SONAME belongs in the relevant Makefile, not hidden in the m4 somewhere. Which I think necessitates .in.in. I think we can live with that. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 20/24] xen/passthrough: Extend XEN_DOMCTL_assign_device to support DT device
TODO: Update the commit message A device node is described by a path. It will be used to retrieved the node in the device tree and assign the related device to the domain. Only device protected by an IOMMU can be assigned to a guest. Signed-off-by: Julien Grall julien.gr...@linaro.org Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Wei Liu wei.l...@citrix.com Cc: Jan Beulich jbeul...@suse.com --- Changes in v2: - Use a different number for XEN_DOMCTL_assign_dt_device --- tools/libxc/include/xenctrl.h | 10 tools/libxc/xc_domain.c | 95 -- xen/drivers/passthrough/device_tree.c | 97 +-- xen/drivers/passthrough/iommu.c | 7 +++ xen/drivers/passthrough/pci.c | 43 +++- xen/include/public/domctl.h | 15 +- xen/include/xen/iommu.h | 3 ++ 7 files changed, 249 insertions(+), 21 deletions(-) diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h index d66571f..db45475 100644 --- a/tools/libxc/include/xenctrl.h +++ b/tools/libxc/include/xenctrl.h @@ -2055,6 +2055,16 @@ int xc_deassign_device(xc_interface *xch, uint32_t domid, uint32_t machine_bdf); +int xc_assign_dt_device(xc_interface *xch, +uint32_t domid, +char *path); +int xc_test_assign_dt_device(xc_interface *xch, + uint32_t domid, + char *path); +int xc_deassign_dt_device(xc_interface *xch, + uint32_t domid, + char *path); + int xc_domain_memory_mapping(xc_interface *xch, uint32_t domid, unsigned long first_gfn, diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c index eb066cf..bca3aee 100644 --- a/tools/libxc/xc_domain.c +++ b/tools/libxc/xc_domain.c @@ -1637,7 +1637,8 @@ int xc_assign_device( domctl.cmd = XEN_DOMCTL_assign_device; domctl.domain = domid; -domctl.u.assign_device.machine_sbdf = machine_sbdf; +domctl.u.assign_device.dev = XEN_DOMCTL_DEV_PCI; +domctl.u.assign_device.u.pci.machine_sbdf = machine_sbdf; return do_domctl(xch, domctl); } @@ -1686,7 +1687,8 @@ int xc_test_assign_device( domctl.cmd = XEN_DOMCTL_test_assign_device; domctl.domain = domid; -domctl.u.assign_device.machine_sbdf = machine_sbdf; +domctl.u.assign_device.dev = XEN_DOMCTL_DEV_PCI; +domctl.u.assign_device.u.pci.machine_sbdf = machine_sbdf; return do_domctl(xch, domctl); } @@ -1700,11 +1702,96 @@ int xc_deassign_device( domctl.cmd = XEN_DOMCTL_deassign_device; domctl.domain = domid; -domctl.u.assign_device.machine_sbdf = machine_sbdf; - +domctl.u.assign_device.dev = XEN_DOMCTL_DEV_PCI; +domctl.u.assign_device.u.pci.machine_sbdf = machine_sbdf; + return do_domctl(xch, domctl); } +int xc_assign_dt_device( +xc_interface *xch, +uint32_t domid, +char *path) +{ +int rc; +size_t size = strlen(path); +DECLARE_DOMCTL; +DECLARE_HYPERCALL_BOUNCE(path, size, XC_HYPERCALL_BUFFER_BOUNCE_IN); + +if ( xc_hypercall_bounce_pre(xch, path) ) +return -1; + +domctl.cmd = XEN_DOMCTL_assign_device; +domctl.domain = (domid_t)domid; + +domctl.u.assign_device.dev = XEN_DOMCTL_DEV_DT; +domctl.u.assign_device.u.dt.size = size; +set_xen_guest_handle(domctl.u.assign_device.u.dt.path, path); + +rc = do_domctl(xch, domctl); + +xc_hypercall_bounce_post(xch, path); + +return rc; +} + +int xc_test_assign_dt_device( +xc_interface *xch, +uint32_t domid, +char *path) +{ +int rc; +size_t size = strlen(path); +DECLARE_DOMCTL; +DECLARE_HYPERCALL_BOUNCE(path, size, XC_HYPERCALL_BUFFER_BOUNCE_IN); + +if ( xc_hypercall_bounce_pre(xch, path) ) +return -1; + +domctl.cmd = XEN_DOMCTL_test_assign_device; +domctl.domain = (domid_t)domid; + +domctl.u.assign_device.dev = XEN_DOMCTL_DEV_DT; +domctl.u.assign_device.u.dt.size = size; +set_xen_guest_handle(domctl.u.assign_device.u.dt.path, path); + +rc = do_domctl(xch, domctl); + +xc_hypercall_bounce_post(xch, path); + +return rc; +} + +int xc_deassign_dt_device( +xc_interface *xch, +uint32_t domid, +char *path) +{ +int rc; +size_t size = strlen(path); +DECLARE_DOMCTL; +DECLARE_HYPERCALL_BOUNCE(path, size, XC_HYPERCALL_BUFFER_BOUNCE_IN); + +if ( xc_hypercall_bounce_pre(xch, path) ) +return -1; + +domctl.cmd = XEN_DOMCTL_deassign_device; +domctl.domain = (domid_t)domid; + +domctl.u.assign_device.dev = XEN_DOMCTL_DEV_DT; +domctl.u.assign_device.u.dt.size = size; +set_xen_guest_handle(domctl.u.assign_device.u.dt.path, path); + +rc = do_domctl(xch, domctl); + +xc_hypercall_bounce_post(xch, path); + +return rc; +} + + + + int
Re: [Xen-devel] [PATCH v5 4/9] xen: introduce XEN_DOMCTL_devour
At 13:53 + on 13 Jan (1421153637), Ian Campbell wrote: On Thu, 2014-12-11 at 14:45 +0100, Vitaly Kuznetsov wrote: +gmfn = mfn_to_gmfn(d, mfn); (I haven't thought about it super hard, but I'm taking it as given that this approach to kexec is going to be needed for ARM too, since that seems likely) mfn_to_gmfn is going to be a bit pricey on ARM, we don't have an m2p to refer to, I'm not sure what we would do instead, walking the p2m looking for mfns surely won't be a good idea! An alternative approach to this might be to walk the guest p2m (with appropriate continuations) and move each domheap page (this would also help us preserve super page mappings). It would also have the advantage of not needing additional stages in the destroy path and state in struct domain etc, since all the action would be constrained to the one hypercall. x86 folks, would that work for your p2m too? Without having looked at the details, it sounds plausible to me. Tim. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [RFC PATCHv1 net-next] xen-netback: always fully coalesce guest Rx packets
Always fully coalesce guest Rx packets into the minimum number of ring slots. Reducing the number of slots per packet has significant performance benefits (e.g., 7.2 Gbit/s to 11 Gbit/s in an off-host receive test). However, this does increase the number of grant ops per packet which decreases performance with some workloads (intrahost VM to VM) /unless/ grant copy has been optimized for adjacent ops with the same source or destination (see grant-table: defer releasing pages acquired in a grant copy[1]). Do we need to retain the existing path and make the always coalesce path conditional on a suitable version of Xen? [1] http://lists.xen.org/archives/html/xen-devel/2015-01/msg01118.html Signed-off-by: David Vrabel david.vra...@citrix.com --- drivers/net/xen-netback/common.h |1 - drivers/net/xen-netback/netback.c | 106 ++--- 2 files changed, 3 insertions(+), 104 deletions(-) diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h index 5f1fda4..589fa25 100644 --- a/drivers/net/xen-netback/common.h +++ b/drivers/net/xen-netback/common.h @@ -251,7 +251,6 @@ struct xenvif { struct xenvif_rx_cb { unsigned long expires; int meta_slots_used; - bool full_coalesce; }; #define XENVIF_RX_CB(skb) ((struct xenvif_rx_cb *)(skb)-cb) diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index 908e65e..568238d 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -233,51 +233,6 @@ static void xenvif_rx_queue_drop_expired(struct xenvif_queue *queue) } } -/* - * Returns true if we should start a new receive buffer instead of - * adding 'size' bytes to a buffer which currently contains 'offset' - * bytes. - */ -static bool start_new_rx_buffer(int offset, unsigned long size, int head, - bool full_coalesce) -{ - /* simple case: we have completely filled the current buffer. */ - if (offset == MAX_BUFFER_OFFSET) - return true; - - /* -* complex case: start a fresh buffer if the current frag -* would overflow the current buffer but only if: -* (i) this frag would fit completely in the next buffer -* and (ii) there is already some data in the current buffer -* and (iii) this is not the head buffer. -* and (iv) there is no need to fully utilize the buffers -* -* Where: -* - (i) stops us splitting a frag into two copies -* unless the frag is too large for a single buffer. -* - (ii) stops us from leaving a buffer pointlessly empty. -* - (iii) stops us leaving the first buffer -* empty. Strictly speaking this is already covered -* by (ii) but is explicitly checked because -* netfront relies on the first buffer being -* non-empty and can crash otherwise. -* - (iv) is needed for skbs which can use up more than MAX_SKB_FRAGS -* slot -* -* This means we will effectively linearise small -* frags but do not needlessly split large buffers -* into multiple copies tend to give large frags their -* own buffers as before. -*/ - BUG_ON(size MAX_BUFFER_OFFSET); - if ((offset + size MAX_BUFFER_OFFSET) offset !head - !full_coalesce) - return true; - - return false; -} - struct netrx_pending_operations { unsigned copy_prod, copy_cons; unsigned meta_prod, meta_cons; @@ -336,24 +291,13 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb BUG_ON(offset = PAGE_SIZE); BUG_ON(npo-copy_off MAX_BUFFER_OFFSET); - bytes = PAGE_SIZE - offset; + if (npo-copy_off == MAX_BUFFER_OFFSET) + meta = get_next_rx_buffer(queue, npo); + bytes = PAGE_SIZE - offset; if (bytes size) bytes = size; - if (start_new_rx_buffer(npo-copy_off, - bytes, - *head, - XENVIF_RX_CB(skb)-full_coalesce)) { - /* -* Netfront requires there to be some data in the head -* buffer. -*/ - BUG_ON(*head); - - meta = get_next_rx_buffer(queue, npo); - } - if (npo-copy_off + bytes MAX_BUFFER_OFFSET) bytes = MAX_BUFFER_OFFSET - npo-copy_off; @@ -652,60 +596,16 @@ static void xenvif_rx_action(struct xenvif_queue *queue) while (xenvif_rx_ring_slots_available(queue, XEN_NETBK_RX_SLOTS_MAX) (skb = xenvif_rx_dequeue(queue)) != NULL) { - RING_IDX max_slots_needed;
[Xen-devel] [seabios test] 33391: tolerable FAIL - PUSHED
flight 33391 seabios real [real] http://www.chiark.greenend.org.uk/~xensrcts/logs/33391/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-pvh-intel 9 guest-start fail never pass test-amd64-i386-libvirt 9 guest-start fail never pass test-amd64-amd64-xl-pvh-amd 9 guest-start fail never pass test-amd64-amd64-libvirt 9 guest-start fail never pass test-amd64-amd64-xl-pcipt-intel 9 guest-start fail never pass test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop fail never pass test-amd64-amd64-xl-winxpsp3 14 guest-stop fail never pass test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop fail never pass test-amd64-i386-xl-qemut-winxpsp3 14 guest-stopfail never pass test-amd64-amd64-xl-qemut-winxpsp3 14 guest-stop fail never pass test-amd64-i386-xl-win7-amd64 14 guest-stop fail never pass test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop fail never pass test-amd64-amd64-xl-win7-amd64 14 guest-stop fail never pass test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass test-amd64-amd64-xl-qemuu-winxpsp3 14 guest-stop fail never pass test-amd64-i386-xl-qemuu-winxpsp3 14 guest-stopfail never pass test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail never pass test-amd64-i386-xl-winxpsp3 14 guest-stop fail never pass version targeted for testing: seabios 301dd092c2d04a5d70c94b9d873d810785e94a84 baseline version: seabios 60e0e55f212dadd043ab9e39bee05a48013ddd8f People who touched revisions under test: Kevin O'Connor ke...@koconnor.net jobs: build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass test-amd64-amd64-xl pass test-amd64-i386-xl pass test-amd64-amd64-xl-pvh-amd fail test-amd64-i386-rhel6hvm-amd pass test-amd64-i386-qemut-rhel6hvm-amd pass test-amd64-i386-qemuu-rhel6hvm-amd pass test-amd64-amd64-xl-qemut-debianhvm-amd64pass test-amd64-i386-xl-qemut-debianhvm-amd64 pass test-amd64-amd64-xl-qemuu-debianhvm-amd64pass test-amd64-i386-xl-qemuu-debianhvm-amd64 pass test-amd64-i386-freebsd10-amd64 pass test-amd64-amd64-xl-qemuu-ovmf-amd64 pass test-amd64-i386-xl-qemuu-ovmf-amd64 pass test-amd64-amd64-xl-qemut-win7-amd64 fail test-amd64-i386-xl-qemut-win7-amd64 fail test-amd64-amd64-xl-qemuu-win7-amd64 fail test-amd64-i386-xl-qemuu-win7-amd64 fail test-amd64-amd64-xl-win7-amd64 fail test-amd64-i386-xl-win7-amd64fail test-amd64-i386-xl-credit2 pass test-amd64-i386-freebsd10-i386 pass test-amd64-amd64-xl-pcipt-intel fail test-amd64-amd64-xl-pvh-intelfail test-amd64-i386-rhel6hvm-intel pass test-amd64-i386-qemut-rhel6hvm-intel pass test-amd64-i386-qemuu-rhel6hvm-intel pass test-amd64-amd64-libvirt fail test-amd64-i386-libvirt fail test-amd64-i386-xl-multivcpu pass test-amd64-amd64-pairpass test-amd64-i386-pair pass test-amd64-amd64-xl-sedf-pin pass test-amd64-amd64-xl-sedf pass test-amd64-i386-xl-qemut-winxpsp3-vcpus1 fail
Re: [Xen-devel] [OSSTEST PATCH] make-flight: reorganize scheduling related test jobs
On Mon, 2015-01-12 at 16:52 +, Ian Jackson wrote: Dario Faggioli writes ([OSSTEST PATCH] make-flight: reorganize scheduling related test jobs): Scheduling related tests are ok to run on ARM, so do not cut them off. They also do not depend on a particular Dom0 architecture. The net effect is that the following tests are removed: test-amd64-i386-xl-credit2 test-amd64-i386-xl-multivcpu while the following new ones are created: test-amd64-amd64-xl-credit2 test-amd64-amd64-xl-multivcpu test-armhf-armhf-xl-credit2 test-armhf-armhf-xl-multivcpu test-armhf-armhf-xl-sedf test-armhf-armhf-xl-sedf-pin This looks plausible but can you include the output of a diff between the two sets of runvars, please ? Not sure I'm getting. I will put down here a diff of two invocation of `./mg-show-flight-runvars standalone', one done before the other after the patch... Was it that? $ diff -Nru runvars.orig runvars.patched --- runvars.orig2015-01-13 09:49:17.402478000 + +++ runvars.patched 2015-01-13 09:49:56.794085000 + @@ -3,6 +3,7 @@ test-amd64-amd64-rumpuserxen-amd64all_hostflags arch-amd64,arch-xen-amd64,suite-wheezy,purpose-test test-amd64-amd64-xl all_hostflags arch-amd64,arch-xen-amd64,suite-wheezy,purpose-test test-amd64-amd64-xl-credit2 all_hostflags arch-amd64,arch-xen-amd64,suite-wheezy,purpose-test +test-amd64-amd64-xl-multivcpu all_hostflags arch-amd64,arch-xen-amd64,suite-wheezy,purpose-test test-amd64-amd64-xl-pcipt-intel all_hostflags arch-amd64,arch-xen-amd64,suite-wheezy,purpose-test,hvm-intel,pcipassthrough-nic test-amd64-amd64-xl-pvh-amd all_hostflags arch-amd64,arch-xen-amd64,suite-wheezy,purpose-test,hvm-amd test-amd64-amd64-xl-pvh-intel all_hostflags arch-amd64,arch-xen-amd64,suite-wheezy,purpose-test,hvm-intel @@ -29,7 +30,6 @@ test-amd64-i386-rhel6hvm-intelall_hostflags arch-i386,arch-xen-amd64,suite-wheezy,purpose-test,hvm-intel test-amd64-i386-rumpuserxen-i386 all_hostflags arch-i386,arch-xen-amd64,suite-wheezy,purpose-test test-amd64-i386-xlall_hostflags arch-i386,arch-xen-amd64,suite-wheezy,purpose-test -test-amd64-i386-xl-multivcpu all_hostflags arch-i386,arch-xen-amd64,suite-wheezy,purpose-test test-amd64-i386-xl-qemut-debianhvm-amd64 all_hostflags arch-i386,arch-xen-amd64,suite-wheezy,purpose-test,hvm test-amd64-i386-xl-qemut-win7-amd64 all_hostflags arch-i386,arch-xen-amd64,suite-wheezy,purpose-test,hvm test-amd64-i386-xl-qemut-winxpsp3 all_hostflags arch-i386,arch-xen-amd64,suite-wheezy,purpose-test,hvm @@ -44,6 +44,10 @@ test-amd64-i386-xl-winxpsp3-vcpus1all_hostflags arch-i386,arch-xen-amd64,suite-wheezy,purpose-test,hvm test-armhf-armhf-libvirt all_hostflags arch-armhf,arch-xen-armhf,suite-wheezy,purpose-test test-armhf-armhf-xl all_hostflags arch-armhf,arch-xen-armhf,suite-wheezy,purpose-test +test-armhf-armhf-xl-credit2 all_hostflags arch-armhf,arch-xen-armhf,suite-wheezy,purpose-test +test-armhf-armhf-xl-multivcpu all_hostflags arch-armhf,arch-xen-armhf,suite-wheezy,purpose-test +test-armhf-armhf-xl-sedf all_hostflags arch-armhf,arch-xen-armhf,suite-wheezy,purpose-test +test-armhf-armhf-xl-sedf-pin all_hostflags arch-armhf,arch-xen-armhf,suite-wheezy,purpose-test build-amd64 archamd64 build-amd64-libvirt archamd64 build-amd64-oldkern archamd64 @@ -62,6 +66,7 @@ test-amd64-amd64-rumpuserxen-amd64archamd64
[Xen-devel] [PATCH v3 13/24] xen/arm: Implement hypercall PHYSDEVOP_{, un}map_pirq
The physdev sub-hypercalls PHYSDEVOP_{,map}_pirq allow the toolstack to assign/deassign a physical IRQ to the guest (via the config options irqs for xl). The x86 version is using them with PIRQ (IRQ bound to an event channel). As ARM doesn't have a such concept, we could reuse it to bound a physical IRQ to a virtual IRQ. For now, we allow only SPIs to be mapped to the guest. The type MAP_PIRQ_TYPE_GSI is used for this purpose. Signed-off-by: Julien Grall julien.gr...@linaro.org Cc: Jan Beulich jbeul...@suse.com --- I'm not sure it's the best solution to reuse hypercalls for a different purpose. If x86 plan to have a such concept (i.e binding a physical IRQ to a virtual IRQ), we could introduce new hypercalls. Any thoughs? TODO: This patch is lacking of support of vIRQ != IRQ. I plan to handle it correctly on the next version. Changes in v3: - Functions to allocate/release/reserved a VIRQ has been moved in a separate patch - Make clear that only MAP_PIRQ_GSI is only supported for now Changes in v2: - Add PHYSDEVOP_unmap_pirq - Rework commit message - Add functions to allocate/release a VIRQ - is_routable_irq has been renamed into is_assignable_irq --- xen/arch/arm/physdev.c | 136 - 1 file changed, 134 insertions(+), 2 deletions(-) diff --git a/xen/arch/arm/physdev.c b/xen/arch/arm/physdev.c index 61b4a18..0cf9bbd 100644 --- a/xen/arch/arm/physdev.c +++ b/xen/arch/arm/physdev.c @@ -8,13 +8,145 @@ #include xen/types.h #include xen/lib.h #include xen/errno.h +#include xen/iocap.h +#include xen/guest_access.h +#include xsm/xsm.h +#include asm/current.h #include asm/hypercall.h +#include public/physdev.h +static int physdev_map_pirq(domid_t domid, int type, int index, int *pirq_p) +{ +struct domain *d; +int ret; +int irq = index; +int virq; + +d = rcu_lock_domain_by_any_id(domid); +if ( d == NULL ) +return -ESRCH; + +ret = xsm_map_domain_pirq(XSM_TARGET, d); +if ( ret ) +goto free_domain; + +/* For now we only suport GSI */ +if ( type != MAP_PIRQ_TYPE_GSI ) +{ +ret = -EINVAL; +dprintk(XENLOG_G_ERR, +dom%u: wrong map_pirq type 0x%x, only MAP_PIRQ_TYPE_GSI is supported.\n, +d-domain_id, type); +goto free_domain; +} + +if ( !is_assignable_irq(irq) ) +{ +ret = -EINVAL; +dprintk(XENLOG_G_ERR, IRQ%u is not routable to a guest\n, irq); +goto free_domain; +} + +ret = -EPERM; +if ( !irq_access_permitted(current-domain, irq) ) +goto free_domain; + +if ( *pirq_p 0 ) +{ +BUG_ON(irq 16); /* is_assignable_irq already denies SGIs */ +virq = vgic_allocate_virq(d, (irq = 32)); + +ret = -ENOSPC; +if ( virq 0 ) +goto free_domain; +} +else +{ +ret = -EBUSY; +virq = *pirq_p; + +if ( !vgic_reserve_virq(d, virq) ) +goto free_domain; +} + +gdprintk(XENLOG_DEBUG, irq = %u virq = %u\n, irq, virq); + +ret = route_irq_to_guest(d, virq, irq, routed IRQ); + +if ( !ret ) +*pirq_p = virq; +else +vgic_free_virq(d, virq); + +free_domain: +rcu_unlock_domain(d); + +return ret; +} + +int physdev_unmap_pirq(domid_t domid, int pirq) +{ +struct domain *d; +int ret; + +d = rcu_lock_domain_by_any_id(domid); +if ( d == NULL ) +return -ESRCH; + +ret = xsm_unmap_domain_pirq(XSM_TARGET, d); +if ( ret ) +goto free_domain; + +ret = release_guest_irq(d, pirq); +if ( ret ) +goto free_domain; + +vgic_free_virq(d, pirq); + +free_domain: +rcu_unlock_domain(d); + +return ret; +} int do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) { -printk(%s %d cmd=%d: not implemented yet\n, __func__, __LINE__, cmd); -return -ENOSYS; +int ret; + +switch ( cmd ) +{ +case PHYSDEVOP_map_pirq: +{ +physdev_map_pirq_t map; + +ret = -EFAULT; +if ( copy_from_guest(map, arg, 1) != 0 ) +break; + +ret = physdev_map_pirq(map.domid, map.type, map.index, map.pirq); + +if ( __copy_to_guest(arg, map, 1) ) +ret = -EFAULT; +} +break; + +case PHYSDEVOP_unmap_pirq: +{ +physdev_unmap_pirq_t unmap; + +ret = -EFAULT; +if ( copy_from_guest(unmap, arg, 1) != 0 ) +break; + +ret = physdev_unmap_pirq(unmap.domid, unmap.pirq); +} + +default: +ret = -ENOSYS; +break; +} + +return ret; } /* -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 01/24] xen: Extend DOMCTL createdomain to support arch configuration
On ARM the virtual GIC may differ between each guest (emulated GIC version, number of SPIs...). Those informations are already known at the domain creation and can never change. For now only the gic_version is set. In long run, there will be more parameters such as the number of SPIs. All will be required to be set at the same time. A new arch-specific structure arch_domainconfig has been created, the x86 one doesn't have any specific configuration, a dummy structure (C-spec compliant) has been created to factorize the code on the toolstack. Some external tools (qemu, xenstore) may require to create a domain. Rather than asking them to take care of the arch-specific domain configuration, let the current function (xc_domain_create) to chose a default configuration and introduce a new one (xc_domain_create_config). This patch also drop the previously DOMCTL arm_configure_domain introduced in Xen 4.5, as it has been made useless. Signed-off-by: Julien Grall julien.gr...@linaro.org Cc: Daniel De Graaf dgde...@tycho.nsa.gov Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Wei Liu wei.l...@citrix.com Cc: Stefano Stabellini stefano.stabell...@citrix.com Cc: Keir Fraser k...@xen.org Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com Cc: George Dunlap george.dun...@eu.citrix.com --- This is a follow-up of http://lists.xen.org/archives/html/xen-devel/2014-11/msg00522.html TODO: What about migration? For now the configuration lives in internal libxl structure. We need a way to pass the domain configuration to the other end. I'm not sure if we should care of this right now as migration doesn't yet exists on ARM. For the xc_domain_create, Stefano S. was looking to drop PV domain creation support in QEMU. So maybe I could simply extend xc_domain_create and drop the xc_domain_create_config. Changes in v3: - Patch was previously sent in a separate series [1] - Rename arch_domainconfig to xen_arch_domainconfig - Drop the typedef - Pass NULL for DOM0 config on x86 - Drop spurious changes - Update comment in start_xen in arch/arm/setup.c [1] https://patches.linaro.org/41083/ --- tools/flask/policy/policy/modules/xen/xen.if | 2 +- tools/libxc/include/xenctrl.h| 14 + tools/libxc/xc_domain.c | 46 tools/libxl/libxl_arch.h | 6 tools/libxl/libxl_arm.c | 28 ++--- tools/libxl/libxl_create.c | 21 ++--- tools/libxl/libxl_dm.c | 3 +- tools/libxl/libxl_dom.c | 2 +- tools/libxl/libxl_internal.h | 7 +++-- tools/libxl/libxl_x86.c | 10 ++ xen/arch/arm/domain.c| 28 - xen/arch/arm/domctl.c| 34 xen/arch/arm/mm.c| 6 ++-- xen/arch/arm/setup.c | 6 +++- xen/arch/x86/domain.c| 3 +- xen/arch/x86/mm.c| 6 ++-- xen/arch/x86/setup.c | 8 +++-- xen/common/domain.c | 7 +++-- xen/common/domctl.c | 3 +- xen/common/schedule.c| 3 +- xen/include/public/arch-arm.h| 8 + xen/include/public/arch-x86/xen.h| 4 +++ xen/include/public/domctl.h | 18 +-- xen/include/xen/domain.h | 3 +- xen/include/xen/sched.h | 9 -- xen/xsm/flask/hooks.c| 3 -- xen/xsm/flask/policy/access_vectors | 2 -- 27 files changed, 170 insertions(+), 120 deletions(-) diff --git a/tools/flask/policy/policy/modules/xen/xen.if b/tools/flask/policy/policy/modules/xen/xen.if index 2d32e1c..620d151 100644 --- a/tools/flask/policy/policy/modules/xen/xen.if +++ b/tools/flask/policy/policy/modules/xen/xen.if @@ -51,7 +51,7 @@ define(`create_domain_common', ` getaffinity setaffinity setvcpuextstate }; allow $1 $2:domain2 { set_cpuid settsc setscheduler setclaim set_max_evtchn set_vnumainfo get_vnumainfo cacheflush - psr_cmt_op configure_domain }; + psr_cmt_op }; allow $1 $2:security check_context; allow $1 $2:shadow enable; allow $1 $2:mmu { map_read map_write adjust memorymap physmap pinpage mmuext_op updatemp }; diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h index 0ad8b8d..d66571f 100644 --- a/tools/libxc/include/xenctrl.h +++ b/tools/libxc/include/xenctrl.h @@ -477,18 +477,20 @@ typedef union } start_info_any_t; #endif + +typedef struct xen_arch_domainconfig xc_domain_configuration_t; +int
[Xen-devel] [PATCH v3 19/24] xen/iommu: arm: Wire iommu DOMCTL for ARM
Signed-off-by: Julien Grall julien.gr...@linaro.org Acked-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Jan Beulich jbeul...@suse.com --- Changes in v3: - Add Stefano's ack Changes in v2: - Don't move the call in common code. --- xen/arch/arm/domctl.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/xen/arch/arm/domctl.c b/xen/arch/arm/domctl.c index 485d3aa..cc4894e 100644 --- a/xen/arch/arm/domctl.c +++ b/xen/arch/arm/domctl.c @@ -33,7 +33,16 @@ long arch_do_domctl(struct xen_domctl *domctl, struct domain *d, return p2m_cache_flush(d, s, e); } default: -return subarch_do_domctl(domctl, d, u_domctl); +{ +int rc; + +rc = subarch_do_domctl(domctl, d, u_domctl); + +if ( rc == -ENOSYS ) +rc = iommu_do_domctl(domctl, d, u_domctl); + +return rc; +} } } -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 02/24] xen/arm: Divide GIC initialization in 2 parts
Currently the function to translate IRQ from the device tree is set unconditionally to be able to be able to retrieve serial/timer IRQ before the GIC has been initialized. It assumes that the xlate function won't never changed. We may also need to have the primary interrupt controller very early. Rework the gic initialization in 2 parts: - gic_preinit: Get the interrupt controller device tree node and set up GIC and xlate callbacks - gic_init: Initialize the interrupt controller and the boot CPU interrupts. The former function will be called just after the IRQ subsystem as been initialized. Signed-off-by: Julien Grall julien.gr...@linaro.org --- Changes in v3: - Patch was previously sent in a separate series [1] - Reorder the function to avoid forward declaration - Make gic-v3 driver compliant to the new interface - Remove spurious field addition in gicv2 structure Changelog based on the separate series: Changes in v3: - Patch added. [1] https://patches.linaro.org/33313/ --- xen/arch/arm/gic-v2.c | 70 ++- xen/arch/arm/gic-v3.c | 75 --- xen/arch/arm/gic.c| 16 -- xen/arch/arm/setup.c | 3 +- xen/include/asm-arm/gic.h | 8 + 5 files changed, 100 insertions(+), 72 deletions(-) diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c index 15916c9..016b0fd 100644 --- a/xen/arch/arm/gic-v2.c +++ b/xen/arch/arm/gic-v2.c @@ -655,37 +655,10 @@ static hw_irq_controller gicv2_guest_irq_type = { .set_affinity = gicv2_irq_set_affinity, }; -const static struct gic_hw_operations gicv2_ops = { -.info= gicv2_info, -.secondary_init = gicv2_secondary_cpu_init, -.save_state = gicv2_save_state, -.restore_state = gicv2_restore_state, -.dump_state = gicv2_dump_state, -.gicv_setup = gicv2v_setup, -.gic_host_irq_type = gicv2_host_irq_type, -.gic_guest_irq_type = gicv2_guest_irq_type, -.eoi_irq = gicv2_eoi_irq, -.deactivate_irq = gicv2_dir_irq, -.read_irq= gicv2_read_irq, -.set_irq_properties = gicv2_set_irq_properties, -.send_SGI= gicv2_send_SGI, -.disable_interface = gicv2_disable_interface, -.update_lr = gicv2_update_lr, -.update_hcr_status = gicv2_hcr_status, -.clear_lr= gicv2_clear_lr, -.read_lr = gicv2_read_lr, -.write_lr= gicv2_write_lr, -.read_vmcr_priority = gicv2_read_vmcr_priority, -.read_apr= gicv2_read_apr, -.make_dt_node= gicv2_make_dt_node, -}; - -/* Set up the GIC */ -static int __init gicv2_init(struct dt_device_node *node, const void *data) +static int __init gicv2_init(void) { int res; - -dt_device_set_used_by(node, DOMID_XEN); +const struct dt_device_node *node = gicv2_info.node; res = dt_device_get_address(node, 0, gicv2.dbase, NULL); if ( res || !gicv2.dbase || (gicv2.dbase ~PAGE_MASK) ) @@ -708,9 +681,6 @@ static int __init gicv2_init(struct dt_device_node *node, const void *data) panic(GICv2: Cannot find the maintenance IRQ); gicv2_info.maintenance_irq = res; -/* Set the GIC as the primary interrupt controller */ -dt_interrupt_controller = node; - /* TODO: Add check on distributor, cpu size */ printk(GICv2 initialization:\n @@ -755,8 +725,42 @@ static int __init gicv2_init(struct dt_device_node *node, const void *data) spin_unlock(gicv2.lock); +return 0; +} + +const static struct gic_hw_operations gicv2_ops = { +.info= gicv2_info, +.init= gicv2_init, +.secondary_init = gicv2_secondary_cpu_init, +.save_state = gicv2_save_state, +.restore_state = gicv2_restore_state, +.dump_state = gicv2_dump_state, +.gicv_setup = gicv2v_setup, +.gic_host_irq_type = gicv2_host_irq_type, +.gic_guest_irq_type = gicv2_guest_irq_type, +.eoi_irq = gicv2_eoi_irq, +.deactivate_irq = gicv2_dir_irq, +.read_irq= gicv2_read_irq, +.set_irq_properties = gicv2_set_irq_properties, +.send_SGI= gicv2_send_SGI, +.disable_interface = gicv2_disable_interface, +.update_lr = gicv2_update_lr, +.update_hcr_status = gicv2_hcr_status, +.clear_lr= gicv2_clear_lr, +.read_lr = gicv2_read_lr, +.write_lr= gicv2_write_lr, +.read_vmcr_priority = gicv2_read_vmcr_priority, +.read_apr= gicv2_read_apr, +.make_dt_node= gicv2_make_dt_node, +}; + +/* Set up the GIC */ +static int __init gicv2_preinit(struct dt_device_node *node, const void *data) +{ gicv2_info.hw_version = GIC_V2; +gicv2_info.node = node; register_gic_ops(gicv2_ops); +
[Xen-devel] [PATCH v3 24/24] xl: Add new option dtdev
The option dtdev will be used to passthrough a non-PCI device described in the device tree to a guest. Signed-off-by: Julien Grall julien.gr...@linaro.org Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Wei Liu wei.l...@citrix.com --- Changes in v2: - libxl_device_dt has been rename to libxl_device_dtdev - use xrealloc instead of realloc --- docs/man/xl.cfg.pod.5| 5 + tools/libxl/xl_cmdimpl.c | 21 - 2 files changed, 25 insertions(+), 1 deletion(-) diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5 index 225b782..cfd3d5f 100644 --- a/docs/man/xl.cfg.pod.5 +++ b/docs/man/xl.cfg.pod.5 @@ -721,6 +721,11 @@ More information about Xen gfx_passthru feature is available on the XenVGAPassthrough Lhttp://wiki.xen.org/wiki/XenVGAPassthrough wiki page. +=item Bdtdev=[ DTDEV_PATH, DTDEV_PATH, ... ] + +Specifies the host device node to passthrough to this guest. Each DTDEV_PATH +is the absolute path in the device tree. + =item Bioports=[ IOPORT_RANGE, IOPORT_RANGE, ... ] Allow guest to access specific legacy I/O ports. Each BIOPORT_RANGE diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c index 31e89e8..80c9df6 100644 --- a/tools/libxl/xl_cmdimpl.c +++ b/tools/libxl/xl_cmdimpl.c @@ -986,7 +986,7 @@ static void parse_config_data(const char *config_source, long l; XLU_Config *config; XLU_ConfigList *cpus, *vbds, *nics, *pcis, *cvfbs, *cpuids, *vtpms; -XLU_ConfigList *channels, *ioports, *irqs, *iomem, *viridian; +XLU_ConfigList *channels, *ioports, *irqs, *iomem, *viridian, *dtdevs; int num_ioports, num_irqs, num_iomem, num_cpus, num_viridian; int pci_power_mgmt = 0; int pci_msitranslate = 0; @@ -1746,6 +1746,25 @@ skip_vfb: libxl_defbool_set(b_info-u.pv.e820_host, true); } +if (!xlu_cfg_get_list (config, dtdev, dtdevs, 0, 0)) { +d_config-num_dtdevs = 0; +d_config-dtdevs = NULL; +for (i = 0; (buf = xlu_cfg_get_listitem(dtdevs, i)) != NULL; i++) { +libxl_device_dtdev *dtdev; + +d_config-dtdevs = (libxl_device_dtdev *) xrealloc(d_config-dtdevs, sizeof (libxl_device_dtdev) * (d_config-num_dtdevs + 1)); +dtdev = d_config-dtdevs + d_config-num_dtdevs; +libxl_device_dtdev_init(dtdev); + +dtdev-path = strdup(buf); +if (dtdev-path == NULL) { +fprintf(stderr, unable to duplicate string for dtdevs\n); +exit(-1); +} +d_config-num_dtdevs++; +} +} + switch (xlu_cfg_get_list(config, cpuid, cpuids, 0, 1)) { case 0: { -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 22/24] tools/libxl: arm: Use an higher value for the GIC phandle
The partial device tree may contains phandle. The Device Tree Compiler tends to allocate the phandle from 1. Reserve the ID 65000 for the GIC phandle. I think we can safely assume that the partial device tree will never contain a such ID. Signed-off-by: Julien Grall julien.gr...@linaro.org Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Wei Liu wei.l...@citrix.com --- Changes in v3: - Patch added --- tools/libxl/libxl_arm.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/tools/libxl/libxl_arm.c b/tools/libxl/libxl_arm.c index 619458b..dc745fb 100644 --- a/tools/libxl/libxl_arm.c +++ b/tools/libxl/libxl_arm.c @@ -78,10 +78,11 @@ static struct arch_info { {xen-3.0-aarch64, arm,armv8-timer, arm,armv8 }, }; -enum { -PHANDLE_NONE = 0, -PHANDLE_GIC, -}; +/* + * The device tree compiler (DTC) is allocating the phandle from 1 to + * onwards. Reserve a high value for the GIC phandle. + */ +#define PHANDLE_GIC (65000) typedef uint32_t be32; typedef be32 gic_interrupt[3]; -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 17/24] xen/passthrough: arm: release earlier the DT devices assigned to a guest
The toolstack may not have deassign every device used by a guest. Therefore we have to go through the device list and removing them before asking the IOMMU drivers to release memory for this domain. This can be done by moving the call to the release function when we relinquish the resources. The IOMMU part will be destroyed later when the domain is freed. Signed-off-by: Julien Grall julien.gr...@linaro.org Cc: Jan Beulich jbeul...@suse.com --- Changes in v3: - Patch added. Superseed the patch xen/passthrough: call arch_iommu_domain_destroy before calling iommu teardown in the previous patch series. --- xen/arch/arm/domain.c | 4 xen/drivers/passthrough/arm/iommu.c | 1 - xen/drivers/passthrough/device_tree.c | 5 - xen/include/xen/iommu.h | 2 +- 4 files changed, 9 insertions(+), 3 deletions(-) diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c index 6e56665..d85748a 100644 --- a/xen/arch/arm/domain.c +++ b/xen/arch/arm/domain.c @@ -772,6 +772,10 @@ int domain_relinquish_resources(struct domain *d) switch ( d-arch.relmem ) { case RELMEM_not_started: +ret = iommu_release_dt_devices(d); +if ( ret ) +return ret; + d-arch.relmem = RELMEM_xen; /* Falltrough */ diff --git a/xen/drivers/passthrough/arm/iommu.c b/xen/drivers/passthrough/arm/iommu.c index 5870aef..8223a39 100644 --- a/xen/drivers/passthrough/arm/iommu.c +++ b/xen/drivers/passthrough/arm/iommu.c @@ -66,7 +66,6 @@ int arch_iommu_domain_init(struct domain *d) void arch_iommu_domain_destroy(struct domain *d) { -iommu_dt_domain_destroy(d); } int arch_iommu_populate_page_table(struct domain *d) diff --git a/xen/drivers/passthrough/device_tree.c b/xen/drivers/passthrough/device_tree.c index 88e496e..e7eb34f 100644 --- a/xen/drivers/passthrough/device_tree.c +++ b/xen/drivers/passthrough/device_tree.c @@ -97,7 +97,7 @@ int iommu_dt_domain_init(struct domain *d) return 0; } -void iommu_dt_domain_destroy(struct domain *d) +int iommu_release_dt_devices(struct domain *d) { struct hvm_iommu *hd = domain_hvm_iommu(d); struct dt_device_node *dev, *_dev; @@ -109,5 +109,8 @@ void iommu_dt_domain_destroy(struct domain *d) if ( rc ) dprintk(XENLOG_ERR, Failed to deassign %s in domain %u\n, dt_node_full_name(dev), d-domain_id); +return rc; } + +return 0; } diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h index c146ee4..d03df14 100644 --- a/xen/include/xen/iommu.h +++ b/xen/include/xen/iommu.h @@ -117,7 +117,7 @@ void iommu_read_msi_from_ire(struct msi_desc *msi_desc, struct msi_msg *msg); int iommu_assign_dt_device(struct domain *d, struct dt_device_node *dev); int iommu_deassign_dt_device(struct domain *d, struct dt_device_node *dev); int iommu_dt_domain_init(struct domain *d); -void iommu_dt_domain_destroy(struct domain *d); +int iommu_release_dt_devices(struct domain *d); #endif /* HAS_DEVICE_TREE */ -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC PATCHv1 net-next] xen-netback: always fully coalesce guest Rx packets
On Tue, Jan 13, 2015 at 02:05:17PM +, David Vrabel wrote: Always fully coalesce guest Rx packets into the minimum number of ring slots. Reducing the number of slots per packet has significant performance benefits (e.g., 7.2 Gbit/s to 11 Gbit/s in an off-host receive test). Good number. However, this does increase the number of grant ops per packet which decreases performance with some workloads (intrahost VM to VM) Do you have figures before and after this change? /unless/ grant copy has been optimized for adjacent ops with the same source or destination (see grant-table: defer releasing pages acquired in a grant copy[1]). Do we need to retain the existing path and make the always coalesce path conditional on a suitable version of Xen? It the new path improves off-host RX on all Xen versions and doesn't degrade intrahost VM to VM RX that much, I think we should use it unconditionally. Is intrahost VM to VM RX important to XenServer? I don't consider intrahost VM to VM RX a very important use case, at least not as important as off-host RX. I would expect in a could environment users would not count on their VMs reside on the same host. Plus, some could provider might deliberately route traffic off-host for various reasons even if VMs are on the same host. (Verizon for one, mentioned they do that during last year's Xen Summit IIRC). Others might disagree. Let's wait for other people to chime in. [1] http://lists.xen.org/archives/html/xen-devel/2015-01/msg01118.html Signed-off-by: David Vrabel david.vra...@citrix.com --- drivers/net/xen-netback/common.h |1 - drivers/net/xen-netback/netback.c | 106 ++--- 2 files changed, 3 insertions(+), 104 deletions(-) Love the diffstat! Wei. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2] tools/Rules.mk: Don't optimize debug builds; add macro debugging information
On Tue, 2015-01-13 at 13:52 +0800, Wen Congyang wrote: On 12/01/2014 10:21 PM, Euan Harris wrote: Tools debug builds are built with optimization level -O1, inherited from the CFLAGS definition in StdGNU.mk. Optimizations confuse the debugger, and the comment justifying -O1 in StdGNU.mk should not apply for a userspace library. Disable optimization by appending -O0 to CFLAGS, which overrides the -O1 flag specified earlier. Also specify -g3, to add macro debugging information which allows gdb to expand macro invocations. This is useful as libxl uses many non-trivial macros. Signed-off-by: Euan Harris euan.har...@citrix.com Changes since v1: * moved flag override to tools/Rules.mk so it affects all tools --- tools/Rules.mk |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/tools/Rules.mk b/tools/Rules.mk index 87a56dc..7ef1ce5 100644 --- a/tools/Rules.mk +++ b/tools/Rules.mk @@ -54,6 +54,11 @@ CFLAGS_libxenvchan = -I$(XEN_LIBVCHAN) LDLIBS_libxenvchan = $(SHLIB_libxenctrl) $(SHLIB_libxenstore) -L$(XEN_LIBVCHAN) -lxenvchan SHLIB_libxenvchan = -Wl,-rpath-link=$(XEN_LIBVCHAN) +ifeq ($(debug),y) +# Disable optimizations and debugging information for macros +CFLAGS += -O0 -g3 +endif + LIBXL_BLKTAP ?= $(CONFIG_BLKTAP2) ifeq ($(LIBXL_BLKTAP),y) This patch causes a building error: gcc -fno-strict-aliasing -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -O1 -fno-omit-frame-pointer -m64 -g -fno-strict-aliasing -std=gnu99 -Wall -Wstrict-prototypes -Wdeclaration-after-statement -Wno-unused-but-set-variable -Wno-unused-local-typedefs -O0 -g3 -D__XEN_TOOLS__ -MMD -MF .install.d -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -fno-optimize-sibling-calls -fPIC -I../../tools/include -I../../tools/libxc/include -Ixen/lowlevel/xc -I/usr/include/python2.7 -c xen/lowlevel/xc/xc.c -o build/temp.linux-x86_64-2.7/xen/lowlevel/xc/xc.o -fno-strict-aliasing -Werror In file included from /usr/include/limits.h:25:0, from /usr/lib/gcc/x86_64-redhat-linux/4.9.2/include/limits.h:168, from /usr/lib/gcc/x86_64-redhat-linux/4.9.2/include/syslimits.h:7, from /usr/lib/gcc/x86_64-redhat-linux/4.9.2/include/limits.h:34, from /usr/include/python2.7/Python.h:19, from xen/lowlevel/xc/xc.c:7: /usr/include/features.h:328:4: error: #warning _FORTIFY_SOURCE requires compiling with optimization (-O) [-Werror=cpp] Where is _FORTIFY_SOURCE coming from? I don't see it in our tree anywhere except stubdom/Makefile which is disabling it and the build worked for me. Perhaps it is coming from your build environment somewhere? How are you configuring and building Xen? Maybe what we want to do is only disable optimisations if debug=y AND -D_FORTIFY_SOURCE is not set? Might involve some autoconf checks to determine the fortification level in the user provided CFLAGS, which might be a bit faffsome. Ian. # warning _FORTIFY_SOURCE requires compiling with optimization (-O) ^ cc1: all warnings being treated as errors error: command 'gcc' failed with exit status 1 The following patch can fix this problem: From d16961971e14f6e50f9a9905449929d5a7c60860 Mon Sep 17 00:00:00 2001 From: Wen Congyang we...@cn.fujitsu.com Date: Tue, 13 Jan 2015 12:05:30 +0800 Subject: [PATCH] Fix a building error Commit 1166ecf7 disables optimization. But _FORTIFY_SOURCE requires compiling with optimization (-O). Disable _FORTIFY_SOURCE by appending -Wp,-U_FORTIFY_SOURCE to CFLAGS. Signed-off-by: Wen Congyang we...@cn.fujitsu.com --- tools/Rules.mk | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/Rules.mk b/tools/Rules.mk index 962a743..8ad1b05 100644 --- a/tools/Rules.mk +++ b/tools/Rules.mk @@ -56,7 +56,7 @@ SHLIB_libxenvchan = -Wl,-rpath-link=$(XEN_LIBVCHAN) ifeq ($(debug),y) # Disable optimizations and enable debugging information for macros -CFLAGS += -O0 -g3 +CFLAGS += -O0 -g3 -Wp,-U_FORTIFY_SOURCE endif LIBXL_BLKTAP ?= $(CONFIG_BLKTAP2) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 1/5] x86: expose CMT L3 event mask to user space
On Tue, Jan 13, 2015 at 10:00:58AM +, Jan Beulich wrote: On 13.01.15 at 09:02, chao.p.p...@linux.intel.com wrote: L3 event mask indicates the event types supported in host, including cache occupancy event as well as local/total memory bandwidth events for Memory Bandwidth Monitoring(MBM). Expose it so all these events can be monitored in user space. Signed-off-by: Chao Peng chao.p.p...@linux.intel.com Reviewed-by: Andrew Cooper andrew.coop...@citrix.com Acked-by: Jan Beulich jbeul...@suse.com Please don't re-send patches already applied. Oh, sorry! Just noticed it's already in. Chao ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v1 1/2] tools: unhook blktap1 from the build and remove all references to it
This was disabled by default in Xen 4.4. Since xend has now been removed from the tree I don't believe anything is using it. We need to pass an explicit CONFIG_BLKTAP1=n to qemu-xen-traditional otherwise it defaults to y and doesn't build. This patch does all the ground work, the tools/blktap directory will be removed in the next (*huge*) patch. Note that this has no impact on blktap2, which is what libxl supports. blktap1 was only usable via xend which has already been removed. Signed-off-by: Ian Campbell ian.campb...@citrix.com --- INSTALL |1 - config/Tools.mk.in |1 - tools/Makefile |2 +- tools/configure | 29 + tools/configure.ac |4 +- tools/hotplug/Linux/Makefile |1 - tools/hotplug/Linux/blktap | 94 -- tools/hotplug/Linux/xen-backend.rules.in |2 - 8 files changed, 3 insertions(+), 131 deletions(-) delete mode 100644 tools/hotplug/Linux/blktap diff --git a/INSTALL b/INSTALL index 71dd0eb..33f65ba 100644 --- a/INSTALL +++ b/INSTALL @@ -142,7 +142,6 @@ this detection and the sysv runlevel scripts have to be used. The old backend drivers are disabled because qdisk is now the default. This option can be used to build them anyway. - --enable-blktap1 --enable-blktap2 Build various stubom components, some are only example code. Its usually diff --git a/config/Tools.mk.in b/config/Tools.mk.in index 89de5bd..30267fa 100644 --- a/config/Tools.mk.in +++ b/config/Tools.mk.in @@ -57,7 +57,6 @@ CONFIG_ROMBIOS := @rombios@ CONFIG_SEABIOS := @seabios@ CONFIG_QEMU_TRAD:= @qemu_traditional@ CONFIG_QEMU_XEN := @qemu_xen@ -CONFIG_BLKTAP1 := @blktap1@ CONFIG_BLKTAP2 := @blktap2@ CONFIG_QEMUU_EXTRA_ARGS:= @EXTRA_QEMUU_CONFIGURE_ARGS@ CONFIG_REMUS_NETBUF := @remus_netbuf@ diff --git a/tools/Makefile b/tools/Makefile index af9798a..1ad7a5d 100644 --- a/tools/Makefile +++ b/tools/Makefile @@ -16,7 +16,6 @@ SUBDIRS-y += console SUBDIRS-y += xenmon SUBDIRS-y += xenstat SUBDIRS-$(CONFIG_Linux) += memshr -SUBDIRS-$(CONFIG_BLKTAP1) += blktap SUBDIRS-$(CONFIG_BLKTAP2) += blktap2 SUBDIRS-$(CONFIG_NetBSD) += xenbackendd SUBDIRS-y += libfsimage @@ -169,6 +168,7 @@ subdir-all-qemu-xen-traditional-dir: qemu-xen-traditional-dir-find subdir-install-qemu-xen-traditional-dir: qemu-xen-traditional-dir-find set -e; \ $(buildmakevars2shellvars); \ + export CONFIG_BLKTAP1=n; \ cd qemu-xen-traditional-dir; \ $(QEMU_ROOT)/xen-setup \ --extra-cflags=$(EXTRA_CFLAGS_QEMU_TRADITIONAL) \ diff --git a/tools/configure b/tools/configure index e971070..4117c83 100755 --- a/tools/configure +++ b/tools/configure @@ -700,7 +700,6 @@ rombios qemu_traditional blktap2 LINUX_BACKEND_MODULES -blktap1 debug seabios ovmf @@ -790,7 +789,6 @@ enable_xsmpolicy enable_ovmf enable_seabios enable_debug -enable_blktap1 with_linux_backend_modules enable_blktap2 enable_qemu_traditional @@ -1463,7 +1461,6 @@ Optional Features: --enable-ovmf Enable OVMF (default is DISABLED) --disable-seabios Disable SeaBIOS (default is ENABLED) --disable-debug Disable debug build of tools (default is ENABLED) - --enable-blktap1Enable blktap1 tools (default is DISABLED) --enable-blktap2Enable blktap2, (DEFAULT is on for Linux, otherwise off) --enable-qemu-traditional @@ -3991,29 +3988,6 @@ debug=$ax_cv_debug -# Check whether --enable-blktap1 was given. -if test ${enable_blktap1+set} = set; then : - enableval=$enable_blktap1; -fi - - -if test x$enable_blktap1 = xno; then : - -ax_cv_blktap1=n - -elif test x$enable_blktap1 = xyes; then : - -ax_cv_blktap1=y - -elif test -z $ax_cv_blktap1; then : - -ax_cv_blktap1=n - -fi -blktap1=$ax_cv_blktap1 - - - # Check whether --with-linux-backend-modules was given. if test ${with_linux_backend_modules+set} = set; then : @@ -4037,7 +4011,6 @@ usbbk pciback xen-acpi-processor blktap2 -blktap ;; *) @@ -7935,7 +7908,7 @@ fi -if test x$enable_blktap1 = xyes || test x$enable_blktap2 = xyes; then : +if test x$enable_blktap2 = xyes]; then : { $as_echo $as_me:${as_lineno-$LINENO}: checking for io_setup in -laio 5 $as_echo_n checking for io_setup in -laio... 6; } diff --git a/tools/configure.ac b/tools/configure.ac index 1ac63a3..72e2465 100644 --- a/tools/configure.ac +++ b/tools/configure.ac @@ -89,7 +89,6 @@ AX_ARG_DEFAULT_ENABLE([xsmpolicy], [Disable XSM policy compilation]) AX_ARG_DEFAULT_DISABLE([ovmf], [Enable OVMF]) AX_ARG_DEFAULT_ENABLE([seabios], [Disable SeaBIOS]) AX_ARG_DEFAULT_ENABLE([debug], [Disable debug build of tools]) -AX_ARG_DEFAULT_DISABLE([blktap1], [Enable blktap1 tools]) AC_ARG_WITH([linux-backend-modules],
[Xen-devel] [PATCH v2 2/2] tools: remove blktap1
Now that it is unhooked we can just remove it. Signed-off-by: Ian Campbell ian.campb...@citrix.com --- .gitignore |5 - .hgignore |5 - tools/blktap/Makefile | 13 - tools/blktap/README | 122 -- tools/blktap/drivers/Makefile | 73 -- tools/blktap/drivers/aes.c | 1319 --- tools/blktap/drivers/aes.h | 28 - tools/blktap/drivers/blk.h |3 - tools/blktap/drivers/blk_linux.c| 42 - tools/blktap/drivers/blktapctrl.c | 937 -- tools/blktap/drivers/blktapctrl.h | 36 - tools/blktap/drivers/blktapctrl_linux.c | 89 -- tools/blktap/drivers/block-aio.c| 259 tools/blktap/drivers/block-qcow.c | 1434 - tools/blktap/drivers/block-qcow2.c | 2098 --- tools/blktap/drivers/block-ram.c| 295 - tools/blktap/drivers/block-sync.c | 242 tools/blktap/drivers/block-vmdk.c | 428 --- tools/blktap/drivers/bswap.h| 178 --- tools/blktap/drivers/img2qcow.c | 282 - tools/blktap/drivers/qcow-create.c | 130 -- tools/blktap/drivers/qcow2raw.c | 348 - tools/blktap/drivers/tapaio.c | 357 -- tools/blktap/drivers/tapaio.h | 108 -- tools/blktap/drivers/tapdisk.c | 872 - tools/blktap/drivers/tapdisk.h | 259 tools/blktap/lib/Makefile | 60 - tools/blktap/lib/blkif.c| 185 --- tools/blktap/lib/blktaplib.h| 240 tools/blktap/lib/list.h | 59 - tools/blktap/lib/xenbus.c | 617 - tools/blktap/lib/xs_api.c | 360 -- tools/blktap/lib/xs_api.h | 50 - 33 files changed, 11533 deletions(-) delete mode 100644 tools/blktap/Makefile delete mode 100644 tools/blktap/README delete mode 100644 tools/blktap/drivers/Makefile delete mode 100644 tools/blktap/drivers/aes.c delete mode 100644 tools/blktap/drivers/aes.h delete mode 100644 tools/blktap/drivers/blk.h delete mode 100644 tools/blktap/drivers/blk_linux.c delete mode 100644 tools/blktap/drivers/blktapctrl.c delete mode 100644 tools/blktap/drivers/blktapctrl.h delete mode 100644 tools/blktap/drivers/blktapctrl_linux.c delete mode 100644 tools/blktap/drivers/block-aio.c delete mode 100644 tools/blktap/drivers/block-qcow.c delete mode 100644 tools/blktap/drivers/block-qcow2.c delete mode 100644 tools/blktap/drivers/block-ram.c delete mode 100644 tools/blktap/drivers/block-sync.c delete mode 100644 tools/blktap/drivers/block-vmdk.c delete mode 100644 tools/blktap/drivers/bswap.h delete mode 100644 tools/blktap/drivers/img2qcow.c delete mode 100644 tools/blktap/drivers/qcow-create.c delete mode 100644 tools/blktap/drivers/qcow2raw.c delete mode 100644 tools/blktap/drivers/tapaio.c delete mode 100644 tools/blktap/drivers/tapaio.h delete mode 100644 tools/blktap/drivers/tapdisk.c delete mode 100644 tools/blktap/drivers/tapdisk.h delete mode 100644 tools/blktap/lib/Makefile delete mode 100644 tools/blktap/lib/blkif.c delete mode 100644 tools/blktap/lib/blktaplib.h delete mode 100644 tools/blktap/lib/list.h delete mode 100644 tools/blktap/lib/xenbus.c delete mode 100644 tools/blktap/lib/xs_api.c delete mode 100644 tools/blktap/lib/xs_api.h [... actual patch omitted, see git://xenbits.xen.org/people/ianc/xen.git remove-blktap1-v2 ] -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 00/11] Alternate p2m: support multiple copies of host p2m
Ed White writes (Re: [PATCH 00/11] Alternate p2m: support multiple copies of host p2m): On 01/12/2015 10:00 AM, Ian Jackson wrote: To support this code in-tree, I think we will need Open Source code for exercising it, surely ? I'm hoping that, as Andrew says, there will be people interested in using these capabilities, and that some of them will be prepared to help fill in the gaps. That's why I wanted to send the series to the list very early in the 4.6 development cycle. That makes perfect sense, thanks. There's absolutely nothing wrong with posting a patch series early. If that doesn't turn out to be the case, I'll see if I can find some help internally, but I have neither the bandwidth nor the expertise to do everything myself. Right. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel