[Xen-devel] [libvirt test] 32102: regressions - FAIL
flight 32102 libvirt real [real] http://www.chiark.greenend.org.uk/~xensrcts/logs/32102/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-i386-libvirt5 libvirt-build fail REGR. vs. 32005 build-amd64-libvirt 5 libvirt-build fail REGR. vs. 32005 build-armhf-libvirt 5 libvirt-build fail REGR. vs. 32005 Tests which did not succeed, but are not blocking: test-armhf-armhf-libvirt 1 build-check(1) blocked n/a test-amd64-i386-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-libvirt 1 build-check(1) blocked n/a version targeted for testing: libvirt 9d31a0e4f6be9f90c0ccefa9b49463fb8da98a9c baseline version: libvirt ff018e686a8a412255bc34d3dc558a1bcf74fac5 People who touched revisions under test: Cole Robinson Conrad Meyer Daniel Hansel Daniel P. Berrange Dmitry Guryanov Erik Skultety Ian Campbell John Ferlan Ján Tomko Laine Stump Luyao Huang Martin Kletzander Michal Privoznik Nehal J Wani Pavel Hrdina Peter Krempa Shanzhi Yu Wang Rui jobs: build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt fail build-armhf-libvirt fail build-i386-libvirt fail build-amd64-pvopspass build-armhf-pvopspass build-i386-pvops pass test-amd64-amd64-libvirt blocked test-armhf-armhf-libvirt blocked test-amd64-i386-libvirt blocked sg-report-flight on osstest.cam.xci-test.com logs: /home/xc_osstest/logs images: /home/xc_osstest/images Logs, config files, etc. are available at http://www.chiark.greenend.org.uk/~xensrcts/logs Test harness code can be found at http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary Not pushing. (No revision log; it would be 610 lines long.) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v6 0/2] add new p2m type class and new p2m type
XenGT (Intel Graphics Virtualization technology, please refer to https://01.org/xen/blogs/srclarkx/2013/graphics-virtualization- xengt) driver runs inside Dom0 as a virtual graphics device model, and needs to trap and emulate the guest's write operations to some specific memory pages, like memory pages used by guest graphics driver as PPGTT(per-process graphics translation table). We added a new p2m type, p2m_mmio_write_dm, to trap and emulate the write operations on these graphic page tables. Handling of this new p2m type are similar with existing p2m_ram_ro in most condition checks, with only difference on final policy of emulation vs. drop. For p2m_ram_ro types, write operations will not trigger the device model, and will be discarded later in __hvm_copy(); while for the p2m_mmio_write_dm type pages, writes will go to the device model via ioreq-server. Previously, the conclusion in our v3 patch review is to provide a more generalized HVMOP_map_io_range_to_ioreq_server hypercall, by seperating rangesets inside a ioreq server to read-protected/write- protected/both-prtected. Yet, after offline discussion with Paul, we believe a more simplified solution may suffice. We can keep the existing HVMOP_map_io_range_to_ioreq_server hypercall, and let the user decide whether or not a p2m type change is necessary, because in most cases the emulator will already use the p2m_mmio_dm type. Changes from v5: - Stricter type checks for p2m type transitions; - One code style change. Changes from v4: - A new p2m type class, P2M_DISCARD_WRITE_TYPES, is added; - A new predicate, p2m_is_discard_write, is used in __hvm_copy()/ __hvm_clear()/emulate_gva_to_mfn()/hvm_hap_nested_page_fault(), to discard the write operations; - The new p2m type, p2m_mmio_write_dm, is added to P2M_RO_TYPES; - Coding style changes; Changes from v3: - Use the existing HVMOP_map_io_range_to_ioreq_server hypercall to add write protected range; - Modify the HVMOP_set_mem_type hypercall to support the new p2m type for this range. Changes from v2: - Remove excute attribute of the new p2m type p2m_mmio_write_dm; - Use existing rangeset for keeping the write protection page range instead of introducing hash table; - Some code style fix. Changes from v1: - Changes the new p2m type name from p2m_ram_wp to p2m_mmio_write_dm. This means that we treat the pages as a special mmio range instead of ram; - Move macros to c file since only this file is using them. - Address various comments from Jan. Yu Zhang (2): Add a new p2m type class - P2M_DISCARD_WRITE_TYPES add a new p2m type - p2m_mmio_write_dm xen/arch/x86/hvm/hvm.c | 25 ++--- xen/arch/x86/mm/p2m-ept.c | 1 + xen/arch/x86/mm/p2m-pt.c| 1 + xen/arch/x86/mm/shadow/multi.c | 2 +- xen/include/asm-x86/p2m.h | 9 - xen/include/public/hvm/hvm_op.h | 1 + 6 files changed, 22 insertions(+), 17 deletions(-) -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v6 2/2] add a new p2m type - p2m_mmio_write_dm
From: Yu Zhang A new p2m type, p2m_mmio_write_dm, is added to trap and emulate the write operations on GPU's page tables. Handling of this new p2m type are similar with existing p2m_ram_ro in most condition checks, with only difference on final policy of emulation vs. drop. For p2m_ram_ro types, write operations will not trigger the device model, and will be discarded later in __hvm_copy(); while for the p2m_mmio_write_dm type pages, writes will go to the device model via ioreq-server. Signed-off-by: Yu Zhang Signed-off-by: Wei Ye --- xen/arch/x86/hvm/hvm.c | 11 --- xen/arch/x86/mm/p2m-ept.c | 1 + xen/arch/x86/mm/p2m-pt.c| 1 + xen/include/asm-x86/p2m.h | 4 +++- xen/include/public/hvm/hvm_op.h | 1 + 5 files changed, 14 insertions(+), 4 deletions(-) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 967f822..25114fc 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -2837,7 +2837,8 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla, * to the mmio handler. */ if ( (p2mt == p2m_mmio_dm) || - (npfec.write_access && (p2m_is_discard_write(p2mt))) ) + (npfec.write_access && + (p2m_is_discard_write(p2mt) || (p2mt == p2m_mmio_write_dm))) ) { put_gfn(p2m->domain, gfn); @@ -5904,6 +5905,8 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg) get_gfn_query_unlocked(d, a.pfn, &t); if ( p2m_is_mmio(t) ) a.mem_type = HVMMEM_mmio_dm; +else if ( t == p2m_mmio_write_dm ) +a.mem_type = HVMMEM_mmio_write_dm; else if ( p2m_is_readonly(t) ) a.mem_type = HVMMEM_ram_ro; else if ( p2m_is_ram(t) ) @@ -5931,7 +5934,8 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg) static const p2m_type_t memtype[] = { [HVMMEM_ram_rw] = p2m_ram_rw, [HVMMEM_ram_ro] = p2m_ram_ro, -[HVMMEM_mmio_dm] = p2m_mmio_dm +[HVMMEM_mmio_dm] = p2m_mmio_dm, +[HVMMEM_mmio_write_dm] = p2m_mmio_write_dm }; if ( copy_from_guest(&a, arg, 1) ) @@ -5978,7 +5982,8 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg) goto param_fail4; } if ( !p2m_is_ram(t) && - (!p2m_is_hole(t) || a.hvmmem_type != HVMMEM_mmio_dm) ) + (!p2m_is_hole(t) || a.hvmmem_type != HVMMEM_mmio_dm) && + (t != p2m_mmio_write_dm || a.hvmmem_type != HVMMEM_ram_rw) ) { put_gfn(d, pfn); goto param_fail4; diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c index 15c6e83..e21a92d 100644 --- a/xen/arch/x86/mm/p2m-ept.c +++ b/xen/arch/x86/mm/p2m-ept.c @@ -136,6 +136,7 @@ static void ept_p2m_type_to_flags(ept_entry_t *entry, p2m_type_t type, p2m_acces entry->x = 0; break; case p2m_grant_map_ro: +case p2m_mmio_write_dm: entry->r = 1; entry->w = entry->x = 0; break; diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c index e48b63a..26fb18d 100644 --- a/xen/arch/x86/mm/p2m-pt.c +++ b/xen/arch/x86/mm/p2m-pt.c @@ -94,6 +94,7 @@ static unsigned long p2m_type_to_flags(p2m_type_t t, mfn_t mfn) default: return flags | _PAGE_NX_BIT; case p2m_grant_map_ro: +case p2m_mmio_write_dm: return flags | P2M_BASE_FLAGS | _PAGE_NX_BIT; case p2m_ram_ro: case p2m_ram_logdirty: diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h index 42de75d..2cf73ca 100644 --- a/xen/include/asm-x86/p2m.h +++ b/xen/include/asm-x86/p2m.h @@ -72,6 +72,7 @@ typedef enum { p2m_ram_shared = 12, /* Shared or sharable memory */ p2m_ram_broken = 13, /* Broken page, access cause domain crash */ p2m_map_foreign = 14,/* ram pages from foreign domain */ +p2m_mmio_write_dm = 15, /* Read-only; writes go to the device model */ } p2m_type_t; /* Modifiers to the query */ @@ -111,7 +112,8 @@ typedef unsigned int p2m_query_t; #define P2M_RO_TYPES (p2m_to_mask(p2m_ram_logdirty) \ | p2m_to_mask(p2m_ram_ro) \ | p2m_to_mask(p2m_grant_map_ro) \ - | p2m_to_mask(p2m_ram_shared) ) + | p2m_to_mask(p2m_ram_shared) \ + | p2m_to_mask(p2m_mmio_write_dm)) /* Write-discard types, which should discard the write operations */ #define P2M_DISCARD_WRITE_TYPES (p2m_to_mask(p2m_ram_ro) \ diff --git a/xen/include/public/hvm/hvm_op.h b/xen/include/public/hvm/hvm_op.h index eeb0a60..a4e5345 100644 --- a/xen/include/public/hvm/hvm_op.h +++ b/xen/include/public/hvm/hvm_op.h @@ -81,6 +81,7 @@ typedef enum { HVMMEM_ram_rw, /* Normal read/write guest RAM */ HVMMEM_ram_ro,
[Xen-devel] [PATCH v6 1/2] add a new p2m type class - P2M_DISCARD_WRITE_TYPES
From: Yu Zhang Currently, the P2M_RO_TYPES bears 2 meanings: one is "_PAGE_RW bit is clear in their PTEs", and another is to discard the write operations on these pages. This patch adds a p2m type class, P2M_DISCARD_WRITE_TYPES, to bear the second meaning, so we can use this type class instead of the P2M_RO_TYPES, to decide if a write operation is to be ignored. Signed-off-by: Yu Zhang Reviewed-by: Tim Deegan --- xen/arch/x86/hvm/hvm.c | 16 +++- xen/arch/x86/mm/shadow/multi.c | 2 +- xen/include/asm-x86/p2m.h | 5 + 3 files changed, 9 insertions(+), 14 deletions(-) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 51ffc90..967f822 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -2837,7 +2837,7 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla, * to the mmio handler. */ if ( (p2mt == p2m_mmio_dm) || - (npfec.write_access && (p2mt == p2m_ram_ro)) ) + (npfec.write_access && (p2m_is_discard_write(p2mt))) ) { put_gfn(p2m->domain, gfn); @@ -2882,16 +2882,6 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla, goto out_put_gfn; } -/* Shouldn't happen: Maybe the guest was writing to a r/o grant mapping? */ -if ( npfec.write_access && (p2mt == p2m_grant_map_ro) ) -{ -gdprintk(XENLOG_WARNING, - "trying to write to read-only grant mapping\n"); -hvm_inject_hw_exception(TRAP_gp_fault, 0); -rc = 1; -goto out_put_gfn; -} - /* If we fell through, the vcpu will retry now that access restrictions have * been removed. It may fault again if the p2m entry type still requires so. * Otherwise, this is an error condition. */ @@ -3941,7 +3931,7 @@ static enum hvm_copy_result __hvm_copy( if ( flags & HVMCOPY_to_guest ) { -if ( p2mt == p2m_ram_ro ) +if ( p2m_is_discard_write(p2mt) ) { static unsigned long lastpage; if ( xchg(&lastpage, gfn) != gfn ) @@ -4035,7 +4025,7 @@ static enum hvm_copy_result __hvm_clear(paddr_t addr, int size) p = (char *)__map_domain_page(page) + (addr & ~PAGE_MASK); -if ( p2mt == p2m_ram_ro ) +if ( p2m_is_discard_write(p2mt) ) { static unsigned long lastpage; if ( xchg(&lastpage, gfn) != gfn ) diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c index 225290e..94cf06d 100644 --- a/xen/arch/x86/mm/shadow/multi.c +++ b/xen/arch/x86/mm/shadow/multi.c @@ -4575,7 +4575,7 @@ static mfn_t emulate_gva_to_mfn(struct vcpu *v, { return _mfn(BAD_GFN_TO_MFN); } -if ( p2m_is_readonly(p2mt) ) +if ( p2m_is_discard_write(p2mt) ) { put_page(page); return _mfn(READONLY_GFN); diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h index 5f7fe71..42de75d 100644 --- a/xen/include/asm-x86/p2m.h +++ b/xen/include/asm-x86/p2m.h @@ -113,6 +113,10 @@ typedef unsigned int p2m_query_t; | p2m_to_mask(p2m_grant_map_ro) \ | p2m_to_mask(p2m_ram_shared) ) +/* Write-discard types, which should discard the write operations */ +#define P2M_DISCARD_WRITE_TYPES (p2m_to_mask(p2m_ram_ro) \ + | p2m_to_mask(p2m_grant_map_ro)) + /* Types that can be subject to bulk transitions. */ #define P2M_CHANGEABLE_TYPES (p2m_to_mask(p2m_ram_rw) \ | p2m_to_mask(p2m_ram_logdirty) ) @@ -145,6 +149,7 @@ typedef unsigned int p2m_query_t; #define p2m_is_hole(_t) (p2m_to_mask(_t) & P2M_HOLE_TYPES) #define p2m_is_mmio(_t) (p2m_to_mask(_t) & P2M_MMIO_TYPES) #define p2m_is_readonly(_t) (p2m_to_mask(_t) & P2M_RO_TYPES) +#define p2m_is_discard_write(_t) (p2m_to_mask(_t) & P2M_DISCARD_WRITE_TYPES) #define p2m_is_changeable(_t) (p2m_to_mask(_t) & P2M_CHANGEABLE_TYPES) #define p2m_is_pod(_t) (p2m_to_mask(_t) & P2M_POD_TYPES) #define p2m_is_grant(_t) (p2m_to_mask(_t) & P2M_GRANT_TYPES) -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] VMX: don't allow PVH to reach handle_pio() or handle_mmio()
On Fri, 05 Dec 2014 14:06:53 + "Jan Beulich" wrote: > PVH guests are not supposed to access I/O ports they weren't given > access to (there's nothing to handle emulation of such accesses). > > Reported-by: Roger Pau Monné > Signed-off-by: Jan Beulich > --- > Note: Only compile tested so far. > > --- a/xen/arch/x86/hvm/vmx/vmx.c > +++ b/xen/arch/x86/hvm/vmx/vmx.c > @@ -3082,6 +3082,9 @@ void vmx_vmexit_handler(struct cpu_user_ > } > > case EXIT_REASON_IO_INSTRUCTION: > +if ( unlikely(is_pvh_vcpu(v)) ) > +goto exit_and_crash; > + > __vmread(EXIT_QUALIFICATION, &exit_qualification); > if ( exit_qualification & 0x10 ) > { Actually, handle_pio() will eventually reach handle_pvh_io() which would access check via admin_io_okay, so that path should be OK, right? thanks, Mukesh ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [RFC PATCH] xen/arm: Manage uart TX interrupt correctly
From: Vijaya Kumar K On pl011.c when TX interrupt is received and TX buffer is empty, TX interrupt is not disabled and hence UART interrupt routine see TX interrupt always in MIS register and cpu loops infinitly. With this patch, mask and umask TX interrupt when required Signed-off-by: Vijaya Kumar K --- xen/drivers/char/pl011.c | 18 ++ xen/drivers/char/serial.c | 30 +- xen/include/xen/serial.h |4 3 files changed, 51 insertions(+), 1 deletion(-) diff --git a/xen/drivers/char/pl011.c b/xen/drivers/char/pl011.c index dd19ce8..ad48df3 100644 --- a/xen/drivers/char/pl011.c +++ b/xen/drivers/char/pl011.c @@ -109,6 +109,8 @@ static void __init pl011_init_preirq(struct serial_port *port) panic("pl011: No Baud rate configured\n"); uart->baud = (uart->clock_hz << 2) / divisor; } +/* Trigger RX interrupt at 1/2 full, TX interrupt at 7/8 empty */ +pl011_write(uart, IFLS, (2<<3 | 0)); /* This write must follow FBRD and IBRD writes. */ pl011_write(uart, LCR_H, (uart->data_bits - 5) << 5 | FEN @@ -197,6 +199,20 @@ static const struct vuart_info *pl011_vuart(struct serial_port *port) return &uart->vuart; } +static void pl011_tx_stop(struct serial_port *port) +{ +struct pl011 *uart = port->uart; + +pl011_write(uart, IMSC, pl011_read(uart, IMSC) & ~(TXI)); +} + +static void pl011_tx_start(struct serial_port *port) +{ +struct pl011 *uart = port->uart; + +pl011_write(uart, IMSC, pl011_read(uart, IMSC) | (TXI)); +} + static struct uart_driver __read_mostly pl011_driver = { .init_preirq = pl011_init_preirq, .init_postirq = pl011_init_postirq, @@ -207,6 +223,8 @@ static struct uart_driver __read_mostly pl011_driver = { .putc = pl011_putc, .getc = pl011_getc, .irq = pl011_irq, +.start_tx = pl011_tx_start, +.stop_tx = pl011_tx_stop, .vuart_info = pl011_vuart, }; diff --git a/xen/drivers/char/serial.c b/xen/drivers/char/serial.c index 44026b1..d2ce8a8 100644 --- a/xen/drivers/char/serial.c +++ b/xen/drivers/char/serial.c @@ -76,6 +76,19 @@ void serial_tx_interrupt(struct serial_port *port, struct cpu_user_regs *regs) cpu_relax(); } +if ( port->txbufc == port->txbufp ) +{ +/* Disable TX. nothing to send */ +if ( port->driver->stop_tx != NULL ) +port->driver->stop_tx(port); +spin_unlock(&port->tx_lock); +goto out; +} +else +{ +if ( port->driver->tx_ready(port) && (port->driver->start_tx != NULL) ) +port->driver->start_tx(port); +} for ( i = 0, n = port->driver->tx_ready(port); i < n; i++ ) { if ( port->txbufc == port->txbufp ) @@ -117,6 +130,9 @@ static void __serial_putc(struct serial_port *port, char c) cpu_relax(); if ( n > 0 ) { +/* Enable TX before sending chars */ +if ( port->driver->start_tx != NULL ) +port->driver->start_tx(port); while ( n-- ) port->driver->putc( port, @@ -135,6 +151,9 @@ static void __serial_putc(struct serial_port *port, char c) if ( ((port->txbufp - port->txbufc) == 0) && port->driver->tx_ready(port) > 0 ) { +/* Enable TX before sending chars */ +if ( port->driver->start_tx != NULL ) +port->driver->start_tx(port); /* Buffer and UART FIFO are both empty, and port is available. */ port->driver->putc(port, c); } @@ -152,11 +171,18 @@ static void __serial_putc(struct serial_port *port, char c) while ( !(n = port->driver->tx_ready(port)) ) cpu_relax(); if ( n > 0 ) +{ +/* Enable TX before sending chars */ +if ( port->driver->start_tx != NULL ) +port->driver->start_tx(port); port->driver->putc(port, c); +} } else { /* Simple synchronous transmitter. */ +if ( port->driver->start_tx != NULL ) +port->driver->start_tx(port); port->driver->putc(port, c); } } @@ -403,7 +429,9 @@ void serial_start_sync(int handle) if ( n < 0 ) /* port is unavailable and might not come up until reenabled by dom0, we can't really do proper sync */ -break; +break; +if ( port->driver->start_tx != NULL ) +port->driver->start_tx(port); port->driver->putc( port, port->txbuf[mask_serial_txbuf_idx(port->txbufc++)]); } diff --git a/xen/include/xen/serial.h b/xen/include/xen/serial.h index 9f4451b..71e6ade 100644 --- a/xen/include/xen/serial.h +++ b/xen/include/xen/serial.h @@ -81
[Xen-devel] [qemu-mainline test] 32096: tolerable FAIL - PUSHED
flight 32096 qemu-mainline real [real] http://www.chiark.greenend.org.uk/~xensrcts/logs/32096/ Failures :-/ but no regressions. Regressions which are regarded as allowable (not blocking): test-amd64-i386-pair17 guest-migrate/src_host/dst_host fail like 32029 Tests which did not succeed, but are not blocking: test-armhf-armhf-libvirt 9 guest-start fail never pass test-armhf-armhf-xl 10 migrate-support-checkfail never pass test-amd64-i386-libvirt 9 guest-start fail never pass test-amd64-amd64-xl-win7-amd64 14 guest-stop fail never pass test-amd64-i386-xl-win7-amd64 14 guest-stop fail never pass test-amd64-amd64-xl-winxpsp3 14 guest-stop fail never pass test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass test-amd64-amd64-xl-qemut-winxpsp3 14 guest-stop fail never pass test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop fail never pass test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass test-amd64-i386-xl-qemut-winxpsp3 14 guest-stopfail never pass test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop fail never pass test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail never pass test-amd64-amd64-libvirt 9 guest-start fail never pass test-amd64-amd64-xl-pcipt-intel 9 guest-start fail never pass test-amd64-amd64-xl-qemuu-winxpsp3 14 guest-stop fail never pass test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop fail never pass test-amd64-i386-xl-qemuu-winxpsp3 14 guest-stopfail never pass test-amd64-i386-xl-winxpsp3 14 guest-stop fail never pass version targeted for testing: qemuu54f3a180a3d0b334c55d0f61d6e9fe5c7c6d42d5 baseline version: qemuu0d7954c288e91b8a457f15a0a8e8244facf6594b People who touched revisions under test: Gerd Hoffmann Peter Maydell jobs: build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-armhf-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-armhf-pvopspass build-i386-pvops pass test-amd64-amd64-xl pass test-armhf-armhf-xl pass test-amd64-i386-xl pass test-amd64-i386-rhel6hvm-amd pass test-amd64-i386-qemut-rhel6hvm-amd pass test-amd64-i386-qemuu-rhel6hvm-amd pass test-amd64-amd64-xl-qemut-debianhvm-amd64pass test-amd64-i386-xl-qemut-debianhvm-amd64 pass test-amd64-amd64-xl-qemuu-debianhvm-amd64pass test-amd64-i386-xl-qemuu-debianhvm-amd64 pass test-amd64-i386-freebsd10-amd64 pass test-amd64-amd64-xl-qemuu-ovmf-amd64 pass test-amd64-i386-xl-qemuu-ovmf-amd64 pass test-amd64-amd64-xl-qemut-win7-amd64 fail test-amd64-i386-xl-qemut-win7-amd64 fail test-amd64-amd64-xl-qemuu-win7-amd64 fail test-amd64-i386-xl-qemuu-win7-amd64 fail test-amd64-amd64-xl-win7-amd64 fail test-amd64-i386-xl-win7-amd64fail test-amd64-i386-xl-credit2 pass test-amd64-i386-freebsd10-i386 pass test-amd64-amd64-xl-pcipt-intel fail test-amd64-i386-rhel6hvm-intel pass test-amd64-i386-qemut-rhel6hvm-intel pass test-amd64-i386-qemuu-rhel6hvm-intel pass test-amd64-amd64-libvirt fail test-armhf-armhf-libvirt fail test-amd64-i386-libvirt fail test-amd64-i386-xl-multivcpu pass
[Xen-devel] Some questions regarding QEMU, UEFI, PCI/VGA Passthrough, and other things
While I am not a developer myself (I always sucked hard when it comes to read and write code), there are several capabilities of Xen and its supporting Software which I'm always interesed in how they progress, more out of curiosity than anything else. However, usually, documentation seems to backtrack a lot what its currently implemented in code, and sometimes you catch a mail here with some useful data regarding a topic but later you don't hear about that any more, missing any progress, or because the whole topic was inconclusive. So, this mail is pretty much a compilation of small questions of things I came across but didn't popped up later, but can serve to brainstorm someone, which is why I believe it to be more useful for xen-devel than xen-users. QEMU Because as a VGA Passthrough user I'm currently forced to use qemu-xen-traditional (Through I hear some success about some users using qemu-xen in Xen 4.4, but I myself didn't had any luck with it), I'm stuck with an old QEMU version. However, looking at changelog from latest versions I always see some interesing features, which as far that I know Xen doesn't currently incorporate. 1a - One of the things that newer QEMU versions seems to be capable of doing, is emulating the much newer Intel Q35 Chipset, instead of only the current 440FX from the P5 Pentium era. Some data from Q35 emulation here: www.linux-kvm.org/wiki/images/0/06/2012-forum-Q35.pdf wiki.qemu.org/Features/Q35 I'm aware that newer doesn't neccesarily means better, specially because the practical advantages of Q35 vs 440FX aren't very clear. There are several new emulated features like an AHCI Controller and a PCIe Bus, which sounds interesing on paper, but I don't know if they add any useful feature or increases performance/compatibility. Some comments I read about the matter wrongly stated that Q35 would be needed to do PCIe Passthrough, but this is currently possible on 440FX, through I don't know about the low level implementation differences. I think most of the idea about Q35 is to make the VM look more closely to real Hardware, instead of looking like a ridiculous obvious emulated platform. In the case of the AHCI Controller, I suppose than the OS would need to include Drivers for the controller during installation time, which if I recall correctly both Windows Vista/7/8 and Linux should have, through for a Windows XP install the Q35 AHCI Controller Drivers should probabily need to be slipstreamed with nLite to an install ISO for it to work. 1b - Another experimental feature that recently popped in QEMU is IOMMU emulation. Info here: www.mulix.org/pubs/iommu/viommu.pdf www.linux-kvm.org/wiki/images/4/4a/2010-forum-joro-pv-iommu.pdf IOMMU emulation usefulness seems to be so you can do PCI Passthrough in a Nested Virtualization enviroment. At first sight this looked a bit useless, cause using a DomU to do PCI Passthrough with an emulated IOMMU sounds rather too much overhead if you can simply emulate that device in the nested DomU. However, I also read about the possibility of Xen using Hardware virtualization for Dom0 instead of it being Paravirtualized. In that case, would it be possible to provide the IOMMU emulation layer to Dom0 so you could do PCI Passthrough in platforms without proper support for it? It seems a rather interesing idea. I think it would also be useful to serve as an standarized debug platform for IOMMU virtualization and passthrough, cause some years ago missing or malformed ACPI DMAR/IVRS tables were all over the place and getting IOMMU virtualization working was pretty much random luck and at the mercy of the goodwill of the Motherboard maker to fix their BIOSes. UEFI for DomUs I managed to get this one working, but it seems to need some clarifications here and there. 2a - As far that I know, if you add --enable-ovmf to ./configure before building Xen, it downloads and builds some extra code from a OVMF repository which Xen maintains, through I don't know if its a snapshop of whatever the edk2 repository had at that time, or if it does includes custom patchs for the OVMF Firmware to work in Xen. Xen also has another ./configure option, --with-system-ovmf, which is supposed to be used to specify a path to provide an OVMF Firmware binary. However, when I tried that option some months ago, I never managed to get it working, either using a package with a precompiled ovmf.bin from Arch Linux User Repository, or using another package with the source to compile it myself. Both binaries worked with standalone QEMU, through. Besides than that parameter itself was quite hidden, there is absolutely no info regarding if the provided OVMF binary has to comply with some special requeriments, be it some custom patchs for OVMF so it works with Xen, if it has to be a binary that only includes TianoCore, or the unified one that includes the NVRAM in a single file. In Arch Linux, for the Xen 4.4 package, t
Re: [Xen-devel] [PATCH] console: allocate ring buffer earlier
Hi Jan, On 05/12/2014 16:55, Jan Beulich wrote: > I didn't change ARM, as I wasn't sure how far ahead this call could be pulled. AFAIU, the new function only requires that the page table are setup (because of the alloc_xenheap_pages). So console_init_mem could be called right after console_init_preirq. Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] Steps to run XenServer on ARM Platform
Hi, I am trying to find a tutorial to jumpstart installing XenServer / XCP on an ARM 64bit platform. Could the mailing list help. -Regards Manish ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] Xen 4.5 Development Update (RC3)
Feature patchsets that did not make it in by today have been put on the deferred list. Xen 4.5-rc3 was out on Wednesday (3rd). There are two known issues - with one already in the tree (so will show up in RC4). (see 'Known Issues' below) which are to be fixed by RC4 - if: - The maintainers are fine with it, - The risk is minimal to common code paths. Details for the test-day are at http://wiki.xen.org/wiki/Xen_4.5_RC3_test_instructions In terms of bugs, we have: #11 qxl hypervisor support #13 Re: [Xen-devel] man page example: xm block-attach #18 xl improve support for migration over non-sshlike tunnels #19 xl migrate transport improvements #22 xl does not support specifying virtual function for passthrough device #23 Remove arbitrary LIBXL_MAXMEM_CONSTANT from libxl, see what breaks #24 xl missing support for encrypted VNC #27 Re: [Xen-devel] xend vs xl with pci=['' are not owned by pciback or pcistub will still launch. #28 support PCI hole resize in qemu-xen [ 'mmio_hole' fix it, but the ultimate way is to fix it in QEMU] #30 libxl should implement non-suspend-cancel based resume path #36 credit2 only uses one runqueue instead of one runq per socket #38 Implement VT-d large pages so we can avoid sharing between EPT #40 linux pvops: fpu corruption due to incorrect assumptions #42 "linux, S3 resume of PVHVM fails - missing call to xen_arch_post_suspend?" #43 "30s delay loading xenfb driver on some systems" #44 Security policy ambiguities - XSA-108 process post-mortem #45 arm: domain 0 disables clocks which are in fact being used #46 qemu-upstream: limitation on 4 emulated NICs prevents guest from starting unless PV override is used. = Timeline = We wer planning on a 9-month release cycle - but it is more like an 10 month. Based on that, below are the estimated dates: * Feature Freeze: 24th September 2014 * First RC: 24th October [Friday!] * RC2: Nov 11th * RC2 Test-day: Nov 13th * RC3: Dec 3rd. * RC3 Test-day: Dec 4th < WE ARE HERE ===> * RC4: Dec 15th * RC4 Test-day: Dec 17th Release Date: Jan 7th. The RCs and release will of course depend on stability and bugs, and will therefore be fairly unpredictable. The feature freeze may be slipped for especially important features which are near completion. Bug-fixes, if Acked-by by maintainer, can go anytime before the First RC. Later on we will need to figure out the risk of regression/reward to eliminate the possiblity of a bug introducing another bug. = Prognosis = The states are: none -> fair -> ok -> good -> done none - nothing yet fair - still working on it, patches are prototypes or RFC ok - patches posted, acting on review good - some last minute pieces done - all done, might have bugs = Feature freeze exception = Remember our goal for the release: 1. A bug-free release 2. An awesome release 3. An on-time release Accepting a new feature may make Xen more awesome; but it also introduces a risk that it will introduce more bugs. That bug may be found before the release (threatening #3), or it may not be found until after the release (threatening #1). Each freeze exception request will attempt to balance the benefits (how awesome the exception is) vs the risks (will it cause the release to slip, or worse, cause a bug which goes un-noticed into the final release). The idea is that today we will be pretty permissive, but that we will become progressively more conservative until the first RC, which was scheduled for 3 weeks' time (October 25). After that, we will only accept bug fixes. Bug fixes can be checked in without a freeze exception throughout the code freeze, unless the maintainer thinks they are particularly high risk. In later RC's, we may even begin rejecting bug fixes if the broken functionality is small and the risk to other functionality is high. Document changes can go in anytime if the maintainer is OK with it. Features which are currently marked "experimental" or do not at the moment work at all cannot be broken really; so changes to code only used by those features should be able to get a freeze exception easily. Features which change or add new interfaces which will need to be supported in a backwards-compatible way (for instance, vNUMA) will need freeze exceptions to make sure that the interface itself has enough time to be considered stable. These are guidelines and principles to give you an idea where we're coming from; if you think there's a good reason why making an exception for you will help us achieve goals 1-3 above better than not doing so, feel free to make your case. = Open = == Known issues == * xc_reserved_device_memory_map in hvmloader to avoid conflicting MMIO/RAM (good) v7 posted. Treating pieces as bug-fixes only. Low likehood of making it in Xen 4.5. - Tiejun Chen * pygrub does not handle certain configurations. (done) went in after RC3 - Andrew Cooper and Boris Ostrovsky == Linux == * Linux block multiqueue (ok) v2 posted. - Arianna Avanzi
Re: [Xen-devel] [PATCH 1/4] dma: add dma_get_required_mask_from_max_pfn()
On Fri, Dec 05, 2014 at 02:08:00PM +, David Vrabel wrote: > A generic dma_get_required_mask() is useful even for architectures (such > as ia64) that define ARCH_HAS_GET_REQUIRED_MASK. > > Signed-off-by: David Vrabel > Reviewed-by: Stefano Stabellini > --- > drivers/base/platform.c | 10 -- Is this why you sent this to me? The x86 maintainers should handle this patch set, not me for a tiny 8 lines in just one of the files, sorry. greg k-h ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [xen-4.4-testing test] 32095: regressions - trouble: blocked/broken/fail/pass
flight 32095 xen-4.4-testing real [real] http://www.chiark.greenend.org.uk/~xensrcts/logs/32095/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-i386-pair 17 guest-migrate/src_host/dst_host fail REGR. vs. 31781 Tests which are failing intermittently (not blocking): test-amd64-amd64-xl-sedf 8 debian-fixupfail pass in 32055 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 3 host-install(3) broken pass in 32055 test-amd64-amd64-pv 16 guest-stop fail in 32055 pass in 32095 test-amd64-i386-qemut-rhel6hvm-amd 3 host-install(3) broken in 32055 pass in 32095 Regressions which are regarded as allowable (not blocking): test-amd64-i386-xl-win7-amd64 7 windows-install fail in 32055 like 31733 Tests which did not succeed, but are not blocking: test-amd64-i386-rumpuserxen-i386 1 build-check(1) blocked n/a test-amd64-amd64-rumpuserxen-amd64 1 build-check(1) blocked n/a test-amd64-i386-libvirt 9 guest-start fail never pass test-armhf-armhf-xl 10 migrate-support-checkfail never pass test-armhf-armhf-libvirt 9 guest-start fail never pass test-amd64-amd64-libvirt 9 guest-start fail never pass build-i386-rumpuserxen6 xen-buildfail never pass build-amd64-rumpuserxen 6 xen-buildfail never pass test-amd64-amd64-xl-qemut-winxpsp3 14 guest-stop fail never pass test-amd64-amd64-xl-pcipt-intel 9 guest-start fail never pass test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop fail never pass test-amd64-i386-xend-winxpsp3 17 leak-check/check fail never pass test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop fail never pass test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass test-amd64-amd64-xl-win7-amd64 14 guest-stop fail never pass test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop fail never pass test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass test-amd64-i386-xend-qemut-winxpsp3 17 leak-check/checkfail never pass test-amd64-amd64-xl-winxpsp3 14 guest-stop fail never pass test-amd64-i386-xl-win7-amd64 14 guest-stop fail never pass test-amd64-amd64-xl-qemuu-winxpsp3 14 guest-stop fail never pass test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail in 32055 never pass version targeted for testing: xen a39f202031d7f1d8d9e14b8c3d7d11c812db253e baseline version: xen 7679aeb444ed3bc4de0f473c16c47eab7d2f9d33 People who touched revisions under test: Jan Beulich jobs: build-amd64-xend pass build-i386-xend pass build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-armhf-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-armhf-pvopspass build-i386-pvops pass build-amd64-rumpuserxen fail build-i386-rumpuserxen fail test-amd64-amd64-xl pass test-armhf-armhf-xl pass test-amd64-i386-xl pass test-amd64-i386-rhel6hvm-amd pass test-amd64-i386-qemut-rhel6hvm-amd pass test-amd64-i386-qemuu-rhel6hvm-amd pass test-amd64-amd64-xl-qemut-debianhvm-amd64pass test-amd64-i386-xl-qemut-debianhvm-amd64 pass test-amd64-amd64-xl-qemuu-debianhvm-amd64pass test-amd64-i386-xl-qemuu-debianhvm-amd64 pass test-amd64-i386-freebsd10-amd64 pass test-amd64-amd64-xl-qemuu-ovmf-amd64 pass test-amd64-i386-xl-qemuu-ovmf-amd64 pass test-amd64-amd64-rumpuserxen-amd64 blocked test-amd64-amd64-xl-qemu
Re: [Xen-devel] [PATCHv5 0/4] dma, x86, xen: reduce SWIOTLB usage in Xen guests
On Fri, Dec 05, 2014 at 02:07:59PM +, David Vrabel wrote: > On systems where DMA addresses and physical addresses are not 1:1 > (such as Xen PV guests), the generic dma_get_required_mask() will not > return the correct mask (since it uses max_pfn). > > Some device drivers (such as mptsas, mpt2sas) use > dma_get_required_mask() to set the device's DMA mask to allow them to use > only 32-bit DMA addresses in hardware structures. This results in > unnecessary use of the SWIOTLB if DMA addresses are more than 32-bits, > impacting performance significantly. > > This series allows Xen PV guests to override the default > dma_get_required_mask() with a more suitable one. > > Changes in v5: > - xen_swiotlb_get_required_mask() is x86 only. > > Changes in v4: > - Assume 64-bit mask is required. > > Changes in v3: > - fix off-by-one in xen_dma_get_required_mask() > - split ia64 changes into separate patch. > > Changes in v2: > - split x86 and xen changes into separate patches > > David Why are you sending these to me? Am I the DMA maintainer and forgot about it? /me digs in MAINTAINERS... Nope, not me! Patches are now deleted from my queue, go use scripts/get_maintainer.pl like you should have done... greg k-h ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [xen-unstable baseline test] 32093: regressions - FAIL
"Old" tested version had not actually been tested; therefore in this flight we test it, rather than a new candidate. The baseline, if any, is the most recent actually tested revision. flight 32093 xen-unstable real [real] http://www.chiark.greenend.org.uk/~xensrcts/logs/32093/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-i386-pair 17 guest-migrate/src_host/dst_host fail REGR. vs. 32051 Regressions which are regarded as allowable (not blocking): test-amd64-amd64-xl-sedf-pin 5 xen-boot fail blocked in 32051 Tests which did not succeed, but are not blocking: test-amd64-i386-libvirt 9 guest-start fail never pass test-amd64-amd64-xl-pcipt-intel 9 guest-start fail never pass test-armhf-armhf-libvirt 9 guest-start fail never pass test-armhf-armhf-xl 10 migrate-support-checkfail never pass test-amd64-amd64-libvirt 9 guest-start fail never pass test-amd64-i386-xl-qemut-winxpsp3 14 guest-stopfail never pass test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail never pass test-amd64-i386-xl-qemuu-winxpsp3 14 guest-stopfail never pass test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass test-amd64-amd64-xl-win7-amd64 14 guest-stop fail never pass test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop fail never pass test-amd64-amd64-xl-qemut-winxpsp3 14 guest-stop fail never pass test-amd64-i386-xl-win7-amd64 14 guest-stop fail never pass test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop fail never pass test-amd64-amd64-xl-winxpsp3 14 guest-stop fail never pass test-amd64-i386-xl-winxpsp3 14 guest-stop fail never pass test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop fail never pass test-amd64-amd64-xl-qemuu-winxpsp3 14 guest-stop fail never pass version targeted for testing: xen 3a80985b894f54eb3b2e143e4dea737cf139a517 baseline version: xen 4d1a77ba7ab94183c203226d3fe7ac1cd087c59b People who touched revisions under test: Konrad Rzeszutek Wilk jobs: build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-armhf-libvirt pass build-i386-libvirt pass build-amd64-oldkern pass build-i386-oldkern pass build-amd64-pvopspass build-armhf-pvopspass build-i386-pvops pass build-amd64-rumpuserxen pass build-i386-rumpuserxen pass test-amd64-amd64-xl pass test-armhf-armhf-xl pass test-amd64-i386-xl pass test-amd64-i386-rhel6hvm-amd pass test-amd64-i386-qemut-rhel6hvm-amd pass test-amd64-i386-qemuu-rhel6hvm-amd pass test-amd64-amd64-xl-qemut-debianhvm-amd64pass test-amd64-i386-xl-qemut-debianhvm-amd64 pass test-amd64-amd64-xl-qemuu-debianhvm-amd64pass test-amd64-i386-xl-qemuu-debianhvm-amd64 pass test-amd64-i386-freebsd10-amd64 pass test-amd64-amd64-xl-qemuu-ovmf-amd64 pass test-amd64-i386-xl-qemuu-ovmf-amd64 pass test-amd64-amd64-rumpuserxen-amd64 pass test-amd64-amd64-xl-qemut-win7-amd64 fail test-amd64-i386-xl-qemut-win7-amd64 fail test-amd64-amd64-xl-qemuu-win7-amd64 fail test-amd64-i386-xl-qemuu-win7-amd64 fail test-amd64-amd64-xl-win7-amd64 fail test-amd64-i386-xl-win7-amd64fail test-amd64-i386-xl-credit2
Re: [Xen-devel] [PATCH] libxl: Set path to console on domain startup.
Not really familiar with libvirt, but... On Fri, 5 Dec 2014 16:30:06 + Anthony PERARD wrote: > The path to the pty of a Xen PV console is set only in > virDomainOpenConsole. But this is done too late. A call to > virDomainGetXMLDesc done before OpenConsole will not have the path to > the pty, but a call after OpenConsole will. > > e.g. of the current issue. > Starting a domain with '' > Then: > virDomainGetXMLDesc(): > > > > > > virDomainOpenConsole() > virDomainGetXMLDesc(): > > > > > > > > The patch intend to get the tty path on the first call of GetXMLDesc. > > Signed-off-by: Anthony PERARD > --- > src/libxl/libxl_domain.c | 17 + > 1 file changed, 17 insertions(+) > > diff --git a/src/libxl/libxl_domain.c b/src/libxl/libxl_domain.c > index 9c62291..de56054 100644 > --- a/src/libxl/libxl_domain.c > +++ b/src/libxl/libxl_domain.c > @@ -1290,6 +1290,23 @@ libxlDomainStart(libxlDriverPrivatePtr driver, > virDomainObjPtr vm, > if (libxlDomainSetVcpuAffinities(driver, vm) < 0) > goto cleanup_dom; > > +if (vm->def->nconsoles) { > +virDomainChrDefPtr chr = NULL; Pointless initializer. Possibly combine with following statement. -d > +chr = vm->def->consoles[0]; > +if (chr && chr->source.type == VIR_DOMAIN_CHR_TYPE_PTY) { > +libxl_console_type console_type; > +char *console = NULL; > +console_type = > +(chr->targetType == > VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_SERIAL ? > + LIBXL_CONSOLE_TYPE_SERIAL : LIBXL_CONSOLE_TYPE_PV); > +ret = libxl_console_get_tty(priv->ctx, vm->def->id, > chr->target.port, > +console_type, &console); > +if (!ret) > +ignore_value(VIR_STRDUP(chr->source.data.file.path, > console)); > +VIR_FREE(console); > +} > +} > + > if (!start_paused) { > libxl_domain_unpause(priv->ctx, domid); > virDomainObjSetState(vm, VIR_DOMAIN_RUNNING, > VIR_DOMAIN_RUNNING_BOOTED); > -- > Anthony PERARD > > > ___ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel > ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] RFC: Cleaning up the Mini-OS namespace
po...@iki.fi said: > I wonder if work is minimized if we attempt to merge before or after > we (I?) take the carving knife for a second round in the rumprun-xen > repo to minimize MiniOS to run only on top of itself. Before, I think. Minimizing our copy of Mini-OS duplicates what we would need to do to the upstream copy. I think the steps are roughly as follows: a) split the current rumprun-xen build out of the Mini-OS Makefile. b) replace our fork of Mini-OS with the vanilla upstream Mini-OS. c) re-apply my work and your work, while checking things keep working with upstream xen.git, until we get rumprun-xen working again. c) will leave us with a set of patches to upstream. Does this make sense? It's a fair amount of work but mostly retracing steps we've already done. It'd help if we had a full list of "what exactly needs to keep working upstream", see my other reply to Andrew. Maybe also osstest building and running Mini-OS related tests off our branch while we do the work? (Ian: ping? Doable?) Martin ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Poor network performance between DomU with multiqueue support
On Fri, Dec 05, 2014 at 03:20:55PM +, Zoltan Kiss wrote: > > > On 04/12/14 14:31, Zhangleiqiang (Trump) wrote: > >>-Original Message- > >>From: Zoltan Kiss [mailto:zoltan.k...@linaro.org] > >>Sent: Thursday, December 04, 2014 9:35 PM > >>To: Zhangleiqiang (Trump); Wei Liu; xen-devel@lists.xen.org > >>Cc: Xiaoding (B); Zhuangyuxin; zhangleiqiang; Luohao (brian); Yuzhou (C) > >>Subject: Re: [Xen-devel] Poor network performance between DomU with > >>multiqueue support > >> > >> > >> > >>On 04/12/14 12:09, Zhangleiqiang (Trump) wrote: > I think that's expected, because guest RX data path still uses > grant_copy while > >guest TX uses grant_map to do zero-copy transmit. > >>>As I understand, the RX process is as follows: > >>>1. Phy NIC receive packet > >>>2. XEN Hypervisor trigger interrupt to Dom0 3. Dom0' s NIC driver do > >>>the "RX" operation, and the packet is stored into SKB which is also > >>>owned/shared with netback > >>Not that easy. There is something between the NIC driver and netback which > >>directs the packets, e.g. the old bridge driver, ovs, or the IP stack of > >>the kernel. > >>>4. NetBack notify netfront through event channel that a packet is > >>>receiving 5. Netfront grant a buffer for receiving and notify netback > >>>the GR (if using grant-resue mechanism, netfront just notify the GR to > >>>netback) through IO Ring > >>It looks a bit confusing in the code, but netfront put "requests" on the > >>ring > >>buffer, which contains the grant ref of the guest page where the backend can > >>copy. When the packet comes, netback consumes these requests and send > >>back a response telling the guest the grant copy of the packet finished, it > >>can > >>start handling the data. (sending a response means it's placing a response > >>in > >>the ring and trigger the event channel) And ideally netback should always > >>have > >>requests in the ring, so it doesn't have to wait for the guest to fill it > >>up. > > > >>>6. NetBack do the grant_copy to copy packet from its SKB to the buffer > >>>referenced by GR, and notify netfront through event channel 7. > >>>Netfront copy the data from buffer to user-level app's SKB > >>Or wherever that SKB should go, yes. Like with any received packet on a real > >>network interface. > >>> > >>>Am I right? Why not using zero-copy transmit in guest RX data pash too ? > >>Because that means you are mapping that memory to the guest, and you won't > >>have any guarantee when the guest will release them. And netback can't just > >>unmap them forcibly after a timeout, because finding a correct timeout value > >>would be quite impossible. > >>A malicious/buggy/overloaded guest can hold on to Dom0 memory indefinitely, > >>but it even becomes worse if the memory came from another > >>guest: you can't shutdown that guest for example, until all its memory is > >>returned to him. > > > >Thanks for your detailed explanation about RX data path, I have get it, :) > > > >About the issue that poor performance between DomU to DomU, but high > >throughout between Dom0 to remote Dom0/DomU mentioned in my previous mail, > >do you have any idea about it? > > > >I am wondering if netfront/netback can be optimized to reach the 10Gbps > >throughout between DomUs running on different hosts connected with 10GE > >network. Currently, it seems like the TX is not the bottleneck, because we > >can reach the aggregate throughout of 9Gbps when sending packets from one > >DomU to other 3 DomUs running on different host. So I think the bottleneck > >maybe the RX, are you agreed with me? > > > >I am wondering what is the main reason that prevent RX to reach the higher > >throughout? Compared to KVM+virtio+vhost, which can reach high throughout, > >the RX has extra grantcopy operation, and the grantcopy operation may be one > >reason for it. Do you have any idea about it too? > It's quite sure that the grant copy is the bottleneck for a single queue RX > traffic. I don't know what's the plan to help that, currently only a faster > CPU can help you with that. Could the Intel QuickData help with that? > > > > >> > >>Regards, > >> > >>Zoli > > ___ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] RFC: Cleaning up the Mini-OS namespace
andrew.coop...@citrix.com said: > I think this is a very good idea, and I am completely in favour of it. > > There are already-identified issues such as MiniOS leaking things like > ARRAY_SIZE() into linked namespaces, which I havn't yet had enough tuits > to fix. > > I think splitting things like the stub libc away from the "MiniOS Xen > Framework" is also a good idea. Ideally, the result of a "MiniOS Build" > would be a small set of .a's which can then be linked against some > normal C to make a minios guest. (How feasible this is in reality > remains to be seen.) The approach I used for rumprun-xen is to link all of MiniOS' object files except the startfile into a final .o with "ld -r". This then allows me to use "objcopy -w -GPREFIX..." to make all symbols in minios.o *except* those starting with PREFIX local. This has the advantage that I only had to rename symbols I really wanted to keep global rather than going through all the MiniOS code adding "static" in places where it was missing and sorting out the resulting inter-dependencies. > From a not-public-API point of view, all you have to worry about is that > the existing minios stuff in xen.git, including the stubdom stuff, > continues to work. We have never made any guarantees to anyone using > minios out-of-tree. "Existing minios stuff" meaning the default build of extras/mini-os? What's up with the -DHAVE_LIBC codepaths in mini-os? Who or what uses these? Grepping around in stubdom/ doesn't come up with anything... "Stubdom stuff" meaning the default build of stubdom/, plus the "make c-stubdom" and "make caml-stubdom" examples documented in README? Anything else? Sorry if this is obvious but I'm not that familiar with all of xen.git. Thanks, Martin ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2] introduce grant copy for user land
On 12/02/2014 11:13 AM, Thanos Makatos wrote: This patch introduces the interface to allow user-space applications execute grant-copy operations. This is done by sending an ioctl to the grant device. Signed-off-by: Thanos Makatos --- drivers/xen/gntdev.c | 171 + include/uapi/xen/gntdev.h | 69 ++ 2 files changed, 240 insertions(+) diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c index 51f4c95..7b4a8e0 100644 --- a/drivers/xen/gntdev.c +++ b/drivers/xen/gntdev.c @@ -705,6 +705,174 @@ static long gntdev_ioctl_notify(struct gntdev_priv *priv, void __user *u) return rc; } +static int gntdev_gcopy_batch(int nr_segments, unsigned long gcopy_cb, + struct gntdev_grant_copy_segment __user *__segments, int dir, + int src, int dst) { + + static const int batch_size = PAGE_SIZE / (sizeof(struct page*) + + sizeof(struct gnttab_copy) + sizeof(struct gntdev_grant_copy_segment)); + struct page **pages = (struct page **)gcopy_cb; + struct gnttab_copy *batch = (struct gnttab_copy *)((unsigned long)pages + + sizeof(struct page*) * batch_size); + struct gntdev_grant_copy_segment *segments = + (struct gntdev_grant_copy_segment *)((unsigned long)batch + + sizeof(struct gnttab_copy) * batch_size); + unsigned int nr_pinned = 0, nr_segs2cp = 0; + int err = 0, i; + const int write = dir == GNTCOPY_IOCTL_g2s; + + nr_segments = min(nr_segments, batch_size); + + if (unlikely(copy_from_user(segments, __segments, + sizeof(struct gntdev_grant_copy_segment) * nr_segments))) { + pr_debug("failed to copy %d segments from user", nr_segments); + err = -EFAULT; + goto out; + } + + for (i = 0; i < nr_segments; i++) { + + xen_pfn_t pgaddr; + unsigned long start, offset; + struct gntdev_grant_copy_segment *seg = &segments[i]; + + if (dir == GNTCOPY_IOCTL_s2g || dir == GNTCOPY_IOCTL_g2s) { + + start = (unsigned long)seg->self.iov.iov_base & PAGE_MASK; + offset = (unsigned long)seg->self.iov.iov_base & ~PAGE_MASK; + if (unlikely(offset + seg->self.iov.iov_len > PAGE_SIZE)) { + pr_warn("segments crossing page boundaries not yet " + "implemented\n"); + err = -ENOSYS; + goto out; + } + + err = get_user_pages_fast(start, 1, write, &pages[i]); + if (unlikely(err != 1)) { + pr_debug("failed to get user page %lu", start); + err = -EFAULT; + goto out; + } + + nr_pinned++; + + pgaddr = pfn_to_mfn(page_to_pfn(pages[i])); + } + + nr_segs2cp++; + + switch (dir) { + case GNTCOPY_IOCTL_g2s: /* copy from guest */ + batch[i].len = seg->self.iov.iov_len; + batch[i].source.u.ref = seg->self.ref; + batch[i].source.domid = src; + batch[i].source.offset = seg->self.offset; + batch[i].dest.u.gmfn = pgaddr; + batch[i].dest.domid = DOMID_SELF; + batch[i].dest.offset = offset; + batch[i].flags = GNTCOPY_source_gref; + break; + case GNTCOPY_IOCTL_s2g: /* copy to guest */ + batch[i].len = seg->self.iov.iov_len; + batch[i].source.u.gmfn = pgaddr; + batch[i].source.domid = DOMID_SELF; + batch[i].source.offset = offset; + batch[i].dest.u.ref = seg->self.ref; + batch[i].dest.domid = dst; + batch[i].dest.offset = seg->self.offset; + batch[i].flags = GNTCOPY_dest_gref; + break; + case GNTCOPY_IOCTL_g2g: /* copy guest to guest */ + batch[i].len = seg->g2g.len; + batch[i].source.u.ref = seg->g2g.src.ref; + batch[i].source.domid = src; + batch[i].source.offset = seg->g2g.src.offset; + batch[i].dest.u.ref = seg->g2g.dst.ref; + batch[i].dest.domid = dst; + batch[i].dest.offset = seg->g2g.dst.offset; + batch[i].flags = GNTCOPY_source_gref | GNTCOPY_dest_gref; + break; + default: +
Re: [Xen-devel] A few EFI code questions
On Fri, Dec 05, 2014 at 05:00:52PM +, Jan Beulich wrote: > >>> On 05.12.14 at 17:40, wrote: > > On Fri, Dec 05, 2014 at 03:00:14PM +, Jan Beulich wrote: > >> but I don't think this possibility of renaming warrants a much longer > >> discussion. Please also remember that renaming always implies more > >> cumbersome backporting, even if only slightly more. > > > > I suppose that you are thinking about backporting my EFI + multiboot2 > > patches somewhere. > > Not really, I was just thinking about bug fixes in general. OK. So, go or no go for efi-boot.h name change to boot.h (of course after 4.5 release)? If yes then when? After or before my EFI + multiboot2 patches? I would like to know that in advance because I am going to release first version next week. Daniel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [Xen-users] 4.5 git: regression in xen systemd shutdown hangs the OS
On Tue, Dec 02, Olaf Hering wrote: > On Tue, Dec 02, Ian Campbell wrote: > > > On Mon, 2014-12-01 at 23:41 +, Mark Pryor wrote: > > > list, > > > > Thanks. If you've identified a buggy changeset then it is fine to post > > to the devel lists. I've added a CC. I've also CCd everyone listed in > > the commit which you've fingered. > > > > Olaf, does this suggested change look correct? If so then can you turn > > it into a patch please. > > Yes, something like this (sed -i 's@socket@service@g' *.in): But even with that change xendomains is hanging if it cant talk to xenstored for whatever reason. The result is that the sytem hangs forever at shutdown. I will try to fix that for 4.5. Olaf ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] console: allocate ring buffer earlier
On Fri, Dec 05, 2014 at 04:55:24PM +, Jan Beulich wrote: > ... when "conring_size=" was specified on the command line. We can't > really do this as early as we would want to when the option was not > specified, as the default depends on knowing the system CPU count. Yet > the parsing of the ACPI tables is one of the things that generates a > lot of output especially on large systems. > > I didn't change ARM, as I wasn't sure how far ahead this call could be > pulled. > > Signed-off-by: Jan Beulich Make sense for me but I think that we should have the same thing for ARM too. > --- a/xen/arch/x86/setup.c > +++ b/xen/arch/x86/setup.c > @@ -1187,6 +1187,7 @@ void __init noreturn __start_xen(unsigne > } > > vm_init(); > +console_init_mem(); > vesa_init(); > > softirq_init(); > --- a/xen/drivers/char/console.c > +++ b/xen/drivers/char/console.c > @@ -744,15 +744,14 @@ void __init console_init_preirq(void) > } > } > > -void __init console_init_postirq(void) > +void __init console_init_mem(void) > { > char *ring; > unsigned int i, order, memflags; > - > -serial_init_postirq(); > +unsigned long flags; > > if ( !opt_conring_size ) > -opt_conring_size = num_present_cpus() << (9 + xenlog_lower_thresh); > +return; > > order = get_order_from_bytes(max(opt_conring_size, conring_size)); > memflags = MEMF_bits(crashinfo_maxaddr_bits); > @@ -763,17 +762,28 @@ void __init console_init_postirq(void) > } > opt_conring_size = PAGE_SIZE << order; > > -spin_lock_irq(&console_lock); > +spin_lock_irqsave(&console_lock, flags); I am not sure why are you change spin_lock_irq() to spin_lock_irqsave() here. Could you explain this in commit message? > for ( i = conringc ; i != conringp; i++ ) > ring[i & (opt_conring_size - 1)] = conring[i & (conring_size - 1)]; > conring = ring; > smp_wmb(); /* Allow users of console_force_unlock() to see larger > buffer. */ > conring_size = opt_conring_size; > -spin_unlock_irq(&console_lock); > +spin_unlock_irqrestore(&console_lock, flags); Ditto. Daniel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 9/9] xen/pciback: Implement PCI reset slot or bus with 'do_flr' SysFS attribute
On Fri, Dec 05, 2014 at 10:30:01AM +, David Vrabel wrote: > On 04/12/14 15:39, Alex Williamson wrote: > > > > I don't know what workaround you're talking about. As devices are > > released from the user, vfio-pci attempts to reset them. If > > pci_reset_function() returns success we mark the device clean, otherwise > > it gets marked dirty. Each time a device is released, if there are > > dirty devices we test whether we can try a bus/slot reset to clean them. > > In the case of assigning a GPU this typically means that the GPU or > > audio function come through first, there's no reset mechanism so it gets > > marked dirty, the next device comes through and we manage to try a bus > > reset. vfio-pci does not have any device specific resets, all > > functionality is added to the PCI-core, thank-you-very-much. I even > > posted a generic PCI quirk patch recently that marks AMD VGA PM reset as > > bad so that pci_reset_function() won't claim that worked. All VGA > > access quirks are done in QEMU, the kernel doesn't have any business in > > remapping config space over MMIO regions or trapping other config space > > backdoors. > > Thanks for the info Alex, I hadn't got around to actually looking and > the vfio-pci code and was just going to what Sander said. > > We probably do need to have a more in depth look at now PCI devices and > handled by both the toolstack and pciback but in the short term I would > like a simple solution that does not extend the ABI. Could you enumerate the 'simple solution' then please? I am having a frustrating time figuring out what it is that you are proposing. > > David ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] tools/xenstore: fix link error with libsystemd
On Fri, Dec 05, 2014 at 10:53:03AM +, Ian Campbell wrote: > On Fri, 2014-12-05 at 11:49 +0100, Olaf Hering wrote: > > Linking fails with undefined reference to the used systemd functions. > > Move LDFLAGS after the object files to fix the failure. > > > > Signed-off-by: Olaf Hering > > Cc: Ian Jackson > > Cc: Stefano Stabellini > > Acked-by: Ian Campbell > > This should go into 4.5. Release-Acked-by: Konrad Rzeszutek Wilk > > FWIW my suspicion is that this relates to toolstacks using --as-needed > by default. > > > Cc: Wei Liu > > --- > > tools/xenstore/Makefile | 10 +- > > 1 file changed, 5 insertions(+), 5 deletions(-) > > > > diff --git a/tools/xenstore/Makefile b/tools/xenstore/Makefile > > index bff9b25..11b6a06 100644 > > --- a/tools/xenstore/Makefile > > +++ b/tools/xenstore/Makefile > > @@ -74,10 +74,10 @@ endif > > init-xenstore-domain.o: CFLAGS += $(CFLAGS_libxenguest) > > > > init-xenstore-domain: init-xenstore-domain.o $(LIBXENSTORE) > > - $(CC) $(LDFLAGS) $^ $(LDLIBS_libxenctrl) $(LDLIBS_libxenguest) > > $(LDLIBS_libxenstore) -o $@ $(APPEND_LDFLAGS) > > + $(CC) $^ $(LDFLAGS) $(LDLIBS_libxenctrl) $(LDLIBS_libxenguest) > > $(LDLIBS_libxenstore) -o $@ $(APPEND_LDFLAGS) > > > > xenstored: $(XENSTORED_OBJS) > > - $(CC) $(LDFLAGS) $^ $(LDLIBS_libxenctrl) $(SOCKET_LIBS) -o $@ > > $(APPEND_LDFLAGS) > > + $(CC) $^ $(LDFLAGS) $(LDLIBS_libxenctrl) $(SOCKET_LIBS) -o $@ > > $(APPEND_LDFLAGS) > > > > xenstored.a: $(XENSTORED_OBJS) > > $(AR) cr $@ $^ > > @@ -86,13 +86,13 @@ $(CLIENTS): xenstore > > ln -f xenstore $@ > > > > xenstore: xenstore_client.o $(LIBXENSTORE) > > - $(CC) $(LDFLAGS) $< $(LDLIBS_libxenstore) $(SOCKET_LIBS) -o $@ > > $(APPEND_LDFLAGS) > > + $(CC) $< $(LDFLAGS) $(LDLIBS_libxenstore) $(SOCKET_LIBS) -o $@ > > $(APPEND_LDFLAGS) > > > > xenstore-control: xenstore_control.o $(LIBXENSTORE) > > - $(CC) $(LDFLAGS) $< $(LDLIBS_libxenstore) $(SOCKET_LIBS) -o $@ > > $(APPEND_LDFLAGS) > > + $(CC) $< $(LDFLAGS) $(LDLIBS_libxenstore) $(SOCKET_LIBS) -o $@ > > $(APPEND_LDFLAGS) > > > > xs_tdb_dump: xs_tdb_dump.o utils.o tdb.o talloc.o > > - $(CC) $(LDFLAGS) $^ -o $@ $(APPEND_LDFLAGS) > > + $(CC) $^ $(LDFLAGS) -o $@ $(APPEND_LDFLAGS) > > > > libxenstore.so: libxenstore.so.$(MAJOR) > > ln -sf $< $@ > > ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 2/4] sysctl/libxl: Add interface for returning IO topology data
On 12/05/2014 11:03 AM, Jan Beulich wrote: On 05.12.14 at 16:55, wrote: On 02.12.14 at 22:34, wrote: +struct xen_sysctl_iotopo { +uint16_t seg; +uint8_t bus; +uint8_t devfn; +uint32_t node; +}; This is PCI-centric without expressing in the name or layout. xen_sysctl_pcitopo would be a better name. Perhaps the first part should be a union from the very beginning? And I wonder whether that supposed union part wouldn't be nicely done using struct physdev_pci_device. The do look strikingly similar ;-) How would a union be useful here? Additionally please add IN and OUT annotations. When I first saw this I assumed they would all be OUT (in which case the long running loop problem mentioned in the reply to one of the other patches wouldn't have been there), matching their CPU counterpart... I don't follow this. Are you saying that if ti->max_devs in patch 3/4 is an IN (which it is) then we don't have to guard for long-running loops? -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH for-4.5] flask/policy: Example policy updates for migration
The example XSM policy was missing permission for dom0_t to migrate domains; add these permissions. Reported-by: Wei Liu Signed-off-by: Daniel De Graaf --- This has been tested with xl save/restore on a PV domain, which now succeeds without producing AVC denials. tools/flask/policy/policy/modules/xen/xen.if | 11 +++ tools/flask/policy/policy/modules/xen/xen.te | 3 +++ 2 files changed, 10 insertions(+), 4 deletions(-) diff --git a/tools/flask/policy/policy/modules/xen/xen.if b/tools/flask/policy/policy/modules/xen/xen.if index fa69c9d..bf5e135 100644 --- a/tools/flask/policy/policy/modules/xen/xen.if +++ b/tools/flask/policy/policy/modules/xen/xen.if @@ -48,11 +48,13 @@ define(`create_domain_common', ` allow $1 $2:domain { create max_vcpus setdomainmaxmem setaddrsize getdomaininfo hypercall setvcpucontext setextvcpucontext getscheduler getvcpuinfo getvcpuextstate getaddrsize - getaffinity setaffinity }; - allow $1 $2:domain2 { set_cpuid settsc setscheduler setclaim set_max_evtchn set_vnumainfo get_vnumainfo psr_cmt_op configure_domain }; + getaffinity setaffinity setvcpuextstate }; + allow $1 $2:domain2 { set_cpuid settsc setscheduler setclaim + set_max_evtchn set_vnumainfo get_vnumainfo cacheflush + psr_cmt_op configure_domain }; allow $1 $2:security check_context; allow $1 $2:shadow enable; - allow $1 $2:mmu { map_read map_write adjust memorymap physmap pinpage mmuext_op }; + allow $1 $2:mmu { map_read map_write adjust memorymap physmap pinpage mmuext_op updatemp }; allow $1 $2:grant setup; allow $1 $2:hvm { cacheattr getparam hvmctl irqlevel pciroute sethvmc setparam pcilevel trackdirtyvram nested }; @@ -80,7 +82,7 @@ define(`create_domain_build_label', ` define(`manage_domain', ` allow $1 $2:domain { getdomaininfo getvcpuinfo getaffinity getaddrsize pause unpause trigger shutdown destroy - setaffinity setdomainmaxmem getscheduler }; + setaffinity setdomainmaxmem getscheduler resume }; allow $1 $2:domain2 set_vnumainfo; ') @@ -88,6 +90,7 @@ define(`manage_domain', ` # Allow creation of a snapshot or migration image from a domain # (inbound migration is the same as domain creation) define(`migrate_domain_out', ` + allow $1 domxen_t:mmu map_read; allow $1 $2:hvm { gethvmc getparam irqlevel }; allow $1 $2:mmu { stat pageinfo map_read }; allow $1 $2:domain { getaddrsize getvcpucontext getextvcpucontext getvcpuextstate pause destroy }; diff --git a/tools/flask/policy/policy/modules/xen/xen.te b/tools/flask/policy/policy/modules/xen/xen.te index d214470..c0128aa 100644 --- a/tools/flask/policy/policy/modules/xen/xen.te +++ b/tools/flask/policy/policy/modules/xen/xen.te @@ -129,12 +129,14 @@ create_domain(dom0_t, domU_t) manage_domain(dom0_t, domU_t) domain_comms(dom0_t, domU_t) domain_comms(domU_t, domU_t) +migrate_domain_out(dom0_t, domU_t) domain_self_comms(domU_t) declare_domain(isolated_domU_t) create_domain(dom0_t, isolated_domU_t) manage_domain(dom0_t, isolated_domU_t) domain_comms(dom0_t, isolated_domU_t) +migrate_domain_out(dom0_t, isolated_domU_t) domain_self_comms(isolated_domU_t) # Declare a boolean that denies creation of prot_domU_t domains @@ -142,6 +144,7 @@ gen_bool(prot_doms_locked, false) declare_domain(prot_domU_t) if (!prot_doms_locked) { create_domain(dom0_t, prot_domU_t) + migrate_domain_out(dom0_t, prot_domU_t) } domain_comms(dom0_t, prot_domU_t) domain_comms(domU_t, prot_domU_t) -- 1.9.3 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/4] pci: Do not ignore device's PXM information
On 12/05/2014 10:53 AM, Jan Beulich wrote: --- a/xen/include/xen/pci.h +++ b/xen/include/xen/pci.h @@ -56,6 +56,8 @@ struct pci_dev { u8 phantom_stride; +int node; /* NUMA node */ I don't think we currently support node IDs wider than 8 bits. I used an int because pxm_to_node() returns an int. OTOH, pxm2node[], for which pxm_to_node() is essentially a wrapper, is a u8. -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] A few EFI code questions
>>> On 05.12.14 at 17:40, wrote: > On Fri, Dec 05, 2014 at 03:00:14PM +, Jan Beulich wrote: >> but I don't think this possibility of renaming warrants a much longer >> discussion. Please also remember that renaming always implies more >> cumbersome backporting, even if only slightly more. > > I suppose that you are thinking about backporting my EFI + multiboot2 > patches somewhere. Not really, I was just thinking about bug fixes in general. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH] console: allocate ring buffer earlier
... when "conring_size=" was specified on the command line. We can't really do this as early as we would want to when the option was not specified, as the default depends on knowing the system CPU count. Yet the parsing of the ACPI tables is one of the things that generates a lot of output especially on large systems. I didn't change ARM, as I wasn't sure how far ahead this call could be pulled. Signed-off-by: Jan Beulich --- a/xen/arch/x86/setup.c +++ b/xen/arch/x86/setup.c @@ -1187,6 +1187,7 @@ void __init noreturn __start_xen(unsigne } vm_init(); +console_init_mem(); vesa_init(); softirq_init(); --- a/xen/drivers/char/console.c +++ b/xen/drivers/char/console.c @@ -744,15 +744,14 @@ void __init console_init_preirq(void) } } -void __init console_init_postirq(void) +void __init console_init_mem(void) { char *ring; unsigned int i, order, memflags; - -serial_init_postirq(); +unsigned long flags; if ( !opt_conring_size ) -opt_conring_size = num_present_cpus() << (9 + xenlog_lower_thresh); +return; order = get_order_from_bytes(max(opt_conring_size, conring_size)); memflags = MEMF_bits(crashinfo_maxaddr_bits); @@ -763,17 +762,28 @@ void __init console_init_postirq(void) } opt_conring_size = PAGE_SIZE << order; -spin_lock_irq(&console_lock); +spin_lock_irqsave(&console_lock, flags); for ( i = conringc ; i != conringp; i++ ) ring[i & (opt_conring_size - 1)] = conring[i & (conring_size - 1)]; conring = ring; smp_wmb(); /* Allow users of console_force_unlock() to see larger buffer. */ conring_size = opt_conring_size; -spin_unlock_irq(&console_lock); +spin_unlock_irqrestore(&console_lock, flags); printk("Allocated console ring of %u KiB.\n", opt_conring_size >> 10); } +void __init console_init_postirq(void) +{ +serial_init_postirq(); + +if ( !opt_conring_size ) +opt_conring_size = num_present_cpus() << (9 + xenlog_lower_thresh); + +if ( conring == _conring ) +console_init_mem(); +} + void __init console_endboot(void) { int i, j; --- a/xen/include/xen/console.h +++ b/xen/include/xen/console.h @@ -14,6 +14,7 @@ struct xen_sysctl_readconsole; long read_console_ring(struct xen_sysctl_readconsole *op); void console_init_preirq(void); +void console_init_mem(void); void console_init_postirq(void); void console_endboot(void); int console_has(const char *device); console: allocate ring buffer earlier ... when "conring_size=" was specified on the command line. We can't really do this as early as we would want to when the option was not specified, as the default depends on knowing the system CPU count. Yet the parsing of the ACPI tables is one of the things that generates a lot of output especially on large systems. I didn't change ARM, as I wasn't sure how far ahead this call could be pulled. Signed-off-by: Jan Beulich --- a/xen/arch/x86/setup.c +++ b/xen/arch/x86/setup.c @@ -1187,6 +1187,7 @@ void __init noreturn __start_xen(unsigne } vm_init(); +console_init_mem(); vesa_init(); softirq_init(); --- a/xen/drivers/char/console.c +++ b/xen/drivers/char/console.c @@ -744,15 +744,14 @@ void __init console_init_preirq(void) } } -void __init console_init_postirq(void) +void __init console_init_mem(void) { char *ring; unsigned int i, order, memflags; - -serial_init_postirq(); +unsigned long flags; if ( !opt_conring_size ) -opt_conring_size = num_present_cpus() << (9 + xenlog_lower_thresh); +return; order = get_order_from_bytes(max(opt_conring_size, conring_size)); memflags = MEMF_bits(crashinfo_maxaddr_bits); @@ -763,17 +762,28 @@ void __init console_init_postirq(void) } opt_conring_size = PAGE_SIZE << order; -spin_lock_irq(&console_lock); +spin_lock_irqsave(&console_lock, flags); for ( i = conringc ; i != conringp; i++ ) ring[i & (opt_conring_size - 1)] = conring[i & (conring_size - 1)]; conring = ring; smp_wmb(); /* Allow users of console_force_unlock() to see larger buffer. */ conring_size = opt_conring_size; -spin_unlock_irq(&console_lock); +spin_unlock_irqrestore(&console_lock, flags); printk("Allocated console ring of %u KiB.\n", opt_conring_size >> 10); } +void __init console_init_postirq(void) +{ +serial_init_postirq(); + +if ( !opt_conring_size ) +opt_conring_size = num_present_cpus() << (9 + xenlog_lower_thresh); + +if ( conring == _conring ) +console_init_mem(); +} + void __init console_endboot(void) { int i, j; --- a/xen/include/xen/console.h +++ b/xen/include/xen/console.h @@ -14,6 +14,7 @@ struct xen_sysctl_readconsole; long read_console_ring(struct xen_sysctl_readconsole *op); void console_init_preirq(void); +void console_init_mem(void); void console_init_postirq(void); void console_endboot(void); int console_has(
Re: [Xen-devel] [PATCH v2] console: increase initial conring size
On Fri, Dec 05, 2014 at 04:21:35PM +, Jan Beulich wrote: > >>> On 05.12.14 at 16:50, wrote: > > This bug (or lack of feature if you prefer) should be fixed, as it > > was pointed out by Jan Beulich and Olaf Hering, by allocating conring > > earlier. I though about that before posting this patch (I did not > > know beforehand about Olaf's work made in 2011). However, I stated > > that it is too late to make so intrusive changes. > > I continue to disagree. If anything, I'd rather see us hide (e.g. behind > opt_cpu_info) some of the worst offenders causing the log to become > that large. Even if yielding a bigger patch, that would have less impact Nowadays the worst offender is the EFI memmap which can be quite big. We could hide it behind 'opt_efi_info' and only print out some rather odd entries. But that would be 4.6 material, while this patch nicely fixes it for 4.5. > functionality wise and likely benefit more people. Nor do I see the > change to move the allocation earlier all that intrusive. > > But then again, considering that all you enlarge is an __initdata item, > perhaps this is acceptable. This has the other side-benefit that it will help us troubleshoot in the field without having the customer try extra parameters to extend the log data. I am all up for less round-trip to troubleshoot issues and I can't see this causing any regressions (unless we have some hard-coded EFL section data). > > Jan > ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] A few EFI code questions
On Fri, Dec 05, 2014 at 03:00:14PM +, Jan Beulich wrote: > >>> On 05.12.14 at 15:51, wrote: > > On Thu, Dec 04, 2014 at 09:35:01AM +, Jan Beulich wrote: > >> >>> On 03.12.14 at 22:02, wrote: > >> > 3) Should not we change xen/arch/*/efi/efi-boot.h to > >> >xen/arch/*/efi/efi-boot.c? efi-boot.h contains more > >> >code than definitions, declarations and short static > >> >functions. So, I think that it is more regular *.c file > >> >than header file. > >> > >> That's a matter of taste - I'd probably have made it .c too, but > >> didn't mind it being .h as done by Roy (presumably on the basis > >> that #include directives are preferred to have .h files as their > >> operands). The only thing I regret is that I didn't ask for the > >> pointless efi- prefix to be dropped. > > > > As I can see a few people people agree to some extent with my suggestion. > > Great! Sadly if we wish .c file than simple boot.c (as Jan suggested we can > > drop efi- prefix) conflicts with exiting boot.c link. Is efi-boot.c OK? > > Or maybe boot-arch.c? boot.h is OK for sure. Which one do you prefer? > > Do you have better ideas? > > boot.h would be my preference given how things look like right now, Granted! > but I don't think this possibility of renaming warrants a much longer > discussion. Please also remember that renaming always implies more > cumbersome backporting, even if only slightly more. I suppose that you are thinking about backporting my EFI + multiboot2 patches somewhere. If you wish I can rename this file after my patch series or even later to take some fixes for bugs in my code not discovered earlier. Is it OK for you? Daniel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] have architectures specify the number of PIRQs a hardware domain gets
On Fri, 5 Dec 2014, Jan Beulich wrote: > >>> On 05.12.14 at 17:05, wrote: > > On 05/12/14 15:42, Jan Beulich wrote: > > On 05.12.14 at 16:25, wrote: > >>> - XEN_DOMCTL_irq_permission => I don't really understand this bits. > >>> AFAIU the pirq number is different on each domain. But we use it to > >>> check permission on both domain. Shouldn't we translate the pirq to irq > >>> for the current->domain? > >> > >> Indeed, see also > >> http://lists.xenproject.org/archives/html/xen-devel/2014-12/msg00219.html > > > > Do you plan to send a patch to resolve this problem? > > So far I assumed Stefano would, as he was running into an issue > which iirc fixing this would help. I was only interested in fixing a bug for the 4.5 release, this work is not suitable for 4.5 and I don't know if I'll do it for 4.6. Regarding the original bug I was trying to fix, I think the original patch should go in as is: http://marc.info/?i=alpine.DEB.2.02.1412011852390.14135%40kaball.uk.xensource.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH] libxl: Set path to console on domain startup.
The path to the pty of a Xen PV console is set only in virDomainOpenConsole. But this is done too late. A call to virDomainGetXMLDesc done before OpenConsole will not have the path to the pty, but a call after OpenConsole will. e.g. of the current issue. Starting a domain with '' Then: virDomainGetXMLDesc(): virDomainOpenConsole() virDomainGetXMLDesc(): The patch intend to get the tty path on the first call of GetXMLDesc. Signed-off-by: Anthony PERARD --- src/libxl/libxl_domain.c | 17 + 1 file changed, 17 insertions(+) diff --git a/src/libxl/libxl_domain.c b/src/libxl/libxl_domain.c index 9c62291..de56054 100644 --- a/src/libxl/libxl_domain.c +++ b/src/libxl/libxl_domain.c @@ -1290,6 +1290,23 @@ libxlDomainStart(libxlDriverPrivatePtr driver, virDomainObjPtr vm, if (libxlDomainSetVcpuAffinities(driver, vm) < 0) goto cleanup_dom; +if (vm->def->nconsoles) { +virDomainChrDefPtr chr = NULL; +chr = vm->def->consoles[0]; +if (chr && chr->source.type == VIR_DOMAIN_CHR_TYPE_PTY) { +libxl_console_type console_type; +char *console = NULL; +console_type = +(chr->targetType == VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_SERIAL ? + LIBXL_CONSOLE_TYPE_SERIAL : LIBXL_CONSOLE_TYPE_PV); +ret = libxl_console_get_tty(priv->ctx, vm->def->id, chr->target.port, +console_type, &console); +if (!ret) +ignore_value(VIR_STRDUP(chr->source.data.file.path, console)); +VIR_FREE(console); +} +} + if (!start_paused) { libxl_domain_unpause(priv->ctx, domid); virDomainObjSetState(vm, VIR_DOMAIN_RUNNING, VIR_DOMAIN_RUNNING_BOOTED); -- Anthony PERARD ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH] libxl: Set path to console on domain startup
Hi, I'm trying to fix an issue when using OpenStack with libvirt+xen (libxenlight). OpenStack cannot access the console output of a Xen PV guest, because the XML generated by libvirt for a domain is missing the path to the pty. The path actually appear in the XML once one call virDomainOpenConsole(). The patch intend to get the path to the pty without having to call virDomainOpenConsole, so I've done the work in libxlDomainStart(). So I have a few question: Is libxlDomainStart will be called on restore/migrate/reboot? I guest the function libxlDomainOpenConsole() would not need to do the same work if the console path is settup properly. There is a bug report about this: https://bugzilla.redhat.com/show_bug.cgi?id=1170743 Regards, Anthony PERARD (1): libxl: Set path to console on domain startup. src/libxl/libxl_domain.c | 17 + 1 file changed, 17 insertions(+) -- Anthony PERARD ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PVH cleanups after 4.5
On Fri, Dec 05, 2014 at 10:42:27AM +, Andrew Cooper wrote: > On 05/12/14 09:54, Ian Campbell wrote: > > On Fri, 2014-12-05 at 10:49 +0100, Tim Deegan wrote: > >> At 09:20 + on 05 Dec (1417767654), Jan Beulich wrote: > >> On 04.12.14 at 18:25, wrote: > Potential feature flags, based on whiteboard notes at the session. > Things that are 'Yes' in both columns might not need actual flags :) > > 'HVM' 'PVH' > 64bit hypercalls Yes Yes > 32bit hypercalls Yes No > >>> Iiuc the lack of support of 32-bit hypercalls is simply because PVH > >>> guests aren't expected to use them as being always 64-bit right > >>> now. I.e. I can't really see why we couldn't just enable them once > >>> the 64-bit hypercall tables got combined, in which case we wouldn't > >>> need a feature flag here either. > >> Agreed -- I think the same will apply to a few other things, like shadow > >> pagetables and some of the other MM tricks. > > Might we want to constrain a given PVH domain to only make 32- or 64-bit > > hypercalls? > > > > Or do we consider already having crossed that bridge with HVM enough > > reason to allow it for PVH? I'm wonder if that, even if it is > > technically possible to support not, doing so might mitigate some > > potential security issues down the line. There's obviously a tradeoff > > against in-guest flexibility though. > > Madating a 32/64bit split serves only to cause booting issues; you need > to know a-priori what the eventual kernel is going to be before you > build the domain. This is an awkward issue with PV domains which > *really* wants not to apply to PVH as well. > > PVH guests with the plan of "HVM - qemu" should be able to fully choose > their operating mode, and allow for in-guest bootstrapping which is far > superior from a security/isolation point of view than toolstack > bootstrapping. Or another use-case: kexec-ing from within an 64-bit PVH guest to an 32-bit PVH or vice-versa. > > ~Andrew > > > ___ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] ucode=scan usefulness
On Fri, Dec 05, 2014 at 03:05:01PM +, Jan Beulich wrote: > Konrad, > > having been surprised to find your cpio scanning code to not work I > had to realize that this can't possibly work when the initrd is > compressed. Considering that you found this useful nevertheless - Heh. Right. > am I to imply that you're running with (and only considering) non- > compressed initrd? Are there plans to support compressed ones too? I hadn't thought of that use-case as the vehicle to create the payload is 'dracut'. And its mechanism is to prepend the uncompressed cpio with microcode to the compressed cpio with normal initramfs. Thought of course there is nothing stopping to have an compressed initramfs _with_ the microcode blobs. I will put this on the Xen 4.6 roadmap. P.S. Thought let me double-check that 'dracut' does not compress - it was not doing that in Fedora 20. > > Jan > ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] have architectures specify the number of PIRQs a hardware domain gets
>>> On 05.12.14 at 17:05, wrote: > On 05/12/14 15:42, Jan Beulich wrote: > On 05.12.14 at 16:25, wrote: >>> - XEN_DOMCTL_irq_permission => I don't really understand this bits. >>> AFAIU the pirq number is different on each domain. But we use it to >>> check permission on both domain. Shouldn't we translate the pirq to irq >>> for the current->domain? >> >> Indeed, see also >> http://lists.xenproject.org/archives/html/xen-devel/2014-12/msg00219.html > > Do you plan to send a patch to resolve this problem? So far I assumed Stefano would, as he was running into an issue which iirc fixing this would help. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC V8 2/3] libxl domain snapshot API design
On Fri, Dec 05, 2014 at 04:11:48PM +, Ian Campbell wrote: > On Fri, 2014-12-05 at 16:06 +, Wei Liu wrote: > > Regarding JSON API, as Ian said, feel free to hook it up to libxlu. > > *If* it is useful to multiple toolstacks but not suitable for libxl then > libxlu would be the right place. > > As I understood things the need for JSON here was xl specific, and it is > IMHO fine for xl to also use the idl infrastructure, without needing to > launder it via libxlu. > Hmm... I was think about if by any chance Chunyan wants to unify xl and libvirt's knowledge of a domain snapshot, it can go into libxlu. I'm no libvirt expert though. If libvirt doesn't need this then putting it in xl is enough. Wei. > Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2] console: increase initial conring size
>>> On 05.12.14 at 16:50, wrote: > This bug (or lack of feature if you prefer) should be fixed, as it > was pointed out by Jan Beulich and Olaf Hering, by allocating conring > earlier. I though about that before posting this patch (I did not > know beforehand about Olaf's work made in 2011). However, I stated > that it is too late to make so intrusive changes. I continue to disagree. If anything, I'd rather see us hide (e.g. behind opt_cpu_info) some of the worst offenders causing the log to become that large. Even if yielding a bigger patch, that would have less impact functionality wise and likely benefit more people. Nor do I see the change to move the allocation earlier all that intrusive. But then again, considering that all you enlarge is an __initdata item, perhaps this is acceptable. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] xen: privcmd: schedule() after private hypercall when non CONFIG_PREEMPT
On Wed, Dec 03, 2014 at 08:39:47PM +0100, Luis R. Rodriguez wrote: > On Wed, Dec 03, 2014 at 05:37:51AM +0100, Juergen Gross wrote: > > On 12/03/2014 03:28 AM, Luis R. Rodriguez wrote: > >> On Tue, Dec 02, 2014 at 11:11:18AM +, David Vrabel wrote: > >>> On 01/12/14 22:36, Luis R. Rodriguez wrote: > > Then I do agree its a fair analogy (and find this obviously odd that how > widespread cond_resched() is), we just don't have an equivalent for IRQ > context, why not avoid the special check then and use this all the time > in the > middle of a hypercall on the return from an interrupt (e.g., the timer > interrupt)? > >>> > >>> http://lists.xen.org/archives/html/xen-devel/2014-02/msg01101.html > >> > >> OK thanks! That explains why we need some asm code but in that submission > >> you > >> still also had used is_preemptible_hypercall(regs) and in the new > >> implementation you use a CPU variable xen_in_preemptible_hcall prior to > >> calling > >> preempt_schedule_irq(). I believe you added the CPU variable because > >> preempt_schedule_irq() will preempt first without any checks if it should, > >> I'm > >> asking why not do something like cond_resched_irq() where we check with > >> should_resched() prior to preempting and that way we can avoid having to > >> use > >> the CPU variable? > > > > Because that could preempt at any asynchronous interrupt making the > > no-preempt kernel fully preemptive. > > OK yeah I see. That still doesn't negate the value of using something > like cond_resched_irq() with a should_resched() on only critical hypercalls. > The current implementation (patch by David) forces preemption without > checking for should_resched() so it would preempt unnecessarily at least > once. > > > How would you know you are just > > doing a critical hypercall which should be preempted? > > You would not, you're right. I was just trying to see if we could generalize > an API for this to avoid having users having to create their own CPU variables > but this all seems very specialized as we want to use this on the timer > so if we do generalize a cond_resched_irq() perhaps the documentation can > warn about this type of case or abuse. David's patch had the check only it was x86 based, if we use cond_resched_irq() we can leave that aspect out to be done through asm inlines or it'll use the generic shoudl_resched(), that should save some code on the asm implementations. I have some patches now which generalizees this, I also have more information about this can happen exactly, and a way to triggger it on small systems with some hacks to emulate possibly backend behaviour on larger systems. In the worst case this can be a dangerious situation to be in. I'll send some new RFTs. Luis ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] tools/hotplug: update systemd dependency to use service instead of socket
On Fri, Dec 05, 2014 at 09:28:44AM +0100, Olaf Hering wrote: > On Fri, Dec 05, Olaf Hering wrote: > > > So looking again at > > tools/hotplug/Linux/systemd/var-lib-xenstored.mount.in it seems that it > > happens to work for me because XENSTORED_MOUNT_CTX is set within that > > file. So if something happens to need a different value for > > XENSTORED_MOUNT_CTX it has to be provided in the to-be-created config > > file: EnvironmentFile=-@CONFIG_DIR@/@CONFIG_LEAF_DIR@/xenstored > > This config file is not part of xen. > > And I wonder why a new config file has to be created, instead of just > reusing the existing tools/hotplug/Linux/init.d/sysconfig.xencommons.in? Right. > > I will send out a few patches to adjust the EnvironmentFile handling. Excellent. Will be happy to test them out. > > Its just the question if a configure --with-selinux-mount-context=VAL is > needed. OK. That might be complicated in that the context could change between bootup and run-time (I think that is what Michael told me). > > Olaf > > ___ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [v8][PATCH 17/17] xen/vtd: re-enable USB device assignment if enable pci_force
On Mon, Dec 01, 2014 at 05:24:35PM +0800, Tiejun Chen wrote: > Before we refine RMRR mechanism, USB RMRR may conflict with guest bios > region so we always ignore USB RMRR. Now this can be gone when we enable > pci_force to check/reserve RMRR. > > Signed-off-by: Tiejun Chen > --- > xen/drivers/passthrough/vtd/dmar.h | 1 + > xen/drivers/passthrough/vtd/iommu.c | 12 > xen/drivers/passthrough/vtd/utils.c | 18 ++ > 3 files changed, 27 insertions(+), 4 deletions(-) > > diff --git a/xen/drivers/passthrough/vtd/dmar.h > b/xen/drivers/passthrough/vtd/dmar.h > index a57c0d4..832dc32 100644 > --- a/xen/drivers/passthrough/vtd/dmar.h > +++ b/xen/drivers/passthrough/vtd/dmar.h > @@ -132,6 +132,7 @@ do {\ > int vtd_hw_check(void); > void disable_pmr(struct iommu *iommu); > int is_usb_device(u16 seg, u8 bus, u8 devfn); > +int is_reserve_device_memory(struct domain *d, u8 bus, u8 devfn); > int is_igd_drhd(struct acpi_drhd_unit *drhd); > > #endif /* _DMAR_H_ */ > diff --git a/xen/drivers/passthrough/vtd/iommu.c > b/xen/drivers/passthrough/vtd/iommu.c > index ba40209..1f1ceb7 100644 > --- a/xen/drivers/passthrough/vtd/iommu.c > +++ b/xen/drivers/passthrough/vtd/iommu.c > @@ -2264,9 +2264,11 @@ static int reassign_device_ownership( > * remove it from the hardware domain, because BIOS may use RMRR at > * booting time. Also account for the special casing of USB below (in > * intel_iommu_assign_device()). > + * But if we already check to reserve RMRR, this should be fine. > */ > if ( !is_hardware_domain(source) && > - !is_usb_device(pdev->seg, pdev->bus, pdev->devfn) ) > + !is_usb_device(pdev->seg, pdev->bus, pdev->devfn) && > + !is_reserve_device_memory(source, pdev->bus, pdev->devfn) ) > { > const struct acpi_rmrr_unit *rmrr; > u16 bdf; > @@ -2315,12 +2317,14 @@ static int intel_iommu_assign_device( > if ( ret ) > return ret; > > -/* FIXME: Because USB RMRR conflicts with guest bios region, > - * ignore USB RMRR temporarily. > +/* > + * Because USB RMRR conflicts with guest bios region, > + * ignore USB RMRR temporarily in case of non-reserving-RMRR. > */ > seg = pdev->seg; > bus = pdev->bus; > -if ( is_usb_device(seg, bus, pdev->devfn) ) > +if ( is_usb_device(seg, bus, pdev->devfn) && > + !is_reserve_device_memory(d, bus, pdev->devfn) ) > return 0; > > /* Setup rmrr identity mapping */ > diff --git a/xen/drivers/passthrough/vtd/utils.c > b/xen/drivers/passthrough/vtd/utils.c > index a33564b..1045ac1 100644 > --- a/xen/drivers/passthrough/vtd/utils.c > +++ b/xen/drivers/passthrough/vtd/utils.c > @@ -36,6 +36,24 @@ int is_usb_device(u16 seg, u8 bus, u8 devfn) > return (class == 0xc03); > } > > +int is_reserve_device_memory(struct domain *d, u8 bus, u8 devfn) > +{ > +int i = 0; > + > +if ( d->arch.hvm_domain.pci_force == PCI_DEV_RDM_CHECK ) > +return 1; Ouch. What if the 'hvm_domain' is not there? Please check first for that. > + > +for ( i = 0; i < d->arch.hvm_domain.num_pcidevs; i++ ) > +{ > +if ( d->arch.hvm_domain.pcidevs[i].bus == bus && > + d->arch.hvm_domain.pcidevs[i].devfn == devfn && > + d->arch.hvm_domain.pcidevs[i].flags == PCI_DEV_RDM_CHECK ) > +return 1; > +} > + > +return 0; > +} > + > /* Disable vt-d protected memory registers. */ > void disable_pmr(struct iommu *iommu) > { > -- > 1.9.1 > ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC V8 2/3] libxl domain snapshot API design
On Fri, 2014-12-05 at 16:06 +, Wei Liu wrote: > Regarding JSON API, as Ian said, feel free to hook it up to libxlu. *If* it is useful to multiple toolstacks but not suitable for libxl then libxlu would be the right place. As I understood things the need for JSON here was xl specific, and it is IMHO fine for xl to also use the idl infrastructure, without needing to launder it via libxlu. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/5] tools/hotplug: move XENSTORED_MOUNT_CTX to sysconfig.xencommons
On Fri, Dec 05, 2014 at 04:51:49PM +0100, Olaf Hering wrote: > On Fri, Dec 05, Anthony PERARD wrote: > > > On Fri, Dec 05, 2014 at 01:05:48PM +0100, Olaf Hering wrote: > > > On a non-SELinux system the mount option "context=none" works fine. > > > > That's not true. I've tested 'context=none' on ArchLinux which have no > > support (or limited) for selinux, and the option give this error: > > "tmpfs: Bad mount option context" > > Appears to work for me, at least on SLE12. No idea how much SELinux they > put in there. My old 11.4 behaves the same. Perhaps your kernel lacks > certain functionality? I assome the error msg comes from the kernel. Yes, the message comes from the kernel. I don't know what functionality is needed to for the kernel to handle context=none, so I can't tell. At least, there is not CONFIG_SECURITY_SELINUX in the config, but CONFIG_SECURITY=y is set. In anyway, I'll continue to edit the systemd unit to remove the context option, that's not a big deal. > root@bax:~ # mount -vt tmpfs xxx -o context=foo /mnt/ > mount: xxx mounted on /mnt. $ sudo mount -vt tmpfs xxx -o context=foo /tmp/tmp.lhw79USQUe mount: wrong fs type, bad option, bad superblock on xxx, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so. $ dmesg | tail -1 [1569927.987083] tmpfs: Bad mount option context -- Anthony PERARD ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC V8 2/3] libxl domain snapshot API design
I have to admit I'm confused by the back and forth discussion. It's hard to justify the design of new API without knowing what the constraints and requirements are from your PoV. Here are my two cents, not about details, but about general constraints. There are two layers, one is user of libxl (clients -- xl, libvirt etc) and libxl (the library itself). 1. it's better to *not* have storage management in libxl. It's likely that clients can have their own management functionality already. I'm told that libvirt has that as well as XAPI. Having this functionality in libxl is a bit redundant and requires lots of work (enlighten libxl on what a disk looks like and call out to various utilities). 2. it's *not* a requirement for xl to have the capability to manage snapshots. It's the same arguement that xl has no idea on how to manage snapshots created by "xl save". This should ease your concern on having to duplicate code for libvirt and xl. IMHO the xl only needs to have the capability to create a snapshot and create a domain from a snapshot. The downside is that now xl and libvirt are disconnected, but I think it's fine. The arguement is that you're not allowed to run two toolstack on the same host (think about xl and xend in previous releases). Do these two constraints make your work easier (or harder)? Regarding JSON API, as Ian said, feel free to hook it up to libxlu. Wei. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] have architectures specify the number of PIRQs a hardware domain gets
On 05/12/14 15:42, Jan Beulich wrote: On 05.12.14 at 16:25, wrote: >> - XEN_DOMCTL_irq_permission => I don't really understand this bits. >> AFAIU the pirq number is different on each domain. But we use it to >> check permission on both domain. Shouldn't we translate the pirq to irq >> for the current->domain? > > Indeed, see also > http://lists.xenproject.org/archives/html/xen-devel/2014-12/msg00219.html Do you plan to send a patch to resolve this problem? Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 2/4] sysctl/libxl: Add interface for returning IO topology data
>>> On 05.12.14 at 16:55, wrote: On 02.12.14 at 22:34, wrote: >> +struct xen_sysctl_iotopo { >> +uint16_t seg; >> +uint8_t bus; >> +uint8_t devfn; >> +uint32_t node; >> +}; > > This is PCI-centric without expressing in the name or layout. Perhaps > the first part should be a union from the very beginning? And I wonder whether that supposed union part wouldn't be nicely done using struct physdev_pci_device. Additionally please add IN and OUT annotations. When I first saw this I assumed they would all be OUT (in which case the long running loop problem mentioned in the reply to one of the other patches wouldn't have been there), matching their CPU counterpart... Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 3/4] sysctl/libxl: Provide information about IO topology
>>> On 02.12.14 at 22:34, wrote: > @@ -362,6 +363,35 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) > u_sysctl) > u.topologyinfo.max_cpu_index) ) > ret = -EFAULT; > } > + > +if ( !ret && !guest_handle_is_null(ti->iotopo) ) > +{ > +for ( i = 0; i < ti->max_devs; i++ ) Careful about long running loops here please. Jan > +{ > +xen_sysctl_iotopo_t iotopo; > +struct pci_dev *pdev; > + > +if ( copy_from_guest_offset(&iotopo, ti->iotopo, i, 1) ) > +{ > +ret = -EFAULT; > +break; > +} > + > +spin_lock(&pcidevs_lock); > +pdev = pci_get_pdev(iotopo.seg, iotopo.bus, iotopo.devfn); > +if ( !pdev || (pdev->node == NUMA_NO_NODE) ) > +iotopo.node = INVALID_TOPOLOGY_ID; > +else > +iotopo.node = pdev->node; > +spin_unlock(&pcidevs_lock); > + > +if ( copy_to_guest_offset(ti->iotopo, i, &iotopo, 1) ) > +{ > +ret = -EFAULT; > +break; > +} > +} > +} > } > break; > > -- > 1.8.4.2 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 2/4] sysctl/libxl: Add interface for returning IO topology data
>>> On 02.12.14 at 22:34, wrote: > +struct xen_sysctl_iotopo { > +uint16_t seg; > +uint8_t bus; > +uint8_t devfn; > +uint32_t node; > +}; This is PCI-centric without expressing in the name or layout. Perhaps the first part should be a union from the very beginning? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/4] pci: Do not ignore device's PXM information
>>> On 02.12.14 at 22:34, wrote: > If ACPI provides PXM data for IO devices then dom0 will pass it to > hypervisor during PHYSDEVOP_pci_device_add call. This information, > however, is currently ignored. > > We should remember it (in the form of nodeID). We will also print it > when user requests device information dump. This on its own seems pretty little reason for changing the code; considering that subsequent patches will at least convey the information to the tool stack, maybe that should be mentioned here as the primary reason for the change? > @@ -597,13 +598,14 @@ ret_t do_physdev_op(int cmd, > XEN_GUEST_HANDLE_PARAM(void) arg) > pdev_info.physfn.devfn = manage_pci_ext.physfn.devfn; > ret = pci_add_device(0, manage_pci_ext.bus, > manage_pci_ext.devfn, > - &pdev_info); > + &pdev_info, NUMA_NO_NODE); > break; > } > > case PHYSDEVOP_pci_device_add: { > struct physdev_pci_device_add add; > struct pci_dev_info pdev_info; > +int node; Here and everywhere else as applicable: unsigned int unless a negative value is possible. > @@ -618,7 +620,19 @@ ret_t do_physdev_op(int cmd, > XEN_GUEST_HANDLE_PARAM(void) arg) > } > else > pdev_info.is_virtfn = 0; > -ret = pci_add_device(add.seg, add.bus, add.devfn, &pdev_info); > + > +if ( add.flags & XEN_PCI_DEV_PXM ) { Coding style. > --- a/xen/drivers/passthrough/pci.c > +++ b/xen/drivers/passthrough/pci.c > @@ -568,7 +568,8 @@ static void pci_enable_acs(struct pci_dev *pdev) > pci_conf_write16(seg, bus, dev, func, pos + PCI_ACS_CTRL, ctrl); > } > > -int pci_add_device(u16 seg, u8 bus, u8 devfn, const struct pci_dev_info > *info) > +int pci_add_device(u16 seg, u8 bus, u8 devfn, > + const struct pci_dev_info *info, const int node) I don't see the need for the const. > --- a/xen/include/xen/pci.h > +++ b/xen/include/xen/pci.h > @@ -56,6 +56,8 @@ struct pci_dev { > > u8 phantom_stride; > > +int node; /* NUMA node */ I don't think we currently support node IDs wider than 8 bits. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/5] tools/hotplug: move XENSTORED_MOUNT_CTX to sysconfig.xencommons
On Fri, Dec 05, Anthony PERARD wrote: > On Fri, Dec 05, 2014 at 01:05:48PM +0100, Olaf Hering wrote: > > On a non-SELinux system the mount option "context=none" works fine. > > That's not true. I've tested 'context=none' on ArchLinux which have no > support (or limited) for selinux, and the option give this error: > "tmpfs: Bad mount option context" Appears to work for me, at least on SLE12. No idea how much SELinux they put in there. My old 11.4 behaves the same. Perhaps your kernel lacks certain functionality? I assome the error msg comes from the kernel. root@bax:~ # mount -vt tmpfs xxx -o context=foo /mnt/ mount: xxx mounted on /mnt. root@bax:~ # grep mnt /proc/mounts xxx /mnt tmpfs rw,relatime 0 0 Olaf ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v2] console: increase initial conring size
In general initial conring size is sufficient. However, if log level is increased on platforms which have e.g. huge number of memory regions (I have an IBM System x3550 M2 with 8 GiB RAM which has more than 200 entries in EFI memory map) then some of earlier messages in console ring are overwritten. It means that in case of issues deeper analysis can be hindered. Sadly conring_size argument does not help because new console buffer is allocated late on heap. It means that it is not possible to allocate larger ring earlier. So, in this situation initial conring size should be increased. My experiments showed that even on not so big machines more than 26 KiB of free space are needed for initial messages. In theory we could increase conring size buffer to 32 KiB. However, I think that this value could be too small for huge machines with large number of ACPI tables and EFI memory regions. Hence, this patch increases initial conring size to 64 KiB. Signed-off-by: Daniel Kiper --- This bug (or lack of feature if you prefer) should be fixed, as it was pointed out by Jan Beulich and Olaf Hering, by allocating conring earlier. I though about that before posting this patch (I did not know beforehand about Olaf's work made in 2011). However, I stated that it is too late to make so intrusive changes. So, I think we should (sadly) apply this "band-aid" to 4.5 because, as you can see in Xen-devel archive, this bug hits more and more people and they fix this issue in the same way as I did in this patch. On the other hand I agree that we should finally fix this issue in better way. Hence, I am adding this thing to my TODO list. v2 - suggestions/fixes: - update documentation (suggested by Andrew Cooper), - add rationale (suggested by Jan Beulich). --- docs/misc/xen-command-line.markdown |2 +- xen/drivers/char/console.c |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown index 0866df2..2ad2340 100644 --- a/docs/misc/xen-command-line.markdown +++ b/docs/misc/xen-command-line.markdown @@ -286,7 +286,7 @@ A typical setup for most situations might be `com1=115200,8n1` ### conring\_size > `= ` -> Default: `conring_size=16k` +> Default: `conring_size=64k` Specify the size of the console ring buffer. diff --git a/xen/drivers/char/console.c b/xen/drivers/char/console.c index 2f03259..429d296 100644 --- a/xen/drivers/char/console.c +++ b/xen/drivers/char/console.c @@ -67,7 +67,7 @@ custom_param("console_timestamps", parse_console_timestamps); static uint32_t __initdata opt_conring_size; size_param("conring_size", opt_conring_size); -#define _CONRING_SIZE 16384 +#define _CONRING_SIZE 65536 #define CONRING_IDX_MASK(i) ((i)&(conring_size-1)) static char __initdata _conring[_CONRING_SIZE]; static char *__read_mostly conring = _conring; -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH OSSTEST] ts-xen-build-prep: Install libxml-xpath-perl on build machines
On Fri, 2014-12-05 at 14:55 +, Ian Campbell wrote: > Required by latest libvirt, to build docs. > > Signed-off-by: Ian Campbell Ian acked this on IRC and I have pushed it along with some other bits and bobs floating around already acked. Specifically I have pushed to pretest: 0d8405e Add simple helper to update DI for all architectures. e7ed319 ts-kernel-build: enable CONFIG_IKCONFIG{_PROC} 6184712 standalone: Introduce "HostGroups" for use in OSSTEST_CONFIG a70253f ts-xen-build-prep: Install libxml-xpath-perl on build machines 60670dd linux-next tests: Use correct branch for baseline ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] have architectures specify the number of PIRQs a hardware domain gets
>>> On 05.12.14 at 16:25, wrote: > - XEN_DOMCTL_irq_permission => I don't really understand this bits. > AFAIU the pirq number is different on each domain. But we use it to > check permission on both domain. Shouldn't we translate the pirq to irq > for the current->domain? Indeed, see also http://lists.xenproject.org/archives/html/xen-devel/2014-12/msg00219.html Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [xen-unstable test] 32051: regressions - FAIL
On Thu, Dec 04, 2014 at 10:12:18AM +, xen.org wrote: > flight 32051 xen-unstable real [real] > http://www.chiark.greenend.org.uk/~xensrcts/logs/32051/ > > Regressions :-( There does not seem to be anything warranting that. The failure is that this: 07 Z executing ssh ... osstest@10.80.250.27 cd /home/osstest/build.32051.build-amd64-pvops/linux && git rev-parse HEAD^0 2014-12-03 13:15:37 Z command timed out [30]: ssh -o StrictHostKeyChecking=no -o BatchMode=yes -o ConnectTimeout=100 -o ServerAliveInterval=100 -o PasswordAuthentication=no -o ChallengeResponseAuthentication=no -o UserKnownHostsFile=tmp/t.known_hosts_32051.build-amd64-pvops osstest@10.80.250.27 cd /home/osstest/build.32051.build-amd64-pvops/linux && git rev-parse HEAD^0 status (timed out) at Osstest/TestSupport.pm line 392. Which taking more than 30 seconds is quite odd. But perhaps it triggered git compression? Oh wait, the ConnectionTimeout is 100 seconds but this failed in 30 seconds? And the http://www.chiark.greenend.org.uk/~xensrcts/logs/32051/build-amd64-pvops/6.ts-logs-capture.log was able to capture data - so the host did not crash. ? > > Tests which did not succeed and are blocking, > including tests which could not be run: > build-amd64-pvops 5 kernel-build fail REGR. vs. > 31986 > > Regressions which are regarded as allowable (not blocking): > test-amd64-i386-pair 18 guest-migrate/dst_host/src_host fail blocked in > 31986 > > Tests which did not succeed, but are not blocking: > test-amd64-amd64-rumpuserxen-amd64 1 build-check(1) blocked > n/a > test-amd64-i386-libvirt 9 guest-start fail never > pass > test-armhf-armhf-libvirt 9 guest-start fail never > pass > test-amd64-amd64-xl-pcipt-intel 1 build-check(1) blocked n/a > test-amd64-amd64-xl-sedf-pin 1 build-check(1) blocked n/a > test-amd64-amd64-xl-sedf 1 build-check(1) blocked n/a > test-amd64-amd64-xl 1 build-check(1) blocked n/a > test-armhf-armhf-xl 10 migrate-support-checkfail never > pass > test-amd64-amd64-libvirt 1 build-check(1) blocked n/a > test-amd64-i386-xl-qemut-winxpsp3 14 guest-stopfail never > pass > test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail never > pass > test-amd64-i386-xl-qemuu-winxpsp3 14 guest-stopfail never > pass > test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never > pass > test-amd64-amd64-xl-qemut-win7-amd64 1 build-check(1) blocked > n/a > test-amd64-amd64-xl-win7-amd64 1 build-check(1) blocked n/a > test-amd64-amd64-xl-qemuu-win7-amd64 1 build-check(1) blocked > n/a > test-amd64-amd64-pair 1 build-check(1) blocked n/a > test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop fail never > pass > test-amd64-amd64-xl-qemut-debianhvm-amd64 1 build-check(1)blocked > n/a > test-amd64-amd64-xl-qemuu-debianhvm-amd64 1 build-check(1)blocked > n/a > test-amd64-amd64-xl-qemuu-ovmf-amd64 1 build-check(1) blocked > n/a > test-amd64-amd64-xl-qemut-winxpsp3 1 build-check(1) blocked > n/a > test-amd64-i386-xl-win7-amd64 14 guest-stop fail never > pass > test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop fail never > pass > test-amd64-amd64-xl-winxpsp3 1 build-check(1) blocked n/a > test-amd64-i386-xl-winxpsp3 14 guest-stop fail never > pass > test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop fail never > pass > test-amd64-amd64-xl-qemuu-winxpsp3 1 build-check(1) blocked > n/a > > version targeted for testing: > xen 4d1a77ba7ab94183c203226d3fe7ac1cd087c59b > baseline version: > xen 188336bb86d0992a2a034ece5f39eccc5d10f337 > > > People who touched revisions under test: > Chunyan Liu > Euan Harris > Ian Campbell > Ian Jackson > Jan Beulich > M A Young > Michael Young > Razvan Cojocaru > Tim Deegan > Wei Liu > > > jobs: > build-amd64 pass > build-armhf pass > build-i386 pass > build-amd64-libvirt pass > build-armhf-libvirt pass > build-i386-libvirt pass > build-amd64-oldkern pass > build-i386-oldkern pass > build-amd64-pvops
Re: [Xen-devel] [PATCH 1/5] tools/hotplug: move XENSTORED_MOUNT_CTX to sysconfig.xencommons
On Fri, Dec 05, 2014 at 01:05:48PM +0100, Olaf Hering wrote: > On a non-SELinux system the mount option "context=none" works fine. That's not true. I've tested 'context=none' on ArchLinux which have no support (or limited) for selinux, and the option give this error: "tmpfs: Bad mount option context" -- Anthony PERARD ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [xen-4.4-testing test] 31991: regressions - FAIL [and 1 more messages]
xen.org writes ("[xen-4.4-testing test] 31991: regressions - FAIL"): > flight 31991 xen-4.4-testing real [real] > http://www.chiark.greenend.org.uk/~xensrcts/logs/31991/ > > Regressions :-( > > Tests which did not succeed and are blocking, > including tests which could not be run: > test-amd64-i386-pair 17 guest-migrate/src_host/dst_host fail REGR. vs. > 31781 This is the swiotlb problem which is not a recent regression in Xen 4.3, but probably a gradually-regressing kernel problem. > version targeted for testing: > xen a39f202031d7f1d8d9e14b8c3d7d11c812db253e xen.org writes ("[xen-4.3-testing test] 32089: regressions - FAIL"): > flight 32089 xen-4.3-testing real [real] > http://www.chiark.greenend.org.uk/~xensrcts/logs/32089/ > > Regressions :-( > > Tests which did not succeed and are blocking, > including tests which could not be run: > test-amd64-i386-pair 17 guest-migrate/src_host/dst_host fail REGR. vs. > 31811 Likewise. > version targeted for testing: > xen e0921ec746410f0a07eb3767e95e5eda25d4934a In both of these cases, that was the only reason osstest didn't do a push. Following discussion with Jan on IRC, I am going to do a manual force push of both trees. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] lock down hypercall continuation encoding masks
On 05/12/14 15:18, Jan Beulich wrote: On 05.12.14 at 16:01, wrote: >> On 05/12/14 14:47, Jan Beulich wrote: >> On 05.12.14 at 15:36, wrote: On 05/12/14 11:31, Jan Beulich wrote: > Andrew validly points out that even if these masks aren't a formal part > of the hypercall interface, we aren't free to change them: A guest > suspended for migration in the middle of a continuation would fail to > work if resumed on a hypervisor using a different value. Hence add > respective comments to their definitions. > > Additionally, to help future extensibility as well as in the spirit of > reducing undefined behavior as much as possible, refuse hypercalls made > with the respective bits non-zero when the respective sub-ops don't > make use of those bits. > > Reported-by: Andrew Cooper > Signed-off-by: Jan Beulich General principle looks good. A couple of issues. > --- a/xen/arch/x86/mm.c > +++ b/xen/arch/x86/mm.c > @@ -4661,9 +4661,8 @@ int xenmem_add_to_physmap_one( > long arch_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg) > { > int rc; > -int op = cmd & MEMOP_CMD_MASK; This needs a blanket start_iter check, as do_memory_op() has not done so. >>> Not sure what you're asking for - why is removing the masking not >>> sufficient? >> There is no check to ensure that a non-preemptible arch_memoy_op is not >> called with a non-zero start_iter. >> >> This location needs something like >> >> if ( cmd & ~MEMOP_CMD_MASK ) >> return -ENOSYS; > I'm sorry - the default case of sub_arch_memory_op() will ensure > this. Ah - I see now. That is subtle. Better remember to double check the first patch which needs to add a preemptible subop. Reviewed-by: Andrew Cooper ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] have architectures specify the number of PIRQs a hardware domain gets
On 05/12/14 14:42, Ian Campbell wrote: > On Fri, 2014-12-05 at 14:36 +, Julien Grall wrote: >> Hi, >> >> On 05/12/14 14:27, Ian Campbell wrote: >>> On Fri, 2014-12-05 at 13:51 +, Jan Beulich wrote: #define nr_static_irqs NR_IRQS +#define arch_hwdom_irqs(domid) NR_IRQS >>> >>> FWIW gic_number_lines() is the ARM equivalent of getting the number of >>> GSIs. >>> >>> *BUT* we don't actually use pirqs on ARM (everything goes via the >>> virtualised interrupt controller). So maybe we should be setting >>> nr_pirqs to 0 on ARM. I appreciate you likely want such a patch to come >>> from an ARM person, so I'm fine with you making this NR_IRQS in the >>> meantime. >> >> As we already know that PIRQ is not used on ARM, it would make sense to >> use directly in this patch 0. > > Are you offering to give a tested-by if Jan posts such a patch? nr_pirqs is used in 2 different place (without counting this setting): - event channel => We don't care on ARM as alloc_pirq_struct is returning NULL - XEN_DOMCTL_irq_permission => I don't really understand this bits. AFAIU the pirq number is different on each domain. But we use it to check permission on both domain. Shouldn't we translate the pirq to irq for the current->domain? Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Poor network performance between DomU with multiqueue support
On 04/12/14 14:31, Zhangleiqiang (Trump) wrote: -Original Message- From: Zoltan Kiss [mailto:zoltan.k...@linaro.org] Sent: Thursday, December 04, 2014 9:35 PM To: Zhangleiqiang (Trump); Wei Liu; xen-devel@lists.xen.org Cc: Xiaoding (B); Zhuangyuxin; zhangleiqiang; Luohao (brian); Yuzhou (C) Subject: Re: [Xen-devel] Poor network performance between DomU with multiqueue support On 04/12/14 12:09, Zhangleiqiang (Trump) wrote: I think that's expected, because guest RX data path still uses grant_copy while guest TX uses grant_map to do zero-copy transmit. As I understand, the RX process is as follows: 1. Phy NIC receive packet 2. XEN Hypervisor trigger interrupt to Dom0 3. Dom0' s NIC driver do the "RX" operation, and the packet is stored into SKB which is also owned/shared with netback Not that easy. There is something between the NIC driver and netback which directs the packets, e.g. the old bridge driver, ovs, or the IP stack of the kernel. 4. NetBack notify netfront through event channel that a packet is receiving 5. Netfront grant a buffer for receiving and notify netback the GR (if using grant-resue mechanism, netfront just notify the GR to netback) through IO Ring It looks a bit confusing in the code, but netfront put "requests" on the ring buffer, which contains the grant ref of the guest page where the backend can copy. When the packet comes, netback consumes these requests and send back a response telling the guest the grant copy of the packet finished, it can start handling the data. (sending a response means it's placing a response in the ring and trigger the event channel) And ideally netback should always have requests in the ring, so it doesn't have to wait for the guest to fill it up. 6. NetBack do the grant_copy to copy packet from its SKB to the buffer referenced by GR, and notify netfront through event channel 7. Netfront copy the data from buffer to user-level app's SKB Or wherever that SKB should go, yes. Like with any received packet on a real network interface. Am I right? Why not using zero-copy transmit in guest RX data pash too ? Because that means you are mapping that memory to the guest, and you won't have any guarantee when the guest will release them. And netback can't just unmap them forcibly after a timeout, because finding a correct timeout value would be quite impossible. A malicious/buggy/overloaded guest can hold on to Dom0 memory indefinitely, but it even becomes worse if the memory came from another guest: you can't shutdown that guest for example, until all its memory is returned to him. Thanks for your detailed explanation about RX data path, I have get it, :) About the issue that poor performance between DomU to DomU, but high throughout between Dom0 to remote Dom0/DomU mentioned in my previous mail, do you have any idea about it? I am wondering if netfront/netback can be optimized to reach the 10Gbps throughout between DomUs running on different hosts connected with 10GE network. Currently, it seems like the TX is not the bottleneck, because we can reach the aggregate throughout of 9Gbps when sending packets from one DomU to other 3 DomUs running on different host. So I think the bottleneck maybe the RX, are you agreed with me? I am wondering what is the main reason that prevent RX to reach the higher throughout? Compared to KVM+virtio+vhost, which can reach high throughout, the RX has extra grantcopy operation, and the grantcopy operation may be one reason for it. Do you have any idea about it too? It's quite sure that the grant copy is the bottleneck for a single queue RX traffic. I don't know what's the plan to help that, currently only a faster CPU can help you with that. Regards, Zoli ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Poor network performance between DomU with multiqueue support
On 05/12/14 12:42, Wei Liu wrote: On Fri, Dec 05, 2014 at 01:17:16AM +, Zhangleiqiang (Trump) wrote: [...] I think that's expected, because guest RX data path still uses grant_copy while guest TX uses grant_map to do zero-copy transmit. As far as I know, there are three main grant-related operations used in split device model: grant mapping, grant transfer and grant copy. Grant transfer has not used now, and grant mapping and grant transfer both involve "TLB" refresh work for hypervisor, am I right? Or only grant transfer has this overhead? Transfer is not used so I can't tell. Grant unmap causes TLB flush. I saw in an email the other day XenServer folks has some planned improvement to avoid TLB flush in Xen to upstream in 4.6 window. I can't speak for sure it will get upstreamed as I don't work on that. Does grant copy surely has more overhead than grant mapping? At the very least the zero-copy TX path is faster than previous copying path. But speaking of the micro operation I'm not sure. There was once persistent map prototype netback / netfront that establishes a memory pool between FE and BE then use memcpy to copy data. Unfortunately that prototype was not done right so the result was not good. >From the code, I see that in TX, netback will do gnttab_batch_copy as well as gnttab_map_refs: //netback.c:xenvif_tx_action xenvif_tx_build_gops(queue, budget, &nr_cops, &nr_mops); if (nr_cops == 0) return 0; gnttab_batch_copy(queue->tx_copy_ops, nr_cops); if (nr_mops != 0) { ret = gnttab_map_refs(queue->tx_map_ops, NULL, queue->pages_to_map, nr_mops); BUG_ON(ret); } The copy is for the packet header. Mapping is for packet data. We need to copy header from guest so that it doesn't change under netback's feet. It is also important because if the above mentioned "TLB flush avoidance" patch goes in to Xen, it will be important to grant copy the header rather than grant map plus memcpy. The latter is the old way, it touches the page so you can't avoid TLB flush. Wei. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] lock down hypercall continuation encoding masks
>>> On 05.12.14 at 16:01, wrote: > On 05/12/14 14:47, Jan Beulich wrote: > On 05.12.14 at 15:36, wrote: >>> On 05/12/14 11:31, Jan Beulich wrote: Andrew validly points out that even if these masks aren't a formal part of the hypercall interface, we aren't free to change them: A guest suspended for migration in the middle of a continuation would fail to work if resumed on a hypervisor using a different value. Hence add respective comments to their definitions. Additionally, to help future extensibility as well as in the spirit of reducing undefined behavior as much as possible, refuse hypercalls made with the respective bits non-zero when the respective sub-ops don't make use of those bits. Reported-by: Andrew Cooper Signed-off-by: Jan Beulich >>> General principle looks good. A couple of issues. >>> --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -4661,9 +4661,8 @@ int xenmem_add_to_physmap_one( long arch_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg) { int rc; -int op = cmd & MEMOP_CMD_MASK; >>> This needs a blanket start_iter check, as do_memory_op() has not done so. >> Not sure what you're asking for - why is removing the masking not >> sufficient? > > There is no check to ensure that a non-preemptible arch_memoy_op is not > called with a non-zero start_iter. > > This location needs something like > > if ( cmd & ~MEMOP_CMD_MASK ) > return -ENOSYS; I'm sorry - the default case of sub_arch_memory_op() will ensure this. >>> The ARM code also needs one, as the caller has applied partial checks. >> The ARM code never applied a mask. > > But the common code does, so the ARM code must follow suit for consistency. > > Otherwise, we end up with ARM non-preemptible memory subops not failing > with -ENOSYS where primary memory ops would. Again, the default case results in -ENOSYS for any with the high bits set. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] ucode=scan usefulness
>>> On 05.12.14 at 16:05, wrote: > having been surprised to find your cpio scanning code to not work I > had to realize that this can't possibly work when the initrd is > compressed. Considering that you found this useful nevertheless - > am I to imply that you're running with (and only considering) non- > compressed initrd? Are there plans to support compressed ones too? Never mind, I forgot that the blob gets prefixed uncompressed to the compressed one, and got confused by seeing a fully compressed image simply because the installer for some reason decided not to install any microcode data. Sorry for the noise, Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [xen-4.3-testing test] 32089: regressions - FAIL
flight 32089 xen-4.3-testing real [real] http://www.chiark.greenend.org.uk/~xensrcts/logs/32089/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-i386-pair 17 guest-migrate/src_host/dst_host fail REGR. vs. 31811 Tests which did not succeed, but are not blocking: test-amd64-amd64-rumpuserxen-amd64 1 build-check(1) blocked n/a test-amd64-i386-rumpuserxen-i386 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-ovmf-amd64 7 debian-hvm-install fail never pass test-amd64-i386-libvirt 9 guest-start fail never pass test-amd64-amd64-xl-qemuu-ovmf-amd64 7 debian-hvm-install fail never pass test-amd64-amd64-libvirt 9 guest-start fail never pass test-amd64-amd64-xl-pcipt-intel 9 guest-start fail never pass test-armhf-armhf-xl 5 xen-boot fail never pass test-armhf-armhf-libvirt 5 xen-boot fail never pass test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop fail never pass build-i386-rumpuserxen6 xen-buildfail never pass test-amd64-i386-xend-winxpsp3 17 leak-check/check fail never pass build-amd64-rumpuserxen 6 xen-buildfail never pass test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop fail never pass test-amd64-i386-xend-qemut-winxpsp3 17 leak-check/checkfail never pass test-amd64-i386-xl-win7-amd64 14 guest-stop fail never pass test-amd64-amd64-xl-win7-amd64 14 guest-stop fail never pass test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail never pass test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop fail never pass test-amd64-amd64-xl-winxpsp3 14 guest-stop fail never pass test-amd64-amd64-xl-qemut-winxpsp3 14 guest-stop fail never pass test-amd64-amd64-xl-qemuu-winxpsp3 14 guest-stop fail never pass version targeted for testing: xen e0921ec746410f0a07eb3767e95e5eda25d4934a baseline version: xen 62f1b78417f3a9afe8d40ee3c0d2f0495240cf47 People who touched revisions under test: Jan Beulich jobs: build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-armhf-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-armhf-pvopspass build-i386-pvops pass build-amd64-rumpuserxen fail build-i386-rumpuserxen fail test-amd64-amd64-xl pass test-armhf-armhf-xl fail test-amd64-i386-xl pass test-amd64-i386-rhel6hvm-amd pass test-amd64-i386-qemut-rhel6hvm-amd pass test-amd64-i386-qemuu-rhel6hvm-amd pass test-amd64-amd64-xl-qemut-debianhvm-amd64pass test-amd64-i386-xl-qemut-debianhvm-amd64 pass test-amd64-amd64-xl-qemuu-debianhvm-amd64pass test-amd64-i386-xl-qemuu-debianhvm-amd64 pass test-amd64-i386-freebsd10-amd64 pass test-amd64-amd64-xl-qemuu-ovmf-amd64 fail test-amd64-i386-xl-qemuu-ovmf-amd64 fail test-amd64-amd64-rumpuserxen-amd64 blocked test-amd64-amd64-xl-qemut-win7-amd64 fail test-amd64-i386-xl-qemut-win7-amd64 fail test-amd64-amd64-xl-qemuu-win7-amd64 fail test-amd64-i386-xl-qemuu-win7-amd64 fail test-amd64-amd64-xl-win7-amd64 fail test-amd64-i386-xl-win7-amd64fail test-amd64-i386-xl-credit2 pass test-amd64-i386-freebsd10-i386
[Xen-devel] ucode=scan usefulness
Konrad, having been surprised to find your cpio scanning code to not work I had to realize that this can't possibly work when the initrd is compressed. Considering that you found this useful nevertheless - am I to imply that you're running with (and only considering) non- compressed initrd? Are there plans to support compressed ones too? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/5] tools/hotplug: move XENSTORED_MOUNT_CTX to sysconfig.xencommons
Olaf Hering writes ("Re: [PATCH 1/5] tools/hotplug: move XENSTORED_MOUNT_CTX to sysconfig.xencommons"): > On Fri, Dec 05, Ian Jackson wrote: > > I confess I don't know very much about selinux, but shouldn't we be > > providing a reasonable default policy, rather than leaving it to the > > distro or user to pass special options to configure ? Or are things > > in the selinux world so fragmented or fast-moving that such a generic > > policy couldn't be written ? > > I know nothing about SELinux. Not sure why a context= is required > anyway. But I can find out next week if noone else has an idea how to > deal with SELinux. OK, thanks. Anyway, I don't think this question should stand in the way of this hunk of your patch, which is IMO obviously a move in the right direction. So if you shuffle things about as I suggested I will ack this hunk in your next version of the series. Thanks, Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] lock down hypercall continuation encoding masks
On 05/12/14 14:47, Jan Beulich wrote: On 05.12.14 at 15:36, wrote: >> On 05/12/14 11:31, Jan Beulich wrote: >>> Andrew validly points out that even if these masks aren't a formal part >>> of the hypercall interface, we aren't free to change them: A guest >>> suspended for migration in the middle of a continuation would fail to >>> work if resumed on a hypervisor using a different value. Hence add >>> respective comments to their definitions. >>> >>> Additionally, to help future extensibility as well as in the spirit of >>> reducing undefined behavior as much as possible, refuse hypercalls made >>> with the respective bits non-zero when the respective sub-ops don't >>> make use of those bits. >>> >>> Reported-by: Andrew Cooper >>> Signed-off-by: Jan Beulich >> General principle looks good. A couple of issues. >> >>> --- a/xen/arch/x86/mm.c >>> +++ b/xen/arch/x86/mm.c >>> @@ -4661,9 +4661,8 @@ int xenmem_add_to_physmap_one( >>> long arch_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg) >>> { >>> int rc; >>> -int op = cmd & MEMOP_CMD_MASK; >> This needs a blanket start_iter check, as do_memory_op() has not done so. > Not sure what you're asking for - why is removing the masking not > sufficient? There is no check to ensure that a non-preemptible arch_memoy_op is not called with a non-zero start_iter. This location needs something like if ( cmd & ~MEMOP_CMD_MASK ) return -ENOSYS; > >> The ARM code also needs one, as the caller has applied partial checks. > The ARM code never applied a mask. But the common code does, so the ARM code must follow suit for consistency. Otherwise, we end up with ARM non-preemptible memory subops not failing with -ENOSYS where primary memory ops would. > >>> --- a/xen/common/memory.c >>> +++ b/xen/common/memory.c >>> @@ -977,6 +992,9 @@ long do_memory_op(unsigned long cmd, XEN >>> unsigned int dom_vnodes, dom_vranges, dom_vcpus; >>> struct vnuma_info tmp; >>> >>> +if ( unlikely(start_extent) ) >>> +return -ENOSYS; >>> + >>> /* >>> * Guest passes nr_vnodes, number of regions and nr_vcpus thus >>> * we know how much memory guest has allocated. >> XENMEM_get_vnumainfo needs a guard. > Again - I don't understand what you're asking for: The hunk above > is modifying the XENMEM_get_vnumainfo case. My apologies - I can't see now why I identified get_vnumainfo as missing a check. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] A few EFI code questions
>>> On 05.12.14 at 15:51, wrote: > On Thu, Dec 04, 2014 at 09:35:01AM +, Jan Beulich wrote: >> >>> On 03.12.14 at 22:02, wrote: >> > 3) Should not we change xen/arch/*/efi/efi-boot.h to >> >xen/arch/*/efi/efi-boot.c? efi-boot.h contains more >> >code than definitions, declarations and short static >> >functions. So, I think that it is more regular *.c file >> >than header file. >> >> That's a matter of taste - I'd probably have made it .c too, but >> didn't mind it being .h as done by Roy (presumably on the basis >> that #include directives are preferred to have .h files as their >> operands). The only thing I regret is that I didn't ask for the >> pointless efi- prefix to be dropped. > > As I can see a few people people agree to some extent with my suggestion. > Great! Sadly if we wish .c file than simple boot.c (as Jan suggested we can > drop efi- prefix) conflicts with exiting boot.c link. Is efi-boot.c OK? > Or maybe boot-arch.c? boot.h is OK for sure. Which one do you prefer? > Do you have better ideas? boot.h would be my preference given how things look like right now, but I don't think this possibility of renaming warrants a much longer discussion. Please also remember that renaming always implies more cumbersome backporting, even if only slightly more. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH OSSTEST] ts-xen-build-prep: Install libxml-xpath-perl on build machines
Required by latest libvirt, to build docs. Signed-off-by: Ian Campbell --- ts-xen-build-prep | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ts-xen-build-prep b/ts-xen-build-prep index 05a7857..a7d0d03 100755 --- a/ts-xen-build-prep +++ b/ts-xen-build-prep @@ -177,7 +177,7 @@ sub prep () { libglib2.0-dev liblzma-dev pkg-config autoconf automake libtool xsltproc libxml2-utils libxml2-dev libnl-dev - libdevmapper-dev w3c-dtd-xhtml + libdevmapper-dev w3c-dtd-xhtml libxml-xpath-perl ccache)); target_cmd_root($ho, "chmod -R a+r /usr/share/git-core/templates"); -- 2.1.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] have architectures specify the number of PIRQs a hardware domain gets
>>> On 05.12.14 at 15:48, wrote: > On 05/12/14 13:51, Jan Beulich wrote: >> +d->nr_pirqs = extra_hwdom_irqs ? nr_static_irqs + >> extra_hwdom_irqs >> + : arch_hwdom_irqs(domid); > > This means if the user asks for 0 extra (by the command line) for hwdoms > they get the default which non-obvious. I can certainly add another sentence saying so to the documentation. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-xen-4.5 1/3] tools/hotplug: distclean target should remove files generated by configure
On Thu, Dec 04, 2014 at 09:53:40PM -0400, Konrad Rzeszutek Wilk wrote: > On Tue, Dec 02, 2014 at 01:36:20PM -0500, Konrad Rzeszutek Wilk wrote: > > On Tue, Dec 02, 2014 at 04:16:28PM +0100, Daniel Kiper wrote: > > > Signed-off-by: Daniel Kiper > > > > This usage scenario which I can see this being useful (and > > I've tripped over this) is when you rebuild a new version > > from the same repo. As in, this affects developers, but > > > Lets get it in. It fixes an issues that I keep on tripping > on and it is harmless enough that I don't see a way > for this to cause any regressions. > > Release-Acked-by: Konrad Rzeszutek Wilk Great! Thanks! Daniel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-xen-4.5 2/3] gitignore: ignore some files generated by configure
On Thu, Dec 04, 2014 at 01:25:55PM +, Ian Campbell wrote: > On Tue, 2014-12-02 at 16:16 +0100, Daniel Kiper wrote: > > Signed-off-by: Daniel Kiper > > .gitignore updates seem harmless enough. so I've applied this and the > third patch. Awaiting Konrad's verdict on the first. Thanks! Daniel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] A few EFI code questions
On Thu, Dec 04, 2014 at 09:35:01AM +, Jan Beulich wrote: > >>> On 03.12.14 at 22:02, wrote: > > Hey, > > > > 1) Why is there in EFI code so many functions (e.g. efi_start(), > >efi_arch_edd(), ...) with local variables declared as a static? > >Though some of them have also regular local variables. I do not > >why it was decided that some of them must be the static and > >some of do not. It is a bit confusing. As I can see there is > >only one place which have to have local static (place_string()). > >Other seems to me as thing to save space on the stack but I do > >not think we need that. According to UEFI spec there will be > >"128 KiB or more of available stack space" when system runs in > >boot services mode. It is a lot of space. So, I think we can > >safely convert most of local static variables to normal local > >variables. Am I right? > > No. Consider what code results when you e.g. make an EFI_GUID > instance non-static. It could be quite big... > > 2) I am going to add EDID support to EFI code. Should it be x86 > >specific code or common one? As I can see EDID is defined as > >part of GOP so I think that EDID code should be placed in > >xen/common/efi/boot.c. > > Yes. Granted! > > 3) Should not we change xen/arch/*/efi/efi-boot.h to > >xen/arch/*/efi/efi-boot.c? efi-boot.h contains more > >code than definitions, declarations and short static > >functions. So, I think that it is more regular *.c file > >than header file. > > That's a matter of taste - I'd probably have made it .c too, but > didn't mind it being .h as done by Roy (presumably on the basis > that #include directives are preferred to have .h files as their > operands). The only thing I regret is that I didn't ask for the > pointless efi- prefix to be dropped. As I can see a few people people agree to some extent with my suggestion. Great! Sadly if we wish .c file than simple boot.c (as Jan suggested we can drop efi- prefix) conflicts with exiting boot.c link. Is efi-boot.c OK? Or maybe boot-arch.c? boot.h is OK for sure. Which one do you prefer? Do you have better ideas? Daniel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] have architectures specify the number of PIRQs a hardware domain gets
On Fri, 2014-12-05 at 14:48 +, Jan Beulich wrote: > >>> On 05.12.14 at 15:27, wrote: > > On Fri, 2014-12-05 at 13:51 +, Jan Beulich wrote: > >> #define nr_static_irqs NR_IRQS > >> +#define arch_hwdom_irqs(domid) NR_IRQS > > > > FWIW gic_number_lines() is the ARM equivalent of getting the number of > > GSIs. > > > > *BUT* we don't actually use pirqs on ARM (everything goes via the > > virtualised interrupt controller). So maybe we should be setting > > nr_pirqs to 0 on ARM. I appreciate you likely want such a patch to come > > from an ARM person, so I'm fine with you making this NR_IRQS in the > > meantime. > > Considering Julien also asking for this, I don't mind changing this to > zero for ARM. Just let me know which way I can get this ack-ed. If you are happy to provide a version using zero and Julien wants to provide a tested-by then I'm fine with going that way. I'm not happy taking that change untested though, and I don't expect you to test it. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] have architectures specify the number of PIRQs a hardware domain gets
On 05/12/14 13:51, Jan Beulich wrote: > The current value of nr_static_irqs + 256 is often too small for larger > systems. Make it dependent on CPU count and number of IO-APIC pins on > x86, and (until it obtains PCI support) simply NR_IRQS on ARM. > > Signed-off-by: Jan Beulich I obviously prefer the simpler version that removes an unnecessary configuration option. But in the sense that this resolves the immediate problem at least for the short to medium term: Acked-by: David Vrabel > +d->nr_pirqs = extra_hwdom_irqs ? nr_static_irqs + > extra_hwdom_irqs > + : arch_hwdom_irqs(domid); This means if the user asks for 0 extra (by the command line) for hwdoms they get the default which non-obvious. David ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] have architectures specify the number of PIRQs a hardware domain gets
>>> On 05.12.14 at 15:27, wrote: > On Fri, 2014-12-05 at 13:51 +, Jan Beulich wrote: >> #define nr_static_irqs NR_IRQS >> +#define arch_hwdom_irqs(domid) NR_IRQS > > FWIW gic_number_lines() is the ARM equivalent of getting the number of > GSIs. > > *BUT* we don't actually use pirqs on ARM (everything goes via the > virtualised interrupt controller). So maybe we should be setting > nr_pirqs to 0 on ARM. I appreciate you likely want such a patch to come > from an ARM person, so I'm fine with you making this NR_IRQS in the > meantime. Considering Julien also asking for this, I don't mind changing this to zero for ARM. Just let me know which way I can get this ack-ed. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] lock down hypercall continuation encoding masks
>>> On 05.12.14 at 15:36, wrote: > On 05/12/14 11:31, Jan Beulich wrote: >> Andrew validly points out that even if these masks aren't a formal part >> of the hypercall interface, we aren't free to change them: A guest >> suspended for migration in the middle of a continuation would fail to >> work if resumed on a hypervisor using a different value. Hence add >> respective comments to their definitions. >> >> Additionally, to help future extensibility as well as in the spirit of >> reducing undefined behavior as much as possible, refuse hypercalls made >> with the respective bits non-zero when the respective sub-ops don't >> make use of those bits. >> >> Reported-by: Andrew Cooper >> Signed-off-by: Jan Beulich > > General principle looks good. A couple of issues. > >> --- a/xen/arch/x86/mm.c >> +++ b/xen/arch/x86/mm.c >> @@ -4661,9 +4661,8 @@ int xenmem_add_to_physmap_one( >> long arch_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg) >> { >> int rc; >> -int op = cmd & MEMOP_CMD_MASK; > > This needs a blanket start_iter check, as do_memory_op() has not done so. Not sure what you're asking for - why is removing the masking not sufficient? > The ARM code also needs one, as the caller has applied partial checks. The ARM code never applied a mask. >> --- a/xen/common/memory.c >> +++ b/xen/common/memory.c >> @@ -977,6 +992,9 @@ long do_memory_op(unsigned long cmd, XEN >> unsigned int dom_vnodes, dom_vranges, dom_vcpus; >> struct vnuma_info tmp; >> >> +if ( unlikely(start_extent) ) >> +return -ENOSYS; >> + >> /* >> * Guest passes nr_vnodes, number of regions and nr_vcpus thus >> * we know how much memory guest has allocated. > > XENMEM_get_vnumainfo needs a guard. Again - I don't understand what you're asking for: The hunk above is modifying the XENMEM_get_vnumainfo case. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] have architectures specify the number of PIRQs a hardware domain gets
On Fri, 2014-12-05 at 14:36 +, Julien Grall wrote: > Hi, > > On 05/12/14 14:27, Ian Campbell wrote: > > On Fri, 2014-12-05 at 13:51 +, Jan Beulich wrote: > >> #define nr_static_irqs NR_IRQS > >> +#define arch_hwdom_irqs(domid) NR_IRQS > > > > FWIW gic_number_lines() is the ARM equivalent of getting the number of > > GSIs. > > > > *BUT* we don't actually use pirqs on ARM (everything goes via the > > virtualised interrupt controller). So maybe we should be setting > > nr_pirqs to 0 on ARM. I appreciate you likely want such a patch to come > > from an ARM person, so I'm fine with you making this NR_IRQS in the > > meantime. > > As we already know that PIRQ is not used on ARM, it would make sense to > use directly in this patch 0. Are you offering to give a tested-by if Jan posts such a patch? Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] have architectures specify the number of PIRQs a hardware domain gets
Hi, On 05/12/14 14:27, Ian Campbell wrote: > On Fri, 2014-12-05 at 13:51 +, Jan Beulich wrote: >> #define nr_static_irqs NR_IRQS >> +#define arch_hwdom_irqs(domid) NR_IRQS > > FWIW gic_number_lines() is the ARM equivalent of getting the number of > GSIs. > > *BUT* we don't actually use pirqs on ARM (everything goes via the > virtualised interrupt controller). So maybe we should be setting > nr_pirqs to 0 on ARM. I appreciate you likely want such a patch to come > from an ARM person, so I'm fine with you making this NR_IRQS in the > meantime. As we already know that PIRQ is not used on ARM, it would make sense to use directly in this patch 0. Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] lock down hypercall continuation encoding masks
On 05/12/14 11:31, Jan Beulich wrote: > Andrew validly points out that even if these masks aren't a formal part > of the hypercall interface, we aren't free to change them: A guest > suspended for migration in the middle of a continuation would fail to > work if resumed on a hypervisor using a different value. Hence add > respective comments to their definitions. > > Additionally, to help future extensibility as well as in the spirit of > reducing undefined behavior as much as possible, refuse hypercalls made > with the respective bits non-zero when the respective sub-ops don't > make use of those bits. > > Reported-by: Andrew Cooper > Signed-off-by: Jan Beulich General principle looks good. A couple of issues. > --- a/xen/arch/x86/mm.c > +++ b/xen/arch/x86/mm.c > @@ -4661,9 +4661,8 @@ int xenmem_add_to_physmap_one( > long arch_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg) > { > int rc; > -int op = cmd & MEMOP_CMD_MASK; This needs a blanket start_iter check, as do_memory_op() has not done so. The ARM code also needs one, as the caller has applied partial checks. > > -switch ( op ) > +switch ( cmd ) > { > case XENMEM_set_memory_map: > { > --- a/xen/arch/x86/x86_64/mm.c > +++ b/xen/arch/x86/x86_64/mm.c > @@ -906,9 +906,8 @@ long subarch_memory_op(unsigned long cmd > xen_pfn_t mfn, last_mfn; > unsigned int i; > long rc = 0; > -int op = cmd & MEMOP_CMD_MASK; It is probably best to have a blanket check here even if arch_memory_op() has a check. It will reduce the chance of the check being missed if/when arch_memory_op() gains a presentable subop. > > -switch ( op ) > +switch ( cmd ) > { > case XENMEM_machphys_mfn_list: > if ( copy_from_guest(&xmml, arg, 1) ) > --- a/xen/common/memory.c > +++ b/xen/common/memory.c > @@ -977,6 +992,9 @@ long do_memory_op(unsigned long cmd, XEN > unsigned int dom_vnodes, dom_vranges, dom_vcpus; > struct vnuma_info tmp; > > +if ( unlikely(start_extent) ) > +return -ENOSYS; > + > /* > * Guest passes nr_vnodes, number of regions and nr_vcpus thus > * we know how much memory guest has allocated. XENMEM_get_vnumainfo needs a guard. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] have architectures specify the number of PIRQs a hardware domain gets
On Fri, 2014-12-05 at 13:51 +, Jan Beulich wrote: > #define nr_static_irqs NR_IRQS > +#define arch_hwdom_irqs(domid) NR_IRQS FWIW gic_number_lines() is the ARM equivalent of getting the number of GSIs. *BUT* we don't actually use pirqs on ARM (everything goes via the virtualised interrupt controller). So maybe we should be setting nr_pirqs to 0 on ARM. I appreciate you likely want such a patch to come from an ARM person, so I'm fine with you making this NR_IRQS in the meantime. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] xen/serial: setup UART idle mode for OMAP
Hi Oleksandr, On 05/12/14 13:46, Oleksandr Dmytryshyn wrote: > UART is not able to receive bytes when idle mode is not > configured properly. When we use Xen with old Linux > Kernel (for example 3.8) this kernel configures UART > idle mode even if the UART node in device tree is absent. I don't understand how the kernel can configure the UART as the MMIO range is not mapped. Is there another way to set the idle mode? Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PV DomU running linux 3.17.3 causing xen-netback fatal error in Dom0
On 05/12/14 12:48, Zoltan Kiss wrote: > Hi, > > Maybe I'm misreading it, but it seems to me that netfront doesn't slice > up the linear buffer at all, just blindly sends it. In xennet_start_xmit: This is handled in the beginning of xennet_make_frags() (which I would agree isn't not the obvious place for it). David ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 3/4] x86: allow dma_get_required_mask() to be overridden
Use dma_ops->get_required_mask() if provided, defaulting to dma_get_requried_mask_from_max_pfn(). This is needed on systems (such as Xen PV guests) where the DMA address and the physical address are not equal. ARCH_HAS_DMA_GET_REQUIRED_MASK is defined in asm/device.h instead of asm/dma-mapping.h because linux/dma-mapping.h uses the define before including asm/dma-mapping.h Signed-off-by: David Vrabel Reviewed-by: Stefano Stabellini --- arch/x86/include/asm/device.h |2 ++ arch/x86/kernel/pci-dma.c |8 2 files changed, 10 insertions(+) diff --git a/arch/x86/include/asm/device.h b/arch/x86/include/asm/device.h index 03dd729..10bc628 100644 --- a/arch/x86/include/asm/device.h +++ b/arch/x86/include/asm/device.h @@ -13,4 +13,6 @@ struct dev_archdata { struct pdev_archdata { }; +#define ARCH_HAS_DMA_GET_REQUIRED_MASK + #endif /* _ASM_X86_DEVICE_H */ diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c index a25e202..5154400 100644 --- a/arch/x86/kernel/pci-dma.c +++ b/arch/x86/kernel/pci-dma.c @@ -140,6 +140,14 @@ void dma_generic_free_coherent(struct device *dev, size_t size, void *vaddr, free_pages((unsigned long)vaddr, get_order(size)); } +u64 dma_get_required_mask(struct device *dev) +{ + if (dma_ops->get_required_mask) + return dma_ops->get_required_mask(dev); + return dma_get_required_mask_from_max_pfn(dev); +} +EXPORT_SYMBOL_GPL(dma_get_required_mask); + /* * See for the iommu kernel * parameter documentation. -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCHv5 0/4] dma, x86, xen: reduce SWIOTLB usage in Xen guests
On systems where DMA addresses and physical addresses are not 1:1 (such as Xen PV guests), the generic dma_get_required_mask() will not return the correct mask (since it uses max_pfn). Some device drivers (such as mptsas, mpt2sas) use dma_get_required_mask() to set the device's DMA mask to allow them to use only 32-bit DMA addresses in hardware structures. This results in unnecessary use of the SWIOTLB if DMA addresses are more than 32-bits, impacting performance significantly. This series allows Xen PV guests to override the default dma_get_required_mask() with a more suitable one. Changes in v5: - xen_swiotlb_get_required_mask() is x86 only. Changes in v4: - Assume 64-bit mask is required. Changes in v3: - fix off-by-one in xen_dma_get_required_mask() - split ia64 changes into separate patch. Changes in v2: - split x86 and xen changes into separate patches David ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 2/4] ia64: use common dma_get_required_mask_from_pfn()
Signed-off-by: David Vrabel Reviewed-by: Stefano Stabellini Cc: Tony Luck Cc: Fenghua Yu Cc: linux-i...@vger.kernel.org --- arch/ia64/include/asm/machvec.h |2 +- arch/ia64/include/asm/machvec_init.h |1 - arch/ia64/pci/pci.c | 20 3 files changed, 1 insertion(+), 22 deletions(-) diff --git a/arch/ia64/include/asm/machvec.h b/arch/ia64/include/asm/machvec.h index 9c39bdf..beaa47d 100644 --- a/arch/ia64/include/asm/machvec.h +++ b/arch/ia64/include/asm/machvec.h @@ -287,7 +287,7 @@ extern struct dma_map_ops *dma_get_ops(struct device *); # define platform_dma_get_ops dma_get_ops #endif #ifndef platform_dma_get_required_mask -# define platform_dma_get_required_mask ia64_dma_get_required_mask +# define platform_dma_get_required_mask dma_get_required_mask_from_max_pfn #endif #ifndef platform_irq_to_vector # define platform_irq_to_vector__ia64_irq_to_vector diff --git a/arch/ia64/include/asm/machvec_init.h b/arch/ia64/include/asm/machvec_init.h index 37a4698..ef964b2 100644 --- a/arch/ia64/include/asm/machvec_init.h +++ b/arch/ia64/include/asm/machvec_init.h @@ -3,7 +3,6 @@ extern ia64_mv_send_ipi_t ia64_send_ipi; extern ia64_mv_global_tlb_purge_t ia64_global_tlb_purge; -extern ia64_mv_dma_get_required_mask ia64_dma_get_required_mask; extern ia64_mv_irq_to_vector __ia64_irq_to_vector; extern ia64_mv_local_vector_to_irq __ia64_local_vector_to_irq; extern ia64_mv_pci_get_legacy_mem_t ia64_pci_get_legacy_mem; diff --git a/arch/ia64/pci/pci.c b/arch/ia64/pci/pci.c index 291a582..79da21b 100644 --- a/arch/ia64/pci/pci.c +++ b/arch/ia64/pci/pci.c @@ -791,26 +791,6 @@ static void __init set_pci_dfl_cacheline_size(void) pci_dfl_cache_line_size = (1 << cci.pcci_line_size) / 4; } -u64 ia64_dma_get_required_mask(struct device *dev) -{ - u32 low_totalram = ((max_pfn - 1) << PAGE_SHIFT); - u32 high_totalram = ((max_pfn - 1) >> (32 - PAGE_SHIFT)); - u64 mask; - - if (!high_totalram) { - /* convert to mask just covering totalram */ - low_totalram = (1 << (fls(low_totalram) - 1)); - low_totalram += low_totalram - 1; - mask = low_totalram; - } else { - high_totalram = (1 << (fls(high_totalram) - 1)); - high_totalram += high_totalram - 1; - mask = (((u64)high_totalram) << 32) + 0x; - } - return mask; -} -EXPORT_SYMBOL_GPL(ia64_dma_get_required_mask); - u64 dma_get_required_mask(struct device *dev) { return platform_dma_get_required_mask(dev); -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 4/4] x86/xen: assume a 64-bit DMA mask is required
On a Xen PV guest the DMA addresses and physical addresses are not 1:1 (such as Xen PV guests) and the generic dma_get_required_mask() does not return the correct mask (since it uses max_pfn). Some device drivers (such as mptsas, mpt2sas) use dma_get_required_mask() to set the device's DMA mask to allow them to use only 32-bit DMA addresses in hardware structures. This results in unnecessary use of the SWIOTLB if DMA addresses are more than 32-bits, impacting performance significantly. We could base the DMA mask on the maximum MFN but: a) The hypercall op to get the maximum MFN (XENMEM_maximum_ram_page) will truncate the result to an int in 32-bit guests. b) Future uses of the IOMMU in Xen may map frames at bus addresses above the end of RAM. So, just assume a 64-bit DMA mask is always required. Signed-off-by: David Vrabel --- arch/x86/xen/pci-swiotlb-xen.c |6 ++ 1 file changed, 6 insertions(+) diff --git a/arch/x86/xen/pci-swiotlb-xen.c b/arch/x86/xen/pci-swiotlb-xen.c index 0e98e5d..35774f8 100644 --- a/arch/x86/xen/pci-swiotlb-xen.c +++ b/arch/x86/xen/pci-swiotlb-xen.c @@ -18,6 +18,11 @@ int xen_swiotlb __read_mostly; +static u64 xen_swiotlb_get_required_mask(struct device *dev) +{ + return DMA_BIT_MASK(64); +} + static struct dma_map_ops xen_swiotlb_dma_ops = { .mapping_error = xen_swiotlb_dma_mapping_error, .alloc = xen_swiotlb_alloc_coherent, @@ -31,6 +36,7 @@ static struct dma_map_ops xen_swiotlb_dma_ops = { .map_page = xen_swiotlb_map_page, .unmap_page = xen_swiotlb_unmap_page, .dma_supported = xen_swiotlb_dma_supported, + .get_required_mask = xen_swiotlb_get_required_mask, }; /* -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 1/4] dma: add dma_get_required_mask_from_max_pfn()
A generic dma_get_required_mask() is useful even for architectures (such as ia64) that define ARCH_HAS_GET_REQUIRED_MASK. Signed-off-by: David Vrabel Reviewed-by: Stefano Stabellini --- drivers/base/platform.c | 10 -- include/linux/dma-mapping.h |1 + 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/drivers/base/platform.c b/drivers/base/platform.c index b2afc29..f9f3930 100644 --- a/drivers/base/platform.c +++ b/drivers/base/platform.c @@ -1009,8 +1009,7 @@ int __init platform_bus_init(void) return error; } -#ifndef ARCH_HAS_DMA_GET_REQUIRED_MASK -u64 dma_get_required_mask(struct device *dev) +u64 dma_get_required_mask_from_max_pfn(struct device *dev) { u32 low_totalram = ((max_pfn - 1) << PAGE_SHIFT); u32 high_totalram = ((max_pfn - 1) >> (32 - PAGE_SHIFT)); @@ -1028,6 +1027,13 @@ u64 dma_get_required_mask(struct device *dev) } return mask; } +EXPORT_SYMBOL_GPL(dma_get_required_mask_from_max_pfn); + +#ifndef ARCH_HAS_DMA_GET_REQUIRED_MASK +u64 dma_get_required_mask(struct device *dev) +{ + return dma_get_required_mask_from_max_pfn(dev); +} EXPORT_SYMBOL_GPL(dma_get_required_mask); #endif diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index d5d3881..6e2fdfc 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -127,6 +127,7 @@ static inline int dma_coerce_mask_and_coherent(struct device *dev, u64 mask) return dma_set_mask_and_coherent(dev, mask); } +extern u64 dma_get_required_mask_from_max_pfn(struct device *dev); extern u64 dma_get_required_mask(struct device *dev); #ifndef set_arch_dma_coherent_ops -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH] VMX: don't allow PVH to reach handle_pio() or handle_mmio()
PVH guests are not supposed to access I/O ports they weren't given access to (there's nothing to handle emulation of such accesses). Reported-by: Roger Pau Monné Signed-off-by: Jan Beulich --- Note: Only compile tested so far. --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -3082,6 +3082,9 @@ void vmx_vmexit_handler(struct cpu_user_ } case EXIT_REASON_IO_INSTRUCTION: +if ( unlikely(is_pvh_vcpu(v)) ) +goto exit_and_crash; + __vmread(EXIT_QUALIFICATION, &exit_qualification); if ( exit_qualification & 0x10 ) { VMX: don't allow PVH to reach handle_pio() or handle_mmio() PVH guests are not supposed to access I/O ports they weren't given access to (there's nothing to handle emulation of such accesses). Reported-by: Roger Pau Monné Signed-off-by: Jan Beulich --- Note: Only compile tested so far. --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -3082,6 +3082,9 @@ void vmx_vmexit_handler(struct cpu_user_ } case EXIT_REASON_IO_INSTRUCTION: +if ( unlikely(is_pvh_vcpu(v)) ) +goto exit_and_crash; + __vmread(EXIT_QUALIFICATION, &exit_qualification); if ( exit_qualification & 0x10 ) { ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC V8 2/3] libxl domain snapshot API design
On Tue, 2014-12-02 at 23:14 -0700, Chun Yan Liu wrote: > > >>> On 11/28/2014 at 11:43 PM, in message > >>> <1417189409.23604.62.ca...@citrix.com>, > Ian Campbell wrote: > > On Tue, 2014-11-25 at 02:08 -0700, Chun Yan Liu wrote: > > > Hi, Ian, > > > > > > According to previous discussion, snapshot delete and revert are > > > inclined to be done by high level application itself, won't supply a > > > libxl API. > > > > I thought you had explained a scenario where the toolstack needed to be > > at least aware of delete, specifically when you are deleting a snapshot > > from the middle of an active chain. > > The reason why I post such an overview here before sending next > version is: I'm puzzled about what should be in libxl and what > in toolstack after previous discussion. So posted here to seek > some ideas or agreement first. It's not a full design, not break > down to libxl and toolstack yet. I guess I thought we had gotten closer to this than we actually have. > > Maybe that's not "snapshot delete API in libxl" though, but rather a > > notification API which the toolstack can use to tell libxl something is > > going on. > > About notification API, after looking at lvm, vhd-util and qcow2, > I don't think we need it. No extra work needs to do to handle > disk snapshot chain. > lvm: doesn't support snapshot of snapshot. > vhd-util: backing file chain, external snapshot. Don't need to > delete the disk snapshot when deleting domain snapshot. > qcow2: > * internal disk snapshot: each snapshot increases the refcount > of data, deleting snapshot only decrease the refcount, won't > affect other snapshots. > * external disk snapshot: same as vhd-util, backing file chain. > Don't need to delete disk snapshot when deleting domain snapshot. You don't need to, but might a toolstack (or user) want to consolidate anyway, e.g. to reduce chain length? (which might otherwise be overly long.) > > I don't believe xl can take a disk snapshot of an active disk, it > > doesn't have the machinery to deal with that sort of thing, nor should > > it, this is exactly the sort of thing which libxl is provided to deal > > with. > > Like delete a disk snapshot, xl can call external command to do that > (e.g. qemu-img). But it's better to call qmp to do that. The toolstack (xl or libvirt) doesn't have direct access to qmp, it would have to go via a libxl API, for an Active domain at least. qemu-img is the right answer for an Inactive domain. Secondly, the disk snapshot has to happen while the domain is paused/quiesced for consistency. This happens deep in the bowels of the libxl save/restore code. So either libxl has to do the disk snapshots at the same time or we need a callback to the toolstack in order for it to make the snapshots. > Anyway, if for domain snapshot create, we should put creating disk > snapshot process in libxl, then for domain snapshot delete, we > should put deleting disk snapshot process in libxl. That is, in libxl > there should be: > libxl_disk_snapshot_create (which handles creating disk snapshot) > libxl_disk_snapshot_delete (which handles deleting disk snapshot) > > Otherwise I would think it's weird to have in libxl: > libxl_domain_snapshot_create (wrap saving memory [already has API] > and creating disk snapshot) > libxl_disk_snapshot_delete (deleting disk snapshot) The create and delete cases are subtly different, so it may be that the API ends up asymmetric. The create mechanism (whichever one it is) operates on a single Active domain and is reasonably well defined. The delete operation however can potentially operate on multiple Active domains, e.g. 2 domains are running with a common ancestor snapshot which is being removed. How would the delete interface deal with this case? In particular without libxl becoming involved in "storage management". The reason I'm thinking of a "delete notify" style interface for Active domains is that it then applies to a single Active domain at a time. If multiple domains are affected by a snapshot deletion then the notification is called multiple times. > And about the snapshot json file store and retrieve, using > gentype.py to autogenerate xx_to_json and xx_from_json functions > is very convenient, there would be a group of functions > set/get/update/delete_snapshot_metadata based on that. > But I didn't see other such usage in xl, and it's not proper to > place in libxl. Anywhere could it be placed but used by xl? > Wei might have some ideas about this? xl hasn't needed to use the autogeneration infrastructure to date, but there's no reason why it couldn't do so if there was a need. Just create xl_types.idl and hook it into the Makeile. It would be harder to extend this to other toolstack, but I suspect we don't need to. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH] have architectures specify the number of PIRQs a hardware domain gets
The current value of nr_static_irqs + 256 is often too small for larger systems. Make it dependent on CPU count and number of IO-APIC pins on x86, and (until it obtains PCI support) simply NR_IRQS on ARM. Signed-off-by: Jan Beulich --- This is meant to be an alternative proposal to David's: http://lists.xenproject.org/archives/html/xen-devel/2014-12/msg00421.html --- a/docs/misc/xen-command-line.markdown +++ b/docs/misc/xen-command-line.markdown @@ -596,13 +596,14 @@ Force or disable use of EFI runtime serv ### extra\_guest\_irqs > `= [][,]` -> Default: `32,256` +> Default: `32,` Change the number of PIRQs available for guests. The optional first number is common for all domUs, while the optional second number (preceded by a comma) is for dom0. Changing the setting for domU has no impact on dom0 and vice versa. For example to change dom0 without changing domU, use -`extra_guest_irqs=,512` +`extra_guest_irqs=,512`. The default value for Dom0 and an eventual separate +hardware domain is architecture dependent. ### flask\_enabled > `= ` --- a/xen/arch/x86/domain_build.c +++ b/xen/arch/x86/domain_build.c @@ -101,7 +101,7 @@ static void __init parse_dom0_max_vcpus( } custom_param("dom0_max_vcpus", parse_dom0_max_vcpus); -struct vcpu *__init alloc_dom0_vcpu0(struct domain *dom0) +unsigned int __init dom0_max_vcpus(void) { unsigned max_vcpus; @@ -113,6 +113,13 @@ struct vcpu *__init alloc_dom0_vcpu0(str if ( max_vcpus > MAX_VIRT_CPUS ) max_vcpus = MAX_VIRT_CPUS; +return max_vcpus; +} + +struct vcpu *__init alloc_dom0_vcpu0(struct domain *dom0) +{ +unsigned int max_vcpus = dom0_max_vcpus(); + dom0->vcpu = xzalloc_array(struct vcpu *, max_vcpus); if ( !dom0->vcpu ) return NULL; --- a/xen/arch/x86/io_apic.c +++ b/xen/arch/x86/io_apic.c @@ -33,6 +33,7 @@ #include #include #include +#include #include #include #include @@ -2606,3 +2607,14 @@ void __init init_ioapic_mappings(void) nr_irqs_gsi, nr_irqs - nr_irqs_gsi); } +unsigned int arch_hwdom_irqs(domid_t domid) +{ +unsigned int n = fls(num_present_cpus()); + +if ( !domid ) +n = min(n, dom0_max_vcpus()); +n = min(nr_irqs_gsi + n * NR_DYNAMIC_VECTORS, nr_irqs); +printk("Dom%d has maximum %u PIRQs\n", domid, n); + +return n; +} --- a/xen/common/domain.c +++ b/xen/common/domain.c @@ -231,14 +231,14 @@ static int late_hwdom_init(struct domain #endif } -static unsigned int __read_mostly extra_dom0_irqs = 256; +static unsigned int __read_mostly extra_hwdom_irqs; static unsigned int __read_mostly extra_domU_irqs = 32; static void __init parse_extra_guest_irqs(const char *s) { if ( isdigit(*s) ) extra_domU_irqs = simple_strtoul(s, &s, 0); if ( *s == ',' && isdigit(*++s) ) -extra_dom0_irqs = simple_strtoul(s, &s, 0); +extra_hwdom_irqs = simple_strtoul(s, &s, 0); } custom_param("extra_guest_irqs", parse_extra_guest_irqs); @@ -326,7 +326,8 @@ struct domain *domain_create( if ( !is_hardware_domain(d) ) d->nr_pirqs = nr_static_irqs + extra_domU_irqs; else -d->nr_pirqs = nr_static_irqs + extra_dom0_irqs; +d->nr_pirqs = extra_hwdom_irqs ? nr_static_irqs + extra_hwdom_irqs + : arch_hwdom_irqs(domid); if ( d->nr_pirqs > nr_irqs ) d->nr_pirqs = nr_irqs; --- a/xen/include/asm-arm/irq.h +++ b/xen/include/asm-arm/irq.h @@ -21,10 +21,10 @@ struct arch_irq_desc { #define NR_LOCAL_IRQS 32 #define NR_IRQS1024 -#define nr_irqs NR_IRQS #define nr_irqs NR_IRQS #define nr_static_irqs NR_IRQS +#define arch_hwdom_irqs(domid) NR_IRQS struct irq_desc; struct irqaction; --- a/xen/include/asm-x86/setup.h +++ b/xen/include/asm-x86/setup.h @@ -35,6 +35,8 @@ int construct_dom0( unsigned long initial_images_nrpages(void); void discard_initial_images(void); +unsigned int dom0_max_vcpus(void); + int xen_in_range(unsigned long mfn); void microcode_grab_module( --- a/xen/include/xen/irq.h +++ b/xen/include/xen/irq.h @@ -168,4 +168,8 @@ static inline void set_native_irq_info(u unsigned int set_desc_affinity(struct irq_desc *, const cpumask_t *); +#ifndef arch_hwdom_irqs +unsigned int arch_hwdom_irqs(domid_t); +#endif + #endif /* __XEN_IRQ_H__ */ have architectures specify the number of PIRQs a hardware domain gets The current value of nr_static_irqs + 256 is often too small for larger systems. Make it dependent on CPU count and number of IO-APIC pins on x86, and (until it obtains PCI support) simply NR_IRQS on ARM. Signed-off-by: Jan Beulich --- This is meant to be an alternative proposal to David's: http://lists.xenproject.org/archives/html/xen-devel/2014-12/msg00421.html --- a/docs/misc/xen-command-line.markdown +++ b/docs/misc/xen-command-line.markdown @@ -596,13 +596,14 @@ Force or disable use of EFI runtime serv ### extra\_guest\_irqs > `= [][,]` -> Default: `3
[Xen-devel] [PATCH] xen/serial: setup UART idle mode for OMAP
UART is not able to receive bytes when idle mode is not configured properly. When we use Xen with old Linux Kernel (for example 3.8) this kernel configures UART idle mode even if the UART node in device tree is absent. So UART works normally in this case. But new Linux Kernel (3.12 and upper) doesn't configure idle mode for UART and UART can not work normally in this case. Signed-off-by: Oleksandr Dmytryshyn --- xen/drivers/char/omap-uart.c | 3 +++ xen/include/xen/8250-uart.h | 4 2 files changed, 7 insertions(+) diff --git a/xen/drivers/char/omap-uart.c b/xen/drivers/char/omap-uart.c index a798b8d..16d1454 100644 --- a/xen/drivers/char/omap-uart.c +++ b/xen/drivers/char/omap-uart.c @@ -195,6 +195,9 @@ static void __init omap_uart_init_preirq(struct serial_port *port) omap_write(uart, UART_MCR, UART_MCR_DTR|UART_MCR_RTS); omap_write(uart, UART_OMAP_MDR1, UART_OMAP_MDR1_16X_MODE); + +/* setup iddle mode */ +omap_write(uart, UART_SYSC, OMAP_UART_SYSC_DEF_CONF); } static void __init omap_uart_init_postirq(struct serial_port *port) diff --git a/xen/include/xen/8250-uart.h b/xen/include/xen/8250-uart.h index a682bae..304b9dd 100644 --- a/xen/include/xen/8250-uart.h +++ b/xen/include/xen/8250-uart.h @@ -32,6 +32,7 @@ #define UART_MCR 0x04/* Modem control*/ #define UART_LSR 0x05/* line status */ #define UART_MSR 0x06/* Modem status */ +#define UART_SYSC 0x15/* System configuration register */ #define UART_USR 0x1f/* Status register (DW) */ #define UART_DLL 0x00/* divisor latch (ls) (DLAB=1) */ #define UART_DLM 0x01/* divisor latch (ms) (DLAB=1) */ @@ -145,6 +146,9 @@ /* SCR register bitmasks */ #define OMAP_UART_SCR_RX_TRIG_GRANU1_MASK (1 << 7) +/* System configuration register */ +#define OMAP_UART_SYSC_DEF_CONF 0x0d /* autoidle mode, wakeup is enabled */ + #endif /* __XEN_8250_UART_H__ */ /* -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 5/5] tools/hotplug: support XENSTORED_TRACE in systemd
On Fri, Dec 05, Ian Jackson wrote: > Can systemd not launch these daemons by running the existing > xencommons et al init scripts ? Obviously that won't give you all of > systemd's shiny features but IMO it ought to work. I think the point was to let systemd pass the file descriptors. Thats why the service file does the "exec xenstored". Olaf ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/5] tools/hotplug: move XENSTORED_MOUNT_CTX to sysconfig.xencommons
On Fri, Dec 05, Ian Jackson wrote: > Olaf Hering writes ("Re: [PATCH 1/5] tools/hotplug: move XENSTORED_MOUNT_CTX > to sysconfig.xencommons"): > > On Fri, Dec 05, Ian Jackson wrote: > > > This patch looks like just the hook. It seems to be missing the part > > > where the actual selinux context is defined and plumbed through. > > > > The context in xen source is "none". As asked in the cover letter (which > > unfortunately got send to just Konrad and xen-devel, no idea how to fix > > that) a configure --with-something may be the way to inject it into the > > sources, if required. > > I confess I don't know very much about selinux, but shouldn't we be > providing a reasonable default policy, rather than leaving it to the > distro or user to pass special options to configure ? Or are things > in the selinux world so fragmented or fast-moving that such a generic > policy couldn't be written ? I know nothing about SELinux. Not sure why a context= is required anyway. But I can find out next week if noone else has an idea how to deal with SELinux. Olaf ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 5/5] tools/hotplug: support XENSTORED_TRACE in systemd
Olaf Hering writes ("Re: [PATCH 5/5] tools/hotplug: support XENSTORED_TRACE in systemd"): > On Fri, Dec 05, Ian Jackson wrote: > > I think the only way to make this work properly is to factor the > > necessary parts out of init.d/xencommons into a new script which can > > be used by both xencommons and systemd. I'm not sure such a patch > > would be appropriate for 4.5 at this stage. > > Yes, a helper script to launch just xenstored would help. But which part > would do the final "exec"? Perhaps the sysv script has to fork a shell > like its done above. I will have a look at this. If there's no other way to do it, you could have the helper script take an argument (or a named (environment) parameter) to discover whether to call exec. > Are you opposed to the idea to support XENSTORED_TRACE for systemd right > in 4.5.0? Ideally I would like to support XENSTORED_TRACE for systemd in 4.5.0. But I do not want to duplicate the functionality at all. systemd seems to make it difficult to support XENSTORED_TRACE without either duplicating functionality or refactoring the existing init.d script. (Indeed the very fact that XENSTORED_TRACE does not work with systemd right now is due to the systemd startup of xenstored being decoupled from the init script code which handles XENSTORED_TRACE. I seem to remember making some comments about this kind of thing at the time...) And I am currently unconvinced that refactoring things at this stage of the 4.5 release is appropriate. But others may have a different view. Can systemd not launch these daemons by running the existing xencommons et al init scripts ? Obviously that won't give you all of systemd's shiny features but IMO it ought to work. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PV DomU running linux 3.17.3 causing xen-netback fatal error in Dom0
Hi, Maybe I'm misreading it, but it seems to me that netfront doesn't slice up the linear buffer at all, just blindly sends it. In xennet_start_xmit: unsigned int offset = offset_in_page(data); unsigned int len = skb_headlen(skb); ... tx->offset = offset; tx->size = len; Although in the slot counting it calculates it correctly: DIV_ROUND_UP(offset + len, PAGE_SIZE) Am I missing something? Zoli On 04/12/14 15:53, David Vrabel wrote: On 04/12/14 15:36, Anthony Wright wrote: On 01/12/14 14:22, David Vrabel wrote: This VIF protocol is weird. The first slot contains a txreq with a size for the total length of the packet, subsequent slots have sizes for that fragment only. netback then has to calculate how long the first slot is, by subtracting all the size from the following slots. So something has gone wrong but it's not obvious what it is. Any chance you can dump the ring state when it happens? We think we've worked out how to dump the ring state, please see below. We need the full contents of the ring which isn't currently available via debugfs and I haven't had time to put together a debug patch to make it available. David ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/5] tools/hotplug: move XENSTORED_MOUNT_CTX to sysconfig.xencommons
Olaf Hering writes ("Re: [PATCH 1/5] tools/hotplug: move XENSTORED_MOUNT_CTX to sysconfig.xencommons"): > On Fri, Dec 05, Ian Jackson wrote: > > This patch looks like just the hook. It seems to be missing the part > > where the actual selinux context is defined and plumbed through. > > The context in xen source is "none". As asked in the cover letter (which > unfortunately got send to just Konrad and xen-devel, no idea how to fix > that) a configure --with-something may be the way to inject it into the > sources, if required. I confess I don't know very much about selinux, but shouldn't we be providing a reasonable default policy, rather than leaving it to the distro or user to pass special options to configure ? Or are things in the selinux world so fragmented or fast-moving that such a generic policy couldn't be written ? > > > There is no need to require the creation of a new sysconfig file, just > > > reuse the existing /etc/sysconfig/xencommons file. > > > > This seems to be an unrelated change ? If not I confess I don't see > > the connection. > > The context has to be defined somewhere. And that place is > sysconfig/xencommons. Oh, I see. I think you should do this change as a pre-patch, along with the abolition of /etc/{default,sysconfig}/{xenconsoled,xenstored} Your patch 2/5 involving xenconsoled has a mixture of code motion and other semantic changes, which makes it hard to review. > > And won't this break existing systems which have an > > /etc/{default,sysconfig}/xenstored ? > > Which systems would that be? That file is new in 4.5. Oh, good. In that case we should abolish these ASAP - before 4.5. Thanks, Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Poor network performance between DomU with multiqueue support
On Fri, Dec 05, 2014 at 01:17:16AM +, Zhangleiqiang (Trump) wrote: [...] > > I think that's expected, because guest RX data path still uses grant_copy > > while > > guest TX uses grant_map to do zero-copy transmit. > > As far as I know, there are three main grant-related operations used in split > device model: grant mapping, grant transfer and grant copy. > Grant transfer has not used now, and grant mapping and grant transfer both > involve "TLB" refresh work for hypervisor, am I right? Or only grant > transfer has this overhead? Transfer is not used so I can't tell. Grant unmap causes TLB flush. I saw in an email the other day XenServer folks has some planned improvement to avoid TLB flush in Xen to upstream in 4.6 window. I can't speak for sure it will get upstreamed as I don't work on that. > Does grant copy surely has more overhead than grant mapping? > At the very least the zero-copy TX path is faster than previous copying path. But speaking of the micro operation I'm not sure. There was once persistent map prototype netback / netfront that establishes a memory pool between FE and BE then use memcpy to copy data. Unfortunately that prototype was not done right so the result was not good. > >From the code, I see that in TX, netback will do gnttab_batch_copy as well > >as gnttab_map_refs: > > //netback.c:xenvif_tx_action > xenvif_tx_build_gops(queue, budget, &nr_cops, &nr_mops); > > if (nr_cops == 0) > return 0; > > gnttab_batch_copy(queue->tx_copy_ops, nr_cops); > if (nr_mops != 0) { > ret = gnttab_map_refs(queue->tx_map_ops, > NULL, > queue->pages_to_map, > nr_mops); > BUG_ON(ret); > } > > The copy is for the packet header. Mapping is for packet data. We need to copy header from guest so that it doesn't change under netback's feet. Wei. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/5] tools/hotplug: move XENSTORED_MOUNT_CTX to sysconfig.xencommons
On Fri, Dec 05, Olaf Hering wrote: > On Fri, Dec 05, Ian Jackson wrote: > > And won't this break existing systems which have an > > /etc/{default,sysconfig}/xenstored ? > Which systems would that be? That file is new in 4.5. ... Not the file itself but the usage of a to-be-created file ... Olaf ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel