[Xen-devel] [PATCH v3 14/17] SUPPORT.md: Add statement on PCI passthrough

2017-11-22 Thread George Dunlap
Signed-off-by: George Dunlap 
---
Changes since v2:
- Separate PV and HVM passthrough (excluding PVH by implication)
- + not compatible with PoD
- 'will be' -> 'are'

NB that we don't seem to have the referenced file yet; left as a reference.

CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
CC: Rich Persaud 
CC: Marek Marczykowski-Górecki 
CC: Christopher Clark 
CC: James McKenzie 
---
 SUPPORT.md | 36 +++-
 1 file changed, 35 insertions(+), 1 deletion(-)

diff --git a/SUPPORT.md b/SUPPORT.md
index 63f6a6d127..c8fec4daa8 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -486,9 +486,23 @@ but has no xl support.
 
 ## Security
 
+### Driver Domains
+
+Status: Supported, with caveats
+
+"Driver domains" means allowing non-Domain 0 domains
+with access to physical devices to act as back-ends.
+
+See the appropriate "Device Passthrough" section
+for more information about security support.
+
 ### Device Model Stub Domains
 
-Status: Supported
+Status: Supported, with caveats
+
+Vulnerabilities of a device model stub domain
+to a hostile driver domain (either compromised or untrusted)
+are excluded from security support.
 
 ### KCONFIG Expert
 
@@ -559,6 +573,26 @@ Virtual Performance Management Unit for HVM guests
 Disabled by default (enable with hypervisor command line option).
 This feature is not security supported: see 
http://xenbits.xen.org/xsa/advisory-163.html
 
+### x86/PCI Device Passthrough
+
+Status, x86 PV: Supported, with caveats
+Status, x86 HVM: Supported, with caveats
+
+Only systems using IOMMUs are supported.
+
+Not compatible with migration, populate-on-demand, altp2m,
+introspection, memory sharing, or memory paging.
+
+Because of hardware limitations
+(affecting any operating system or hypervisor),
+it is generally not safe to use this feature
+to expose a physical device to completely untrusted guests.
+However, this feature can still confer significant security benefit
+when used to remove drivers and backends from domain 0
+(i.e., Driver Domains).
+
+XXX See docs/PCI-IOMMU-bugs.txt for more information.
+
 ### ARM/Non-PCI device passthrough
 
 Status: Supported, not security supported
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 17/17] SUPPORT.md: Miscellaneous additions

2017-11-22 Thread George Dunlap
Mostly as a placeholder for things not yet considered

Signed-off-by: George Dunlap 
---
 SUPPORT.md | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index 72be1414a1..08f3a808be 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -132,6 +132,8 @@ Fully virtualised guest using hardware virtualisation 
extensions
 
 Requires hardware virtualisation support (Intel VMX / AMD SVM)
 
+XXX Figure out of we need to add qemu-trad / qemu-upstream to this mix
+
 ### x86/PVH guest
 
 Status: Supported
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 13/17] SUPPORT.md: Add secondary memory management features

2017-11-22 Thread George Dunlap
Signed-off-by: George Dunlap 
Acked-by: Jan Beulich 
---
Changes since v2:
- Add PoD entry
- memsharing x86 -> experimental, ARM -> {}

CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
CC: Tamas K Lengyel 
---
 SUPPORT.md | 37 +
 1 file changed, 37 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index 2d4386ad68..63f6a6d127 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -205,6 +205,43 @@ Export hypervisor coverage data suitable for analysis by 
gcov or lcov.
 Allows a guest to add or remove memory after boot-time.
 This is typically done by a guest kernel agent known as a "balloon driver".
 
+### Populate-on-demand memory
+
+Status, x86 HVM: Supported
+
+This is a mechanism that allows normal operating systems with only a balloon 
driver
+to boot with memory < maxmem.
+
+### Memory Sharing
+
+Status, x86 HVM: Expermental
+
+Allow sharing of identical pages between guests
+
+### Memory Paging
+
+Status, x86 HVM: Experimenal
+
+Allow pages belonging to guests to be paged to disk
+
+### Transcendent Memory
+
+Status: Experimental
+
+Transcendent Memory (tmem) allows the creation of hypervisor memory pools
+which guests can use to store memory
+rather than caching in its own memory or swapping to disk.
+Having these in the hypervisor
+can allow more efficient aggregate use of memory across VMs.
+
+### Alternative p2m
+
+Status, x86 HVM: Tech Preview
+Status, ARM: Tech Preview
+
+Allows external monitoring of hypervisor memory
+by maintaining multiple physical to machine (p2m) memory mappings.
+
 ## Resource Management
 
 ### CPU Pools
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 11/17] SUPPORT.md: Add 'easy' HA / FT features

2017-11-22 Thread George Dunlap
Migration being one of the key 'non-easy' ones to be added later.

Signed-off-by: George Dunlap 
Acked-by: Jan Beulich 
---
Changes since v2:
- Capitalization error

CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
---
 SUPPORT.md | 16 
 1 file changed, 16 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index ee069f8499..cc8b754749 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -285,6 +285,22 @@ which add paravirtualized functionality to HVM guests
 for improved performance and scalability.
 This includes exposing event channels to HVM guests.
 
+## High Availability and Fault Tolerance
+
+### Remus Fault Tolerance
+
+Status: Experimental
+
+### COLO Manager
+
+Status: Experimental
+
+### x86/vMCE
+
+Status: Supported
+
+Forward Machine Check Exceptions to appropriate guests
+
 ## Virtual driver support, guest side
 
 ### Blkfront
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 16/17] SUPPORT.md: Add limits RFC

2017-11-22 Thread George Dunlap
Signed-off-by: George Dunlap 
---
Changes since v2:
- Update memory limits for PV guests

CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
---
 SUPPORT.md | 68 +-
 1 file changed, 67 insertions(+), 1 deletion(-)

diff --git a/SUPPORT.md b/SUPPORT.md
index aa58fb0de3..72be1414a1 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -62,6 +62,58 @@ for the definitions of the support status levels etc.
 
 Extension to the GICv3 interrupt controller to support MSI.
 
+## Limits/Host
+
+### CPUs
+
+Limit, x86: 4095
+Limit, ARM32: 8
+Limit, ARM64: 128
+
+Note that for x86, very large number of cpus may not work/boot,
+but we will still provide security support
+
+### x86/RAM
+
+Limit, x86: 123TiB
+Limit, ARM32: 16GiB
+Limit, ARM64: 5TiB
+
+## Limits/Guest
+
+### Virtual CPUs
+
+Limit, x86 PV: 8192
+Limit-security, x86 PV: 32
+Limit, x86 HVM: 128
+Limit-security, x86 HVM: 32
+Limit, ARM32: 8
+Limit, ARM64: 128
+
+### Virtual RAM
+
+Limit-security, x86 PV 64-bit: 2047GiB
+Limit-security, x86 PV 32-bit: 168GiB (see below)
+Limit-security, x86 HVM: 1.5TiB
+Limit, ARM32: 16GiB
+Limit, ARM64: 1TiB
+
+Note that there are no theoretical limits to 64-bit PV or HVM guest sizes
+other than those determined by the processor architecture.
+
+All 32-bit PV guest memory must be under 168GiB;
+this means the total memory for all 32-bit PV guests cannot exced 168GiB.
+On larger hosts, this limit is 128GiB.
+
+### Event Channel 2-level ABI
+
+Limit, 32-bit: 1024
+Limit, 64-bit: 4096
+
+### Event Channel FIFO ABI
+
+Limit: 131072
+
 ## Guest Type
 
 ### x86/PV
@@ -634,7 +686,7 @@ that covers the DMA of the device to be passed through.
 
 Status: Supported, with caveats
 
-No support for QEMU backends in a 16K or 64K domain.
+No support for QEMU backends bin a 16K or 64K domain.
 
 ### ARM: Guest Devicetree support
 
@@ -736,6 +788,20 @@ If support differs based on implementation
 (for instance, x86 / ARM, Linux / QEMU / FreeBSD),
 one line for each set of implementations will be listed.
 
+### Limit-security
+
+For size limits.
+This figure shows the largest configuration which will receive
+security support.
+It is generally determined by the maximum amount that is regularly tested.
+This limit will only be listed explicitly
+if it is different than the theoretical limit.
+
+### Limit
+
+This figure shows a theoretical size limit.
+This does not mean that such a large configuration will actually work.
+
 ## Definition of Status labels
 
 Each Status value corresponds to levels of security support,
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 12/17] SUPPORT.md: Add Security-releated features

2017-11-22 Thread George Dunlap
With the exception of driver domains, which depend on PCI passthrough,
and will be introduced later.

Signed-off-by: George Dunlap 
Reviewed-by: Konrad Rzeszutek Wilk 
---
Changes since v2:
- Reference XSA-77 as well under the XSM & FLASK section

CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
CC: Tamas K Lengyel 
CC: Rich Persaud 
---
 SUPPORT.md | 40 
 1 file changed, 40 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index cc8b754749..2d4386ad68 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -447,6 +447,46 @@ but has no xl support.
 
 Status: Supported
 
+## Security
+
+### Device Model Stub Domains
+
+Status: Supported
+
+### KCONFIG Expert
+
+Status: Experimental
+
+### Live Patching
+
+Status, x86: Supported
+Status, ARM: Experimental
+
+Compile time disabled for ARM
+
+### Virtual Machine Introspection
+
+Status, x86: Supported, not security supported
+
+### XSM & FLASK
+
+Status: Experimental
+
+Compile time disabled.
+
+Also note that using XSM
+to delegate various domain control hypercalls
+to particular other domains, rather than only permitting use by dom0,
+is also specifically excluded from security support for many hypercalls.
+Please see XSA-77 for more details.
+
+### FLASK default policy
+
+Status: Experimental
+
+The default policy includes FLASK labels and roles for a "typical" Xen-based 
system
+with dom0, driver domains, stub domains, domUs, and so on.
+
 ## Virtual Hardware, Hypervisor
 
 ### x86/Nested PV
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 15/17] SUPPORT.md: Add statement on migration RFC

2017-11-22 Thread George Dunlap
Signed-off-by: George Dunlap 
---
Would someone be willing to take over this one?

CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
CC: Roger Pau Monne 
CC: Anthony Perard 
CC: Paul Durrant 
CC: Julien Grall 
---
 SUPPORT.md | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index c8fec4daa8..aa58fb0de3 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -324,6 +324,36 @@ This includes exposing event channels to HVM guests.
 
 ## High Availability and Fault Tolerance
 
+### Live Migration, Save & Restore
+
+Status, x86: Supported, with caveats
+
+A number of features don't work with live migration / save / restore.  These 
include:
+ * PCI passthrough
+ * vNUMA
+ * Nested HVM
+
+XXX Need to check the following:
+
+ * Guest serial console
+ * Crash kernels
+ * Transcendent Memory
+ * Alternative p2m
+ * vMCE
+ * vPMU
+ * Intel Platform QoS
+ * Remus
+ * COLO
+ * PV protocols: Keyboard, PVUSB, PVSCSI, PVTPM, 9pfs, pvcalls?
+ * FlASK?
+ * CPU / memory hotplug?
+
+Additionally, if an HVM guest was booted with memory != maxmem,
+and the balloon driver hadn't hit the target before migration,
+the size of the guest on the far side might be unexpected.
+
+See docs/features/migration.pandoc for more details
+
 ### Remus Fault Tolerance
 
 Status: Experimental
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 08/17] SUPPORT.md: Add x86-specific virtual hardware

2017-11-22 Thread George Dunlap
x86-specific virtual hardware provided by the hypervisor, toolstack,
or QEMU.

Signed-off-by: George Dunlap 
---
Changes since v2:
- Updated Nested PV / HVM sections
- Removed AVX section
- EFI -> OVMF

Changes since v1:
- Added emulated QEMU support, to replace docs/misc/qemu-xen-security.

Need to figure out what to do with the "backing storage image format"
section of that document.

CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
CC: Roger Pau Monne 
CC: Anthony Perard 
CC: Paul Durrant 
---
 SUPPORT.md | 105 +
 1 file changed, 105 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index 96c381fb55..98ed18098a 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -373,6 +373,111 @@ but has no xl support.
 
 Status: Supported
 
+## Virtual Hardware, Hypervisor
+
+### x86/Nested PV
+
+Status, x86 Xen HVM: Tech Preview
+
+This means running a Xen hypervisor inside an HVM domain on a Xen system,
+with support for PV L2 guests only
+(i.e., hardware virtualization extensions not provided
+to the guest).
+
+This works, but has performance limitations
+because the L1 dom0 can only access emulated L1 devices.
+
+Xen may also run inside other hypervisors (KVM, Hyper-V, VMWare),
+but nobody has reported on performance.
+
+### x86/Nested HVM
+
+Status, x86 HVM: Experimental
+
+This means providing hardware virtulatization support to guest VMs
+allowing, for instance, a nested Xen to support both PV and HVM guests.
+It also implies support for other hypervisors,
+such as KVM, Hyper-V, Bromium, and so on as guests.
+
+### vPMU
+
+Status, x86: Supported, Not security supported
+
+Virtual Performance Management Unit for HVM guests
+
+Disabled by default (enable with hypervisor command line option).
+This feature is not security supported: see 
http://xenbits.xen.org/xsa/advisory-163.html
+
+## Virtual Hardware, QEMU
+
+These are devices available in HVM mode using a qemu devicemodel (the default).
+Note that other devices are available but not security supported.
+
+### x86/Emulated platform devices (QEMU):
+
+Status, piix3: Supported
+
+### x86/Emulated network (QEMU):
+
+Status, e1000: Supported
+Status, rtl8193: Supported
+Status, virtio-net: Supported
+
+### x86/Emulated storage (QEMU):
+
+Status, piix3 ide: Supported
+Status, ahci: Supported
+
+### x86/Emulated graphics (QEMU):
+
+Status, cirrus-vga: Supported
+Status, stgvga: Supported
+
+### x86/Emulated audio (QEMU):
+
+Status, sb16: Supported
+Status, es1370: Supported
+Status, ac97: Supported
+
+### x86/Emulated input (QEMU):
+
+Status, usbmouse: Supported
+Status, usbtablet: Supported
+Status, ps/2 keyboard: Supported
+Status, ps/2 mouse: Supported
+
+### x86/Emulated serial card (QEMU):
+
+Status, UART 16550A: Supported
+
+### x86/Host USB passthrough (QEMU):
+
+Status: Supported, not security supported
+
+## Virtual Firmware
+
+### x86/HVM iPXE
+
+Status: Supported, with caveats
+
+Booting a guest via PXE.
+PXE inherently places full trust of the guest in the network,
+and so should only be used
+when the guest network is under the same administrative control
+as the guest itself.
+
+### x86/HVM BIOS
+
+Status: Supported
+
+Booting a guest via guest BIOS firmware
+
+### x86/HVM OVMF
+
+Status: Supported
+
+OVMF firmware implements the UEFI boot protocol.
+
 # Format and definitions
 
 This file contains prose, and machine-readable fragments.
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 03/17] SUPPORT.md: Add some x86 features

2017-11-22 Thread George Dunlap
Including host architecture support and guest types.

Signed-off-by: George Dunlap 
---
Changes since v2:
- No Host ACPI listing for PVH dom0
- Add IOMMU entries for AMD and Intel

CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
CC: Roger Pau Monne 
---
 SUPPORT.md | 57 +
 1 file changed, 57 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index 934028074b..a4cf2da50d 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -16,6 +16,63 @@ for the definitions of the support status levels etc.
 
 # Feature Support
 
+## Host Architecture
+
+### x86-64
+
+Status: Supported
+
+## Host hardware support
+
+### Physical CPU Hotplug
+
+Status, x86: Supported
+
+### Physical Memory Hotplug
+
+Status, x86: Supported
+
+### Host ACPI (via Domain 0)
+
+Status, x86 PV: Supported
+
+### x86/Intel Platform QoS Technologies
+
+Status: Tech Preview
+
+### IOMMU
+
+Status, AMD IOMMU: Supported
+Status, Intel VT-d: Supported
+
+## Guest Type
+
+### x86/PV
+
+Status: Supported
+
+Traditional Xen PV guest
+
+No hardware requirements
+
+### x86/HVM
+
+Status: Supported
+
+Fully virtualised guest using hardware virtualisation extensions
+
+Requires hardware virtualisation support (Intel VMX / AMD SVM)
+
+### x86/PVH guest
+
+Status: Supported
+
+PVH is a next-generation paravirtualized mode 
+designed to take advantage of hardware virtualization support when possible.
+During development this was sometimes called HVMLite or PVHv2.
+
+Requires hardware virtualisation support (Intel VMX / AMD SVM)
+
 ## Memory Management
 
 ### Dynamic memory control
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 09/17] SUPPORT.md: Add ARM-specific virtual hardware

2017-11-22 Thread George Dunlap
Signed-off-by: George Dunlap 
---
Changes since v2:
- Update "non-pci passthrough" section
- Add DT / ACPI sections

CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
CC: Julien Grall 
---
 SUPPORT.md | 21 +
 1 file changed, 21 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index 98ed18098a..f357291e4e 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -408,6 +408,27 @@ Virtual Performance Management Unit for HVM guests
 Disabled by default (enable with hypervisor command line option).
 This feature is not security supported: see 
http://xenbits.xen.org/xsa/advisory-163.html
 
+### ARM/Non-PCI device passthrough
+
+Status: Supported, not security supported
+
+Note that this still requires an IOMMU
+that covers the DMA of the device to be passed through.
+
+### ARM: 16K and 64K page granularity in guests
+
+Status: Supported, with caveats
+
+No support for QEMU backends in a 16K or 64K domain.
+
+### ARM: Guest Devicetree support
+
+Status: Supported
+
+### ARM: Guest ACPI support
+
+Status: Supported
+
 ## Virtual Hardware, QEMU
 
 These are devices available in HVM mode using a qemu devicemodel (the default).
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 02/17] SUPPORT.md: Add core functionality

2017-11-22 Thread George Dunlap
Core memory management and scheduling.

Signed-off-by: George Dunlap 
---
Changes since v2:
- s/Memory Ballooning/Dynamic memory control/;
- And add a description that mentions ballooning

CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Tim Deegan 
CC: Dario Faggioli 
CC: Nathan Studer 
---
 SUPPORT.md | 62 ++
 1 file changed, 62 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index e3d5d1de8d..934028074b 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -16,6 +16,68 @@ for the definitions of the support status levels etc.
 
 # Feature Support
 
+## Memory Management
+
+### Dynamic memory control
+
+Status: Supported
+
+Allows a guest to add or remove memory after boot-time.
+This is typically done by a guest kernel agent known as a "balloon driver".
+
+## Resource Management
+
+### CPU Pools
+
+Status: Supported
+
+Groups physical cpus into distinct groups called "cpupools",
+with each pool having the capability
+of using different schedulers and scheduling properties.
+
+### Credit Scheduler
+
+Status: Supported
+
+A weighted proportional fair share virtual CPU scheduler.
+This is the default scheduler.
+
+### Credit2 Scheduler
+
+Status: Supported
+
+A general purpose scheduler for Xen,
+designed with particular focus on fairness, responsiveness, and scalability
+
+### RTDS based Scheduler
+
+Status: Experimental
+
+A soft real-time CPU scheduler
+built to provide guaranteed CPU capacity to guest VMs on SMP hosts
+
+### ARINC653 Scheduler
+
+Status: Supported
+
+A periodically repeating fixed timeslice scheduler.
+Currently only single-vcpu domains are supported.
+
+### Null Scheduler
+
+Status: Experimental
+
+A very simple, very static scheduling policy
+that always schedules the same vCPU(s) on the same pCPU(s).
+It is designed for maximum determinism and minimum overhead
+on embedded platforms.
+
+### NUMA scheduler affinity
+
+Status, x86: Supported
+
+Enables NUMA aware scheduling in Xen
+
 # Format and definitions
 
 This file contains prose, and machine-readable fragments.
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 04/17] SUPPORT.md: Add core ARM features

2017-11-22 Thread George Dunlap
Hardware support and guest type.

Signed-off-by: George Dunlap 
---
Changes since v2:
- Moved SMMUv* into generic IOMMU section

CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
CC: Julien Grall 
---
 SUPPORT.md | 25 -
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/SUPPORT.md b/SUPPORT.md
index a4cf2da50d..5945ab4926 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -22,6 +22,14 @@ for the definitions of the support status levels etc.
 
 Status: Supported
 
+### ARM v7 + Virtualization Extensions
+
+Status: Supported
+
+### ARM v8
+
+Status: Supported
+
 ## Host hardware support
 
 ### Physical CPU Hotplug
@@ -35,6 +43,7 @@ for the definitions of the support status levels etc.
 ### Host ACPI (via Domain 0)
 
 Status, x86 PV: Supported
+Status, ARM: Experimental
 
 ### x86/Intel Platform QoS Technologies
 
@@ -44,6 +53,14 @@ for the definitions of the support status levels etc.
 
 Status, AMD IOMMU: Supported
 Status, Intel VT-d: Supported
+Status, ARM SMMUv1: Supported
+Status, ARM SMMUv2: Supported
+
+### ARM/GICv3 ITS
+
+Status: Experimental
+
+Extension to the GICv3 interrupt controller to support MSI.
 
 ## Guest Type
 
@@ -67,12 +84,18 @@ Requires hardware virtualisation support (Intel VMX / AMD 
SVM)
 
 Status: Supported
 
-PVH is a next-generation paravirtualized mode 
+PVH is a next-generation paravirtualized mode
 designed to take advantage of hardware virtualization support when possible.
 During development this was sometimes called HVMLite or PVHv2.
 
 Requires hardware virtualisation support (Intel VMX / AMD SVM)
 
+### ARM guest
+
+Status: Supported
+
+ARM only has one guest type at the moment
+
 ## Memory Management
 
 ### Dynamic memory control
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 07/17] SUPPORT.md: Add virtual devices common to ARM and x86

2017-11-22 Thread George Dunlap
Mostly PV protocols.

Signed-off-by: George Dunlap 
---
Changes since v2:
- Define "having xl support" as a requirement for Tech Preview and Supported
- ...and remove backend from xl support section
- Add OpenBSD blkback
- Fix Linux backend names
- Remove non-existent implementation (PV USB Linux)
- Remove support for PV keyboard in Windows (Fix in qemu tree didn't make it)

CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
CC: Roger Pau Monne 
CC: Anthony Perard 
CC: Paul Durrant 
CC: Julien Grall 
---
 SUPPORT.md | 150 +
 1 file changed, 150 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index dd3632b913..96c381fb55 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -128,6 +128,10 @@ Output of information in machine-parseable JSON format
 
 Status: Supported
 
+### QEMU backend hotplugging for xl
+
+Status: Supported
+
 ## Toolstack/3rd party
 
 ### libvirt driver for xl
@@ -223,6 +227,152 @@ which add paravirtualized functionality to HVM guests
 for improved performance and scalability.
 This includes exposing event channels to HVM guests.
 
+## Virtual driver support, guest side
+
+### Blkfront
+
+Status, Linux: Supported
+Status, FreeBSD: Supported, Security support external
+Status, NetBSD: Supported, Security support external
+Status, OpenBSD: Supported, Security support external
+Status, Windows: Supported
+
+Guest-side driver capable of speaking the Xen PV block protocol
+
+### Netfront
+
+Status, Linux: Supported
+States, Windows: Supported
+Status, FreeBSD: Supported, Security support external
+Status, NetBSD: Supported, Security support external
+Status, OpenBSD: Supported, Security support external
+
+Guest-side driver capable of speaking the Xen PV networking protocol
+
+### PV Framebuffer (frontend)
+
+Status, Linux (xen-fbfront): Supported
+
+Guest-side driver capable of speaking the Xen PV Framebuffer protocol
+
+### PV Console (frontend)
+
+Status, Linux (hvc_xen): Supported
+Status, Windows: Supported
+Status, FreeBSD: Supported, Security support external
+Status, NetBSD: Supported, Security support external
+
+Guest-side driver capable of speaking the Xen PV console protocol
+
+### PV keyboard (frontend)
+
+Status, Linux (xen-kbdfront): Supported
+
+Guest-side driver capable of speaking the Xen PV keyboard protocol
+
+### PV USB (frontend)
+
+Status, Linux: Supported
+
+### PV SCSI protocol (frontend)
+
+Status, Linux: Supported, with caveats
+
+NB that while the PV SCSI backend is in Linux and tested regularly,
+there is currently no xl support.
+
+### PV TPM (frontend)
+
+Status, Linux (xen-tpmfront): Tech Preview
+
+Guest-side driver capable of speaking the Xen PV TPM protocol
+
+### PV 9pfs frontend
+
+Status, Linux: Tech Preview
+
+Guest-side driver capable of speaking the Xen 9pfs protocol
+
+### PVCalls (frontend)
+
+Status, Linux: Tech Preview
+
+Guest-side driver capable of making pv system calls
+
+## Virtual device support, host side
+
+For host-side virtual device support,
+"Supported" and "Tech preview" include xl/libxl support
+unless otherwise noted.
+
+### Blkback
+
+Status, Linux (xen-blkback): Supported
+Status, FreeBSD (blkback): Supported, Security support external
+Status, NetBSD (xbdback): Supported, security support external
+Status, QEMU (xen_disk): Supported
+Status, Blktap2: Deprecated
+
+Host-side implementations of the Xen PV block protocol
+
+### Netback
+
+Status, Linux (xen-netback): Supported
+Status, FreeBSD (netback): Supported, Security support external
+Status, NetBSD (xennetback): Supported, Security support external
+
+Host-side implementations of Xen PV network protocol
+
+### PV Framebuffer (backend)
+
+Status, QEMU: Supported
+
+Host-side implementaiton of the Xen PV framebuffer protocol
+
+### PV Console (xenconsoled)
+
+Status: Supported
+
+Host-side implementation of the Xen PV console protocol
+
+### PV keyboard (backend)
+
+Status, QEMU: Supported
+
+Host-side implementation fo the Xen PV keyboard protocol
+
+### PV USB (backend)
+
+Status, QEMU: Supported
+
+Host-side implementation of the Xen PV USB protocol
+
+### PV SCSI protocol (backend)
+
+Status, Linux: Experimental
+
+NB that while the PV SCSI backend is in Linux and tested regularly,
+there is currently no xl support.
+
+### PV TPM (backend)
+
+Status: Tech Preview
+
+### PV 9pfs (backend)
+
+Status, QEMU: Tech Preview
+
+### PVCalls (backend)
+
+Status, Linux: Experimental
+
+PVCalls backend has been checked into Linux,
+but has no xl support.
+
+### Online resize of virtual disks
+
+Status: Supported
+
 # Format and definitions
 
 This file contains prose, and machine-readable fragments.
-- 
2.15.0


___
Xen-devel m

[Xen-devel] [PATCH v3 01/17] Introduce skeleton SUPPORT.md

2017-11-22 Thread George Dunlap
Add a machine-readable file to describe what features are in what
state of being 'supported', as well as information about how long this
release will be supported, and so on.

The document should be formatted using "semantic newlines" [1], to make
changes easier.

Begin with the basic framework.

Signed-off-by: Ian Jackson 
Signed-off-by: George Dunlap 
Acked-by: Jan Beulich 

[1] http://rhodesmill.org/brandon/2012/one-sentence-per-line/
---
CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Tim Deegan 
CC: Dario Faggioli 
CC: Tamas K Lengyel 
CC: Roger Pau Monne 
CC: Stefano Stabellini 
CC: Anthony Perard 
CC: Paul Durrant 
CC: Konrad Wilk 
CC: Julien Grall 
---
 SUPPORT.md | 194 +
 1 file changed, 194 insertions(+)
 create mode 100644 SUPPORT.md

diff --git a/SUPPORT.md b/SUPPORT.md
new file mode 100644
index 00..e3d5d1de8d
--- /dev/null
+++ b/SUPPORT.md
@@ -0,0 +1,194 @@
+# Support statement for this release
+
+This document describes the support status
+and in particular the security support status of the Xen branch
+within which you find it.
+
+See the bottom of the file
+for the definitions of the support status levels etc.
+
+# Release Support
+
+Xen-Version: 4.10-unstable
+Initial-Release: n/a
+Supported-Until: TBD
+Security-Support-Until: Unreleased - not yet security-supported
+
+# Feature Support
+
+# Format and definitions
+
+This file contains prose, and machine-readable fragments.
+The data in a machine-readable fragment relate to
+the section and subsection in which it is found.
+
+The file is in markdown format.
+The machine-readable fragments are markdown literals
+containing RFC-822-like (deb822-like) data.
+
+## Keys found in the Feature Support subsections
+
+### Status
+
+This gives the overall status of the feature,
+including security support status, functional completeness, etc.
+Refer to the detailed definitions below.
+
+If support differs based on implementation
+(for instance, x86 / ARM, Linux / QEMU / FreeBSD),
+one line for each set of implementations will be listed.
+
+## Definition of Status labels
+
+Each Status value corresponds to levels of security support,
+testing, stability, etc., as follows:
+
+### Experimental
+
+Functional completeness: No
+Functional stability: Here be dragons
+Interface stability: Not stable
+Security supported: No
+
+### Tech Preview
+
+Functional completeness: Yes
+Functional stability: Quirky
+Interface stability: Provisionally stable
+Security supported: No
+
+ Supported
+
+Functional completeness: Yes
+Functional stability: Normal
+Interface stability: Yes
+Security supported: Yes
+
+ Deprecated
+
+Functional completeness: Yes
+Functional stability: Quirky
+Interface stability: No (as in, may disappear the next release)
+Security supported: Yes
+
+All of these may appear in modified form.
+There are several interfaces, for instance,
+which are officially declared as not stable;
+in such a case this feature may be described as "Stable / Interface not 
stable".
+
+## Definition of the status label interpretation tags
+
+### Functionally complete
+
+Does it behave like a fully functional feature?
+Does it work on all expected platforms,
+or does it only work for a very specific sub-case?
+Does it have a sensible UI,
+or do you have to have a deep understanding of the internals
+to get it to work properly?
+
+### Functional stability
+
+What is the risk of it exhibiting bugs?
+
+General answers to the above:
+
+ * **Here be dragons**
+
+   Pretty likely to still crash / fail to work.
+   Not recommended unless you like life on the bleeding edge.
+
+ * **Quirky**
+
+   Mostly works but may have odd behavior here and there.
+   Recommended for playing around or for non-production use cases.
+
+ * **Normal**
+
+   Ready for production use
+
+### Interface stability
+
+If I build a system based on the current interfaces,
+will they still work when I upgrade to the next version?
+
+ * **Not stable**
+
+   Interface is still in the early stages and
+   still fairly likely to be broken in future updates.
+
+ * **Provisionally stable**
+
+   We're not yet promising backwards compatibility,
+   but we think this is probably the final form of the interface.
+   It may still require some tweaks.
+
+ * **Stable**
+
+   We will try very hard to avoid breaking backwards  compatibility,
+   and to fix any regressions that are reported.
+
+### Security supported
+
+Will XSAs be issued if security-related bugs are discovered
+in the functionality?
+
+If "no",
+anyone who finds a security-related bug in the feature
+will be advised to
+post it publicly to the Xen Project mailing lists
+(or contact another security response team,
+if a relevant one exists).
+
+Bugs found after the end of **Security-Support-Until**
+in the Release Support section w

[Xen-devel] [PATCH v3 05/17] SUPPORT.md: Toolstack core

2017-11-22 Thread George Dunlap
For now only include xl-specific features, or interaction with the
system.  Feature support matrix will be added when features are
mentioned.

Signed-off-by: George Dunlap 
---
CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
---
 SUPPORT.md | 38 ++
 1 file changed, 38 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index 5945ab4926..df429cb3c4 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -96,6 +96,44 @@ Requires hardware virtualisation support (Intel VMX / AMD 
SVM)
 
 ARM only has one guest type at the moment
 
+## Toolstack
+
+### xl
+
+Status: Supported
+
+### Direct-boot kernel image format
+
+Supported, x86: bzImage
+Supported, ARM32: zImage
+Supported, ARM64: Image
+
+Format which the toolstack accept for direct-boot kernels
+
+### systemd support for xl
+
+Status: Supported
+
+### JSON output support for xl
+
+Status: Experimental
+
+Output of information in machine-parseable JSON format
+
+### Open vSwitch integration for xl
+
+Status, Linux: Supported
+
+### Virtual cpu hotplug
+
+Status: Supported
+
+## Toolstack/3rd party
+
+### libvirt driver for xl
+
+Status: Supported, Security support external
+
 ## Memory Management
 
 ### Dynamic memory control
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 06/17] SUPPORT.md: Add scalability features

2017-11-22 Thread George Dunlap
Superpage support and PVHVM.

Signed-off-by: George Dunlap 
---
Changes since v2:
- Reworked superpage section

CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
CC: Julien Grall 
---
 SUPPORT.md | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index df429cb3c4..dd3632b913 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -196,6 +196,33 @@ on embedded platforms.
 
 Enables NUMA aware scheduling in Xen
 
+## Scalability
+
+### Super page support
+
+Status, x86 HVM/PVH, HAP: Supported
+Status, x86 HVM/PVH, Shadow, 2MiB: Supported
+Status, ARM: Supported
+
+NB that this refers to the ability of guests
+to have higher-level page table entries point directly to memory,
+improving TLB performance.
+On ARM, and on x86 in HAP mode,
+the guest has whatever support is enabled by the hardware.
+On x86 in shadow mode, only 2MiB (L2) superpages are available;
+furthermore, they do not have the performance characteristics of hardware 
superpages.
+
+Also note is feature independent of the ARM "page granularity" feature (see 
below).
+
+### x86/PVHVM
+
+Status: Supported
+
+This is a useful label for a set of hypervisor features
+which add paravirtualized functionality to HVM guests
+for improved performance and scalability.
+This includes exposing event channels to HVM guests.
+
 # Format and definitions
 
 This file contains prose, and machine-readable fragments.
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 10/17] SUPPORT.md: Add Debugging, analysis, crash post-portem

2017-11-22 Thread George Dunlap
Signed-off-by: George Dunlap 
---
Changes since v2:
- gdbsx -> not security suported
- Added host serial, host debug keys, and host sync_console entries

CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
---
 SUPPORT.md | 58 ++
 1 file changed, 58 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index f357291e4e..ee069f8499 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -138,6 +138,64 @@ Output of information in machine-parseable JSON format
 
 Status: Supported, Security support external
 
+## Debugging, analysis, and crash post-mortem
+
+### Host serial console
+
+Status, NS16550: Supported
+   Status, EHCI: Supported
+   Status, Cadence UART (ARM): Supported
+   Status, PL011 UART (ARM): Supported
+   Status, Exynos 4210 UART (ARM): Supported
+   Status, OMAP UART (ARM): Supported
+   Status, SCI(F) UART: Supported
+
+XXX Should NS16550 and EHCI be limited to x86?  Unlike the ARM
+entries, they don't depend on x86 being configured
+
+### Hypervisor 'debug keys'
+
+Status: Supported, not security supported
+
+These are functions triggered either from the host serial console,
+or via the xl 'debug-keys' command,
+which cause Xen to dump various hypervisor state to the console.
+
+### Hypervisor synchronous console output (sync_console)
+
+Status: Supported, not security supported
+
+Xen command-line flag to force synchronous console output.
+Useful for debugging, but not suitable for production environments
+due to incurred overhead.
+
+### gdbsx
+
+Status, x86: Supported, not security supported
+
+Debugger to debug ELF guests
+
+### Soft-reset for PV guests
+
+Status: Supported
+
+Soft-reset allows a new kernel to start 'from scratch' with a fresh VM state,
+but with all the memory from the previous state of the VM intact.
+This is primarily designed to allow "crash kernels",
+which can do core dumps of memory to help with debugging in the event of a 
crash.
+
+### xentrace
+
+Status, x86: Supported
+
+Tool to capture Xen trace buffer data
+
+### gcov
+
+Status: Supported, Not security supported
+
+Export hypervisor coverage data suitable for analysis by gcov or lcov.
+
 ## Memory Management
 
 ### Dynamic memory control
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 14/16] SUPPORT.md: Add statement on PCI passthrough

2017-11-22 Thread George Dunlap
On 11/16/2017 03:43 PM, Julien Grall wrote:
> Hi George,
> 
> On 13/11/17 15:41, George Dunlap wrote:
>> Signed-off-by: George Dunlap 
>> ---
>> CC: Ian Jackson 
>> CC: Wei Liu 
>> CC: Andrew Cooper 
>> CC: Jan Beulich 
>> CC: Stefano Stabellini 
>> CC: Konrad Wilk 
>> CC: Tim Deegan 
>> CC: Rich Persaud 
>> CC: Marek Marczykowski-Górecki 
>> CC: Christopher Clark 
>> CC: James McKenzie 
>> ---
>>   SUPPORT.md | 33 -
>>   1 file changed, 32 insertions(+), 1 deletion(-)
>>
>> diff --git a/SUPPORT.md b/SUPPORT.md
>> index 3e352198ce..a8388f3dc5 100644
>> --- a/SUPPORT.md
>> +++ b/SUPPORT.md
>> @@ -454,9 +454,23 @@ there is currently no xl support.
>>     ## Security
>>   +### Driver Domains
>> +
>> +    Status: Supported, with caveats
>> +
>> +"Driver domains" means allowing non-Domain 0 domains
>> +with access to physical devices to act as back-ends.
>> +
>> +See the appropriate "Device Passthrough" section
>> +for more information about security support.
>> +
>>   ### Device Model Stub Domains
>>   -    Status: Supported
>> +    Status: Supported, with caveats
>> +
>> +Vulnerabilities of a device model stub domain
>> +to a hostile driver domain (either compromised or untrusted)
>> +are excluded from security support.
>>     ### KCONFIG Expert
>>   @@ -522,6 +536,23 @@ Virtual Performance Management Unit for HVM guests
>>   Disabled by default (enable with hypervisor command line option).
>>   This feature is not security supported: see
>> http://xenbits.xen.org/xsa/advisory-163.html
>>   +### x86/PCI Device Passthrough
>> +
>> +    Status: Supported, with caveats
>> +
>> +Only systems using IOMMUs will be supported.
>> +
>> +Not compatible with migration, altp2m, introspection, memory sharing,
>> or memory paging.
>> +
>> +Because of hardware limitations
>> +(affecting any operating system or hypervisor),
>> +it is generally not safe to use this feature
>> +to expose a physical device to completely untrusted guests.
>> +However, this feature can still confer significant security benefit
>> +when used to remove drivers and backends from domain 0
>> +(i.e., Driver Domains).
>> +See docs/PCI-IOMMU-bugs.txt for more information.
> 
> Where can I find this file? Is it in staging?

No, I took this from a recommendation made to me, without checking.

Rich, are you going to send a patch adding this file, or did you mean to
point to a different file?

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 16/16] SUPPORT.md: Add limits RFC

2017-11-22 Thread George Dunlap


> On Nov 21, 2017, at 9:26 AM, Jan Beulich  wrote:
>
 On 13.11.17 at 16:41,  wrote:
>> +### Virtual CPUs
>> +
>> +    Limit, x86 PV: 8192
>> +    Limit-security, x86 PV: 32
>> +    Limit, x86 HVM: 128
>> +    Limit-security, x86 HVM: 32
>
> Personally I consider the "Limit-security" numbers too low here, but
> I have no proof that higher numbers will work _in all cases_.

You don’t have to have conclusive proof that the numbers work in all
cases; we only need to have reasonable evidence that higher numbers are
generally reliable.  To use US legal terminology, it’s “preponderance of
evidence” (usually used in civil trials) rather than “beyond a
reasonable doubt” (used in criminal trials).

In this case, there are credible claims that using more vcpus opens
users up to a host DoS, and no evidence (or arguments) to the contrary.
 I think it would be irresponsible, under those circumstances, to tell
people that they should provide more vcpus to untrusted guests.

It wouldn’t be too hard to gather further evidence.  If someone
competent spent a few days trying to crash a larger guest and failed,
then that would be reason to think that perhaps larger numbers were safe.

>
>> +### Virtual RAM
>> +
>> +    Limit-security, x86 PV: 2047GiB
>
> I think this needs splitting for 64- and 32-bit (the latter can go up
> to 168Gb only on hosts with no memory past the 168Gb boundary,
> and up to 128Gb only on larger ones, without this being a processor
> architecture limitation).

OK.  Below is an updated section.  It might be good to specify how large
is "larger".

---
### Virtual RAM

    Limit-security, x86 PV 64-bit: 2047GiB
    Limit-security, x86 PV 32-bit: 168GiB (see below)
    Limit-security, x86 HVM: 1.5TiB
    Limit, ARM32: 16GiB
    Limit, ARM64: 1TiB

Note that there are no theoretical limits to 64-bit PV or HVM guest sizes
other than those determined by the processor architecture.

All 32-bit PV guest memory must be under 168GiB;
this means the total memory for all 32-bit PV guests cannot exced 168GiB.
On larger hosts, this limit is 128GiB.
---

>> +### Event Channel FIFO ABI
>> +
>> +    Limit: 131072
>
> Are we certain this is a security supportable limit? There is at least
> one loop (in get_free_port()) which can potentially have this number
> of iterations.

I have no idea.  Do you have another limit you’d like to propose instead?

> That's already leaving aside the one in the 'e' key handler. Speaking
> of which - I think we should state somewhere that there's no security
> support if any key whatsoever was sent to Xen via the console or
> the sysctl interface.

That's a good starting point.  I've added the following:

---
### Hypervisor synchronous console output (sync_console)

    Status: Supported, not security supported

Xen command-line flag to force synchronous console output.
Useful for debugging, but not suitable for production environments
due to incurred overhead.
---
> And more generally - surely there are items that aren't present in
> the series and no-one can realistically spot right away. What do we
> mean to imply for functionality not covered in the doc? One thing
> coming to mind here are certain command line options, an example
> being "sync_console" - the description states "not suitable for
> production environments", but I think this should be tightened to
> exclude security support.

Well specifically for sync_console, I would think given our definition
of "Supported", "not suitable for production environments" would imply
"not security supported"; but it wouldn't hurt to add an entry for it
under "Debugging, analysis, and post-mortem", so I've written one up:

---
### Hypervisor 'debug keys'

    Status: Supported, not security supported
   
These are functions triggered either from the host serial console,
or via the xl 'debug-keys' command,
which cause Xen to dump various hypervisor state to the console.
---

In general, if a feature is explicitly listed *but* some configuration
is not listed (e.g., 'x86 PV' and 'x86 HVM' are listed but not 'x86
PVH') then that feature is not implemented for that configuration is not
implemented.

If a feature is not listed at all, then this document isn't saying
anything one way or another (which is no worse than you were before).

Also, I realized that I somehow failed to send out the 17th patch (!),
which primarily had XXX entries for qemu-upstream/qemu-traditional, and
host serial console support.

Shall I try to make a list of supported serial cards from
/build/hg/xen.git/xen/drivers/char/Kconfig?

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 14/16] SUPPORT.md: Add statement on PCI passthrough

2017-11-22 Thread George Dunlap
On 11/21/2017 08:59 AM, Jan Beulich wrote:
 On 13.11.17 at 16:41,  wrote:
>> +### x86/PCI Device Passthrough
>> +
>> +Status: Supported, with caveats
> 
> I think this wants to be
> 
> ### PCI Device Passthrough
> 
> Status, x86 HVM: Supported, with caveats
> Status, x86 PV: Supported, with caveats
> 
> to (a) allow later extending for ARM and (b) exclude PVH (assuming
> that its absence means non-existing code).

Good call.

> 
>> +Only systems using IOMMUs will be supported.
>> +
>> +Not compatible with migration, altp2m, introspection, memory sharing, or 
>> memory paging.
> 
> And PoD, iirc.

Ack

> 
> With these adjustments (or substantially similar ones)
> Acked-by: Jan Beulich 

Great, thanks.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 14/16] SUPPORT.md: Add statement on PCI passthrough

2017-11-22 Thread George Dunlap
On 11/14/2017 01:25 PM, Marek Marczykowski-Górecki wrote:
> On Mon, Nov 13, 2017 at 03:41:24PM +0000, George Dunlap wrote:
>> Signed-off-by: George Dunlap 
>> ---
>> CC: Ian Jackson 
>> CC: Wei Liu 
>> CC: Andrew Cooper 
>> CC: Jan Beulich 
>> CC: Stefano Stabellini 
>> CC: Konrad Wilk 
>> CC: Tim Deegan 
>> CC: Rich Persaud 
>> CC: Marek Marczykowski-Górecki 
>> CC: Christopher Clark 
>> CC: James McKenzie 
>> ---
>>  SUPPORT.md | 33 -
>>  1 file changed, 32 insertions(+), 1 deletion(-)
>>
>> diff --git a/SUPPORT.md b/SUPPORT.md
>> index 3e352198ce..a8388f3dc5 100644
>> --- a/SUPPORT.md
>> +++ b/SUPPORT.md
> 
> (...)
> 
>> @@ -522,6 +536,23 @@ Virtual Performance Management Unit for HVM guests
>>  Disabled by default (enable with hypervisor command line option).
>>  This feature is not security supported: see 
>> http://xenbits.xen.org/xsa/advisory-163.html
>>  
>> +### x86/PCI Device Passthrough
>> +
>> +Status: Supported, with caveats
>> +
>> +Only systems using IOMMUs will be supported.
> 
> s/will be/are/ ?

Ack

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 13/16] SUPPORT.md: Add secondary memory management features

2017-11-22 Thread George Dunlap
On 11/21/2017 07:55 PM, Andrew Cooper wrote:
> On 13/11/17 15:41, George Dunlap wrote:
>> Signed-off-by: George Dunlap 
>> ---
>> CC: Ian Jackson 
>> CC: Wei Liu 
>> CC: Andrew Cooper 
>> CC: Jan Beulich 
>> CC: Stefano Stabellini 
>> CC: Konrad Wilk 
>> CC: Tim Deegan 
>> CC: Tamas K Lengyel 
>> ---
>>  SUPPORT.md | 31 +++
>>  1 file changed, 31 insertions(+)
>>
>> diff --git a/SUPPORT.md b/SUPPORT.md
>> index 0f7426593e..3e352198ce 100644
>> --- a/SUPPORT.md
>> +++ b/SUPPORT.md
>> @@ -187,6 +187,37 @@ Export hypervisor coverage data suitable for analysis 
>> by gcov or lcov.
>>  
>>  Status: Supported
>>  
>> +### Memory Sharing
>> +
>> +Status, x86 HVM: Tech Preview
>> +Status, ARM: Tech Preview
>> +
>> +Allow sharing of identical pages between guests
> 
> "Tech Preview" should imply there is any kind of `xl dedup-these-domains
> $X $Y` functionality.
> 
> The only thing we appears to have an example wrapper around the libxc
> interface, which requires the user to nominate individual frames, and
> this doesn't qualify as "functionally complete" IMO.

Right, I was getting confused with paging, which does have at least some
code in the tools/ directory.  (But perhaps should also be considered
experimental?  When was the last time anyone tried to use it?)

> There also doesn't appear to be any ARM support in the slightest. 
> mem_sharing_{memop,domctl}() are only implemented for x86.

Ack.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 12/16] SUPPORT.md: Add Security-releated features

2017-11-22 Thread George Dunlap
On 11/21/2017 08:52 AM, Jan Beulich wrote:
>>>> On 13.11.17 at 16:41,  wrote:
>> With the exception of driver domains, which depend on PCI passthrough,
>> and will be introduced later.
>>
>> Signed-off-by: George Dunlap 
> 
> Shouldn't we also explicitly exclude tool stack disaggregation here,
> with reference to XSA-77?

Well in this document, we already consider XSM "experimental"; that
would seem to subsume the specific exclusions listed in XSA-77.

I've modified the "XSM & FLASK" as below; let me know what you think.

The other option would be to make separate entries for specific uses of
XSM (i.e., "for simple domain restriction" vs "for domain disaggregation").

 -George


### XSM & FLASK

Status: Experimental

Compile time disabled.

Also note that using XSM
to delegate various domain control hypercalls
to particular other domains, rather than only permitting use by dom0,
is also specifically excluded from security support for many hypercalls.
Please see XSA-77 for more details.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 10/16] SUPPORT.md: Add Debugging, analysis, crash post-portem

2017-11-22 Thread George Dunlap
On 11/22/2017 11:15 AM, Jan Beulich wrote:
 On 21.11.17 at 19:19,  wrote:
>> xentrace I would argue for security support; I've asked customers to
>> send me xentrace data as part of analysis before.  I also know enough
>> about it that I'm reasonably confident the risk of an attack vector is
>> pretty low.
> 
> Knowing pretty little about xentrace I will trust you here. What I
> was afraid of is that generally anything adding overhead can have
> unintended side effects, the more with the - aiui - huge amounts of
> data this may produce.

The data is fundamentally limited by the size of the in-hypervisor
buffers.  Once those are full, the trace overhead shouldn't be
significantly different than having tracing disabled.  And regardless of
how big they are, the total amount of trace data will be limited by the
throughput of the dom0-based xentrace process writing to disk.  If the
throughput of that process is (say) 50MB/s, then the "steady state" of
trace creation will be the same (one way or another).  Or, at very most,
at the rate a single processor can copy data out of the in-hypervisor
buffers.

Back when I was using xentrace heavily, I regularly hit this limit, and
never had any stability issues.

I suppose with faster disks (SSDs?  SAN on a 40GiB NIC?) this limit will
be higher, but I still have trouble thinking that it would be
significantly more dangerous than, say, any other kind of domain 0 logging.

I mean, there may be something I'm missing; but I've just spent 10
minutes or so trying to brainstorm ways that an attacker could cause
problems on the system, and other than "fill the buffers with junk so
that the admin can't find what she's looking for".  Any other flaws
should be no more likely than from any other feature we expose to guests.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 09/16] SUPPORT.md: Add ARM-specific virtual hardware

2017-11-22 Thread George Dunlap
On 11/16/2017 03:41 PM, Julien Grall wrote:
> Hi George,
> 
> On 13/11/17 15:41, George Dunlap wrote:
>> Signed-off-by: George Dunlap 
>> ---
>> Do we need to add anything more here?
>>
>> And do we need to include ARM ACPI for guests?
>>
>> CC: Ian Jackson 
>> CC: Wei Liu 
>> CC: Andrew Cooper 
>> CC: Jan Beulich 
>> CC: Stefano Stabellini 
>> CC: Konrad Wilk 
>> CC: Tim Deegan 
>> CC: Julien Grall 
>> ---
>>   SUPPORT.md | 10 ++
>>   1 file changed, 10 insertions(+)
>>
>> diff --git a/SUPPORT.md b/SUPPORT.md
>> index b95ee0ebe7..8235336c41 100644
>> --- a/SUPPORT.md
>> +++ b/SUPPORT.md
>> @@ -412,6 +412,16 @@ Virtual Performance Management Unit for HVM guests
>>   Disabled by default (enable with hypervisor command line option).
>>   This feature is not security supported: see
>> http://xenbits.xen.org/xsa/advisory-163.html
>>   +### ARM/Non-PCI device passthrough
>> +
>> +    Status: Supported
> 
> Sorry I didn't notice that until now. I am not comfortable to say
> "Supported" without any caveats.
> 
> As with PCI device passthrough, you at least need an IOMMU present on
> the platform. Sadly, it does not mean all DMA-capable devices on that
> platform will be protected by the IOMMU. This is also assuming, the
> IOMMU do sane things.
> 
> There are potentially other problem coming up with MSI support. But I
> haven't yet fully thought about it.

Shall we make this simply, 'Not security supported' for now?

I'll also mention needing an SMMU and other caveats.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 08/16] SUPPORT.md: Add x86-specific virtual hardware

2017-11-22 Thread George Dunlap
On 11/22/2017 11:11 AM, Jan Beulich wrote:
 On 21.11.17 at 19:02,  wrote:
>> On 11/21/2017 08:39 AM, Jan Beulich wrote:
>> On 13.11.17 at 16:41,  wrote:
 +### x86/Nested PV
 +
 +Status, x86 HVM: Tech Preview
 +
 +This means running a Xen hypervisor inside an HVM domain,
 +with support for PV L2 guests only
 +(i.e., hardware virtualization extensions not provided
 +to the guest).
 +
 +This works, but has performance limitations
 +because the L1 dom0 can only access emulated L1 devices.
>>>
>>> So is this explicitly meaning Xen-on-Xen? Xen-on-KVM, for example,
>>> could be considered "nested PV", too. IOW I think it needs to be
>>> spelled out whether this means the host side of things here, the
>>> guest one, or both.
>>
>> Yes, that's true.  But I forget: Can a Xen dom0 use virtio guest
>> drivers?  I'm pretty sure Stefano tried it at some point but I don't
>> remember what the result was.
> 
> I have no idea at all.

I've changed this to "Status, x86 Xen HVM: Tech Preview", and noted
that it may work for other hypervisors but we haven't received
any concrete reports.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 07/16] SUPPORT.md: Add virtual devices common to ARM and x86

2017-11-22 Thread George Dunlap
On 11/22/2017 11:05 AM, Jan Beulich wrote:
 On 21.11.17 at 18:20,  wrote:
>> On 11/21/2017 11:41 AM, Jan Beulich wrote:
>> On 21.11.17 at 11:56,  wrote:
 On 11/21/2017 08:29 AM, Jan Beulich wrote:
 On 13.11.17 at 16:41,  wrote:
>> +### PV USB support for xl
>> +
>> +Status: Supported
>> +
>> +### PV 9pfs support for xl
>> +
>> +Status: Tech Preview
>
> Why are these two being called out, but xl support for other device
> types isn't?

 Do you see how big this document is? :-)  If you think something else
 needs to be covered, don't ask why I didn't mention it, just say what
 you think I missed.
>>>
>>> Well, (not very) implicitly here: The same for all other PV protocols.
>>
>> Oh, I see -- you didn't read my comment below the `---` pointing this
>> out.  :-)
> 
> Oops, sorry.
> 
>> Yes, I wasn't quite sure what to do here.  We already list all the PV
>> protocols in at least 2 places (frontend and backend support); it seemed
>> a bit redundant to list them all again in xl and/or libxl support.
>>
>> Except, of course, that there are a number of protocols *not* plumbed
>> through the toolstack yet -- PVSCSI being one example.
>>
>> Any suggestions would be welcome.
> 
> How about putting that as a note to the respective frontend /
> backend entries? And then, wouldn't lack of xl support anyway
> mean "experimental" at best?

Yes.

Since the toolstack mainly sets up the backend, I added a  note in the
'backend' section saying that unless otherwise noted, "Tech preview" and
"Supported" imply xl support for creating backends.

We might want to add in libvirt support enumeration at some point.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 08/16] SUPPORT.md: Add x86-specific virtual hardware

2017-11-22 Thread George Dunlap
On 11/22/2017 11:11 AM, Jan Beulich wrote:
 +### x86/HVM EFI
 +
 +Status: Supported
 +
 +Booting a guest via guest EFI firmware
>>>
>>> Shouldn't this say OVMF, to avoid covering possible other
>>> implementations?
>>
>> I don't expect that we'll ever need more than one EFI implementation in
>> the tree.  If a time comes when it makes sense to have two, we can
>> adjust the entry accordingly.
> 
> But that's part of my point - you say "in the tree", but this is a
> separate tree, and there could be any number of separate ones.

I've put the following:

---
### x86/HVM OVMF

Status: Supported

OVMF firmware implements the UEFI boot protocol.
---

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 08/16] SUPPORT.md: Add x86-specific virtual hardware

2017-11-22 Thread George Dunlap
On 11/22/2017 11:11 AM, Jan Beulich wrote:
 On 21.11.17 at 19:02,  wrote:
>> On 11/21/2017 08:39 AM, Jan Beulich wrote:
>> On 13.11.17 at 16:41,  wrote:
 +### x86/Nested PV
 +
 +Status, x86 HVM: Tech Preview
 +
 +This means running a Xen hypervisor inside an HVM domain,
 +with support for PV L2 guests only
 +(i.e., hardware virtualization extensions not provided
 +to the guest).
 +
 +This works, but has performance limitations
 +because the L1 dom0 can only access emulated L1 devices.
>>>
>>> So is this explicitly meaning Xen-on-Xen? Xen-on-KVM, for example,
>>> could be considered "nested PV", too. IOW I think it needs to be
>>> spelled out whether this means the host side of things here, the
>>> guest one, or both.
>>
>> Yes, that's true.  But I forget: Can a Xen dom0 use virtio guest
>> drivers?  I'm pretty sure Stefano tried it at some point but I don't
>> remember what the result was.
> 
> I have no idea at all.
> 
 +### x86/Nested HVM
 +
 +Status, x86 HVM: Experimental
 +
 +This means running a Xen hypervisor inside an HVM domain,
 +with support for running both PV and HVM L2 guests
 +(i.e., hardware virtualization extensions provided
 +to the guest).
>>>
>>> "Nested HVM" generally means more than using Xen as the L1
>>> hypervisor. If this is really to mean just L1 Xen, I think the title
>>> should already say so, not just the description.
>>
>> Yes, I mean any sort of nested guest support here.
> 
> In which case would you ind inserting "for example"?

Yes, I was planning doing that.  Sorry for not making my intention clear.

> 
 +### x86/Advanced Vector eXtension
 +
 +Status: Supported
>>>
>>> As indicated before, I think this either needs to be dropped or
>>> be extended by an entry for virtually every CPUID bit exposed
>>> to guests. Furthermore, in this isolated fashion it is not clear
>>> what derived features (e.g. FMA, FMA4, AVX2, or even AVX-512)
>>> it is meant to imply. If any of them are implied, "with caveats"
>>> would need to be added as long as the instruction emulator isn't
>>> capable of handling the instructions, yet.
>>
>> Adding a section for CPUID bits supported (and to what level) sounds
>> like a useful thing to do, perhaps in the next release.
> 
> May I suggest then that until then the section above be dropped?

Ditto.

 +### x86/HVM EFI
 +
 +Status: Supported
 +
 +Booting a guest via guest EFI firmware
>>>
>>> Shouldn't this say OVMF, to avoid covering possible other
>>> implementations?
>>
>> I don't expect that we'll ever need more than one EFI implementation in
>> the tree.  If a time comes when it makes sense to have two, we can
>> adjust the entry accordingly.
> 
> But that's part of my point - you say "in the tree", but this is a
> separate tree, and there could be any number of separate ones.

But not ones wired into xl or libxl.

On the other hand, it looks like the actual value you put in the xl
config file is 'ovmf', so that probably makes more sense.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 10/16] SUPPORT.md: Add Debugging, analysis, crash post-portem

2017-11-22 Thread George Dunlap
On 11/21/2017 07:21 PM, Andrew Cooper wrote:
> On 21/11/17 19:05, Ian Jackson wrote:
>> George Dunlap writes ("Re: [PATCH 10/16] SUPPORT.md: Add Debugging, 
>> analysis, crash post-portem"):
>>> gdbsx security support: Someone may want to debug an untrusted guest,
>>> so I think we should say 'yes' here.
>> I think running gdb on an potentially hostile program is foolish.
>>
>>> I don't have a strong opinion on gdbsx; I'd call it 'supported', but if
>>> you think we need to exclude it from security support I'm happy with
>>> that as well.
>> gdbsx itself is probably simple enough to be fine but I would rather
>> not call it security supported because that might encourage people to
>> use it with gdb.
>>
>> If someone wants to use gdbsx with something that's not gdb then they
>> might want to ask us to revisit that.
> 
> If gdbsx chooses (or gets tricked into using) DOMID_XEN, then it gets
> arbitrary read/write access over hypervisor virtual address space, due
> to the behaviour of the hypercalls it uses.
> 
> As a tool, it mostly functions (there are some rather sharp corners
> which I've not gotten time to fix so far), but it is definitely not
> something I would trust in a hostile environment.

Right -- "not security supported" it is. :-)

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 10/16] SUPPORT.md: Add Debugging, analysis, crash post-portem

2017-11-21 Thread George Dunlap
On 11/21/2017 08:48 AM, Jan Beulich wrote:
 On 13.11.17 at 16:41,  wrote:
>> --- a/SUPPORT.md
>> +++ b/SUPPORT.md
>> @@ -152,6 +152,35 @@ Output of information in machine-parseable JSON format
>>  
>>  Status: Supported, Security support external
>>  
>> +## Debugging, analysis, and crash post-mortem
>> +
>> +### gdbsx
>> +
>> +Status, x86: Supported
>> +
>> +Debugger to debug ELF guests
>> +
>> +### Soft-reset for PV guests
>> +
>> +Status: Supported
>> +
>> +Soft-reset allows a new kernel to start 'from scratch' with a fresh VM 
>> state, 
>> +but with all the memory from the previous state of the VM intact.
>> +This is primarily designed to allow "crash kernels", 
>> +which can do core dumps of memory to help with debugging in the event of a 
>> crash.
>> +
>> +### xentrace
>> +
>> +Status, x86: Supported
>> +
>> +Tool to capture Xen trace buffer data
>> +
>> +### gcov
>> +
>> +Status: Supported, Not security supported
> 
> I agree with excluding security support here, but why wouldn't the
> same be the case for gdbsx and xentrace?

From my initial post:

---

gdbsx security support: Someone may want to debug an untrusted guest,
so I think we should say 'yes' here.

xentrace: Users may want to trace guests in production environments,
so I think we should say 'yes'.

gcov: No good reason to run a gcov hypervisor in a production
environment.  May be ways for a rogue guest to DoS.

---

xentrace I would argue for security support; I've asked customers to
send me xentrace data as part of analysis before.  I also know enough
about it that I'm reasonably confident the risk of an attack vector is
pretty low.

I don't have a strong opinion on gdbsx; I'd call it 'supported', but if
you think we need to exclude it from security support I'm happy with
that as well.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 08/16] SUPPORT.md: Add x86-specific virtual hardware

2017-11-21 Thread George Dunlap
On 11/21/2017 08:39 AM, Jan Beulich wrote:
 On 13.11.17 at 16:41,  wrote:
>> +### x86/Nested PV
>> +
>> +Status, x86 HVM: Tech Preview
>> +
>> +This means running a Xen hypervisor inside an HVM domain,
>> +with support for PV L2 guests only
>> +(i.e., hardware virtualization extensions not provided
>> +to the guest).
>> +
>> +This works, but has performance limitations
>> +because the L1 dom0 can only access emulated L1 devices.
> 
> So is this explicitly meaning Xen-on-Xen? Xen-on-KVM, for example,
> could be considered "nested PV", too. IOW I think it needs to be
> spelled out whether this means the host side of things here, the
> guest one, or both.

Yes, that's true.  But I forget: Can a Xen dom0 use virtio guest
drivers?  I'm pretty sure Stefano tried it at some point but I don't
remember what the result was.

>> +### x86/Nested HVM
>> +
>> +Status, x86 HVM: Experimental
>> +
>> +This means running a Xen hypervisor inside an HVM domain,
>> +with support for running both PV and HVM L2 guests
>> +(i.e., hardware virtualization extensions provided
>> +to the guest).
> 
> "Nested HVM" generally means more than using Xen as the L1
> hypervisor. If this is really to mean just L1 Xen, I think the title
> should already say so, not just the description.

Yes, I mean any sort of nested guest support here.

>> +### x86/Advanced Vector eXtension
>> +
>> +Status: Supported
> 
> As indicated before, I think this either needs to be dropped or
> be extended by an entry for virtually every CPUID bit exposed
> to guests. Furthermore, in this isolated fashion it is not clear
> what derived features (e.g. FMA, FMA4, AVX2, or even AVX-512)
> it is meant to imply. If any of them are implied, "with caveats"
> would need to be added as long as the instruction emulator isn't
> capable of handling the instructions, yet.

Adding a section for CPUID bits supported (and to what level) sounds
like a useful thing to do, perhaps in the next release.

>> +### x86/HVM EFI
>> +
>> +Status: Supported
>> +
>> +Booting a guest via guest EFI firmware
> 
> Shouldn't this say OVMF, to avoid covering possible other
> implementations?

I don't expect that we'll ever need more than one EFI implementation in
the tree.  If a time comes when it makes sense to have two, we can
adjust the entry accordingly.

 -George


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 06/16] SUPPORT.md: Add scalability features

2017-11-21 Thread George Dunlap
On 11/21/2017 05:31 PM, Julien Grall wrote:
> Hi George,
> 
> On 11/21/2017 04:43 PM, George Dunlap wrote:
>> On 11/16/2017 03:19 PM, Julien Grall wrote:
>>> On 13/11/17 15:41, George Dunlap wrote:
>>>> Signed-off-by: George Dunlap 
>>>> ---
>>>> CC: Ian Jackson 
>>>> CC: Wei Liu 
>>>> CC: Andrew Cooper 
>>>> CC: Jan Beulich 
>>>> CC: Stefano Stabellini 
>>>> CC: Konrad Wilk 
>>>> CC: Tim Deegan 
>>>> CC: Julien Grall 
>>>> ---
>>>>    SUPPORT.md | 21 +
>>>>    1 file changed, 21 insertions(+)
>>>>
>>>> diff --git a/SUPPORT.md b/SUPPORT.md
>>>> index c884fac7f5..a8c56d13dd 100644
>>>> --- a/SUPPORT.md
>>>> +++ b/SUPPORT.md
>>>> @@ -195,6 +195,27 @@ on embedded platforms.
>>>>      Enables NUMA aware scheduling in Xen
>>>>    +## Scalability
>>>> +
>>>> +### 1GB/2MB super page support
>>>> +
>>>> +    Status, x86 HVM/PVH: : Supported
>>>> +    Status, ARM: Supported
>>>> +
>>>> +NB that this refers to the ability of guests
>>>> +to have higher-level page table entries point directly to memory,
>>>> +improving TLB performance.
>>>> +This is independent of the ARM "page granularity" feature (see below).
>>>
>>> I am not entirely sure about this paragraph for Arm. I understood this
>>> section as support for stage-2 page-table (aka EPT on x86) but the
>>> paragraph lead me to believe to it is for guest.
>>>
>>> The size of super pages of guests will depend on the page granularity
>>> used by itself and the format of the page-table (e.g LPAE vs short
>>> descriptor). We have no control on that.
>>>
>>> What we have control is the size of mapping used for stage-2 page-table.
>>
>> Stepping back from the document for a minute: would it make sense to use
>> "hardware assisted paging" (HAP) for Intel EPT, AMD RVI (previously
>> NPT), and ARM stage-2 pagetables?  HAP was already a general term used
>> to describe the two x86 technologies; and I think the description makes
>> sense, because if we didn't have hardware-assisted stage 2 pagetables
>> we'd need Xen-provided shadow pagetables.
> 
> I think using the term "hardware assisted paging" should be fine to
> refer the 3 technologies.

OK, great.

[snip]

> Short-descriptor is always using 4KB granularity supports 16MB, 1MB, 64KB
> 
> LPAE supports 4KB, 16KB, 64KB granularities. Each of them having
> different size of superpage.

Yes, that's why I started saying "L2 and L3 superpages", to mean
"Superpage entries in L2 or L3 pagetables", instead of 2MiB or 1GiB.
(Let me know if you can think of a better way to describe that.)

>> 3. Whether Xen provides the *interface* for a guest to use L2 or L3
>> superpages (for 4k page granularity, 2MiB or 1GiB respectively) in its
>> own pagetables.  I *think* HAP on x86 provides the interface whenever
>> the underlying hardware does.  I assume it's the same for ARM?  In the
>> case of shadow mode, we only provide the interface for 2MiB pagetables.
> 
> See above. We have no way to control that in the guest.

We don't control whether the guest uses *any* features.  Should we not
mention PV disks or SMMUv2 or whatever because we don't know if the
guest will use them?

Of course not.  This document describes whether the guest *has the
features available to use*, either provided by the hardware or emulated
by Xen.

It sounds like you may not have ever thought about whether an ARM guest
has L2 or L3 superpages available, because it's always had all of them;
but it's different on x86.

[snip]

>> 2. Whether Xen uses superpage mappings for HAP.  Xen uses this on x86
>> when hardware support is -- I take it Xen does this on ARM as well?
>
> The size of superpages supported will depend on the page-table format
> (short-descriptor vs LPAE) and the granularity used.
>
> Supersection (16MB) for short-descriptor is optional but mandatory when
> the processor support LPAE. LPAE is mandatory with virtualization. So
> all size of superpages are supported.
>
> Note that stage-2 page-tables can only use LPAE page-table.
>
> I would also rather avoid to mention any superpage size for Arm in
> SUPPORT.MD as there are a lot.

So it sounds like basically everything supported on native was supported
in virtualization (and under Xen) from the start, so it's probably less
important to mention.  But since we *will* need to do that for x86, we
probably need to say *something* in case people want to know.

Let me see what I can come up with.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 07/16] SUPPORT.md: Add virtual devices common to ARM and x86

2017-11-21 Thread George Dunlap
On 11/21/2017 08:29 AM, Jan Beulich wrote:
>> +### QEMU backend hotplugging for xl
>> +
>> +Status: Supported
> 
> Wouldn't this more appropriately be
> 
> ### QEMU backend hotplugging
> 
> Status, xl: Supported

You mean, for this whole section (i.e., everything here that says 'for
xl')?  If not, why this one in particular?

>> +## Virtual driver support, guest side
>> +
>> +### Blkfront
>> +
>> +Status, Linux: Supported
>> +Status, FreeBSD: Supported, Security support external
>> +Status, NetBSD: Supported, Security support external
>> +Status, Windows: Supported
>> +
>> +Guest-side driver capable of speaking the Xen PV block protocol
>> +
>> +### Netfront
>> +
>> +Status, Linux: Supported
>> +States, Windows: Supported
>> +Status, FreeBSD: Supported, Security support external
>> +Status, NetBSD: Supported, Security support external
>> +Status, OpenBSD: Supported, Security support external
> 
> Seeing the difference in OSes between the two (with the variance
> increasing in entries further down) - what does the absence of an
> OS on one list, but its presence on another mean? While not
> impossible, I would find it surprising if e.g. OpenBSD had netfront
> but not even a basic blkfront.

Actually -- at least according to the paper presenting PV frontends for
OpenBSD in 2016 [1], they implemented xenstore and netfront frontends,
but not (at least at that point) a disk frontend.

However, blktfront does appear as a feature in OpenBSD 6.1, released in
April [2]; so I'll add that one in.  (Perhaps Roger hadn't heard that it
had been implemented.)

[1] https://www.openbsd.org/papers/asiabsdcon2016-xen-paper.pdf

[2] https://www.openbsd.org/61.html

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 07/16] SUPPORT.md: Add virtual devices common to ARM and x86

2017-11-21 Thread George Dunlap
On 11/21/2017 11:41 AM, Jan Beulich wrote:
 On 21.11.17 at 11:56,  wrote:
>> On 11/21/2017 08:29 AM, Jan Beulich wrote:
>> On 13.11.17 at 16:41,  wrote:
 +### PV USB support for xl
 +
 +Status: Supported
 +
 +### PV 9pfs support for xl
 +
 +Status: Tech Preview
>>>
>>> Why are these two being called out, but xl support for other device
>>> types isn't?
>>
>> Do you see how big this document is? :-)  If you think something else
>> needs to be covered, don't ask why I didn't mention it, just say what
>> you think I missed.
> 
> Well, (not very) implicitly here: The same for all other PV protocols.

Oh, I see -- you didn't read my comment below the `---` pointing this
out.  :-)

Yes, I wasn't quite sure what to do here.  We already list all the PV
protocols in at least 2 places (frontend and backend support); it seemed
a bit redundant to list them all again in xl and/or libxl support.

Except, of course, that there are a number of protocols *not* plumbed
through the toolstack yet -- PVSCSI being one example.

Any suggestions would be welcome.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Ping: [PATCH] VMX: sync CPU state upon vCPU destruction

2017-11-21 Thread George Dunlap
On 11/21/2017 04:42 PM, Dario Faggioli wrote:
> On Tue, 2017-11-21 at 08:29 -0700, Jan Beulich wrote:
> On 21.11.17 at 15:07,  wrote:
>>>
>> The question here is: In what other cases do we expect an RCU
>> callback to possibly touch guest state? I think the common use is
>> to merely free some memory in a delayed fashion.
>>
>>> Those choices that you outlined appear to be different in terms
>>> whether
>>> we solve the general problem and probably have some minor
>>> performance
>>> impact or we solve the ad-hoc problem but make the system more
>>> entangled. Here I'm more inclined to the first choice because this
>>> particular scenario the performance impact should be negligible.
>>
>> For the problem at hand there's no question about a
>> performance effect. The question is whether doing this for _other_
>> RCU callbacks would introduce a performance drop in certain cases.
>>
> Well, I personally favour the approach of making the piece of code that
> plays with the context responsible of not messing up when doing so.
> 
> And (replying to Igor comment above), I don't think that syncing
> context before RCU handlers solves the general problem --as you're
> calling it-- of "VMX code asynchronously messing up with the context". 
> In fzct, it solves the specific problem of "VMX code called via RCU,
> asynchronously messing up with the context".
> There may be other places where (VMX?) code messes with context, *not*
> from within an RCU handler, and that would still be an issue.

Yes, to expand on what I said earlier: Given that we cannot (at least
between now and the release) make it so that developers *never* have to
think about syncing state, it seems like the best thing to do is to make
coders *always* think about syncing state.  Syncing always in the RCU
handler means coders can get away sometimes without syncing; which makes
it more likely we'll forget in some other circumstance where it matters.

But that's my take on general principles; like Dario I wouldn't argue
too strongly if someone felt differently.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 06/16] SUPPORT.md: Add scalability features

2017-11-21 Thread George Dunlap
On 11/16/2017 03:19 PM, Julien Grall wrote:
> Hi George,
> 
> On 13/11/17 15:41, George Dunlap wrote:
>> Superpage support and PVHVM.
>>
>> Signed-off-by: George Dunlap 
>> ---
>> CC: Ian Jackson 
>> CC: Wei Liu 
>> CC: Andrew Cooper 
>> CC: Jan Beulich 
>> CC: Stefano Stabellini 
>> CC: Konrad Wilk 
>> CC: Tim Deegan 
>> CC: Julien Grall 
>> ---
>>   SUPPORT.md | 21 +
>>   1 file changed, 21 insertions(+)
>>
>> diff --git a/SUPPORT.md b/SUPPORT.md
>> index c884fac7f5..a8c56d13dd 100644
>> --- a/SUPPORT.md
>> +++ b/SUPPORT.md
>> @@ -195,6 +195,27 @@ on embedded platforms.
>>     Enables NUMA aware scheduling in Xen
>>   +## Scalability
>> +
>> +### 1GB/2MB super page support
>> +
>> +    Status, x86 HVM/PVH: : Supported
>> +    Status, ARM: Supported
>> +
>> +NB that this refers to the ability of guests
>> +to have higher-level page table entries point directly to memory,
>> +improving TLB performance.
>> +This is independent of the ARM "page granularity" feature (see below).
> 
> I am not entirely sure about this paragraph for Arm. I understood this
> section as support for stage-2 page-table (aka EPT on x86) but the
> paragraph lead me to believe to it is for guest.
> 
> The size of super pages of guests will depend on the page granularity
> used by itself and the format of the page-table (e.g LPAE vs short
> descriptor). We have no control on that.
> 
> What we have control is the size of mapping used for stage-2 page-table.

Stepping back from the document for a minute: would it make sense to use
"hardware assisted paging" (HAP) for Intel EPT, AMD RVI (previously
NPT), and ARM stage-2 pagetables?  HAP was already a general term used
to describe the two x86 technologies; and I think the description makes
sense, because if we didn't have hardware-assisted stage 2 pagetables
we'd need Xen-provided shadow pagetables.

Back to the question at hand, there are four different things:

1. Whether Xen itself uses superpage mappings for its virtual address
space.  (Not sure if Xen does this or not.)

2. Whether Xen uses superpage mappings for HAP.  Xen uses this on x86
when hardware support is -- I take it Xen does this on ARM as well?

3. Whether Xen provides the *interface* for a guest to use L2 or L3
superpages (for 4k page granularity, 2MiB or 1GiB respectively) in its
own pagetables.  I *think* HAP on x86 provides the interface whenever
the underlying hardware does.  I assume it's the same for ARM?  In the
case of shadow mode, we only provide the interface for 2MiB pagetables.

4. Whether a guest using L2 or L3 superpages will actually have
superpages, or whether it's "only emulated".  As Jan said, for shadow
pagetables on x86, the underlying pagetables still only have 4k pages,
so the guest will get no benefit from using L2 superpages in its
pagetables (either in terms of reduced memory reads on a tlb miss, or in
terms of larger effectiveness of each TLB entry).

#3 and #4 are probably the most pertinent to users, with #2 being next
on the list, and #1 being least.

Does that make sense?

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Ping: [PATCH] VMX: sync CPU state upon vCPU destruction

2017-11-21 Thread George Dunlap
On 11/21/2017 01:22 PM, Jan Beulich wrote:
 On 09.11.17 at 15:49,  wrote:
>> See the code comment being added for why we need this.
>>
>> Reported-by: Igor Druzhinin 
>> Signed-off-by: Jan Beulich 
> 
> I realize we aren't settled yet on where to put the sync call. The
> discussion appears to have stalled, though. Just to recap,
> alternatives to the placement below are
> - at the top of complete_domain_destroy(), being the specific
>   RCU callback exhibiting the problem (others are unlikely to
>   touch guest state)
> - in rcu_do_batch(), paralleling the similar call from
>   do_tasklet_work()

I read through the discussion yesterday without digging into the code.
At the moment, I'd say that specific code needing to touch potentially
non-sync'd state should be marked to sync it, rather than syncing it all
the time.  But I don't have a strong opinion (particularly as I haven't
dug into the code).

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 04/16] SUPPORT.md: Add core ARM features

2017-11-21 Thread George Dunlap


On Nov 21, 2017, at 11:37 AM, Jan Beulich 
mailto:jbeul...@suse.com>> wrote:

On 21.11.17 at 11:45, 
mailto:george.dun...@citrix.com>> wrote:
On 11/21/2017 08:11 AM, Jan Beulich wrote:
On 13.11.17 at 16:41, 
mailto:george.dun...@citrix.com>> wrote:
+### ARM/SMMUv1
+
+Status: Supported
+
+### ARM/SMMUv2
+
+Status: Supported

Do these belong here, when IOMMU isn't part of the corresponding
x86 patch?

Since there was recently a time when these weren't supported, I think
it's useful to have them in here.  (Julien, let me know if you think
otherwise.)

Do you think it would be useful to include an IOMMU line for x86?

At this point of the series I would surely have said "yes". The
later PCI passthrough additions state this implicitly at least (by
requiring an IOMMU for passthrough to be supported at all).
But even then saying so explicitly may be better.

How much do we specifically need to break down?  AMD / Intel?

What about something like this?

### IOMMU

Status, AMD IOMMU: Supported
Status, Intel VT-d: Supported
Status, ARM SMMUv1: Supported
Status, ARM SMMUv2: Supported

 -George
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 03/16] SUPPORT.md: Add some x86 features

2017-11-21 Thread George Dunlap


On Nov 21, 2017, at 11:35 AM, Jan Beulich 
mailto:jbeul...@suse.com>> wrote:

On 21.11.17 at 11:42, 
mailto:george.dun...@citrix.com>> wrote:
On 11/21/2017 08:09 AM, Jan Beulich wrote:
On 13.11.17 at 16:41, 
mailto:george.dun...@citrix.com>> wrote:
+### x86/PVH guest
+
+Status: Supported
+
+PVH is a next-generation paravirtualized mode
+designed to take advantage of hardware virtualization support when possible.
+During development this was sometimes called HVMLite or PVHv2.
+
+Requires hardware virtualisation support (Intel VMX / AMD SVM)

I think it needs to be said that only DomU is considered supported.
Dom0 is perhaps not even experimental at this point, considering
the panic() in dom0_construct_pvh().

Indeed, that's why dom0 PVH isn't in the list, and why this says 'PVH
guest', and is in the 'Guest Type' section.  We generally don't say,
"Oh, and we don't have this feature at all".

If you think it's important we could add a sentence here explicitly
stating that dom0 PVH isn't supported, but I sort of feel like it isn't
necessary.

Much depends on whether you think "guest" == "DomU". To me
Dom0 is a guest, too.

That’s not how I’ve ever understood those terms.

A guest at a hotel is someone who is served, and who does not have (legal) 
access to the internals of the system.  The maids who clean the room and the 
janitors who sweep the floors are hosts, because they have (to various degrees) 
extra access designed to help them serve the guests.

A “guest” is a virtual machine that does not have access to the internals of 
the system; that is the “target” of virtualization.  As such, the dom0 kernel 
and all the toolstack / emulation code running in domain 0 are part of the 
“host”.

Domain 0 is a domain and a VM, but only domUs are guests.

Any other opinions on this?  Do we need to add these to the terms defined at 
the bottom?

 -George
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 07/16] SUPPORT.md: Add virtual devices common to ARM and x86

2017-11-21 Thread George Dunlap
On 11/21/2017 08:29 AM, Jan Beulich wrote:
 On 13.11.17 at 16:41,  wrote:
>> +### PV USB support for xl
>> +
>> +Status: Supported
>> +
>> +### PV 9pfs support for xl
>> +
>> +Status: Tech Preview
> 
> Why are these two being called out, but xl support for other device
> types isn't?

Do you see how big this document is? :-)  If you think something else
needs to be covered, don't ask why I didn't mention it, just say what
you think I missed.

> 
>> +### QEMU backend hotplugging for xl
>> +
>> +Status: Supported
> 
> Wouldn't this more appropriately be
> 
> ### QEMU backend hotplugging
> 
> Status, xl: Supported

Maybe -- let me think about it.

> 
> ?
> 
>> +## Virtual driver support, guest side
>> +
>> +### Blkfront
>> +
>> +Status, Linux: Supported
>> +Status, FreeBSD: Supported, Security support external
>> +Status, NetBSD: Supported, Security support external
>> +Status, Windows: Supported
>> +
>> +Guest-side driver capable of speaking the Xen PV block protocol
>> +
>> +### Netfront
>> +
>> +Status, Linux: Supported
>> +States, Windows: Supported
>> +Status, FreeBSD: Supported, Security support external
>> +Status, NetBSD: Supported, Security support external
>> +Status, OpenBSD: Supported, Security support external
> 
> Seeing the difference in OSes between the two (with the variance
> increasing in entries further down) - what does the absence of an
> OS on one list, but its presence on another mean? While not
> impossible, I would find it surprising if e.g. OpenBSD had netfront
> but not even a basic blkfront.

Good catch.  Roger suggested that I add the OpenBSD Netfront; he's away
so I'll have to see if I can figure out if they have blkfront support or
not.

>> +Guest-side driver capable of speaking the Xen PV networking protocol
>> +
>> +### PV Framebuffer (frontend)
>> +
>> +Status, Linux (xen-fbfront): Supported
>> +
>> +Guest-side driver capable of speaking the Xen PV Framebuffer protocol
>> +
>> +### PV Console (frontend)
>> +
>> +Status, Linux (hvc_xen): Supported
>> +Status, Windows: Supported
>> +Status, FreeBSD: Supported, Security support external
>> +Status, NetBSD: Supported, Security support external
>> +
>> +Guest-side driver capable of speaking the Xen PV console protocol
>> +
>> +### PV keyboard (frontend)
>> +
>> +Status, Linux (xen-kbdfront): Supported
>> +Status, Windows: Supported
>> +
>> +Guest-side driver capable of speaking the Xen PV keyboard protocol
> 
> Are these three active/usable in guests regardless of whether the
> guest is being run PV, PVH, or HVM? If not, wouldn't this need
> spelling out?

In theory I think they could be used; I suspect it's just that they
aren't used.  Let me see if I can think of a way to concisely express that.

>> +## Virtual device support, host side
>> +
>> +### Blkback
>> +
>> +Status, Linux (blkback): Supported
> 
> Strictly speaking, if the driver name is to be spelled out here in
> the first place, it's xen-blkback here and ...
> 
>> +Status, FreeBSD (blkback): Supported, Security support external
>> +Status, NetBSD (xbdback): Supported, security support external
>> +Status, QEMU (xen_disk): Supported
>> +Status, Blktap2: Deprecated
>> +
>> +Host-side implementations of the Xen PV block protocol
>> +
>> +### Netback
>> +
>> +Status, Linux (netback): Supported
> 
> ... xen-netback here for the upstream kernels.

Ack.


>> +### PV USB (backend)
>> +
>> +Status, Linux: Experimental
> 
> What existing/upstream code does this refer to?

I guess a bunch of patches posted to a mailing list?  Yeah, that's
probably something we should take out.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 04/16] SUPPORT.md: Add core ARM features

2017-11-21 Thread George Dunlap
On 11/21/2017 08:11 AM, Jan Beulich wrote:
 On 13.11.17 at 16:41,  wrote:
>> +### ARM/SMMUv1
>> +
>> +Status: Supported
>> +
>> +### ARM/SMMUv2
>> +
>> +Status: Supported
> 
> Do these belong here, when IOMMU isn't part of the corresponding
> x86 patch?

Since there was recently a time when these weren't supported, I think
it's useful to have them in here.  (Julien, let me know if you think
otherwise.)

Do you think it would be useful to include an IOMMU line for x86?

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 03/16] SUPPORT.md: Add some x86 features

2017-11-21 Thread George Dunlap
On 11/21/2017 08:09 AM, Jan Beulich wrote:
 On 13.11.17 at 16:41,  wrote:
>> +### x86/PVH guest
>> +
>> +Status: Supported
>> +
>> +PVH is a next-generation paravirtualized mode 
>> +designed to take advantage of hardware virtualization support when possible.
>> +During development this was sometimes called HVMLite or PVHv2.
>> +
>> +Requires hardware virtualisation support (Intel VMX / AMD SVM)
> 
> I think it needs to be said that only DomU is considered supported.
> Dom0 is perhaps not even experimental at this point, considering
> the panic() in dom0_construct_pvh().

Indeed, that's why dom0 PVH isn't in the list, and why this says 'PVH
guest', and is in the 'Guest Type' section.  We generally don't say,
"Oh, and we don't have this feature at all".

If you think it's important we could add a sentence here explicitly
stating that dom0 PVH isn't supported, but I sort of feel like it isn't
necessary.

>> +### Host ACPI (via Domain 0)
>> +
>> +Status, x86 PV: Supported
>> +Status, x86 PVH: Tech preview
>
> Are we this far already? Preview implies functional completeness,
> but I'm not sure about all ACPI related parts actually having been
> implemented (and see also below). But perhaps things like P and C
> state handling come as individual features later on.

Hmm, yeah, it doesn't make much sense to say that we have "Tech preview"
status for a feature with a PVH dom0, when PVH dom0 itself isn't even
'experimental' yet.  I'll remove this (unless Roger or Wei want to object).

 -George


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 02/16] SUPPORT.md: Add core functionality

2017-11-21 Thread George Dunlap
On 11/21/2017 08:03 AM, Jan Beulich wrote:
 On 13.11.17 at 16:41,  wrote:
>> --- a/SUPPORT.md
>> +++ b/SUPPORT.md
>> @@ -16,6 +16,65 @@ for the definitions of the support status levels etc.
>>  
>>  # Feature Support
>>  
>> +## Memory Management
>> +
>> +### Memory Ballooning
>> +
>> +Status: Supported
> 
> Is this a proper feature in the context we're talking about? To me
> it's meaningful in guest OS context only. I also wouldn't really
> consider it "core", but placement within the series clearly is a minor
> aspect.
> 
> I'd prefer this to be dropped altogether as a feature, but

This doesn't make any sense to me.  Allowing a guest to modify its own
memory requires a *lot* of support, spread throughout the hypervisor;
and there are a huge number of recent security holes that would have
been much more difficult to exploit if guests didn't have the ability to
balloon up or down.

If what you mean is *specifically* the technique of making a "memory
balloon" to trick the guest OS into handing back memory without knowing
it, then it's just a matter of semantics.  We could call this "dynamic
memory control" or something like that if you prefer (although we'd have
to mention ballooning in the description to make sure people can find it).

> Acked-by: Jan Beulich 
> is independent of that.
> 
>> +### Credit2 Scheduler
>> +
>> +Status: Supported
> 
> Sort of unrelated, but with this having been the case since 4.8 as it
> looks, is there a reason it still isn't the default scheduler?
Well first of all it was missing some features which credit1 had:
namely, soft affinity (i.e., required for host NUMA awareness) and caps.
 These were checked in this release cycle; but we also wanted to switch
the default at the beginning of a development cycle to get the highest
chance of shaking out any weird bugs.

So according to those criteria, we could switch to credit2 being the
default scheduler as soon as 4.10 development window opens.

At some point recently Dario said there were still some unusual behavior
he wanted to dig into; but I think with him not working for Citrix
anymore, it's doubtful we'll have resource to take that up; the best
option might be to just pull the lever and see what happens.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 06/16] SUPPORT.md: Add scalability features

2017-11-16 Thread George Dunlap
On 11/16/2017 03:19 PM, Julien Grall wrote:
> Hi George,
> 
> On 13/11/17 15:41, George Dunlap wrote:
>> Superpage support and PVHVM.
>>
>> Signed-off-by: George Dunlap 
>> ---
>> CC: Ian Jackson 
>> CC: Wei Liu 
>> CC: Andrew Cooper 
>> CC: Jan Beulich 
>> CC: Stefano Stabellini 
>> CC: Konrad Wilk 
>> CC: Tim Deegan 
>> CC: Julien Grall 
>> ---
>>   SUPPORT.md | 21 +
>>   1 file changed, 21 insertions(+)
>>
>> diff --git a/SUPPORT.md b/SUPPORT.md
>> index c884fac7f5..a8c56d13dd 100644
>> --- a/SUPPORT.md
>> +++ b/SUPPORT.md
>> @@ -195,6 +195,27 @@ on embedded platforms.
>>     Enables NUMA aware scheduling in Xen
>>   +## Scalability
>> +
>> +### 1GB/2MB super page support
>> +
>> +    Status, x86 HVM/PVH: : Supported
>> +    Status, ARM: Supported
>> +
>> +NB that this refers to the ability of guests
>> +to have higher-level page table entries point directly to memory,
>> +improving TLB performance.
>> +This is independent of the ARM "page granularity" feature (see below).
> 
> I am not entirely sure about this paragraph for Arm. I understood this
> section as support for stage-2 page-table (aka EPT on x86) but the
> paragraph lead me to believe to it is for guest.

Hmm, yes likely there was some confusion when this was listed.  We
probably should make separate entries for HAP / stage 2 superpage
support and guest PT superpage support.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [BUG] Error applying XSA240 update 5 on 4.8 and 4.9 (patch 3 references CONFIG_PV_LINEAR_PT, 3285e75dea89, x86/mm: Make PV linear pagetables optional)

2017-11-16 Thread George Dunlap
On 11/16/2017 01:04 PM, Jan Beulich wrote:
 On 16.11.17 at 13:30,  wrote:
>> On Thursday, 16 November 2017 8:30:39 PM AEDT Jan Beulich wrote:
>> On 15.11.17 at 23:48,  wrote:
 I am having trouble applying the patch 3 from XSA240 update 5 for xen
 stable 4.8 and 4.9
 xsa240 0003 contains:

 CONFIG_PV_LINEAR_PT

 from:

 x86/mm: Make PV linear pagetables optional
 https://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=3285e75dea89afb0e 
 f5 b3ee39bd15194bd7cc110

 I cannot find this string in an XSA, nor is an XSA referenced in the
 commit.
 Am I missing a patch, or doing something wrong?
>>>
>>> Well, you're expected to apply all patched which haven't been
>>> applied so far. In particular, in the stable version trees, the 2nd
>>> patch hasn't gone in yet (I'm intending to do this later today),
>>> largely because it (a) wasn't ready at the time the first patch
>>> went in and (b) it is more a courtesy patch than an actual part of
>>> the security fix.
>>
>> I'm not quite sure this is a great idea... They should work on the released 
>> versions - hence xsa240 patchset should apply to the base tarball + current 
>> XSA patches. If there is something in the git that *isn't* in the latest 
>> release, it should be included in the XSA patchset - otherwise the set is 
>> incomplete.
> 
> Well, I've been taking a different view: The only valid (or so to say
> canonical) base to supply patches against is the current tip of the
> respective staging branch. Anyone wanting to apply to anything
> older will need to make adjustments, if need be. Otherwise what
> would keep you or others to request, say, not only patches against
> 4.7.3, but also against 4.7.0, 4.7.1, and 4.7.2?

Jan,

These are two different things.  Steve's reluctance to backport a
potentially arbitrary number of non-security-related patches is
completely reasonable.

Steve, one of the problems with what you ask is that as a security team,
we'd like to be able to take the patches given in the advisory and check
it in, as-is, to the staging branches.  That makes it easier, for
instance, to make sure that all the XSAs have been applied before we do
a release; and it means that we only need to review one patch per
supported release (up to 5 potential patches at this time in addition to
the one to xen-unstable) rather than two (up to 10 potential patches).

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/2 v2] xen: Fix 16550 UART console for HP Moonshot (Aarch64) platform

2017-11-16 Thread George Dunlap
On Nov 15, 2017, at 9:20 PM, Konrad Rzeszutek Wilk  
wrote:
> 
> On Thu, Nov 09, 2017 at 03:49:24PM +0530, Bhupinder Thakur wrote:
>>The console was not working on HP Moonshot (HPE Proliant Aarch64) because
>>the UART registers were accessed as 8-bit aligned addresses. However,
>>registers are 32-bit aligned for HP Moonshot.
>> 
>>Since ACPI/SPCR table does not specify the register shift to be applied 
>> to the
>>register offset, this patch implements an erratum to correctly set the 
>> register
>>shift for HP Moonshot.
>> 
>>Similar erratum was implemented in linux:
>> 
>>commit 79a648328d2a604524a30523ca763fbeca0f70e3
>>Author: Loc Ho 
>>Date:   Mon Jul 3 14:33:09 2017 -0700
>> 
>>ACPI: SPCR: Workaround for APM X-Gene 8250 UART 32-alignment errata
>> 
>>APM X-Gene verion 1 and 2 have an 8250 UART with its register
>>aligned to 32-bit. In addition, the latest released BIOS
>>encodes the access field as 8-bit access instead 32-bit access.
>>This causes no console with ACPI boot as the console
>>will not match X-Gene UART port due to the lack of mmio32
>>option.
>> 
>>Signed-off-by: Loc Ho 
>>Acked-by: Greg Kroah-Hartman 
>>Signed-off-by: Rafael J. Wysocki 
> 
> Any particular reason you offset this whole commit description by four spaces?

I get this effect when I use “git show” to look at a changeset for some reason. 
 Bhupinder, did you perhaps export a changeset as a patch using “git show” and 
then re-import it?

In any case, this needs to be fixed.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 13/16] SUPPORT.md: Add secondary memory management features

2017-11-13 Thread George Dunlap
Signed-off-by: George Dunlap 
---
CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
CC: Tamas K Lengyel 
---
 SUPPORT.md | 31 +++
 1 file changed, 31 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index 0f7426593e..3e352198ce 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -187,6 +187,37 @@ Export hypervisor coverage data suitable for analysis by 
gcov or lcov.
 
 Status: Supported
 
+### Memory Sharing
+
+Status, x86 HVM: Tech Preview
+Status, ARM: Tech Preview
+
+Allow sharing of identical pages between guests
+
+### Memory Paging
+
+Status, x86 HVM: Experimenal
+
+Allow pages belonging to guests to be paged to disk
+
+### Transcendent Memory
+
+Status: Experimental
+
+Transcendent Memory (tmem) allows the creation of hypervisor memory pools
+which guests can use to store memory 
+rather than caching in its own memory or swapping to disk.
+Having these in the hypervisor
+can allow more efficient aggregate use of memory across VMs.
+
+### Alternative p2m
+
+Status, x86 HVM: Tech Preview
+Status, ARM: Tech Preview
+
+Allows external monitoring of hypervisor memory
+by maintaining multiple physical to machine (p2m) memory mappings.
+
 ## Resource Management
 
 ### CPU Pools
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 16/16] SUPPORT.md: Add limits RFC

2017-11-13 Thread George Dunlap
Signed-off-by: George Dunlap 
---
Could someone take this one over as well?

CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
---
 SUPPORT.md | 61 +
 1 file changed, 61 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index e72f9f3892..d11e05fc2a 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -64,6 +64,53 @@ for the definitions of the support status levels etc.
 
 Extension to the GICv3 interrupt controller to support MSI.
 
+## Limits/Host
+
+### CPUs
+
+Limit, x86: 4095
+Limit, ARM32: 8
+Limit, ARM64: 128
+
+Note that for x86, very large number of cpus may not work/boot,
+but we will still provide security support
+
+### x86/RAM
+
+Limit, x86: 123TiB
+Limit, ARM32: 16GiB
+Limit, ARM64: 5TiB
+
+## Limits/Guest
+
+### Virtual CPUs
+
+Limit, x86 PV: 8192
+Limit-security, x86 PV: 32
+Limit, x86 HVM: 128
+Limit-security, x86 HVM: 32
+Limit, ARM32: 8
+Limit, ARM64: 128
+
+### Virtual RAM
+
+Limit-security, x86 PV: 2047GiB
+Limit-security, x86 HVM: 1.5TiB
+Limit, ARM32: 16GiB
+Limit, ARM64: 1TiB
+
+Note that there are no theoretical limits to PV or HVM guest sizes
+other than those determined by the processor architecture.
+
+### Event Channel 2-level ABI
+
+Limit, 32-bit: 1024
+Limit, 64-bit: 4096
+
+### Event Channel FIFO ABI
+
+Limit: 131072
+
 ## Guest Type
 
 ### x86/PV
@@ -685,6 +732,20 @@ If support differs based on implementation
 (for instance, x86 / ARM, Linux / QEMU / FreeBSD),
 one line for each set of implementations will be listed.
 
+### Limit-security
+
+For size limits.
+This figure shows the largest configuration which will receive
+security support.
+It is generally determined by the maximum amount that is regularly tested.
+This limit will only be listed explicitly
+if it is different than the theoretical limit.
+
+### Limit
+
+This figure shows a theoretical size limit.
+This does not mean that such a large configuration will actually work.
+
 ## Definition of Status labels
 
 Each Status value corresponds to levels of security support,
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 15/16] SUPPORT.md: Add statement on migration RFC

2017-11-13 Thread George Dunlap
Signed-off-by: George Dunlap 
---
Would someone be willing to take over this one?

CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
CC: Roger Pau Monne 
CC: Anthony Perard 
CC: Paul Durrant 
CC: Julien Grall 
---
 SUPPORT.md | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index a8388f3dc5..e72f9f3892 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -294,6 +294,36 @@ This includes exposing event channels to HVM guests.
 
 ## High Availability and Fault Tolerance
 
+### Live Migration, Save & Restore
+
+Status, x86: Supported, with caveats
+
+A number of features don't work with live migration / save / restore.  These 
include:
+ * PCI passthrough
+ * vNUMA
+ * Nested HVM
+
+XXX Need to check the following:
+ 
+ * Guest serial console
+ * Crash kernels
+ * Transcendent Memory
+ * Alternative p2m
+ * vMCE
+ * vPMU
+ * Intel Platform QoS
+ * Remus
+ * COLO
+ * PV protocols: Keyboard, PVUSB, PVSCSI, PVTPM, 9pfs, pvcalls?
+ * FlASK?
+ * CPU / memory hotplug?
+
+Additionally, if an HVM guest was booted with memory != maxmem,
+and the balloon driver hadn't hit the target before migration,
+the size of the guest on the far side might be unexpected.
+
+See docs/features/migration.pandoc for more details
+
 ### Remus Fault Tolerance
 
 Status: Experimental
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 11/16] SUPPORT.md: Add 'easy' HA / FT features

2017-11-13 Thread George Dunlap
Migration being one of the key 'non-easy' ones to be added later.

Signed-off-by: George Dunlap 
---
CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
---
 SUPPORT.md | 16 
 1 file changed, 16 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index bd83c81557..722a29fec5 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -261,6 +261,22 @@ which add paravirtualized functionality to HVM guests
 for improved performance and scalability.
 This includes exposing event channels to HVM guests.
 
+## High Availability and Fault Tolerance
+
+### Remus Fault Tolerance
+
+Status: Experimental
+
+### COLO Manager
+
+Status: Experimental
+
+### x86/vMCE
+
+Status: Supported
+
+Forward Machine Check Exceptions to Appropriate guests
+
 ## Virtual driver support, guest side
 
 ### Blkfront
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 12/16] SUPPORT.md: Add Security-releated features

2017-11-13 Thread George Dunlap
With the exception of driver domains, which depend on PCI passthrough,
and will be introduced later.

Signed-off-by: George Dunlap 
---
CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
CC: Tamas K Lengyel 
CC: Rich Persaud 
---
 SUPPORT.md | 34 ++
 1 file changed, 34 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index 722a29fec5..0f7426593e 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -421,6 +421,40 @@ there is currently no xl support.
 
 Status: Supported
 
+## Security
+
+### Device Model Stub Domains
+
+Status: Supported
+
+### KCONFIG Expert
+
+Status: Experimental
+
+### Live Patching
+
+Status, x86: Supported
+Status, ARM: Experimental
+
+Compile time disabled for ARM
+
+### Virtual Machine Introspection
+
+Status, x86: Supported, not security supported
+
+### XSM & FLASK
+
+Status: Experimental
+
+Compile time disabled
+
+### FLASK default policy
+
+Status: Experimental
+
+The default policy includes FLASK labels and roles for a "typical" Xen-based 
system
+with dom0, driver domains, stub domains, domUs, and so on.
+
 ## Virtual Hardware, Hypervisor
 
 ### x86/Nested PV
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 14/16] SUPPORT.md: Add statement on PCI passthrough

2017-11-13 Thread George Dunlap
Signed-off-by: George Dunlap 
---
CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
CC: Rich Persaud 
CC: Marek Marczykowski-Górecki 
CC: Christopher Clark 
CC: James McKenzie 
---
 SUPPORT.md | 33 -
 1 file changed, 32 insertions(+), 1 deletion(-)

diff --git a/SUPPORT.md b/SUPPORT.md
index 3e352198ce..a8388f3dc5 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -454,9 +454,23 @@ there is currently no xl support.
 
 ## Security
 
+### Driver Domains
+
+Status: Supported, with caveats
+
+"Driver domains" means allowing non-Domain 0 domains 
+with access to physical devices to act as back-ends.
+
+See the appropriate "Device Passthrough" section
+for more information about security support.
+
 ### Device Model Stub Domains
 
-Status: Supported
+Status: Supported, with caveats
+
+Vulnerabilities of a device model stub domain 
+to a hostile driver domain (either compromised or untrusted)
+are excluded from security support.
 
 ### KCONFIG Expert
 
@@ -522,6 +536,23 @@ Virtual Performance Management Unit for HVM guests
 Disabled by default (enable with hypervisor command line option).
 This feature is not security supported: see 
http://xenbits.xen.org/xsa/advisory-163.html
 
+### x86/PCI Device Passthrough
+
+Status: Supported, with caveats
+
+Only systems using IOMMUs will be supported.
+
+Not compatible with migration, altp2m, introspection, memory sharing, or 
memory paging.
+
+Because of hardware limitations
+(affecting any operating system or hypervisor),
+it is generally not safe to use this feature 
+to expose a physical device to completely untrusted guests.
+However, this feature can still confer significant security benefit 
+when used to remove drivers and backends from domain 0
+(i.e., Driver Domains).
+See docs/PCI-IOMMU-bugs.txt for more information.
+
 ### ARM/Non-PCI device passthrough
 
 Status: Supported
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 01/16] Introduce skeleton SUPPORT.md

2017-11-13 Thread George Dunlap
On 11/13/2017 03:41 PM, George Dunlap wrote:
> Add a machine-readable file to describe what features are in what
> state of being 'supported', as well as information about how long this
> release will be supported, and so on.
> 
> The document should be formatted using "semantic newlines" [1], to make
> changes easier.
> 
> Begin with the basic framework.
> 
> Signed-off-by: Ian Jackson 
> Signed-off-by: George Dunlap 

Sending this series out slightly unfinished, as I've gotten diverted
with some security issues.

I think patches 1-14 should be mostly ready.  Patches 15 and 16 both
need some work; if anyone could pick them up I'd appreciate it.

Thanks,
 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 02/16] SUPPORT.md: Add core functionality

2017-11-13 Thread George Dunlap
Core memory management and scheduling.

Signed-off-by: George Dunlap 
---
CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Tim Deegan 
CC: Dario Faggioli 
CC: Nathan Studer 
---
 SUPPORT.md | 59 +++
 1 file changed, 59 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index d7f2ae45e4..064a2f43e9 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -16,6 +16,65 @@ for the definitions of the support status levels etc.
 
 # Feature Support
 
+## Memory Management
+
+### Memory Ballooning
+
+Status: Supported
+
+## Resource Management
+
+### CPU Pools
+
+Status: Supported
+
+Groups physical cpus into distinct groups called "cpupools",
+with each pool having the capability
+of using different schedulers and scheduling properties.
+
+### Credit Scheduler
+
+Status: Supported
+
+A weighted proportional fair share virtual CPU scheduler.
+This is the default scheduler.
+
+### Credit2 Scheduler
+
+Status: Supported
+
+A general purpose scheduler for Xen,
+designed with particular focus on fairness, responsiveness, and scalability
+
+### RTDS based Scheduler
+
+Status: Experimental
+
+A soft real-time CPU scheduler 
+built to provide guaranteed CPU capacity to guest VMs on SMP hosts
+
+### ARINC653 Scheduler
+
+Status: Supported
+
+A periodically repeating fixed timeslice scheduler.
+Currently only single-vcpu domains are supported.
+
+### Null Scheduler
+
+Status: Experimental
+
+A very simple, very static scheduling policy 
+that always schedules the same vCPU(s) on the same pCPU(s). 
+It is designed for maximum determinism and minimum overhead
+on embedded platforms.
+
+### NUMA scheduler affinity
+
+Status, x86: Supported
+
+Enables NUMA aware scheduling in Xen
+
 # Format and definitions
 
 This file contains prose, and machine-readable fragments.
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 07/16] SUPPORT.md: Add virtual devices common to ARM and x86

2017-11-13 Thread George Dunlap
Mostly PV protocols.

Signed-off-by: George Dunlap 
---
The xl side of this seems a bit incomplete: There are a number of
things supported but not mentioned (like networking, &c), and a number
of things not in xl (PV SCSI).  Couldn't find evidence of pvcall or pv
keyboard support.  Also we seem to be missing "PV channels" from this
list entirely

CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
CC: Roger Pau Monne 
CC: Anthony Perard 
CC: Paul Durrant 
CC: Julien Grall 
---
 SUPPORT.md | 160 +
 1 file changed, 160 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index a8c56d13dd..20c58377a5 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -130,6 +130,22 @@ Output of information in machine-parseable JSON format
 
 Status: Supported
 
+### Qemu based disk backend (qdisk) for xl
+
+Status: Supported
+
+### PV USB support for xl
+
+Status: Supported
+
+### PV 9pfs support for xl
+
+Status: Tech Preview
+
+### QEMU backend hotplugging for xl
+
+Status: Supported
+
 ## Toolstack/3rd party
 
 ### libvirt driver for xl
@@ -216,6 +232,150 @@ which add paravirtualized functionality to HVM guests
 for improved performance and scalability.
 This includes exposing event channels to HVM guests.
 
+## Virtual driver support, guest side
+
+### Blkfront
+
+Status, Linux: Supported
+Status, FreeBSD: Supported, Security support external
+Status, NetBSD: Supported, Security support external
+Status, Windows: Supported
+
+Guest-side driver capable of speaking the Xen PV block protocol
+
+### Netfront
+
+Status, Linux: Supported
+States, Windows: Supported
+Status, FreeBSD: Supported, Security support external
+Status, NetBSD: Supported, Security support external
+Status, OpenBSD: Supported, Security support external
+
+Guest-side driver capable of speaking the Xen PV networking protocol
+
+### PV Framebuffer (frontend)
+
+Status, Linux (xen-fbfront): Supported
+
+Guest-side driver capable of speaking the Xen PV Framebuffer protocol
+
+### PV Console (frontend)
+
+Status, Linux (hvc_xen): Supported
+Status, Windows: Supported
+Status, FreeBSD: Supported, Security support external
+Status, NetBSD: Supported, Security support external
+
+Guest-side driver capable of speaking the Xen PV console protocol
+
+### PV keyboard (frontend)
+
+Status, Linux (xen-kbdfront): Supported
+Status, Windows: Supported
+
+Guest-side driver capable of speaking the Xen PV keyboard protocol
+
+[XXX 'Supported' here depends on the version we ship in 4.10 having some fixes]
+
+### PV USB (frontend)
+
+Status, Linux: Supported
+
+### PV SCSI protocol (frontend)
+
+Status, Linux: Supported, with caveats
+
+NB that while the PV SCSI backend is in Linux and tested regularly,
+there is currently no xl support.
+
+### PV TPM (frontend)
+
+Status, Linux (xen-tpmfront): Tech Preview
+
+Guest-side driver capable of speaking the Xen PV TPM protocol
+
+### PV 9pfs frontend
+
+Status, Linux: Tech Preview
+
+Guest-side driver capable of speaking the Xen 9pfs protocol
+
+### PVCalls (frontend)
+
+Status, Linux: Tech Preview
+
+Guest-side driver capable of making pv system calls
+
+Note that there is currently no xl support for pvcalls.
+
+## Virtual device support, host side
+
+### Blkback
+
+Status, Linux (blkback): Supported
+Status, FreeBSD (blkback): Supported, Security support external
+Status, NetBSD (xbdback): Supported, security support external
+Status, QEMU (xen_disk): Supported
+Status, Blktap2: Deprecated
+
+Host-side implementations of the Xen PV block protocol
+
+### Netback
+
+Status, Linux (netback): Supported
+Status, FreeBSD (netback): Supported, Security support external
+Status, NetBSD (xennetback): Supported, Security support external
+
+Host-side implementations of Xen PV network protocol
+
+### PV Framebuffer (backend)
+
+Status, QEMU: Supported
+
+Host-side implementaiton of the Xen PV framebuffer protocol
+
+### PV Console (xenconsoled)
+
+Status: Supported
+
+Host-side implementation of the Xen PV console protocol
+
+### PV keyboard (backend)
+
+Status, QEMU: Supported
+
+Host-side implementation fo the Xen PV keyboard protocol
+
+### PV USB (backend)
+
+Status, Linux: Experimental
+Status, QEMU: Supported
+
+Host-side implementation of the Xen PV USB protocol
+
+### PV SCSI protocol (backend)
+
+Status, Linux: Supported, with caveats
+
+NB that while the PV SCSI backend is in Linux and tested regularly,
+there is currently no xl support.
+
+### PV TPM (backend)
+
+Status: Tech Preview
+
+### PV 9pfs (backend)
+
+Status, QEMU: Tech Preview
+
+### PVCalls (backend)
+
+Status, Linux: Tech Preview
+
+### Online resize of virtual disks
+
+Status: Supported
+
 # Format and definitions
 
 This file contain

[Xen-devel] [PATCH 05/16] SUPPORT.md: Toolstack core

2017-11-13 Thread George Dunlap
For now only include xl-specific features, or interaction with the
system.  Feature support matrix will be added when features are
mentioned.

Signed-off-by: George Dunlap 
---
CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
---
 SUPPORT.md | 38 ++
 1 file changed, 38 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index 7c01d8cf9a..c884fac7f5 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -98,6 +98,44 @@ Requires hardware virtualisation support (Intel VMX / AMD 
SVM)
 
 ARM only has one guest type at the moment
 
+## Toolstack
+
+### xl
+
+Status: Supported
+
+### Direct-boot kernel image format
+
+Supported, x86: bzImage
+Supported, ARM32: zImage
+Supported, ARM64: Image
+
+Format which the toolstack accept for direct-boot kernels
+
+### systemd support for xl
+
+Status: Supported
+
+### JSON output support for xl
+
+Status: Experimental
+
+Output of information in machine-parseable JSON format
+
+### Open vSwitch integration for xl
+
+Status, Linux: Supported
+
+### Virtual cpu hotplug
+
+Status: Supported
+
+## Toolstack/3rd party
+
+### libvirt driver for xl
+
+Status: Supported, Security support external
+
 ## Memory Management
 
 ### Memory Ballooning
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 04/16] SUPPORT.md: Add core ARM features

2017-11-13 Thread George Dunlap
Hardware support and guest type.

Signed-off-by: George Dunlap 
---
CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
CC: Julien Grall 
---
 SUPPORT.md | 29 +
 1 file changed, 29 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index 6b09f98331..7c01d8cf9a 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -22,6 +22,14 @@ for the definitions of the support status levels etc.
 
 Status: Supported
 
+### ARM v7 + Virtualization Extensions
+
+Status: Supported
+
+### ARM v8
+
+Status: Supported
+
 ## Host hardware support
 
 ### Physical CPU Hotplug
@@ -36,11 +44,26 @@ for the definitions of the support status levels etc.
 
 Status, x86 PV: Supported
 Status, x86 PVH: Tech preview
+Status, ARM: Experimental
 
 ### x86/Intel Platform QoS Technologies
 
 Status: Tech Preview
 
+### ARM/SMMUv1
+
+Status: Supported
+
+### ARM/SMMUv2
+
+Status: Supported
+
+### ARM/GICv3 ITS
+
+Status: Experimental
+
+Extension to the GICv3 interrupt controller to support MSI.
+
 ## Guest Type
 
 ### x86/PV
@@ -69,6 +92,12 @@ During development this was sometimes called HVMLite or 
PVHv2.
 
 Requires hardware virtualisation support (Intel VMX / AMD SVM)
 
+### ARM guest
+
+Status: Supported
+
+ARM only has one guest type at the moment
+
 ## Memory Management
 
 ### Memory Ballooning
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 01/16] Introduce skeleton SUPPORT.md

2017-11-13 Thread George Dunlap
Add a machine-readable file to describe what features are in what
state of being 'supported', as well as information about how long this
release will be supported, and so on.

The document should be formatted using "semantic newlines" [1], to make
changes easier.

Begin with the basic framework.

Signed-off-by: Ian Jackson 
Signed-off-by: George Dunlap 

[1] http://rhodesmill.org/brandon/2012/one-sentence-per-line/
---
CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Tim Deegan 
CC: Dario Faggioli 
CC: Tamas K Lengyel 
CC: Roger Pau Monne 
CC: Stefano Stabellini 
CC: Anthony Perard 
CC: Paul Durrant 
CC: Konrad Wilk 
CC: Julien Grall 
---
 SUPPORT.md | 196 +
 1 file changed, 196 insertions(+)
 create mode 100644 SUPPORT.md

diff --git a/SUPPORT.md b/SUPPORT.md
new file mode 100644
index 00..d7f2ae45e4
--- /dev/null
+++ b/SUPPORT.md
@@ -0,0 +1,196 @@
+# Support statement for this release
+
+This document describes the support status 
+and in particular the security support status of the Xen branch
+within which you find it.
+
+See the bottom of the file 
+for the definitions of the support status levels etc.
+
+# Release Support
+
+Xen-Version: 4.10-unstable
+Initial-Release: n/a
+Supported-Until: TBD
+Security-Support-Until: Unreleased - not yet security-supported
+
+# Feature Support
+
+# Format and definitions
+
+This file contains prose, and machine-readable fragments.
+The data in a machine-readable fragment relate to
+the section and subsection in which it is found.
+
+The file is in markdown format.
+The machine-readable fragments are markdown literals
+containing RFC-822-like (deb822-like) data.
+
+## Keys found in the Feature Support subsections
+
+### Status
+
+This gives the overall status of the feature,
+including security support status, functional completeness, etc.
+Refer to the detailed definitions below.
+
+If support differs based on implementation
+(for instance, x86 / ARM, Linux / QEMU / FreeBSD),
+one line for each set of implementations will be listed.
+
+## Definition of Status labels
+
+Each Status value corresponds to levels of security support,
+testing, stability, etc., as follows:
+
+### Experimental
+
+Functional completeness: No
+Functional stability: Here be dragons
+Interface stability: Not stable
+Security supported: No
+
+### Tech Preview
+
+Functional completeness: Yes
+Functional stability: Quirky
+Interface stability: Provisionally stable
+Security supported: No
+
+ Supported
+
+Functional completeness: Yes
+Functional stability: Normal
+Interface stability: Yes
+Security supported: Yes
+
+ Deprecated
+
+Functional completeness: Yes
+Functional stability: Quirky
+Interface stability: No (as in, may disappear the next release)
+Security supported: Yes
+
+All of these may appear in modified form.  
+There are several interfaces, for instance,
+which are officially declared as not stable;
+in such a case this feature may be described as "Stable / Interface not 
stable".
+
+## Definition of the status label interpretation tags
+
+### Functionally complete
+
+Does it behave like a fully functional feature?
+Does it work on all expected platforms,
+or does it only work for a very specific sub-case?
+Does it have a sensible UI,
+or do you have to have a deep understanding of the internals
+to get it to work properly?
+
+### Functional stability
+
+What is the risk of it exhibiting bugs?
+
+General answers to the above:
+
+ * **Here be dragons**
+
+   Pretty likely to still crash / fail to work.
+   Not recommended unless you like life on the bleeding edge.
+
+ * **Quirky**
+
+   Mostly works but may have odd behavior here and there.
+   Recommended for playing around or for non-production use cases.
+
+ * **Normal**
+
+   Ready for production use
+
+### Interface stability
+
+If I build a system based on the current interfaces,
+will they still work when I upgrade to the next version?
+
+ * **Not stable**
+
+   Interface is still in the early stages and
+   still fairly likely to be broken in future updates.
+
+ * **Provisionally stable**
+
+   We're not yet promising backwards compatibility,
+   but we think this is probably the final form of the interface.
+   It may still require some tweaks.
+
+ * **Stable**
+
+   We will try very hard to avoid breaking backwards  compatibility,
+   and to fix any regressions that are reported.
+
+### Security supported
+
+Will XSAs be issued if security-related bugs are discovered
+in the functionality?
+
+If "no",
+anyone who finds a security-related bug in the feature
+will be advised to
+post it publicly to the Xen Project mailing lists
+(or contact another security response team,
+if a relevant one exists).
+
+Bugs found after the end of **Security-Support-Until**
+in the Release Support section will receiv

[Xen-devel] [PATCH 06/16] SUPPORT.md: Add scalability features

2017-11-13 Thread George Dunlap
Superpage support and PVHVM.

Signed-off-by: George Dunlap 
---
CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
CC: Julien Grall 
---
 SUPPORT.md | 21 +
 1 file changed, 21 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index c884fac7f5..a8c56d13dd 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -195,6 +195,27 @@ on embedded platforms.
 
 Enables NUMA aware scheduling in Xen
 
+## Scalability
+
+### 1GB/2MB super page support
+
+Status, x86 HVM/PVH: : Supported
+Status, ARM: Supported
+
+NB that this refers to the ability of guests
+to have higher-level page table entries point directly to memory,
+improving TLB performance.
+This is independent of the ARM "page granularity" feature (see below).
+
+### x86/PVHVM
+
+Status: Supported
+
+This is a useful label for a set of hypervisor features
+which add paravirtualized functionality to HVM guests 
+for improved performance and scalability.
+This includes exposing event channels to HVM guests.
+
 # Format and definitions
 
 This file contains prose, and machine-readable fragments.
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 10/16] SUPPORT.md: Add Debugging, analysis, crash post-portem

2017-11-13 Thread George Dunlap
Signed-off-by: George Dunlap 
---
CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
---
 SUPPORT.md | 29 +
 1 file changed, 29 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index 8235336c41..bd83c81557 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -152,6 +152,35 @@ Output of information in machine-parseable JSON format
 
 Status: Supported, Security support external
 
+## Debugging, analysis, and crash post-mortem
+
+### gdbsx
+
+Status, x86: Supported
+
+Debugger to debug ELF guests
+
+### Soft-reset for PV guests
+
+Status: Supported
+
+Soft-reset allows a new kernel to start 'from scratch' with a fresh VM state, 
+but with all the memory from the previous state of the VM intact.
+This is primarily designed to allow "crash kernels", 
+which can do core dumps of memory to help with debugging in the event of a 
crash.
+
+### xentrace
+
+Status, x86: Supported
+
+Tool to capture Xen trace buffer data
+
+### gcov
+
+Status: Supported, Not security supported
+
+Export hypervisor coverage data suitable for analysis by gcov or lcov.
+
 ## Memory Management
 
 ### Memory Ballooning
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 08/16] SUPPORT.md: Add x86-specific virtual hardware

2017-11-13 Thread George Dunlap
x86-specific virtual hardware provided by the hypervisor, toolstack,
or QEMU.

Signed-off-by: George Dunlap 
---
Added emulated QEMU support, to replace docs/misc/qemu-xen-security.

Need to figure out what to do with the "backing storage image format"
section of that document.

CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
CC: Roger Pau Monne 
CC: Anthony Perard 
CC: Paul Durrant 
---
 SUPPORT.md | 106 +
 1 file changed, 106 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index 20c58377a5..b95ee0ebe7 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -376,6 +376,112 @@ there is currently no xl support.
 
 Status: Supported
 
+## Virtual Hardware, Hypervisor
+
+### x86/Nested PV
+
+Status, x86 HVM: Tech Preview
+
+This means running a Xen hypervisor inside an HVM domain,
+with support for PV L2 guests only
+(i.e., hardware virtualization extensions not provided
+to the guest).
+
+This works, but has performance limitations
+because the L1 dom0 can only access emulated L1 devices.
+
+### x86/Nested HVM
+
+Status, x86 HVM: Experimental
+
+This means running a Xen hypervisor inside an HVM domain,
+with support for running both PV and HVM L2 guests
+(i.e., hardware virtualization extensions provided
+to the guest).
+
+### x86/Advanced Vector eXtension
+
+Status: Supported
+
+### vPMU
+
+Status, x86: Supported, Not security supported
+
+Virtual Performance Management Unit for HVM guests
+
+Disabled by default (enable with hypervisor command line option).
+This feature is not security supported: see 
http://xenbits.xen.org/xsa/advisory-163.html
+
+## Virtual Hardware, QEMU
+
+These are devices available in HVM mode using a qemu devicemodel (the default).
+Note that other devices are available but not security supported.
+
+### x86/Emulated platform devices (QEMU):
+
+Status, piix3: Supported
+
+### x86/Emulated network (QEMU):
+
+Status, e1000: Supported
+Status, rtl8193: Supported
+Status, virtio-net: Supported
+
+### x86/Emulated storage (QEMU):
+
+Status, piix3 ide: Supported
+Status, ahci: Supported
+
+### x86/Emulated graphics (QEMU):
+
+Status, cirrus-vga: Supported
+Status, stgvga: Supported
+
+### x86/Emulated audio (QEMU):
+
+Status, sb16: Supported
+Status, es1370: Supported
+Status, ac97: Supported
+
+### x86/Emulated input (QEMU):
+
+Status, usbmouse: Supported
+Status, usbtablet: Supported
+Status, ps/2 keyboard: Supported
+Status, ps/2 mouse: Supported
+
+### x86/Emulated serial card (QEMU):
+
+Status, UART 16550A: Supported
+
+### x86/Host USB passthrough (QEMU):
+
+Status: Supported, not security supported 
+
+## Virtual Firmware
+
+### x86/HVM iPXE
+
+Status: Supported, with caveats
+
+Booting a guest via PXE.
+PXE inherently places full trust of the guest in the network,
+and so should only be used
+when the guest network is under the same administrative control
+as the guest itself.
+
+### x86/HVM BIOS
+
+Status: Supported
+
+Booting a guest via guest BIOS firmware
+
+### x86/HVM EFI
+
+Status: Supported
+
+Booting a guest via guest EFI firmware
+
 # Format and definitions
 
 This file contains prose, and machine-readable fragments.
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 09/16] SUPPORT.md: Add ARM-specific virtual hardware

2017-11-13 Thread George Dunlap
Signed-off-by: George Dunlap 
---
Do we need to add anything more here?

And do we need to include ARM ACPI for guests?

CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
CC: Julien Grall 
---
 SUPPORT.md | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index b95ee0ebe7..8235336c41 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -412,6 +412,16 @@ Virtual Performance Management Unit for HVM guests
 Disabled by default (enable with hypervisor command line option).
 This feature is not security supported: see 
http://xenbits.xen.org/xsa/advisory-163.html
 
+### ARM/Non-PCI device passthrough
+
+Status: Supported
+
+### ARM: 16K and 64K page granularity in guests
+
+Status: Supported, with caveats
+
+No support for QEMU backends in a 16K or 64K domain.
+
 ## Virtual Hardware, QEMU
 
 These are devices available in HVM mode using a qemu devicemodel (the default).
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 03/16] SUPPORT.md: Add some x86 features

2017-11-13 Thread George Dunlap
Including host architecture support and guest types.

Signed-off-by: George Dunlap 
---
CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Tim Deegan 
CC: Roger Pau Monne 
---
 SUPPORT.md | 53 +
 1 file changed, 53 insertions(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index 064a2f43e9..6b09f98331 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -16,6 +16,59 @@ for the definitions of the support status levels etc.
 
 # Feature Support
 
+## Host Architecture
+
+### x86-64
+
+Status: Supported
+
+## Host hardware support
+
+### Physical CPU Hotplug
+
+Status, x86: Supported
+
+### Physical Memory Hotplug
+
+Status, x86: Supported
+
+### Host ACPI (via Domain 0)
+
+Status, x86 PV: Supported
+Status, x86 PVH: Tech preview
+
+### x86/Intel Platform QoS Technologies
+
+Status: Tech Preview
+
+## Guest Type
+
+### x86/PV
+
+Status: Supported
+
+Traditional Xen PV guest
+
+No hardware requirements
+
+### x86/HVM
+
+Status: Supported
+
+Fully virtualised guest using hardware virtualisation extensions
+
+Requires hardware virtualisation support (Intel VMX / AMD SVM)
+
+### x86/PVH guest
+
+Status: Supported
+
+PVH is a next-generation paravirtualized mode 
+designed to take advantage of hardware virtualization support when possible.
+During development this was sometimes called HVMLite or PVHv2.
+
+Requires hardware virtualisation support (Intel VMX / AMD SVM)
+
 ## Memory Management
 
 ### Memory Ballooning
-- 
2.15.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Commit moratorium to staging

2017-11-06 Thread George Dunlap
On 11/03/2017 06:35 PM, Juergen Gross wrote:
> On 03/11/17 19:29, Roger Pau Monné wrote:
>> On Fri, Nov 03, 2017 at 05:57:52PM +0000, George Dunlap wrote:
>>> On 11/03/2017 02:52 PM, George Dunlap wrote:
>>>> On 11/03/2017 02:14 PM, Roger Pau Monné wrote:
>>>>> On Thu, Nov 02, 2017 at 09:55:11AM +, Paul Durrant wrote:
>>>>>> Hmm. I wonder whether the guest is actually healthy after the migrate. 
>>>>>> One could imagine a situation where the storage device model (IDE in our 
>>>>>> case I guess) gets stuck in some way but recovers after a timeout in the 
>>>>>> guest storage stack. Thus, if you happen to try shut down while it is 
>>>>>> still stuck Windows starts trying to shut down but can't. Try after the 
>>>>>> timeout though and it can.
>>>>>> In the past we did make attempts to support Windows without PV drivers 
>>>>>> in XenServer but xenrt would never reliably pass VM lifecycle tests 
>>>>>> using emulated devices. That was with qemu trad, but I wonder whether 
>>>>>> upstream qemu is actually any better particularly if using older device 
>>>>>> models such as IDE and RTL8139 (which are probably largely unmodified 
>>>>>> from trad).
>>>>>
>>>>> Since I've been looking into this for a couple of days, and found no
>>>>> solution I'm going to write what I've found so far:
>>>>>
>>>>>  - The issue only affects Windows guests.
>>>>>  - It only manifests itself when doing live migration, non-live
>>>>>migration or save/resume work fine.
>>>>>  - It affects all x86 hardware, the amount of migrations in order to
>>>>>trigger it seems to depend on the hardware, but doing 20 migrations
>>>>>reliably triggers it on all the hardware I've tested.
>>>>
>>>> Not good.
>>>>
>>>> You said that Windows reported that the login process failed somehow?
>>>>
>>>> Is it possible something bad is happening, like sending spurious page
>>>> faults to the guest in logdirty mode?
>>>>
>>>> I wonder if we could reproduce something like it on Linux -- set a build
>>>> going and start localhost migrating; a spurious page fault is likely to
>>>> cause the build to fail.
>>>
>>> Well, with a looping xen-build going on in the guest, I've done 40 local
>>> migrates with no problems yet.
>>>
>>> But Roger -- is this on emulated devices only, no PV drivers?
>>>
>>> That might be something worth looking at.
>>
>> Yes, windows doesn't have PV devices. But save/restore and non-live
>> migration seems fine, so it doesn't look to be related to devices, but
>> rather to log-dirty or some other aspect of live-migration.
> 
> log-dirty for read-I/Os of emulated devices?

FWIW I booted a Linux guest with "xen_nopv" on the command-line, gave it
256 MiB of RAM, and then ran a Xen build on it in a loop (see command
below).

Then I started migrating it in a loop.

After an hour or two it had done 146 local migrations, and 46 builds of
Xen (swapping onto emulated disk is pretty slow), without any issues.

Build command:

# while make -j 3 xen ; do git clean -ffdx ; done

I'm shutting down the VM and I'll leave it running overnight.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Ping: [PATCH] x86emul: keep compiler from using {x, y, z}mm registers itself

2017-11-06 Thread George Dunlap
On 11/06/2017 11:59 AM, Jan Beulich wrote:
>>>> On 16.10.17 at 14:42,  wrote:
>>>>> On 16.10.17 at 14:37,  wrote:
>>> On 16/10/17 13:32, Jan Beulich wrote:
>>>> Since the emulator acts on the live hardware registers, we need to
>>>> prevent the compiler from using them e.g. for inlined memcpy() /
>>>> memset() (as gcc7 does). We can't, however, set this from the command
>>>> line, as otherwise the 64-bit build would face issues with functions
>>>> returning floating point values and being declared in standard headers.
>>>>
>>>> As the pragma isn't available prior to gcc6, we need to invoke it
>>>> conditionally. Luckily up to gcc6 we haven't seen generated code access
>>>> SIMD registers beyond what our asm()s do.
>>>>
>>>> Reported-by: George Dunlap 
>>>> Signed-off-by: Jan Beulich 
>>>> ---
>>>> While this doesn't affect core functionality, I think it would still be
>>>> nice for it to be allowed in for 4.10.
>>>
>>> Agreed.
>>>
>>> Has this been tested with Clang?
>>
>> Sorry, no - still haven't got around to set up a suitable Clang
>> locally.
>>
>>>  It stands a good chance of being
>>> compatible, but we may need an && !defined(__clang__) included.
>>
>> Should non-gcc silently ignore "#pragma GCC ..." it doesn't
>> recognize, or not define __GNUC__ in the first place if it isn't
>> sufficiently compatible? I.e. if anything I'd expect we need
>> "#elif defined(__clang__)" to achieve the same for Clang by
>> some different pragma (if such exists).
> 
> Not having received any reply so far, I'm wondering whether
> being able to build the test harness with clang is more
> important than for it to work correctly when built with gcc. I
> can't predict when I would get around to set up a suitable
> clang on my dev systems.

I agree with the argument you make above.  On the unlikely chance
there's a problem Travis should catch it, and someone who actually has a
clang setup can help sort it out.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [For Xen-4.10 PATCH] docs/features: update the status of Credit2 implemented features

2017-11-06 Thread George Dunlap
On 11/06/2017 10:35 AM, Dario Faggioli wrote:
> As soft-affinity and caps will be available in Xen 4.10.
> 
> Signed-off-by: Dario Faggioli 

Reviewed-by: George Dunlap 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Commit moratorium to staging

2017-11-03 Thread George Dunlap
On 11/03/2017 02:52 PM, George Dunlap wrote:
> On 11/03/2017 02:14 PM, Roger Pau Monné wrote:
>> On Thu, Nov 02, 2017 at 09:55:11AM +, Paul Durrant wrote:
>>> Hmm. I wonder whether the guest is actually healthy after the migrate. One 
>>> could imagine a situation where the storage device model (IDE in our case I 
>>> guess) gets stuck in some way but recovers after a timeout in the guest 
>>> storage stack. Thus, if you happen to try shut down while it is still stuck 
>>> Windows starts trying to shut down but can't. Try after the timeout though 
>>> and it can.
>>> In the past we did make attempts to support Windows without PV drivers in 
>>> XenServer but xenrt would never reliably pass VM lifecycle tests using 
>>> emulated devices. That was with qemu trad, but I wonder whether upstream 
>>> qemu is actually any better particularly if using older device models such 
>>> as IDE and RTL8139 (which are probably largely unmodified from trad).
>>
>> Since I've been looking into this for a couple of days, and found no
>> solution I'm going to write what I've found so far:
>>
>>  - The issue only affects Windows guests.
>>  - It only manifests itself when doing live migration, non-live
>>migration or save/resume work fine.
>>  - It affects all x86 hardware, the amount of migrations in order to
>>trigger it seems to depend on the hardware, but doing 20 migrations
>>reliably triggers it on all the hardware I've tested.
> 
> Not good.
> 
> You said that Windows reported that the login process failed somehow?
> 
> Is it possible something bad is happening, like sending spurious page
> faults to the guest in logdirty mode?
> 
> I wonder if we could reproduce something like it on Linux -- set a build
> going and start localhost migrating; a spurious page fault is likely to
> cause the build to fail.

Well, with a looping xen-build going on in the guest, I've done 40 local
migrates with no problems yet.

But Roger -- is this on emulated devices only, no PV drivers?

That might be something worth looking at.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Commit moratorium to staging

2017-11-03 Thread George Dunlap
On 11/03/2017 02:14 PM, Roger Pau Monné wrote:
> On Thu, Nov 02, 2017 at 09:55:11AM +, Paul Durrant wrote:
>>> -Original Message-
>>> From: Roger Pau Monne
>>> Sent: 02 November 2017 09:42
>>> To: Paul Durrant 
>>> Cc: Ian Jackson ; Lars Kurth
>>> ; Wei Liu ; Julien Grall
>>> ; committ...@xenproject.org; xen-devel >> de...@lists.xenproject.org>
>>> Subject: Re: [Xen-devel] Commit moratorium to staging
>>>
>>> On Thu, Nov 02, 2017 at 09:20:10AM +, Paul Durrant wrote:
> -Original Message-
> From: Roger Pau Monne
> Sent: 02 November 2017 09:15
> To: Roger Pau Monne 
> Cc: Ian Jackson ; Lars Kurth
> ; Wei Liu ; Julien Grall
> ; Paul Durrant ;
> committ...@xenproject.org; xen-devel >> de...@lists.xenproject.org>
> Subject: Re: [Xen-devel] Commit moratorium to staging
>
> On Wed, Nov 01, 2017 at 04:17:10PM +, Roger Pau Monné wrote:
>> On Wed, Nov 01, 2017 at 02:07:48PM +, Ian Jackson wrote:
>>> * Affected hosts differ from unaffected hosts according to cpuid.
>>>   Roger has repro'd the bug on an unaffected host by masking out
>>>   certain cpuid bits.  There are 6 implicated bits and he is working
>>>   to narrow that down.
>>
>> I'm currently trying to narrow this down and make sure the above is
>> accurate.
>
> So I was wrong with this, I guess I've run the tests on the wrong
> host. Even when masking the different cpuid bits in the guest the
> tests still succeeds.
>
> AFAICT the test fail or succeed reliably depending on the host
> hardware. I don't really have many ideas about what to do next, but I
> think it would be useful to create a manual osstest flight that runs
> the win16 job in all the different hosts in the colo. I would also
> capture the normal information that Xen collects after each test (xl
> info, /proc/cpuid, serial logs...).
>
> Is there anything else not captured by ts-logs-capture that would be
> interesting in order to help debug the issue?

 Does the shutdown reliably complete prior to migrate and then only fail
>>> intermittently after a localhost migrate?
>>>
>>> AFAICT yes, but it can also be added to the test in order to be sure.
>>>
 It might be useful to know what cpuid info is seen by the guest before and
>>> after migrate.
>>>
>>> Is there anyway to get that from windows in an automatic way? If not I
>>> could test that with a Debian guest. In fact it might even be a good
>>> thing for Linux based guest to be added to the regular migration tests
>>> in order to make sure cpuid bits don't change across migrations.
>>>
>>
>> I found this for windows:
>>
>> https://www.cpuid.com/downloads/cpu-z/cpu-z_1.81-en.exe
>>
>> It can generate a text or html report as well as being run interactively. 
>> But you may get more mileage from using a debian HVM guest. I guess it may 
>> also be useful is we can get a scan of available MSRs and content before and 
>> after migrate too.
>>
 Another datapoint... does the shutdown fail if you insert a delay of a 
 couple
>>> of minutes between the migrate and the shutdown?
>>>
>>> Sometimes, after a variable number of calls to xl shutdown ... the
>>> guest usually ends up shutting down.
>>>
>>
>> Hmm. I wonder whether the guest is actually healthy after the migrate. One 
>> could imagine a situation where the storage device model (IDE in our case I 
>> guess) gets stuck in some way but recovers after a timeout in the guest 
>> storage stack. Thus, if you happen to try shut down while it is still stuck 
>> Windows starts trying to shut down but can't. Try after the timeout though 
>> and it can.
>> In the past we did make attempts to support Windows without PV drivers in 
>> XenServer but xenrt would never reliably pass VM lifecycle tests using 
>> emulated devices. That was with qemu trad, but I wonder whether upstream 
>> qemu is actually any better particularly if using older device models such 
>> as IDE and RTL8139 (which are probably largely unmodified from trad).
> 
> Since I've been looking into this for a couple of days, and found no
> solution I'm going to write what I've found so far:
> 
>  - The issue only affects Windows guests.
>  - It only manifests itself when doing live migration, non-live
>migration or save/resume work fine.
>  - It affects all x86 hardware, the amount of migrations in order to
>trigger it seems to depend on the hardware, but doing 20 migrations
>reliably triggers it on all the hardware I've tested.

Not good.

You said that Windows reported that the login process failed somehow?

Is it possible something bad is happening, like sending spurious page
faults to the guest in logdirty mode?

I wonder if we could reproduce something like it on Linux -- set a build
going and start localhost migrating; a spurious page fault is likely to
cause the build to fail.

 -George

___
Xen-devel

Re: [Xen-devel] [PATCH RFC v2] Add SUPPORT.md

2017-11-02 Thread George Dunlap
On 10/27/2017 04:09 PM, NathanStuder wrote:
> 
> 
> On 10/09/2017 10:14 AM, Lars Kurth wrote:
>>
>>> On 27 Sep 2017, at 13:57, Robert VanVossen 
>>>  wrote:
>>>
>>>
>>>
>>> On 9/26/2017 3:12 AM, Dario Faggioli wrote:
>>>> [Cc-list modified by removing someone and adding someone else]
>>>>
>>>> On Mon, 2017-09-25 at 16:10 -0700, Stefano Stabellini wrote:
>>>>> On Mon, 11 Sep 2017, George Dunlap wrote:
>>>>>> +### RTDS based Scheduler
>>>>>> +
>>>>>> +Status: Experimental
>>>>>> +
>>>>>> +A soft real-time CPU scheduler built to provide guaranteed CPU
>>>>>> capacity to guest VMs on SMP hosts
>>>>>> +
>>>>>> +### ARINC653 Scheduler
>>>>>> +
>>>>>> +Status: Supported, Not security supported
>>>>>> +
>>>>>> +A periodically repeating fixed timeslice scheduler. Multicore
>>>>>> support is not yet implemented.
>>>>>> +
>>>>>> +### Null Scheduler
>>>>>> +
>>>>>> +Status: Experimental
>>>>>> +
>>>>>> +A very simple, very static scheduling policy 
>>>>>> +that always schedules the same vCPU(s) on the same pCPU(s). 
>>>>>> +It is designed for maximum determinism and minimum overhead
>>>>>> +on embedded platforms.
>>
>> ...
>>
>>>> Actually, the best candidate for gaining security support, is IMO
>>>> ARINC. Code is also rather simple and "stable" (hasn't changed in the
>>>> last... years!) and it's used by DornerWorks' people for some of their
>>>> projects (I think?). It's also not tested in OSSTest, though, and
>>>> considering how special purpose it is, I think we're not totally
>>>> comfortable marking it as Sec-Supported, without feedback from the
>>>> maintainers.
>>>>
>>>> George, Josh, Robert?
>>>>
>>>
>>> Yes, we do still use the ARINC653 scheduler. Since it is so simple, it 
>>> hasn't
>>> really needed any modifications in the last couple years.
>>>
>>> We are not really sure what kind of feedback you are looking from us in 
>>> regards
>>> to marking it sec-supported, but would be happy to try and answer any 
>>> questions.
>>> If you have any specific questions or requests, we can discuss it 
>>> internally and
>>> get back to you.
>>
>> I think there are two sets of issues: one around testing, which Dario 
>> outlined.
>>
>> For example, if you had some test harnesses that could be run on Xen release 
>> candidates, which verify that the scheduler works as expected, that would
>> help. It would imply a commitment to run the tests on release candidates.
> 
> We have an internal Xen test harness that we use to test the scheduler, but I
> assume you would like it converted to use OSSTest instead, so that the
> tests could be integrated into the main test suite someday?

In our past discussions I don't think anyone has thought the "everything
has to be tested in osstest" strategy is really feasible.  So I think we
were going for a model where it just had to be regularly tested
*somewhere*, more or less as a marker for "is this functionality
important enough to people to give security support".

>> The second question is what happens if someone reported a security issue on
>> the scheduler. The security team would not have the capability to fix issues 
>> in 
>> the ARINC scheduler: so it would be necessary to pull in an expert under 
>> embargo to help triage the issue, fix the issue and prove that the fix 
>> works. This 
>> would most likely require "the expert" to work to the timeline of the 
>> security
>> team (which may require prioritising it over other work), as once a security 
>> issue 
>> has been reported, the reporter may insist on a disclosure schedule. If we 
>> didn't 
>> have a fix in time, because we don't get expert bandwidth, we could be 
>> forced to 
>> disclose an XSA without a fix.
> 
> We can support this and have enough staff familiar with the scheduler that
> prioritizing security issues shouldn't be a problem.  The maintainers (Robbie
> and Josh) can triage issues if and when the time comes, but if you need a more
> dedicated "expert" for this type of issue, then that would likely be me.

OK -- in that case, if it's OK with you, I'll list ArinC as 'Supported'.

Thanks,
 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Commit moratorium to staging

2017-11-02 Thread George Dunlap
On 11/01/2017 02:07 PM, Ian Jackson wrote:
> So, investigations (mostly by Roger, and also a bit of archaeology in
> the osstest db by me) have determined:
> 
> * This bug is 100% reproducible on affected hosts.  The repro is
>   to boot the Windows guest, save/restore it, then migrate it,
>   then shut down.  (This is from an IRL conversation with Roger and
>   may not be 100% accurate.  Roger, please correct me.)

I presume when you say 'migrate' you mean localhost migration?

Are the results different if you:
- only save/restore *or* migrate it?
- save/restore twice or migrate twice, rather than save/restore + migrate?

Going through the save/restore path suggests that there's something
about the domain that's being set up one way on initial creation than on
restoring/receiving from a migration: i.e., something not being saved
and restored properly.

An alternate explanation would be a 'hitch' somewhere in the 're-attach'
driver code.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC v2] Add SUPPORT.md

2017-11-02 Thread George Dunlap
On 11/01/2017 05:10 PM, Konrad Rzeszutek Wilk wrote:
> On Tue, Oct 24, 2017 at 04:22:38PM +0100, George Dunlap wrote:
>> On Fri, Sep 15, 2017 at 3:51 PM, Konrad Rzeszutek Wilk
>>  wrote:
>>>> +### Soft-reset for PV guests
>>>
>>> s/PV/HVM/
>>
>> Is it?  I thought this was for RHEL 5 PV guests to be able to do crash 
>> kernels.
>>
>>>> +### Transcendent Memory
>>>> +
>>>> +Status: Experimental
>>>> +
>>>> +[XXX Add description]
>>>
>>> Guests with tmem drivers autoballoon memory out allowing a fluid
>>> and dynamic memory allocation - in effect memory overcommit without
>>> the need to swap. Only works with Linux guests (as it requires
>>> OS drivers).
>>
>> But autoballooning doesn't require any support in Xen, right?  I
>> thought the TMEM support in Xen was more about the trancendent memory
>> backends.
> 
> frontends you mean? That is Linux guests when compiled with XEN_TMEM will
> balloon down (using the self-shrinker) to using the normal balloon code
> (XENMEM_decrease_reservation, XENMEM_populate_physmap) to make the
> guest smaller. Then the Linux code starts hitting the case where it starts
> swapping memory out - and that is where the tmem comes in and the
> pages are swapped out to the hypervisor.

Right -- so TMEM itself actually consists of this ephemeral and
non-ephemeral memory pools.  Autoballooning is just a trick to get Linux
to put the least-used pages into one of the pools.

How about this:

---
Transcendent Memory (tmem) allows the creation of hypervisor memory
pools which guests can use to store memory rather than caching in its
own memory or swapping to disk.  Having these in the hypervisor can
allow more efficient aggregate use of memory across VMs.
---

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC v2] Add SUPPORT.md

2017-11-01 Thread George Dunlap
On 09/12/2017 08:52 PM, Stefano Stabellini wrote:
>>> +### Xen Framebuffer
>>> +
>>> +Status, Linux: Supported
>>
>> Frontend?
> 
> Yes, please. If you write "Xen Framebuffer" I only take it to mean the
> protocol as should be documented somewhere under docs/. Then I read
> Linux, and I don't understand what you mean. Then I read QEMU and I have
> to guess you are talking about the backend?

Well this was in the "backend" section, so it was just completely wrong.
 I've removed it. :-)


>>> +### ARM: 16K and 64K pages in guests
>>> +
>>> +Status: Supported, with caveats
>>> +
>>> +No support for QEMU backends in a 16K or 64K domain.
>>
>> Needs to be merged with the "1GB/2MB super page support"?
>  
> Super-pages are different from page granularity. 1GB and 2MB pages are
> based on the same 4K page granularity, while 512MB pages are based on
> 64K granularity. Does it make sense?

It does -- wondering what the best way to describe this concisely is.
Would it make sense to say "L2 and L3 superpages", and then explain in
the comment that for 4k page granularity that's 2MiB and 1GiB, and for
64k granularity it's 512MiB?

> Maybe we want to say "ARM: 16K and 64K page granularity in guest" to
> clarify.

Clarifying that this is "page granularity" would be helpful.

If we had a document describing this in more detail we could point to
that also might be useful.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC v2] Add SUPPORT.md

2017-11-01 Thread George Dunlap
On 09/12/2017 11:39 AM, Roger Pau Monné wrote:
> On Mon, Sep 11, 2017 at 06:01:59PM +0100, George Dunlap wrote:
>> +## Toolstack
>> +
>> +### xl
>> +
>> +Status: Supported
>> +
>> +### Direct-boot kernel image format
>> +
>> +Supported, x86: bzImage
> 
> ELF
> 
>> +Supported, ARM32: zImage
>> +Supported, ARM64: Image
>> +
>> +Format which the toolstack accept for direct-boot kernels
> 
> IMHO it would be good to provide references to the specs, for ELF that
> should be:
> 
> http://refspecs.linuxbase.org/elf/elf.pdf

I'm having trouble evaluating these to recommendations because I don't
really know what the point of this section is.  Who wants this
information and why?

I think most end-users will want to build a Linux / whatever binary.
From that perspective, "bzImage" is probably the thing people want to
know about.  If you're doing unikernels or rolling your own custom
system somehow, knowing that it's ELF is probably more useful.

>> +### Qemu based disk backend (qdisk) for xl
>> +
>> +Status: Supported
>> +
>> +### Open vSwitch integration for xl
>> +
>> +Status: Supported
> 
> Status, Linux: Supported
> 
> I haven't played with vswitch on FreeBSD at all.

Ack

>> +### systemd support for xl
>> +
>> +Status: Supported
>> +
>> +### JSON output support for xl
>> +
>> +Status: Experimental
>> +
>> +Output of information in machine-parseable JSON format
>> +
>> +### AHCI support for xl
>> +
>> +Status, x86: Supported
>> +
>> +### ACPI guest
>> +
>> +Status, x86 HVM: Supported
>> +Status, ARM: Tech Preview
> 
> status, x86 PVH: Tech preview

Is the interface and functionality mostly stable?  Or are the interfaces
likely to change / people using it likely to have crashes?

>> +### PVUSB support for xl
>> +
>> +Status: Supported
>> +
>> +### HVM USB passthrough for xl
>> +
>> +Status, x86: Supported
>> +
>> +### QEMU backend hotplugging for xl
>> +
>> +Status: Supported
> 
> What's this exactly? Is it referring to hot-adding PV disk and nics?
> If so it shouldn't specifically reference xl, the same can be done
> with blkback or netback for example.

I think it means, xl knows how to hotplug QEMU backends.  There was a
time when I think this wasn't true.


>> +## Scalability
>> +
>> +### 1GB/2MB super page support
>> +
>> +Status: Supported
> 
> This needs something like:
> 
> Status, x86 HVM/PVH: Supported

Sounds good -- I'll have a line for ARM as well.

> IIRC on ARM page sizes are different (64K?)
> 
>> +
>> +### x86/PV-on-HVM
>> +
>> +Status: Supported
>> +
>> +This is a useful label for a set of hypervisor features
>> +which add paravirtualized functionality to HVM guests 
>> +for improved performance and scalability.  
>> +This includes exposing event channels to HVM guests.
>> +
>> +### x86/Deliver events to PVHVM guests using Xen event channels
>> +
>> +Status: Supported
> 
> I think this should be labeled as "x86/HVM deliver guest events using
> event channels", and the x86/PV-on-HVM section removed.

Actually, I think 'PVHVM' should be the feature and this one should be
removed.


>> +### Blkfront
>> +
>> +Status, Linux: Supported
>> +Status, FreeBSD: Supported, Security support external
>> +Status, Windows: Supported
> 
> Status, NetBSD: Supported, Security support external

Ack


>> +### Xen Console
>> +
>> +Status, Linux (hvc_xen): Supported
>> +Status, Windows: Supported
>> +
>> +Guest-side driver capable of speaking the Xen PV console protocol
> 
> Status, FreeBSD: Supported, Security support external
> Status, NetBSD: Supported, Security support external

Ack

> 
>> +
>> +### Xen PV keyboard
>> +
>> +Status, Linux (xen-kbdfront): Supported
>> +Status, Windows: Supported
>> +
>> +Guest-side driver capable of speaking the Xen PV keyboard protocol
>> +
>> +[XXX 'Supported' here depends on the version we ship in 4.10 having some 
>> fixes]
>> +
>> +### Xen PVUSB protocol
>> +
>> +Status, Linux: Supported
>> +
>> +### Xen PV SCSI protocol
>> +
>> +Status, Linux: Supported, with caveats
> 
> Should both of the above items be labeled with frontend/backend?

Done.

> And do we really need the 'Xen' prefix in all the items? Se

Re: [Xen-devel] [PATCH for-4.10] common/multicall: Increase debugability for bad hypercalls

2017-10-31 Thread George Dunlap
On 10/31/2017 05:18 PM, Andrew Cooper wrote:
> While investigating an issue (in a new codepath I'd introduced, as it turns
> out), leaving interrupts disabled manifested as a subsequent op in the
> multicall failing a check_lock() test.
> 
> The codepath would have hit the ASSERT_NOT_IN_ATOMIC on the return-to-guest
> path, had it not hit the check_lock() first.
> 
> Call ASSERT_NOT_IN_ATOMIC() after each operation in the multicall, to make
> failures more obvious.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: George Dunlap 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH for-4.10] common/spinlock: Improve the output from check_lock() if it trips

2017-10-31 Thread George Dunlap
On 10/31/2017 10:49 AM, Andrew Cooper wrote:
> If check_lock() triggers, a crash will occur.  Instead of simply identifying
> "the irq context was different", indicate the expected and current irq
> context.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: George Dunlap 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] scripts: introduce a script for build test

2017-10-25 Thread George Dunlap
On 10/25/2017 04:27 PM, Wei Liu wrote:
> On Wed, Oct 25, 2017 at 04:25:21PM +0100, Ian Jackson wrote:
>> Wei Liu writes ("Re: [PATCH v2] scripts: introduce a script for build test"):
>>> On Tue, Oct 24, 2017 at 02:38:39PM +0100, Ian Jackson wrote:
 Anthony PERARD writes ("Re: [PATCH v2] scripts: introduce a script for 
 build test"):
> That feels wrong. How do I run the same exact command at the default
> one, but with -j8 instead of -j4?

  .../build-test sh -ec make -j4 distclean && ./configure && make -j4

 But I think Anthony has a point.  The clean should 1. be git-clean,
 not make distclean 2. be run anyway.
>>>
>>> I don't think we should call git-clean unconditionally -- imagine
>>> someone knew for sure they only needed to build part of the tools or the
>>> hypervisor.
>>
>> If you are worried about this you should check that there are no
>> uncommitted files before starting.
> 
> This is already done in this version.
> 
> I don't worry if there is uncommitted file, I just don't want to stop
> developers from being smarter than the script when they know git-clean
> is not necessary.

What kind of "smarter" did you have in mind?

This script sounds like an aid to developers who don't have the
motivation / experience / whatever to write their own script (or do
something fancier, like git rebase --exec).  If people want to be
smarter they can write their own script, using this as a starting point.

FWIW in xsatool I use 'git clean' extensively.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC v2] Add SUPPORT.md

2017-10-25 Thread George Dunlap
On Tue, Oct 24, 2017 at 12:42 PM, Andrew Cooper
 wrote:
> On 24/10/17 11:27, George Dunlap wrote:
>> On 10/23/2017 06:55 PM, Andrew Cooper wrote:
>>> On 23/10/17 17:22, George Dunlap wrote:
>>>> On 09/11/2017 06:53 PM, Andrew Cooper wrote:
>>>>> On 11/09/17 18:01, George Dunlap wrote:
>>>>>> +### x86/RAM
>>>>>> +
>>>>>> +Limit, x86: 16TiB
>>>>>> +Limit, ARM32: 16GiB
>>>>>> +Limit, ARM64: 5TiB
>>>>>> +
>>>>>> +[XXX: Andy to suggest what this should say for x86]
>>>>> The limit for x86 is either 16TiB or 123TiB, depending on
>>>>> CONFIG_BIGMEM.  CONFIG_BIGMEM is exposed via menuconfig without
>>>>> XEN_CONFIG_EXPERT, so falls into at least some kind of support statement.
>>>>>
>>>>> As for practical limits, I don't think its reasonable to claim anything
>>>>> which we can't test.  What are the specs in the MA colo?
>>>> At the moment the "Limit" tag specifically says that it's theoretical
>>>> and may not work.
>>>>
>>>> We could add another tag, "Limit-tested", or something like that.
>>>>
>>>> Or, we could simply have the Limit-security be equal to the highest
>>>> amount which has been tested (either by osstest or downstreams).
>>>>
>>>> For simplicity's sake I'd go with the second one.
>>> It think it would be very helpful to distinguish the upper limits from
>>> the supported limits.  There will be a large difference between the two.
>>>
>>> Limit-Theoretical and Limit-Supported ?
>> Well "supported" without any modifiers implies "security supported".  So
>> perhaps we could just `s/Limit-security/Limit-supported/;` ?
>
> By this, you mean use Limit-Supported throughout this document?  That
> sounds like a good plan.

Yes, that's basically what I meant.

>>>>>> +Limit, x86 HVM: 128
>>>>>> +Limit, ARM32: 8
>>>>>> +Limit, ARM64: 128
>>>>>> +
>>>>>> +[XXX Andrew Cooper: Do want to add "Limit-Security" here for some of 
>>>>>> these?]
>>>>> 32 for each.  64 vcpu HVM guests can excerpt enough p2m lock pressure to
>>>>> trigger a 5 second host watchdog timeout.
>>>> Is that "32 for x86 PV and x86 HVM", or "32 for x86 HVM and ARM64"?  Or
>>>> something else?
>>> The former.  I'm not qualified to comment on any of the ARM limits.
>>>
>>> There are several non-trivial for_each_vcpu() loops in the domain_kill
>>> path which aren't handled by continuations.  ISTR 128 vcpus is enough to
>>> trip a watchdog timeout when freeing pagetables.
>> I don't think 32 is a really practical limit.
>
> What do you mean by practical here, and what evidence are you basing
> this on?
>
> Amongst other things, there is an ABI boundary in Xen at 32 vcpus, and
> given how often it is broken in Linux, its clear that there isn't
> regular testing happening beyond this limit.

Is that true for dom0 as well?

>> I'm inclined to say that if a rogue guest can crash a host with 33 vcpus, we 
>> should issue an XSA
>> and fix it.
>
> The reason XenServer limits at 32 vcpus is that I can crash Xen with a
> 64 vcpu HVM domain.  The reason it hasn't been my top priority to fix
> this is because there is very little customer interest in pushing this
> limit higher.
>
> Obviously, we should fix issues as and when they are discovered, and
> work towards increasing the limits in the longterm, but saying "this
> limit seems too low, so lets provisionally set it higher" is short
> sighted and a recipe for more XSAs.

OK -- I'll set this to 32 for now and see if anyone else wants to
argue for a different value.

>>>>>> +
>>>>>> +### x86 PV/Event Channels
>>>>>> +
>>>>>> +Limit: 131072
>>>>> Why do we call out event channel limits but not grant table limits?
>>>>> Also, why is this x86?  The 2l and fifo ABIs are arch agnostic, as far
>>>>> as I am aware.
>>>> Sure, but I'm pretty sure that ARM guests don't (perhaps cannot?) use PV
>>>> event channels.
>>> This is mixing the hypervisor API/ABI capabilities with the actual
>>> abilities of guests (which is also different to what Linux would use in
>>> the gue

Re: [Xen-devel] [PATCH RFC v2] Add SUPPORT.md

2017-10-24 Thread George Dunlap
On Fri, Sep 15, 2017 at 3:51 PM, Konrad Rzeszutek Wilk
 wrote:
>> +### Soft-reset for PV guests
>
> s/PV/HVM/

Is it?  I thought this was for RHEL 5 PV guests to be able to do crash kernels.

>> +### Transcendent Memory
>> +
>> +Status: Experimental
>> +
>> +[XXX Add description]
>
> Guests with tmem drivers autoballoon memory out allowing a fluid
> and dynamic memory allocation - in effect memory overcommit without
> the need to swap. Only works with Linux guests (as it requires
> OS drivers).

But autoballooning doesn't require any support in Xen, right?  I
thought the TMEM support in Xen was more about the trancendent memory
backends.

> ..snip..
>> +### Live Patching
>> +
>> +Status, x86: Supported
>> +Status, ARM: Experimental
>> +
>> +Compile time disabled
>
> for ARM.
>
> As the patch will do:
>
>  config LIVEPATCH
> -   bool "Live patching support (TECH PREVIEW)"
> -   default n
> +   bool "Live patching support"
> +   default X86
> depends on HAS_BUILD_ID = "y"
> ---help---
>   Allows a running Xen hypervisor to be dynamically patched using

Ack

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC v2] Add SUPPORT.md

2017-10-24 Thread George Dunlap
On Tue, Sep 12, 2017 at 4:35 PM, Rich Persaud  wrote:
>> On Sep 11, 2017, at 13:01, George Dunlap  wrote:
>>
>> +### XSM & FLASK
>> +
>> +Status: Experimental
>> +
>> +Compile time disabled
>> +
>> +### XSM & FLASK support for IS_PRIV
>> +
>> +Status: Experimental
>
> In which specific areas is XSM lacking in Functional completeness, Functional 
> stability and/or Interface stability, resulting in "Experimental" status?  
> What changes to XSM would be needed for it to qualify for "Supported" status?

So first of all, I guess there's two "features" here: One is XSM /
FLASK itself, which downstreams such OpenXT can use do make their own
policies.  The second is the "default FLASK policy", shipped with Xen,
which has rules and labels for things in a "normal" Xen system: domUs,
driver domains, stub domains, dom0, &c.  There was a time when you
could simply enable that and a basic Xen System would Just Work, and
(in theory) would be more secure than the default Xen system.  It
probably makes sense to treat these separately.

Two problems we have so far: The first is that the policy bitrots
fairly quickly.  At the moment we don't have proper testing, and we
don't really have anyone that knows how to fix it if it does break.

The second problem is that while functional testing can show that the
default policy is *at least* as permissive as not having FLASK enabled
at all, it's a lot more difficult to show that having FLASK enabled
isn't in some cases *more permissive* than we would like to be by
default.  We've noticed issues before where enabling XSM accidentally
gives a domU access to hypercalls or settings it wouldn't have access
to otherwise.  Absent some way of automatically catching these
changes, we're not sure we could recommend people use the default
policy, even if we had confidence (via testing) that it wouldn't break
people's functionality on update.

The "default policy bitrot" problem won't be one for you, because (as
I understand it) you write your own custom policies.  But the second
issue should be more concerning: when you update to a new version of
Xen, what confidence do you have that your old policies will still
adequately restrict guests from dangerous new functionality?

I think sorting the second question out is basically what it would
take to call FLASK by itself (as opposed to the default policy)
"Supported".  (And if you can make an argument that this is already
sorted, then we can list FLASK itself as "supported".)

> If there will be no security support for features in Experimental status, 
> would Xen Project accept patches to fix XSM security issues?  Could 
> downstream projects issue CVEs for XSM security issues, if these will not be 
> issued by Xen Project?

Experimental status is about 1) our assessment of how reliable the
feature is, and 2) whether we will issue XSAs if security-related bugs
are found.  We will of course accept patches to improve functionality,
and it's likely that if someone only *reports* a bug that people on
the list will be able to come up with a fix.

Regarding CVEs, I guess what you care about is whether as our own CNA,
the XenProject would be willing to issue CVEs for XSM security issues,
and/or perhaps whether we would mind if you asked Mitre directly
instead.

That's slightly a different topic, which we should probably discuss
when we become a CNA.  But to give you an idea where I'm at, I think
the question is: What kind of a bug do you think you'd issue a CVE for
(and/or, an XSA)?

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC v2] Add SUPPORT.md

2017-10-24 Thread George Dunlap
On 10/23/2017 06:55 PM, Andrew Cooper wrote:
> On 23/10/17 17:22, George Dunlap wrote:
>> On 09/11/2017 06:53 PM, Andrew Cooper wrote:
>>> On 11/09/17 18:01, George Dunlap wrote:
>>>> +### x86/RAM
>>>> +
>>>> +Limit, x86: 16TiB
>>>> +Limit, ARM32: 16GiB
>>>> +Limit, ARM64: 5TiB
>>>> +
>>>> +[XXX: Andy to suggest what this should say for x86]
>>> The limit for x86 is either 16TiB or 123TiB, depending on
>>> CONFIG_BIGMEM.  CONFIG_BIGMEM is exposed via menuconfig without
>>> XEN_CONFIG_EXPERT, so falls into at least some kind of support statement.
>>>
>>> As for practical limits, I don't think its reasonable to claim anything
>>> which we can't test.  What are the specs in the MA colo?
>> At the moment the "Limit" tag specifically says that it's theoretical
>> and may not work.
>>
>> We could add another tag, "Limit-tested", or something like that.
>>
>> Or, we could simply have the Limit-security be equal to the highest
>> amount which has been tested (either by osstest or downstreams).
>>
>> For simplicity's sake I'd go with the second one.
> 
> It think it would be very helpful to distinguish the upper limits from
> the supported limits.  There will be a large difference between the two.
> 
> Limit-Theoretical and Limit-Supported ?

Well "supported" without any modifiers implies "security supported".  So
perhaps we could just `s/Limit-security/Limit-supported/;` ?

> 
> In all cases, we should identify why the limit is where it is, even if
> that is only "maximum people have tested to".  Other

This document is already fairly complicated, and a massive amount of
work (as each line is basically an invitation to bike-shedding).  If
it's OK with you, I'll leave the introduction of where the limit comes
from for a motivated individual to add in a subsequent patch. :-)

>> Shall I write an e-mail with a more direct query for the maximum amounts
>> of various numbers tested by the XenProject (via osstest), Citrix, SuSE,
>> and Oracle?
> 
> For XenServer,
> http://docs.citrix.com/content/dam/docs/en-us/xenserver/current-release/downloads/xenserver-config-limits.pdf
> 
>>> [root@fusebot ~]# python
>>> Python 2.7.5 (default, Nov 20 2015, 02:00:19)
>>> [GCC 4.8.5 20150623 (Red Hat 4.8.5-4)] on linux2
>>> Type "help", "copyright", "credits" or "license" for more information.
>>>>>> from xen.lowlevel.xc import xc as XC
>>>>>> xc = XC()
>>>>>> xc.domain_create()
>>> 1
>>>>>> xc.domain_max_vcpus(1, 8192)
>>> 0
>>>>>> xc.domain_create()
>>> 2
>>>>>> xc.domain_max_vcpus(2, 8193)
>>> Traceback (most recent call last):
>>>   File "", line 1, in 
>>> xen.lowlevel.xc.Error: (22, 'Invalid argument')
>>>
>>> Trying to shut such a domain down however does tickle a host watchdog
>>> timeout as the for_each_vcpu() loops in domain_kill() are very long.
>> For now I'll set 'Limit' to 8192, and 'Limit-security' to 512.
>> Depending on what I get for the "test limit" survey I may adjust it
>> afterwards.
> 
> The largest production x86 server I am aware of is a Skylake-S system
> with 496 threads.  512 is not a plausibly-tested number.
> 
>>
>>>> +Limit, x86 HVM: 128
>>>> +Limit, ARM32: 8
>>>> +Limit, ARM64: 128
>>>> +
>>>> +[XXX Andrew Cooper: Do want to add "Limit-Security" here for some of 
>>>> these?]
>>> 32 for each.  64 vcpu HVM guests can excerpt enough p2m lock pressure to
>>> trigger a 5 second host watchdog timeout.
>> Is that "32 for x86 PV and x86 HVM", or "32 for x86 HVM and ARM64"?  Or
>> something else?
> 
> The former.  I'm not qualified to comment on any of the ARM limits.
> 
> There are several non-trivial for_each_vcpu() loops in the domain_kill
> path which aren't handled by continuations.  ISTR 128 vcpus is enough to
> trip a watchdog timeout when freeing pagetables.

I don't think 32 is a really practical limit.  I'm inclined to say that
if a rogue guest can crash a host with 33 vcpus, we should issue an XSA
and fix it.

>>>> +### Virtual RAM
>>>> +
>>>> +Limit, x86 PV: >1TB
>>>> +Limit, x86 HVM: 1TB
>>>> +Limit, ARM32: 16GiB
>>>> +Limi

Re: [Xen-devel] [PATCH for-4.10] xenalyze: fix compilation

2017-10-23 Thread George Dunlap
On 10/23/2017 05:28 PM, Roger Pau Monne wrote:
> Recent changes in xenalyze introduced INT_MIN without also adding the
> required header, fix this by adding the header.
> 
> Signed-off-by: Roger Pau Monné 

Acked-by: George Dunlap 

> ---
> Cc: George Dunlap 
> Cc: Ian Jackson 
> Cc: Wei Liu 
> Cc: Julien Grall 
> ---
> This should be accepted for 4.10 because it's a build bug fix, with no
> functional change at all.
> ---
>  tools/xentrace/xenalyze.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/tools/xentrace/xenalyze.c b/tools/xentrace/xenalyze.c
> index 79bdba7fed..5768b54f86 100644
> --- a/tools/xentrace/xenalyze.c
> +++ b/tools/xentrace/xenalyze.c
> @@ -23,6 +23,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC v2] Add SUPPORT.md

2017-10-23 Thread George Dunlap
On 09/11/2017 06:53 PM, Andrew Cooper wrote:
> On 11/09/17 18:01, George Dunlap wrote:
>> +### x86/PV
>> +
>> +Status: Supported
>> +
>> +Traditional Xen Project PV guest
> 
> What's a "Xen Project" PV guest?  Just Xen here.
> 
> Also, a perhaps a statement of "No hardware requirements" ?

OK.

> 
>> +### x86/RAM
>> +
>> +Limit, x86: 16TiB
>> +Limit, ARM32: 16GiB
>> +Limit, ARM64: 5TiB
>> +
>> +[XXX: Andy to suggest what this should say for x86]
> 
> The limit for x86 is either 16TiB or 123TiB, depending on
> CONFIG_BIGMEM.  CONFIG_BIGMEM is exposed via menuconfig without
> XEN_CONFIG_EXPERT, so falls into at least some kind of support statement.
> 
> As for practical limits, I don't think its reasonable to claim anything
> which we can't test.  What are the specs in the MA colo?

At the moment the "Limit" tag specifically says that it's theoretical
and may not work.

We could add another tag, "Limit-tested", or something like that.

Or, we could simply have the Limit-security be equal to the highest
amount which has been tested (either by osstest or downstreams).

For simplicity's sake I'd go with the second one.

Shall I write an e-mail with a more direct query for the maximum amounts
of various numbers tested by the XenProject (via osstest), Citrix, SuSE,
and Oracle?

>> +
>> +## Limits/Guest
>> +
>> +### Virtual CPUs
>> +
>> +Limit, x86 PV: 512
> 
> Where did this number come from?  The actual limit as enforced in Xen is
> 8192, and it has been like that for a very long time (i.e. the 3.x days)

Looks like Lars copied this from
https://wiki.xenproject.org/wiki/Xen_Project_Release_Features.  Not sure
where it came from before that.

> [root@fusebot ~]# python
> Python 2.7.5 (default, Nov 20 2015, 02:00:19)
> [GCC 4.8.5 20150623 (Red Hat 4.8.5-4)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> from xen.lowlevel.xc import xc as XC
>>>> xc = XC()
>>>> xc.domain_create()
> 1
>>>> xc.domain_max_vcpus(1, 8192)
> 0
>>>> xc.domain_create()
> 2
>>>> xc.domain_max_vcpus(2, 8193)
> Traceback (most recent call last):
>   File "", line 1, in 
> xen.lowlevel.xc.Error: (22, 'Invalid argument')
> 
> Trying to shut such a domain down however does tickle a host watchdog
> timeout as the for_each_vcpu() loops in domain_kill() are very long.

For now I'll set 'Limit' to 8192, and 'Limit-security' to 512.
Depending on what I get for the "test limit" survey I may adjust it
afterwards.

>> +Limit, x86 HVM: 128
>> +Limit, ARM32: 8
>> +Limit, ARM64: 128
>> +
>> +[XXX Andrew Cooper: Do want to add "Limit-Security" here for some of these?]
> 
> 32 for each.  64 vcpu HVM guests can excerpt enough p2m lock pressure to
> trigger a 5 second host watchdog timeout.

Is that "32 for x86 PV and x86 HVM", or "32 for x86 HVM and ARM64"?  Or
something else?

>> +### Virtual RAM
>> +
>> +Limit, x86 PV: >1TB
>> +Limit, x86 HVM: 1TB
>> +Limit, ARM32: 16GiB
>> +Limit, ARM64: 1TB
> 
> There is no specific upper bound on the size of PV or HVM guests that I
> am aware of.  1.5TB HVM domains definitely work, because that's what we
> test and support in XenServer.

Are there limits for 32-bit guests?  There's some complicated limit
having to do with the m2p, right?

>> +
>> +### x86 PV/Event Channels
>> +
>> +Limit: 131072
> 
> Why do we call out event channel limits but not grant table limits? 
> Also, why is this x86?  The 2l and fifo ABIs are arch agnostic, as far
> as I am aware.

Sure, but I'm pretty sure that ARM guests don't (perhaps cannot?) use PV
event channels.

> 
>> +## High Availability and Fault Tolerance
>> +
>> +### Live Migration, Save & Restore
>> +
>> +Status, x86: Supported
> 
> With caveats.  From docs/features/migration.pandoc

This would extend the meaning of "caveats" from "when it's not security
supported" to "when it doesn't work"; which is probably the best thing
at the moment.

> * x86 HVM with nested-virt (no relevant information included in the stream)
[snip]
> Also, features such as vNUMA and nested virt (which are two I know for
> certain) have all state discarded on the source side, because they were
> never suitably plumbed in.

OK, I'll list these, as well as PCI pass-through.

(Actually, vNUMA doesn't seem to be 

Re: [Xen-devel] [PATCH for-4.10] scripts: add a script for build testing

2017-10-23 Thread George Dunlap
On 10/20/2017 06:32 PM, Wei Liu wrote:
> Signed-off-by: Wei Liu 
> ---
> Cc: Andrew Cooper 
> Cc: George Dunlap 
> Cc: Ian Jackson 
> Cc: Jan Beulich 
> Cc: Konrad Rzeszutek Wilk 
> Cc: Stefano Stabellini 
> Cc: Tim Deegan 
> Cc: Wei Liu 
> Cc: Julien Grall 
> 
> The risk for this is zero, hence the for-4.10 tag.

I'm not necessarily arguing against this, but in my estimation this
isn't zero risk.  It's a new feature (even if one only for developers).
It's not *intended* to destroy anything, but a bug in it well could
destroy data.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 for-4.10] x86/mm: Make PV linear pagetables optional

2017-10-18 Thread George Dunlap
On 10/18/2017 02:41 PM, Jan Beulich wrote:
 On 18.10.17 at 12:51,  wrote:
>> --- a/xen/arch/x86/Kconfig
>> +++ b/xen/arch/x86/Kconfig
>> @@ -37,6 +37,26 @@ source "arch/Kconfig"
>>  config PV
>>  def_bool y
>>  
>> +config PV_LINEAR_PT
>> +   bool "Support for PV linear pagetables"
>> +   depends on PV
>> +   default y
>> +   ---help---
>> + Linear pagetables (also called "recursive pagetables") refers
>> + to the practice of a guest operating system having pagetable
>> + entries pointing to other pagetables of the same level (i.e.,
>> + allowing L2 PTEs to point to other L2 pages).  Some operating
>> + systems use it as a simple way to consisently map the current
>> + process's pagetables into its own virtual address space.
>> +
>> + Linux and MiniOS don't use this technique.  NetBSD and Novell
>> + Netware do; there may be other custom operating systems which
>> + do.  If you're certain you don't plan on having PV guests
>> + which use this feature, turning it off can reduce the attack
>> + surface.
>> +
>> + If unsure, say Y.
>> +
>>  config HVM
>>  def_bool y
> 
> Note how the options in context use tab indentation. Granted
> there are other examples of space indentation in this file, but
> at least they're using 8 spaces (except of course of the help
> text), while you're using 7.
> 
>> @@ -2320,6 +2353,7 @@ static int _put_page_type(struct page_info *page, bool 
>> preemptible,
>>  break;
>>  }
>>  
>> +#ifdef CONFIG_PV_LINEAR_PT
>>  if ( ptpg && PGT_type_equal(x, ptpg->u.inuse.type_info) )
>>  {
>>  /*
>> @@ -2334,6 +2368,9 @@ static int _put_page_type(struct page_info *page, bool 
>> preemptible,
>>  ASSERT(ptpg->linear_pt_count > 0);
>>  ptpg = NULL;
>>  }
>> +#else /* CONFIG_PV_LINEAR_PT */
>> +BUG_ON(ptpg && PGT_type_equal(x, ptpg->u.inuse.type_info));
>> +#endif
> 
> Along the lines of my most recent reply to v1 (which I realize I
> did send only after v2 had arrived), I'm not really certain about
> the usefulness of the preprocessor conditionals - I'd prefer if
> we went without them, but I can live with them if you strongly
> think they're better than the alternative. If you keep them,
> please convert the BUG_ON() to ASSERT() though, to be in
> line with the #ifdef side.

I would argue that if linear pagetables are disabled, and we nonetheless
detect a linear pagetable, then BUG_ON() is the right behavior.  Since
we're not properly tracking any of it, it is almost certainly the result
of a security vulnerability.  Having a DoS in that case is much
preferrable to having a privilege escalation.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/3] x86/mm: Consolidate all Xen L2 slot writing into init_xen_pae_l2_slots()

2017-10-18 Thread George Dunlap
On 10/12/2017 02:54 PM, Andrew Cooper wrote:
> Having all of this logic together makes it easier to follow Xen's virtual
> setup across the whole system.
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: George Dunlap 

> ---
> CC: Jan Beulich 
> CC: Tim Deegan 
> CC: George Dunlap 
> CC: Wei Liu 
> CC: Julien Grall 
> ---
>  xen/arch/x86/mm.c  | 16 +---
>  xen/arch/x86/mm/shadow/multi.c | 42 
> +++---
>  xen/include/asm-x86/mm.h   |  1 +
>  3 files changed, 25 insertions(+), 34 deletions(-)
> 
> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> index f90a42a..ea4af16 100644
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -1433,13 +1433,7 @@ static int alloc_l2_table(struct page_info *page, 
> unsigned long type,
>  }
>  
>  if ( rc >= 0 && (type & PGT_pae_xen_l2) )
> -{
> -/* Xen private mappings. */
> -memcpy(&pl2e[COMPAT_L2_PAGETABLE_FIRST_XEN_SLOT(d)],
> -   &compat_idle_pg_table_l2[
> -   l2_table_offset(HIRO_COMPAT_MPT_VIRT_START)],
> -   COMPAT_L2_PAGETABLE_XEN_SLOTS(d) * sizeof(*pl2e));
> -}
> +init_xen_pae_l2_slots(pl2e, d);
>  
>  unmap_domain_page(pl2e);
>  return rc > 0 ? 0 : rc;
> @@ -1518,6 +1512,14 @@ static int alloc_l3_table(struct page_info *page)
>  return rc > 0 ? 0 : rc;
>  }
>  
> +void init_xen_pae_l2_slots(l2_pgentry_t *l2t, const struct domain *d)
> +{
> +memcpy(&l2t[COMPAT_L2_PAGETABLE_FIRST_XEN_SLOT(d)],
> +   &compat_idle_pg_table_l2[
> +   l2_table_offset(HIRO_COMPAT_MPT_VIRT_START)],
> +   COMPAT_L2_PAGETABLE_XEN_SLOTS(d) * sizeof(*l2t));
> +}
> +
>  /*
>   * This function must write all ROOT_PAGETABLE_PV_XEN_SLOTS, to clobber any
>   * values a guest may have left there from alloc_l4_table().
> diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
> index d540af1..1b76e0c 100644
> --- a/xen/arch/x86/mm/shadow/multi.c
> +++ b/xen/arch/x86/mm/shadow/multi.c
> @@ -1521,31 +1521,6 @@ void sh_install_xen_entries_in_l4(struct domain *d, 
> mfn_t gl4mfn, mfn_t sl4mfn)
>  }
>  #endif
>  
> -#if GUEST_PAGING_LEVELS >= 3
> -// For 3-on-3 PV guests, we need to make sure the xen mappings are in
> -// place, which means that we need to populate the l2h entry in the l3
> -// table.
> -
> -static void sh_install_xen_entries_in_l2h(struct domain *d, mfn_t sl2hmfn)
> -{
> -shadow_l2e_t *sl2e;
> -
> -if ( !is_pv_32bit_domain(d) )
> -return;
> -
> -sl2e = map_domain_page(sl2hmfn);
> -BUILD_BUG_ON(sizeof (l2_pgentry_t) != sizeof (shadow_l2e_t));
> -
> -/* Copy the common Xen mappings from the idle domain */
> -memcpy(
> -&sl2e[COMPAT_L2_PAGETABLE_FIRST_XEN_SLOT(d)],
> -
> &compat_idle_pg_table_l2[l2_table_offset(HIRO_COMPAT_MPT_VIRT_START)],
> -COMPAT_L2_PAGETABLE_XEN_SLOTS(d) * sizeof(*sl2e));
> -
> -unmap_domain_page(sl2e);
> -}
> -#endif
> -
>  
>  /**/
>  /* Create a shadow of a given guest page.
> @@ -1610,7 +1585,14 @@ sh_make_shadow(struct vcpu *v, mfn_t gmfn, u32 
> shadow_type)
>  #endif
>  #if GUEST_PAGING_LEVELS >= 3
>  case SH_type_l2h_shadow:
> -sh_install_xen_entries_in_l2h(v->domain, smfn);
> +BUILD_BUG_ON(sizeof(l2_pgentry_t) != sizeof(shadow_l2e_t));
> +if ( is_pv_32bit_domain(d) )
> +{
> +shadow_l2e_t *l2t = map_domain_page(smfn);
> +
> +init_xen_pae_l2_slots(l2t, d);
> +unmap_domain_page(l2t);
> +}
>  break;
>  #endif
>  default: /* Do nothing */ break;
> @@ -1677,6 +1659,8 @@ sh_make_monitor_table(struct vcpu *v)
>  
>  if ( is_pv_32bit_domain(d) )
>  {
> +l2_pgentry_t *l2t;
> +
>  /* For 32-bit PV guests, we need to map the 32-bit Xen
>   * area into its usual VAs in the monitor tables */
>  m3mfn = shadow_alloc(d, SH_type_monitor_table, 0);
> @@ -1687,7 +1671,11 @@ sh_make_monitor_table(struct vcpu *v)
>  mfn_to_page(m2mfn)->shadow_flags = 2;
>  l3e = map_domain_page(m3mfn);
>  l3e[3] = l3e_from_mfn(m2mfn, _PAGE_PRESENT);
> -sh_install_xen_entries_in_l2h(d, m2mfn);
> +
> +l2t = map_domain_page(m2mfn);
> +init_xen_pae_l2_slots(l2t, d);
> + 

Re: [Xen-devel] [PATCH 3/3] x86/mm: Consolidate all Xen L4 slot writing into init_xen_l4_slots()

2017-10-18 Thread George Dunlap
On 10/12/2017 02:54 PM, Andrew Cooper wrote:
> There are currently three functions which write L4 pagetables for Xen, but
> they all behave subtly differently.  sh_install_xen_entries_in_l4() in
> particular is catering for two different usecases, which makes the safety of
> the linear mappings hard to follow.
> 
> By consolidating the L4 pagetable writing in a single function, the resulting
> setup of Xen's virtual layout is easier to understand.
> 
> No practical changes to the resulting L4, although the logic has been
> rearranged to avoid rewriting some slots.  This changes the zap_ro_mpt
> parameter to simply ro_mpt.
> 
> Both {hap,sh}_install_xen_entries_in_l4() get folded into their callers.  The
> hap side only a single caller, while the shadow side has two.  The shadow
> split helps highlight the correctness of the linear slots.
> 
> Signed-off-by: Andrew Cooper 

Acked-by: George Dunlap 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 for-4.10] x86/mm: Make PV linear pagetables optional

2017-10-18 Thread George Dunlap
Allowing pagetables to point to other pagetables of the same level
(often called 'linear pagetables') has been included in Xen since its
inception; but recently it has been the source of a number of subtle
reference-counting bugs.

It is not used by Linux or MiniOS; but it used used by NetBSD and
Novell Netware.  There are significant numbers of people who are never
going to use the feature, along with significant numbers who need the
feature.

Add a Kconfig option for the feature (default to 'y').  Also add a
command-line option to control whether PV linear pagetables are
allowed (default to 'true').

NB that we leave linear_pt_count in the page struct.  It's in a union,
so its presence doesn't increase the size of the data struct.
Changing the layout of the other elements based on configuration
options is asking for trouble however; so we'll just leave it there
and ASSERT that it's zero.

Reported-by: Jann Horn 
Signed-off-by: George Dunlap 
---
Changes since v1
- Remove stray blank lines added from previous patch
- Leave pg->linear_pt_count present, assert it's 0
- Rename variable to opt_pv_linear_pt
- Add spaces around #ifdef/#else/#endif for large code block
- Add /* CONFIG_LINEAR_PV_PT */ after #else/#endif for clarity
- Correct documented default value
- Mention in documentation that the option is only available if configured
- Move config option to below PV (for the day when we make that selectable)

Changes since XSA
- Add a Kconfig option
- Default to 'on' (rather than 'off').

Release justification: This was originally part of a security fix
embargoed until after the freeze date; it wasn't checked in with the
other security patches in order to allow a discussion about the
default.

CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Julien Grall 
---
 docs/misc/xen-command-line.markdown | 19 +++
 xen/arch/x86/Kconfig| 20 
 xen/arch/x86/mm.c   | 37 +
 3 files changed, 76 insertions(+)

diff --git a/docs/misc/xen-command-line.markdown 
b/docs/misc/xen-command-line.markdown
index eb4995e68b..781110d4b2 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -1422,6 +1422,25 @@ The following resources are available:
 CDP, one COS will corespond two CBMs other than one with CAT, due to the
 sum of CBMs is fixed, that means actual `cos_max` in use will automatically
 reduce to half when CDP is enabled.
+   
+### pv-linear-pt
+> `= `
+
+> Default: `true`
+
+Only available if Xen is compiled with CONFIG\_PV\_LINEAR\_PT support
+enabled.
+
+Allow PV guests to have pagetable entries pointing to other pagetables
+of the same level (i.e., allowing L2 PTEs to point to other L2 pages).
+This technique is often called "linear pagetables", and is sometimes
+used to allow operating systems a simple way to consistently map the
+current process's pagetables into its own virtual address space.
+
+Linux and MiniOS don't use this technique.  NetBSD and Novell Netware
+do; there may be other custom operating systems which do.  If you're
+certain you don't plan on having PV guests which use this feature,
+turning it off can reduce the attack surface.
 
 ### rcu-idle-timer-period-ms
 > `= `
diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
index 64955dc017..a8bbaa652b 100644
--- a/xen/arch/x86/Kconfig
+++ b/xen/arch/x86/Kconfig
@@ -37,6 +37,26 @@ source "arch/Kconfig"
 config PV
def_bool y
 
+config PV_LINEAR_PT
+   bool "Support for PV linear pagetables"
+   depends on PV
+   default y
+   ---help---
+ Linear pagetables (also called "recursive pagetables") refers
+ to the practice of a guest operating system having pagetable
+ entries pointing to other pagetables of the same level (i.e.,
+ allowing L2 PTEs to point to other L2 pages).  Some operating
+ systems use it as a simple way to consisently map the current
+ process's pagetables into its own virtual address space.
+
+ Linux and MiniOS don't use this technique.  NetBSD and Novell
+ Netware do; there may be other custom operating systems which
+ do.  If you're certain you don't plan on having PV guests
+ which use this feature, turning it off can reduce the attack
+ surface.
+
+ If unsure, say Y.
+
 config HVM
def_bool y
 
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 62d313e3f5..2f9febd1ee 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -587,6 +587,8 @@ static void put_data_page(
 put_page(page);
 }
 
+#ifdef CONFIG_PV_LINEAR_PT
+
 static bool inc_linear_entries(struct page_info *pg)
 {
 typeof(pg->linear_pt_c

Re: [Xen-devel] [PATCH] x86/mm: Make PV linear pagetables optional

2017-10-18 Thread George Dunlap
On 10/18/2017 10:39 AM, Jan Beulich wrote:
 On 17.10.17 at 19:10,  wrote:
>> --- a/docs/misc/xen-command-line.markdown
>> +++ b/docs/misc/xen-command-line.markdown
>> @@ -1422,6 +1422,22 @@ The following resources are available:
>>  CDP, one COS will corespond two CBMs other than one with CAT, due to the
>>  sum of CBMs is fixed, that means actual `cos_max` in use will 
>> automatically
>>  reduce to half when CDP is enabled.
>> +
>> +### pv-linear-pt
>> +> `= `
>> +
>> +> Default: `false`
> 
> This looks to be wrong now.
> 
>> --- a/xen/arch/x86/Kconfig
>> +++ b/xen/arch/x86/Kconfig
>> @@ -97,6 +97,27 @@ config TBOOT
>>Technology (TXT)
>>  
>>If unsure, say Y.
>> +
>> +config PV_LINEAR_PT
>> +   bool "Support for PV linear pagetables"
>> +   depends on PV
> 
> For this to look reasonable in a hierarchical menu, it should follow
> PV (with - if there were any - only other options also depending on
> PV in between) rather than being added at a random place.
> 
>> +   default y
>> +   ---help---
>> + Linear pagetables (also called "recursive pagetables") refers
>> + to the practice of a guest operating system having pagetable
> 
> The two lines above should match in how they're being indented.
> 
>> --- a/xen/arch/x86/mm.c
>> +++ b/xen/arch/x86/mm.c
>> @@ -587,6 +587,12 @@ static void put_data_page(
>>  put_page(page);
>>  }
>>  
>> +#ifdef CONFIG_PV_LINEAR_PT
>> +static void zero_linear_entries(struct page_info *pg)
> 
> When framing multiple functions, I think it is better to have a blank
> line between #ifdef and first piece of code (as well as around the
> #else and prior to the #endif), and I think the #else and #endif
> would also benefit from having /* PV_LINEAR_PT */ or some such
> added on their lines.
> 
>> @@ -719,6 +735,20 @@ get_##level##_linear_pagetable( 
>> \
>>  
>> \
>>  return 1;   
>> \
>>  }
>> +#define LPT_ASSERT ASSERT
>> +#else
>> +#define define_get_linear_pagetable(level)  \
>> +static int  \
>> +get_##level##_linear_pagetable( \
>> +level##_pgentry_t pde, unsigned long pde_pfn, struct domain *d) \
>> +{   \
>> +return 0;   \
>> +}
>> +#define zero_linear_entries(pg)
>> +#define dec_linear_uses(pg)
>> +#define dec_linear_entries(pg)
> 
> Would perhaps be better if these evaluated their arguments.
> 
>> +#define LPT_ASSERT(x)
>> +#endif
>>  
>>  
>>  bool is_iomem_page(mfn_t mfn)
> 
> Could you arrange for the double blank lines to go away here with
> the blank line additions asked for above?
> 
>> @@ -2330,8 +2360,8 @@ static int _put_page_type(struct page_info *page, bool 
>> preemptible,
>>   * necessary anymore for a dying domain.
>>   */
>>  ASSERT(page_get_owner(page)->is_dying);
>> -ASSERT(page->linear_pt_count < 0);
>> -ASSERT(ptpg->linear_pt_count > 0);
>> +LPT_ASSERT(page->linear_pt_count < 0);
>> +LPT_ASSERT(ptpg->linear_pt_count > 0);
> 
> Other than Andrew has suggested, with these I don't think
> LPT_ASSERT() can go away, unless you played tricks and forced
> the function's ptpg to be NULL regardless of caller, or unless you
> put the entire if() into an #ifdef.

Actually, coming back to this -- if we disable linear pagetables, how
can it ever be the case that "PGT_type_equal(x,
ptpg->u.inuse.type_info)" evaluates to true?  The ASSERT()s should never
be executed.

OTOH, if we know the code will be completely unused (and that the
compiler won't know), it's probably a better idea to just block it out
entirely anyway (and maybe add a BUG_ON() the types being equal).

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/mm: Make PV linear pagetables optional

2017-10-18 Thread George Dunlap
On 10/18/2017 10:39 AM, Jan Beulich wrote:
 On 17.10.17 at 19:10,  wrote:
>> --- a/docs/misc/xen-command-line.markdown
>> +++ b/docs/misc/xen-command-line.markdown
>> @@ -1422,6 +1422,22 @@ The following resources are available:
>>  CDP, one COS will corespond two CBMs other than one with CAT, due to the
>>  sum of CBMs is fixed, that means actual `cos_max` in use will 
>> automatically
>>  reduce to half when CDP is enabled.
>> +
>> +### pv-linear-pt
>> +> `= `
>> +
>> +> Default: `false`
> 
> This looks to be wrong now.
> 
>> --- a/xen/arch/x86/Kconfig
>> +++ b/xen/arch/x86/Kconfig
>> @@ -97,6 +97,27 @@ config TBOOT
>>Technology (TXT)
>>  
>>If unsure, say Y.
>> +
>> +config PV_LINEAR_PT
>> +   bool "Support for PV linear pagetables"
>> +   depends on PV
> 
> For this to look reasonable in a hierarchical menu, it should follow
> PV (with - if there were any - only other options also depending on
> PV in between) rather than being added at a random place.

AFAICT there's no way to select PV or HVM options in the menu at the
moment.  I could move this below the 'PV' option in case that should
ever change.

>> +   default y
>> +   ---help---
>> + Linear pagetables (also called "recursive pagetables") refers
>> + to the practice of a guest operating system having pagetable
> 
> The two lines above should match in how they're being indented.

Gah -- this isn't a .c file so my .c style isn't being applied.  Let me
see what I can do.

> 
>> --- a/xen/arch/x86/mm.c
>> +++ b/xen/arch/x86/mm.c
>> @@ -587,6 +587,12 @@ static void put_data_page(
>>  put_page(page);
>>  }
>>  
>> +#ifdef CONFIG_PV_LINEAR_PT
>> +static void zero_linear_entries(struct page_info *pg)
> 
> When framing multiple functions, I think it is better to have a blank
> line between #ifdef and first piece of code (as well as around the
> #else and prior to the #endif), and I think the #else and #endif
> would also benefit from having /* PV_LINEAR_PT */ or some such
> added on their lines.

Ack

> 
>> @@ -719,6 +735,20 @@ get_##level##_linear_pagetable( 
>> \
>>  
>> \
>>  return 1;   
>> \
>>  }
>> +#define LPT_ASSERT ASSERT
>> +#else
>> +#define define_get_linear_pagetable(level)  \
>> +static int  \
>> +get_##level##_linear_pagetable( \
>> +level##_pgentry_t pde, unsigned long pde_pfn, struct domain *d) \
>> +{   \
>> +return 0;   \
>> +}
>> +#define zero_linear_entries(pg)
>> +#define dec_linear_uses(pg)
>> +#define dec_linear_entries(pg)
> 
> Would perhaps be better if these evaluated their arguments.

Following Andy's suggestion I'm changing them to static inlines and
adding an ASSERT().

>> +#define LPT_ASSERT(x)
>> +#endif
>>  
>>  
>>  bool is_iomem_page(mfn_t mfn)
> 
> Could you arrange for the double blank lines to go away here with
> the blank line additions asked for above?

Ack

> 
>> @@ -2330,8 +2360,8 @@ static int _put_page_type(struct page_info *page, bool 
>> preemptible,
>>   * necessary anymore for a dying domain.
>>   */
>>  ASSERT(page_get_owner(page)->is_dying);
>> -ASSERT(page->linear_pt_count < 0);
>> -ASSERT(ptpg->linear_pt_count > 0);
>> +LPT_ASSERT(page->linear_pt_count < 0);
>> +LPT_ASSERT(ptpg->linear_pt_count > 0);
> 
> Other than Andrew has suggested, with these I don't think
> LPT_ASSERT() can go away, unless you played tricks and forced
> the function's ptpg to be NULL regardless of caller, or unless you
> put the entire if() into an #ifdef.

Good point -- I'll see what I can do.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/mm: Make PV linear pagetables optional

2017-10-18 Thread George Dunlap
On 10/17/2017 07:05 PM, Andrew Cooper wrote:
> On 17/10/17 18:10, George Dunlap wrote:
>> Allowing pagetables to point to other pagetables of the same level
>> (often called 'linear pagetables') has been included in Xen since its
>> inception; but recently it has been the source of a number of subtle
>> reference-counting bugs.
>>
>> It is not used by Linux or MiniOS; but it used used by NetBSD and
>> Novell Netware.  There are significant numbers of people who are never
>> going to use the feature, along with significant numbers who need the
>> feature.
>>
>> Add a Kconfig option for the feature (default to 'y').  Also add a
>> command-line option to control whether PV linear pagetables are
>> allowed (default to 'true').
>>
>> In order to make the code clean:
>> - Introduce LPT_ASSERT(), which only exists if CONFIG_PV_LINEAR_PT is defined
>> - Introduce zero_linear_entries() to set page->linear_pt_count to zero
>>   (or do nothing, as appropriate)
>>
>> Reported-by: Jann Horn 
>> Signed-off-by: George Dunlap 
> 
> Definitely +1 to this kind of arrangement of user choices.  Some notes
> below.
> 
>> diff --git a/docs/misc/xen-command-line.markdown 
>> b/docs/misc/xen-command-line.markdown
>> index eb4995e68b..952368d3be 100644
>> --- a/docs/misc/xen-command-line.markdown
>> +++ b/docs/misc/xen-command-line.markdown
>> @@ -1422,6 +1422,22 @@ The following resources are available:
>>  CDP, one COS will corespond two CBMs other than one with CAT, due to the
>>  sum of CBMs is fixed, that means actual `cos_max` in use will 
>> automatically
>>  reduce to half when CDP is enabled.
>> +
>> +### pv-linear-pt
>> +> `= `
>> +
>> +> Default: `false`
> 
> Only available if Xen is compiled with CONFIG_PV_LINEAR_PT support enabled.

Ack

> 
>> +
>> +Allow PV guests to have pagetable entries pointing to other pagetables
>> +of the same level (i.e., allowing L2 PTEs to point to other L2 pages).
>> +This technique is often called "linear pagetables", and is sometimes
>> +used to allow operating systems a simple way to consistently map the
>> +current process's pagetables into its own virtual address space.
>> +
>> +Linux and MiniOS don't use this technique.  NetBSD and Novell Netware
>> +do; there may be other custom operating systems which do.  If you're
>> +certain you don't plan on having PV guests which use this feature,
>> +turning it off can reduce the attack surface.
>>  
>>  ### rcu-idle-timer-period-ms
>>  > `= `
>> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
>> index 62d313e3f5..5881b64608 100644
>> --- a/xen/arch/x86/mm.c
>> +++ b/xen/arch/x86/mm.c
>> @@ -654,6 +660,9 @@ static void dec_linear_uses(struct page_info *pg)
>>   * frame if it is mapped by a different root table. This is sufficient 
>> and
>>   * also necessary to allow validation of a root table mapping itself.
>>   */
>> +static bool __read_mostly pv_linear_pt_enable = true;
>> +boolean_param("pv-linear-pt", pv_linear_pt_enable);
> 
> The _enable suffix just makes the name longer, and (semi-upheld)
> convention would be for opt_pv_linear_pt, which is fine even in its used
> context below.

Ack

> 
>> +
>>  #define define_get_linear_pagetable(level)  
>> \
>>  static int  
>> \
>>  get_##level##_linear_pagetable( 
>> \
>> diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
>> index 26f0153164..7825f36316 100644
>> --- a/xen/include/asm-x86/mm.h
>> +++ b/xen/include/asm-x86/mm.h
>> @@ -177,10 +177,15 @@ struct page_info
>>   *   in use.
>>   */
>>  struct {
>> +#ifdef CONFIG_PV_LINEAR_PT
>>  u16 nr_validated_ptes:PAGETABLE_ORDER + 1;
>>  u16 :16 - PAGETABLE_ORDER - 1 - 2;
>>  s16 partial_pte:2;
>>  s16 linear_pt_count;
>> +#else
>> +u16 nr_validated_ptes;
>> +s8 partial_pte;
>> +#endif
> 
> I don't think this is a clever move.  Having CONFIG_PV_LINEAR_PT change
> the behaviour of nr_validated_ptes and partial_pte is a recipe for
> subtle bugs.
>
> An alternative would be to have the dec_linear_{uses,entries}()
> BUG_ON(pg->linear_pt_count != 0) when !CONFIG_PV_LINEAR_PT

Oh, I just noticed this was a union; so cutting out linear_pt_count
doesn't actually save you any space.

Yeah, in that case, leaving it in and adding ASSERTs that it's 0 makes
sense.  (I think an ASSERT is better than a BUG_ON() in this case.)

 -George


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH] x86/mm: Make PV linear pagetables optional

2017-10-17 Thread George Dunlap
Allowing pagetables to point to other pagetables of the same level
(often called 'linear pagetables') has been included in Xen since its
inception; but recently it has been the source of a number of subtle
reference-counting bugs.

It is not used by Linux or MiniOS; but it used used by NetBSD and
Novell Netware.  There are significant numbers of people who are never
going to use the feature, along with significant numbers who need the
feature.

Add a Kconfig option for the feature (default to 'y').  Also add a
command-line option to control whether PV linear pagetables are
allowed (default to 'true').

In order to make the code clean:
- Introduce LPT_ASSERT(), which only exists if CONFIG_PV_LINEAR_PT is defined
- Introduce zero_linear_entries() to set page->linear_pt_count to zero
  (or do nothing, as appropriate)

Reported-by: Jann Horn 
Signed-off-by: George Dunlap 
---
Changes since XSA
- Add a Kconfig option
- Default to 'on' (rather than 'off').

Release justification: This was originally part of a security fix
embargoed until after the freeze date; it wasn't checked in with the
other security patches in order to allow a discussion about the
default.

CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Konrad Wilk 
CC: Julien Grall 
---
 docs/misc/xen-command-line.markdown | 16 
 xen/arch/Kconfig|  1 +
 xen/arch/arm/mm.c   |  1 +
 xen/arch/x86/Kconfig| 21 
 xen/arch/x86/mm.c   | 38 +
 xen/include/asm-x86/mm.h|  5 +
 6 files changed, 78 insertions(+), 4 deletions(-)

diff --git a/docs/misc/xen-command-line.markdown 
b/docs/misc/xen-command-line.markdown
index eb4995e68b..952368d3be 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -1422,6 +1422,22 @@ The following resources are available:
 CDP, one COS will corespond two CBMs other than one with CAT, due to the
 sum of CBMs is fixed, that means actual `cos_max` in use will automatically
 reduce to half when CDP is enabled.
+   
+### pv-linear-pt
+> `= `
+
+> Default: `false`
+
+Allow PV guests to have pagetable entries pointing to other pagetables
+of the same level (i.e., allowing L2 PTEs to point to other L2 pages).
+This technique is often called "linear pagetables", and is sometimes
+used to allow operating systems a simple way to consistently map the
+current process's pagetables into its own virtual address space.
+
+Linux and MiniOS don't use this technique.  NetBSD and Novell Netware
+do; there may be other custom operating systems which do.  If you're
+certain you don't plan on having PV guests which use this feature,
+turning it off can reduce the attack surface.
 
 ### rcu-idle-timer-period-ms
 > `= `
diff --git a/xen/arch/Kconfig b/xen/arch/Kconfig
index cf0acb7e89..47287a4985 100644
--- a/xen/arch/Kconfig
+++ b/xen/arch/Kconfig
@@ -6,3 +6,4 @@ config NR_CPUS
default "128" if ARM
---help---
  Specifies the maximum number of physical CPUs which Xen will support.
+
diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 3c328e2df5..199155fcd8 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -42,6 +42,7 @@
 #include 
 #include 
 
+
 struct domain *dom_xen, *dom_io, *dom_cow;
 
 /* Override macros from asm/page.h to make them work with mfn_t */
diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
index 64955dc017..e2fcbaf5cc 100644
--- a/xen/arch/x86/Kconfig
+++ b/xen/arch/x86/Kconfig
@@ -97,6 +97,27 @@ config TBOOT
  Technology (TXT)
 
  If unsure, say Y.
+
+config PV_LINEAR_PT
+   bool "Support for PV linear pagetables"
+   depends on PV
+   default y
+   ---help---
+ Linear pagetables (also called "recursive pagetables") refers
+to the practice of a guest operating system having pagetable
+entries pointing to other pagetables of the same level (i.e.,
+allowing L2 PTEs to point to other L2 pages).  Some operating
+systems use it as a simple way to consisently map the current
+process's pagetables into its own virtual address space.
+
+Linux and MiniOS don't use this technique.  NetBSD and Novell
+Netware do; there may be other custom operating systems which
+do.  If you're certain you don't plan on having PV guests
+which use this feature, turning it off can reduce the attack
+surface.
+
+If unsure, say Y.
+
 endmenu
 
 source "common/Kconfig"
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 62d313e3f5..5881b64608 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -587,6 +587,12 @@ static void put_data_page(
 put_page(page);
 }
 
+#ifdef CONFIG_P

Re: [Xen-devel] [PATCH] x86/mm: Make PV linear pagetables optional

2017-10-17 Thread George Dunlap
On 10/17/2017 06:10 PM, George Dunlap wrote:
> Allowing pagetables to point to other pagetables of the same level
> (often called 'linear pagetables') has been included in Xen since its
> inception; but recently it has been the source of a number of subtle
> reference-counting bugs.
> 
> It is not used by Linux or MiniOS; but it used used by NetBSD and
> Novell Netware.  There are significant numbers of people who are never
> going to use the feature, along with significant numbers who need the
> feature.
> 
> Add a Kconfig option for the feature (default to 'y').  Also add a
> command-line option to control whether PV linear pagetables are
> allowed (default to 'true').
> 
> In order to make the code clean:
> - Introduce LPT_ASSERT(), which only exists if CONFIG_PV_LINEAR_PT is defined
> - Introduce zero_linear_entries() to set page->linear_pt_count to zero
>   (or do nothing, as appropriate)
> 
> Reported-by: Jann Horn 
> Signed-off-by: George Dunlap 
> ---
> Changes since XSA
> - Add a Kconfig option
> - Default to 'on' (rather than 'off').
> 
> Release justification: This was originally part of a security fix
> embargoed until after the freeze date; it wasn't checked in with the
> other security patches in order to allow a discussion about the
> default.
> 
> CC: Ian Jackson 
> CC: Wei Liu 
> CC: Andrew Cooper 
> CC: Jan Beulich 
> CC: Stefano Stabellini 
> CC: Konrad Wilk 
> CC: Julien Grall 
> ---
>  docs/misc/xen-command-line.markdown | 16 
>  xen/arch/Kconfig|  1 +
>  xen/arch/arm/mm.c   |  1 +
>  xen/arch/x86/Kconfig| 21 
>  xen/arch/x86/mm.c   | 38 
> +
>  xen/include/asm-x86/mm.h|  5 +
>  6 files changed, 78 insertions(+), 4 deletions(-)
> 
> diff --git a/docs/misc/xen-command-line.markdown 
> b/docs/misc/xen-command-line.markdown
> index eb4995e68b..952368d3be 100644
> --- a/docs/misc/xen-command-line.markdown
> +++ b/docs/misc/xen-command-line.markdown
> @@ -1422,6 +1422,22 @@ The following resources are available:
>  CDP, one COS will corespond two CBMs other than one with CAT, due to the
>  sum of CBMs is fixed, that means actual `cos_max` in use will 
> automatically
>  reduce to half when CDP is enabled.
> + 
> +### pv-linear-pt
> +> `= `
> +
> +> Default: `false`
> +
> +Allow PV guests to have pagetable entries pointing to other pagetables
> +of the same level (i.e., allowing L2 PTEs to point to other L2 pages).
> +This technique is often called "linear pagetables", and is sometimes
> +used to allow operating systems a simple way to consistently map the
> +current process's pagetables into its own virtual address space.
> +
> +Linux and MiniOS don't use this technique.  NetBSD and Novell Netware
> +do; there may be other custom operating systems which do.  If you're
> +certain you don't plan on having PV guests which use this feature,
> +turning it off can reduce the attack surface.
>  
>  ### rcu-idle-timer-period-ms
>  > `= `
> diff --git a/xen/arch/Kconfig b/xen/arch/Kconfig
> index cf0acb7e89..47287a4985 100644
> --- a/xen/arch/Kconfig
> +++ b/xen/arch/Kconfig
> @@ -6,3 +6,4 @@ config NR_CPUS
>   default "128" if ARM
>   ---help---
> Specifies the maximum number of physical CPUs which Xen will support.
> +
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index 3c328e2df5..199155fcd8 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -42,6 +42,7 @@
>  #include 
>  #include 
>  
> +

Gah -- sorry about the blank lines.  Should have looked over the patch
better first.

I'll wait for feedback on the rest of the patch before I resend.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86emul: keep compiler from using {x, y, z}mm registers itself

2017-10-16 Thread George Dunlap
On 10/16/2017 01:32 PM, Jan Beulich wrote:
> Since the emulator acts on the live hardware registers, we need to
> prevent the compiler from using them e.g. for inlined memcpy() /
> memset() (as gcc7 does). 

Why doesn't this affect the rest of the hypervisor too, since we don't
save and restore the *mm registers?

> We can't, however, set this from the command
> line, as otherwise the 64-bit build would face issues with functions
> returning floating point values and being declared in standard headers.

Sorry, just to clarify: You mean that there are standard headers which
contain prototypes for functions which return floating point values; we
include those headers but do not call the functions; and adding the
#pragma to the command-line would cause the compiler to choke on the
prototypes (even though the functions are never actually called)?

 -George


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 11/12] fuzz/x86_emulate: Set and fuzz more CPU state

2017-10-13 Thread George Dunlap
On 10/12/2017 04:38 PM, Jan Beulich wrote:
 On 11.10.17 at 19:52,  wrote:
>> The Intel manual claims that, "If [certain CPUID bits] are set, the
>> processor deprecates FCS and FDS, and the field is saved as h";
>> but experimentally it would be more accurate to say, "the field is
>> occasionally saved as h".  This causes the --rerun checking to
>> trip non-deterministically.  Sanitize them to zero.
> 
> I think we've meanwhile settled on the field being saved as zero
> being a side effect of using 32-bit fxsave plus a context switch in
> the OS kernel.
> 
>> @@ -594,6 +595,75 @@ static const struct x86_emulate_ops all_fuzzer_ops = {
>>  };
>>  #undef SET
>>  
>> +/*
>> + * This funciton will read or write fxsave to the fpu.  When writing,
>> + * it 'sanitizes' the state: It will mask off the appropriate bits in
>> + * the mxcsr, 'restore' the state to the fpu, then 'save' it again so
>> + * that the data in fxsave reflects what's actually in the FPU.
>> + *
>> + * TODO: Extend state beyond just FPU (ymm registers, &c)
>> + */
>> +static void _set_fpu_state(char *fxsave, bool write)
>> +{
>> +if ( cpu_has_fxsr )
>> +{
>> +static union __attribute__((__aligned__(16))) {
>> +char x[512];
>> +struct {
>> +uint16_t cw, sw;
>> +uint8_t  tw, _rsvd1;
>> +uint16_t op;
>> +uint32_t ip;
>> +uint16_t cs, _rsvd2;
>> +uint32_t dp;
>> +uint16_t ds, _rsvd3;
>> +uint32_t mxcsr;
>> +uint32_t mxcsr_mask;
>> +/* ... */
>> +};
>> +} *fxs;
>> +
>> +fxs = (typeof(fxs))fxsave;
>> +
>> +if ( write )
>> +{
>> +/* 
>> + * Clear reserved bits to make sure we don't get any
>> + * exceptions
>> + */
>> +fxs->mxcsr &= mxcsr_mask;
>> +
>> +/*
>> + * The Intel manual says that on newer models CS/DS are
>> + * deprecated and that these fields "are saved as h".
>> + * Experimentally, however, at least on my test box,
>> + * whether this saved as h or as the previously
>> + * written value is random; meaning that when run with
>> + * --rerun, we occasionally detect a "state mismatch" in these
>> + * bytes.  Instead, simply sanitize them to zero.
>> + *
>> + * TODO Check CPUID as specified in the manual before
>> + * clearing
>> + */
>> +fxs->cs = fxs->ds = 0;
> 
> Shouldn't be needed anymore with ...
> 
>> +asm volatile( "fxrstor %0" :: "m" (*fxs) );
> 
> rex64 (or fxrstor64) used here and ...
> 
>> +}
>> +
>> +asm volatile( "fxsave %0" : "=m" (*fxs) );
> 
> ... here (of course the alternative here then is fxsave64).
> 
> Also please add blanks before the opening parentheses.
> 
>> @@ -732,6 +806,18 @@ static void setup_state(struct x86_emulate_ctxt *ctxt)
>>  printf("Setting cpu_user_regs offset %x\n", offset);
>>  continue;
>>  }
>> +offset -= sizeof(struct cpu_user_regs);
>> +
>> +/* Fuzz fxsave state */
>> +if ( offset < sizeof(s->fxsave) / 4 )
> 
> You've switched to sizeof() here but ...
> 
>> +{
>> +/* 32-bit size is arbitrary; see comment above */
>> +if ( !input_read(s, s->fxsave + (offset * 4), 4) )
>> +return;
>> +printf("Setting fxsave offset %x\n", offset * 4);
>> +continue;
>> +}
>> +offset -= 128;
> 
> ... not here.
> 
>> @@ -1008,6 +1098,16 @@ static void compare_states(struct fuzz_state state[2])
>>  if ( memcmp(&state[0].ops, &state[1].ops, sizeof(state[0].ops)) )
>>  printf("ops differ!\n");
>>  
>> +if ( memcmp(&state[0].fxsave, &state[1].fxsave, 
>> sizeof(state[0].fxsave)) )
>> +{
>> +printf("fxsave differs!\n");
>> +for ( i = 0;  i < sizeof(state[0].fxsave)/sizeof(unsigned); i++ 
>> )
> 
> Blanks around / again please.
> 
>> +{
>> +printf("[%04lu] %08x %08x\n",
> 
> I think I've indicated before that I consider leading zeros on decimal
> numbers misleading. 

Come to think of it I agree with you.

> Could I talk you into using %4lu instead (or
> really %4zu, considering the expression type) in places like this one
> (i.e. also in the earlier patch, where I notice only now the l -> z
> conversion wasn't done consistently either)?

/me looks up what %zu is supposed to do

Sure.

> 
>> +i * sizeof(unsigned), ((unsigned 
>> *)&state[0].fxsave)[i], ((unsigned *)&state[1].fxsave)[i]);
> 
> Long line.

Ack.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH for-4.10] fuzz/x86_emulate: Fix afl-harness batch mode file pointer leak

2017-10-13 Thread George Dunlap
On 10/13/2017 11:31 AM, Jan Beulich wrote:
 On 13.10.17 at 12:23,  wrote:
>> On 10/13/2017 10:20 AM, Jan Beulich wrote:
>> On 13.10.17 at 11:10,  wrote:
 On 10/13/2017 10:06 AM, Jan Beulich wrote:
 On 13.10.17 at 11:00,  wrote:
>> --- a/tools/fuzz/x86_instruction_emulator/afl-harness.c
>> +++ b/tools/fuzz/x86_instruction_emulator/afl-harness.c
>> @@ -99,13 +99,17 @@ int main(int argc, char **argv)
>>  exit(-1);
>>  }
>>  
>> -if ( !feof(fp) )
>> +/* Only run the test if the input file was smaller than 
>> INPUT_SIZE */
>> +if ( feof(fp) )
>> +{
>> +LLVMFuzzerTestOneInput(input, size);
>> +}
>
> ... ideally with the unnecessary braces dropped here
> Reviewed-by: Jan Beulich 

 Do you really want this to look like this?

 if ( ... )
foo();
 else
 {
...
 }
>>>
>>> Yes. It's Linux and qemu who dislike non-matched if/else bodies,
>>> but our ./CODING_STYLE only says
>>>
>>> "Braces should be omitted for blocks with a single statement. e.g.,
>>>  
>>>  if ( condition )
>>>  single_statement();"
>>>
>>> and personally I'm happy that it doesn't say anything more.
>>
>> Hmm, I personally think it's ugly enough that I'd rather restructure the
>> code to avoid it looking like that. :-)
>>
>> I'll see what I can do.
> 
> Well, assuming you would think that way I've intentionally said
> "ideally", i.e. if you really don't want to change it, I can live with
> the braces.

OK, thanks. :-)

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH for-4.10] fuzz/x86_emulate: Fix afl-harness batch mode file pointer leak

2017-10-13 Thread George Dunlap
On 10/13/2017 10:20 AM, Jan Beulich wrote:
 On 13.10.17 at 11:10,  wrote:
>> On 10/13/2017 10:06 AM, Jan Beulich wrote:
>> On 13.10.17 at 11:00,  wrote:
 --- a/tools/fuzz/x86_instruction_emulator/afl-harness.c
 +++ b/tools/fuzz/x86_instruction_emulator/afl-harness.c
 @@ -99,13 +99,17 @@ int main(int argc, char **argv)
  exit(-1);
  }
  
 -if ( !feof(fp) )
 +/* Only run the test if the input file was smaller than 
 INPUT_SIZE */
 +if ( feof(fp) )
 +{
 +LLVMFuzzerTestOneInput(input, size);
 +}
>>>
>>> ... ideally with the unnecessary braces dropped here
>>> Reviewed-by: Jan Beulich 
>>
>> Do you really want this to look like this?
>>
>> if ( ... )
>>foo();
>> else
>> {
>>...
>> }
> 
> Yes. It's Linux and qemu who dislike non-matched if/else bodies,
> but our ./CODING_STYLE only says
> 
> "Braces should be omitted for blocks with a single statement. e.g.,
>  
>  if ( condition )
>  single_statement();"
> 
> and personally I'm happy that it doesn't say anything more.

Hmm, I personally think it's ugly enough that I'd rather restructure the
code to avoid it looking like that. :-)

I'll see what I can do.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 08/12] fuzz/x86_emulate: Move all state into fuzz_state

2017-10-13 Thread George Dunlap
On 10/13/2017 10:54 AM, Jan Beulich wrote:
 On 13.10.17 at 11:22,  wrote:
>> On 10/12/2017 04:16 PM, Jan Beulich wrote:
>> On 11.10.17 at 19:52,  wrote:
 @@ -761,12 +757,11 @@ static void disable_hooks(struct x86_emulate_ctxt 
 *ctxt)
  static void sanitize_input(struct x86_emulate_ctxt *ctxt)
  {
  struct fuzz_state *s = ctxt->data;
 -struct fuzz_corpus *c = s->corpus;
 -struct cpu_user_regs *regs = &c->regs;
 -unsigned long bitmap = c->options;
 +struct cpu_user_regs *regs = ctxt->regs;
 +unsigned long bitmap = s->options;
  
  /* Some hooks can't be disabled. */
 -c->options &= ~((1<>>> +s->options &= ~((1<>>
>>> Mind adding the missing blanks here while you touch this?
>>
>> Like this?
>>
>> s->options &= ~((1< 
> Even farther (at the same time adding the missing number suffixes):
> 
> s->options &= ~((1UL << HOOK_read) | (1UL << HOOK_insn_fetch));

Got it.  (I was actually trying to verify the 'snuggly' outer braces,
but missed spaces around the '<<'s).

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 10/12] fuzz/x86_emulate: Add --rerun option to try to track down instability

2017-10-13 Thread George Dunlap
On 10/12/2017 04:24 PM, Jan Beulich wrote:
 On 11.10.17 at 19:52,  wrote:
>> @@ -884,20 +891,146 @@ int LLVMFuzzerInitialize(int *argc, char ***argv)
>>  return 0;
>>  }
>>  
>> -int LLVMFuzzerTestOneInput(const uint8_t *data_p, size_t size)
>> +static void setup_fuzz_state(struct fuzz_state *state, const void *data_p, 
>> size_t size)
>>  {
>> -struct fuzz_state state = {
>> -.ops = all_fuzzer_ops,
>> -};
>> -struct x86_emulate_ctxt ctxt = {
>> -.data = &state,
>> -.regs = &state.regs,
>> -.addr_size = 8 * sizeof(void *),
>> -.sp_size = 8 * sizeof(void *),
>> -};
>> +memset(state, 0, sizeof(*state));
>> +state->corpus = data_p;
>> +state->data_num = size;
>> +}
>> +
>> +static int runtest(struct fuzz_state *state) {
>>  int rc;
>>  
>> -if ( size <= fuzz_minimal_input_size() )
>> +struct x86_emulate_ctxt *ctxt = &state->ctxt;
> 
> Please don't leave a blank line between declarations.
> 
>> +static void compare_states(struct fuzz_state state[2])
>> +{
>> +/* First zero any "internal" pointers */
>> +state[0].corpus = state[1].corpus = NULL;
>> +state[0].ctxt.data = state[1].ctxt.data = NULL;
>> +state[0].ctxt.regs = state[1].ctxt.regs = NULL;
>> +
>> +if ( memcmp(&state[0], &state[1], sizeof(struct fuzz_state)) )
>> +{
>> +unsigned int i;
>> +
>> +printf("State mismatch\n");
>> +
>> +for ( i = 0; i < ARRAY_SIZE(state[0].cr); i++ )
>> +if ( state[0].cr[i] != state[1].cr[i] )
>> +printf("cr[%u]: %lx != %lx\n",
>> +   i, state[0].cr[i], state[1].cr[i]);
>> +
>> +for ( i = 0; i < ARRAY_SIZE(state[0].msr); i++ )
>> +if ( state[0].msr[i] != state[1].msr[i] )
>> +printf("msr[%u]: %lx != %lx\n",
>> +   i, state[0].msr[i], state[1].msr[i]);
>> +
>> +for ( i = 0; i < ARRAY_SIZE(state[0].segments); i++ )
>> +if ( memcmp(&state[0].segments[i], &state[1].segments[i],
>> +sizeof(state[0].segments[0])) )
>> +printf("segments[%u]: [%x:%x:%x:%lx] != [%x:%x:%x:%lx]!\n", 
>> i,
>> +   (unsigned)state[0].segments[i].sel,
>> +   (unsigned)state[0].segments[i].attr,
>> +   state[0].segments[i].limit,
>> +   state[0].segments[i].base,
>> +   (unsigned)state[1].segments[i].sel,
>> +   (unsigned)state[1].segments[i].attr,
>> +   state[1].segments[i].limit,
>> +   state[1].segments[i].base);
>> +
>> +if ( state[0].data_num != state[1].data_num )
>> +printf("data_num: %lx !=  %lx\n", state[0].data_num,
>> +   state[1].data_num);
>> +if ( state[0].data_index != state[1].data_index )
>> +printf("data_index: %lx !=  %lx\n", state[0].data_index,
>> +   state[1].data_index);
>> +
>> +if ( memcmp(&state[0].regs, &state[1].regs, sizeof(state[0].regs)) )
>> +{
>> +printf("registers differ!\n");
>> +/* Print If Not Equal */
>> +#define PRINT_NE(elem)\
>> +if ( state[0].elem != state[1].elem ) \
>> +printf(#elem " differ: %lx != %lx\n", \
>> +   (unsigned long)state[0].elem, \
>> +   (unsigned long)state[1].elem)
>> +PRINT_NE(regs.r15);
>> +PRINT_NE(regs.r14);
>> +PRINT_NE(regs.r13);
>> +PRINT_NE(regs.r12);
>> +PRINT_NE(regs.rbp);
>> +PRINT_NE(regs.rbx);
>> +PRINT_NE(regs.r10);
>> +PRINT_NE(regs.r11);
>> +PRINT_NE(regs.r9);
>> +PRINT_NE(regs.r8);
>> +PRINT_NE(regs.rax);
>> +PRINT_NE(regs.rcx);
>> +PRINT_NE(regs.rdx);
>> +PRINT_NE(regs.rsi);
>> +PRINT_NE(regs.rdi);
> 
> Aren't these register fields all of the same type? If so, why do you
> need to casts to unsigned long in the macro?

As it happens, they're all the same size; when I wrote the macro it was
designed such that the same macro could be used for all the elements
regardless of what size they were.  Since there's no time pressure,
would you rather I add the segment registers (and leave the cast), or
only add rflags (and remove the cast)?

> 
>> +for ( i = offsetof(struct cpu_user_regs, error_code) / 
>> sizeof(unsigned);
>> +  i < sizeof(state[1].regs)/sizeof(unsigned); i++ )
> 
> Blanks around binary operators please (also elsewhere).

Ack

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 08/12] fuzz/x86_emulate: Move all state into fuzz_state

2017-10-13 Thread George Dunlap
On 10/12/2017 04:16 PM, Jan Beulich wrote:
 On 11.10.17 at 19:52,  wrote:
>> --- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
>> +++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
>> @@ -22,34 +22,31 @@
>>  
>>  #define SEG_NUM x86_seg_none
>>  
>> -/* Layout of data expected as fuzzing input. */
>> -struct fuzz_corpus
>> +/*
>> + * State of the fuzzing harness and emulated cpu.  Calculated
>> + * initially from the input corpus, and later mutated by the emulation
>> + * callbacks (and the emulator itself, in the case of regs).
>> + */
>> +struct fuzz_state
>>  {
>> +/* Emulated CPU state */
>> +unsigned long options;
>>  unsigned long cr[5];
>>  uint64_t msr[MSR_INDEX_MAX];
>> -struct cpu_user_regs regs;
>>  struct segment_register segments[SEG_NUM];
>> -unsigned long options;
>> -unsigned char data[INPUT_SIZE];
>> -} input;
>> -#define DATA_OFFSET offsetof(struct fuzz_corpus, data)
>> +struct cpu_user_regs regs;
>>  
>> -/*
>> - * Internal state of the fuzzing harness.  Calculated initially from the 
>> input
>> - * corpus, and later mutates by the emulation callbacks.
>> - */
>> -struct fuzz_state
>> -{
>>  /* Fuzzer's input data. */
>> -struct fuzz_corpus *corpus;
>> +#define DATA_OFFSET offsetof(struct fuzz_state, corpus)
>> +const unsigned char * corpus;
> 
> Stray blank after *. Also any reason this can't be uint8_t,
> matching LLVMFuzzerTestOneInput()'s parameter and making
> it possible to avoid the cast you currently use on that
> assignment?

For some reason I thought this would make things uglier; but it actually
works pretty well.

>> @@ -646,11 +634,20 @@ static void set_sizes(struct x86_emulate_ctxt *ctxt)
>>  ctxt->addr_size = ctxt->sp_size = 64;
>>  else
>>  {
>> -ctxt->addr_size = c->segments[x86_seg_cs].db ? 32 : 16;
>> -ctxt->sp_size   = c->segments[x86_seg_ss].db ? 32 : 16;
>> +ctxt->addr_size = s->segments[x86_seg_cs].db ? 32 : 16;
>> +ctxt->sp_size   = s->segments[x86_seg_ss].db ? 32 : 16;
>>  }
>>  }
>>  
>> +static void setup_state(struct x86_emulate_ctxt *ctxt)
>> +{
>> +struct fuzz_state *s = ctxt->data;
>> +
>> +/* Fuzz all of the emulated state in one go */
>> +if (!input_read(s, s, DATA_OFFSET))
> 
> Missing blanks.

Ack

> 
>> @@ -761,12 +757,11 @@ static void disable_hooks(struct x86_emulate_ctxt 
>> *ctxt)
>>  static void sanitize_input(struct x86_emulate_ctxt *ctxt)
>>  {
>>  struct fuzz_state *s = ctxt->data;
>> -struct fuzz_corpus *c = s->corpus;
>> -struct cpu_user_regs *regs = &c->regs;
>> -unsigned long bitmap = c->options;
>> +struct cpu_user_regs *regs = ctxt->regs;
>> +unsigned long bitmap = s->options;
>>  
>>  /* Some hooks can't be disabled. */
>> -c->options &= ~((1<> +s->options &= ~((1< 
> Mind adding the missing blanks here while you touch this?

Like this?

s->options &= ~((1

  1   2   3   4   5   6   7   8   9   10   >