Re: [Xen-devel] PVH dom0 creation fails - the system freezes

2018-08-08 Thread bercarug

On 08/08/2018 01:11 PM, Roger Pau Monné wrote:

On Wed, Aug 08, 2018 at 11:44:28AM +0200, Roger Pau Monné wrote:

I just realized that I've dropped a chunk from my series while
rebasing, could you please try again with the following diff applied
on top of my series?

diff --git a/xen/drivers/passthrough/x86/iommu.c 
b/xen/drivers/passthrough/x86/iommu.c
index 6aec43ed1a..6271d8b671 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -209,7 +209,13 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
  if ( !hwdom_iommu_map(d, pfn, max_pfn) )
  continue;
  
-rc = iommu_map_page(d, pfn, pfn, IOMMUF_readable|IOMMUF_writable);

+if ( iommu_use_hap_pt(d) )
+{
+ASSERT(is_hvm_domain(d));
+rc = set_identity_p2m_entry(d, pfn, p2m_access_rw, 0);
+}
+else
+rc = iommu_map_page(d, pfn, pfn, IOMMUF_readable|IOMMUF_writable);
  if ( rc )
  printk(XENLOG_WARNING " d%d: IOMMU mapping failed: %d\n",
 d->domain_id, rc);

I've pushed a new version that has this chunk, so it might be easier
for you to just fetch and test:

git://xenbits.xen.org/people/royger/xen.git iommu_inclusive_v4

Thanks, Roger.

I already recompiled Xen using this patch, and the USB devices are 
functional again.



Thanks,

Gabriel




Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar 
Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in 
Romania. Registration number J22/2621/2005.
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] PVH dom0 creation fails - the system freezes

2018-08-08 Thread bercarug

On 08/08/2018 11:51 AM, Roger Pau Monné wrote:

On Wed, Aug 08, 2018 at 09:43:39AM +0100, Paul Durrant wrote:

-Original Message-
From: Roger Pau Monne
Sent: 08 August 2018 09:08
To: berca...@amazon.com
Cc: Paul Durrant ; xen-devel ; David Woodhouse ;
Jan Beulich ; Belgun, Adrian 
Subject: Re: [Xen-devel] PVH dom0 creation fails - the system freezes

On Wed, Aug 08, 2018 at 10:46:40AM +0300, berca...@amazon.com wrote:

On 08/02/2018 04:55 PM, Roger Pau Monné wrote:

Please try to avoid top posting.

On Thu, Aug 02, 2018 at 11:36:26AM +, Bercaru, Gabriel wrote:

I applied the match mentioned, but the system fails to boot. Instead, it
drops to a BusyBox shell. It seems to be a file system issue.

So you have applied 173c7803592065d27bf2e60d50e08e197a0efa83 and it
causes a regression for you?

As I understand it, before applying 173c780359206 you where capable of
booting the PVH Dom0, albeit with non-working USB?

And after applying 173c780359206 you are no longer able to boot?

Right, after applying 173c780359206 the system fails to boot (it drops to
the BusyBox shell).

Following is a sequence of the boot log regarding the file system issue.

At least part of the issue seems to be that the IO-APIC information
provided to Dom0 is wrong (from the attached log):

[0.00] IOAPIC[0]: apic_id 2, version 152, address 0xfec0, GSI 0-

0

[0.00] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[0.00] ERROR: Unable to locate IOAPIC for GSI 2
[0.00] Failed to find ioapic for gsi : 2
[0.00] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[0.00] ERROR: Unable to locate IOAPIC for GSI 9
[0.00] Failed to find ioapic for gsi : 9
[0.00] ERROR: Unable to locate IOAPIC for GSI 1
[0.00] ERROR: Unable to locate IOAPIC for GSI 2
[0.00] ERROR: Unable to locate IOAPIC for GSI 3
[0.00] ERROR: Unable to locate IOAPIC for GSI 4
[0.00] ERROR: Unable to locate IOAPIC for GSI 5
[0.00] ERROR: Unable to locate IOAPIC for GSI 6
[0.00] ERROR: Unable to locate IOAPIC for GSI 7
[0.00] ERROR: Unable to locate IOAPIC for GSI 8
[0.00] ERROR: Unable to locate IOAPIC for GSI 9
[0.00] ERROR: Unable to locate IOAPIC for GSI 10
[0.00] ERROR: Unable to locate IOAPIC for GSI 11
[0.00] ERROR: Unable to locate IOAPIC for GSI 12
[0.00] ERROR: Unable to locate IOAPIC for GSI 13
[0.00] ERROR: Unable to locate IOAPIC for GSI 14
[0.00] ERROR: Unable to locate IOAPIC for GSI 15

Can you try to boot with just the staging branch (current commit is
008a8fb249b9d433) and see how that goes?

Thanks, Roger.


I recompiled Xen using the staging branch, commit 008a8fb249b9d433 and

the

system boots,

OK, so your issues where not caused by 173c780359206 then?

008a8fb249b9d433 already contains 173c780359206 because it was
committed earlier. In any case it's good to know you are able to boot
(albeit with issues) using the current staging branch.


however the USB problem persists. I was able to log in using the serial port
and after executing

Yes, the fixes for this issue have not been committed yet, see:

https://lists.xenproject.org/archives/html/xen-devel/2018-
08/msg00528.html

If you want you can give this branch a try, it should hopefully solve
your USB issues.


"xl list -l" the memory decrease problem appeared.

Yup, I will look into this now in order to find some kind of
workaround.


I attached the boot log. Following are some lines extracted from the log,
regarding the USB

devices problem:

[    5.864084] usb 1-1: device descriptor read/64, error -71

[    7.564025] usb 1-1: new full-speed USB device number 4 using xhci_hcd
[    7.571347] usb 1-1: Device not responding to setup address.

[    8.008031] usb 1-1: device not accepting address 4, error -71

[    8.609623] usb 1-1: device not accepting address 5, error -71


At the beginning of the log, there is a message regarding
iommu_inclusive_mapping:


(XEN) [VT-D]found ACPI_DMAR_RMRR:
(XEN) [VT-D]  RMRR address range 3e2e..3e2f not in reserved

memory;

need "iommu_inclusive_mapping=1"?
(XEN) [VT-D] endpoint: :00:14.0


I mention that I tried to boot again using this command line option, but
this message and the

USB messages persist.

Does USB work despite of the warning message?

No, it does not.



Yes, iommu_inclusive_mapping doesn't work for PVH, that's what my
patch series is trying to address. The error is caused by
missing/wrong RMRR regions in the ACPI tables.


Looks like this warning is suggesting that there is an RMRR that falls outside 
of an E820 reserved region. For PV I guess that 'inclusive' will fix this, but 
for PVH 'reserved' isn't going to fix it. I hope that the range at least falls 
in a hole, so maybe we also need a dom0_iommu=holes option too? Ugly, but maybe 
necessary.

I wanted to avoid adding such option because I think it's going to
interact quite badly 

Re: [Xen-devel] PVH dom0 creation fails - the system freezes

2018-08-08 Thread bercarug

On 08/08/2018 11:08 AM, Roger Pau Monné wrote:

On Wed, Aug 08, 2018 at 10:46:40AM +0300, berca...@amazon.com wrote:

On 08/02/2018 04:55 PM, Roger Pau Monné wrote:

Please try to avoid top posting.

On Thu, Aug 02, 2018 at 11:36:26AM +, Bercaru, Gabriel wrote:

I applied the match mentioned, but the system fails to boot. Instead, it
drops to a BusyBox shell. It seems to be a file system issue.

So you have applied 173c7803592065d27bf2e60d50e08e197a0efa83 and it
causes a regression for you?

As I understand it, before applying 173c780359206 you where capable of
booting the PVH Dom0, albeit with non-working USB?

And after applying 173c780359206 you are no longer able to boot?

Right, after applying 173c780359206 the system fails to boot (it drops to
the BusyBox shell).

Following is a sequence of the boot log regarding the file system issue.

At least part of the issue seems to be that the IO-APIC information
provided to Dom0 is wrong (from the attached log):

[0.00] IOAPIC[0]: apic_id 2, version 152, address 0xfec0, GSI 0-0
[0.00] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[0.00] ERROR: Unable to locate IOAPIC for GSI 2
[0.00] Failed to find ioapic for gsi : 2
[0.00] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[0.00] ERROR: Unable to locate IOAPIC for GSI 9
[0.00] Failed to find ioapic for gsi : 9
[0.00] ERROR: Unable to locate IOAPIC for GSI 1
[0.00] ERROR: Unable to locate IOAPIC for GSI 2
[0.00] ERROR: Unable to locate IOAPIC for GSI 3
[0.00] ERROR: Unable to locate IOAPIC for GSI 4
[0.00] ERROR: Unable to locate IOAPIC for GSI 5
[0.00] ERROR: Unable to locate IOAPIC for GSI 6
[0.00] ERROR: Unable to locate IOAPIC for GSI 7
[0.00] ERROR: Unable to locate IOAPIC for GSI 8
[0.00] ERROR: Unable to locate IOAPIC for GSI 9
[0.00] ERROR: Unable to locate IOAPIC for GSI 10
[0.00] ERROR: Unable to locate IOAPIC for GSI 11
[0.00] ERROR: Unable to locate IOAPIC for GSI 12
[0.00] ERROR: Unable to locate IOAPIC for GSI 13
[0.00] ERROR: Unable to locate IOAPIC for GSI 14
[0.00] ERROR: Unable to locate IOAPIC for GSI 15

Can you try to boot with just the staging branch (current commit is
008a8fb249b9d433) and see how that goes?

Thanks, Roger.


I recompiled Xen using the staging branch, commit 008a8fb249b9d433 and the
system boots,

OK, so your issues where not caused by 173c780359206 then?

008a8fb249b9d433 already contains 173c780359206 because it was
committed earlier. In any case it's good to know you are able to boot
(albeit with issues) using the current staging branch.


however the USB problem persists. I was able to log in using the serial port
and after executing

Yes, the fixes for this issue have not been committed yet, see:

https://lists.xenproject.org/archives/html/xen-devel/2018-08/msg00528.html

If you want you can give this branch a try, it should hopefully solve
your USB issues.


"xl list -l" the memory decrease problem appeared.

Yup, I will look into this now in order to find some kind of
workaround.


I attached the boot log. Following are some lines extracted from the log,
regarding the USB

devices problem:

[    5.864084] usb 1-1: device descriptor read/64, error -71

[    7.564025] usb 1-1: new full-speed USB device number 4 using xhci_hcd
[    7.571347] usb 1-1: Device not responding to setup address.

[    8.008031] usb 1-1: device not accepting address 4, error -71

[    8.609623] usb 1-1: device not accepting address 5, error -71


At the beginning of the log, there is a message regarding
iommu_inclusive_mapping:


(XEN) [VT-D]found ACPI_DMAR_RMRR:
(XEN) [VT-D]  RMRR address range 3e2e..3e2f not in reserved memory;
need "iommu_inclusive_mapping=1"?
(XEN) [VT-D] endpoint: :00:14.0


I mention that I tried to boot again using this command line option, but
this message and the

USB messages persist.

Yes, iommu_inclusive_mapping doesn't work for PVH, that's what my
patch series is trying to address. The error is caused by
missing/wrong RMRR regions in the ACPI tables.

Thanks, Roger.


I tried compiling from the branch mentioned. I changed the command line by

adding "dom0-iommu=reserved", but I get the same error messages about USB

during boot.


Gabriel




Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar 
Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in 
Romania. Registration number J22/2621/2005.
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] PVH dom0 creation fails - the system freezes

2018-08-08 Thread bercarug

On 08/02/2018 04:55 PM, Roger Pau Monné wrote:

Please try to avoid top posting.

On Thu, Aug 02, 2018 at 11:36:26AM +, Bercaru, Gabriel wrote:

I applied the match mentioned, but the system fails to boot. Instead, it
drops to a BusyBox shell. It seems to be a file system issue.

So you have applied 173c7803592065d27bf2e60d50e08e197a0efa83 and it
causes a regression for you?

As I understand it, before applying 173c780359206 you where capable of
booting the PVH Dom0, albeit with non-working USB?

And after applying 173c780359206 you are no longer able to boot?
Right, after applying 173c780359206 the system fails to boot (it drops 
to the BusyBox shell).

Following is a sequence of the boot log regarding the file system issue.

At least part of the issue seems to be that the IO-APIC information
provided to Dom0 is wrong (from the attached log):

[0.00] IOAPIC[0]: apic_id 2, version 152, address 0xfec0, GSI 0-0
[0.00] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[0.00] ERROR: Unable to locate IOAPIC for GSI 2
[0.00] Failed to find ioapic for gsi : 2
[0.00] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[0.00] ERROR: Unable to locate IOAPIC for GSI 9
[0.00] Failed to find ioapic for gsi : 9
[0.00] ERROR: Unable to locate IOAPIC for GSI 1
[0.00] ERROR: Unable to locate IOAPIC for GSI 2
[0.00] ERROR: Unable to locate IOAPIC for GSI 3
[0.00] ERROR: Unable to locate IOAPIC for GSI 4
[0.00] ERROR: Unable to locate IOAPIC for GSI 5
[0.00] ERROR: Unable to locate IOAPIC for GSI 6
[0.00] ERROR: Unable to locate IOAPIC for GSI 7
[0.00] ERROR: Unable to locate IOAPIC for GSI 8
[0.00] ERROR: Unable to locate IOAPIC for GSI 9
[0.00] ERROR: Unable to locate IOAPIC for GSI 10
[0.00] ERROR: Unable to locate IOAPIC for GSI 11
[0.00] ERROR: Unable to locate IOAPIC for GSI 12
[0.00] ERROR: Unable to locate IOAPIC for GSI 13
[0.00] ERROR: Unable to locate IOAPIC for GSI 14
[0.00] ERROR: Unable to locate IOAPIC for GSI 15

Can you try to boot with just the staging branch (current commit is
008a8fb249b9d433) and see how that goes?

Thanks, Roger.

I recompiled Xen using the staging branch, commit 008a8fb249b9d433 and 
the system boots,


however the USB problem persists. I was able to log in using the serial 
port and after executing


"xl list -l" the memory decrease problem appeared.


I attached the boot log. Following are some lines extracted from the 
log, regarding the USB


devices problem:

[    5.864084] usb 1-1: device descriptor read/64, error -71

[    7.564025] usb 1-1: new full-speed USB device number 4 using xhci_hcd
[    7.571347] usb 1-1: Device not responding to setup address.

[    8.008031] usb 1-1: device not accepting address 4, error -71

[    8.609623] usb 1-1: device not accepting address 5, error -71


At the beginning of the log, there is a message regarding 
iommu_inclusive_mapping:



(XEN) [VT-D]found ACPI_DMAR_RMRR:
(XEN) [VT-D]  RMRR address range 3e2e..3e2f not in reserved 
memory; need "iommu_inclusive_mapping=1"?

(XEN) [VT-D] endpoint: :00:14.0


I mention that I tried to boot again using this command line option, but 
this message and the


USB messages persist.


Gabriel




Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar 
Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in 
Romania. Registration number J22/2621/2005.


staging.cap
Description: application/vnd.tcpdump.pcap
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] PVH dom0 creation fails - the system freezes

2018-07-26 Thread bercarug

On 07/25/2018 07:12 PM, Roger Pau Monné wrote:

On Wed, Jul 25, 2018 at 05:05:35PM +0300, berca...@amazon.com wrote:

On 07/25/2018 05:02 PM, Wei Liu wrote:

On Wed, Jul 25, 2018 at 03:41:11PM +0200, Juergen Gross wrote:

On 25/07/18 15:35, Roger Pau Monné wrote:

What could be causing the available memory loss problem?

That seems to be Linux aggressively ballooning out memory, you go from
7129M total memory to 246M. Are you creating a lot of domains?

This might be related to the tools thinking dom0 is a PV domain.

Good point.

In that case, xenstore-ls -fp would also be useful. The output should
show the balloon target for Dom0.

You can also try to set the autoballoon to off in /etc/xen/xl.cfg to see
if it makes any difference.

Wei.

Also tried setting autoballooning off, but it had no effect.

This is a Linux/libxl issue that I'm not sure what's the best way to
solve. Linux has the following 'workaround' in the balloon driver:

err = xenbus_scanf(XBT_NIL, "memory", "static-max", "%llu",
   _max);
if (err != 1)
static_max = new_target;
else
static_max >>= PAGE_SHIFT - 10;
target_diff = xen_pv_domain() ? 0
: static_max - balloon_stats.target_pages;

I suppose this is used to cope with the memory reporting mismatch
usually seen on HVM guests. This however interacts quite badly on a
PVH Dom0 that has for example:

/local/domain/0/memory/target = "8391840"   (n0)
/local/domain/0/memory/static-max = "17179869180"   (n0)

One way to solve this is to set target and static-max to the same
value initially, so that target_diff on Linux is 0. Another option
would be to force target_diff = 0 for Dom0.

I'm attaching a patch for libxl that should solve this, could you
please give it a try and report back?

I'm still unsure however about the best way to fix this, need to think
about it.

Roger.
---8<---
diff --git a/tools/libxl/libxl_mem.c b/tools/libxl/libxl_mem.c
index e551e09fed..2c984993d8 100644
--- a/tools/libxl/libxl_mem.c
+++ b/tools/libxl/libxl_mem.c
@@ -151,7 +151,9 @@ retry_transaction:
  *target_memkb = info.current_memkb;
  }
  if (staticmax == NULL) {
-libxl__xs_printf(gc, t, max_path, "%"PRIu64, info.max_memkb);
+libxl__xs_printf(gc, t, max_path, "%"PRIu64,
+ libxl__domain_type(gc, 0) == LIBXL_DOMAIN_TYPE_PV ?
+ info.max_memkb : info.current_memkb);
  *max_memkb = info.max_memkb;
  }
  



I have tried Roger's patch and it fixed the memory decrease problem. "xl 
list -l"


no longer causes any issue.

The output of "xenstore-ls -fp" shows that both target and static-max 
are now


set to the same value.


Gabriel




Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar 
Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in 
Romania. Registration number J22/2621/2005.
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] PVH dom0 creation fails - the system freezes

2018-07-25 Thread bercarug

On 07/25/2018 05:02 PM, Wei Liu wrote:

On Wed, Jul 25, 2018 at 03:41:11PM +0200, Juergen Gross wrote:

On 25/07/18 15:35, Roger Pau Monné wrote:

What could be causing the available memory loss problem?

That seems to be Linux aggressively ballooning out memory, you go from
7129M total memory to 246M. Are you creating a lot of domains?

This might be related to the tools thinking dom0 is a PV domain.

Good point.

In that case, xenstore-ls -fp would also be useful. The output should
show the balloon target for Dom0.

You can also try to set the autoballoon to off in /etc/xen/xl.cfg to see
if it makes any difference.

Wei.

Also tried setting autoballooning off, but it had no effect.

Gabriel




Juergen


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel






Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar 
Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in 
Romania. Registration number J22/2621/2005.
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] PVH dom0 creation fails - the system freezes

2018-07-25 Thread bercarug

On 07/25/2018 04:35 PM, Roger Pau Monné wrote:

On Wed, Jul 25, 2018 at 01:06:43PM +0300, berca...@amazon.com wrote:

On 07/24/2018 12:54 PM, Jan Beulich wrote:

On 23.07.18 at 13:50,  wrote:

For the last few days, I have been trying to get a PVH dom0 running,
however I encountered the following problem: the system seems to
freeze after the hypervisor boots, the screen goes black. I have tried to
debug it via a serial console (using Minicom) and managed to get some
more Xen output, after the screen turns black.

I mention that I have tried to boot the PVH dom0 using different kernel
images (from 4.9.0 to 4.18-rc3), different Xen  versions (4.10, 4.11, 4.12).

Below I attached my system / hypervisor configuration, as well as the
output captured through the serial console, corresponding to the latest
versions for Xen and the Linux Kernel (Xen staging and Kernel from the
xen/tip tree).
[...]
(XEN) [VT-D]iommu.c:919: iommu_fault_status: Fault Overflow
(XEN) [VT-D]iommu.c:921: iommu_fault_status: Primary Pending Fault
(XEN) [VT-D]DMAR:[DMA Write] Request device [:00:14.0] fault addr 8deb3000, 
iommu reg = 82c00021b000

Can you figure out which PCI device is 00:14.0?

This is the output of lspci -vvv for device 00:14.0:

00:14.0 USB controller: Intel Corporation Sunrise Point-H USB 3.0 xHCI 
Controller (rev 31) (prog-if 30 [XHCI])
    Subsystem: Intel Corporation Sunrise Point-H USB 3.0 xHCI 
Controller
    Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- 
ParErr+ Stepping- SERR+ FastB2B- DisINTx+
    Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium 
>TAbort- SERR- 
    Latency: 0
    Interrupt: pin A routed to IRQ 178
    Region 0: Memory at a2e0 (64-bit, non-prefetchable) [size=64K]
    Capabilities: [70] Power Management version 2
    Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA 
PME(D0-,D1-,D2-,D3hot+,D3cold+)

    Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
    Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+
    Address: fee0e000  Data: 4021
    Kernel driver in use: xhci_hcd
    Kernel modules: xhci_pci

(XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
(XEN) print_vtd_entries: iommu #0 dev :00:14.0 gmfn 8deb3
(XEN) root_entry[00] = 1021c60001
(XEN) context[a0] = 2_1021d6d001
(XEN) l4[000] = 9c1021d6c107
(XEN) l3[002] = 9c1021d3e107
(XEN) l2[06f] = 9c10218c0107
(XEN) l1[0b3] = 8000
(XEN) l1[0b3] not present
(XEN) Dom0 callback via changed to Direct Vector 0xf3

This might be a hint at a missing RMRR entry in the ACPI tables, as
we've seen to be the case for a number of systems (I dare to guess
that :00:14.0 is a USB controller, perhaps one with a keyboard
and/or mouse connected). You may want to play with the respective
command line option ("rmrr="). Note that "iommu_inclusive_mapping"
as you're using it does not have any meaning for PVH (see
intel_iommu_hwdom_init()).

Jan




Hello,

Following Roger's advice, I rebuilt Xen (4.12) using the staging branch and
I managed to get a PVH dom0 starting. However, some other problems appeared:

1) The USB devices are not usable anymore (keyboard and mouse), so the
system is only accessible through the serial port.

Can you boot with iommu=debug and see if you get any extra IOMMU
information on the serial console?

The debug flag was already set, so the log I attached on the first
message already contains the IOMMU info.
In Xen's command line I used iommu=debug,verbose,workaround_bios_bug.



2) I can run any usual command in dom0, but the ones involving xl (except
for xl info) will make the system run out of memory very fast. Eventually,
when there is no more free memory available, the OOM killer begins removing
processes until the system auto reboots.

I attached a file containing the output of a lsusb, as well as the output of
xl info and xl list -l.
After xl list -l, the “free -m” commands show the available memory
decreasing.
Each command has a timestamp appended, so it can be seen how fast the
available memory decreases.

I removed much of the process killing logs and kept the last one, since they
were following the same pattern.

Dom0 still appears to be of type PV (output of xl list -l), however during
boot, the following messages were displayed: “Building a PVH Dom0” and
“Booting paravirtualized kernel on Xen PVH”.

I mention that I had to add “workaround_bios_bug” in GRUB_CMDLINE_XEN for
iommu to get dom0 running.

It seems to me like your ACPI DMAR table contains errors, and I
wouldn't be surprised if those also cause the USB devices to
malfunction.


What could be causing the available memory loss problem?

That seems to be Linux aggressively ballooning out memory, you go from
7129M total memory to 246M. Are you creating a lot of domains?

Roger.


I did not create any guest before issuing "xl list -l". However, creating

a PVH domU will work - "xl create " does not produce 

[Xen-devel] PVH dom0 creation fails - the system freezes

2018-07-23 Thread bercarug

Hello,

For the last few days, I have been trying to get a PVH dom0 running,
however I encountered the following problem: the system seems to
freeze after the hypervisor boots, the screen goes black. I have tried to
debug it via a serial console (using Minicom) and managed to get some
more Xen output, after the screen turns black.

I mention that I have tried to boot the PVH dom0 using different kernel
images (from 4.9.0 to 4.18-rc3), different Xen  versions (4.10, 4.11, 4.12).

Below I attached my system / hypervisor configuration, as well as the
output captured through the serial console, corresponding to the latest
versions for Xen and the Linux Kernel (Xen staging and Kernel from the
xen/tip tree).


OS + Distro: Linux / Debian 9 Stretch
Kernel Version: 4.17-rc5, tagged with for-linus-4.18-rc5-tag from the
xen/tip tree.
Xen Version: 4.12, commit id e3f667bc5f51d0aa44357a64ca134cd952679c81
of the Xen tree.
Host system: attached cpuinfo.log
Serial console output: attached boot.log
My grub configuration file, containing the Xen command line arguments: 
attached

grub.log

I can provide additional info as requested.
Any ideas why this happens? Do you have any recommendations for additional
debugging?

Here are the last few lines of the boot log. The last (separated) ones 
were only
visible though the serial console, since at that point the screen was 
completely

black.

(XEN) *** Building a PVH Dom0 ***
(XEN) [VT-D]d0:Hostbridge: skip :00:00.0 map
(XEN) [VT-D]d0:PCI: map :00:14.0
(XEN) [VT-D]d0:PCI: map :00:14.2
(XEN) [VT-D]d0:PCI: map :00:16.0
(XEN) [VT-D]d0:PCI: map :00:16.1
(XEN) [VT-D]d0:PCI: map :00:17.0
(XEN) [VT-D]d0:PCI: map :00:1f.0
(XEN) [VT-D]d0:PCI: map :00:1f.2
(XEN) [VT-D]d0:PCI: map :00:1f.4
(XEN) [VT-D]d0:PCIe: map :01:00.0
(XEN) [VT-D]d0:PCIe: map :02:00.0
(XEN) [VT-D]d0:PCIe: map :03:00.0
(XEN) [VT-D]d0:PCIe: map :04:00.0
(XEN) [VT-D]iommu_enable_translation: iommu->reg = 82c00021b000
(XEN) WARNING: PVH is an experimental mode with limited functionality
(XEN) Initial low memory virq threshold set at 0x4000 pages.
(XEN) Scrubbing Free RAM on 1 nodes using 4 CPUs
(XEN) 
...done.

(XEN) Std. Loglevel: All
(XEN) Guest Loglevel: All
(XEN) ***
(XEN) WARNING: CONSOLE OUTPUT IS SYNCHRONOUS
(XEN) This option is intended to aid debugging of Xen by ensuring
(XEN) that all output is synchronously delivered on the serial line.
(XEN) However it can introduce SIGNIFICANT latencies and affect
(XEN) timekeeping. It is NOT recommended for production use!
(XEN) ***
(XEN) 3... 2... 1...

(XEN) Xen is relinquishing VGA console.
(XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch 
input to Xen)

(XEN) Freed 468kB init memory
(XEN) [VT-D]iommu.c:919: iommu_fault_status: Fault Overflow
(XEN) [VT-D]iommu.c:921: iommu_fault_status: Primary Pending Fault
(XEN) [VT-D]DMAR:[DMA Write] Request device [:00:14.0] fault addr 
8deb3000, iommu reg = 82c00021b000

(XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
(XEN) print_vtd_entries: iommu #0 dev :00:14.0 gmfn 8deb3
(XEN) root_entry[00] = 1021c60001
(XEN) context[a0] = 2_1021d6d001
(XEN) l4[000] = 9c1021d6c107
(XEN) l3[002] = 9c1021d3e107
(XEN) l2[06f] = 9c10218c0107
(XEN) l1[0b3] = 8000
(XEN) l1[0b3] not present
(XEN) Dom0 callback via changed to Direct Vector 0xf3

Thanks,
Gabriel




Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar 
Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in 
Romania. Registration number J22/2621/2005.
(XEN) Xen version 4.12-unstable (root@) (gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516) debug=y  Mon Jul 16 09:20:26 EDT 2018
(XEN) Latest ChangeSet: Thu Jul 12 18:48:06 2018 +0200 git:e3f667bc5f
(XEN) Console output is synchronous.
(XEN) Bootloader: GRUB 2.02~beta3-5
(XEN) Command line: placeholder dom0=pvh dom0_mem=4096M loglvl=all sync_console console_to_ring=true console=com1,vga com1=115200,8n1 iommu=debug,verbose,workaround_bios_bug iommu_inclusive_mapping=true
(XEN) Xen image load base address: 0
(XEN) Video information:
(XEN)  VGA is text mode 80x25, font 8x16
(XEN)  VBE/DDC methods: V2; EDID transfer time: 1 seconds
(XEN) Disc information:
(XEN)  Found 1 MBR signatures
(XEN)  Found 2 EDD information structures
(XEN) Xen-e820 RAM map:
(XEN)   - 00098c00 (usable)
(XEN)  00098c00 - 000a (reserved)
(XEN)  000e - 0010 (reserved)
(XEN)  0010 - 8c1c4000 (usable)
(XEN)  8c1c4000 - 8c1c5000 (ACPI NVS)
(XEN)  8c1c5000 - 8c20f000 (reserved)
(XEN)  8c20f000 - 8c281000 (usable)
(XEN)  8c281000 - 8dec1000 (reserved)