On 01/15/2016 12:05 PM, Andrew Cooper wrote:
On 15/01/16 10:58, Håkon Alstadheim wrote:
This is just a preliminary report, mostly just for the record.
I will report again if this keeps happening after 4.7 is out, or upon
request. Anyone working on this, please mail me and request more
information. I have available logs from dom0 boot (I dump dmesg and xl
dmesg to disk after every boot, and log dom0 serial console to disk).
I will send boot logs if requested. I will turn on maximum verbosity
and provide all output. My serial console is very slow, so I can not
keep running at max verbosity all the time.
At the end of this mail there is "xl info" and output from dom0 serial
console.
CPUINFO:
vendor_id : GenuineIntel
cpu family : 6
model : 63
model name : Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
# smbios-sys-info
Libsmbios version: 2.2.28
Product Name: Z10PE-D8 WS
Vendor: ASUSTeK COMPUTER INC.
BIOS Version: 3101
Dom0 OS:
Linux gentoo 4.1.12-gentoo #1 SMP Sat Jan 2 09:36:31 CET 2016 x86_64
Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz GenuineIntel GNU/Linux.
Kernel is gentoo-sources, with experimental use-flag. Cpu type set to
Haswell. Issue also happened without experimental.
# cat /proc/cmdline
placeholder root=LABEL=ssdroot ro
xen-pciback.hide=(02:00.*)(08:00.*)(00:1b.*)(81:00.*)(82:00.*)(83:00.*)
console=hvc0
console=vga domodules domdadm dolvm intel_iommu=on earlyprintk=xen
usbcore.autosuspend=-1
The system is mostly built with stable packages, xen and xen-tools
keyworded to ~amd64.
I have been experiencing issues with domains with passed through PCIe
devices since I first installed xen. Then at version 4.5.x , I'm now
at 4.6.0 with gentoo patches. Crashes SEEM mostly related to this pci
pass through and interrupts (usb-cards, sound cards).
Recently the system has been more stable, whether it is because I pass
through as few things as possible, or because of improvements in Xen I
do not know. I have also taken to building with debug, which leads to
more abrupt but less mysterious failures. Earlier (w/o debug and under
xen 4.5 ) stuff would just gradually stop working and end up in total
hang of everything. So, hey, things are improving :-b
This isn't the first time we have seen this on Haswell processors. Do
you have microcode loading set up?
Not entirely sure to be honest. Is microcode : 0x31 the newest?
I AM running the very latest bios from Asus, but I do not have
confidence in my microcode loading setup, so I have not had one in place.
Trying now.
Downloading microcode.dat from Intel
Installing iucode_tool, which in its --help states:
-w, --write-to=file Write selected microcodes to a file in binary
format. The binary format is suitable to be
uploaded to the kernel
Ran "iucode_tool microcode.dat -w microcode.bin"
----
# ls -l micro*
-rwxr-xr-x 1 root root 693248 Jan 15 12:40 microcode.bin
-rwxr-xr-x 1 root root 2081807 Nov 6 04:04 microcode.dat
----
placed microcode.bin in /boot/microcode.bin
booted with :
---
xen_commandline : ssd-xen-debug-marker console_timestamps=date
loglvl=all guest_loglvl=all sync_console iommu=1,verbose,debug
iommu_inclusive_mapping=1 com1=115200,8n1 console=com1 dom0_max_vcpus=4
dom0_vcpus_pin=1 dom0_mem=8G,max:8G cpufreq=xen:performance,verbose
tmem=1 sched_smt_power_savings=1 apic_verbosity=debug e820-verbose=1
core_parking=power ucode=microcode.bin
---
#cat /proc/cpuinfo | grep micro
says: microcode : 0x31
This is no change from previous boot.
Now: How do I know wheter 0x31 is the newest?
Grepping the console output reveals no reference to ucode or microcode
other than the Xen command-line.
---
Håkon
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel