Hi all,

hopefully it's ok to reply-all at this point


On 9/13/23 23:38, Hans van Kranenburg wrote:
I have a few quick additional questions already:

1. For clarification.. From your text, I understand that only this one
single server is showing the problem after the Debian version upgrade.
Does this mean that this is the only server you have running with
exactly this combination of hardware (and BIOS version, CPU microcode
etc etc)? Or, is there another one with same hardware which does not
show the problem?

This is the unique HW combination in terms of server type Dell R750xs and CPU type 'Intel Xeon Silver 4310'


2. Can you reply with the output of 'xl dmesg' when the problem happens?
Or, if the system gets unusable too quick, do you have a serial console
connection to capture the output?

in attachment


3. To confirm... I understand that there are many of these messages.
Since you pasted only one, does that mean that all of them look exactly
the same, with "1 of 1 multicall(s) failed: cpu 10" "call  1: op=1
arg=[ffff8888a1a9eb10] result=-22"? Or are there variations? If so, can
you reply with a few different ones?

all looks exacly same, only 1 of 1 multicalls failed with same result



On 9/14/23 07:43, Juergen Gross wrote:
>>> kernel: [   99.768181] Call Trace:
>>> kernel: [   99.768436]  <TASK>
>>> kernel: [   99.768691]  ? __warn+0x7d/0xc0
>>> kernel: [   99.768947]  ? xen_mc_flush+0x196/0x220
>>> kernel: [   99.769204]  ? report_bug+0xe6/0x170
>>> kernel: [   99.769460]  ? handle_bug+0x41/0x70
>>> kernel: [   99.769713]  ? exc_invalid_op+0x13/0x60
>>> kernel: [   99.769967]  ? asm_exc_invalid_op+0x16/0x20
>>> kernel: [   99.770223]  ? xen_mc_flush+0x196/0x220
>>> kernel: [   99.770478]  xen_mc_issue+0x6d/0x70
>>> kernel: [   99.770726]  xen_set_pmd_hyper+0x54/0x90
>>> kernel: [   99.770965]  do_set_pmd+0x188/0x2a0
>
> This looks like an attempt to map a hugepage, which isn't supported
> when running as a Xen PV guest (this includes dom0).
>
> Are transparent hugepages enabled somehow? In a Xen PV guest there
> should be no /sys/kernel/mm/transparent_hugepage directory. Depending > on the presence of that directory either hugepage_init() has a bug, or > a test for hugepages being supported is missing in filemap_map_pages() > or do_set_pmd().
>
>>> kernel: [   99.771200]  filemap_map_pages+0x1a9/0x6e0
>>> kernel: [   99.771434]  xfs_filemap_map_pages+0x41/0x60 [xfs]
>>> kernel: [   99.771714]  do_fault+0x1a4/0x410
>>> kernel: [   99.771947]  __handle_mm_fault+0x660/0xfa0

in faulty state (linux 6.1) and also in good state (linux 5.10), the directory /sys/kernel/mm/transparent_hugepage is not present

we have also tried to boot with 'transparent_hugepage=never', but it make no difference


best regards
bodik
(XEN) Xen version 4.17.2-pre (Debian 4.17.1+2-gb773c48e36-1) 
(pkg-xen-de...@lists.alioth.debian.org) (x86_64-linux-gnu-gcc (Debian 
12.2.0-14) 12.2.0) debug=n Thu May 18 19:26:30 UTC 2023
(XEN) Bootloader: GRUB 2.06-13
(XEN) Command line: placeholder dom0_mem=32G,max:32G
(XEN) Xen image load base address: 0x5e800000
(XEN) Video information:
(XEN)  VGA is text mode 80x25, font 8x16
(XEN)  VBE/DDC methods: none; EDID transfer time: 0 seconds
(XEN)  EDID info not retrieved because no DDC retrieval method detected
(XEN) Disc information:
(XEN)  Found 2 MBR signatures
(XEN)  Found 2 EDD information structures
(XEN) Xen-e820 RAM map:
(XEN)  [0000000000000000, 0000000000098fff] (usable)
(XEN)  [0000000000099000, 000000000009ffff] (reserved)
(XEN)  [00000000000e0000, 00000000000fffff] (reserved)
(XEN)  [0000000000100000, 000000004a413fff] (usable)
(XEN)  [000000004a414000, 000000004b413fff] (ACPI NVS)
(XEN)  [000000004b414000, 000000004bfc2fff] (usable)
(XEN)  [000000004bfc3000, 000000004c0c8fff] (reserved)
(XEN)  [000000004c0c9000, 000000004cffffff] (usable)
(XEN)  [000000004d000000, 000000004d1fffff] (reserved)
(XEN)  [000000004d200000, 000000005eefdfff] (usable)
(XEN)  [000000005eefe000, 000000006e3fefff] (reserved)
(XEN)  [000000006e3ff000, 000000006f3fefff] (ACPI NVS)
(XEN)  [000000006f3ff000, 000000006f7fefff] (ACPI data)
(XEN)  [000000006f7ff000, 000000006f7fffff] (usable)
(XEN)  [000000006f800000, 000000008fffffff] (reserved)
(XEN)  [00000000fd000000, 00000000fe7fffff] (reserved)
(XEN)  [00000000fec00000, 00000000fec00fff] (reserved)
(XEN)  [00000000fec80000, 00000000fed00fff] (reserved)
(XEN)  [00000000fed40000, 00000000fed44fff] (reserved)
(XEN)  [00000000ff000000, 00000000ffffffff] (reserved)
(XEN)  [0000000100000000, 000000407fffffff] (usable)
(XEN) ACPI: RSDP 000FE320, 0024 (r2 DELL  )
(XEN) ACPI: XSDT 6F40A188, 00F4 (r1 DELL   PE_SC3          0 DELL  1000013)
(XEN) ACPI: FACP 6F7F6000, 0114 (r6 DELL   PE_SC3          0 DELL        1)
(XEN) ACPI: DSDT 6F770000, 7FAD3 (r2 DELL   PE_SC3          3 DELL        1)
(XEN) ACPI: FACS 6F373000, 0040
(XEN) ACPI: SSDT 6F7FB000, 1571 (r2  INTEL RAS_ACPI        1 INTL 20210331)
(XEN) ACPI: SSDT 6F7FA000, 0745 (r2  INTEL ADDRXLAT        1 INTL 20210331)
(XEN) ACPI: EINJ 6F7F9000, 0150 (r1 DELL   PE_SC3          1 INTL        1)
(XEN) ACPI: BERT 6F7F8000, 0030 (r1 DELL   PE_SC3          1 INTL        1)
(XEN) ACPI: ERST 6F7F7000, 0230 (r1 DELL   PE_SC3          1 INTL        1)
(XEN) ACPI: HMAT 6F7F5000, 0180 (r1 DELL   PE_SC3          1 DELL        1)
(XEN) ACPI: HPET 6F7F4000, 0038 (r1 DELL   PE_SC3          1 DELL        1)
(XEN) ACPI: MCFG 6F7F3000, 003C (r1 DELL   PE_SC3          1 DELL        1)
(XEN) ACPI: MIGT 6F7F2000, 0040 (r1 DELL   PE_SC3          0 DELL        1)
(XEN) ACPI: MSCT 6F7F1000, 0090 (r1 DELL   PE_SC3          1 DELL        1)
(XEN) ACPI: WSMT 6F7F0000, 0028 (r1 DELL   PE_SC3          0 DELL        1)
(XEN) ACPI: APIC 6F76F000, 035E (r4 DELL   PE_SC3          0 DELL        1)
(XEN) ACPI: SLIT 6F76E000, 0030 (r1 DELL   PE_SC3          1 DELL  1000013)
(XEN) ACPI: SRAT 6F767000, 6430 (r3 DELL   PE_SC3          2 DELL  1000013)
(XEN) ACPI: OEM4 6F5DF000, 187A61 (r2  INTEL CPU  CST     3000 INTL 20210331)
(XEN) ACPI: OEM1 6F4CB000, 113489 (r2  INTEL CPU EIST     3000 INTL 20210331)
(XEN) ACPI: OEM2 6F484000, 46031 (r2  INTEL CPU  HWP     3000 INTL 20210331)
(XEN) ACPI: SSDT 6F40D000, 764A5 (r2  INTEL SSDT  PM     4000 INTL 20210331)
(XEN) ACPI: SSDT 6F40C000, 0AA3 (r2 DELL   PE_SC3          0 DELL        1)
(XEN) ACPI: HEST 6F40B000, 017C (r1 DELL   PE_SC3          1 INTL        1)
(XEN) ACPI: SSDT 6F7FD000, 0623 (r2 DELL   Tpm2Tabl     1000 INTL 20210331)
(XEN) ACPI: TPM2 6F409000, 004C (r4 DELL   PE_SC3          2 DELL  1000013)
(XEN) ACPI: SSDT 6F401000, 7299 (r2  INTEL SpsNm           2 INTL 20210331)
(XEN) ACPI: SSDT 6F400000, 06EA (r2 DELL   PE_SC3          2 DELL        1)
(XEN) ACPI: DMAR 6F3FF000, 0188 (r1 DELL   PE_SC3          1 DELL        1)
(XEN) System RAM: 261595MB (267873864kB)
(XEN) Domain heap initialised DMA width 32 bits
(XEN) x2APIC mode is already enabled by BIOS.
(XEN) ACPI: 32/64X FACS address mismatch in FADT - 6f373000/0000000000000000, 
using 32
(XEN) IOAPIC[0]: apic_id 8, version 32, address 0xfec00000, GSI 0-119
(XEN) CPU0: TSC: ratio: 168 / 2
(XEN) CPU0: bus: 100 MHz base: 2100 MHz max: 3300 MHz
(XEN) CPU0: 800 ... 2100 MHz
(XEN) xstate: size: 0xa88 and states: 0x2e7
(XEN) Unrecognised CPU model 0x6a - assuming vulnerable to LazyFPU
(XEN) Speculative mitigation facilities:
(XEN)   Hardware hints: RDCL_NO IBRS_ALL SKIP_L1DFL MDS_NO TAA_NO SBDR_SSDP_NO 
PSDP_NO
(XEN)   Hardware features: IBPB IBRS STIBP SSBD PSFD L1D_FLUSH MD_CLEAR 
TSX_CTRL FB_CLEAR FB_CLEAR_CTRL
(XEN)   Compiled-in support: INDIRECT_THUNK SHADOW_PAGING
(XEN)   Xen settings: BTI-Thunk JMP, SPEC_CTRL: IBRS+ STIBP+ SSBD- PSFD- TSX+, 
Other: IBPB-ctxt BRANCH_HARDEN
(XEN)   Support for HVM VMs: MSR_SPEC_CTRL MSR_VIRT_SPEC_CTRL RSB EAGER_FPU
(XEN)   Support for PV VMs: MSR_SPEC_CTRL EAGER_FPU
(XEN)   XPTI (64-bit PV only): Dom0 disabled, DomU disabled (with PCID)
(XEN)   PV L1TF shadowing: Dom0 disabled, DomU disabled
(XEN) Using scheduler: SMP Credit Scheduler rev2 (credit2)
(XEN) Initializing Credit2 scheduler
(XEN) Platform timer is 24.000MHz HPET
(XEN) Detected 2095.078 MHz processor.
(XEN) Intel VT-d iommu 8 supported page sizes: 4kB, 2MB, 1GB
(XEN) Intel VT-d iommu 7 supported page sizes: 4kB, 2MB, 1GB
(XEN) Intel VT-d iommu 6 supported page sizes: 4kB, 2MB, 1GB
(XEN) Intel VT-d iommu 5 supported page sizes: 4kB, 2MB, 1GB
(XEN) Intel VT-d iommu 4 supported page sizes: 4kB, 2MB, 1GB
(XEN) Intel VT-d iommu 3 supported page sizes: 4kB, 2MB, 1GB
(XEN) Intel VT-d iommu 2 supported page sizes: 4kB, 2MB, 1GB
(XEN) Intel VT-d iommu 1 supported page sizes: 4kB, 2MB, 1GB
(XEN) Intel VT-d iommu 0 supported page sizes: 4kB, 2MB, 1GB
(XEN) Intel VT-d iommu 9 supported page sizes: 4kB, 2MB, 1GB
(XEN) Intel VT-d Snoop Control enabled.
(XEN) Intel VT-d Dom0 DMA Passthrough not enabled.
(XEN) Intel VT-d Queued Invalidation enabled.
(XEN) Intel VT-d Interrupt Remapping enabled.
(XEN) Intel VT-d Posted Interrupt not enabled.
(XEN) Intel VT-d Shared EPT tables enabled.
(XEN) I/O virtualisation enabled
(XEN)  - Dom0 mode: Relaxed
(XEN) Interrupt remapping enabled
(XEN) Enabled directed EOI with ioapic_ack_old on!
(XEN) Enabling APIC mode:  Clustered.  Using 1 I/O APICs
(XEN) ENABLING IO-APIC IRQs
(XEN)  -> Using old ACK method
(XEN) Allocated console ring of 128 KiB.
(XEN) VMX: Supported advanced features:
(XEN)  - APIC MMIO access virtualisation
(XEN)  - APIC TPR shadow
(XEN)  - Extended Page Tables (EPT)
(XEN)  - Virtual-Processor Identifiers (VPID)
(XEN)  - Virtual NMI
(XEN)  - MSR direct-access bitmap
(XEN)  - Unrestricted Guest
(XEN)  - APIC Register Virtualization
(XEN)  - Virtual Interrupt Delivery
(XEN)  - Posted Interrupt Processing
(XEN)  - VMCS shadowing
(XEN)  - VM Functions
(XEN)  - Virtualisation Exceptions
(XEN)  - Page Modification Logging
(XEN)  - TSC Scaling
(XEN)  - Bus Lock Detection
(XEN) HVM: ASIDs enabled.
(XEN) HVM: VMX enabled
(XEN) HVM: Hardware Assisted Paging (HAP) detected
(XEN) HVM: HAP page sizes: 4kB, 2MB, 1GB
(XEN) Brought up 48 CPUs
(XEN) Scheduling granularity: cpu, 1 CPU per sched-resource
(XEN) Initializing Credit2 scheduler
(XEN) Dom0 has maximum 1368 PIRQs
(XEN)  Xen  kernel: 64-bit, lsb
(XEN)  Dom0 kernel: 64-bit, PAE, lsb, paddr 0x1000000 -> 0x4a00000
(XEN) PHYSICAL MEMORY ARRANGEMENT:
(XEN)  Dom0 alloc.:   0000004020000000->0000004028000000 (8345580 pages to be 
allocated)
(XEN)  Init. ramdisk: 000000407d7ec000->000000407ffff69e
(XEN) VIRTUAL MEMORY ARRANGEMENT:
(XEN)  Loaded kernel: ffffffff81000000->ffffffff84a00000
(XEN)  Phys-Mach map: 0000008000000000->0000008004000000
(XEN)  Start info:    ffffffff84a00000->ffffffff84a004b8
(XEN)  Page tables:   ffffffff84a01000->ffffffff84a2a000
(XEN)  Boot stack:    ffffffff84a2a000->ffffffff84a2b000
(XEN)  TOTAL:         ffffffff80000000->ffffffff84c00000
(XEN)  ENTRY ADDRESS: ffffffff830721c0
(XEN) Dom0 has maximum 48 VCPUs
(XEN) Initial low memory virq threshold set at 0x4000 pages.
(XEN) Scrubbing Free RAM in background
(XEN) Std. Loglevel: Errors and warnings
(XEN) Guest Loglevel: Nothing (Rate-limited: Errors and warnings)
(XEN) *** Serial input to DOM0 (type 'CTRL-a' three times to switch input)
(XEN) Freed 624kB init memory

Reply via email to