On 26/08/2021 22:00, Julien Grall wrote:
> Hi Andrew,
>
> While doing more testing today, I noticed that only one vCPU would be
> brought up with HVM guest with Xen 4.16 on my setup (QEMU):
>
> [    1.122180]
> ================================================================================
> [    1.122180] UBSAN: shift-out-of-bounds in
> oss/linux/arch/x86/kernel/apic/apic.c:2362:13
> [    1.122180] shift exponent -1 is negative
> [    1.122180] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-rc7+ #304
> [    1.122180] Hardware name: Xen HVM domU, BIOS 4.16-unstable 06/07/2021
> [    1.122180] Call Trace:
> [    1.122180]  dump_stack_lvl+0x56/0x6c
> [    1.122180]  ubsan_epilogue+0x5/0x50
> [    1.122180]  __ubsan_handle_shift_out_of_bounds+0xfa/0x140
> [    1.122180]  ? cgroup_kill_write+0x4d/0x150
> [    1.122180]  ? cpu_up+0x6e/0x100
> [    1.122180]  ? _raw_spin_unlock_irqrestore+0x30/0x50
> [    1.122180]  ? rcu_read_lock_held_common+0xe/0x40
> [    1.122180]  ? irq_shutdown_and_deactivate+0x11/0x30
> [    1.122180]  ? lock_release+0xc7/0x2a0
> [    1.122180]  ? apic_id_is_primary_thread+0x56/0x60
> [    1.122180]  apic_id_is_primary_thread+0x56/0x60
> [    1.122180]  cpu_up+0xbd/0x100
> [    1.122180]  bringup_nonboot_cpus+0x4f/0x60
> [    1.122180]  smp_init+0x26/0x74
> [    1.122180]  kernel_init_freeable+0x183/0x32d
> [    1.122180]  ? _raw_spin_unlock_irq+0x24/0x40
> [    1.122180]  ? rest_init+0x330/0x330
> [    1.122180]  kernel_init+0x17/0x140
> [    1.122180]  ? rest_init+0x330/0x330
> [    1.122180]  ret_from_fork+0x22/0x30
> [    1.122244]
> ================================================================================
> [    1.123176] installing Xen timer for CPU 1
> [    1.123369] x86: Booting SMP configuration:
> [    1.123409] .... node  #0, CPUs:      #1
> [    1.154400] Callback from call_rcu_tasks_trace() invoked.
> [    1.154491] smp: Brought up 1 node, 1 CPU
> [    1.154526] smpboot: Max logical packages: 2
> [    1.154570] smpboot: Total of 1 processors activated (5999.99
> BogoMIPS)
>
> I have tried a PV guest (same setup) and the kernel could bring up all
> the vCPUs.
>
> Digging down, Linux will set smp_num_siblings to 0 (via
> detect_ht_early()) and as a result will skip all the CPUs. The value
> is retrieve from a CPUID leaf. So it sounds like we don't set the
> leaft correctly.
>
> FWIW, I have also tried on Xen 4.11 and could spot the same issue.
> Does this ring any bell to you?

The CPUID data we give to guests is generally nonsense when it comes to
topology.  By any chance does the hardware you're booting this on not
have hyperthreading enabled/active to begin with?

Fixing this is on the todo list, but it needs libxl to start using
policy objects (series for the next phase of this still pending on
xen-devel).  Exactly how you represent the topology to the guest
correctly depends on the vendor and rough generation - I believe there
are 5 different algorithms to use, and for AMD in particular, it even
depends on how many IO-APICs are visible in the guest.

~Andrew


Reply via email to