Hi,

Am Montag, 28. Oktober 2019, 18:30:12 CET schrieb Stonehouse, Robert:
> This is a heads-up as I have observed that the following commit (backported 
> onto an Amazon 4.11 tree) causes kexec (and hence kdump) to fail. 
> ========
> commit c719519a4183d0630121f6abeba420f49dbc3229
> Author: Jan Beulich <jbeul...@suse.com>
> AuthorDate: Fri Jul 5 10:32:41 2019 +0200
> Commit: Jan Beulich <jbeul...@suse.com>
> CommitDate: Fri Jul 5 10:32:41 2019 +0200
> 
> x86/SMP: don't try to stop already stopped CPUs
>     
>     In particular with an enabled IOMMU (but not really limited to this
>     case), trying to invoke fixup_irqs() after having already done
>     disable_IO_APIC() -> clear_IO_APIC() is a rather bad idea:
> ========
> 
> The test was performing "echo c > /proc/sysrq-trigger" in dom0 and the loaded 
> crash kernel fails to show any signs of starting. This is the end of the Xen 
> console ...
> ========
> (XEN) Hardware Dom0 crashed: rebooting machine in 5 seconds.
> (XEN) Resetting with ACPI MEMORY or I/O RESET_REG.
> <machine hangs here then reboots via the BIOS after 5 seconds>
> ========
> Expected behaviour is that the kdump kernel immediately loads and then 
> performs the crash dump

I can confirm this behavior but with xen version (4.11.0_08-1) from
SuSE SLES12 SP4 which doesn't contain the said commit
c719519a4183d0630121f6abeba420f49dbc3229.But I can see this only on systems 
with newer Intel CPUS like
"Intel(R) Xeon(R) Gold 6242 CPU".



> 
> I'm sorry that I have not yet had time to check if this affects vanilla 
> stable-4.11 or master. I just wanted to be certain that you don't have the 
> same issue.
> 
> 
> Reverting one hunk via the following commit fixes things for me (this is an 
> experiment and not at all a proposed fix)
> ========
> --- a/xen/arch/x86/smp.c
> +++ b/xen/arch/x86/smp.c
> @@ -303,15 +303,15 @@ static void stop_this_cpu(void *dummy)
>  void smp_send_stop(void)
>  {
>      unsigned int cpu = smp_processor_id();
> +    
> +    local_irq_disable();
> +    fixup_irqs(cpumask_of(cpu), 0);
> +    local_irq_enable();
>  
>      if ( num_online_cpus() > 1 )
>      {
>          int timeout = 10;
>  
> -        local_irq_disable();
> -        fixup_irqs(cpumask_of(cpu), 0);
> -        local_irq_enable();
> -
>          smp_call_function(stop_this_cpu, NULL, 0);
>  
>          /* Wait 10ms for all other CPUs to go offline. */
> ========
> 
> Regards
> Rob
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Reply via email to