RE: [V3 PATCH 0/4] Fix race issues among panic, NMI and crash_kexec

2015-08-21 Thread 河合英宏 / KAWAI,HIDEHIRO
> From: Peter Zijlstra [mailto:pet...@infradead.org]
> User-Agent: StGit/0.16
> 
> Fwiw, stgit is broken wrt sending email, all your emails have the exact
> same timestamp, which means that the emails will be ordered on received
> timestamp when threaded and generate the below mess:

Sorry for the inconvenience.  I'll try to find some workaround.

Regards,

Hidehiro Kawai
Hitachi, Ltd. Research & Development Group



RE: [V3 PATCH 0/4] Fix race issues among panic, NMI and crash_kexec

2015-08-21 Thread 河合英宏 / KAWAI,HIDEHIRO
 From: Peter Zijlstra [mailto:pet...@infradead.org]
 User-Agent: StGit/0.16
 
 Fwiw, stgit is broken wrt sending email, all your emails have the exact
 same timestamp, which means that the emails will be ordered on received
 timestamp when threaded and generate the below mess:

Sorry for the inconvenience.  I'll try to find some workaround.

Regards,

Hidehiro Kawai
Hitachi, Ltd. Research  Development Group



Re: [V3 PATCH 0/4] Fix race issues among panic, NMI and crash_kexec

2015-08-20 Thread Peter Zijlstra

User-Agent: StGit/0.16

Fwiw, stgit is broken wrt sending email, all your emails have the exact
same timestamp, which means that the emails will be ordered on received
timestamp when threaded and generate the below mess:


 Aug 06 Hidehiro Kawai  (1.9K) [V3 PATCH 0/4] Fix race issues among panic, NMI 
and crash_kexec
 Aug 06 Hidehiro Kawai  (2.4K) ├─>[V3 PATCH 3/4] kexec: Fix race between 
panic() and crash_kexec() called directly
 Aug 06 Hidehiro Kawai  (4.9K) ├─>[V3 PATCH 1/4] panic/x86: Fix re-entrance 
problem due to panic on NMI
 Aug 06 Hidehiro Kawai  (5.3K) ├─>[V3 PATCH 2/4] panic/x86: Allow cpus to save 
registers even if they are looping in NMI context
 Aug 06 Hidehiro Kawai  (2.5K) ├─>[V3 PATCH 4/4] x86/apic: Introduce noextnmi 
boot option


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [V3 PATCH 0/4] Fix race issues among panic, NMI and crash_kexec

2015-08-20 Thread Peter Zijlstra

User-Agent: StGit/0.16

Fwiw, stgit is broken wrt sending email, all your emails have the exact
same timestamp, which means that the emails will be ordered on received
timestamp when threaded and generate the below mess:


 Aug 06 Hidehiro Kawai  (1.9K) [V3 PATCH 0/4] Fix race issues among panic, NMI 
and crash_kexec
 Aug 06 Hidehiro Kawai  (2.4K) ├─[V3 PATCH 3/4] kexec: Fix race between 
panic() and crash_kexec() called directly
 Aug 06 Hidehiro Kawai  (4.9K) ├─[V3 PATCH 1/4] panic/x86: Fix re-entrance 
problem due to panic on NMI
 Aug 06 Hidehiro Kawai  (5.3K) ├─[V3 PATCH 2/4] panic/x86: Allow cpus to save 
registers even if they are looping in NMI context
 Aug 06 Hidehiro Kawai  (2.5K) ├─[V3 PATCH 4/4] x86/apic: Introduce noextnmi 
boot option


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [V3 PATCH 0/4] Fix race issues among panic, NMI and crash_kexec

2015-08-07 Thread Michal Hocko
JFYI I have those patches tested on the largish box. Will come back to
you as soon as I have some feedback.

I will also try to review these patches sometimes next week.

Thanks!

On Thu 06-08-15 14:45:43, Hidehiro Kawai wrote:
> When an HA cluster software or administrator detects non-response
> of a host, they issue an NMI to the host to completely stop current
> works and take a crash dump.  If the kernel has already panicked
> or is capturing a crash dump at that time, further NMI can cause
> a crash dump failure.
> 
> Also, crash_kexec() called from oops context and panic() can
> cause race conditions.
> 
> To solve these issue, this patch set does following things:
> 
> - Don't panic on NMI if the kernel has already panicked
> - Extend exclusion control currently done by panic_lock to crash_kexec
> - Introduce "noextnmi" boot option which masks external NMI at the
>   boot time (supported only for x86)
> 
> V3:
> - Introduce nmi_panic() macro to reduce code duplication
> - In the case of panic on NMI, don't return from NMI handlers
>   if another cpu already panicked
> 
> V2:
> - Use atomic_cmpxchg() instead of current spin_trylock() to exclude
>   concurrent accesses to panic() and crash_kexec()
> - Don't introduce no-lock version of panic() and crash_kexec()
> 
> ---
> 
> Hidehiro Kawai (4):
>   panic/x86: Fix re-entrance problem due to panic on NMI
>   panic/x86: Allow cpus to save registers even if they are looping in NMI 
> context
>   kexec: Fix race between panic() and crash_kexec() called directly
>   x86/apic: Introduce noextnmi boot option
> 
> 
>  Documentation/kernel-parameters.txt |4 
>  arch/x86/kernel/apic/apic.c |   17 -
>  arch/x86/kernel/nmi.c   |   15 +++
>  arch/x86/kernel/reboot.c|   11 +++
>  include/linux/kernel.h  |   21 +
>  kernel/kexec.c  |   20 
>  kernel/panic.c  |   23 ---
>  kernel/watchdog.c   |5 +++--
>  8 files changed, 106 insertions(+), 10 deletions(-)
> 
> 
> -- 
> Hidehiro Kawai
> Hitachi, Ltd. Research & Development Group
> 
> 

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [V3 PATCH 0/4] Fix race issues among panic, NMI and crash_kexec

2015-08-07 Thread Michal Hocko
JFYI I have those patches tested on the largish box. Will come back to
you as soon as I have some feedback.

I will also try to review these patches sometimes next week.

Thanks!

On Thu 06-08-15 14:45:43, Hidehiro Kawai wrote:
 When an HA cluster software or administrator detects non-response
 of a host, they issue an NMI to the host to completely stop current
 works and take a crash dump.  If the kernel has already panicked
 or is capturing a crash dump at that time, further NMI can cause
 a crash dump failure.
 
 Also, crash_kexec() called from oops context and panic() can
 cause race conditions.
 
 To solve these issue, this patch set does following things:
 
 - Don't panic on NMI if the kernel has already panicked
 - Extend exclusion control currently done by panic_lock to crash_kexec
 - Introduce noextnmi boot option which masks external NMI at the
   boot time (supported only for x86)
 
 V3:
 - Introduce nmi_panic() macro to reduce code duplication
 - In the case of panic on NMI, don't return from NMI handlers
   if another cpu already panicked
 
 V2:
 - Use atomic_cmpxchg() instead of current spin_trylock() to exclude
   concurrent accesses to panic() and crash_kexec()
 - Don't introduce no-lock version of panic() and crash_kexec()
 
 ---
 
 Hidehiro Kawai (4):
   panic/x86: Fix re-entrance problem due to panic on NMI
   panic/x86: Allow cpus to save registers even if they are looping in NMI 
 context
   kexec: Fix race between panic() and crash_kexec() called directly
   x86/apic: Introduce noextnmi boot option
 
 
  Documentation/kernel-parameters.txt |4 
  arch/x86/kernel/apic/apic.c |   17 -
  arch/x86/kernel/nmi.c   |   15 +++
  arch/x86/kernel/reboot.c|   11 +++
  include/linux/kernel.h  |   21 +
  kernel/kexec.c  |   20 
  kernel/panic.c  |   23 ---
  kernel/watchdog.c   |5 +++--
  8 files changed, 106 insertions(+), 10 deletions(-)
 
 
 -- 
 Hidehiro Kawai
 Hitachi, Ltd. Research  Development Group
 
 

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/