Re: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-27 Thread 'Baoquan He'
On 10/27/15 at 09:01am, 河合英宏 / KAWAI,HIDEHIRO wrote:
> Hi,
> 
> > I just have a look at this thread. I am wondering why we don't use
> > existing is_kdump_kernel() directly to disable external NMI if it's
> > in kdump kernel. Then no need to introduce another boot option "noextnmi"
> > which is used only for kdump kernel.
> 
> As I stated in another mail, there is a case where we don't want to
> mask external NMIs in the second kernel.  So, we need some
> configurable way.  Please see the following quotation.

Got it. Thanks for telling.

> 
> > We souldn't enable this feature silently.  Some users wouldn't like
> > to enable this feature.  For example, a user enables a watchdog timer
> > which raises an external NMI when the counter is not reset for a
> > specific duration.  Then, the second kernel hangs up while saving
> > crash dump, and NMI is delivered to the CPU.  The kernel gets panic
> > due to the NMI, prints some information to the display and serial
> > console, and then automatically reboot.  In this case, users don't
> > want to block external NMIs.
> 
> Regards,
> 
> Hidehiro Kawai
> Hitachi, Ltd. Research & Development Group
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Re: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-27 Thread 河合英宏 / KAWAI,HIDEHIRO
Hi,

> I just have a look at this thread. I am wondering why we don't use
> existing is_kdump_kernel() directly to disable external NMI if it's
> in kdump kernel. Then no need to introduce another boot option "noextnmi"
> which is used only for kdump kernel.

As I stated in another mail, there is a case where we don't want to
mask external NMIs in the second kernel.  So, we need some
configurable way.  Please see the following quotation.

> We souldn't enable this feature silently.  Some users wouldn't like
> to enable this feature.  For example, a user enables a watchdog timer
> which raises an external NMI when the counter is not reset for a
> specific duration.  Then, the second kernel hangs up while saving
> crash dump, and NMI is delivered to the CPU.  The kernel gets panic
> due to the NMI, prints some information to the display and serial
> console, and then automatically reboot.  In this case, users don't
> want to block external NMIs.

Regards,

Hidehiro Kawai
Hitachi, Ltd. Research & Development Group




Re: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-27 Thread Baoquan He
Hi,

I just have a look at this thread. I am wondering why we don't use
existing is_kdump_kernel() directly to disable external NMI if it's
in kdump kernel. Then no need to introduce another boot option "noextnmi"
which is used only for kdump kernel.

Thanks
Baoquan

On 09/25/15 at 08:28pm, Hidehiro Kawai wrote:
> This patch introduces new boot option "noextnmi" which disables
> external NMI.  This option is useful for the dump capture kernel
> so that an HA application or administrator wouldn't mistakenly
> shoot down the kernel by NMI.
> 
> Currently, only x86 supports this option.
> 
> Signed-off-by: Hidehiro Kawai 
> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: "H. Peter Anvin" 
> Cc: Jonathan Corbet 
> ---
>  Documentation/kernel-parameters.txt |4 
>  arch/x86/kernel/apic/apic.c |   17 -
>  2 files changed, 20 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/kernel-parameters.txt 
> b/Documentation/kernel-parameters.txt
> index 22a4b68..8bcaccd 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -2379,6 +2379,10 @@ bytes respectively. Such letter suffixes can also be 
> entirely omitted.
>   noexec=on: enable non-executable mappings (default)
>   noexec=off: disable non-executable mappings
>  
> + noextnmi[X86]
> + Mask external NMI.  This option is useful for a
> + dump capture kernel to be shot down by NMI.
> +
>   nosmap  [X86]
>   Disable SMAP (Supervisor Mode Access Prevention)
>   even if it is supported by processor.
> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
> index 24e94ce..fd47128 100644
> --- a/arch/x86/kernel/apic/apic.c
> +++ b/arch/x86/kernel/apic/apic.c
> @@ -82,6 +82,12 @@
>  static unsigned int disabled_cpu_apicid __read_mostly = BAD_APICID;
>  
>  /*
> + * Don't enable external NMI via LINT1 on BSP.  This is useful for
> + * the dump capture kernel.
> + */
> +static bool apic_noextnmi;
> +
> +/*
>   * Map cpu index to physical APIC ID
>   */
>  DEFINE_EARLY_PER_CPU_READ_MOSTLY(u16, x86_cpu_to_apicid, BAD_APICID);
> @@ -1161,6 +1167,8 @@ void __init init_bsp_APIC(void)
>   value = APIC_DM_NMI;
>   if (!lapic_is_integrated()) /* 82489DX */
>   value |= APIC_LVT_LEVEL_TRIGGER;
> + if (apic_noextnmi)
> + value |= APIC_LVT_MASKED;
>   apic_write(APIC_LVT1, value);
>  }
>  
> @@ -1380,7 +1388,7 @@ void setup_local_APIC(void)
>   /*
>* only the BP should see the LINT1 NMI signal, obviously.
>*/
> - if (!cpu)
> + if (!cpu && !apic_noextnmi)
>   value = APIC_DM_NMI;
>   else
>   value = APIC_DM_NMI | APIC_LVT_MASKED;
> @@ -2548,3 +2556,10 @@ static int __init apic_set_disabled_cpu_apicid(char 
> *arg)
>   return 0;
>  }
>  early_param("disable_cpu_apicid", apic_set_disabled_cpu_apicid);
> +
> +static int __init apic_set_noextnmi(char *arg)
> +{
> + apic_noextnmi = true;
> + return 0;
> +}
> +early_param("noextnmi", apic_set_noextnmi);
> 
> 
> 
> ___
> kexec mailing list
> ke...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-27 Thread Baoquan He
Hi,

I just have a look at this thread. I am wondering why we don't use
existing is_kdump_kernel() directly to disable external NMI if it's
in kdump kernel. Then no need to introduce another boot option "noextnmi"
which is used only for kdump kernel.

Thanks
Baoquan

On 09/25/15 at 08:28pm, Hidehiro Kawai wrote:
> This patch introduces new boot option "noextnmi" which disables
> external NMI.  This option is useful for the dump capture kernel
> so that an HA application or administrator wouldn't mistakenly
> shoot down the kernel by NMI.
> 
> Currently, only x86 supports this option.
> 
> Signed-off-by: Hidehiro Kawai 
> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: "H. Peter Anvin" 
> Cc: Jonathan Corbet 
> ---
>  Documentation/kernel-parameters.txt |4 
>  arch/x86/kernel/apic/apic.c |   17 -
>  2 files changed, 20 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/kernel-parameters.txt 
> b/Documentation/kernel-parameters.txt
> index 22a4b68..8bcaccd 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -2379,6 +2379,10 @@ bytes respectively. Such letter suffixes can also be 
> entirely omitted.
>   noexec=on: enable non-executable mappings (default)
>   noexec=off: disable non-executable mappings
>  
> + noextnmi[X86]
> + Mask external NMI.  This option is useful for a
> + dump capture kernel to be shot down by NMI.
> +
>   nosmap  [X86]
>   Disable SMAP (Supervisor Mode Access Prevention)
>   even if it is supported by processor.
> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
> index 24e94ce..fd47128 100644
> --- a/arch/x86/kernel/apic/apic.c
> +++ b/arch/x86/kernel/apic/apic.c
> @@ -82,6 +82,12 @@
>  static unsigned int disabled_cpu_apicid __read_mostly = BAD_APICID;
>  
>  /*
> + * Don't enable external NMI via LINT1 on BSP.  This is useful for
> + * the dump capture kernel.
> + */
> +static bool apic_noextnmi;
> +
> +/*
>   * Map cpu index to physical APIC ID
>   */
>  DEFINE_EARLY_PER_CPU_READ_MOSTLY(u16, x86_cpu_to_apicid, BAD_APICID);
> @@ -1161,6 +1167,8 @@ void __init init_bsp_APIC(void)
>   value = APIC_DM_NMI;
>   if (!lapic_is_integrated()) /* 82489DX */
>   value |= APIC_LVT_LEVEL_TRIGGER;
> + if (apic_noextnmi)
> + value |= APIC_LVT_MASKED;
>   apic_write(APIC_LVT1, value);
>  }
>  
> @@ -1380,7 +1388,7 @@ void setup_local_APIC(void)
>   /*
>* only the BP should see the LINT1 NMI signal, obviously.
>*/
> - if (!cpu)
> + if (!cpu && !apic_noextnmi)
>   value = APIC_DM_NMI;
>   else
>   value = APIC_DM_NMI | APIC_LVT_MASKED;
> @@ -2548,3 +2556,10 @@ static int __init apic_set_disabled_cpu_apicid(char 
> *arg)
>   return 0;
>  }
>  early_param("disable_cpu_apicid", apic_set_disabled_cpu_apicid);
> +
> +static int __init apic_set_noextnmi(char *arg)
> +{
> + apic_noextnmi = true;
> + return 0;
> +}
> +early_param("noextnmi", apic_set_noextnmi);
> 
> 
> 
> ___
> kexec mailing list
> ke...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Re: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-27 Thread 河合英宏 / KAWAI,HIDEHIRO
Hi,

> I just have a look at this thread. I am wondering why we don't use
> existing is_kdump_kernel() directly to disable external NMI if it's
> in kdump kernel. Then no need to introduce another boot option "noextnmi"
> which is used only for kdump kernel.

As I stated in another mail, there is a case where we don't want to
mask external NMIs in the second kernel.  So, we need some
configurable way.  Please see the following quotation.

> We souldn't enable this feature silently.  Some users wouldn't like
> to enable this feature.  For example, a user enables a watchdog timer
> which raises an external NMI when the counter is not reset for a
> specific duration.  Then, the second kernel hangs up while saving
> crash dump, and NMI is delivered to the CPU.  The kernel gets panic
> due to the NMI, prints some information to the display and serial
> console, and then automatically reboot.  In this case, users don't
> want to block external NMIs.

Regards,

Hidehiro Kawai
Hitachi, Ltd. Research & Development Group




Re: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-27 Thread 'Baoquan He'
On 10/27/15 at 09:01am, 河合英宏 / KAWAI,HIDEHIRO wrote:
> Hi,
> 
> > I just have a look at this thread. I am wondering why we don't use
> > existing is_kdump_kernel() directly to disable external NMI if it's
> > in kdump kernel. Then no need to introduce another boot option "noextnmi"
> > which is used only for kdump kernel.
> 
> As I stated in another mail, there is a case where we don't want to
> mask external NMIs in the second kernel.  So, we need some
> configurable way.  Please see the following quotation.

Got it. Thanks for telling.

> 
> > We souldn't enable this feature silently.  Some users wouldn't like
> > to enable this feature.  For example, a user enables a watchdog timer
> > which raises an external NMI when the counter is not reset for a
> > specific duration.  Then, the second kernel hangs up while saving
> > crash dump, and NMI is delivered to the CPU.  The kernel gets panic
> > due to the NMI, prints some information to the display and serial
> > console, and then automatically reboot.  In this case, users don't
> > want to block external NMIs.
> 
> Regards,
> 
> Hidehiro Kawai
> Hitachi, Ltd. Research & Development Group
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-15 Thread 河合英宏 / KAWAI,HIDEHIRO
> > By the way, I have a pending patch which expands this option like
> > this:
> >
> > apic_extnmi={ bsp | all | none }
> >
> > If apic_extnmi=all is specified, external NMIs are broadcast to
> > all CPUs.  This raises the successful rate of kernel panic in the case
> > where an external NMI to CPU 0 is swallowed by other NMI handlers or
> > blocked due to hang-up in NMI context.  The patch works without any
> > problems, but I'm going to drop the feature if it will cause long
> > discussion.  I'd like to settle this patch set down once.  At least,
> > I'm going to change this option to apic_extnmi={bsp|none} style for
> > the future expansion.
> >
> > How do you think about this?
> 
> Do it right away with all three variants. They make a lot of sense to
> me.

OK. I'll do that.

Regards,

--
Hidehiro Kawai
Hitachi, Ltd. Research & Development Group





RE: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-15 Thread 河合英宏 / KAWAI,HIDEHIRO
> * Thomas Gleixner  wrote:
> 
> > Borislav,
> >
> > On Mon, 5 Oct 2015, Borislav Petkov wrote:
> > > On Mon, Oct 05, 2015 at 02:03:58AM +, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > > > That's different from my point of view.  I'm not going to pass
> > > > some data from the first kernel to the second kernel. I'm just going to
> > > > provide a configurable option for the second kernel to users.
> > >
> > > Dude, WTF?! You're adding a kernel command line which is supposed to
> > > be used *only* by the kdump kernel. But nooo, it is there in the open
> > > and visible to people. And anyone can type it in during boot. AND THAT
> > > SHOULDN'T BE POSSIBLE IN THE FIRST PLACE!
> > >
> > > This information is strictly for the kdump kernel - it shouldn't be a
> > > generic command line option. How hard it is to understand that simple
> > > fact?!
> >
> > Calm down!
> >
> > Disabling that NMI in the first kernel is not going to make the world
> > explode. We have enough command line options a user can type in, which
> > are way worse than that one. If some "expert" types nonsense on the
> > first kernel command line, then it's none of our problems, really.
> >
> > If Kawai would have marked that option as a debug feature, this
> > discussion would have probably never happened.
> >
> > Aside of that, the best way to hand in options for the kdump kernel is
> > the command line. We have an existing interface for that.
> >
> > Let's move on. Nothing to see here.
> 
> So Boris kind of has a point: there are numerous problems with boot options as
> kexec parameter interface:
> 
>  - boot option strings are not a well defined programmatic interface:
> - failures are not obvious (they are often ignored)
> - inserting/setting parameters is awkward as well.
> 
>  - boot options are not an ABI, so when options have dual use with kexec it's 
> easy
>to break things inadvertently: without that failure being apparent other 
> than
>reintroducing obscure kexec failure modes again.
> 
>  - in the booted up kexec kernel it's not really obvious which options got 
> set by
>kexec and which got set by the user - as there's no separation of 
> namespaces.
> 
>  - likewise, if the user specifies an option in conflict with a kexec 
> requirement
>it's not really obvious what's happening: the user setting should probably
>dominate - I'm not sure that's the case with the current kexec code.

Thanks for the detailed explanation.  I can understand the disadvantages
of using boot option.  So these are essential problems with boot options
rather than new boot option added for kexec'ed kernel.
 
> Boot options are basically a user interface.
> 
> On the other hand the hack of (ab-)using boot parameters as kexec parameter
> passing is an existing, many years old practice and we cannot just stop it 
> without
> offering an alternative (and better!) interface first.
> 
> We could improve things by either adding a separate kexec-only parameter 
> passing
> facility (like programmatic boot parameters are) - or by creating some sort of
> boot parameter alias that clearly identifies kexec parameters.
> 
> So for example when introducing 'noextnmi' we'd also add a facility to add a
> 'kexec_noextnmi' alias - which clearly identifies this boot parameter as a 
> kexec
> related one.
> 
> Every 'kexec' inserted parameter would be prefixed by kexec_ - and then the
> separation of namespaces (and the prioritization of user vs. kexec 
> requirements)
> becomes well defined as well..
> 
> We should also probably print a warning if a kexec_* parameter is passed in 
> that
> has no matching handler in the kexec()-ed kernel.

It would be reasonable.  Or we might improve kexec command so that
it removes conflict boot options with warnings.

As I stated in another mail, I'm going to change "noextnmi" to
"apic_extnmi={bsp|all|none}", which can be used both the first and
second kernels.  So, I'll add this option without "kexec_" prefix
at this point.

> I do concur that this patch is probably OK given existing practices.

Thanks!

--
Hidehiro Kawai
Hitachi, Ltd. Research & Development Group


N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

RE: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-15 Thread 河合英宏 / KAWAI,HIDEHIRO
> * Thomas Gleixner  wrote:
> 
> > Borislav,
> >
> > On Mon, 5 Oct 2015, Borislav Petkov wrote:
> > > On Mon, Oct 05, 2015 at 02:03:58AM +, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > > > That's different from my point of view.  I'm not going to pass
> > > > some data from the first kernel to the second kernel. I'm just going to
> > > > provide a configurable option for the second kernel to users.
> > >
> > > Dude, WTF?! You're adding a kernel command line which is supposed to
> > > be used *only* by the kdump kernel. But nooo, it is there in the open
> > > and visible to people. And anyone can type it in during boot. AND THAT
> > > SHOULDN'T BE POSSIBLE IN THE FIRST PLACE!
> > >
> > > This information is strictly for the kdump kernel - it shouldn't be a
> > > generic command line option. How hard it is to understand that simple
> > > fact?!
> >
> > Calm down!
> >
> > Disabling that NMI in the first kernel is not going to make the world
> > explode. We have enough command line options a user can type in, which
> > are way worse than that one. If some "expert" types nonsense on the
> > first kernel command line, then it's none of our problems, really.
> >
> > If Kawai would have marked that option as a debug feature, this
> > discussion would have probably never happened.
> >
> > Aside of that, the best way to hand in options for the kdump kernel is
> > the command line. We have an existing interface for that.
> >
> > Let's move on. Nothing to see here.
> 
> So Boris kind of has a point: there are numerous problems with boot options as
> kexec parameter interface:
> 
>  - boot option strings are not a well defined programmatic interface:
> - failures are not obvious (they are often ignored)
> - inserting/setting parameters is awkward as well.
> 
>  - boot options are not an ABI, so when options have dual use with kexec it's 
> easy
>to break things inadvertently: without that failure being apparent other 
> than
>reintroducing obscure kexec failure modes again.
> 
>  - in the booted up kexec kernel it's not really obvious which options got 
> set by
>kexec and which got set by the user - as there's no separation of 
> namespaces.
> 
>  - likewise, if the user specifies an option in conflict with a kexec 
> requirement
>it's not really obvious what's happening: the user setting should probably
>dominate - I'm not sure that's the case with the current kexec code.

Thanks for the detailed explanation.  I can understand the disadvantages
of using boot option.  So these are essential problems with boot options
rather than new boot option added for kexec'ed kernel.
 
> Boot options are basically a user interface.
> 
> On the other hand the hack of (ab-)using boot parameters as kexec parameter
> passing is an existing, many years old practice and we cannot just stop it 
> without
> offering an alternative (and better!) interface first.
> 
> We could improve things by either adding a separate kexec-only parameter 
> passing
> facility (like programmatic boot parameters are) - or by creating some sort of
> boot parameter alias that clearly identifies kexec parameters.
> 
> So for example when introducing 'noextnmi' we'd also add a facility to add a
> 'kexec_noextnmi' alias - which clearly identifies this boot parameter as a 
> kexec
> related one.
> 
> Every 'kexec' inserted parameter would be prefixed by kexec_ - and then the
> separation of namespaces (and the prioritization of user vs. kexec 
> requirements)
> becomes well defined as well..
> 
> We should also probably print a warning if a kexec_* parameter is passed in 
> that
> has no matching handler in the kexec()-ed kernel.

It would be reasonable.  Or we might improve kexec command so that
it removes conflict boot options with warnings.

As I stated in another mail, I'm going to change "noextnmi" to
"apic_extnmi={bsp|all|none}", which can be used both the first and
second kernels.  So, I'll add this option without "kexec_" prefix
at this point.

> I do concur that this patch is probably OK given existing practices.

Thanks!

--
Hidehiro Kawai
Hitachi, Ltd. Research & Development Group


N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

RE: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-15 Thread 河合英宏 / KAWAI,HIDEHIRO
> > By the way, I have a pending patch which expands this option like
> > this:
> >
> > apic_extnmi={ bsp | all | none }
> >
> > If apic_extnmi=all is specified, external NMIs are broadcast to
> > all CPUs.  This raises the successful rate of kernel panic in the case
> > where an external NMI to CPU 0 is swallowed by other NMI handlers or
> > blocked due to hang-up in NMI context.  The patch works without any
> > problems, but I'm going to drop the feature if it will cause long
> > discussion.  I'd like to settle this patch set down once.  At least,
> > I'm going to change this option to apic_extnmi={bsp|none} style for
> > the future expansion.
> >
> > How do you think about this?
> 
> Do it right away with all three variants. They make a lot of sense to
> me.

OK. I'll do that.

Regards,

--
Hidehiro Kawai
Hitachi, Ltd. Research & Development Group





Re: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-14 Thread Ingo Molnar

* Thomas Gleixner  wrote:

> Borislav,
> 
> On Mon, 5 Oct 2015, Borislav Petkov wrote:
> > On Mon, Oct 05, 2015 at 02:03:58AM +, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > > That's different from my point of view.  I'm not going to pass
> > > some data from the first kernel to the second kernel. I'm just going to
> > > provide a configurable option for the second kernel to users.
> > 
> > Dude, WTF?! You're adding a kernel command line which is supposed to
> > be used *only* by the kdump kernel. But nooo, it is there in the open
> > and visible to people. And anyone can type it in during boot. AND THAT
> > SHOULDN'T BE POSSIBLE IN THE FIRST PLACE!
> > 
> > This information is strictly for the kdump kernel - it shouldn't be a
> > generic command line option. How hard it is to understand that simple
> > fact?!
> 
> Calm down!
> 
> Disabling that NMI in the first kernel is not going to make the world
> explode. We have enough command line options a user can type in, which
> are way worse than that one. If some "expert" types nonsense on the
> first kernel command line, then it's none of our problems, really.
> 
> If Kawai would have marked that option as a debug feature, this
> discussion would have probably never happened.
> 
> Aside of that, the best way to hand in options for the kdump kernel is
> the command line. We have an existing interface for that.
> 
> Let's move on. Nothing to see here.

So Boris kind of has a point: there are numerous problems with boot options as 
kexec parameter interface:

 - boot option strings are not a well defined programmatic interface:
- failures are not obvious (they are often ignored)
- inserting/setting parameters is awkward as well.

 - boot options are not an ABI, so when options have dual use with kexec it's 
easy
   to break things inadvertently: without that failure being apparent other than
   reintroducing obscure kexec failure modes again.

 - in the booted up kexec kernel it's not really obvious which options got set 
by
   kexec and which got set by the user - as there's no separation of namespaces.

 - likewise, if the user specifies an option in conflict with a kexec 
requirement
   it's not really obvious what's happening: the user setting should probably
   dominate - I'm not sure that's the case with the current kexec code.

Boot options are basically a user interface.

On the other hand the hack of (ab-)using boot parameters as kexec parameter 
passing is an existing, many years old practice and we cannot just stop it 
without 
offering an alternative (and better!) interface first.

We could improve things by either adding a separate kexec-only parameter 
passing 
facility (like programmatic boot parameters are) - or by creating some sort of 
boot parameter alias that clearly identifies kexec parameters.

So for example when introducing 'noextnmi' we'd also add a facility to add a 
'kexec_noextnmi' alias - which clearly identifies this boot parameter as a 
kexec 
related one.

Every 'kexec' inserted parameter would be prefixed by kexec_ - and then the 
separation of namespaces (and the prioritization of user vs. kexec 
requirements) 
becomes well defined as well..

We should also probably print a warning if a kexec_* parameter is passed in 
that 
has no matching handler in the kexec()-ed kernel.

I do concur that this patch is probably OK given existing practices.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-14 Thread Ingo Molnar

* Thomas Gleixner  wrote:

> Borislav,
> 
> On Mon, 5 Oct 2015, Borislav Petkov wrote:
> > On Mon, Oct 05, 2015 at 02:03:58AM +, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > > That's different from my point of view.  I'm not going to pass
> > > some data from the first kernel to the second kernel. I'm just going to
> > > provide a configurable option for the second kernel to users.
> > 
> > Dude, WTF?! You're adding a kernel command line which is supposed to
> > be used *only* by the kdump kernel. But nooo, it is there in the open
> > and visible to people. And anyone can type it in during boot. AND THAT
> > SHOULDN'T BE POSSIBLE IN THE FIRST PLACE!
> > 
> > This information is strictly for the kdump kernel - it shouldn't be a
> > generic command line option. How hard it is to understand that simple
> > fact?!
> 
> Calm down!
> 
> Disabling that NMI in the first kernel is not going to make the world
> explode. We have enough command line options a user can type in, which
> are way worse than that one. If some "expert" types nonsense on the
> first kernel command line, then it's none of our problems, really.
> 
> If Kawai would have marked that option as a debug feature, this
> discussion would have probably never happened.
> 
> Aside of that, the best way to hand in options for the kdump kernel is
> the command line. We have an existing interface for that.
> 
> Let's move on. Nothing to see here.

So Boris kind of has a point: there are numerous problems with boot options as 
kexec parameter interface:

 - boot option strings are not a well defined programmatic interface:
- failures are not obvious (they are often ignored)
- inserting/setting parameters is awkward as well.

 - boot options are not an ABI, so when options have dual use with kexec it's 
easy
   to break things inadvertently: without that failure being apparent other than
   reintroducing obscure kexec failure modes again.

 - in the booted up kexec kernel it's not really obvious which options got set 
by
   kexec and which got set by the user - as there's no separation of namespaces.

 - likewise, if the user specifies an option in conflict with a kexec 
requirement
   it's not really obvious what's happening: the user setting should probably
   dominate - I'm not sure that's the case with the current kexec code.

Boot options are basically a user interface.

On the other hand the hack of (ab-)using boot parameters as kexec parameter 
passing is an existing, many years old practice and we cannot just stop it 
without 
offering an alternative (and better!) interface first.

We could improve things by either adding a separate kexec-only parameter 
passing 
facility (like programmatic boot parameters are) - or by creating some sort of 
boot parameter alias that clearly identifies kexec parameters.

So for example when introducing 'noextnmi' we'd also add a facility to add a 
'kexec_noextnmi' alias - which clearly identifies this boot parameter as a 
kexec 
related one.

Every 'kexec' inserted parameter would be prefixed by kexec_ - and then the 
separation of namespaces (and the prioritization of user vs. kexec 
requirements) 
becomes well defined as well..

We should also probably print a warning if a kexec_* parameter is passed in 
that 
has no matching handler in the kexec()-ed kernel.

I do concur that this patch is probably OK given existing practices.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-13 Thread 河合英宏 / KAWAI,HIDEHIRO
> On Fri, 25 Sep 2015, Hidehiro Kawai wrote:
> 
> > This patch introduces new boot option "noextnmi" which disables
> > external NMI.  This option is useful for the dump capture kernel
> > so that an HA application or administrator wouldn't mistakenly
> > shoot down the kernel by NMI.
> >
> > Currently, only x86 supports this option.
> 
> You might add that is can be used for debugging purposes as
> well. External NMIs can be their own source of trouble. :)

Thanks for your comments!  I'll do that.

By the way, I have a pending patch which expands this option like
this:

apic_extnmi={ bsp | all | none }

If apic_extnmi=all is specified, external NMIs are broadcast to
all CPUs.  This raises the successful rate of kernel panic in the case
where an external NMI to CPU 0 is swallowed by other NMI handlers or
blocked due to hang-up in NMI context.  The patch works without any
problems, but I'm going to drop the feature if it will cause long
discussion.  I'd like to settle this patch set down once.  At least,
I'm going to change this option to apic_extnmi={bsp|none} style for
the future expansion.

How do you think about this?

> > Signed-off-by: Hidehiro Kawai 
> > Cc: Thomas Gleixner 
> > Cc: Ingo Molnar 
> > Cc: "H. Peter Anvin" 
> > Cc: Jonathan Corbet 
> > ---
> >  Documentation/kernel-parameters.txt |4 
> >  arch/x86/kernel/apic/apic.c |   17 -
> >  2 files changed, 20 insertions(+), 1 deletion(-)
> >
> > diff --git a/Documentation/kernel-parameters.txt 
> > b/Documentation/kernel-parameters.txt
> > index 22a4b68..8bcaccd 100644
> > --- a/Documentation/kernel-parameters.txt
> > +++ b/Documentation/kernel-parameters.txt
> > @@ -2379,6 +2379,10 @@ bytes respectively. Such letter suffixes can also be 
> > entirely omitted.
> > noexec=on: enable non-executable mappings (default)
> > noexec=off: disable non-executable mappings
> >
> > +   noextnmi[X86]
> > +   Mask external NMI.  This option is useful for a
> > +   dump capture kernel to be shot down by NMI.
> 
> That should read: "...not to be shot down", right?

Yes, you are right.  I'll fix it.

> Other than that.
> 
> Acked-by: Thomas Gleixner 

Regards,

Hidehiro Kawai
Hitachi, Ltd. Research & Development Group





Re: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-13 Thread Thomas Gleixner
On Fri, 25 Sep 2015, Hidehiro Kawai wrote:

> This patch introduces new boot option "noextnmi" which disables
> external NMI.  This option is useful for the dump capture kernel
> so that an HA application or administrator wouldn't mistakenly
> shoot down the kernel by NMI.
> 
> Currently, only x86 supports this option.

You might add that is can be used for debugging purposes as
well. External NMIs can be their own source of trouble. :)
 
> Signed-off-by: Hidehiro Kawai 
> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: "H. Peter Anvin" 
> Cc: Jonathan Corbet 
> ---
>  Documentation/kernel-parameters.txt |4 
>  arch/x86/kernel/apic/apic.c |   17 -
>  2 files changed, 20 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/kernel-parameters.txt 
> b/Documentation/kernel-parameters.txt
> index 22a4b68..8bcaccd 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -2379,6 +2379,10 @@ bytes respectively. Such letter suffixes can also be 
> entirely omitted.
>   noexec=on: enable non-executable mappings (default)
>   noexec=off: disable non-executable mappings
>  
> + noextnmi[X86]
> + Mask external NMI.  This option is useful for a
> + dump capture kernel to be shot down by NMI.

That should read: "...not to be shot down", right?

Other than that.

Acked-by: Thomas Gleixner 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-13 Thread 河合英宏 / KAWAI,HIDEHIRO
Hello, Boris

Sorry for the late reply.

> On Mon, Oct 05, 2015 at 09:21:02AM +, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > So, the problem for you is that "noextnmi" option is visible and effective
> > in the first kernel, isn't it?
> 
> No, such an option shouldn't exist at all. You should be passing
> information *in* *a* *different* *manner* to the kdump kernel - not with
> a kernel command line option.

Sorry, I couldn't find out the reason why I shouldn't use cmdline option.
It doesn't need new user I/F to inform the 1st kernel about masking/unmasking
external NMI in the 2nd kernel, doesn't need new data passing infrastructure,
and is easy to configure that.  Also, "elfcorehdr" and "disable_cpu_apicid"
have already been introduced as cmdline options for dump capture kernel.
This means the cmdline option approach would be mostly acceptable.

> I get the feeling I'm starting to sound like a broken record on this
> mail thread... :-(
> 
> One other thing we could probably try to do is use boot_params which
> is, IIUC, passed to the second kernel. So we can add another bit to
> boot_params.hdr.loadflags or so and use that. Or something similar.

I think using boot_params would be worse than ELF header approach.
It needs to reserve boot_params bits for all boot loaders.

Regards,

Hidehiro Kawai
Hitachi, Ltd. Research & Development Group




RE: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-13 Thread 河合英宏 / KAWAI,HIDEHIRO
> On Fri, 25 Sep 2015, Hidehiro Kawai wrote:
> 
> > This patch introduces new boot option "noextnmi" which disables
> > external NMI.  This option is useful for the dump capture kernel
> > so that an HA application or administrator wouldn't mistakenly
> > shoot down the kernel by NMI.
> >
> > Currently, only x86 supports this option.
> 
> You might add that is can be used for debugging purposes as
> well. External NMIs can be their own source of trouble. :)

Thanks for your comments!  I'll do that.

By the way, I have a pending patch which expands this option like
this:

apic_extnmi={ bsp | all | none }

If apic_extnmi=all is specified, external NMIs are broadcast to
all CPUs.  This raises the successful rate of kernel panic in the case
where an external NMI to CPU 0 is swallowed by other NMI handlers or
blocked due to hang-up in NMI context.  The patch works without any
problems, but I'm going to drop the feature if it will cause long
discussion.  I'd like to settle this patch set down once.  At least,
I'm going to change this option to apic_extnmi={bsp|none} style for
the future expansion.

How do you think about this?

> > Signed-off-by: Hidehiro Kawai 
> > Cc: Thomas Gleixner 
> > Cc: Ingo Molnar 
> > Cc: "H. Peter Anvin" 
> > Cc: Jonathan Corbet 
> > ---
> >  Documentation/kernel-parameters.txt |4 
> >  arch/x86/kernel/apic/apic.c |   17 -
> >  2 files changed, 20 insertions(+), 1 deletion(-)
> >
> > diff --git a/Documentation/kernel-parameters.txt 
> > b/Documentation/kernel-parameters.txt
> > index 22a4b68..8bcaccd 100644
> > --- a/Documentation/kernel-parameters.txt
> > +++ b/Documentation/kernel-parameters.txt
> > @@ -2379,6 +2379,10 @@ bytes respectively. Such letter suffixes can also be 
> > entirely omitted.
> > noexec=on: enable non-executable mappings (default)
> > noexec=off: disable non-executable mappings
> >
> > +   noextnmi[X86]
> > +   Mask external NMI.  This option is useful for a
> > +   dump capture kernel to be shot down by NMI.
> 
> That should read: "...not to be shot down", right?

Yes, you are right.  I'll fix it.

> Other than that.
> 
> Acked-by: Thomas Gleixner 

Regards,

Hidehiro Kawai
Hitachi, Ltd. Research & Development Group





RE: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-13 Thread 河合英宏 / KAWAI,HIDEHIRO
Hello, Boris

Sorry for the late reply.

> On Mon, Oct 05, 2015 at 09:21:02AM +, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > So, the problem for you is that "noextnmi" option is visible and effective
> > in the first kernel, isn't it?
> 
> No, such an option shouldn't exist at all. You should be passing
> information *in* *a* *different* *manner* to the kdump kernel - not with
> a kernel command line option.

Sorry, I couldn't find out the reason why I shouldn't use cmdline option.
It doesn't need new user I/F to inform the 1st kernel about masking/unmasking
external NMI in the 2nd kernel, doesn't need new data passing infrastructure,
and is easy to configure that.  Also, "elfcorehdr" and "disable_cpu_apicid"
have already been introduced as cmdline options for dump capture kernel.
This means the cmdline option approach would be mostly acceptable.

> I get the feeling I'm starting to sound like a broken record on this
> mail thread... :-(
> 
> One other thing we could probably try to do is use boot_params which
> is, IIUC, passed to the second kernel. So we can add another bit to
> boot_params.hdr.loadflags or so and use that. Or something similar.

I think using boot_params would be worse than ELF header approach.
It needs to reserve boot_params bits for all boot loaders.

Regards,

Hidehiro Kawai
Hitachi, Ltd. Research & Development Group




Re: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-13 Thread Thomas Gleixner
On Fri, 25 Sep 2015, Hidehiro Kawai wrote:

> This patch introduces new boot option "noextnmi" which disables
> external NMI.  This option is useful for the dump capture kernel
> so that an HA application or administrator wouldn't mistakenly
> shoot down the kernel by NMI.
> 
> Currently, only x86 supports this option.

You might add that is can be used for debugging purposes as
well. External NMIs can be their own source of trouble. :)
 
> Signed-off-by: Hidehiro Kawai 
> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: "H. Peter Anvin" 
> Cc: Jonathan Corbet 
> ---
>  Documentation/kernel-parameters.txt |4 
>  arch/x86/kernel/apic/apic.c |   17 -
>  2 files changed, 20 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/kernel-parameters.txt 
> b/Documentation/kernel-parameters.txt
> index 22a4b68..8bcaccd 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -2379,6 +2379,10 @@ bytes respectively. Such letter suffixes can also be 
> entirely omitted.
>   noexec=on: enable non-executable mappings (default)
>   noexec=off: disable non-executable mappings
>  
> + noextnmi[X86]
> + Mask external NMI.  This option is useful for a
> + dump capture kernel to be shot down by NMI.

That should read: "...not to be shot down", right?

Other than that.

Acked-by: Thomas Gleixner 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-05 Thread Borislav Petkov
On Mon, Oct 05, 2015 at 09:21:02AM +, 河合英宏 / KAWAI,HIDEHIRO wrote:
> So, the problem for you is that "noextnmi" option is visible and effective
> in the first kernel, isn't it?

No, such an option shouldn't exist at all. You should be passing
information *in* *a* *different* *manner* to the kdump kernel - not with
a kernel command line option.

I get the feeling I'm starting to sound like a broken record on this
mail thread... :-(

One other thing we could probably try to do is use boot_params which
is, IIUC, passed to the second kernel. So we can add another bit to
boot_params.hdr.loadflags or so and use that. Or something similar.

Not particularly crazy about it but it is still much better than a
command line param...

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-05 Thread 河合英宏 / KAWAI,HIDEHIRO
> On Mon, Oct 05, 2015 at 02:03:58AM +, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > That's different from my point of view.  I'm not going to pass
> > some data from the first kernel to the second kernel. I'm just going to
> > provide a configurable option for the second kernel to users.
> 
> Dude, WTF?! You're adding a kernel command line which is supposed to
> be used *only* by the kdump kernel. But nooo, it is there in the open
> and visible to people. And anyone can type it in during boot. AND THAT
> SHOULDN'T BE POSSIBLE IN THE FIRST PLACE!
> 
> This information is strictly for the kdump kernel - it shouldn't be a
> generic command line option. How hard it is to understand that simple
> fact?!

So, the problem for you is that "noextnmi" option is visible and effective
in the first kernel, isn't it?  If so, we can ignore "noextnmi" option
if we are in the first kernel and remove it from the documentation.
"elfcorehdr" cmdline option prepared by kexec command is passed to only
the second kernel, and it is also used to check if the booted kernel is
a kdump kernel.  Thus, if "elfcorehdr" is NOT specified, then ignore
"noextnmi".

Documentation/kernel-parameters.txt:
> elfcorehdr=[size[KMG]@]offset[KMG] [IA64,PPC,SH,X86,S390]
> Specifies physical address of start of kernel core
> image elf header and optionally the size. Generally
> kexec loader will pass this option to capture kernel.
> See Documentation/kdump/kdump.txt for details.

> 
> 
> > I think we should use the ELF header only if the passed information
> > is saved to a crash dump.
> 
> So what?! ELF header will contain the additional bit of information that
> the second kernel wasn't reacting to NMIs. But that's fine, that *is*
> the desired behavior anyway.
> 
> All I'm saying is, this is a strict kdump kernel "command", so to speak,
> and it doesn't belong with the generic kernel command line parameters.

Regards,

Hidehiro Kawai
Hitachi, Ltd. Research & Development Group





Re: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-05 Thread Borislav Petkov
On Mon, Oct 05, 2015 at 02:03:58AM +, 河合英宏 / KAWAI,HIDEHIRO wrote:
> That's different from my point of view.  I'm not going to pass
> some data from the first kernel to the second kernel. I'm just going to
> provide a configurable option for the second kernel to users.

Dude, WTF?! You're adding a kernel command line which is supposed to
be used *only* by the kdump kernel. But nooo, it is there in the open
and visible to people. And anyone can type it in during boot. AND THAT
SHOULDN'T BE POSSIBLE IN THE FIRST PLACE!

This information is strictly for the kdump kernel - it shouldn't be a
generic command line option. How hard it is to understand that simple
fact?!



> I think we should use the ELF header only if the passed information
> is saved to a crash dump.

So what?! ELF header will contain the additional bit of information that
the second kernel wasn't reacting to NMIs. But that's fine, that *is*
the desired behavior anyway.

All I'm saying is, this is a strict kdump kernel "command", so to speak,
and it doesn't belong with the generic kernel command line parameters.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-05 Thread Borislav Petkov
On Mon, Oct 05, 2015 at 02:03:58AM +, 河合英宏 / KAWAI,HIDEHIRO wrote:
> That's different from my point of view.  I'm not going to pass
> some data from the first kernel to the second kernel. I'm just going to
> provide a configurable option for the second kernel to users.

Dude, WTF?! You're adding a kernel command line which is supposed to
be used *only* by the kdump kernel. But nooo, it is there in the open
and visible to people. And anyone can type it in during boot. AND THAT
SHOULDN'T BE POSSIBLE IN THE FIRST PLACE!

This information is strictly for the kdump kernel - it shouldn't be a
generic command line option. How hard it is to understand that simple
fact?!



> I think we should use the ELF header only if the passed information
> is saved to a crash dump.

So what?! ELF header will contain the additional bit of information that
the second kernel wasn't reacting to NMIs. But that's fine, that *is*
the desired behavior anyway.

All I'm saying is, this is a strict kdump kernel "command", so to speak,
and it doesn't belong with the generic kernel command line parameters.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-05 Thread 河合英宏 / KAWAI,HIDEHIRO
> On Mon, Oct 05, 2015 at 02:03:58AM +, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > That's different from my point of view.  I'm not going to pass
> > some data from the first kernel to the second kernel. I'm just going to
> > provide a configurable option for the second kernel to users.
> 
> Dude, WTF?! You're adding a kernel command line which is supposed to
> be used *only* by the kdump kernel. But nooo, it is there in the open
> and visible to people. And anyone can type it in during boot. AND THAT
> SHOULDN'T BE POSSIBLE IN THE FIRST PLACE!
> 
> This information is strictly for the kdump kernel - it shouldn't be a
> generic command line option. How hard it is to understand that simple
> fact?!

So, the problem for you is that "noextnmi" option is visible and effective
in the first kernel, isn't it?  If so, we can ignore "noextnmi" option
if we are in the first kernel and remove it from the documentation.
"elfcorehdr" cmdline option prepared by kexec command is passed to only
the second kernel, and it is also used to check if the booted kernel is
a kdump kernel.  Thus, if "elfcorehdr" is NOT specified, then ignore
"noextnmi".

Documentation/kernel-parameters.txt:
> elfcorehdr=[size[KMG]@]offset[KMG] [IA64,PPC,SH,X86,S390]
> Specifies physical address of start of kernel core
> image elf header and optionally the size. Generally
> kexec loader will pass this option to capture kernel.
> See Documentation/kdump/kdump.txt for details.

> 
> 
> > I think we should use the ELF header only if the passed information
> > is saved to a crash dump.
> 
> So what?! ELF header will contain the additional bit of information that
> the second kernel wasn't reacting to NMIs. But that's fine, that *is*
> the desired behavior anyway.
> 
> All I'm saying is, this is a strict kdump kernel "command", so to speak,
> and it doesn't belong with the generic kernel command line parameters.

Regards,

Hidehiro Kawai
Hitachi, Ltd. Research & Development Group





Re: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-05 Thread Borislav Petkov
On Mon, Oct 05, 2015 at 09:21:02AM +, 河合英宏 / KAWAI,HIDEHIRO wrote:
> So, the problem for you is that "noextnmi" option is visible and effective
> in the first kernel, isn't it?

No, such an option shouldn't exist at all. You should be passing
information *in* *a* *different* *manner* to the kdump kernel - not with
a kernel command line option.

I get the feeling I'm starting to sound like a broken record on this
mail thread... :-(

One other thing we could probably try to do is use boot_params which
is, IIUC, passed to the second kernel. So we can add another bit to
boot_params.hdr.loadflags or so and use that. Or something similar.

Not particularly crazy about it but it is still much better than a
command line param...

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-04 Thread 河合英宏 / KAWAI,HIDEHIRO
> On Fri, Oct 02, 2015 at 12:58:02AM +, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > > On Thu, Oct 01, 2015 at 10:24:19AM +, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > > > But how do we check if the starting kernel is a dump capture kernel?
> > >
> > > How does that first kernel pass info to the capture kernel?
> >
> > As I described in the previous mail,
> 
> I meant: "How does the first kernel pass info to the capture kernel by
> *not* using the kernel command line"?
> 
> The kernel command line is not the channel to pass data to the kdump
> kernel.

That's different from my point of view.  I'm not going to pass
some data from the first kernel to the second kernel. I'm just going to
provide a configurable option for the second kernel to users.

We souldn't enable this feature silently.  Some users wouldn't like
to enable this feature.  For example, a user enables a watchdog timer
which raises an external NMI when the counter is not reset for a
specific duration.  Then, the second kernel hangs up while saving
crash dump, and NMI is delivered to the CPU.  The kernel gets panic
due to the NMI, prints some information to the display and serial
console, and then automatically reboot.  In this case, users don't
want to block external NMIs.

So, making this feature configurable by command line option is
reasonable.

> > Yes, your first kernel doesn't get external NMIs, but basically
> > you don't have to set "noextnmi" option to the first kernel.
> 
> So it doesn't belong there as a kernel command line parameter in the
> first place.
> 
> IOW, you need a different method to pass data to the second kernel. Be
> it an ELF header, be it a shared page, whatever.

I think we should use the ELF header only if the passed information
is saved to a crash dump.

Also, we wouldn't want to introduce new shared page for that purpose.
A memory segment provided by kexec syscall is not usable because
the second kernel doesn't know what there is in a segment without a
command line option.  Please note that "elfcorehdr" command line option
prepared by kexec command is used to inform the second kernel about
the address of the ELF header memory segment.

Regards,


Hidehiro Kawai
Hitachi, Ltd. Research & Development Group


N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

RE: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-04 Thread 河合英宏 / KAWAI,HIDEHIRO
> On Fri, Oct 02, 2015 at 12:58:02AM +, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > > On Thu, Oct 01, 2015 at 10:24:19AM +, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > > > But how do we check if the starting kernel is a dump capture kernel?
> > >
> > > How does that first kernel pass info to the capture kernel?
> >
> > As I described in the previous mail,
> 
> I meant: "How does the first kernel pass info to the capture kernel by
> *not* using the kernel command line"?
> 
> The kernel command line is not the channel to pass data to the kdump
> kernel.

That's different from my point of view.  I'm not going to pass
some data from the first kernel to the second kernel. I'm just going to
provide a configurable option for the second kernel to users.

We souldn't enable this feature silently.  Some users wouldn't like
to enable this feature.  For example, a user enables a watchdog timer
which raises an external NMI when the counter is not reset for a
specific duration.  Then, the second kernel hangs up while saving
crash dump, and NMI is delivered to the CPU.  The kernel gets panic
due to the NMI, prints some information to the display and serial
console, and then automatically reboot.  In this case, users don't
want to block external NMIs.

So, making this feature configurable by command line option is
reasonable.

> > Yes, your first kernel doesn't get external NMIs, but basically
> > you don't have to set "noextnmi" option to the first kernel.
> 
> So it doesn't belong there as a kernel command line parameter in the
> first place.
> 
> IOW, you need a different method to pass data to the second kernel. Be
> it an ELF header, be it a shared page, whatever.

I think we should use the ELF header only if the passed information
is saved to a crash dump.

Also, we wouldn't want to introduce new shared page for that purpose.
A memory segment provided by kexec syscall is not usable because
the second kernel doesn't know what there is in a segment without a
command line option.  Please note that "elfcorehdr" command line option
prepared by kexec command is used to inform the second kernel about
the address of the ELF header memory segment.

Regards,


Hidehiro Kawai
Hitachi, Ltd. Research & Development Group


N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

Re: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-02 Thread Borislav Petkov
On Fri, Oct 02, 2015 at 12:58:02AM +, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > On Thu, Oct 01, 2015 at 10:24:19AM +, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > > But how do we check if the starting kernel is a dump capture kernel?
> > 
> > How does that first kernel pass info to the capture kernel?
> 
> As I described in the previous mail,

I meant: "How does the first kernel pass info to the capture kernel by
*not* using the kernel command line"?

The kernel command line is not the channel to pass data to the kdump
kernel.

> Yes, your first kernel doesn't get external NMIs, but basically
> you don't have to set "noextnmi" option to the first kernel.

So it doesn't belong there as a kernel command line parameter in the
first place.

IOW, you need a different method to pass data to the second kernel. Be
it an ELF header, be it a shared page, whatever.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-02 Thread Borislav Petkov
On Fri, Oct 02, 2015 at 12:58:02AM +, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > On Thu, Oct 01, 2015 at 10:24:19AM +, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > > But how do we check if the starting kernel is a dump capture kernel?
> > 
> > How does that first kernel pass info to the capture kernel?
> 
> As I described in the previous mail,

I meant: "How does the first kernel pass info to the capture kernel by
*not* using the kernel command line"?

The kernel command line is not the channel to pass data to the kdump
kernel.

> Yes, your first kernel doesn't get external NMIs, but basically
> you don't have to set "noextnmi" option to the first kernel.

So it doesn't belong there as a kernel command line parameter in the
first place.

IOW, you need a different method to pass data to the second kernel. Be
it an ELF header, be it a shared page, whatever.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-01 Thread 河合英宏 / KAWAI,HIDEHIRO
> On Thu, Oct 01, 2015 at 10:24:19AM +, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > But how do we check if the starting kernel is a dump capture kernel?
> 
> How does that first kernel pass info to the capture kernel?

As I described in the previous mail, You just have to add "noextnmi"
to KDUMP_COMMANDLINE_APPEND in /etc/sysconfig/kdump.  Then, "noextnmi"
option is passed to the capture kernel by the action of kexec command.

Cmdline option gives users flexibility.  I'm not sure all users
want to disable external NMIs in the 2nd kernel.

> > I think using cmdline option is the simplest way.
> 
> More often than not, simplest != correct.
> 
> What happens if I pass this option to the first kernel? All of a sudden
> my *first* kernel doesn't get external NMIs.

Yes, your first kernel doesn't get external NMIs, but basically
you don't have to set "noextnmi" option to the first kernel.


Hidehiro Kawai
Hitachi, Ltd. Research & Development Group




Re: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-01 Thread Borislav Petkov
On Thu, Oct 01, 2015 at 10:24:19AM +, 河合英宏 / KAWAI,HIDEHIRO wrote:
> But how do we check if the starting kernel is a dump capture kernel?

How does that first kernel pass info to the capture kernel?

> I think using cmdline option is the simplest way.

More often than not, simplest != correct.

What happens if I pass this option to the first kernel? All of a sudden
my *first* kernel doesn't get external NMIs.

Do you catch my drift?

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-01 Thread 河合英宏 / KAWAI,HIDEHIRO
> On Thu, Oct 01, 2015 at 07:01:50AM +, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > I suppose that a sever which uses this feature will equip a BMC
> > and BMC mandatorily supports hard reset command for the server.
> > If the HA clustering software detects no response from the server
> > after relatively long timeout, it might want to insert hard reset
> > to the server by IPMI over LAN.
> 
> So why doesn't the capture kernel *automatically* ignore external NMIs,
> without a cmdline option?
> 
> Before it starts capturing, it says: "I'm starting capturing and am
> ignoring external NMIs from now on."

But how do we check if the starting kernel is a dump capture kernel?
I think using cmdline option is the simplest way.  You just have to
add "noextnmi" to KDUMP_COMMANDLINE_APPEND in /etc/sysconfig/kdump.

Regards,

Hidehiro Kawai
Hitachi, Ltd. Research & Development Group



Re: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-01 Thread Borislav Petkov
On Thu, Oct 01, 2015 at 07:01:50AM +, 河合英宏 / KAWAI,HIDEHIRO wrote:
> I suppose that a sever which uses this feature will equip a BMC
> and BMC mandatorily supports hard reset command for the server.
> If the HA clustering software detects no response from the server
> after relatively long timeout, it might want to insert hard reset
> to the server by IPMI over LAN.

So why doesn't the capture kernel *automatically* ignore external NMIs,
without a cmdline option?

Before it starts capturing, it says: "I'm starting capturing and am
ignoring external NMIs from now on."

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-01 Thread 河合英宏 / KAWAI,HIDEHIRO
> On Thu, Oct 01, 2015 at 02:33:18AM +, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > > On Fri, Sep 25, 2015 at 08:28:11PM +0900, Hidehiro Kawai wrote:
> > > > This patch introduces new boot option "noextnmi" which disables
> > > > external NMI.  This option is useful for the dump capture kernel
> > > > so that an HA application or administrator wouldn't mistakenly
> > > > shoot down the kernel by NMI.
> > >
> > > So that they can get really stuck when the crash kernel crashes, right?
> > > ;-)
> >
> > No, it is different from my intention.
> >
> > `mistakenly' in the above means; they issue NMI due to a misconception
> > that the monitored host is stuck in the 1st kernel while it is actually
> > taking a crash dump in the 2nd kernel.  To avoid this kind of accident,
> > there is a tool such as fence_kdump which notifies "I'm taking a crash
> > dump, so don't send NMI" to the HA clustering software.  However, there
> > is a time window between kernel panic and the notification.
> >
> > "noextnmi" allows users to avoid this kind of accident all the time of
> > 2nd kernel.
> 
> Yes yes, I understand. But if the crash kernel also gets stuck they have
> no means of recovery, right? (other than power cycling the hardware)

Yes, but I think it's not a big problem.

I suppose that a sever which uses this feature will equip a BMC
and BMC mandatorily supports hard reset command for the server.
If the HA clustering software detects no response from the server
after relatively long timeout, it might want to insert hard reset
to the server by IPMI over LAN.

> Just playing devils advocate here, I don't actually object to the patch.

Regards,

Hidehiro Kawai
Hitachi, Ltd. Research & Development Group





Re: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-01 Thread Peter Zijlstra
On Thu, Oct 01, 2015 at 02:33:18AM +, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > On Fri, Sep 25, 2015 at 08:28:11PM +0900, Hidehiro Kawai wrote:
> > > This patch introduces new boot option "noextnmi" which disables
> > > external NMI.  This option is useful for the dump capture kernel
> > > so that an HA application or administrator wouldn't mistakenly
> > > shoot down the kernel by NMI.
> > 
> > So that they can get really stuck when the crash kernel crashes, right?
> > ;-)
> 
> No, it is different from my intention.
> 
> `mistakenly' in the above means; they issue NMI due to a misconception
> that the monitored host is stuck in the 1st kernel while it is actually
> taking a crash dump in the 2nd kernel.  To avoid this kind of accident,
> there is a tool such as fence_kdump which notifies "I'm taking a crash
> dump, so don't send NMI" to the HA clustering software.  However, there
> is a time window between kernel panic and the notification.
> 
> "noextnmi" allows users to avoid this kind of accident all the time of
> 2nd kernel.

Yes yes, I understand. But if the crash kernel also gets stuck they have
no means of recovery, right? (other than power cycling the hardware)

Just playing devils advocate here, I don't actually object to the patch.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-01 Thread Borislav Petkov
On Thu, Oct 01, 2015 at 10:24:19AM +, 河合英宏 / KAWAI,HIDEHIRO wrote:
> But how do we check if the starting kernel is a dump capture kernel?

How does that first kernel pass info to the capture kernel?

> I think using cmdline option is the simplest way.

More often than not, simplest != correct.

What happens if I pass this option to the first kernel? All of a sudden
my *first* kernel doesn't get external NMIs.

Do you catch my drift?

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-01 Thread 河合英宏 / KAWAI,HIDEHIRO
> On Thu, Oct 01, 2015 at 07:01:50AM +, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > I suppose that a sever which uses this feature will equip a BMC
> > and BMC mandatorily supports hard reset command for the server.
> > If the HA clustering software detects no response from the server
> > after relatively long timeout, it might want to insert hard reset
> > to the server by IPMI over LAN.
> 
> So why doesn't the capture kernel *automatically* ignore external NMIs,
> without a cmdline option?
> 
> Before it starts capturing, it says: "I'm starting capturing and am
> ignoring external NMIs from now on."

But how do we check if the starting kernel is a dump capture kernel?
I think using cmdline option is the simplest way.  You just have to
add "noextnmi" to KDUMP_COMMANDLINE_APPEND in /etc/sysconfig/kdump.

Regards,

Hidehiro Kawai
Hitachi, Ltd. Research & Development Group



Re: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-01 Thread Peter Zijlstra
On Thu, Oct 01, 2015 at 02:33:18AM +, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > On Fri, Sep 25, 2015 at 08:28:11PM +0900, Hidehiro Kawai wrote:
> > > This patch introduces new boot option "noextnmi" which disables
> > > external NMI.  This option is useful for the dump capture kernel
> > > so that an HA application or administrator wouldn't mistakenly
> > > shoot down the kernel by NMI.
> > 
> > So that they can get really stuck when the crash kernel crashes, right?
> > ;-)
> 
> No, it is different from my intention.
> 
> `mistakenly' in the above means; they issue NMI due to a misconception
> that the monitored host is stuck in the 1st kernel while it is actually
> taking a crash dump in the 2nd kernel.  To avoid this kind of accident,
> there is a tool such as fence_kdump which notifies "I'm taking a crash
> dump, so don't send NMI" to the HA clustering software.  However, there
> is a time window between kernel panic and the notification.
> 
> "noextnmi" allows users to avoid this kind of accident all the time of
> 2nd kernel.

Yes yes, I understand. But if the crash kernel also gets stuck they have
no means of recovery, right? (other than power cycling the hardware)

Just playing devils advocate here, I don't actually object to the patch.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-01 Thread 河合英宏 / KAWAI,HIDEHIRO
> On Thu, Oct 01, 2015 at 02:33:18AM +, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > > On Fri, Sep 25, 2015 at 08:28:11PM +0900, Hidehiro Kawai wrote:
> > > > This patch introduces new boot option "noextnmi" which disables
> > > > external NMI.  This option is useful for the dump capture kernel
> > > > so that an HA application or administrator wouldn't mistakenly
> > > > shoot down the kernel by NMI.
> > >
> > > So that they can get really stuck when the crash kernel crashes, right?
> > > ;-)
> >
> > No, it is different from my intention.
> >
> > `mistakenly' in the above means; they issue NMI due to a misconception
> > that the monitored host is stuck in the 1st kernel while it is actually
> > taking a crash dump in the 2nd kernel.  To avoid this kind of accident,
> > there is a tool such as fence_kdump which notifies "I'm taking a crash
> > dump, so don't send NMI" to the HA clustering software.  However, there
> > is a time window between kernel panic and the notification.
> >
> > "noextnmi" allows users to avoid this kind of accident all the time of
> > 2nd kernel.
> 
> Yes yes, I understand. But if the crash kernel also gets stuck they have
> no means of recovery, right? (other than power cycling the hardware)

Yes, but I think it's not a big problem.

I suppose that a sever which uses this feature will equip a BMC
and BMC mandatorily supports hard reset command for the server.
If the HA clustering software detects no response from the server
after relatively long timeout, it might want to insert hard reset
to the server by IPMI over LAN.

> Just playing devils advocate here, I don't actually object to the patch.

Regards,

Hidehiro Kawai
Hitachi, Ltd. Research & Development Group





RE: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-01 Thread 河合英宏 / KAWAI,HIDEHIRO
> On Thu, Oct 01, 2015 at 10:24:19AM +, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > But how do we check if the starting kernel is a dump capture kernel?
> 
> How does that first kernel pass info to the capture kernel?

As I described in the previous mail, You just have to add "noextnmi"
to KDUMP_COMMANDLINE_APPEND in /etc/sysconfig/kdump.  Then, "noextnmi"
option is passed to the capture kernel by the action of kexec command.

Cmdline option gives users flexibility.  I'm not sure all users
want to disable external NMIs in the 2nd kernel.

> > I think using cmdline option is the simplest way.
> 
> More often than not, simplest != correct.
> 
> What happens if I pass this option to the first kernel? All of a sudden
> my *first* kernel doesn't get external NMIs.

Yes, your first kernel doesn't get external NMIs, but basically
you don't have to set "noextnmi" option to the first kernel.


Hidehiro Kawai
Hitachi, Ltd. Research & Development Group




Re: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-10-01 Thread Borislav Petkov
On Thu, Oct 01, 2015 at 07:01:50AM +, 河合英宏 / KAWAI,HIDEHIRO wrote:
> I suppose that a sever which uses this feature will equip a BMC
> and BMC mandatorily supports hard reset command for the server.
> If the HA clustering software detects no response from the server
> after relatively long timeout, it might want to insert hard reset
> to the server by IPMI over LAN.

So why doesn't the capture kernel *automatically* ignore external NMIs,
without a cmdline option?

Before it starts capturing, it says: "I'm starting capturing and am
ignoring external NMIs from now on."

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-09-30 Thread 河合英宏 / KAWAI,HIDEHIRO
> On Fri, Sep 25, 2015 at 08:28:11PM +0900, Hidehiro Kawai wrote:
> > This patch introduces new boot option "noextnmi" which disables
> > external NMI.  This option is useful for the dump capture kernel
> > so that an HA application or administrator wouldn't mistakenly
> > shoot down the kernel by NMI.
> 
> So that they can get really stuck when the crash kernel crashes, right?
> ;-)

No, it is different from my intention.

`mistakenly' in the above means; they issue NMI due to a misconception
that the monitored host is stuck in the 1st kernel while it is actually
taking a crash dump in the 2nd kernel.  To avoid this kind of accident,
there is a tool such as fence_kdump which notifies "I'm taking a crash
dump, so don't send NMI" to the HA clustering software.  However, there
is a time window between kernel panic and the notification.

"noextnmi" allows users to avoid this kind of accident all the time of
2nd kernel.

Regards,


Hidehiro Kawai
Hitachi, Ltd. Research & Development Group



N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

Re: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-09-30 Thread Peter Zijlstra
On Fri, Sep 25, 2015 at 08:28:11PM +0900, Hidehiro Kawai wrote:
> This patch introduces new boot option "noextnmi" which disables
> external NMI.  This option is useful for the dump capture kernel
> so that an HA application or administrator wouldn't mistakenly
> shoot down the kernel by NMI.

So that they can get really stuck when the crash kernel crashes, right?
;-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-09-30 Thread Peter Zijlstra
On Fri, Sep 25, 2015 at 08:28:11PM +0900, Hidehiro Kawai wrote:
> This patch introduces new boot option "noextnmi" which disables
> external NMI.  This option is useful for the dump capture kernel
> so that an HA application or administrator wouldn't mistakenly
> shoot down the kernel by NMI.

So that they can get really stuck when the crash kernel crashes, right?
;-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

2015-09-30 Thread 河合英宏 / KAWAI,HIDEHIRO
> On Fri, Sep 25, 2015 at 08:28:11PM +0900, Hidehiro Kawai wrote:
> > This patch introduces new boot option "noextnmi" which disables
> > external NMI.  This option is useful for the dump capture kernel
> > so that an HA application or administrator wouldn't mistakenly
> > shoot down the kernel by NMI.
> 
> So that they can get really stuck when the crash kernel crashes, right?
> ;-)

No, it is different from my intention.

`mistakenly' in the above means; they issue NMI due to a misconception
that the monitored host is stuck in the 1st kernel while it is actually
taking a crash dump in the 2nd kernel.  To avoid this kind of accident,
there is a tool such as fence_kdump which notifies "I'm taking a crash
dump, so don't send NMI" to the HA clustering software.  However, there
is a time window between kernel panic and the notification.

"noextnmi" allows users to avoid this kind of accident all the time of
2nd kernel.

Regards,


Hidehiro Kawai
Hitachi, Ltd. Research & Development Group



N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i