RE: [V3 PATCH 3/4] kexec: Fix race between panic() and crash_kexec() called directly

2015-08-31 Thread 河合英宏 / KAWAI,HIDEHIRO
Hello Peter,

> From: linux-kernel-ow...@vger.kernel.org 
> [mailto:linux-kernel-ow...@vger.kernel.org] On Behalf Of 河合英宏 / KAWAI,
> 
> Hi,
> 
> > From: Peter Zijlstra [mailto:pet...@infradead.org]
> >
> > On Sat, Aug 22, 2015 at 02:35:24AM +, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > > > From: Peter Zijlstra [mailto:pet...@infradead.org]
> > > >
> > > > On Thu, Aug 06, 2015 at 02:45:43PM +0900, Hidehiro Kawai wrote:
> > > > >  void crash_kexec(struct pt_regs *regs)
> > > > >  {
> > > > > + int old_cpu, this_cpu;
> > > > > +
> > > > > + /*
> > > > > +  * `old_cpu == -1' means we are the first comer and 
> > > > > crash_kexec()
> > > > > +  * was called without entering panic().
> > > > > +  * `old_cpu == this_cpu' means crash_kexec() was called from 
> > > > > panic().
> > > > > +  */
> > > > > + this_cpu = raw_smp_processor_id();
> > > > > + old_cpu = atomic_cmpxchg(_cpu, -1, this_cpu);
> > > > > + if (old_cpu != -1 && old_cpu != this_cpu)
> > > > > + return;
> > > >
> > > > This allows recursive calling of crash_kexec(), the Changelog did not
> > > > mention that. Is this really required?
> > >
> > > What part are you arguing?  Recursive call of crash_kexec() doesn't
> > > happen.  In the first place, one of the purpose of this patch is
> > > to prevent a recursive call of crash_kexec() in the following case
> > > as I stated in the description:
> > >
> > > CPU 0:
> > >   oops_end()
> > > crash_kexec()
> > >   mutex_trylock() // acquired
> > > 
> > > io_check_error()
> > >   panic()
> > > crash_kexec()
> > >   mutex_trylock() // failed to acquire
> > > infinite loop
> > >
> >
> > Yes, but what to we want to do there? It seems to me that is wrong, we
> > do not want to let a recursive crash_kexec() proceed.
> >
> > Whereas the condition you created explicitly allows this recursion by
> > virtue of the 'old_cpu != this_cpu' check.
> 
> I understand your question.  I don't intend to permit the recursive
> call of crash_kexec() as for 'old_cpu != this_cpu' check.  That is
> needed for the case of panic() --> crash_kexec().  Since panic_cpu has
> already been set to this_cpu in panic() (please see PATCH 1/4), no one
> can run crash_kexec() without 'old_cpu != this_cpu' check.
> 
> If you don't like this check, I would also be able to handle this case
> like below:
> 
> crash_kexec()
> {
>   old_cpu = atomic_cmpxchg(_cpu, -1, this_cpu);
>   if (old_cpu != -1)
>   return;
> 
>   __crash_kexec();
> }
> 
> panic()
> {
>   atomic_cmpxchg(_cpu, -1, this_cpu);
>   __crash_kexec();
> ...
> 

Is that OK?

Regards,

Hidehiro Kawai
Hitachi, Ltd. Research & Development Group


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [V3 PATCH 3/4] kexec: Fix race between panic() and crash_kexec() called directly

2015-08-31 Thread Peter Zijlstra
On Mon, Aug 31, 2015 at 08:53:11AM +, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > I understand your question.  I don't intend to permit the recursive
> > call of crash_kexec() as for 'old_cpu != this_cpu' check.  That is
> > needed for the case of panic() --> crash_kexec().  Since panic_cpu has
> > already been set to this_cpu in panic() (please see PATCH 1/4), no one
> > can run crash_kexec() without 'old_cpu != this_cpu' check.
> > 
> > If you don't like this check, I would also be able to handle this case
> > like below:
> > 
> > crash_kexec()
> > {
> > old_cpu = atomic_cmpxchg(_cpu, -1, this_cpu);
> > if (old_cpu != -1)
> > return;
> > 
> > __crash_kexec();
> > }
> > 
> > panic()
> > {
> > atomic_cmpxchg(_cpu, -1, this_cpu);
> > __crash_kexec();
> > ...
> > 
> 
> Is that OK?

I suppose so, but I think me getting confused means comments can be
added/improved.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [V3 PATCH 3/4] kexec: Fix race between panic() and crash_kexec() called directly

2015-08-25 Thread Peter Zijlstra
On Sat, Aug 22, 2015 at 02:35:24AM +, 河合英宏 / KAWAI,HIDEHIRO wrote:
  From: Peter Zijlstra [mailto:pet...@infradead.org]
  
  On Thu, Aug 06, 2015 at 02:45:43PM +0900, Hidehiro Kawai wrote:
void crash_kexec(struct pt_regs *regs)
{
   + int old_cpu, this_cpu;
   +
   + /*
   +  * `old_cpu == -1' means we are the first comer and crash_kexec()
   +  * was called without entering panic().
   +  * `old_cpu == this_cpu' means crash_kexec() was called from panic().
   +  */
   + this_cpu = raw_smp_processor_id();
   + old_cpu = atomic_cmpxchg(panic_cpu, -1, this_cpu);
   + if (old_cpu != -1  old_cpu != this_cpu)
   + return;
  
  This allows recursive calling of crash_kexec(), the Changelog did not
  mention that. Is this really required?
 
 What part are you arguing?  Recursive call of crash_kexec() doesn't
 happen.  In the first place, one of the purpose of this patch is
 to prevent a recursive call of crash_kexec() in the following case
 as I stated in the description:
 
 CPU 0:
   oops_end()
 crash_kexec()
   mutex_trylock() // acquired
 NMI
 io_check_error()
   panic()
 crash_kexec()
   mutex_trylock() // failed to acquire
 infinite loop
 

Yes, but what to we want to do there? It seems to me that is wrong, we
do not want to let a recursive crash_kexec() proceed.

Whereas the condition you created explicitly allows this recursion by
virtue of the 'old_cpu != this_cpu' check.

You changelog does not explain why you want a recursive crash_kexec().

 Also, the logic doesn't change form V1 (although the implementation
 changed), so I didn't add changelogs any more.

I cannot remember V1, nor can any prior patch be relevant.


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [V3 PATCH 3/4] kexec: Fix race between panic() and crash_kexec() called directly

2015-08-25 Thread 河合英宏 / KAWAI,HIDEHIRO
Hi,

 From: Peter Zijlstra [mailto:pet...@infradead.org]
 
 On Sat, Aug 22, 2015 at 02:35:24AM +, 河合英宏 / KAWAI,HIDEHIRO wrote:
   From: Peter Zijlstra [mailto:pet...@infradead.org]
  
   On Thu, Aug 06, 2015 at 02:45:43PM +0900, Hidehiro Kawai wrote:
 void crash_kexec(struct pt_regs *regs)
 {
+   int old_cpu, this_cpu;
+
+   /*
+* `old_cpu == -1' means we are the first comer and 
crash_kexec()
+* was called without entering panic().
+* `old_cpu == this_cpu' means crash_kexec() was called from 
panic().
+*/
+   this_cpu = raw_smp_processor_id();
+   old_cpu = atomic_cmpxchg(panic_cpu, -1, this_cpu);
+   if (old_cpu != -1  old_cpu != this_cpu)
+   return;
  
   This allows recursive calling of crash_kexec(), the Changelog did not
   mention that. Is this really required?
 
  What part are you arguing?  Recursive call of crash_kexec() doesn't
  happen.  In the first place, one of the purpose of this patch is
  to prevent a recursive call of crash_kexec() in the following case
  as I stated in the description:
 
  CPU 0:
oops_end()
  crash_kexec()
mutex_trylock() // acquired
  NMI
  io_check_error()
panic()
  crash_kexec()
mutex_trylock() // failed to acquire
  infinite loop
 
 
 Yes, but what to we want to do there? It seems to me that is wrong, we
 do not want to let a recursive crash_kexec() proceed.
 
 Whereas the condition you created explicitly allows this recursion by
 virtue of the 'old_cpu != this_cpu' check.

I understand your question.  I don't intend to permit the recursive
call of crash_kexec() as for 'old_cpu != this_cpu' check.  That is
needed for the case of panic() -- crash_kexec().  Since panic_cpu has
already been set to this_cpu in panic() (please see PATCH 1/4), no one
can run crash_kexec() without 'old_cpu != this_cpu' check.

If you don't like this check, I would also be able to handle this case
like below:

crash_kexec()
{
old_cpu = atomic_cmpxchg(panic_cpu, -1, this_cpu);
if (old_cpu != -1)
return;

__crash_kexec();
}

panic()
{
atomic_cmpxchg(panic_cpu, -1, this_cpu);
__crash_kexec();
...


Regards,

Hidehiro Kawai
Hitachi, Ltd. Research  Development Group

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [V3 PATCH 3/4] kexec: Fix race between panic() and crash_kexec() called directly

2015-08-21 Thread 河合英宏 / KAWAI,HIDEHIRO
 From: Peter Zijlstra [mailto:pet...@infradead.org]
 
 On Thu, Aug 06, 2015 at 02:45:43PM +0900, Hidehiro Kawai wrote:
   void crash_kexec(struct pt_regs *regs)
   {
  +   int old_cpu, this_cpu;
  +
  +   /*
  +* `old_cpu == -1' means we are the first comer and crash_kexec()
  +* was called without entering panic().
  +* `old_cpu == this_cpu' means crash_kexec() was called from panic().
  +*/
  +   this_cpu = raw_smp_processor_id();
  +   old_cpu = atomic_cmpxchg(panic_cpu, -1, this_cpu);
  +   if (old_cpu != -1  old_cpu != this_cpu)
  +   return;
 
 This allows recursive calling of crash_kexec(), the Changelog did not
 mention that. Is this really required?

What part are you arguing?  Recursive call of crash_kexec() doesn't
happen.  In the first place, one of the purpose of this patch is
to prevent a recursive call of crash_kexec() in the following case
as I stated in the description:

CPU 0:
  oops_end()
crash_kexec()
  mutex_trylock() // acquired
NMI
io_check_error()
  panic()
crash_kexec()
  mutex_trylock() // failed to acquire
infinite loop


Also, the logic doesn't change form V1 (although the implementation
changed), so I didn't add changelogs any more.

Regards,

Hidehiro Kawai
Hitachi, Ltd. Research  Development Group


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [V3 PATCH 3/4] kexec: Fix race between panic() and crash_kexec() called directly

2015-08-20 Thread Peter Zijlstra
On Thu, Aug 06, 2015 at 02:45:43PM +0900, Hidehiro Kawai wrote:
  void crash_kexec(struct pt_regs *regs)
  {
 + int old_cpu, this_cpu;
 +
 + /*
 +  * `old_cpu == -1' means we are the first comer and crash_kexec()
 +  * was called without entering panic().
 +  * `old_cpu == this_cpu' means crash_kexec() was called from panic().
 +  */
 + this_cpu = raw_smp_processor_id();
 + old_cpu = atomic_cmpxchg(panic_cpu, -1, this_cpu);
 + if (old_cpu != -1  old_cpu != this_cpu)
 + return;

This allows recursive calling of crash_kexec(), the Changelog did not
mention that. Is this really required?

 +
   /* Take the kexec_mutex here to prevent sys_kexec_load
* running on one cpu from replacing the crash kernel
* we are using after a panic on a different cpu.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec