RE: [V3 PATCH 3/4] kexec: Fix race between panic() and crash_kexec() called directly
Hello Peter, > From: linux-kernel-ow...@vger.kernel.org > [mailto:linux-kernel-ow...@vger.kernel.org] On Behalf Of 河合英宏 / KAWAI, > > Hi, > > > From: Peter Zijlstra [mailto:pet...@infradead.org] > > > > On Sat, Aug 22, 2015 at 02:35:24AM +, 河合英宏 / KAWAI,HIDEHIRO wrote: > > > > From: Peter Zijlstra [mailto:pet...@infradead.org] > > > > > > > > On Thu, Aug 06, 2015 at 02:45:43PM +0900, Hidehiro Kawai wrote: > > > > > void crash_kexec(struct pt_regs *regs) > > > > > { > > > > > + int old_cpu, this_cpu; > > > > > + > > > > > + /* > > > > > + * `old_cpu == -1' means we are the first comer and > > > > > crash_kexec() > > > > > + * was called without entering panic(). > > > > > + * `old_cpu == this_cpu' means crash_kexec() was called from > > > > > panic(). > > > > > + */ > > > > > + this_cpu = raw_smp_processor_id(); > > > > > + old_cpu = atomic_cmpxchg(_cpu, -1, this_cpu); > > > > > + if (old_cpu != -1 && old_cpu != this_cpu) > > > > > + return; > > > > > > > > This allows recursive calling of crash_kexec(), the Changelog did not > > > > mention that. Is this really required? > > > > > > What part are you arguing? Recursive call of crash_kexec() doesn't > > > happen. In the first place, one of the purpose of this patch is > > > to prevent a recursive call of crash_kexec() in the following case > > > as I stated in the description: > > > > > > CPU 0: > > > oops_end() > > > crash_kexec() > > > mutex_trylock() // acquired > > > > > > io_check_error() > > > panic() > > > crash_kexec() > > > mutex_trylock() // failed to acquire > > > infinite loop > > > > > > > Yes, but what to we want to do there? It seems to me that is wrong, we > > do not want to let a recursive crash_kexec() proceed. > > > > Whereas the condition you created explicitly allows this recursion by > > virtue of the 'old_cpu != this_cpu' check. > > I understand your question. I don't intend to permit the recursive > call of crash_kexec() as for 'old_cpu != this_cpu' check. That is > needed for the case of panic() --> crash_kexec(). Since panic_cpu has > already been set to this_cpu in panic() (please see PATCH 1/4), no one > can run crash_kexec() without 'old_cpu != this_cpu' check. > > If you don't like this check, I would also be able to handle this case > like below: > > crash_kexec() > { > old_cpu = atomic_cmpxchg(_cpu, -1, this_cpu); > if (old_cpu != -1) > return; > > __crash_kexec(); > } > > panic() > { > atomic_cmpxchg(_cpu, -1, this_cpu); > __crash_kexec(); > ... > Is that OK? Regards, Hidehiro Kawai Hitachi, Ltd. Research & Development Group ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [V3 PATCH 3/4] kexec: Fix race between panic() and crash_kexec() called directly
On Mon, Aug 31, 2015 at 08:53:11AM +, 河合英宏 / KAWAI,HIDEHIRO wrote: > > I understand your question. I don't intend to permit the recursive > > call of crash_kexec() as for 'old_cpu != this_cpu' check. That is > > needed for the case of panic() --> crash_kexec(). Since panic_cpu has > > already been set to this_cpu in panic() (please see PATCH 1/4), no one > > can run crash_kexec() without 'old_cpu != this_cpu' check. > > > > If you don't like this check, I would also be able to handle this case > > like below: > > > > crash_kexec() > > { > > old_cpu = atomic_cmpxchg(_cpu, -1, this_cpu); > > if (old_cpu != -1) > > return; > > > > __crash_kexec(); > > } > > > > panic() > > { > > atomic_cmpxchg(_cpu, -1, this_cpu); > > __crash_kexec(); > > ... > > > > Is that OK? I suppose so, but I think me getting confused means comments can be added/improved. ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [V3 PATCH 3/4] kexec: Fix race between panic() and crash_kexec() called directly
On Sat, Aug 22, 2015 at 02:35:24AM +, 河合英宏 / KAWAI,HIDEHIRO wrote: From: Peter Zijlstra [mailto:pet...@infradead.org] On Thu, Aug 06, 2015 at 02:45:43PM +0900, Hidehiro Kawai wrote: void crash_kexec(struct pt_regs *regs) { + int old_cpu, this_cpu; + + /* + * `old_cpu == -1' means we are the first comer and crash_kexec() + * was called without entering panic(). + * `old_cpu == this_cpu' means crash_kexec() was called from panic(). + */ + this_cpu = raw_smp_processor_id(); + old_cpu = atomic_cmpxchg(panic_cpu, -1, this_cpu); + if (old_cpu != -1 old_cpu != this_cpu) + return; This allows recursive calling of crash_kexec(), the Changelog did not mention that. Is this really required? What part are you arguing? Recursive call of crash_kexec() doesn't happen. In the first place, one of the purpose of this patch is to prevent a recursive call of crash_kexec() in the following case as I stated in the description: CPU 0: oops_end() crash_kexec() mutex_trylock() // acquired NMI io_check_error() panic() crash_kexec() mutex_trylock() // failed to acquire infinite loop Yes, but what to we want to do there? It seems to me that is wrong, we do not want to let a recursive crash_kexec() proceed. Whereas the condition you created explicitly allows this recursion by virtue of the 'old_cpu != this_cpu' check. You changelog does not explain why you want a recursive crash_kexec(). Also, the logic doesn't change form V1 (although the implementation changed), so I didn't add changelogs any more. I cannot remember V1, nor can any prior patch be relevant. ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
RE: [V3 PATCH 3/4] kexec: Fix race between panic() and crash_kexec() called directly
Hi, From: Peter Zijlstra [mailto:pet...@infradead.org] On Sat, Aug 22, 2015 at 02:35:24AM +, 河合英宏 / KAWAI,HIDEHIRO wrote: From: Peter Zijlstra [mailto:pet...@infradead.org] On Thu, Aug 06, 2015 at 02:45:43PM +0900, Hidehiro Kawai wrote: void crash_kexec(struct pt_regs *regs) { + int old_cpu, this_cpu; + + /* +* `old_cpu == -1' means we are the first comer and crash_kexec() +* was called without entering panic(). +* `old_cpu == this_cpu' means crash_kexec() was called from panic(). +*/ + this_cpu = raw_smp_processor_id(); + old_cpu = atomic_cmpxchg(panic_cpu, -1, this_cpu); + if (old_cpu != -1 old_cpu != this_cpu) + return; This allows recursive calling of crash_kexec(), the Changelog did not mention that. Is this really required? What part are you arguing? Recursive call of crash_kexec() doesn't happen. In the first place, one of the purpose of this patch is to prevent a recursive call of crash_kexec() in the following case as I stated in the description: CPU 0: oops_end() crash_kexec() mutex_trylock() // acquired NMI io_check_error() panic() crash_kexec() mutex_trylock() // failed to acquire infinite loop Yes, but what to we want to do there? It seems to me that is wrong, we do not want to let a recursive crash_kexec() proceed. Whereas the condition you created explicitly allows this recursion by virtue of the 'old_cpu != this_cpu' check. I understand your question. I don't intend to permit the recursive call of crash_kexec() as for 'old_cpu != this_cpu' check. That is needed for the case of panic() -- crash_kexec(). Since panic_cpu has already been set to this_cpu in panic() (please see PATCH 1/4), no one can run crash_kexec() without 'old_cpu != this_cpu' check. If you don't like this check, I would also be able to handle this case like below: crash_kexec() { old_cpu = atomic_cmpxchg(panic_cpu, -1, this_cpu); if (old_cpu != -1) return; __crash_kexec(); } panic() { atomic_cmpxchg(panic_cpu, -1, this_cpu); __crash_kexec(); ... Regards, Hidehiro Kawai Hitachi, Ltd. Research Development Group ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
RE: [V3 PATCH 3/4] kexec: Fix race between panic() and crash_kexec() called directly
From: Peter Zijlstra [mailto:pet...@infradead.org] On Thu, Aug 06, 2015 at 02:45:43PM +0900, Hidehiro Kawai wrote: void crash_kexec(struct pt_regs *regs) { + int old_cpu, this_cpu; + + /* +* `old_cpu == -1' means we are the first comer and crash_kexec() +* was called without entering panic(). +* `old_cpu == this_cpu' means crash_kexec() was called from panic(). +*/ + this_cpu = raw_smp_processor_id(); + old_cpu = atomic_cmpxchg(panic_cpu, -1, this_cpu); + if (old_cpu != -1 old_cpu != this_cpu) + return; This allows recursive calling of crash_kexec(), the Changelog did not mention that. Is this really required? What part are you arguing? Recursive call of crash_kexec() doesn't happen. In the first place, one of the purpose of this patch is to prevent a recursive call of crash_kexec() in the following case as I stated in the description: CPU 0: oops_end() crash_kexec() mutex_trylock() // acquired NMI io_check_error() panic() crash_kexec() mutex_trylock() // failed to acquire infinite loop Also, the logic doesn't change form V1 (although the implementation changed), so I didn't add changelogs any more. Regards, Hidehiro Kawai Hitachi, Ltd. Research Development Group ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [V3 PATCH 3/4] kexec: Fix race between panic() and crash_kexec() called directly
On Thu, Aug 06, 2015 at 02:45:43PM +0900, Hidehiro Kawai wrote: void crash_kexec(struct pt_regs *regs) { + int old_cpu, this_cpu; + + /* + * `old_cpu == -1' means we are the first comer and crash_kexec() + * was called without entering panic(). + * `old_cpu == this_cpu' means crash_kexec() was called from panic(). + */ + this_cpu = raw_smp_processor_id(); + old_cpu = atomic_cmpxchg(panic_cpu, -1, this_cpu); + if (old_cpu != -1 old_cpu != this_cpu) + return; This allows recursive calling of crash_kexec(), the Changelog did not mention that. Is this really required? + /* Take the kexec_mutex here to prevent sys_kexec_load * running on one cpu from replacing the crash kernel * we are using after a panic on a different cpu. ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec