RE: [RFC][PATCH] pstore: Skip spinlock when just one cpu is online

2012-12-10 Thread Seiji Aguchi
> A boot argument might help - so we can force use of pstore in cases where > kdump is failing (or prevent use of pstore in cases where it > seem to be preventing us getting to kdump ... I don't have a preference). > BUT this would only be useful if we had a repeatable > problem so that we could

RE: [RFC][PATCH] pstore: Skip spinlock when just one cpu is online

2012-12-10 Thread Seiji Aguchi
> > If we can fix it with a small patch in adance, it is really helpful for us. > > As I said in my email I just sent, it may not help you without testing it. > As there are probably other problems in that un-tested theoretical scenario. OK. I understood. > > > > 2) > > In the long term, I pla

RE: [RFC][PATCH] pstore: Skip spinlock when just one cpu is online

2012-12-10 Thread Seiji Aguchi
> Now my first reaction would be, if that is the scenario, why couldn't cpuA > release the lock within one second. Because if cpuA is stuck > talking with firmware, then your patch to force the unlock is probably going > to trip over the same problems. > (those problems include dealing with rese

RE: [RFC][PATCH] pstore: Skip spinlock when just one cpu is online

2012-12-10 Thread Luck, Tony
> But you are assuming that kmsg_dump is perfect and it isn't, in which case > by putting kmsg_dump in the kdump path, you actually may be blocking kdump > from working. I think the concern is that kdump isn't perfect, so sometimes we don't get a good dump from it. In those cases it would have be

Re: [RFC][PATCH] pstore: Skip spinlock when just one cpu is online

2012-12-10 Thread Don Zickus
On Fri, Dec 07, 2012 at 11:43:03PM +, Seiji Aguchi wrote: > > Can all these things really happen (did you run into this problem on a real > > system?). Or is this just a theoretical problem. Ugly (but > > practical) hacks might be OK to solve real problems. > > It is a theoretical problem r

Re: [RFC][PATCH] pstore: Skip spinlock when just one cpu is online

2012-12-10 Thread Don Zickus
On Fri, Dec 07, 2012 at 09:41:13PM +, Seiji Aguchi wrote: > [Issue] > > If one cpu ,which is taking a psinfo->buf_lock, > receive NMI from a panicked cpu via smp_send_stop(), > the panicked cpu hangs up in pstore_dump() called by > kmsg_dump(KMSG_DUMP_PANIC) > because the psinfo->buf_lock is

RE: [RFC][PATCH] pstore: Skip spinlock when just one cpu is online

2012-12-07 Thread Seiji Aguchi
> Can all these things really happen (did you run into this problem on a real > system?). Or is this just a theoretical problem. Ugly (but > practical) hacks might be OK to solve real problems. It is a theoretical problem right now. But it is a timing issue and there is a possibility to happen

RE: [RFC][PATCH] pstore: Skip spinlock when just one cpu is online

2012-12-07 Thread Luck, Tony
> This patch skips taking a psinfo->buf_lock when just one cpu is online > because stopped cpus turn to offline via smp_send_stop() > in some architectures like x86, powerpc or arm64. That seems an impressive list of preconditions. So for this to help we need to have taken all but one cpu offline