>>> On 17.03.17 at 07:46, <haozhong.zh...@intel.com> wrote: > @@ -52,8 +52,8 @@ void mce_barrier_exit(struct mce_softirq_barrier *bar) > } > } > > -void mce_barrier(struct mce_softirq_barrier *bar) > +void mce_barrier(struct mce_softirq_barrier *bar, bool nowait) > { > - mce_barrier_enter(bar); > - mce_barrier_exit(bar); > + mce_barrier_enter(bar, nowait); > + mce_barrier_exit(bar, nowait); > }
I don't think this currently unused function needs the extra parameter. Should a future user find this necessary, it can be done then. > --- a/xen/arch/x86/cpu/mcheck/mce.c > +++ b/xen/arch/x86/cpu/mcheck/mce.c > @@ -42,6 +42,13 @@ DEFINE_PER_CPU_READ_MOSTLY(struct mca_banks *, > poll_bankmask); > DEFINE_PER_CPU_READ_MOSTLY(struct mca_banks *, no_cmci_banks); > DEFINE_PER_CPU_READ_MOSTLY(struct mca_banks *, mce_clear_banks); > > +/* > + * Flag to indicate that at least one non-local MCE's on this CPU have ... one non-local MCE on this CPU has ... > + * not been completed handled. It's set by mcheck_cmn_handler() and > + * cleared by mce_softirq(). > + */ > +DEFINE_PER_CPU(bool, mce_in_process); static Also please consider whether mce_in_progress might not be a better alternative name, and whether the non-local aspect shouldn't also be expressed by the name chosen. > @@ -1704,10 +1717,11 @@ static void mce_softirq(void) > { > int cpu = smp_processor_id(); > unsigned int workcpu; > + bool nowait = !this_cpu(mce_in_process); > > mce_printk(MCE_VERBOSE, "CPU%d enter softirq\n", cpu); > > - mce_barrier_enter(&mce_inside_bar); > + mce_barrier_enter(&mce_inside_bar, nowait); > > /* > * Everybody is here. Now let's see who gets to do the > @@ -1720,10 +1734,10 @@ static void mce_softirq(void) > > atomic_set(&severity_cpu, cpu); > > - mce_barrier_enter(&mce_severity_bar); > + mce_barrier_enter(&mce_severity_bar, nowait); > if (!mctelem_has_deferred(cpu)) > atomic_set(&severity_cpu, cpu); > - mce_barrier_exit(&mce_severity_bar); > + mce_barrier_exit(&mce_severity_bar, nowait); > > /* We choose severity_cpu for further processing */ > if (atomic_read(&severity_cpu) == cpu) { The logic here looks pretty suspicious even without your changes, but I think we should try hard to not make it worse. I think you need to avoid setting severity_cpu in the LMCE case (with, obviously, further resulting adjustments). And it also needs to be at least clarified (in the patch description) whether this_cpu(mce_in_process) changing (namely from 0 to 1) after you've latched it can't have any bad effect. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel