On 3/18/19 5:40 AM, Peter Zijlstra wrote:
> On Fri, Mar 15, 2019 at 08:41:00PM +0000, Lendacky, Thomas wrote:
>> This issue is different from a previous issue related to perf NMIs that
>> was fixed in commit:
>>   63e6be6d98e1 ("perf, x86: Catch spurious interrupts after disabling 
>> counters")
>>
>> The difference here is that the NMI latency can contribute to what appear
>> to be spurious NMIs during the handling of PMC counter overflow while the
>> counter is active as opposed to when the counter is being disabled.
> 
> But could we not somehow merge these two cases? The cause is similar
> IIUC. The PMI gets triggered, but then:
> 
>   - a previous PMI handler handled our overflow, or
>   - our event gets disabled.
> 
> and when our PMI triggers it finds nothing to do.

Yes, let me look into what we can do here.

> 
>> +/*
>> + * Because of NMI latency, if multiple PMC counters are active or other 
>> sources
>> + * of NMIs are received, the perf NMI handler can handle one or more 
>> overflowed
>> + * PMC counters outside of the NMI associated with the PMC overflow. If the 
>> NMI
>> + * doesn't arrive at the LAPIC in time to become a pending NMI, then the 
>> kernel
>> + * back-to-back NMI support won't be active. This PMC handler needs to take 
>> into
>> + * account that this can occur, otherwise this could result in unknown NMI
>> + * messages being issued. Examples of this is PMC overflow while in the NMI
>> + * handler when multiple PMCs are active or PMC overflow while handling some
>> + * other source of an NMI.
>> + *
>> + * Attempt to mitigate this by using the number of active PMCs to determine
>> + * whether to return NMI_HANDLED if the perf NMI handler did not 
>> handle/reset
>> + * any PMCs. The per-CPU perf_nmi_counter variable is set to a minimum of 
>> the
>> + * number of active PMCs or 2. The value of 2 is used in case an NMI does 
>> not
>> + * arrive at the LAPIC in time to be collapsed into an already pending NMI.
>> + */
>> +static int amd_pmu_handle_irq(struct pt_regs *regs)
>> +{
>> +    struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
>> +    int active, handled;
>> +
>> +    active = __bitmap_weight(cpuc->active_mask, X86_PMC_IDX_MAX);
>> +    handled = x86_pmu_handle_irq(regs);
>> +
>> +    /*
>> +     * If no counters are active, reset perf_nmi_counter and return
>> +     * NMI_DONE
>> +     */
>> +    if (!active) {
>> +            this_cpu_write(perf_nmi_counter, 0);
>> +            return NMI_DONE;
>> +    }
> 
> This will actively render 63e6be6d98e1 void I think. Because that can
> return !0 while !active -- that's rather the whole point of it.

Yup, I didn't take that into account when I changed that. Let me take a
closer look at all this along with your suggestion from the other email.

I'm traveling this week, so it might be a few days.

Thanks,
Tom

> 
>> +    /*
>> +     * If a counter was handled, record the number of possible remaining
>> +     * NMIs that can occur.
>> +     */
>> +    if (handled) {
>> +            this_cpu_write(perf_nmi_counter,
>> +                           min_t(unsigned int, 2, active));
>> +
>> +            return handled;
>> +    }
>> +
>> +    if (!this_cpu_read(perf_nmi_counter))
>> +            return NMI_DONE;
>> +
>> +    this_cpu_dec(perf_nmi_counter);
>> +
>> +    return NMI_HANDLED;
>> +}
> 
> 

Reply via email to