On Mon, 11 May 2015, Vikas Shivappa wrote:

> Signed-off-by: Vikas Shivappa <[email protected]>
> 
> Conflicts:
>       arch/x86/kernel/cpu/perf_event_intel_cqm.c

And that's interesting for what?

> --- /dev/null
> +++ b/arch/x86/include/asm/rdt_common.h

> @@ -0,0 +1,13 @@
> +#ifndef _X86_RDT_H_
> +#define _X86_RDT_H_
> +
> +#define MSR_IA32_PQR_ASSOC   0x0c8f
> +
> +struct intel_pqr_state {
> +     raw_spinlock_t  lock;
> +     int             rmid;
> +     int             clos;

Those want to be u32. We have no type checkes there, but u32 is the
proper data type for wrmsr.

> +     int             cnt;
> +};
> +

What's wrong with having this in intel_rdt.h? It seems you're a big
fan of sprinkling stuff all over the place so reading becomes a
chasing game.

>  {
>       struct task_struct *task = current;
>       struct intel_rdt *ir;
> +     struct intel_pqr_state *state = this_cpu_ptr(&pqr_state);
> +     unsigned long flags;
>  
> +     raw_spin_lock_irqsave(&state->lock, flags);

finish_arch_switch() is called with interrupts disabled already ...

>       rcu_read_lock();

So we now have a spin_lock() and rcu_read_lock() and no explanation
what is protecting what.

>       ir = task_rdt(task);
> -     if (ir->clos == clos) {
> +     if (ir->clos == state->clos) {

And of course taking the spin lock for checking state->clos is
complete and utter nonsense.

state->clos can only be changed by this code and the only reason why
we need the lock is to protect against concurrent modification of
state->rmid.

So the check for ir->clos == state->clos can be done lockless.

And I seriously doubt, that state->lock is necessary at all.

Q: What is it protecting?
A: state->clos, state->rmid, state->cnt

Q: What's the context?
A: Per cpu context. The per cpu pqr state is NEVER modified from a
   remote cpu.

Q: What is the lock for?
A: Nothing.

Q: Why?
A: Because interrupt disabled regions protect per cpu state perfectly
   fine and there is is no memory ordering issue which would require a
   lock or barrier either.

Peter explained it to you several times already that context switch is
one the most sensitive hot pathes where we care about every cycle. But
you just go ahead and mindlessly create pointless overhead.

> +     /*
> +      * PQR has closid in high 32 bits and CQM-RMID
> +      * in low 10 bits. Rewrite the exsting rmid from
> +      * software cache.

This comment wouldnt be necessary if you would have proper documented
struct pqr_state.

> +      */
> +     wrmsr(MSR_IA32_PQR_ASSOC, state->rmid, ir->clos);
> +     state->clos = ir->clos;
>       rcu_read_unlock();
> +     raw_spin_unlock_irqrestore(&state->lock, flags);
> +
>  }

> -static DEFINE_PER_CPU(struct intel_cqm_state, cqm_state);
> +DEFINE_PER_CPU(struct intel_pqr_state, pqr_state);

With CONFIG_PERF=n and CONFIG_CGROUP_RDT=y the linker will fail.

Thanks,

        tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to