Re: [PATCH 5/7] x86/intel_rdt: Software Cache for IA32_PQR_MSR
On Wed, 20 May 2015, Thomas Gleixner wrote: On Wed, 20 May 2015, Vikas Shivappa wrote: On Wed, 20 May 2015, Thomas Gleixner wrote: On Wed, 20 May 2015, Vikas Shivappa wrote: On Fri, 15 May 2015, Thomas Gleixner wrote: sorry, my reason was wrong though. So PERF_EVENTS cant be disabled on x86 and RDT is only in x86 && SUP_INTEL. That should have been your answer to #1, right? yes, thats why i said '>> sorry, my reason was wrong though.' above I thought the .config edit would have spit out an error message as the PERF_EVENT was reselected (basically print any such dependency changes it does when starting to compile). I was only curious if you compiled by any chance as some of the folks were trying to get the Intel systems with RDT.. Thanks, Vikas Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 5/7] x86/intel_rdt: Software Cache for IA32_PQR_MSR
On Wed, 20 May 2015, Vikas Shivappa wrote: > On Wed, 20 May 2015, Thomas Gleixner wrote: > > > On Wed, 20 May 2015, Vikas Shivappa wrote: > > > On Fri, 15 May 2015, Thomas Gleixner wrote: > > > > > -static DEFINE_PER_CPU(struct intel_cqm_state, cqm_state); > > > > > +DEFINE_PER_CPU(struct intel_pqr_state, pqr_state); > > > > > > > > With CONFIG_PERF=n and CONFIG_CGROUP_RDT=y the linker will fail. > > > > > > copy from Makefile below - > > > obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_intel_rapl.o > > > perf_event_intel_cqm.o > > > > > > should work with CONFIG_PERF_EVENTS=n and CGROUP_RDT=y ? > > > > Groan. Did you try to compile it? Certainly not. > > of course I did compile successfully after I changed PERF_EVENTS=n and RDT=y > in the .config. > sorry, my reason was wrong though. > > What I had not noticed was the .config is simply added to enabling > CONFIG_PERF_EVENTS=y even though i disable it by editing .config. > Thats because x86 is selecting it. > So PERF_EVENTS cant be disabled on x86 and RDT is only in x86 && SUP_INTEL. > > How did you configure CONFIG_PERF_EVENTS=n and CGROUP_RDT=y and see the error I did not compile it. I looked at the config switches. And my two observations still stand: 1) It will fail with CONFIG_PERF_EVENTS=n and CGROUP_RDT=y 2) perf_event_intel_rapl.o depends on CONFIG_PERF_EVENTS=y It's up to you to provide me a proper reason why it will be always built. I'm reviewing code and its not my job to figure out why something magically works. It's your job to provide proper answers to my review observations. And your answer to my observation #1 definitely does not fall into that category: > > > copy from Makefile below - > > > obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_intel_rapl.o > > > perf_event_intel_cqm.o > > > > > > should work with CONFIG_PERF_EVENTS=n and CGROUP_RDT=y ? Then I told you that this is crap, because of #2 So now you gave me an explanation WHY it works magically: > So PERF_EVENTS cant be disabled on x86 and RDT is only in x86 && SUP_INTEL. That should have been your answer to #1, right? Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 5/7] x86/intel_rdt: Software Cache for IA32_PQR_MSR
On Wed, 20 May 2015, Thomas Gleixner wrote: On Wed, 20 May 2015, Vikas Shivappa wrote: On Fri, 15 May 2015, Thomas Gleixner wrote: -static DEFINE_PER_CPU(struct intel_cqm_state, cqm_state); +DEFINE_PER_CPU(struct intel_pqr_state, pqr_state); With CONFIG_PERF=n and CONFIG_CGROUP_RDT=y the linker will fail. copy from Makefile below - obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_intel_rapl.o perf_event_intel_cqm.o should work with CONFIG_PERF_EVENTS=n and CGROUP_RDT=y ? Groan. Did you try to compile it? Certainly not. of course I did compile successfully after I changed PERF_EVENTS=n and RDT=y in the .config. sorry, my reason was wrong though. What I had not noticed was the .config is simply added to enabling CONFIG_PERF_EVENTS=y even though i disable it by editing .config. Thats because x86 is selecting it. So PERF_EVENTS cant be disabled on x86 and RDT is only in x86 && SUP_INTEL. How did you configure CONFIG_PERF_EVENTS=n and CGROUP_RDT=y and see the error ? Thanks, Vikas Simply because the whole section which contains perf_event* object files is conditional on ifdef CONFIG_PERF_EVENTS I'm starting to get really grumpy and tired of your attitude. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 5/7] x86/intel_rdt: Software Cache for IA32_PQR_MSR
On Wed, 20 May 2015, Vikas Shivappa wrote: > On Fri, 15 May 2015, Thomas Gleixner wrote: > > > -static DEFINE_PER_CPU(struct intel_cqm_state, cqm_state); > > > +DEFINE_PER_CPU(struct intel_pqr_state, pqr_state); > > > > With CONFIG_PERF=n and CONFIG_CGROUP_RDT=y the linker will fail. > > copy from Makefile below - > obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_intel_rapl.o > perf_event_intel_cqm.o > > should work with CONFIG_PERF_EVENTS=n and CGROUP_RDT=y ? Groan. Did you try to compile it? Certainly not. Simply because the whole section which contains perf_event* object files is conditional on ifdef CONFIG_PERF_EVENTS I'm starting to get really grumpy and tired of your attitude. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 5/7] x86/intel_rdt: Software Cache for IA32_PQR_MSR
On Fri, 15 May 2015, Thomas Gleixner wrote: On Mon, 11 May 2015, Vikas Shivappa wrote: Signed-off-by: Vikas Shivappa Conflicts: arch/x86/kernel/cpu/perf_event_intel_cqm.c And that's interesting for what? Will remove this, fixed some conflicts as this code changes both cqm and cache allocation. --- /dev/null +++ b/arch/x86/include/asm/rdt_common.h @@ -0,0 +1,13 @@ +#ifndef _X86_RDT_H_ +#define _X86_RDT_H_ + +#define MSR_IA32_PQR_ASSOC 0x0c8f + +struct intel_pqr_state { + raw_spinlock_t lock; + int rmid; + int clos; Those want to be u32. We have no type checkes there, but u32 is the proper data type for wrmsr. will fix. + int cnt; +}; + What's wrong with having this in intel_rdt.h? intel_rdt.h is specific for only cache allocation and rdt features in the cgroup. well, thats a criteria to seperate them , isnt it ? cqm wont use most of the things (and growing) in intel_rdt.h. It seems you're a big fan of sprinkling stuff all over the place so reading becomes a chasing game. { struct task_struct *task = current; struct intel_rdt *ir; + struct intel_pqr_state *state = this_cpu_ptr(_state); + unsigned long flags; + raw_spin_lock_irqsave(>lock, flags); finish_arch_switch() is called with interrupts disabled already ... Ok , I somehow could not locate this cli when I thought about it. So if this is interrupt disabled and no preempt , then we dont need both the spin lock and rcu lock (since rcu lock sync would obviously wait for the preempt disable..). will remove both of them. rcu_read_lock(); So we now have a spin_lock() and rcu_read_lock() and no explanation what is protecting what. ir = task_rdt(task); - if (ir->clos == clos) { + if (ir->clos == state->clos) { And of course taking the spin lock for checking state->clos is complete and utter nonsense. state->clos can only be changed by this code and the only reason why we need the lock is to protect against concurrent modification of state->rmid. So the check for ir->clos == state->clos can be done lockless. And I seriously doubt, that state->lock is necessary at all. Q: What is it protecting? A: state->clos, state->rmid, state->cnt Q: What's the context? A: Per cpu context. The per cpu pqr state is NEVER modified from a remote cpu. Q: What is the lock for? A: Nothing. Q: Why? A: Because interrupt disabled regions protect per cpu state perfectly fine and there is is no memory ordering issue which would require a lock or barrier either. Peter explained it to you several times already that context switch is one the most sensitive hot pathes where we care about every cycle. But you just go ahead and mindlessly create pointless overhead. + /* +* PQR has closid in high 32 bits and CQM-RMID +* in low 10 bits. Rewrite the exsting rmid from +* software cache. This comment wouldnt be necessary if you would have proper documented struct pqr_state. Will fix, that would be lot better. +*/ + wrmsr(MSR_IA32_PQR_ASSOC, state->rmid, ir->clos); + state->clos = ir->clos; rcu_read_unlock(); + raw_spin_unlock_irqrestore(>lock, flags); + } -static DEFINE_PER_CPU(struct intel_cqm_state, cqm_state); +DEFINE_PER_CPU(struct intel_pqr_state, pqr_state); With CONFIG_PERF=n and CONFIG_CGROUP_RDT=y the linker will fail. copy from Makefile below - obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_intel_rapl.o perf_event_intel_cqm.o should work with CONFIG_PERF_EVENTS=n and CGROUP_RDT=y ? Thanks, Vikas Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 5/7] x86/intel_rdt: Software Cache for IA32_PQR_MSR
On Fri, 15 May 2015, Thomas Gleixner wrote: On Mon, 11 May 2015, Vikas Shivappa wrote: Signed-off-by: Vikas Shivappa vikas.shiva...@linux.intel.com Conflicts: arch/x86/kernel/cpu/perf_event_intel_cqm.c And that's interesting for what? Will remove this, fixed some conflicts as this code changes both cqm and cache allocation. --- /dev/null +++ b/arch/x86/include/asm/rdt_common.h @@ -0,0 +1,13 @@ +#ifndef _X86_RDT_H_ +#define _X86_RDT_H_ + +#define MSR_IA32_PQR_ASSOC 0x0c8f + +struct intel_pqr_state { + raw_spinlock_t lock; + int rmid; + int clos; Those want to be u32. We have no type checkes there, but u32 is the proper data type for wrmsr. will fix. + int cnt; +}; + What's wrong with having this in intel_rdt.h? intel_rdt.h is specific for only cache allocation and rdt features in the cgroup. well, thats a criteria to seperate them , isnt it ? cqm wont use most of the things (and growing) in intel_rdt.h. It seems you're a big fan of sprinkling stuff all over the place so reading becomes a chasing game. { struct task_struct *task = current; struct intel_rdt *ir; + struct intel_pqr_state *state = this_cpu_ptr(pqr_state); + unsigned long flags; + raw_spin_lock_irqsave(state-lock, flags); finish_arch_switch() is called with interrupts disabled already ... Ok , I somehow could not locate this cli when I thought about it. So if this is interrupt disabled and no preempt , then we dont need both the spin lock and rcu lock (since rcu lock sync would obviously wait for the preempt disable..). will remove both of them. rcu_read_lock(); So we now have a spin_lock() and rcu_read_lock() and no explanation what is protecting what. ir = task_rdt(task); - if (ir-clos == clos) { + if (ir-clos == state-clos) { And of course taking the spin lock for checking state-clos is complete and utter nonsense. state-clos can only be changed by this code and the only reason why we need the lock is to protect against concurrent modification of state-rmid. So the check for ir-clos == state-clos can be done lockless. And I seriously doubt, that state-lock is necessary at all. Q: What is it protecting? A: state-clos, state-rmid, state-cnt Q: What's the context? A: Per cpu context. The per cpu pqr state is NEVER modified from a remote cpu. Q: What is the lock for? A: Nothing. Q: Why? A: Because interrupt disabled regions protect per cpu state perfectly fine and there is is no memory ordering issue which would require a lock or barrier either. Peter explained it to you several times already that context switch is one the most sensitive hot pathes where we care about every cycle. But you just go ahead and mindlessly create pointless overhead. + /* +* PQR has closid in high 32 bits and CQM-RMID +* in low 10 bits. Rewrite the exsting rmid from +* software cache. This comment wouldnt be necessary if you would have proper documented struct pqr_state. Will fix, that would be lot better. +*/ + wrmsr(MSR_IA32_PQR_ASSOC, state-rmid, ir-clos); + state-clos = ir-clos; rcu_read_unlock(); + raw_spin_unlock_irqrestore(state-lock, flags); + } -static DEFINE_PER_CPU(struct intel_cqm_state, cqm_state); +DEFINE_PER_CPU(struct intel_pqr_state, pqr_state); With CONFIG_PERF=n and CONFIG_CGROUP_RDT=y the linker will fail. copy from Makefile below - obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_intel_rapl.o perf_event_intel_cqm.o should work with CONFIG_PERF_EVENTS=n and CGROUP_RDT=y ? Thanks, Vikas Thanks, tglx -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 5/7] x86/intel_rdt: Software Cache for IA32_PQR_MSR
On Wed, 20 May 2015, Thomas Gleixner wrote: On Wed, 20 May 2015, Vikas Shivappa wrote: On Fri, 15 May 2015, Thomas Gleixner wrote: -static DEFINE_PER_CPU(struct intel_cqm_state, cqm_state); +DEFINE_PER_CPU(struct intel_pqr_state, pqr_state); With CONFIG_PERF=n and CONFIG_CGROUP_RDT=y the linker will fail. copy from Makefile below - obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_intel_rapl.o perf_event_intel_cqm.o should work with CONFIG_PERF_EVENTS=n and CGROUP_RDT=y ? Groan. Did you try to compile it? Certainly not. of course I did compile successfully after I changed PERF_EVENTS=n and RDT=y in the .config. sorry, my reason was wrong though. What I had not noticed was the .config is simply added to enabling CONFIG_PERF_EVENTS=y even though i disable it by editing .config. Thats because x86 is selecting it. So PERF_EVENTS cant be disabled on x86 and RDT is only in x86 SUP_INTEL. How did you configure CONFIG_PERF_EVENTS=n and CGROUP_RDT=y and see the error ? Thanks, Vikas Simply because the whole section which contains perf_event* object files is conditional on ifdef CONFIG_PERF_EVENTS I'm starting to get really grumpy and tired of your attitude. Thanks, tglx -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 5/7] x86/intel_rdt: Software Cache for IA32_PQR_MSR
On Wed, 20 May 2015, Vikas Shivappa wrote: On Wed, 20 May 2015, Thomas Gleixner wrote: On Wed, 20 May 2015, Vikas Shivappa wrote: On Fri, 15 May 2015, Thomas Gleixner wrote: -static DEFINE_PER_CPU(struct intel_cqm_state, cqm_state); +DEFINE_PER_CPU(struct intel_pqr_state, pqr_state); With CONFIG_PERF=n and CONFIG_CGROUP_RDT=y the linker will fail. copy from Makefile below - obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_intel_rapl.o perf_event_intel_cqm.o should work with CONFIG_PERF_EVENTS=n and CGROUP_RDT=y ? Groan. Did you try to compile it? Certainly not. of course I did compile successfully after I changed PERF_EVENTS=n and RDT=y in the .config. sorry, my reason was wrong though. What I had not noticed was the .config is simply added to enabling CONFIG_PERF_EVENTS=y even though i disable it by editing .config. Thats because x86 is selecting it. So PERF_EVENTS cant be disabled on x86 and RDT is only in x86 SUP_INTEL. How did you configure CONFIG_PERF_EVENTS=n and CGROUP_RDT=y and see the error I did not compile it. I looked at the config switches. And my two observations still stand: 1) It will fail with CONFIG_PERF_EVENTS=n and CGROUP_RDT=y 2) perf_event_intel_rapl.o depends on CONFIG_PERF_EVENTS=y It's up to you to provide me a proper reason why it will be always built. I'm reviewing code and its not my job to figure out why something magically works. It's your job to provide proper answers to my review observations. And your answer to my observation #1 definitely does not fall into that category: copy from Makefile below - obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_intel_rapl.o perf_event_intel_cqm.o should work with CONFIG_PERF_EVENTS=n and CGROUP_RDT=y ? Then I told you that this is crap, because of #2 So now you gave me an explanation WHY it works magically: So PERF_EVENTS cant be disabled on x86 and RDT is only in x86 SUP_INTEL. That should have been your answer to #1, right? Thanks, tglx -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 5/7] x86/intel_rdt: Software Cache for IA32_PQR_MSR
On Wed, 20 May 2015, Vikas Shivappa wrote: On Fri, 15 May 2015, Thomas Gleixner wrote: -static DEFINE_PER_CPU(struct intel_cqm_state, cqm_state); +DEFINE_PER_CPU(struct intel_pqr_state, pqr_state); With CONFIG_PERF=n and CONFIG_CGROUP_RDT=y the linker will fail. copy from Makefile below - obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_intel_rapl.o perf_event_intel_cqm.o should work with CONFIG_PERF_EVENTS=n and CGROUP_RDT=y ? Groan. Did you try to compile it? Certainly not. Simply because the whole section which contains perf_event* object files is conditional on ifdef CONFIG_PERF_EVENTS I'm starting to get really grumpy and tired of your attitude. Thanks, tglx -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 5/7] x86/intel_rdt: Software Cache for IA32_PQR_MSR
On Wed, 20 May 2015, Thomas Gleixner wrote: On Wed, 20 May 2015, Vikas Shivappa wrote: On Wed, 20 May 2015, Thomas Gleixner wrote: On Wed, 20 May 2015, Vikas Shivappa wrote: On Fri, 15 May 2015, Thomas Gleixner wrote: sorry, my reason was wrong though. So PERF_EVENTS cant be disabled on x86 and RDT is only in x86 SUP_INTEL. That should have been your answer to #1, right? yes, thats why i said ' sorry, my reason was wrong though.' above I thought the .config edit would have spit out an error message as the PERF_EVENT was reselected (basically print any such dependency changes it does when starting to compile). I was only curious if you compiled by any chance as some of the folks were trying to get the Intel systems with RDT.. Thanks, Vikas Thanks, tglx -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 5/7] x86/intel_rdt: Software Cache for IA32_PQR_MSR
On Mon, 11 May 2015, Vikas Shivappa wrote: > Signed-off-by: Vikas Shivappa > > Conflicts: > arch/x86/kernel/cpu/perf_event_intel_cqm.c And that's interesting for what? > --- /dev/null > +++ b/arch/x86/include/asm/rdt_common.h > @@ -0,0 +1,13 @@ > +#ifndef _X86_RDT_H_ > +#define _X86_RDT_H_ > + > +#define MSR_IA32_PQR_ASSOC 0x0c8f > + > +struct intel_pqr_state { > + raw_spinlock_t lock; > + int rmid; > + int clos; Those want to be u32. We have no type checkes there, but u32 is the proper data type for wrmsr. > + int cnt; > +}; > + What's wrong with having this in intel_rdt.h? It seems you're a big fan of sprinkling stuff all over the place so reading becomes a chasing game. > { > struct task_struct *task = current; > struct intel_rdt *ir; > + struct intel_pqr_state *state = this_cpu_ptr(_state); > + unsigned long flags; > > + raw_spin_lock_irqsave(>lock, flags); finish_arch_switch() is called with interrupts disabled already ... > rcu_read_lock(); So we now have a spin_lock() and rcu_read_lock() and no explanation what is protecting what. > ir = task_rdt(task); > - if (ir->clos == clos) { > + if (ir->clos == state->clos) { And of course taking the spin lock for checking state->clos is complete and utter nonsense. state->clos can only be changed by this code and the only reason why we need the lock is to protect against concurrent modification of state->rmid. So the check for ir->clos == state->clos can be done lockless. And I seriously doubt, that state->lock is necessary at all. Q: What is it protecting? A: state->clos, state->rmid, state->cnt Q: What's the context? A: Per cpu context. The per cpu pqr state is NEVER modified from a remote cpu. Q: What is the lock for? A: Nothing. Q: Why? A: Because interrupt disabled regions protect per cpu state perfectly fine and there is is no memory ordering issue which would require a lock or barrier either. Peter explained it to you several times already that context switch is one the most sensitive hot pathes where we care about every cycle. But you just go ahead and mindlessly create pointless overhead. > + /* > + * PQR has closid in high 32 bits and CQM-RMID > + * in low 10 bits. Rewrite the exsting rmid from > + * software cache. This comment wouldnt be necessary if you would have proper documented struct pqr_state. > + */ > + wrmsr(MSR_IA32_PQR_ASSOC, state->rmid, ir->clos); > + state->clos = ir->clos; > rcu_read_unlock(); > + raw_spin_unlock_irqrestore(>lock, flags); > + > } > -static DEFINE_PER_CPU(struct intel_cqm_state, cqm_state); > +DEFINE_PER_CPU(struct intel_pqr_state, pqr_state); With CONFIG_PERF=n and CONFIG_CGROUP_RDT=y the linker will fail. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 5/7] x86/intel_rdt: Software Cache for IA32_PQR_MSR
On Mon, 11 May 2015, Vikas Shivappa wrote: Signed-off-by: Vikas Shivappa vikas.shiva...@linux.intel.com Conflicts: arch/x86/kernel/cpu/perf_event_intel_cqm.c And that's interesting for what? --- /dev/null +++ b/arch/x86/include/asm/rdt_common.h @@ -0,0 +1,13 @@ +#ifndef _X86_RDT_H_ +#define _X86_RDT_H_ + +#define MSR_IA32_PQR_ASSOC 0x0c8f + +struct intel_pqr_state { + raw_spinlock_t lock; + int rmid; + int clos; Those want to be u32. We have no type checkes there, but u32 is the proper data type for wrmsr. + int cnt; +}; + What's wrong with having this in intel_rdt.h? It seems you're a big fan of sprinkling stuff all over the place so reading becomes a chasing game. { struct task_struct *task = current; struct intel_rdt *ir; + struct intel_pqr_state *state = this_cpu_ptr(pqr_state); + unsigned long flags; + raw_spin_lock_irqsave(state-lock, flags); finish_arch_switch() is called with interrupts disabled already ... rcu_read_lock(); So we now have a spin_lock() and rcu_read_lock() and no explanation what is protecting what. ir = task_rdt(task); - if (ir-clos == clos) { + if (ir-clos == state-clos) { And of course taking the spin lock for checking state-clos is complete and utter nonsense. state-clos can only be changed by this code and the only reason why we need the lock is to protect against concurrent modification of state-rmid. So the check for ir-clos == state-clos can be done lockless. And I seriously doubt, that state-lock is necessary at all. Q: What is it protecting? A: state-clos, state-rmid, state-cnt Q: What's the context? A: Per cpu context. The per cpu pqr state is NEVER modified from a remote cpu. Q: What is the lock for? A: Nothing. Q: Why? A: Because interrupt disabled regions protect per cpu state perfectly fine and there is is no memory ordering issue which would require a lock or barrier either. Peter explained it to you several times already that context switch is one the most sensitive hot pathes where we care about every cycle. But you just go ahead and mindlessly create pointless overhead. + /* + * PQR has closid in high 32 bits and CQM-RMID + * in low 10 bits. Rewrite the exsting rmid from + * software cache. This comment wouldnt be necessary if you would have proper documented struct pqr_state. + */ + wrmsr(MSR_IA32_PQR_ASSOC, state-rmid, ir-clos); + state-clos = ir-clos; rcu_read_unlock(); + raw_spin_unlock_irqrestore(state-lock, flags); + } -static DEFINE_PER_CPU(struct intel_cqm_state, cqm_state); +DEFINE_PER_CPU(struct intel_pqr_state, pqr_state); With CONFIG_PERF=n and CONFIG_CGROUP_RDT=y the linker will fail. Thanks, tglx -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 5/7] x86/intel_rdt: Software Cache for IA32_PQR_MSR
This patch implements a common software cache for IA32_PQR_MSR(RMID 0:9, CLOSId 32:63) to be used by both Cache monitoring(CMT) and Cache allocation. CMT updates the RMID where as cache_alloc updates the CLOSid in the software cache. During scheduling when the new RMID/CLOSid value is different from the cached values, IA32_PQR_MSR is updated. Since the measured rdmsr latency for IA32_PQR_MSR is very high(~250 cycles) this software cache is necessary to avoid reading the MSR to compare the current CLOSid value. Caching reduces the frequency of MSR writes during the scheduler hot path for cache allocation. During CPU hotplug pqr cache is updated to zero. Signed-off-by: Vikas Shivappa Conflicts: arch/x86/kernel/cpu/perf_event_intel_cqm.c --- arch/x86/include/asm/intel_rdt.h | 9 ++--- arch/x86/include/asm/rdt_common.h | 13 + arch/x86/kernel/cpu/intel_rdt.c| 30 +- arch/x86/kernel/cpu/perf_event_intel_cqm.c | 20 +++- 4 files changed, 43 insertions(+), 29 deletions(-) create mode 100644 arch/x86/include/asm/rdt_common.h diff --git a/arch/x86/include/asm/intel_rdt.h b/arch/x86/include/asm/intel_rdt.h index 589394b..f4372d8 100644 --- a/arch/x86/include/asm/intel_rdt.h +++ b/arch/x86/include/asm/intel_rdt.h @@ -4,16 +4,16 @@ #ifdef CONFIG_CGROUP_RDT #include +#include -#define MSR_IA32_PQR_ASSOC 0xc8f #define MAX_CBM_LENGTH 32 #define IA32_L3_CBM_BASE 0xc90 #define CBM_FROM_INDEX(x) (IA32_L3_CBM_BASE + x) -DECLARE_PER_CPU(unsigned int, x86_cpu_clos); + +DECLARE_PER_CPU(struct intel_pqr_state, pqr_state); extern struct static_key rdt_enable_key; extern void __rdt_sched_in(void); - struct rdt_subsys_info { /* Clos Bitmap to keep track of available CLOSids.*/ unsigned long *closmap; @@ -67,6 +67,9 @@ static inline struct intel_rdt *task_rdt(struct task_struct *task) * IA32_PQR_MSR writes until the user starts really using the feature * ie creates a rdt cgroup directory and assigns a cache_mask thats * different from the root cgroup's cache_mask. + * - Caches the per cpu CLOSid values and does the MSR write only + * when a task with a different CLOSid is scheduled in. That + * means the task belongs to a different cgroup. * - Closids are allocated so that different cgroup directories * with same cache_mask gets the same CLOSid. This minimizes CLOSids * used and reduces MSR write frequency. diff --git a/arch/x86/include/asm/rdt_common.h b/arch/x86/include/asm/rdt_common.h new file mode 100644 index 000..33fd8ea --- /dev/null +++ b/arch/x86/include/asm/rdt_common.h @@ -0,0 +1,13 @@ +#ifndef _X86_RDT_H_ +#define _X86_RDT_H_ + +#define MSR_IA32_PQR_ASSOC 0x0c8f + +struct intel_pqr_state { + raw_spinlock_t lock; + int rmid; + int clos; + int cnt; +}; + +#endif diff --git a/arch/x86/kernel/cpu/intel_rdt.c b/arch/x86/kernel/cpu/intel_rdt.c index fe3ce4e..2415965 100644 --- a/arch/x86/kernel/cpu/intel_rdt.c +++ b/arch/x86/kernel/cpu/intel_rdt.c @@ -85,27 +85,28 @@ void __rdt_sched_in(void) { struct task_struct *task = current; struct intel_rdt *ir; - unsigned int clos; - - /* -* This needs to be fixed -* to cache the whole PQR instead of just CLOSid. -* PQR has closid in high 32 bits and CQM-RMID in low 10 bits. -* Should not write a 0 to the low 10 bits of PQR -* and corrupt RMID. -*/ - clos = this_cpu_read(x86_cpu_clos); + struct intel_pqr_state *state = this_cpu_ptr(_state); + unsigned long flags; + raw_spin_lock_irqsave(>lock, flags); rcu_read_lock(); ir = task_rdt(task); - if (ir->clos == clos) { + if (ir->clos == state->clos) { rcu_read_unlock(); + raw_spin_unlock_irqrestore(>lock, flags); return; } - wrmsr(MSR_IA32_PQR_ASSOC, 0, ir->clos); - this_cpu_write(x86_cpu_clos, ir->clos); + /* +* PQR has closid in high 32 bits and CQM-RMID +* in low 10 bits. Rewrite the exsting rmid from +* software cache. +*/ + wrmsr(MSR_IA32_PQR_ASSOC, state->rmid, ir->clos); + state->clos = ir->clos; rcu_read_unlock(); + raw_spin_unlock_irqrestore(>lock, flags); + } static void __clos_get(unsigned int closid) @@ -372,6 +373,9 @@ static inline bool intel_rdt_update_cpumask(int cpu) */ static inline void intel_rdt_cpu_start(int cpu) { + struct intel_pqr_state *state = _cpu(pqr_state, cpu); + + state->clos = 0; mutex_lock(_group_mutex); if (intel_rdt_update_cpumask(cpu)) cbm_update_msrs(cpu); diff --git a/arch/x86/kernel/cpu/perf_event_intel_cqm.c b/arch/x86/kernel/cpu/perf_event_intel_cqm.c index e4d1b8b..fd039899 100644 ---
[PATCH 5/7] x86/intel_rdt: Software Cache for IA32_PQR_MSR
This patch implements a common software cache for IA32_PQR_MSR(RMID 0:9, CLOSId 32:63) to be used by both Cache monitoring(CMT) and Cache allocation. CMT updates the RMID where as cache_alloc updates the CLOSid in the software cache. During scheduling when the new RMID/CLOSid value is different from the cached values, IA32_PQR_MSR is updated. Since the measured rdmsr latency for IA32_PQR_MSR is very high(~250 cycles) this software cache is necessary to avoid reading the MSR to compare the current CLOSid value. Caching reduces the frequency of MSR writes during the scheduler hot path for cache allocation. During CPU hotplug pqr cache is updated to zero. Signed-off-by: Vikas Shivappa vikas.shiva...@linux.intel.com Conflicts: arch/x86/kernel/cpu/perf_event_intel_cqm.c --- arch/x86/include/asm/intel_rdt.h | 9 ++--- arch/x86/include/asm/rdt_common.h | 13 + arch/x86/kernel/cpu/intel_rdt.c| 30 +- arch/x86/kernel/cpu/perf_event_intel_cqm.c | 20 +++- 4 files changed, 43 insertions(+), 29 deletions(-) create mode 100644 arch/x86/include/asm/rdt_common.h diff --git a/arch/x86/include/asm/intel_rdt.h b/arch/x86/include/asm/intel_rdt.h index 589394b..f4372d8 100644 --- a/arch/x86/include/asm/intel_rdt.h +++ b/arch/x86/include/asm/intel_rdt.h @@ -4,16 +4,16 @@ #ifdef CONFIG_CGROUP_RDT #include linux/cgroup.h +#include asm/rdt_common.h -#define MSR_IA32_PQR_ASSOC 0xc8f #define MAX_CBM_LENGTH 32 #define IA32_L3_CBM_BASE 0xc90 #define CBM_FROM_INDEX(x) (IA32_L3_CBM_BASE + x) -DECLARE_PER_CPU(unsigned int, x86_cpu_clos); + +DECLARE_PER_CPU(struct intel_pqr_state, pqr_state); extern struct static_key rdt_enable_key; extern void __rdt_sched_in(void); - struct rdt_subsys_info { /* Clos Bitmap to keep track of available CLOSids.*/ unsigned long *closmap; @@ -67,6 +67,9 @@ static inline struct intel_rdt *task_rdt(struct task_struct *task) * IA32_PQR_MSR writes until the user starts really using the feature * ie creates a rdt cgroup directory and assigns a cache_mask thats * different from the root cgroup's cache_mask. + * - Caches the per cpu CLOSid values and does the MSR write only + * when a task with a different CLOSid is scheduled in. That + * means the task belongs to a different cgroup. * - Closids are allocated so that different cgroup directories * with same cache_mask gets the same CLOSid. This minimizes CLOSids * used and reduces MSR write frequency. diff --git a/arch/x86/include/asm/rdt_common.h b/arch/x86/include/asm/rdt_common.h new file mode 100644 index 000..33fd8ea --- /dev/null +++ b/arch/x86/include/asm/rdt_common.h @@ -0,0 +1,13 @@ +#ifndef _X86_RDT_H_ +#define _X86_RDT_H_ + +#define MSR_IA32_PQR_ASSOC 0x0c8f + +struct intel_pqr_state { + raw_spinlock_t lock; + int rmid; + int clos; + int cnt; +}; + +#endif diff --git a/arch/x86/kernel/cpu/intel_rdt.c b/arch/x86/kernel/cpu/intel_rdt.c index fe3ce4e..2415965 100644 --- a/arch/x86/kernel/cpu/intel_rdt.c +++ b/arch/x86/kernel/cpu/intel_rdt.c @@ -85,27 +85,28 @@ void __rdt_sched_in(void) { struct task_struct *task = current; struct intel_rdt *ir; - unsigned int clos; - - /* -* This needs to be fixed -* to cache the whole PQR instead of just CLOSid. -* PQR has closid in high 32 bits and CQM-RMID in low 10 bits. -* Should not write a 0 to the low 10 bits of PQR -* and corrupt RMID. -*/ - clos = this_cpu_read(x86_cpu_clos); + struct intel_pqr_state *state = this_cpu_ptr(pqr_state); + unsigned long flags; + raw_spin_lock_irqsave(state-lock, flags); rcu_read_lock(); ir = task_rdt(task); - if (ir-clos == clos) { + if (ir-clos == state-clos) { rcu_read_unlock(); + raw_spin_unlock_irqrestore(state-lock, flags); return; } - wrmsr(MSR_IA32_PQR_ASSOC, 0, ir-clos); - this_cpu_write(x86_cpu_clos, ir-clos); + /* +* PQR has closid in high 32 bits and CQM-RMID +* in low 10 bits. Rewrite the exsting rmid from +* software cache. +*/ + wrmsr(MSR_IA32_PQR_ASSOC, state-rmid, ir-clos); + state-clos = ir-clos; rcu_read_unlock(); + raw_spin_unlock_irqrestore(state-lock, flags); + } static void __clos_get(unsigned int closid) @@ -372,6 +373,9 @@ static inline bool intel_rdt_update_cpumask(int cpu) */ static inline void intel_rdt_cpu_start(int cpu) { + struct intel_pqr_state *state = per_cpu(pqr_state, cpu); + + state-clos = 0; mutex_lock(rdt_group_mutex); if (intel_rdt_update_cpumask(cpu)) cbm_update_msrs(cpu); diff --git a/arch/x86/kernel/cpu/perf_event_intel_cqm.c
[PATCH 5/7] x86/intel_rdt: Software Cache for IA32_PQR_MSR
This patch implements a common software cache for IA32_PQR_MSR(RMID 0:9, CLOSId 32:63) to be used by both CMT and CAT. CMT updates the RMID where as CAT updates the CLOSid in the software cache. When the new RMID/CLOSid value is different from the cached values, IA32_PQR_MSR is updated. Since the measured rdmsr latency for IA32_PQR_MSR is very high(~250 cycles) this software cache is necessary to avoid reading the MSR to compare the current CLOSid value. During CPU hotplug the pqr cache is updated to zero. Signed-off-by: Vikas Shivappa Conflicts: arch/x86/kernel/cpu/perf_event_intel_cqm.c --- arch/x86/include/asm/intel_rdt.h | 31 +++--- arch/x86/include/asm/rdt_common.h | 13 + arch/x86/kernel/cpu/intel_rdt.c| 3 +++ arch/x86/kernel/cpu/perf_event_intel_cqm.c | 20 +++ 4 files changed, 39 insertions(+), 28 deletions(-) create mode 100644 arch/x86/include/asm/rdt_common.h diff --git a/arch/x86/include/asm/intel_rdt.h b/arch/x86/include/asm/intel_rdt.h index 2fc496f..6aae109 100644 --- a/arch/x86/include/asm/intel_rdt.h +++ b/arch/x86/include/asm/intel_rdt.h @@ -4,12 +4,13 @@ #ifdef CONFIG_CGROUP_RDT #include +#include -#define MSR_IA32_PQR_ASSOC 0xc8f #define MAX_CBM_LENGTH 32 #define IA32_L3_CBM_BASE 0xc90 #define CBM_FROM_INDEX(x) (IA32_L3_CBM_BASE + x) -DECLARE_PER_CPU(unsigned int, x86_cpu_clos); + +DECLARE_PER_CPU(struct intel_pqr_state, pqr_state); extern struct static_key rdt_enable_key; struct rdt_subsys_info { @@ -61,30 +62,30 @@ static inline struct intel_rdt *task_rdt(struct task_struct *task) static inline void rdt_sched_in(struct task_struct *task) { struct intel_rdt *ir; - unsigned int clos; + struct intel_pqr_state *state = this_cpu_ptr(_state); + unsigned long flags; if (!rdt_enabled()) return; - /* -* This needs to be fixed -* to cache the whole PQR instead of just CLOSid. -* PQR has closid in high 32 bits and CQM-RMID in low 10 bits. -* Should not write a 0 to the low 10 bits of PQR -* and corrupt RMID. -*/ - clos = this_cpu_read(x86_cpu_clos); - + raw_spin_lock_irqsave(>lock, flags); rcu_read_lock(); ir = task_rdt(task); - if (ir->clos == clos) { + if (ir->clos == state->clos) { rcu_read_unlock(); + raw_spin_unlock_irqrestore(>lock, flags); return; } - wrmsr(MSR_IA32_PQR_ASSOC, 0, ir->clos); - this_cpu_write(x86_cpu_clos, ir->clos); + /* +* PQR has closid in high 32 bits and CQM-RMID +* in low 10 bits. Rewrite the exsting rmid from +* software cache. +*/ + wrmsr(MSR_IA32_PQR_ASSOC, state->rmid, ir->clos); + state->clos = ir->clos; rcu_read_unlock(); + raw_spin_unlock_irqrestore(>lock, flags); } #else diff --git a/arch/x86/include/asm/rdt_common.h b/arch/x86/include/asm/rdt_common.h new file mode 100644 index 000..33fd8ea --- /dev/null +++ b/arch/x86/include/asm/rdt_common.h @@ -0,0 +1,13 @@ +#ifndef _X86_RDT_H_ +#define _X86_RDT_H_ + +#define MSR_IA32_PQR_ASSOC 0x0c8f + +struct intel_pqr_state { + raw_spinlock_t lock; + int rmid; + int clos; + int cnt; +}; + +#endif diff --git a/arch/x86/kernel/cpu/intel_rdt.c b/arch/x86/kernel/cpu/intel_rdt.c index 74b1e28..9da61b2 100644 --- a/arch/x86/kernel/cpu/intel_rdt.c +++ b/arch/x86/kernel/cpu/intel_rdt.c @@ -347,6 +347,9 @@ static inline bool rdt_update_cpumask(int cpu) */ static inline void rdt_cpu_start(int cpu) { + struct intel_pqr_state *state = _cpu(pqr_state, cpu); + + state->clos = 0; mutex_lock(_group_mutex); if (rdt_update_cpumask(cpu)) cbm_update_msrs(cpu); diff --git a/arch/x86/kernel/cpu/perf_event_intel_cqm.c b/arch/x86/kernel/cpu/perf_event_intel_cqm.c index e4d1b8b..fd039899 100644 --- a/arch/x86/kernel/cpu/perf_event_intel_cqm.c +++ b/arch/x86/kernel/cpu/perf_event_intel_cqm.c @@ -7,22 +7,16 @@ #include #include #include +#include #include "perf_event.h" -#define MSR_IA32_PQR_ASSOC 0x0c8f #define MSR_IA32_QM_CTR0x0c8e #define MSR_IA32_QM_EVTSEL 0x0c8d static unsigned int cqm_max_rmid = -1; static unsigned int cqm_l3_scale; /* supposedly cacheline size */ -struct intel_cqm_state { - raw_spinlock_t lock; - int rmid; - int cnt; -}; - -static DEFINE_PER_CPU(struct intel_cqm_state, cqm_state); +DEFINE_PER_CPU(struct intel_pqr_state, pqr_state); /* * Protects cache_cgroups and cqm_rmid_free_lru and cqm_rmid_limbo_lru. @@ -961,7 +955,7 @@ out: static void intel_cqm_event_start(struct perf_event *event, int mode) { - struct intel_cqm_state *state
[PATCH 5/7] x86/intel_rdt: Software Cache for IA32_PQR_MSR
This patch implements a common software cache for IA32_PQR_MSR(RMID 0:9, CLOSId 32:63) to be used by both CMT and CAT. CMT updates the RMID where as CAT updates the CLOSid in the software cache. When the new RMID/CLOSid value is different from the cached values, IA32_PQR_MSR is updated. Since the measured rdmsr latency for IA32_PQR_MSR is very high(~250 cycles) this software cache is necessary to avoid reading the MSR to compare the current CLOSid value. During CPU hotplug the pqr cache is updated to zero. Signed-off-by: Vikas Shivappa vikas.shiva...@linux.intel.com Conflicts: arch/x86/kernel/cpu/perf_event_intel_cqm.c --- arch/x86/include/asm/intel_rdt.h | 31 +++--- arch/x86/include/asm/rdt_common.h | 13 + arch/x86/kernel/cpu/intel_rdt.c| 3 +++ arch/x86/kernel/cpu/perf_event_intel_cqm.c | 20 +++ 4 files changed, 39 insertions(+), 28 deletions(-) create mode 100644 arch/x86/include/asm/rdt_common.h diff --git a/arch/x86/include/asm/intel_rdt.h b/arch/x86/include/asm/intel_rdt.h index 2fc496f..6aae109 100644 --- a/arch/x86/include/asm/intel_rdt.h +++ b/arch/x86/include/asm/intel_rdt.h @@ -4,12 +4,13 @@ #ifdef CONFIG_CGROUP_RDT #include linux/cgroup.h +#include asm/rdt_common.h -#define MSR_IA32_PQR_ASSOC 0xc8f #define MAX_CBM_LENGTH 32 #define IA32_L3_CBM_BASE 0xc90 #define CBM_FROM_INDEX(x) (IA32_L3_CBM_BASE + x) -DECLARE_PER_CPU(unsigned int, x86_cpu_clos); + +DECLARE_PER_CPU(struct intel_pqr_state, pqr_state); extern struct static_key rdt_enable_key; struct rdt_subsys_info { @@ -61,30 +62,30 @@ static inline struct intel_rdt *task_rdt(struct task_struct *task) static inline void rdt_sched_in(struct task_struct *task) { struct intel_rdt *ir; - unsigned int clos; + struct intel_pqr_state *state = this_cpu_ptr(pqr_state); + unsigned long flags; if (!rdt_enabled()) return; - /* -* This needs to be fixed -* to cache the whole PQR instead of just CLOSid. -* PQR has closid in high 32 bits and CQM-RMID in low 10 bits. -* Should not write a 0 to the low 10 bits of PQR -* and corrupt RMID. -*/ - clos = this_cpu_read(x86_cpu_clos); - + raw_spin_lock_irqsave(state-lock, flags); rcu_read_lock(); ir = task_rdt(task); - if (ir-clos == clos) { + if (ir-clos == state-clos) { rcu_read_unlock(); + raw_spin_unlock_irqrestore(state-lock, flags); return; } - wrmsr(MSR_IA32_PQR_ASSOC, 0, ir-clos); - this_cpu_write(x86_cpu_clos, ir-clos); + /* +* PQR has closid in high 32 bits and CQM-RMID +* in low 10 bits. Rewrite the exsting rmid from +* software cache. +*/ + wrmsr(MSR_IA32_PQR_ASSOC, state-rmid, ir-clos); + state-clos = ir-clos; rcu_read_unlock(); + raw_spin_unlock_irqrestore(state-lock, flags); } #else diff --git a/arch/x86/include/asm/rdt_common.h b/arch/x86/include/asm/rdt_common.h new file mode 100644 index 000..33fd8ea --- /dev/null +++ b/arch/x86/include/asm/rdt_common.h @@ -0,0 +1,13 @@ +#ifndef _X86_RDT_H_ +#define _X86_RDT_H_ + +#define MSR_IA32_PQR_ASSOC 0x0c8f + +struct intel_pqr_state { + raw_spinlock_t lock; + int rmid; + int clos; + int cnt; +}; + +#endif diff --git a/arch/x86/kernel/cpu/intel_rdt.c b/arch/x86/kernel/cpu/intel_rdt.c index 74b1e28..9da61b2 100644 --- a/arch/x86/kernel/cpu/intel_rdt.c +++ b/arch/x86/kernel/cpu/intel_rdt.c @@ -347,6 +347,9 @@ static inline bool rdt_update_cpumask(int cpu) */ static inline void rdt_cpu_start(int cpu) { + struct intel_pqr_state *state = per_cpu(pqr_state, cpu); + + state-clos = 0; mutex_lock(rdt_group_mutex); if (rdt_update_cpumask(cpu)) cbm_update_msrs(cpu); diff --git a/arch/x86/kernel/cpu/perf_event_intel_cqm.c b/arch/x86/kernel/cpu/perf_event_intel_cqm.c index e4d1b8b..fd039899 100644 --- a/arch/x86/kernel/cpu/perf_event_intel_cqm.c +++ b/arch/x86/kernel/cpu/perf_event_intel_cqm.c @@ -7,22 +7,16 @@ #include linux/perf_event.h #include linux/slab.h #include asm/cpu_device_id.h +#include asm/rdt_common.h #include perf_event.h -#define MSR_IA32_PQR_ASSOC 0x0c8f #define MSR_IA32_QM_CTR0x0c8e #define MSR_IA32_QM_EVTSEL 0x0c8d static unsigned int cqm_max_rmid = -1; static unsigned int cqm_l3_scale; /* supposedly cacheline size */ -struct intel_cqm_state { - raw_spinlock_t lock; - int rmid; - int cnt; -}; - -static DEFINE_PER_CPU(struct intel_cqm_state, cqm_state); +DEFINE_PER_CPU(struct intel_pqr_state, pqr_state); /* * Protects cache_cgroups and cqm_rmid_free_lru and cqm_rmid_limbo_lru. @@
[PATCH 5/7] x86/intel_rdt: Software Cache for IA32_PQR_MSR
This patch implements a common software cache for IA32_PQR_MSR(RMID 0:9, CLOSId 32:63) to be used by both CMT and CAT. CMT updates the RMID where as CAT updates the CLOSid in the software cache. When the new RMID/CLOSid value is different from the cached values, IA32_PQR_MSR is updated. Since the measured rdmsr latency for IA32_PQR_MSR is very high(~250 cycles) this software cache is necessary to avoid reading the MSR to compare the current CLOSid value. Signed-off-by: Vikas Shivappa --- arch/x86/include/asm/intel_rdt.h | 31 +++--- arch/x86/include/asm/rdt_common.h | 13 + arch/x86/kernel/cpu/perf_event_intel_cqm.c | 20 +++ 3 files changed, 36 insertions(+), 28 deletions(-) create mode 100644 arch/x86/include/asm/rdt_common.h diff --git a/arch/x86/include/asm/intel_rdt.h b/arch/x86/include/asm/intel_rdt.h index 6383a24..5a8139e 100644 --- a/arch/x86/include/asm/intel_rdt.h +++ b/arch/x86/include/asm/intel_rdt.h @@ -4,12 +4,13 @@ #ifdef CONFIG_CGROUP_RDT #include +#include -#define MSR_IA32_PQR_ASSOC 0xc8f #define MAX_CBM_LENGTH 32 #define IA32_L3_CBM_BASE 0xc90 #define CBM_FROM_INDEX(x) (IA32_L3_CBM_BASE + x) -DECLARE_PER_CPU(unsigned int, x86_cpu_clos); + +DECLARE_PER_CPU(struct intel_pqr_state, pqr_state); extern struct static_key rdt_enable_key; struct rdt_subsys_info { @@ -62,30 +63,30 @@ static inline struct intel_rdt *task_rdt(struct task_struct *task) static inline void rdt_sched_in(struct task_struct *task) { struct intel_rdt *ir; - unsigned int clos; + struct intel_pqr_state *state = this_cpu_ptr(_state); + unsigned long flags; if (!rdt_enabled()) return; - /* -* This needs to be fixed after CQM code stabilizes -* to cache the whole PQR instead of just CLOSid. -* PQR has closid in high 32 bits and CQM-RMID in low 10 bits. -* Should not write a 0 to the low 10 bits of PQR -* and corrupt RMID. -*/ - clos = this_cpu_read(x86_cpu_clos); - + raw_spin_lock_irqsave(>lock, flags); rcu_read_lock(); ir = task_rdt(task); - if (ir->clos == clos) { + if (ir->clos == state->clos) { rcu_read_unlock(); + raw_spin_unlock_irqrestore(>lock, flags); return; } - wrmsr(MSR_IA32_PQR_ASSOC, 0, ir->clos); - this_cpu_write(x86_cpu_clos, ir->clos); + /* +* PQR has closid in high 32 bits and CQM-RMID +* in low 10 bits. Rewrite the exsting rmid from +* software cache. +*/ + wrmsr(MSR_IA32_PQR_ASSOC, state->rmid, ir->clos); + state->clos = ir->clos; rcu_read_unlock(); + raw_spin_unlock_irqrestore(>lock, flags); } #else diff --git a/arch/x86/include/asm/rdt_common.h b/arch/x86/include/asm/rdt_common.h new file mode 100644 index 000..c87f908 --- /dev/null +++ b/arch/x86/include/asm/rdt_common.h @@ -0,0 +1,13 @@ +#ifndef _X86_RDT_H_ +#define _X86_RDT_H_ + +#define MSR_IA32_PQR_ASSOC 0x0c8f + +struct intel_pqr_state { + raw_spinlock_tlock; + int rmid; + int clos; + int cnt; +}; + +#endif diff --git a/arch/x86/kernel/cpu/perf_event_intel_cqm.c b/arch/x86/kernel/cpu/perf_event_intel_cqm.c index 596d1ec..63c52e0 100644 --- a/arch/x86/kernel/cpu/perf_event_intel_cqm.c +++ b/arch/x86/kernel/cpu/perf_event_intel_cqm.c @@ -7,22 +7,16 @@ #include #include #include +#include #include "perf_event.h" -#define MSR_IA32_PQR_ASSOC 0x0c8f #define MSR_IA32_QM_CTR0x0c8e #define MSR_IA32_QM_EVTSEL 0x0c8d static unsigned int cqm_max_rmid = -1; static unsigned int cqm_l3_scale; /* supposedly cacheline size */ -struct intel_cqm_state { - raw_spinlock_t lock; - int rmid; - int cnt; -}; - -static DEFINE_PER_CPU(struct intel_cqm_state, cqm_state); +DEFINE_PER_CPU(struct intel_pqr_state, pqr_state); /* * Protects cache_cgroups and cqm_rmid_free_lru and cqm_rmid_limbo_lru. @@ -931,7 +925,7 @@ out: static void intel_cqm_event_start(struct perf_event *event, int mode) { - struct intel_cqm_state *state = this_cpu_ptr(_state); + struct intel_pqr_state *state = this_cpu_ptr(_state); unsigned int rmid = event->hw.cqm_rmid; unsigned long flags; @@ -948,14 +942,14 @@ static void intel_cqm_event_start(struct perf_event *event, int mode) WARN_ON_ONCE(state->rmid); state->rmid = rmid; - wrmsrl(MSR_IA32_PQR_ASSOC, state->rmid); + wrmsr(MSR_IA32_PQR_ASSOC, state->rmid, state->clos); raw_spin_unlock_irqrestore(>lock, flags); } static void intel_cqm_event_stop(struct perf_event *event, int mode) { - struct intel_cqm_state *state = this_cpu_ptr(_state); + struct intel_pqr_state
[PATCH 5/7] x86/intel_rdt: Software Cache for IA32_PQR_MSR
This patch implements a common software cache for IA32_PQR_MSR(RMID 0:9, CLOSId 32:63) to be used by both CMT and CAT. CMT updates the RMID where as CAT updates the CLOSid in the software cache. When the new RMID/CLOSid value is different from the cached values, IA32_PQR_MSR is updated. Since the measured rdmsr latency for IA32_PQR_MSR is very high(~250 cycles) this software cache is necessary to avoid reading the MSR to compare the current CLOSid value. Signed-off-by: Vikas Shivappa vikas.shiva...@linux.intel.com --- arch/x86/include/asm/intel_rdt.h | 31 +++--- arch/x86/include/asm/rdt_common.h | 13 + arch/x86/kernel/cpu/perf_event_intel_cqm.c | 20 +++ 3 files changed, 36 insertions(+), 28 deletions(-) create mode 100644 arch/x86/include/asm/rdt_common.h diff --git a/arch/x86/include/asm/intel_rdt.h b/arch/x86/include/asm/intel_rdt.h index 6383a24..5a8139e 100644 --- a/arch/x86/include/asm/intel_rdt.h +++ b/arch/x86/include/asm/intel_rdt.h @@ -4,12 +4,13 @@ #ifdef CONFIG_CGROUP_RDT #include linux/cgroup.h +#include asm/rdt_common.h -#define MSR_IA32_PQR_ASSOC 0xc8f #define MAX_CBM_LENGTH 32 #define IA32_L3_CBM_BASE 0xc90 #define CBM_FROM_INDEX(x) (IA32_L3_CBM_BASE + x) -DECLARE_PER_CPU(unsigned int, x86_cpu_clos); + +DECLARE_PER_CPU(struct intel_pqr_state, pqr_state); extern struct static_key rdt_enable_key; struct rdt_subsys_info { @@ -62,30 +63,30 @@ static inline struct intel_rdt *task_rdt(struct task_struct *task) static inline void rdt_sched_in(struct task_struct *task) { struct intel_rdt *ir; - unsigned int clos; + struct intel_pqr_state *state = this_cpu_ptr(pqr_state); + unsigned long flags; if (!rdt_enabled()) return; - /* -* This needs to be fixed after CQM code stabilizes -* to cache the whole PQR instead of just CLOSid. -* PQR has closid in high 32 bits and CQM-RMID in low 10 bits. -* Should not write a 0 to the low 10 bits of PQR -* and corrupt RMID. -*/ - clos = this_cpu_read(x86_cpu_clos); - + raw_spin_lock_irqsave(state-lock, flags); rcu_read_lock(); ir = task_rdt(task); - if (ir-clos == clos) { + if (ir-clos == state-clos) { rcu_read_unlock(); + raw_spin_unlock_irqrestore(state-lock, flags); return; } - wrmsr(MSR_IA32_PQR_ASSOC, 0, ir-clos); - this_cpu_write(x86_cpu_clos, ir-clos); + /* +* PQR has closid in high 32 bits and CQM-RMID +* in low 10 bits. Rewrite the exsting rmid from +* software cache. +*/ + wrmsr(MSR_IA32_PQR_ASSOC, state-rmid, ir-clos); + state-clos = ir-clos; rcu_read_unlock(); + raw_spin_unlock_irqrestore(state-lock, flags); } #else diff --git a/arch/x86/include/asm/rdt_common.h b/arch/x86/include/asm/rdt_common.h new file mode 100644 index 000..c87f908 --- /dev/null +++ b/arch/x86/include/asm/rdt_common.h @@ -0,0 +1,13 @@ +#ifndef _X86_RDT_H_ +#define _X86_RDT_H_ + +#define MSR_IA32_PQR_ASSOC 0x0c8f + +struct intel_pqr_state { + raw_spinlock_tlock; + int rmid; + int clos; + int cnt; +}; + +#endif diff --git a/arch/x86/kernel/cpu/perf_event_intel_cqm.c b/arch/x86/kernel/cpu/perf_event_intel_cqm.c index 596d1ec..63c52e0 100644 --- a/arch/x86/kernel/cpu/perf_event_intel_cqm.c +++ b/arch/x86/kernel/cpu/perf_event_intel_cqm.c @@ -7,22 +7,16 @@ #include linux/perf_event.h #include linux/slab.h #include asm/cpu_device_id.h +#include asm/rdt_common.h #include perf_event.h -#define MSR_IA32_PQR_ASSOC 0x0c8f #define MSR_IA32_QM_CTR0x0c8e #define MSR_IA32_QM_EVTSEL 0x0c8d static unsigned int cqm_max_rmid = -1; static unsigned int cqm_l3_scale; /* supposedly cacheline size */ -struct intel_cqm_state { - raw_spinlock_t lock; - int rmid; - int cnt; -}; - -static DEFINE_PER_CPU(struct intel_cqm_state, cqm_state); +DEFINE_PER_CPU(struct intel_pqr_state, pqr_state); /* * Protects cache_cgroups and cqm_rmid_free_lru and cqm_rmid_limbo_lru. @@ -931,7 +925,7 @@ out: static void intel_cqm_event_start(struct perf_event *event, int mode) { - struct intel_cqm_state *state = this_cpu_ptr(cqm_state); + struct intel_pqr_state *state = this_cpu_ptr(pqr_state); unsigned int rmid = event-hw.cqm_rmid; unsigned long flags; @@ -948,14 +942,14 @@ static void intel_cqm_event_start(struct perf_event *event, int mode) WARN_ON_ONCE(state-rmid); state-rmid = rmid; - wrmsrl(MSR_IA32_PQR_ASSOC, state-rmid); + wrmsr(MSR_IA32_PQR_ASSOC, state-rmid, state-clos); raw_spin_unlock_irqrestore(state-lock, flags); } static void
[PATCH 5/7] x86/intel_rdt: Software Cache for IA32_PQR_MSR
This patch implements a common software cache for IA32_PQR_MSR(RMID 0:9, CLOSId 32:63) to be used by both CMT and CAT. CMT updates the RMID where as CAT updates the CLOSid in the software cache. When the new RMID/CLOSid value is different from the cached values, IA32_PQR_MSR is updated. Since the measured rdmsr latency for IA32_PQR_MSR is very high(~250 cycles) this software cache is necessary to avoid reading the MSR to compare the current CLOSid value. Signed-off-by: Vikas Shivappa --- arch/x86/include/asm/intel_rdt.h | 31 +++--- arch/x86/include/asm/rdt_common.h | 13 + arch/x86/kernel/cpu/perf_event_intel_cqm.c | 20 +++ 3 files changed, 36 insertions(+), 28 deletions(-) create mode 100644 arch/x86/include/asm/rdt_common.h diff --git a/arch/x86/include/asm/intel_rdt.h b/arch/x86/include/asm/intel_rdt.h index bc57b56..27621c8 100644 --- a/arch/x86/include/asm/intel_rdt.h +++ b/arch/x86/include/asm/intel_rdt.h @@ -4,12 +4,13 @@ #ifdef CONFIG_CGROUP_RDT #include +#include -#define MSR_IA32_PQR_ASSOC 0xc8f #define MAX_CBM_LENGTH 32 #define IA32_L3_CBM_BASE 0xc90 #define CBM_FROM_INDEX(x) (IA32_L3_CBM_BASE + x) -DECLARE_PER_CPU(unsigned int, x86_cpu_clos); + +DECLARE_PER_CPU(struct intel_pqr_state, pqr_state); extern struct static_key rdt_enable_key; struct rdt_subsys_info { @@ -64,30 +65,30 @@ static inline struct intel_rdt *task_rdt(struct task_struct *task) static inline void rdt_sched_in(struct task_struct *task) { struct intel_rdt *ir; - unsigned int clos; + struct intel_pqr_state *state = this_cpu_ptr(_state); + unsigned long flags; if (!rdt_enabled()) return; - /* -* This needs to be fixed after CQM code stabilizes -* to cache the whole PQR instead of just CLOSid. -* PQR has closid in high 32 bits and CQM-RMID in low 10 bits. -* Should not write a 0 to the low 10 bits of PQR -* and corrupt RMID. -*/ - clos = this_cpu_read(x86_cpu_clos); - + raw_spin_lock_irqsave(>lock, flags); rcu_read_lock(); ir = task_rdt(task); - if (ir->clos == clos) { + if (ir->clos == state->clos) { rcu_read_unlock(); + raw_spin_unlock_irqrestore(>lock, flags); return; } - wrmsr(MSR_IA32_PQR_ASSOC, 0, ir->clos); - this_cpu_write(x86_cpu_clos, ir->clos); + /* +* PQR has closid in high 32 bits and CQM-RMID +* in low 10 bits. Rewrite the exsting rmid from +* software cache. +*/ + wrmsr(MSR_IA32_PQR_ASSOC, state->rmid, ir->clos); + state->clos = ir->clos; rcu_read_unlock(); + raw_spin_unlock_irqrestore(>lock, flags); } #else diff --git a/arch/x86/include/asm/rdt_common.h b/arch/x86/include/asm/rdt_common.h new file mode 100644 index 000..c87f908 --- /dev/null +++ b/arch/x86/include/asm/rdt_common.h @@ -0,0 +1,13 @@ +#ifndef _X86_RDT_H_ +#define _X86_RDT_H_ + +#define MSR_IA32_PQR_ASSOC 0x0c8f + +struct intel_pqr_state { + raw_spinlock_tlock; + int rmid; + int clos; + int cnt; +}; + +#endif diff --git a/arch/x86/kernel/cpu/perf_event_intel_cqm.c b/arch/x86/kernel/cpu/perf_event_intel_cqm.c index 596d1ec..63c52e0 100644 --- a/arch/x86/kernel/cpu/perf_event_intel_cqm.c +++ b/arch/x86/kernel/cpu/perf_event_intel_cqm.c @@ -7,22 +7,16 @@ #include #include #include +#include #include "perf_event.h" -#define MSR_IA32_PQR_ASSOC 0x0c8f #define MSR_IA32_QM_CTR0x0c8e #define MSR_IA32_QM_EVTSEL 0x0c8d static unsigned int cqm_max_rmid = -1; static unsigned int cqm_l3_scale; /* supposedly cacheline size */ -struct intel_cqm_state { - raw_spinlock_t lock; - int rmid; - int cnt; -}; - -static DEFINE_PER_CPU(struct intel_cqm_state, cqm_state); +DEFINE_PER_CPU(struct intel_pqr_state, pqr_state); /* * Protects cache_cgroups and cqm_rmid_free_lru and cqm_rmid_limbo_lru. @@ -931,7 +925,7 @@ out: static void intel_cqm_event_start(struct perf_event *event, int mode) { - struct intel_cqm_state *state = this_cpu_ptr(_state); + struct intel_pqr_state *state = this_cpu_ptr(_state); unsigned int rmid = event->hw.cqm_rmid; unsigned long flags; @@ -948,14 +942,14 @@ static void intel_cqm_event_start(struct perf_event *event, int mode) WARN_ON_ONCE(state->rmid); state->rmid = rmid; - wrmsrl(MSR_IA32_PQR_ASSOC, state->rmid); + wrmsr(MSR_IA32_PQR_ASSOC, state->rmid, state->clos); raw_spin_unlock_irqrestore(>lock, flags); } static void intel_cqm_event_stop(struct perf_event *event, int mode) { - struct intel_cqm_state *state = this_cpu_ptr(_state); + struct intel_pqr_state
[PATCH 5/7] x86/intel_rdt: Software Cache for IA32_PQR_MSR
This patch implements a common software cache for IA32_PQR_MSR(RMID 0:9, CLOSId 32:63) to be used by both CMT and CAT. CMT updates the RMID where as CAT updates the CLOSid in the software cache. When the new RMID/CLOSid value is different from the cached values, IA32_PQR_MSR is updated. Since the measured rdmsr latency for IA32_PQR_MSR is very high(~250 cycles) this software cache is necessary to avoid reading the MSR to compare the current CLOSid value. Signed-off-by: Vikas Shivappa vikas.shiva...@linux.intel.com --- arch/x86/include/asm/intel_rdt.h | 31 +++--- arch/x86/include/asm/rdt_common.h | 13 + arch/x86/kernel/cpu/perf_event_intel_cqm.c | 20 +++ 3 files changed, 36 insertions(+), 28 deletions(-) create mode 100644 arch/x86/include/asm/rdt_common.h diff --git a/arch/x86/include/asm/intel_rdt.h b/arch/x86/include/asm/intel_rdt.h index bc57b56..27621c8 100644 --- a/arch/x86/include/asm/intel_rdt.h +++ b/arch/x86/include/asm/intel_rdt.h @@ -4,12 +4,13 @@ #ifdef CONFIG_CGROUP_RDT #include linux/cgroup.h +#include asm/rdt_common.h -#define MSR_IA32_PQR_ASSOC 0xc8f #define MAX_CBM_LENGTH 32 #define IA32_L3_CBM_BASE 0xc90 #define CBM_FROM_INDEX(x) (IA32_L3_CBM_BASE + x) -DECLARE_PER_CPU(unsigned int, x86_cpu_clos); + +DECLARE_PER_CPU(struct intel_pqr_state, pqr_state); extern struct static_key rdt_enable_key; struct rdt_subsys_info { @@ -64,30 +65,30 @@ static inline struct intel_rdt *task_rdt(struct task_struct *task) static inline void rdt_sched_in(struct task_struct *task) { struct intel_rdt *ir; - unsigned int clos; + struct intel_pqr_state *state = this_cpu_ptr(pqr_state); + unsigned long flags; if (!rdt_enabled()) return; - /* -* This needs to be fixed after CQM code stabilizes -* to cache the whole PQR instead of just CLOSid. -* PQR has closid in high 32 bits and CQM-RMID in low 10 bits. -* Should not write a 0 to the low 10 bits of PQR -* and corrupt RMID. -*/ - clos = this_cpu_read(x86_cpu_clos); - + raw_spin_lock_irqsave(state-lock, flags); rcu_read_lock(); ir = task_rdt(task); - if (ir-clos == clos) { + if (ir-clos == state-clos) { rcu_read_unlock(); + raw_spin_unlock_irqrestore(state-lock, flags); return; } - wrmsr(MSR_IA32_PQR_ASSOC, 0, ir-clos); - this_cpu_write(x86_cpu_clos, ir-clos); + /* +* PQR has closid in high 32 bits and CQM-RMID +* in low 10 bits. Rewrite the exsting rmid from +* software cache. +*/ + wrmsr(MSR_IA32_PQR_ASSOC, state-rmid, ir-clos); + state-clos = ir-clos; rcu_read_unlock(); + raw_spin_unlock_irqrestore(state-lock, flags); } #else diff --git a/arch/x86/include/asm/rdt_common.h b/arch/x86/include/asm/rdt_common.h new file mode 100644 index 000..c87f908 --- /dev/null +++ b/arch/x86/include/asm/rdt_common.h @@ -0,0 +1,13 @@ +#ifndef _X86_RDT_H_ +#define _X86_RDT_H_ + +#define MSR_IA32_PQR_ASSOC 0x0c8f + +struct intel_pqr_state { + raw_spinlock_tlock; + int rmid; + int clos; + int cnt; +}; + +#endif diff --git a/arch/x86/kernel/cpu/perf_event_intel_cqm.c b/arch/x86/kernel/cpu/perf_event_intel_cqm.c index 596d1ec..63c52e0 100644 --- a/arch/x86/kernel/cpu/perf_event_intel_cqm.c +++ b/arch/x86/kernel/cpu/perf_event_intel_cqm.c @@ -7,22 +7,16 @@ #include linux/perf_event.h #include linux/slab.h #include asm/cpu_device_id.h +#include asm/rdt_common.h #include perf_event.h -#define MSR_IA32_PQR_ASSOC 0x0c8f #define MSR_IA32_QM_CTR0x0c8e #define MSR_IA32_QM_EVTSEL 0x0c8d static unsigned int cqm_max_rmid = -1; static unsigned int cqm_l3_scale; /* supposedly cacheline size */ -struct intel_cqm_state { - raw_spinlock_t lock; - int rmid; - int cnt; -}; - -static DEFINE_PER_CPU(struct intel_cqm_state, cqm_state); +DEFINE_PER_CPU(struct intel_pqr_state, pqr_state); /* * Protects cache_cgroups and cqm_rmid_free_lru and cqm_rmid_limbo_lru. @@ -931,7 +925,7 @@ out: static void intel_cqm_event_start(struct perf_event *event, int mode) { - struct intel_cqm_state *state = this_cpu_ptr(cqm_state); + struct intel_pqr_state *state = this_cpu_ptr(pqr_state); unsigned int rmid = event-hw.cqm_rmid; unsigned long flags; @@ -948,14 +942,14 @@ static void intel_cqm_event_start(struct perf_event *event, int mode) WARN_ON_ONCE(state-rmid); state-rmid = rmid; - wrmsrl(MSR_IA32_PQR_ASSOC, state-rmid); + wrmsr(MSR_IA32_PQR_ASSOC, state-rmid, state-clos); raw_spin_unlock_irqrestore(state-lock, flags); } static void